I'm trying to fit a function f(x,y,z) with the following quadratic polynomial:
3d polynomial
Some distorted spherical surface in three dimensions. The problem is related to the calculation of effective masses in solid state physics.
Here is a picture of the data to show that it indeed falls off parabolically in all directions, even though the curvature in the z-direction is rather low:
3d parabolas
I'm interested in the coefficients, which correspond to effective masses. I've got an array of xyz coordinates, which is regular and centered on the maximum:
[[ 0. 0. 0. ]
[ 0. 0. 0.01282017]
[ 0. 0. 0.02564034]
...
[-0.05026321 -0.05026321 -0.03846052]
[-0.05026321 -0.05026321 -0.02564034]
[-0.05026321 -0.05026321 -0.01282017]]
And a corresponding 1D array of scalar values, one for each point. The number of data points around this maximum can range from 100 to 1000.
This is the code I'm currently trying to use for fitting:
def func(data, mxx, mxy, mxz, myy, myz, mzz):
x = data[:, 0]
y = data[:, 1]
z = data[:, 2]
return (
(1 / (2 * mxx)) * (x ** 2)
+ (1 / (1 * mxy)) * (x * y)
+ (1 / (1 * mxz)) * (x * z)
+ (1 / (2 * myy)) * (y ** 2)
+ (1 / (1 * myz)) * (y * z)
+ (1 / (2 * mzz)) * (z ** 2)
) + f(0, 0, 0)
energy = data[:, 3]
guess = (mxx, mxy, mxz, myy, myz, mzz)
params, pcov = scipy.optimize.curve_fit(
func, data, energy, p0=guess, method="trf"
)
Where f(0,0,0) is the value of the function at (0, 0, 0), which I retrieve with the scipy.interpolate.griddata function.
For this problem, the masses should be negative and have values between -0.2 and -2, roughly speaking. I'm creating guess values through a finite difference differentiation.
However, I don't get any senseful results from scipy.interpolate.curve_fit - typically the coefficients end up with huge numbers (like 1e9). I'm completly lost at this point.
What am I doing wrong :( ?
One of the problems is that you fit 1/m. While this is correct from a physics point of view, it is bad from the algorithm point of view. If the fitting algorithm needs to change sign for values of m near zero, the coefficients diverge. Consequently, it is better to fit mI = 1/m and make the according error progressions later. Here I use leastsqwhich requires some additional calculations for the covariance matrix (as it returns the reduced form). I do the fit with g() and the inverse masses, but you can immediately reproduce your problems when introducing f() and directly fitting the ms.
A second point is that the data has an offset, i.e. if x = y = z = 0 the data is v= -0.0195 This needs to be introduced into the model.
Finally, I'd say that you already have non-parabolic behaviour in your data.
Nevertheless, here is how it looks like:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(linewidth=300)
from scipy.optimize import leastsq
from scipy.optimize import curve_fit
data = np.loadtxt( "silicon.csv", delimiter=',' )
def f( x, y, z, mxx, mxy, mxz, myy, myz, mzz, offI ):
out = 1./(2 * mxx) * x * x
out += 1./( mxy ) * x * y
out += 1./( mxz ) * x * z
out += 1./( 2 * myy ) * y * y
out += 1./( myz ) * y * z
out += 1./( 2 * mzz ) * z * z
out += 1./offI
return out
def g( x, y, z, mxxI, mxyI, mxzI, myyI, myzI, mzzI, off ):
out = mxxI / 2 * x * x
out += mxyI * x * y
out += mxzI * x * z
out += myyI / 2 * y * y
out += myzI * y * z
out += mzzI / 2 * z * z
out += off
return out
def residuals( params, indata ):
out = list()
for x, y, z, v in indata:
out.append( v - g( x,y, z, *params ) )
return out
sol, cov, info, msg, ier = leastsq( residuals, 7*[0], args=( data, ), full_output=True)
s_sq = sum( [x**2 for x in residuals( sol, data) ] )/ (len( data ) - len( sol ) )
print "solution"
print sol
masses = [1/x for x in sol]
print "masses:"
print masses
print "covariance matrix:"
covMX = cov * s_sq
print covMX
print "sum of residuals"
print sum( residuals( sol, data) )
### plotting the cuts
fig = plt.figure('cuts')
ax = dict()
for i in range( 1, 10 ):
ax[i] = fig.add_subplot( 3, 3, i )
dl = np.linspace( -.2, .2, 25)
#### xx
xdata = [ [ x, v ] for x,y,z,v in data if ( abs(y)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( x, 0, 0, *masses ) for x in dl ), np.float )
ax[1].plot( *zip(*sorted( xdata ) ), ls='', marker='o')
ax[1].plot( dl, vl )
#### xy
xydata = [ [ x, v ] for x, y, z, v in data if ( abs( x - y )<1e-2 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( xy, xy, 0, *masses ) for xy in dl ), np.float )
ax[2].plot( *zip(*sorted( xydata ) ), ls='', marker='o')
ax[2].plot( dl, vl )
#### xz
xzdata = [ [ x, v ] for x, y, z, v in data if ( abs( x - z )<1e-2 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( xz, 0, xz, *masses ) for xz in dl ), np.float )
ax[3].plot( *zip(*sorted( xzdata ) ), ls='', marker='o')
ax[3].plot( dl, vl )
#### yy
ydata = [ [ y, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, y, 0, *masses ) for y in dl ), np.float )
ax[5].plot( *zip(*sorted( ydata ) ), ls='', marker='o' )
ax[5].plot( dl, vl )
#### yz
yzdata = [ [ y, v ] for x, y, z, v in data if ( abs( y - z )<1e-2 and abs(x) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, yz, yz, *masses ) for yz in dl ), np.float )
ax[6].plot( *zip(*sorted( yzdata ) ), ls='', marker='o')
ax[6].plot( dl, vl )
#### zz
zdata = [ [ z, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, 0, z, *masses ) for z in dl ), np.float )
ax[9].plot( *zip(*sorted( zdata ) ), ls='', marker='o' )
ax[9].plot( dl, vl )
#### some diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x - z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, d, *masses ) for d in dl ), np.float )
ax[7].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[7].plot( dl, vl )
#### some other diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x + z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, -d, *masses ) for d in dl ), np.float )
ax[8].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[8].plot( dl, vl )
plt.show()
This gives the following output:
solution
[-1.46528595 0.25090717 0.25090717 -1.46528595 0.25090717 -1.46528595 -0.01993436]
masses:
[-0.6824606499739905, 3.985537743156507, 3.9855376943660676, -0.6824606473928339, 3.9855377322848344, -0.6824606467055248, -50.16463861555409]
covariance matrix:
[
[ 4.76417852e-03 -1.46907683e-12 -8.57639600e-12 -2.21281938e-12 -2.38444957e-12 8.42981521e-12 -2.70034183e-05]
[-1.46907683e-12 9.17104397e-04 -7.10573582e-13 1.32125214e-11 7.44553140e-12 1.29909935e-11 -1.11259046e-13]
[-8.57639600e-12 -7.10573582e-13 9.17104389e-04 -8.60004172e-12 -6.14797647e-12 8.27070243e-12 3.11127064e-14]
[-2.21281914e-12 1.32125214e-11 -8.60004172e-12 4.76417860e-03 -4.20477032e-12 9.20893224e-12 -2.70034186e-05]
[-2.38444957e-12 7.44553140e-12 -6.14797647e-12 -4.20477032e-12 9.17104395e-04 1.50963408e-11 -7.28889534e-14]
[ 8.42981530e-12 1.29909935e-11 8.27070243e-12 9.20893175e-12 1.50963408e-11 4.76417849e-03 -2.70034182e-05]
[-2.70034183e-05 -1.11259046e-13 3.11127064e-14 -2.70034186e-05 -7.28889534e-14 -2.70034182e-05 5.77019926e-07]
]
sum of residuals
4.352727352163743e-09
...and here some 1d cuts that show some significant deviation from parabolic behaviour if one is not on one of the main axes.
Related
I am trying to train a very basic linear regression model to predict a linear equation Y = m*X + c
The Weight parameter is optimized to 5 but the Bias parameter is stuck at 0. Am I doing something wrong?
X = np.array(range(1,1000))
Y = 5 * X + 7
def forward(W, X ,b):
return W * X + b
def getcost(Y, y):
return np.sum((Y-y)**2) / 1000
def backward(W, b, X, Y, y, lr):
dW = -2 * np.dot((Y-y).T, X) / 1000
db = -2 * np.sum(Y-y) / 1000
W -= lr * dW
b -= lr * db
return W, b
W = 0.0
b = 0.0
for i in range(80):
y = forward(W, X ,b)
cost = getcost(Y, y)
W, b = backward(W, b, X, Y, y, lr=0.000001)
print(int(cost), W, b)
The range of X is too extensive since X and Y have a linear relationship the model can be trained on a small range of values. The learning rate is very small it will take much more time to converge since your input set is very big. If you really want to use the same data then You can normalize X.
X = np.array(range(1,30))
Y = 5 * X +7
# Normalize the X values
#X = (X - np.mean(X)) / np.std(X)
N = len(Y)
learning_rate = 0.001
# Initialize the model with the correct values for m and b
m, b = 0.0, 0.0
errors = []
for p in range(8000):
hyp = m * X + b
error = Y - hyp
m_gradient = -(2/N) * np.sum(X * error)
b_gradient = -(2/N) * np.sum(error)
m = m - learning_rate * m_gradient
b = b - learning_rate * b_gradient
errors.append(np.mean(error ** 2))
if p%400==0:
print(f'm={m} b={b} ' )
# prediction for x = 231 , y should be 5*200+7 = 1007
print( m*200+b)
plt.plot(errors)
#
plt.xlabel('Iteration')
plt.ylabel('Error')
plt.show()
I agree with #Ahsan Nawaz
The only changes I made to your code are -
Scaled your features (for otherwise, increasing the learning_rate gave NANs)
Increased the learning rate
Increased the number of epochs
Here is your code modified -
import numpy as np
from sklearn.preprocessing import StandardScaler
X = np.array(range(1,1000))
scaler = StandardScaler()
scaler.fit(X.reshape(-1,1))
X = scaler.transform(X.reshape(-1,1)).reshape(-1)
Y = 5 * X + 7
def forward(W, X ,b):
return W * X + b
def getcost(Y, y):
return np.sum((Y-y)**2) / 1000
def backward(W, b, X, Y, y, lr):
dW = -2 * np.dot((Y-y).T, X) / 1000
db = -2 * np.sum(Y-y) / 1000
W -= lr * dW
b -= lr * db
return W, b
W = 0.0
b = 0.0
for i in range(8000):
y = forward(W, X ,b)
cost = getcost(Y, y)
W, b = backward(W, b, X, Y, y, lr=0.001)
print(int(cost), W, b)
Here is the final output -
0 4.999999437318114 6.999999212245364
I have a data set of {x2} values for which two arrays f[x2] and g[x2] are known. The data set {x2} is not uniformly spaced; and I would like to evaluate the convolution integral of f,g using these known samples. A minimal code for this would be something like:
#irregular grid for data points
x2 = np.geomspace( 5, 10, 100 )
x2n =-np.flip( x2 )
x2 = np.concatenate( ( x2n, x2 ) )
x2 = np.concatenate( (np.array([0.0]) , x2 ), axis=0 )
x_inner = np.linspace( -5,5, 1000 )
x2 = np.concatenate( ( x_inner, x2 ) )
x2 = np.sort(x2)
f2 = np.zeros( x2.shape[0], dtype=np.complex128 )
f2[ np.abs(x2)<=2 ] = 1.0 + 2j
g2 = np.zeros( x2.shape[0], dtype=np.complex128 )
g2 = np.sin( x2**3 )*np.exp( -x2**2 ) + 1j*np.sin( x2 )*np.exp( -x2**2 )
def fg_x( f, g ):
return f*g
def convolution_quad( f , g ):
return quad( fg_x, -np.inf, np.inf, args=(g) )
from scipy.integrate import quad
#evaluate convolution of the two arrays over the irregular sample data x2
res2 = convolution_quad( f2, g2)
However, this function call does not work at all, it gives the error:
return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit)
TypeError: only size-1 arrays can be converted to Python scalars
how can one calculate such convolution integrals over discrete data set by using scipy's quad? Such integrals can be evaluated with the trapezoid rule or Simpsons rule, but here I am looking for an accurate evaluation.
quad will require a continuous function as input.
Since your data is discrete you should use discrete convolution from
numpy.convolve
res2 = np.convolve(f2, g2)
How to plot these equations please? The output is empty - there are only axes but no line
import numpy as np
import matplotlib.pyplot as plt
r = 50
a = 5
n = 20
t = 5
x = (r + a * np.sin(n * t * 360 )) * np.cos (t * 360 )
y = (r + a * np.sin(n * t * 360 )) * np.sin (t * 360 )
fig, ax = plt.subplots()
ax.plot(x, y)
plt.show()
You are currently just calculating single values for x and y:
>>> import numpy as np
>>> r, a, n, t = 50, 5, 20, 5
>>> x = (r + a * np.sin(n * t * 360 )) * np.cos (t * 360 )
>>> y = (r + a * np.sin(n * t * 360 )) * np.sin (t * 360 )
>>> print(x, y)
-47.22961311822641 6.299155241288046
This means there is no line for matplotlib to plot.
To plot a line, you have to pass two or more points for matplotlib to draw lines between.
import matplotlib.pyplot as plt
import numpy as np
t = np.linspace(0, 2*np.pi, 100) # create an array of 100 points between 0 and 2*pi
x = np.sin(2*t)
y = np.cos(t)
plt.plot(x, y)
plt.show()
Or in your case:
t = np.linspace(0, 2*np.pi, 1000)
# removed the factor *360 as numpy's sin/cos works with radians by default
x = (r + a * np.sin(n * t)) * np.cos(t)
y = (r + a * np.sin(n * t)) * np.sin(t)
plt.plot(x, y)
plt.show()
You are evaluating the function just at t=5. You should give a range of values to evaluate. If you change t variable to, for example
t= np.array([0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1])
you will see a graph. But it is up to you to define the range and the step for your needs
I'm working on some code which needs to be able to preform a 2d gaussian fitting. I mostly based my code on following question: Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error . Now is problem that I don't really have an initial guess about the different parameters that need to be used.
I've tried this:
def twoD_Gaussian(x_data_tuple, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
(x,y) = x_data_tuple
xo = float(xo)
yo = float(yo)
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo)
+ c*((y-yo)**2)))
return g.ravel()
The data.reshape(201,201) is just something I took from the aformentioned question.
mean_gauss_x = sum(x * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_x = np.sqrt(sum(data.reshape(201,201) * (x - mean_gauss_x)**2) / sum(data.reshape(201,201)))
mean_gauss_y = sum(y * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_y = np.sqrt(sum(data.reshape(201,201) * (y - mean_gauss_y)**2) / sum(data.reshape(201,201)))
initial_guess = (np.max(data), mean_gauss_x, mean_gauss_y, sigma_gauss_x, sigma_gauss_y,0,10)
popt, pcov = curve_fit(twoD_Gaussian, (x, y), data, p0=initial_guess)
data_fitted = twoD_Gaussian((x, y), *popt)
If I try this, I get following error message: ValueError: setting an array element with a sequence.
Is the reasoning about the begin parameters correct?
And why do I get this error?
If I use the runnable code from the linked question and substitute your definition of initial_guess:
mean_gauss_x = sum(x * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_x = np.sqrt(sum(data.reshape(201,201) * (x - mean_gauss_x)**2) / sum(data.reshape(201,201)))
mean_gauss_y = sum(y * data.reshape(201,201)) / sum(data.reshape(201,201))
sigma_gauss_y = np.sqrt(sum(data.reshape(201,201) * (y - mean_gauss_y)**2) / sum(data.reshape(201,201)))
initial_guess = (np.max(data), mean_gauss_x, mean_gauss_y, sigma_gauss_x, sigma_gauss_y,0,10)
Then
print(inital_guess)
yields
(13.0, array([...]), array([...]), array([...]), array([...]), 0, 10)
Notice that some of the values in initial_guess are arrays. The optimize.curve_fit function expects initial_guess to be a tuple of scalars. This is the source of the problem.
The error message
ValueError: setting an array element with a sequence
often arises when an array-like is supplied when a scalar value is expected. It is a hint that the source of the problem may have to do with an array having the wrong number of dimensions. For example, it might arise if you pass a 1D array to a function that expects a scalar.
Let's look at this piece of code taken from the linked question:
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)
x and y are 1D arrays, while X and Y are 2D arrays. (I've capitalized all 2D arrays to help distinguish them from 1D arrays).
Now notice that Python sum and NumPy's sum method behave differently when applied to 2D arrays:
In [146]: sum(X)
Out[146]:
array([ 0., 201., 402., 603., 804., 1005., 1206., 1407.,
1608., 1809., 2010., 2211., 2412., 2613., 2814., 3015.,
...
38592., 38793., 38994., 39195., 39396., 39597., 39798., 39999.,
40200.])
In [147]: X.sum()
Out[147]: 4040100.0
The Python sum function is equivalent to
total = 0
for item in X:
total += item
Since X is a 2D array, the loop for item in X is iterating over the rows of X. Each item is therefore a 1D array representing a row of X. Thus, total ends up being a 1D array.
In contrast, X.sum() sums all the elements in X and returns a scalar.
Since initial_guess should be a tuple of scalars,
everywhere you use sum you should instead use the NumPy sum method. For example, replace
mean_gauss_x = sum(x * data) / sum(data)
with
mean_gauss_x = (X * DATA).sum() / (DATA.sum())
import numpy as np
import scipy.optimize as optimize
import matplotlib.pyplot as plt
# define model function and pass independant variables x and y as a list
def twoD_Gaussian(data, amplitude, xo, yo, sigma_x, sigma_y, theta, offset):
X, Y = data
xo = float(xo)
yo = float(yo)
a = (np.cos(theta) ** 2) / (2 * sigma_x ** 2) + (np.sin(theta) ** 2) / (
2 * sigma_y ** 2
)
b = -(np.sin(2 * theta)) / (4 * sigma_x ** 2) + (np.sin(2 * theta)) / (
4 * sigma_y ** 2
)
c = (np.sin(theta) ** 2) / (2 * sigma_x ** 2) + (np.cos(theta) ** 2) / (
2 * sigma_y ** 2
)
g = offset + amplitude * np.exp(
-(a * ((X - xo) ** 2) + 2 * b * (X - xo) * (Y - yo) + c * ((Y - yo) ** 2))
)
return g.ravel()
# Create x and y indices
x = np.linspace(0, 200, 201)
y = np.linspace(0, 200, 201)
X, Y = np.meshgrid(x, y)
# create data
data = twoD_Gaussian((X, Y), 3, 100, 100, 20, 40, 0, 10)
data_noisy = data + 0.2 * np.random.normal(size=data.shape)
DATA = data.reshape(201, 201)
# add some noise to the data and try to fit the data generated beforehand
mean_gauss_x = (X * DATA).sum() / (DATA.sum())
sigma_gauss_x = np.sqrt((DATA * (X - mean_gauss_x) ** 2).sum() / (DATA.sum()))
mean_gauss_y = (Y * DATA).sum() / (DATA.sum())
sigma_gauss_y = np.sqrt((DATA * (Y - mean_gauss_y) ** 2).sum() / (DATA.sum()))
initial_guess = (
np.max(data),
mean_gauss_x,
mean_gauss_y,
sigma_gauss_x,
sigma_gauss_y,
0,
10,
)
print(initial_guess)
# (13.0, 100.00000000000001, 100.00000000000001, 57.106515650488404, 57.43620227324201, 0, 10)
# initial_guess = (3,100,100,20,40,0,10)
popt, pcov = optimize.curve_fit(twoD_Gaussian, (X, Y), data_noisy, p0=initial_guess)
data_fitted = twoD_Gaussian((X, Y), *popt)
fig, ax = plt.subplots(1, 1)
ax.imshow(
data_noisy.reshape(201, 201),
cmap=plt.cm.jet,
origin="bottom",
extent=(X.min(), X.max(), Y.min(), Y.max()),
)
ax.contour(X, Y, data_fitted.reshape(201, 201), 8, colors="w")
plt.show()
I'm trying to write an implementation of elgamal with elliptic curves in haskell.
But there's some problem in my point addition function: as long as I keep adding the start point to itself I never reach the point at infinity (O).
Here is my code:
addP :: Curve->Point->Point->Point
addP _ O O = O
addP _ O p = p
addP _ p O = p
addP curve#(a,b,p) (P x1 y1) (P x2 y2) | x1 == x2 && y1 == -y2 = O
| otherwise = P x3 ((m*(x1-x3)-y1) `mod''` p)
where x3 = (((m*m)-x1-x2) `mod''` p)
m | x1 /= x2 = (y2-y1)/(x2-x1)
| otherwise = (3*(x1*x1)+a)/(2*y1)
Where Curve is defined as
-- first double=a, second double=b, third double=p in y^2=x^3+ax+b mod p
type Curve = (Double, Double, Double)
and Point is defined as
data Point = P Double Double |
P
deriving (Eq, Read, Show)
Does anyone know what I've done wrong?
as long as I keep adding the start point to itself I never reach the point at infinity (O).
Could you please post the reference/link where you learned this. I have very limit knowledge of Elliptic curves but I know little bit of Haskell so I tried to see what is going with your code. Very first thing I noticed the use of division and double while you are using modular arithmetic modulo prime p. I am not able to see what you mod'' does so I changed your code little bit and it's working fine for me.
type Curve = ( Integer , Integer , Integer )
data Point = P Integer Integer | O
deriving (Eq, Read, Show)
extendedGcd :: Integer -> Integer -> ( Integer , Integer )
extendedGcd a b
| b == 0 = ( 1 , 0 )
| otherwise = ( t , s - q * t ) where
( q , r ) = quotRem a b
( s , t ) = extendedGcd b r
modInv :: Integer -> Integer -> Integer
modInv a b
| gcd a b /= 1 = error " gcd is not 1 "
| otherwise = d where
d = until ( > 0 ) ( + b ) . fst.extendedGcd a $ b
addP :: Curve->Point->Point->Point
addP _ O O = O
addP _ O p = p
addP _ p O = p
addP ( a, b, p ) ( P x1 y1 ) ( P x2 y2 )
| x1 == x2 && mod ( y1 + y2 ) p == 0 = O
| otherwise = P x3 ( mod ( m * ( x1 - x3 ) - y1 ) p ) where
m | x1 /= x2 = ( mod ( y2 - y1 ) p ) * modInv ( mod ( x2 - x1 ) p ) p
| otherwise = ( 3 * x1 * x1 + a ) * modInv ( 2*y1 ) p
x3 = mod ( m * m - x1 - x2 ) p
Lets take curve y^2 = x^3 + x + 1 modulo 13. Z_13 = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 ]. Quadratic residue ( QR ) = [ 0, 1, 3, 4, 9, 10, 12] and Quadratic non residue ( QNR )= [ 2, 5, 6, 7, 8, 11] of Z_13. Take x = 0 and we have y^2 = 1 ( mod 13 ) since 1 is in QR so solution for this equation is 1 and 12. We get two points ( 0, 1 ) and ( 0, 12 ). Putting x = 1, y^2 = 3 ( mod 13 ) so points corresponding to x = 1 is ( 1, 4 ) and ( 1, 9). Putting x=2, y^2 = 11 ( mod 13 ) and 11 is QNR so we don't have solution. Whenever a solution exists, it gives us two points and both are inverse of each other modulo prime p ( 13 in this case ). Total points on given curve is ( 0, 1 ), ( 0, 12 ), ( 1, 4 ), ( 1, 9 ), ( 4, 2 ), ( 4, 11 ), ( 5, 1 ), ( 5, 12 ), ( 7, 0 ), ( 7, 0 ), ( 8, 1 ), ( 8, 12 ), ( 10, 6 ), ( 10, 7 ), ( 11, 2 ), ( 11, 11 ). You can try all the points and see which one generate the whole group.
*Main>take 20 . iterate ( addP ( 1 , 1 , 13 ) ( P 7 0 ) ) $ ( P 7 0 )
[P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O]
*Main> take 20 . iterate ( addP ( 1 , 1 , 13 ) ( P 0 12 ) ) $ ( P 0 12 )
[P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6]
Coming back to Elgamal system
1. Bob chose elliptic curve E( a, b) over GF( p ) or GF ( 2^n ).
2. Bob chose a point on the curve e1( x1, y1 )
3. Bob chose an integer d.
4. Bob calculate e2(x2, y2 ) = d * e1( x1, y1 ).
5. Bob announce E( a, b, p ), e1( x1, y1 ) and e2( x2, y2) as your public key and keeps d as private key
Encryption.
Alice selects P, point on the curve, as her plain text. She chose a random number r and computes C1 = r * e1, C2 = P + r * e2.
Decryption.
Bob after receiving C1 and C2, computes C2 - d * C1 => P + r * e2 - d * r * e1
=> P + r * d * e1 - d * r * e1 => P
Edit: You are correct! If you take generator element and keep adding it then you can generate the whole group. See the lecture by Christof Paar[1].
[1]https://www.youtube.com/watch?v=3S9eZRHjP8g&list=PLn_QCKxjl9zmx3VojkDqljZcLCIslz7kB&index=37