Where is the mistake in this Elliptic Curve code - haskell

I'm trying to write an implementation of elgamal with elliptic curves in haskell.
But there's some problem in my point addition function: as long as I keep adding the start point to itself I never reach the point at infinity (O).
Here is my code:
addP :: Curve->Point->Point->Point
addP _ O O = O
addP _ O p = p
addP _ p O = p
addP curve#(a,b,p) (P x1 y1) (P x2 y2) | x1 == x2 && y1 == -y2 = O
| otherwise = P x3 ((m*(x1-x3)-y1) `mod''` p)
where x3 = (((m*m)-x1-x2) `mod''` p)
m | x1 /= x2 = (y2-y1)/(x2-x1)
| otherwise = (3*(x1*x1)+a)/(2*y1)
Where Curve is defined as
-- first double=a, second double=b, third double=p in y^2=x^3+ax+b mod p
type Curve = (Double, Double, Double)
and Point is defined as
data Point = P Double Double |
P
deriving (Eq, Read, Show)
Does anyone know what I've done wrong?

as long as I keep adding the start point to itself I never reach the point at infinity (O).
Could you please post the reference/link where you learned this. I have very limit knowledge of Elliptic curves but I know little bit of Haskell so I tried to see what is going with your code. Very first thing I noticed the use of division and double while you are using modular arithmetic modulo prime p. I am not able to see what you mod'' does so I changed your code little bit and it's working fine for me.
type Curve = ( Integer , Integer , Integer )
data Point = P Integer Integer | O
deriving (Eq, Read, Show)
extendedGcd :: Integer -> Integer -> ( Integer , Integer )
extendedGcd a b
| b == 0 = ( 1 , 0 )
| otherwise = ( t , s - q * t ) where
( q , r ) = quotRem a b
( s , t ) = extendedGcd b r
modInv :: Integer -> Integer -> Integer
modInv a b
| gcd a b /= 1 = error " gcd is not 1 "
| otherwise = d where
d = until ( > 0 ) ( + b ) . fst.extendedGcd a $ b
addP :: Curve->Point->Point->Point
addP _ O O = O
addP _ O p = p
addP _ p O = p
addP ( a, b, p ) ( P x1 y1 ) ( P x2 y2 )
| x1 == x2 && mod ( y1 + y2 ) p == 0 = O
| otherwise = P x3 ( mod ( m * ( x1 - x3 ) - y1 ) p ) where
m | x1 /= x2 = ( mod ( y2 - y1 ) p ) * modInv ( mod ( x2 - x1 ) p ) p
| otherwise = ( 3 * x1 * x1 + a ) * modInv ( 2*y1 ) p
x3 = mod ( m * m - x1 - x2 ) p
Lets take curve y^2 = x^3 + x + 1 modulo 13. Z_13 = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 ]. Quadratic residue ( QR ) = [ 0, 1, 3, 4, 9, 10, 12] and Quadratic non residue ( QNR )= [ 2, 5, 6, 7, 8, 11] of Z_13. Take x = 0 and we have y^2 = 1 ( mod 13 ) since 1 is in QR so solution for this equation is 1 and 12. We get two points ( 0, 1 ) and ( 0, 12 ). Putting x = 1, y^2 = 3 ( mod 13 ) so points corresponding to x = 1 is ( 1, 4 ) and ( 1, 9). Putting x=2, y^2 = 11 ( mod 13 ) and 11 is QNR so we don't have solution. Whenever a solution exists, it gives us two points and both are inverse of each other modulo prime p ( 13 in this case ). Total points on given curve is ( 0, 1 ), ( 0, 12 ), ( 1, 4 ), ( 1, 9 ), ( 4, 2 ), ( 4, 11 ), ( 5, 1 ), ( 5, 12 ), ( 7, 0 ), ( 7, 0 ), ( 8, 1 ), ( 8, 12 ), ( 10, 6 ), ( 10, 7 ), ( 11, 2 ), ( 11, 11 ). You can try all the points and see which one generate the whole group.
*Main>take 20 . iterate ( addP ( 1 , 1 , 13 ) ( P 7 0 ) ) $ ( P 7 0 )
[P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O,P 7 0,O]
*Main> take 20 . iterate ( addP ( 1 , 1 , 13 ) ( P 0 12 ) ) $ ( P 0 12 )
[P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6,P 7 0,P 10 7,P 0 1,O,P 0 12,P 10 6]
Coming back to Elgamal system
1. Bob chose elliptic curve E( a, b) over GF( p ) or GF ( 2^n ).
2. Bob chose a point on the curve e1( x1, y1 )
3. Bob chose an integer d.
4. Bob calculate e2(x2, y2 ) = d * e1( x1, y1 ).
5. Bob announce E( a, b, p ), e1( x1, y1 ) and e2( x2, y2) as your public key and keeps d as private key
Encryption.
Alice selects P, point on the curve, as her plain text. She chose a random number r and computes C1 = r * e1, C2 = P + r * e2.
Decryption.
Bob after receiving C1 and C2, computes C2 - d * C1 => P + r * e2 - d * r * e1
=> P + r * d * e1 - d * r * e1 => P
Edit: You are correct! If you take generator element and keep adding it then you can generate the whole group. See the lecture by Christof Paar[1].
[1]https://www.youtube.com/watch?v=3S9eZRHjP8g&list=PLn_QCKxjl9zmx3VojkDqljZcLCIslz7kB&index=37

Related

Simpson's rule 3/8 for n intervals in Python

im trying to write a program that gives the integral approximation of e(x^2) between 0 and 1 based on this integral formula:
Formula
i've done this code so far but it keeps giving the wrong answer (Other methods gives 1.46 as an answer, this one gives 1.006).
I think that maybe there is a problem with the two for cycles that does the Riemman sum, or that there is a problem in the way i've wrote the formula. I also tried to re-write the formula in other ways but i had no success
Any kind of help is appreciated.
import math
import numpy as np
def f(x):
y = np.exp(x**2)
return y
a = float(input("¿Cual es el limite inferior? \n"))
b = float(input("¿Cual es el limite superior? \n"))
n = int(input("¿Cual es el numero de intervalos? "))
x = np.zeros([n+1])
y = np.zeros([n])
z = np.zeros([n])
h = (b-a)/n
print (h)
x[0] = a
x[n] = b
suma1 = 0
suma2 = 0
for i in np.arange(1,n):
x[i] = x[i-1] + h
suma1 = suma1 + f(x[i])
alfa = (x[i]-x[i-1])/3
for i in np.arange(0,n):
y[i] = (x[i-1]+ alfa)
suma2 = suma2 + f(y[i])
z[i] = y[i] + alfa
int3 = ((b-a)/(8*n)) * (f(x[0])+f(x[n]) + (3*(suma2+f(z[i]))) + (2*(suma1)))
print (int3)
I'm not a math major but I remember helping a friend with this rule for something about waterplane area for ships.
Here's an implementation based on Wikipedia's description of the Simpson's 3/8 rule:
# The input parameters
a, b, n = 0, 1, 10
# Divide the interval into 3*n sub-intervals
# and hence 3*n+1 endpoints
x = np.linspace(a,b,3*n+1)
y = f(x)
# The weight for each points
w = [1,3,3,1]
result = 0
for i in range(0, 3*n, 3):
# Calculate the area, 4 points at a time
result += (x[i+3] - x[i]) / 8 * (y[i:i+4] * w).sum()
# result = 1.4626525814387632
You can do it using numpy.vectorize (Based on this wikipedia post):
a, b, n = 0, 1, 10**6
h = (b-a) / n
x = np.linspace(0,n,n+1)*h + a
fv = np.vectorize(f)
(
3*h/8 * (
f(x[0]) +
3 * fv(x[np.mod(np.arange(len(x)), 3) != 0]).sum() + #skip every 3rd index
2 * fv(x[::3]).sum() + #get every 3rd index
f(x[-1])
)
)
#Output: 1.462654874404461
If you use numpy's built-in functions (which I think is always possible), performance will improve considerably:
a, b, n = 0, 1, 10**6
x = np.exp(np.square(np.linspace(0,n,n+1)*h + a))
(
3*h/8 * (
x[0] +
3 * x[np.mod(np.arange(len(x)), 3) != 0].sum()+
2 * x[::3].sum() +
x[-1]
)
)
#Output: 1.462654874404461

Fitting a function f(x,y,z) with a quadratic polynomial

I'm trying to fit a function f(x,y,z) with the following quadratic polynomial:
3d polynomial
Some distorted spherical surface in three dimensions. The problem is related to the calculation of effective masses in solid state physics.
Here is a picture of the data to show that it indeed falls off parabolically in all directions, even though the curvature in the z-direction is rather low:
3d parabolas
I'm interested in the coefficients, which correspond to effective masses. I've got an array of xyz coordinates, which is regular and centered on the maximum:
[[ 0. 0. 0. ]
[ 0. 0. 0.01282017]
[ 0. 0. 0.02564034]
...
[-0.05026321 -0.05026321 -0.03846052]
[-0.05026321 -0.05026321 -0.02564034]
[-0.05026321 -0.05026321 -0.01282017]]
And a corresponding 1D array of scalar values, one for each point. The number of data points around this maximum can range from 100 to 1000.
This is the code I'm currently trying to use for fitting:
def func(data, mxx, mxy, mxz, myy, myz, mzz):
x = data[:, 0]
y = data[:, 1]
z = data[:, 2]
return (
(1 / (2 * mxx)) * (x ** 2)
+ (1 / (1 * mxy)) * (x * y)
+ (1 / (1 * mxz)) * (x * z)
+ (1 / (2 * myy)) * (y ** 2)
+ (1 / (1 * myz)) * (y * z)
+ (1 / (2 * mzz)) * (z ** 2)
) + f(0, 0, 0)
energy = data[:, 3]
guess = (mxx, mxy, mxz, myy, myz, mzz)
params, pcov = scipy.optimize.curve_fit(
func, data, energy, p0=guess, method="trf"
)
Where f(0,0,0) is the value of the function at (0, 0, 0), which I retrieve with the scipy.interpolate.griddata function.
For this problem, the masses should be negative and have values between -0.2 and -2, roughly speaking. I'm creating guess values through a finite difference differentiation.
However, I don't get any senseful results from scipy.interpolate.curve_fit - typically the coefficients end up with huge numbers (like 1e9). I'm completly lost at this point.
What am I doing wrong :( ?
One of the problems is that you fit 1/m. While this is correct from a physics point of view, it is bad from the algorithm point of view. If the fitting algorithm needs to change sign for values of m near zero, the coefficients diverge. Consequently, it is better to fit mI = 1/m and make the according error progressions later. Here I use leastsqwhich requires some additional calculations for the covariance matrix (as it returns the reduced form). I do the fit with g() and the inverse masses, but you can immediately reproduce your problems when introducing f() and directly fitting the ms.
A second point is that the data has an offset, i.e. if x = y = z = 0 the data is v= -0.0195 This needs to be introduced into the model.
Finally, I'd say that you already have non-parabolic behaviour in your data.
Nevertheless, here is how it looks like:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(linewidth=300)
from scipy.optimize import leastsq
from scipy.optimize import curve_fit
data = np.loadtxt( "silicon.csv", delimiter=',' )
def f( x, y, z, mxx, mxy, mxz, myy, myz, mzz, offI ):
out = 1./(2 * mxx) * x * x
out += 1./( mxy ) * x * y
out += 1./( mxz ) * x * z
out += 1./( 2 * myy ) * y * y
out += 1./( myz ) * y * z
out += 1./( 2 * mzz ) * z * z
out += 1./offI
return out
def g( x, y, z, mxxI, mxyI, mxzI, myyI, myzI, mzzI, off ):
out = mxxI / 2 * x * x
out += mxyI * x * y
out += mxzI * x * z
out += myyI / 2 * y * y
out += myzI * y * z
out += mzzI / 2 * z * z
out += off
return out
def residuals( params, indata ):
out = list()
for x, y, z, v in indata:
out.append( v - g( x,y, z, *params ) )
return out
sol, cov, info, msg, ier = leastsq( residuals, 7*[0], args=( data, ), full_output=True)
s_sq = sum( [x**2 for x in residuals( sol, data) ] )/ (len( data ) - len( sol ) )
print "solution"
print sol
masses = [1/x for x in sol]
print "masses:"
print masses
print "covariance matrix:"
covMX = cov * s_sq
print covMX
print "sum of residuals"
print sum( residuals( sol, data) )
### plotting the cuts
fig = plt.figure('cuts')
ax = dict()
for i in range( 1, 10 ):
ax[i] = fig.add_subplot( 3, 3, i )
dl = np.linspace( -.2, .2, 25)
#### xx
xdata = [ [ x, v ] for x,y,z,v in data if ( abs(y)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( x, 0, 0, *masses ) for x in dl ), np.float )
ax[1].plot( *zip(*sorted( xdata ) ), ls='', marker='o')
ax[1].plot( dl, vl )
#### xy
xydata = [ [ x, v ] for x, y, z, v in data if ( abs( x - y )<1e-2 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( xy, xy, 0, *masses ) for xy in dl ), np.float )
ax[2].plot( *zip(*sorted( xydata ) ), ls='', marker='o')
ax[2].plot( dl, vl )
#### xz
xzdata = [ [ x, v ] for x, y, z, v in data if ( abs( x - z )<1e-2 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( xz, 0, xz, *masses ) for xz in dl ), np.float )
ax[3].plot( *zip(*sorted( xzdata ) ), ls='', marker='o')
ax[3].plot( dl, vl )
#### yy
ydata = [ [ y, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, y, 0, *masses ) for y in dl ), np.float )
ax[5].plot( *zip(*sorted( ydata ) ), ls='', marker='o' )
ax[5].plot( dl, vl )
#### yz
yzdata = [ [ y, v ] for x, y, z, v in data if ( abs( y - z )<1e-2 and abs(x) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, yz, yz, *masses ) for yz in dl ), np.float )
ax[6].plot( *zip(*sorted( yzdata ) ), ls='', marker='o')
ax[6].plot( dl, vl )
#### zz
zdata = [ [ z, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, 0, z, *masses ) for z in dl ), np.float )
ax[9].plot( *zip(*sorted( zdata ) ), ls='', marker='o' )
ax[9].plot( dl, vl )
#### some diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x - z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, d, *masses ) for d in dl ), np.float )
ax[7].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[7].plot( dl, vl )
#### some other diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x + z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, -d, *masses ) for d in dl ), np.float )
ax[8].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[8].plot( dl, vl )
plt.show()
This gives the following output:
solution
[-1.46528595 0.25090717 0.25090717 -1.46528595 0.25090717 -1.46528595 -0.01993436]
masses:
[-0.6824606499739905, 3.985537743156507, 3.9855376943660676, -0.6824606473928339, 3.9855377322848344, -0.6824606467055248, -50.16463861555409]
covariance matrix:
[
[ 4.76417852e-03 -1.46907683e-12 -8.57639600e-12 -2.21281938e-12 -2.38444957e-12 8.42981521e-12 -2.70034183e-05]
[-1.46907683e-12 9.17104397e-04 -7.10573582e-13 1.32125214e-11 7.44553140e-12 1.29909935e-11 -1.11259046e-13]
[-8.57639600e-12 -7.10573582e-13 9.17104389e-04 -8.60004172e-12 -6.14797647e-12 8.27070243e-12 3.11127064e-14]
[-2.21281914e-12 1.32125214e-11 -8.60004172e-12 4.76417860e-03 -4.20477032e-12 9.20893224e-12 -2.70034186e-05]
[-2.38444957e-12 7.44553140e-12 -6.14797647e-12 -4.20477032e-12 9.17104395e-04 1.50963408e-11 -7.28889534e-14]
[ 8.42981530e-12 1.29909935e-11 8.27070243e-12 9.20893175e-12 1.50963408e-11 4.76417849e-03 -2.70034182e-05]
[-2.70034183e-05 -1.11259046e-13 3.11127064e-14 -2.70034186e-05 -7.28889534e-14 -2.70034182e-05 5.77019926e-07]
]
sum of residuals
4.352727352163743e-09
...and here some 1d cuts that show some significant deviation from parabolic behaviour if one is not on one of the main axes.

Solving a line intercept equation

What is A and B so that the line Ay = Bx + 1 passes through points (1, 3) and (5,13) in the Cartesian plane?
I have been trying to solve it using the slope intercept equation to no avail. This is taken from Dale Hoffman's Contemprary Calculus.
First, I would reorder to get canonical form,
y = (B/A) * x + (1/A) = m * x + b
Now we find slope (m):
m = dy / dx = (13 - 3) / (5 - 1) = 2.5
sub in to find b:
3 = 2.5 * 1 + b
b = 0.5
Now sub back to find the values you want,
b = 0.5 = 1 / A
A = 2
m = 2.5 = B / 2
B = 5

UVa 12921 ( Triple Shots Help )

Please watch this problem .
Link : https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=862&page=show_problem&problem=4800
I have been trying to solve this Geometry problem from a few Weeks ago . But every time I failed . My approach to solve this problem is ---
As 3 points are in same distance that simply means The point we will found in the result that will be a Center of a Circle whose radius is The distance of those 3 points distinctly. Let 3 points are ( x1, y1, x2, y2, x3 , y3 ). SO , we can write,
(x1 - H)^2 + (y1 - K)^2 = (x2 - H)^2 + (y2 - K)^2
=> (x1^2 + y1^2 -x2^2 -y2^2) - 2H(x1-x2) - 2K(y1-y2) = 0
=> A - 2HX1 - 2KY1 = 0 ------ ( i )
(x2 - H)^2 + (y2 - K)^2 = (x3 - H)^2 + (y3 - K)^2
=> (x2^2 + y2^2 -x3^2 -y3^2) - 2H(x2-x3) - 2K(y2-y3) = 0
=> B - 2HX2 - 2KY2 = 0 ------- ( ii )
And then we can Solute this two equation in the following way :
So,
A - 2HX1 - ( (B - 2HX2) / Y2 ) * Y1 = 0 [ Putting the value of 2K from eqn ( ii ) ]
=> H = ( AY2 - BY1 ) / ( 2 * ( X1Y2 - X2Y1 ) ) ----- (iii)
And,
=> K = ( B - 2HX2 ) / 2Y2 ----- ( iv )
Now , if those points are previously Co-Linear then I will print " Impossible " . But If not then we will do the above's Calculation . If ( H, K ) are in the same distance from those 3 points ( x1, y1, x2, y2, x3 , y3 ) the Print ( H, K ) else print " Impossible ".
Is my approach correct ( My code give answers " Impossible " for all test. ) ? If not then why ? Give me some Idea that how can I solve it ?? Thanks in Advance .
Your method is a little complex.
A general form of a circle is
x^2 + y^2 + Dx + Ey + F = 0.
In this way, you can avoid the square for variables.
Given 3 points, you can solve D, E, and F replacing x and y by those points. (The coefficient of F is always 1 so you can solve D and E quickly using Cramer's rule, subtracting one equation from the other.)
When you get D & E, the coordinate of the center is (-D/2, -E/2). (So you can just ignore the F in the last step.) Note that when D or E is infinite, then it's impossible to find a center for that case.

How to select Y values at X position in Groovy?

this is sort of a mathy question...
I had a question prior to this about normalizing monthly data here :
How to produce X values of a stretched graph?
I got a good answer and it works well, the only issue is that now I need to check X values of one month with 31 days against X values of a month with 28.
So my question would be: If I have two sets of parameters like so:
x | y x2 | y2
1 | 10 1.0 | 10
2 | 9 1.81 | 9.2
3 | 8 2.63 | 8.6
4 | 7 3.45 | 7.8
5 | 6 4.27 | 7
6 | 5 5.09 | 6.2
7 | 4 5.91 | 5.4
8 | 3 6.73 | 4.2
9 | 2 7.55 | 3.4
10 | 1 8.36 | 2.6
9.18 | 1.8
10.0 | 1.0
As you can see, the general trend is the same for these two data sets.
However, if I run these values through a cross-correlation function (the general goal), I will get something back that does not reflect this, since the data sets are of two different sizes.
The real world example of this would be, say, if you are tracking how many miles you run per day:
In February (with 28 days), during the first week, you run one mile each day. During the second week, you run two miles each day, etc.
In March (with 31 days), you do the same thing, but run for one mile for eight days, two miles for eight days, three miles for eight days, and four miles for seven days.
The correlation coefficient according to the following function should be almost exactly 1:
class CrossCorrelator {
def variance = { x->
def v = 0
x.each{ v += it**2}
v/(x.size()) - (mean(x)**2)
}
def covariance = {x, y->
def z = 0
[x, y].transpose().each{ z += it[0] * it[1] }
(z / (x.size())) - (mean(x) * mean(y))
}
def coefficient = {x, y->
covariance(x,y) / (Math.sqrt(variance(x) * variance(y)))
}
}
def i = new CrossCorrelator()
i.coefficient(y values, y2 values)
Just looking at the data sets, it seems like the graphs would be exactly the same if I were to grab the values at 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10, and the function would produce a more accurate result.
However, it's skewed since the lengths are not the same.
Is there some way to locate what the values at the integers in the twelve-value data set would be? I haven't found a simple way to do it, but this would be incredibly helpful.
Thanks in advance,
5
Edit: As per request, here is the code that generates the X values of the graphs:
def x = (1..12)
def y = 10
change = {l, size ->
v = [1]
l.each{
v << ((((size-1)/(x.size() - 1)) * it) + 1)
}
v -= v.last()
return v
}
change(x, y)
Edit: Not working code as per another request:
def normalize( xylist, days ) {
xylist.collect { x, y -> [ x * ( days / xylist.size() ), y ] }
}
def change = {l, size ->
def v = [1]
l.each{
v << ((((size-1)/(l.size() - 1)) * it) + 1)
}
v -= v.last()
return v
}
def resample( list, min, max ) {
// We want a graph with integer points from min to max on the x axis
(min..max).collect { i ->
// find the values above and below this point
bounds = list.inject( [ a:null, b:null ] ) { r, p ->
// if the value is less than i, set it in r.a
if( p[ 0 ] < i )
r.a = p
// if it's bigger (and we don't already have a bigger point)
// then set it into r.b
if( !r.b && p[ 0 ] >= i )
r.b = p
r
}
// so now, bounds.a is the point below our required point, and bounds.b
// Deal with the first case (where a is null, because we are at the start)
if( !bounds.a )
[ i, list[ 0 ][ 1 ] ]
else {
// so work out the distance from bounds.a to bounds.b
dist = ( bounds.b[0] - bounds.a[0] )
// And how far the point i is along this line
r = ( i - bounds.a[0] ) / dist
// and recalculate the y figure for this point
y = ( ( bounds.b[1] - bounds.a[1] ) * r ) + bounds.a[1]
[ i, y ]
}
}
}
def feb = [9, 3, 7, 23, 15, 16, 17, 18, 19, 13, 14, 8, 13, 12, 15, 6, 7, 13, 19, 12, 7, 3, 4, 15, 6, 17, 8, 19]
def march = [8, 12, 4, 17, 11, 15, 12, 8, 9, 13, 12, 7, 3, 4, 8, 2, 17, 19, 21, 12, 12, 13, 14, 15, 16, 7, 8, 19, 21, 14, 16]
//X and Y Values for February
z = [(1..28), change(feb, 28)].transpose()
//X and Y Values for March stretched to 28 entries
o = [(1..31), change(march, 28)].transpose()
o1 = normalize(o, 28)
resample(o1, 1, 28)
If I switch "march" in the o variable declaration to (1..31), the script runs successfully. When I try to use "march," I get "
java.lang.NullPointerException: Cannot invoke method getAt() on null object"
Also: I try not to directly copy code just because it's bad practice, so one of the functions I changed basically does the same thing, it's just my version. I'll get around to refactoring the rest of it eventually, too. But that's why it's slightly different.
Ok...here we go...this may not be the cleanest bit of code ever...
Let's first generate two distributions, both from 1 to 10 (in the y axis)
def generate( range, max ) {
range.collect { i ->
[ i, max * ( i / ( range.to - range.from + 1 ) ) ]
}
}
// A distribution 10 elements long from 1 to 10
def e1 = generate( 1..10, 10 )
// A distribution 14 elements long from 1 to 10
def e2 = generate( 1..14, 10 )
So now, e1 and e2 are:
[1.00,1.00], [2.00,2.00], [3.00,3.00], [4.00,4.00], [5.00,5.00], [6.00,6.00], [7.00,7.00], [8.00,8.00], [9.00,9.00], [10.00,10.00]
[1.00,0.71], [2.00,1.43], [3.00,2.14], [4.00,2.86], [5.00,3.57], [6.00,4.29], [7.00,5.00], [8.00,5.71], [9.00,6.43], [10.00,7.14], [11.00,7.86], [12.00,8.57], [13.00,9.29], [14.00,10.00]
respectively (to 2dp). Now, using the code from the previous question, we can normalize these to the same x range:
def normalize( xylist, days ) {
xylist.collect { x, y -> [ x * ( days / xylist.size() ), y ] }
}
n1 = normalize( e1, 10 )
n2 = normalize( e2, 10 )
This means n1 and n2 are:
[1.00,1.00], [2.00,2.00], [3.00,3.00], [4.00,4.00], [5.00,5.00], [6.00,6.00], [7.00,7.00], [8.00,8.00], [9.00,9.00], [10.00,10.00]
[0.71,0.71], [1.43,1.43], [2.14,2.14], [2.86,2.86], [3.57,3.57], [4.29,4.29], [5.00,5.00], [5.71,5.71], [6.43,6.43], [7.14,7.14], [7.86,7.86], [8.57,8.57], [9.29,9.29], [10.00,10.00]
But, as you correctly state they have different numbers of sample points, so cannot be compared easily.
But we can write a method to step through each point we want in our graph, fond the two closest points, and interpolate a y value from the values of these two points like so:
def resample( list, min, max ) {
// We want a graph with integer points from min to max on the x axis
(min..max).collect { i ->
// find the values above and below this point
bounds = list.inject( [ a:null, b:null ] ) { r, p ->
// if the value is less than i, set it in r.a
if( p[ 0 ] < i )
r.a = p
// if it's bigger (and we don't already have a bigger point)
// then set it into r.b
if( !r.b && p[ 0 ] >= i )
r.b = p
r
}
// so now, bounds.a is the point below our required point, and bounds.b
if( !bounds.a ) // no lower bound...take the first element
[ i, list[ 0 ][ 1 ] ]
else if( !bounds.b ) // no upper bound... take the last element
[ i, list[ -1 ][ 1 ] ]
else {
// so work out the distance from bounds.a to bounds.b
dist = ( bounds.b[0] - bounds.a[0] )
// And how far the point i is along this line
r = ( i - bounds.a[0] ) / dist
// and recalculate the y figure for this point
y = ( ( bounds.b[1] - bounds.a[1] ) * r ) + bounds.a[1]
[ i, y ]
}
}
}
final1 = resample( n1, 1, 10 )
final2 = resample( n2, 1, 10 )
now, the values final1 and final2 are:
[1.00,1.00], [2.00,2.00], [3.00,3.00], [4.00,4.00], [5.00,5.00], [6.00,6.00], [7.00,7.00], [8.00,8.00], [9.00,9.00], [10.00,10.00]
[1.00,1.00], [2.00,2.00], [3.00,3.00], [4.00,4.00], [5.00,5.00], [6.00,6.00], [7.00,7.00], [8.00,8.00], [9.00,9.00], [10.00,10.00]
(obviously, there is some rounding here, so 2d.p. is hiding the fact that they are not exactly the same)
Phew... Must be home-time after that ;-)
EDIT
As pointed out in the edit to the question, there was a bug in my resample method that caused it to fail in certain conditions...
I believe this has now been fixed in the code above, and from the given example:
def march = [8, 12, 4, 17, 11, 15, 12, 8, 9, 13, 12, 7, 3, 4, 8, 2, 17, 19, 21, 12, 12, 13, 14, 15, 16, 7, 8, 19, 21, 14, 16]
o = [ (1..31), march ].transpose()
// X values squeezed to be between 1 and 28 (instead of 1 to 31)
o1 = normalize(o, 28)
// Then, resample this graph so there are only 28 points
v = resample(o1, 1, 28)
If you plot the original 31 points (in o) and the new graph of 28 points (in v), you get:
Which doesn't look too bad.
I have no idea what the change method was supposed to do, so I have omitted it from this code

Resources