Arithmetic Mean with exponent of small numbers - rounding

Due to rounding error, cannot get mean of three numbers:
a=-1.11e4
b=-1.12e4
c=-1.13e4
Mean=1/3 *[exp(a)+exp(b)+exp(c)]
How to get the results in a log value?

You're trying to find log((exp(a) + exp(b) + exp(c)) / 3), but a, b, and c are so low that the result of exp underflows to 0. You can fix this by adjusting the values so exp doesn't underflow.
Let d = max(a, b, c). Then we have the following equality:
M = log((exp(a) + exp(b) + exp(c)) / 3)
= log(exp(d) * (exp(a-d) + exp(b-d) + exp(c-d)) / 3)
= log(exp(d)) + log((exp(a-d) + exp(b-d) + exp(c-d)) / 3)
= d + log((exp(a-d) + exp(b-d) + exp(c-d)) / 3)
So we can calculate the result as d + log((exp(a-d) + exp(b-d) + exp(c-d)) / 3). Since d is equal to one of a, b, or c, one of the exp arguments is 0, and the rest are at most 0. Thus, one of the exp outputs is 1, and the rest are at most 1. We don't have to worry about overflow or underflow; while an underflow might still occur in one or more exp calls, it won't be a problem any more, since the log argument won't be 0.

Related

Splitting an int64 into two int32, performing math, then re-joining

I am working within constraints of hardware that has 64bit integer limit. Does not support floating point. I am dealing with very large integers that I need to multiply and divide. When multiplying I encounter an overflow of the 64bits. I am prototyping a solution in python. This is what I have in my function:
upper = x >> 32 #x is cast as int64 before being passed to this function
lower = x & 0x00000000FFFFFFFF
temp_upper = upper * y // z #Dividing first is not an option, as this is not the actual equation I am working with. This is just to make sure in my testing I overflow unless I do the splitting.
temp_lower = lower * y // z
return temp_upper << 32 | lower
This works, somewhat, but I end up losing a lot of precision (my result is off by sometimes a few million). From looking at it, it appears that this is happening because of the division. If sufficient enough it shifts the upper to the right. Then when I shift it back into place I have a gap of zeroes.
Unfortunately this topic is very hard to google, since anything with upper/lower brings up results about rounding up/down. And anything about splitting ints returns results about splitting them into a char array. Anything about int arithmetic bring up basic algebra with integer math. Maybe I am just not good at googling. But can you guys give me some pointers on how to do this?
Splitting like this is just a thing I am trying, it doesnt have to be the solution. All I need to be able to do is to temporarily go over 64bit integer limit. The final result will be under 64bit (After the division part). I remember learning in college about splitting it up like this and then doing the math and re-combining. But unfortunately as I said I am having trouble finding anything online on how to do the actual math on it.
Lastly, my numbers are sometimes small. So I cant chop off the right bits. I need the results to basically be equivalent to if I used something like int128 or something.
I suppose a different way to look at this problem is this. Since I have no problem with splitting the int64, we can forget about that part. So then we can pretend that two int64's are being fed to me, one is upper and one is lower. I cant combine them, because they wont fit into a single int64. So I need to divide them first by Z. Combining step is easy. How do I do the division?
Thanks.
As I understand it, you want to perform (x*y)//z.
Your numbers x,y,z all fit on 64bits, except that you need 128 bits for intermediate x*y.
The problem you have is indeed related to division: you have
h * y = qh * z + rh
l * y = ql * z + rl
h * y << 32 + l*y = (qh<<32 + ql) * z + (rh<<32 + rl)
but nothing says that (rh<<32 + rl) < z, and in your case high bits of l*y overlap low bits of h * y, so you get the wrong quotient, off by potentially many units.
What you should do as second operation is rather:
rh<<32 + l * y = ql' * z + rl'
Then get the total quotient qh<<32 + ql'
But of course, you must care to avoid overflow when evaluating left operand...
Since you are splitting only one of the operands of x*y, I'll assume that the intermediate result always fits on 96 bits.
If that is correct, then your problem is to divide a 3 32bits limbs x*y by a 2 32bits limbs z.
It is thus like Burnigel - Ziegler divide and conquer algorithm for division.
The algorithm can be decomposed like this:
obtain the 3 limbs a2,a1,a0 of multiplication x*y by using karatsuba for example
split z into 2 limbs z1,z0
perform the div32( (a2,a1,a0) , (z1,z0) )
here is some pseudo code, only dealing with positive operands, and with no guaranty to be correct, but you get an idea of implementation:
p = 1<<32;
function (a1,a0) = split(a)
a1 = a >> 32;
a0 = a - (a1 * p);
function (a2,a1,a0) = mul22(x,y)
(x1,x0) = split(x) ;
(y1,y0) = split(y) ;
(h1,h0) = split(x1 * y1);
assert(h1 == 0); -- assume that results fits on 96 bits
(l1,l0) = split(x0 * y0);
(m1,m0) = split((x1 - x0) * (y0 - y1)); -- karatsuba trick
a0 = l0;
(carry,a1) = split( l1 + l0 + h0 + m0 );
a2 = l1 + m1 + h0 + carry;
function (q,r) = quorem(a,b)
q = a // b;
r = a - (b * q);
function (q1,q0,r0) = div21(a1,a0,b0)
(q1,r1) = quorem(a1,b0);
(q0,r0) = quorem( r1 * p + a0 , b0 );
(q1,q0) = split( q1 * p + q0 );
function q = div32(a2,a1,a0,b1,b0)
(q,r) = quorem(a2*p+a1,b1*p+b0);
q = q * p;
(a2,a1)=split(r);
if a2<b1
(q1,q0,r)=div21(a2,a1,b1);
assert(q1==0); -- since a2<b1...
else
q0=p-1;
r=(a2-b1)*p+a1+b1;
(d1,d0) = split(q0*b0);
r = (r-d1)*p + a0 - d0;
while(r < 0)
q = q - 1;
r = r + b1*p + b0;
function t=muldiv(x,y,z)
(a2,a1,a0) = mul22(x,y);
(z1,z0) = split(z);
if z1 == 0
(q2,q1,r1)=div21(a2,a1,z0);
assert(q2==0); -- otherwise result will not fit on 64 bits
t = q1*p + ( ( r1*p + a0 )//z0);
else
t = div32(a2,a1,a0,z1,z0);

Re-writing sympy expression as cubic polynomial with defined variables and coefficients

Given an expression in sympy, how do I re-write the expression as a polynomial defined as [1]:
D11*(omega**2/k**2)**3 + D22*(omega**2/k**2)**2 + D33*(omega**2/k**2) + D44 = 0
Note that this is different from the similar question asked here (Rewrite equation as polynomial).
Let
x=(omega**2/k**2)
then
D11*x**3 + D22*x**2 + D33*x + D44 = 0
I would like to find D11, D22, D33, and D44, given that x=omega**2/k**2
Normally, the collect function (http://docs.sympy.org/latest/tutorial/simplification.html) would collect similar terms, but in this situation, it does not seem to work well.
Here is a simple example that helps to explain what I am trying to accomplish. The output should be in the form D11*(omega**2/k**2)**3 + D22*(omega**2/k**2)**2 + D33*(omega**2/k**2) + D44 = 0
from sympy import symbols, collect
from IPython.display import display
omega = symbols('omega')
k = symbols('k')
a = symbols('a')
b = symbols('b')
c = symbols('c')
D11 = a*b*c
D22 = c+b
D33 = a+c*b + b
D44 = a+b
x = omega**2/k**2
expr = (D11*x**3 + D22*x**2 + D33*x + D44)
expr0 = expr.expand()
expr1 = collect(expr0, x)
display(expr1)
The output is:
a*b*c*omega**6/k**6 + a + b + omega**2*(a + b*c + b)/k**2 + omega**4*(b + c)/k**4
Although numerically correct, I would like the polynomial in the form [1] given above, and once it is in the form, I would like to extract the D11, D22, D33, and D44 coefficients.
Using evaluate=False in the collect function gets me closer to the goal, since the output now becomes:
{omega**2/k**2: a + b*c + b, omega**4/k**4: b + c, omega**6/k**6: a*b*c, 1: a + b}
Starting with your expr1,
expr1 = a*b*c*omega**6/k**6 + a + b + omega**2*(a + b*c + b)/k**2 + omega**4*(b + c)/k**4
it seems the easiest way to get the coefficients is to turn x into a symbol, at least temporarily:
Poly(expr1.subs(x, Symbol('x')), Symbol('x')).all_coeffs()
returns [a*b*c, b + c, a + b*c + b, a + b] (the coefficients are listed starting with the highest degree).
I would probably have x = Symbol('x') there to begin with, and only use expr.subs(x, omega**2/k**2) when needed.
SymPy's internal order of terms in a sum cannot be changed. To "rearrange" a SymPy expression means to print it in a more human-friendly form. This is largely a string manipulation problem, as we are no longer producing a SymPy object.
str(Poly(expr1.subs(x, Symbol('x')), Symbol('x'))).replace('x', '(' + str(x) + ')')
returns Poly(a*b*c*(omega**2/k**2)**3 + (b + c)*(omega**2/k**2)**2 + (a + b*c + b)*(omega**2/k**2) + a + b, (omega**2/k**2), domain='ZZ[a,b,c]')
Adding .split(',')[0].replace('Poly(', '') to the above removes the meta-data of a polynomial, leaving a*b*c*(omega**2/k**2)**3 + (b + c)*(omega**2/k**2)**2 + (a + b*c + b)*(omega**2/k**2) + a + b

Distance to a straight line in standard form

For a 3D straight line expressed in the standard form
a1*x + b1*y + c1*z + d1 = 0
a2*x + b2*y + c2*z + d2 = 0
and a given point x0,y0,z0
what is the distance from the point to the straight line?
Distance from point P0 to parametric line L(t) = Base + t * Dir is
Dist = Length(CrossProduct(Dir, P0 - Base)) / Length(Dir)
To find direction vector:
Dir = CrossProduct((a1,b1,c1), (a2,b2,c2))
To get some arbitrary base point, solve equation system with 2 equations and three unknowns (find arbitrary solution):
a1*x + b1*y + c1*z + d1 = 0
a2*x + b2*y + c2*z + d2 = 0
Check minors consisting of a and b, a and c, b and c coefficients. When minor is non-zero, corresponding variable might be taken as free one. For example, if a1 * b2 - b1 * a2 <> 0, choose variable z as free - make it zero or another value and solve system for two unknowns x and y.
(I omitted extra cases of parallel or coinciding planes)

Count the Number of Zero's between Range of integers

. Is there any Direct formula or System to find out the Numbers of Zero's between a Distinct Range ... Let two Integer M & N are given . if I have to find out the total number of zero's between this Range then what should I have to do ?
Let M = 1234567890 & N = 2345678901
And answer is : 987654304
Thanks in advance .
Reexamining the Problem
Here is a simple solution in Ruby, which inspects each integer from the interval [m,n], determines the string of its digits in the standard base 10 positional system, and counts the occuring 0 digits:
def brute_force(m, n)
if m > n
return 0
end
z = 0
m.upto(n) do |k|
z += k.to_s.count('0')
end
z
end
If you run it in an interactive Ruby shell you will get
irb> brute_force(1,100)
=> 11
which is fine. However using the interval bounds from the example in the question
m = 1234567890
n = 2345678901
you will recognize that this will take considerable time. On my machine it does need more than a couple of seconds, I had to cancel it so far.
So the real question is not only to come up with the correct zero counts but to do it faster than the above brute force solution.
Complexity: Running Time
The brute force solution needs to perform n-m+1 times searching the base 10 string for the number k, which is of length floor(log_10(k))+1, so it will not use more than
O(n (log(n)+1))
string digit accesses. The slow example had an n of roughly n = 10^9.
Reducing Complexity
Yiming Rong's answer is a first attempt to reduce the complexity of the problem.
If the function for calculating the number of zeros regarding the interval [m,n] is F(m,n), then it has the property
F(m,n) = F(1,n) - F(1,m-1)
so that it suffices to look for a most likely simpler function G with the property
G(n) = F(1,n).
Divide and Conquer
Coming up with a closed formula for the function G is not that easy. E.g.
the interval [1,1000] contains 192 zeros, but the interval [1001,2000] contains 300 zeros, because a case like k = 99 in the first interval would correspond to k = 1099 in the second interval, which yields another zero digit to count. k=7 would show up as 1007, yielding two more zeros.
What one can try is to express the solution for some problem instance in terms of solutions to simpler problem instances. This strategy is called divide and conquer in computer science. It works if at some complexity level it is possible to solve the problem instance and if one can deduce the solution of a more complex problem from the solutions of the simpler ones. This naturally leads to a recursive formulation.
E.g. we can formulate a solution for a restricted version of G, which is only working for some of the arguments. We call it g and it is defined for 9, 99, 999, etc. and will be equal to G for these arguments.
It can be calculated using this recursive function:
# zeros for 1..n, where n = (10^k)-1: 0, 9, 99, 999, ..
def g(n)
if n <= 9
return 0
end
n2 = (n - 9) / 10
return 10 * g(n2) + n2
end
Note that this function is much faster than the brute force method: To count the zeros in the interval [1, 10^9-1], which is comparable to the m from the question, it just needs 9 calls, its complexity is
O(log(n))
Again note that this g is not defined for arbitrary n, only for n = (10^k)-1.
Derivation of g
It starts with finding the recursive definition of the function h(n),
which counts zeros in the numbers from 1 to n = (10^k) - 1, if the decimal representation has leading zeros.
Example: h(999) counts the zero digits for the number representations:
001..009
010..099
100..999
The result would be h(999) = 297.
Using k = floor(log10(n+1)), k2 = k - 1, n2 = (10^k2) - 1 = (n-9)/10 the function h turns out to be
h(n) = 9 [k2 + h(n2)] + h(n2) + n2 = 9 k2 + 10 h(n2) + n2
with the initial condition h(0) = 0. It allows to formulate g as
g(n) = 9 [k2 + h(n2)] + g(n2)
with the intital condition g(0) = 0.
From these two definitions we can define the difference d between h and g as well, again as a recursive function:
d(n) = h(n) - g(n) = h(n2) - g(n2) + n2 = d(n2) + n2
with the initial condition d(0) = 0. Trying some examples leads to a geometric series, e.g. d(9999) = d(999) + 999 = d(99) + 99 + 999 = d(9) + 9 + 99 + 999 = 0 + 9 + 99 + 999 = (10^0)-1 + (10^1)-1 + (10^2)-1 + (10^3)-1 = (10^4 - 1)/(10-1) - 4. This gives the closed form
d(n) = n/9 - k
This allows us to express g in terms of g only:
g(n) = 9 [k2 + h(n2)] + g(n2) = 9 [k2 + g(n2) + d(n2)] + g(n2) = 9 k2 + 9 d(n2) + 10 g(n2) = 9 k2 + n2 - 9 k2 + 10 g(n2) = 10 g(n2) + n2
Derivation of G
Using the above definitions and naming the k digits of the representation q_k, q_k2, .., q2, q1 we first extend h into H:
H(q_k q_k2..q_1) = q_k [k2 + h(n2)] + r (k2-kr) + H(q_kr..q_1) + n2
with initial condition H(q_1) = 0 for q_1 <= 9.
Note the additional definition r = q_kr..q_1. To understand why it is needed look at the example H(901), where the next level call to H is H(1), which means that the digit string length shrinks from k=3 to kr=1, needing an additional padding with r (k2-kr) zero digits.
Using this, we can extend g to G as well:
G(q_k q_k2..q_1) = (q_k-1) [k2 + h(n2)] + k2 + r (k2-kr) + H(q_kr..q_1) + g(n2)
with initial condition G(q_1) = 0 for q_1 <= 9.
Note: It is likely that one can simplify the above expressions like in case of g above. E.g. trying to express G just in terms of G and not using h and H. I might do this in the future. The above is already enough to implement a fast zero calculation.
Test Result
recursive(1234567890, 2345678901) =
987654304
expected:
987654304
success
See the source and log for details.
Update: I changed the source and log according to the more detailed problem description from that contest (allowing 0 as input, handling invalid inputs, 2nd larger example).
You can use a standard approach to find m = [1, M-1] and n = [1, N], then [M, N] = n - m.
Standard approaches are easily available: Counting zeroes.

How to implement Frobenius pseudoprime algorithm?

Someone told me that the Frobenius pseudoprime algorithm take three times longer to run than the Miller–Rabin primality test but has seven times the resolution. So then if one where to run the former ten times and the later thirty times, both would take the same time to run, but the former would provide about 233% more analyse power. In trying to find out how to perform the test, the following paper was discovered with the algorithm at the end:
A Simple Derivation for the Frobenius Pseudoprime Test
There is an attempt at implementing the algorithm below, but the program never prints out a number. Could someone who is more familiar with the math notation or algorithm verify what is going on please?
Edit 1: The code below has corrections added, but the implementation for compute_wm_wm1 is missing. Could someone explain the recursive definition from an algorithmic standpoint? It is not "clicking" for me.
Edit 2: The erroneous code has been removed, and an implementation of the compute_wm_wm1 function has been added below. It appears to work but may require further optimization to be practical.
from random import SystemRandom
from fractions import gcd
random = SystemRandom().randrange
def find_prime_number(bits, test):
number = random((1 << bits - 1) + 1, 1 << bits, 2)
while True:
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
number += 2
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 * extended_gcd(b, integer)[0] - 2) % integer
m = (integer - jacobi_symbol(d, integer)) >> 1
wm, wm1 = compute_wm_wm1(w1, m, integer)
if w1 * wm != 2 * wm1 % integer:
return False
b = pow(b, (integer - 1) >> 1, integer)
return b * wm % integer == 2
def choose_ab(integer):
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
while is_square(d) or gcd(2 * d * a * b, integer) != 1:
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
return a, b, d
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer >> 1
seen = set([x])
while x * x != integer:
x = (x + integer // x) >> 1
if x in seen:
return False
seen.add(x)
return True
def extended_gcd(n, d):
x1, x2, y1, y2 = 0, 1, 1, 0
while d:
n, (q, d) = d, divmod(n, d)
x1, x2, y1, y2 = x2 - q * x1, x1, y2 - q * y1, y1
return x2, y2
def jacobi_symbol(n, d):
j = 1
while n:
while not n & 1:
n >>= 1
if d & 7 in {3, 5}:
j = -j
n, d = d, n
if n & 3 == 3 == d & 3:
j = -j
n %= d
return j if d == 1 else 0
def compute_wm_wm1(w1, m, n):
a, b = 2, w1
for shift in range(m.bit_length() - 1, -1, -1):
if m >> shift & 1:
a, b = (a * b - w1) % n, (b * b - 2) % n
else:
a, b = (a * a - 2) % n, (a * b - w1) % n
return a, b
print('Probably prime:\n', find_prime_number(300, 10))
You seem to have misunderstood the algorithm completely due to not being familiar with the notation.
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 // b - 2) % integer
That comes from the line
W0 ≡ 2 (mod n) and W1 ≡ a2b−1 − 2 (mod n)
But the b-1 doesn't mean 1/b here, but the modular inverse of b modulo n, i.e. an integer c with b·c ≡ 1 (mod n). You can most easily find such a c by continued fraction expansion of b/n or, equivalently, but with slightly more computation, by the extended Euclidean algorithm. Since you're probably not familiar with continued fractions, I recommend the latter.
m = (integer - d // integer) // 2
comes from
n − (∆/n) = 2m
and misunderstands the Jacobi symbol as a fraction/division (admittedly, I have displayed it here even more like a fraction, but since the site doesn't support LaTeX rendering, we'll have to make do).
The Jacobi symbol is a generalisation of the Legendre symbol - denoted identically - which indicates whether a number is a quadratic residue modulo an odd prime (if n is a quadratic residue modulo p, i.e. there is a k with k^2 ≡ n (mod p) and n is not a multiple of p, then (n/p) = 1, if n is a multiple of p, then (n/p) = 0, otherwise (n/p) = -1). The Jacobi symbol lifts the restriction that the 'denominator' be an odd prime and allows arbitrary odd numbers as 'denominators'. Its value is the product of the Legendre symbols with the same 'numerator' for all primes dividing n (according to multiplicity). More on that, and how to compute Jacobi symbols efficiently in the linked article.
The line should correctly read
m = (integer - jacobi_symbol(d,integer)) // 2
The following lines I completely fail to understand, logically, here should follow the calculation of
Wm and Wm+1 using the recursion
W2j ≡ Wj2 − 2 (mod n)
W2j+1 ≡ WjWj+1 − W1 (mod n)
An efficient method of using that recursion to compute the required values is given around formula (11) of the PDF.
w_m0 = w1 * 2 // m % integer
w_m1 = w1 * 2 // (m + 1) % integer
w_m2 = (w_m0 * w_m1 - w1) % integer
The remainder of the function is almost correct, except of course that it now gets the wrong data due to earlier misunderstandings.
if w1 * w_m0 != 2 * w_m2:
The (in)equality here should be modulo integer, namely if (w1*w_m0 - 2*w_m2) % integer != 0.
return False
b = pow(b, (integer - 1) // 2, integer)
return b * w_m0 % integer == 2
Note, however, that if n is a prime, then
b^((n-1)/2) ≡ (b/n) (mod n)
where (b/n) is the Legendre (or Jacobi) symbol (for prime 'denominators', the Jacobi symbol is the Legendre symbol), hence b^((n-1)/2) ≡ ±1 (mod n). So you could use that as an extra check, if Wm is not 2 or n-2, n can't be prime, nor can it be if b^((n-1)/2) (mod n) is not 1 or n-1.
Probably computing b^((n-1)/2) (mod n) first and checking whether that's 1 or n-1 is a good idea, since if that check fails (that is the Euler pseudoprime test, by the way) you don't need the other, no less expensive, computations anymore, and if it succeeds, it's very likely that you need to compute it anyway.
Regarding the corrections, they seem correct, except for one that made a glitch I previously overlooked possibly worse:
if w1 * wm != 2 * wm1 % integer:
That applies the modulus only to 2 * wm1.
Concerning the recursion for the Wj, I think it is best to explain with a working implementation, first in toto for easy copy and paste:
def compute_wm_wm1(w1,m,n):
a, b = 2, w1
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
while mask <= m:
mask <<= 1
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
else:
a, b = (a*a-2)%n, (a*b-w1)%n
mask >>= 1
return a, b
Then with explanations in between:
def compute_wm_wm1(w1,m,n):
We need the value of W1, the index of the desired number, and the number by which to take the modulus as input. The value W0 is always 2, so we don't need that as a parameter.
Call it as
wm, wm1 = compute_wm_wm1(w1,m,integer)
in frobenius_pseudoprime (aside: not a good name, most of the numbers returning True are real primes).
a, b = 2, w1
We initialise a and b to W0 and W1 respectively. At each point, a holds the value of Wj and b the value of Wj+1, where j is the value of the bits of m so far consumed. For example, with m = 13, the values of j, a and b develop as follows:
consumed remaining j a b
1101 0 w_0 w_1
1 101 1 w_1 w_2
11 01 3 w_3 w_4
110 1 6 w_6 w_7
1101 13 w_13 w_14
The bits are consumed left-to-right, so we have to find the first set bit of m and place our 'pointer' right before it
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
I subtracted a bit from the computed logarithm just to be entirely sure that we don't get fooled by a floating point error (by the way, using log limits you to numbers of at most 1024 bits, about 308 decimal digits; if you want to treat larger numbers, you have to find the base-2 logarithm of m in a different way, using log was the simplest way, and it's just a proof of concept, so I used that here).
while mask <= m:
mask <<= 1
Shift the mask until it's greater than m,so the set bit points just before m's first set bit. Then shift one position back, so we point at the bit.
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
If the next bit is set, the value of the initial portion of consumed bits of m goes from j to 2*j+1, so the next values of the W sequence we need are W2j+1 for a and W2j+2 for b. By the above recursion formula,
W_{2j+1} = W_j * W_{j+1} - W_1 (mod n)
W_{2j+2} = W_{j+1}^2 - 2 (mod n)
Since a was Wj and b was Wj+1, a becomes (a*b - W_1) % n and b becomes (b * b - 2) % n.
else:
a, b = (a*a-2)%n, (a*b-w1)%n
If the next bit is not set, the value of the initial portion of consumed bits of m goes from j to 2*j, so a becomes W2j = (Wj2 - 2) (mod n), and b becomes
W2j+1 = (Wj * Wj+1 - W1) (mod n).
mask >>= 1
Move the pointer to the next bit. When we have moved past the final bit, mask becomes 0 and the loop ends. The initial portion of consumed bits of m is now all of m's bits, so the value is of course m.
Then we can
return a, b
Some additional remarks:
def find_prime_number(bits, test):
while True:
number = random(3, 1 << bits, 2)
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
Primes are not too frequent among the larger numbers, so just picking random numbers is likely to take a lot of attempts to hit one. You will probably find a prime (or probable prime) faster if you pick one random number and check candidates in order.
Another point is that such a test as the Frobenius test is disproportionally expensive to find that e.g. a multiple of 3 is composite. Before using such a test (or a Miller-Rabin test, or a Lucas test, or an Euler test, ...), you should definitely do a bit of trial division to weed out most of the composites and do the work only where it has a fighting chance of being worth it.
Oh, and the is_square function isn't prepared to deal with arguments less than 2, divide-by-zero errors lurk there,
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer // 2
should help.

Resources