Levenshtein distance cost - haskell

I am new to haskell and I encountered a performance issue that is so grave that it must be my code and not the haskell platform.
I have a python implementation of the Levenshtein distance (own code) and I passed (or tried to do so) this to haskell. The result is the following:
bool2int :: Bool -> Int
bool2int True = 1
bool2int False = 0
levenshtein :: Eq a => [a] -> [a] -> Int -> Int -> Int
levenshtein u v 0 0 = 0
levenshtein u v i 0 = i
levenshtein u v 0 j = j
levenshtein u v i j = minimum [1 + levenshtein u v i (j - 1),
1 + levenshtein u v (i - 1) j,
bool2int (u !! (i - 1) /= v !! (j - 1) ) + levenshtein u v (i - 1) (j - 1) ]
distance :: Eq a => [a] -> [a] -> Int
distance u v = levenshtein u v (length u) (length v)
Now, the difference in execution time for strings of length 10 or more is of various powers of 10 between python and haskell. Also with some rough time measuring (wall clock, as I haven't found a clock() command in haskell so far) it seems that my haskell implementation has not cost O(mn), but some other exorbitantly fast growing cost.
Nota bene: I do not want my haskell implementation to compete speed wise with the python script. I just want it to run in a "sensible" time and not in multiples of the time the whole universe exists.
Questions:
What am I doing wrong, that my implementation is so darn slow?
How to fix it?
Talking about "lazy evaluation": I gather that if levenshtein "cat" "kit" 2 2 is called thrice, it is only calculated once. Is this right?
There must be something built-in for my bool2int, right?
Any other input is highly appreciated if it shoves me ahead on the rough path to mastering haskell.
EDIT: Here goes the python code for comparison:
#! /usr/bin/python3.2
# -*- coding, utf-8 -*-
class Levenshtein:
def __init__ (self, u, v):
self.__u = ' ' + u
self.__v = ' ' + v
self.__D = [ [None for x in self.__u] for x in self.__v]
for x, _ in enumerate (self.__u): self.__D [0] [x] = x
for x, _ in enumerate (self.__v): self.__D [x] [0] = x
#property
def distance (self):
return self.__getD (len (self.__v) - 1, len (self.__u) - 1)
def __getD (self, i, j):
if self.__D [i] [j] != None: return self.__D [i] [j]
self.__D [i] [j] = min ( [self.__getD (i - 1, j - 1) + (0 if self.__v [i] == self.__u [j] else 1),
self.__getD (i, j - 1) + 1,
self.__getD (i - 1, j) + 1] )
return self.__D [i] [j]
print (Levenshtein ('first string', 'second string').distance)

What am I doing wrong, that my implementation is so darn slow?
Your algorithm has exponential complexity. You seem to be assuming that the calls are being memoized for you, but that's not the case.
How to fix it?
You'll need to add explicit memoization, possibly using an array or some other method.
Talking about "lazy evaluation": I gather that if levenshtein "cat" "kit" 2 2 is called thrice, it is only calculated once. Is this right?
No, Haskell does not do automatic memoization. Laziness means that if you do let y = f x in y + y, then the f x will only be evaluated (once) if the result of the sum is demanded. It does not mean that f x + f x will evaluate in only one call to f x. You have to be explicit when you want to share results from subexpressions.
There must be something built-in for my bool2int, right?
Yes, there is an instance Enum Bool, so you can use fromEnum.
*Main> fromEnum True
1
*Main> fromEnum False
0
Any other input is highly appreciated if it shoves me ahead on the rough path to mastering haskell.
While writing stuff from scratch may be fun and educational, it is important to learn to take advantage of the many great libraries on Hackage when doing common things like this.
For example there is an implementation of the Levenshtein distance in the edit-distance package.
I translated your Haskell code back to Python for comparison:
def levenshtein(u, v, i, j):
if i == 0: return j
if j == 0: return i
return min(1 + levenshtein(u, v, i, (j-1)),
1 + levenshtein(u, v, (i-1), j),
(u[i-1] != v[j-1]) + levenshtein(u, v, (i-1), (j-1)))
def distance(u, v):
return levenshtein(u, v, len(u), len(v))
if __name__ == "__main__":
print distance("abbacabaab", "abaddafaca")
Even without fixing the O(n) indexing issue that chrisdb pointed out in his answer, this performs slower than the Haskell version when compiled:
$ time python levenshtein.py
6
real 0m4.793s
user 0m4.690s
sys 0m0.020s
$ time ./Levenshtein
6
real 0m0.524s
user 0m0.520s
sys 0m0.000s
Of course, they both lose to the properly memoized version in the edit-distance package:
$ time ./LevenshteinEditDistance
6
real 0m0.015s
user 0m0.010s
sys 0m0.000s
Here's a simple memoized implementation using Data.Array:
import Data.Array
distance u v = table ! (m, n)
where table = listArray ((0, 0), (m, n)) [levenshtein i j | i <- [0..m], j <- [0..n]]
levenshtein 0 j = j
levenshtein i 0 = i
levenshtein i j = minimum [ 1 + table!(i, j-1)
, 1 + table!(i-1, j)
, fromEnum (u'!(i-1) /= v'!(j-1)) + table!(i-1, j-1) ]
u' = listArray (0, m-1) u
v' = listArray (0, n-1) v
m = length u
n = length v
main = print $ distance "abbacabaab" "abaddafaca"
It performs similarly to your original Python code:
$ time python levenshtein-original.py
6
real 0m0.037s
user 0m0.030s
sys 0m0.000s
$ time ./LevenshteinArray
6
real 0m0.017s
user 0m0.010s
sys 0m0.000s

It looks to me like the likely cause is the use of !! for random access. Haskell lists are linked lists, which are poorly suited to algorithms that require random rather than sequential access.
You might want to try replacing the lists with something better suited to random access. If you are interested in strings you could use Data.Text or ByteStrings, which are underlyingly arrays and should be fast. Or you could use something like Data.Vector perhaps.
EDIT: Actually, it looks like Data.Text will have the same issue, since the documentation says indexing is O(n). Probably converting to a vector would be best.

Related

Number of Different Subsequences GCDs

Number of Different Subsequences GCDs
You are given an array nums that consists of positive integers.
The GCD of a sequence of numbers is defined as the greatest integer that divides all the numbers in the sequence evenly.
For example, the GCD of the sequence [4,6,16] is 2.
A subsequence of an array is a sequence that can be formed by removing some elements (possibly none) of the array.
For example, [2,5,10] is a subsequence of [1,2,1,2,4,1,5,10].
Return the number of different GCDs among all non-empty subsequences of nums.
Example 1:
Input: nums = [6,10,3]
Output: 5
Explanation: The figure shows all the non-empty subsequences and their GCDs.
The different GCDs are 6, 10, 3, 2, and 1.
from itertools import permutations
import math
class Solution:
def countDifferentSubsequenceGCDs(self, nums: List[int]) -> int:
s = set()
g =0
for i in range(1,len(nums)+1):
comb = combinations(nums,i)
for i in comb:
if len(i)==1:
u = i[0]
s.add(u)
else:
g = math.gcd(i[0],i[1])
s.add(g)
if len(i)>2:
for j in range(2,len(i)):
g = math.gcd(g,i[j])
s.add(g)
g = 0
y = len(s)
return y
I am getting TLE for this input. Can someone pls help?
[5852,6671,170275,141929,2414,99931,179958,56781,110656,190278,7613,138315,58116,114790,129975,144929,61102,90624,60521,177432,57353,199478,120483,75965,5634,109100,145872,168374,26215,48735,164982,189698,77697,31691,194812,87215,189133,186435,131282,110653,133096,175717,49768,79527,74491,154031,130905,132458,103116,154404,9051,125889,63633,194965,105982,108610,174259,45353,96240,143865,184298,176813,193519,98227,22667,115072,174001,133281,28294,42913,136561,103090,97131,128371,192091,7753,123030,11400,80880,184388,161169,155500,151566,103180,169649,44657,44196,131659,59491,3225,52303,141458,143744,60864,106026,134683,90132,151466,92609,120359,70590,172810,143654,159632,191208,1497,100582,194119,134349,33882,135969,147157,53867,111698,14713,126118,95614,149422,145333,52387,132310,108371,127121,93531,108639,90723,416,141159,141587,163445,160551,86806,120101,157249,7334,60190,166559,46455,144378,153213,47392,24013,144449,66924,8509,176453,18469,21820,4376,118751,3817,197695,198073,73715,65421,70423,28702,163789,48395,90289,76097,18224,43902,41845,66904,138250,44079,172139,71543,169923,186540,77200,119198,184190,84411,130153,124197,29935,6196,81791,101334,90006,110342,49294,67744,28512,66443,191406,133724,54812,158768,113156,5458,59081,4684,104154,38395,9261,188439,42003,116830,184709,132726,177780,111848,142791,57829,165354,182204,135424,118187,58510,137337,170003,8048,103521,176922,150955,84213,172969,165400,111752,15411,193319,78278,32948,55610,12437,80318,18541,20040,81360,78088,194994,41474,109098,148096,66155,34182,2224,146989,9940,154819,57041,149496,120810,44963,184556,163306,133399,9811,99083,52536,90946,25959,53940,150309,176726,113496,155035,50888,129067,27375,174577,102253,77614,132149,131020,4509,85288,160466,105468,73755,4743,41148,52653,85916,147677,35427,88892,112523,55845,69871,176805,25273,99414,143558,90139,180122,140072,127009,139598,61510,17124,190177,10591,22199,34870,44485,43661,141089,55829,70258,198998,87094,157342,132616,66924,96498,88828,89204,29862,76341,61654,158331,187462,128135,35481,152033,144487,27336,84077,10260,106588,19188,99676,38622,32773,89365,30066,161268,153986,99101,20094,149627,144252,58646,148365,21429,69921,95655,77478,147967,140063,29968,120002,72662,28241,11994,77526,3246,160872,175745,3814,24035,108406,30174,10492,49263,62819,153825,110367,42473,30293,118203,43879,178492,63287,41667,195037,26958,114060,99164,142325,77077,144235,66430,186545,125046,82434,26249,54425,170932,83209,10387,7147,2755,77477,190444,156388,83952,117925,102569,82125,104479,16506,16828,83192,157666,119501,29193,65553,56412,161955,142322,180405,122925,173496,93278,67918,48031,141978,54484,80563,52224,64588,94494,21331,73607,23440,197470,117415,23722,170921,150565,168681,88837,59619,102362,80422,10762,85785,48972,83031,151784,79380,64448,87644,26463,142666,160273,151778,156229,24129,64251,57713,5341,63901,105323,18961,70272,144496,18591,191148,19695,5640,166562,2600,76238,196800,94160,129306,122903,40418,26460,131447,86008,20214,133503,174391,45415,47073,39208,37104,83830,80118,28018,185946,134836,157783,76937,33109,54196,37141,142998,189433,8326,82856,163455,176213,144953,195608,180774,53854,46703,78362,113414,140901,41392,12730,187387,175055,64828,66215,16886,178803,117099,112767,143988,65594,141919,115186,141050,118833,2849]
I'm going to add "an answer" here because most "not horribly slow" programs I've seen for this are way too elaborate.
Call the input xs. The fastest way I know of asks, for each integer j in 1 through max(xs), can j be the gcd of some non-empty subset of xs? Of course if max(xs) can be huge, that can be slow. But in the context you apparently took this from (LeetCode), it cannot be huge.
So, given j, how do we know whether some subset's gcd is j? Actually easy! We look at all and only the multiples of j in xs. The gcd of all of those is at least j. If, at any point along the way, their gcd so far is j, we found a subset whose gcd is j. Else the running gcd exceeds j after processing all of j's multiples, so no subset's gcd is j.
def numgcds(xs):
from math import gcd
limit = max(xs) + 1
result = 0
xsset = set(xs)
for j in range(1, limit):
g = 0
for x in range(j, limit, j):
if x in xsset:
g = gcd(x, g)
if g == j:
result += 1
break
return result
Where L is max(xs), worst-case runtime is O(L * log(L)). Across outer loop iterations, the inner loop goes around (at worst) L times at first, then L/2 times, then L/3, and so on. That sums to L*(1/1 + 1/2 + 1/3 + ... + 1/L). The second factor (the sum of reciprocals) is the L'th "harmonic number", and is approximately the natural logarithm of L.
More Gonzo
I don't really like having the runtime depend on the largest integer in the input. For example, numgcds([20000000]) takes 20 million iterations of the outer loop to determine that there's only one gcd, and can take appreciable time (about 30 seconds on my box just now).
Instead, with more code, we can build some dicts that eliminate all searching. For each divisor d of an integer in xs, d2xs[d] is the list of multiples of d in xs. The keys of d2xs are the only possible gcds we need to check, and a key's associated values are exactly (no searching needed) the multiples of the key in xs.
The collection of all possible divisors of all integers in xs can be found by factoring each integer in xs, and generating all possible combinations of its factors' prime powers.
This is harder to code, but can run very much faster. numgcds([20000000]) is essentially instant. And it runs about 10 times faster for the largish example you gave.
def gendivisors(x):
from collections import Counter
from itertools import product
from math import prod
c = Counter(factor(x))
pows = []
for p, k in c.items():
pows.append([p**i for i in range(k+1)])
for t in product(*pows):
yield prod(t)
def numgcds(xs):
from math import gcd
from collections import defaultdict
d2xs = defaultdict(list)
for x in xs:
for d in gendivisors(x):
d2xs[d].append(x)
result = 0
for j, mults in d2xs.items():
g = 0
for x in mults:
g = gcd(x, g)
if g == j:
result += 1
break
return result
I'm not including code for factor(n) - pick your favorite. The code requires it return an iterable (list, generator iterator, tuple, doesn't matter) of all n's prime factors. Order doesn't matter. As special cases, list(factor(i)) should return [i] for i equal to 0 or 1.
For ordinary cases, list(factor(p)) == [p] for a prime p, and, e.g., sorted(factor(20)) == [2, 2, 5].
Worst-case timing is much harder to nail, but the key bit is that a reasonable implementation of factor(n) will have worst-case time O(sqrt(n)).

What is the most efficient way to calculate power of larger numbers in python

I am working on a cryptography assignment and trying to decrypt an rsa cipher. There is a step in which I calculate
plaintext = (cipher number) a mod n
Where a = 39222883199635910704838468978553915708840633543073057195 and
n = 68102916241556954755526080258049854830403581949186670592
cipher number = 49158615403582362779085177062796191820833652424030845810
When I apply the plain text formula I am supposed to get a small number which will later be used. But this mod line runs for ever and I cannot get an output. For small a and n numbers it works fast. How can I make it work faster?
Here is the part of the code that has to do with my question
while True:
# Get next line from file
line = fileHandler.readline()
plaintext = pow(int(line), a) % n
pow can actually take 3 arguments for this case:
pow(int(line), a, n)
According to the docs:
... if z is present, return x to the power y, modulo z (computed more efficiently than pow(x, y) % z).
The most optimized way i found is :
pow = (a, b, mod) #where mod is you modulo
Example :
mod = 1000000007
ans = pow(2, n-1, mod)
print(ans)
This is way faster than using :
ans = ((pow(2, n-1) % 1000000007) + 1000000007) % 1000000007 #10 times slower

efficiently generating all integers within a binary mask

Suppose I have some binary mask mask. (e.g. 0b101011011101)
Is there an efficient method of computing all integers k such that k & mask == k? (where & is the bitwise AND operator) (alternatively, k & ~mask == 0)
If mask has m ones, then there are exactly 2m numbers that satisfy this property, so it seems like there should be some kind of process that is O(2m). Enumerating the integers less than the mask is wasteful (though easy to eliminate values that do not apply).
I figured it out... you can identify all the single bit patterns like as follows, since the least significant 1 bit of any integer k is cleared when calculating k & (k-1):
def onebits(x):
while x > 0:
# find least significant 1 bit
xprev = x
x &= x-1
yield x ^ xprev
and then I can use the ruler function to XOR in various combinations of 1 bits to emulate which bits of a counter are toggled each time:
def maskcount(mask):
maskbits = []
m = 0
for ls1 in onebits(mask):
m ^= ls1
maskbits.append(m)
# ruler function modified from
# http://lua-users.org/wiki/LuaCoroutinesVersusPythonGenerators
def ruler(k):
for i in range(k):
yield i
for x in ruler(i): yield x
x = 0
yield x
for k in ruler(len(maskbits)):
x ^= maskbits[k]
yield x
which looks like this:
>>> for x in maskcount(0xc05):
... print format(x, '#016b')
0b00000000000000
0b00000000000001
0b00000000000100
0b00000000000101
0b00010000000000
0b00010000000001
0b00010000000100
0b00010000000101
0b00100000000000
0b00100000000001
0b00100000000100
0b00100000000101
0b00110000000000
0b00110000000001
0b00110000000100
0b00110000000101
An easy way to solve the problem is to find the bits that are set in mask, and then simply count with i, but then replacing the bits of i with corresponding bits from the mask.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
for i in xrange(1 << len(bits)):
yield sum(b for j, b in enumerate(bits) if (i >> j) & 1)
print list(codes(39))
That gives you O(log(N)) work per iteration (where N is the number of bits set in mask).
It's possible to be more efficient, and do O(1) work per iteration by counting using gray codes. With gray code counting, only a single bit changes each iteration so it's possible to efficiently update the current value, v. Obviously this is much harder to understand than the simple solution above.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
blt = dict((1 << i, b) for i, b in enumerate(bits))
p, v = 0, 0
for i in xrange(1 << len(bits)):
n = i ^ (i >> 1)
v ^= blt.get(p^n, 0)
p = n
yield v
print list(codes(39))
A disadvantage of using gray codes is that the results are not returned in numeric order. But luckily that wasn't a condition in the question!

How to implement Frobenius pseudoprime algorithm?

Someone told me that the Frobenius pseudoprime algorithm take three times longer to run than the Miller–Rabin primality test but has seven times the resolution. So then if one where to run the former ten times and the later thirty times, both would take the same time to run, but the former would provide about 233% more analyse power. In trying to find out how to perform the test, the following paper was discovered with the algorithm at the end:
A Simple Derivation for the Frobenius Pseudoprime Test
There is an attempt at implementing the algorithm below, but the program never prints out a number. Could someone who is more familiar with the math notation or algorithm verify what is going on please?
Edit 1: The code below has corrections added, but the implementation for compute_wm_wm1 is missing. Could someone explain the recursive definition from an algorithmic standpoint? It is not "clicking" for me.
Edit 2: The erroneous code has been removed, and an implementation of the compute_wm_wm1 function has been added below. It appears to work but may require further optimization to be practical.
from random import SystemRandom
from fractions import gcd
random = SystemRandom().randrange
def find_prime_number(bits, test):
number = random((1 << bits - 1) + 1, 1 << bits, 2)
while True:
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
number += 2
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 * extended_gcd(b, integer)[0] - 2) % integer
m = (integer - jacobi_symbol(d, integer)) >> 1
wm, wm1 = compute_wm_wm1(w1, m, integer)
if w1 * wm != 2 * wm1 % integer:
return False
b = pow(b, (integer - 1) >> 1, integer)
return b * wm % integer == 2
def choose_ab(integer):
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
while is_square(d) or gcd(2 * d * a * b, integer) != 1:
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
return a, b, d
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer >> 1
seen = set([x])
while x * x != integer:
x = (x + integer // x) >> 1
if x in seen:
return False
seen.add(x)
return True
def extended_gcd(n, d):
x1, x2, y1, y2 = 0, 1, 1, 0
while d:
n, (q, d) = d, divmod(n, d)
x1, x2, y1, y2 = x2 - q * x1, x1, y2 - q * y1, y1
return x2, y2
def jacobi_symbol(n, d):
j = 1
while n:
while not n & 1:
n >>= 1
if d & 7 in {3, 5}:
j = -j
n, d = d, n
if n & 3 == 3 == d & 3:
j = -j
n %= d
return j if d == 1 else 0
def compute_wm_wm1(w1, m, n):
a, b = 2, w1
for shift in range(m.bit_length() - 1, -1, -1):
if m >> shift & 1:
a, b = (a * b - w1) % n, (b * b - 2) % n
else:
a, b = (a * a - 2) % n, (a * b - w1) % n
return a, b
print('Probably prime:\n', find_prime_number(300, 10))
You seem to have misunderstood the algorithm completely due to not being familiar with the notation.
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 // b - 2) % integer
That comes from the line
W0 ≡ 2 (mod n) and W1 ≡ a2b−1 − 2 (mod n)
But the b-1 doesn't mean 1/b here, but the modular inverse of b modulo n, i.e. an integer c with b·c ≡ 1 (mod n). You can most easily find such a c by continued fraction expansion of b/n or, equivalently, but with slightly more computation, by the extended Euclidean algorithm. Since you're probably not familiar with continued fractions, I recommend the latter.
m = (integer - d // integer) // 2
comes from
n − (∆/n) = 2m
and misunderstands the Jacobi symbol as a fraction/division (admittedly, I have displayed it here even more like a fraction, but since the site doesn't support LaTeX rendering, we'll have to make do).
The Jacobi symbol is a generalisation of the Legendre symbol - denoted identically - which indicates whether a number is a quadratic residue modulo an odd prime (if n is a quadratic residue modulo p, i.e. there is a k with k^2 ≡ n (mod p) and n is not a multiple of p, then (n/p) = 1, if n is a multiple of p, then (n/p) = 0, otherwise (n/p) = -1). The Jacobi symbol lifts the restriction that the 'denominator' be an odd prime and allows arbitrary odd numbers as 'denominators'. Its value is the product of the Legendre symbols with the same 'numerator' for all primes dividing n (according to multiplicity). More on that, and how to compute Jacobi symbols efficiently in the linked article.
The line should correctly read
m = (integer - jacobi_symbol(d,integer)) // 2
The following lines I completely fail to understand, logically, here should follow the calculation of
Wm and Wm+1 using the recursion
W2j ≡ Wj2 − 2 (mod n)
W2j+1 ≡ WjWj+1 − W1 (mod n)
An efficient method of using that recursion to compute the required values is given around formula (11) of the PDF.
w_m0 = w1 * 2 // m % integer
w_m1 = w1 * 2 // (m + 1) % integer
w_m2 = (w_m0 * w_m1 - w1) % integer
The remainder of the function is almost correct, except of course that it now gets the wrong data due to earlier misunderstandings.
if w1 * w_m0 != 2 * w_m2:
The (in)equality here should be modulo integer, namely if (w1*w_m0 - 2*w_m2) % integer != 0.
return False
b = pow(b, (integer - 1) // 2, integer)
return b * w_m0 % integer == 2
Note, however, that if n is a prime, then
b^((n-1)/2) ≡ (b/n) (mod n)
where (b/n) is the Legendre (or Jacobi) symbol (for prime 'denominators', the Jacobi symbol is the Legendre symbol), hence b^((n-1)/2) ≡ ±1 (mod n). So you could use that as an extra check, if Wm is not 2 or n-2, n can't be prime, nor can it be if b^((n-1)/2) (mod n) is not 1 or n-1.
Probably computing b^((n-1)/2) (mod n) first and checking whether that's 1 or n-1 is a good idea, since if that check fails (that is the Euler pseudoprime test, by the way) you don't need the other, no less expensive, computations anymore, and if it succeeds, it's very likely that you need to compute it anyway.
Regarding the corrections, they seem correct, except for one that made a glitch I previously overlooked possibly worse:
if w1 * wm != 2 * wm1 % integer:
That applies the modulus only to 2 * wm1.
Concerning the recursion for the Wj, I think it is best to explain with a working implementation, first in toto for easy copy and paste:
def compute_wm_wm1(w1,m,n):
a, b = 2, w1
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
while mask <= m:
mask <<= 1
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
else:
a, b = (a*a-2)%n, (a*b-w1)%n
mask >>= 1
return a, b
Then with explanations in between:
def compute_wm_wm1(w1,m,n):
We need the value of W1, the index of the desired number, and the number by which to take the modulus as input. The value W0 is always 2, so we don't need that as a parameter.
Call it as
wm, wm1 = compute_wm_wm1(w1,m,integer)
in frobenius_pseudoprime (aside: not a good name, most of the numbers returning True are real primes).
a, b = 2, w1
We initialise a and b to W0 and W1 respectively. At each point, a holds the value of Wj and b the value of Wj+1, where j is the value of the bits of m so far consumed. For example, with m = 13, the values of j, a and b develop as follows:
consumed remaining j a b
1101 0 w_0 w_1
1 101 1 w_1 w_2
11 01 3 w_3 w_4
110 1 6 w_6 w_7
1101 13 w_13 w_14
The bits are consumed left-to-right, so we have to find the first set bit of m and place our 'pointer' right before it
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
I subtracted a bit from the computed logarithm just to be entirely sure that we don't get fooled by a floating point error (by the way, using log limits you to numbers of at most 1024 bits, about 308 decimal digits; if you want to treat larger numbers, you have to find the base-2 logarithm of m in a different way, using log was the simplest way, and it's just a proof of concept, so I used that here).
while mask <= m:
mask <<= 1
Shift the mask until it's greater than m,so the set bit points just before m's first set bit. Then shift one position back, so we point at the bit.
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
If the next bit is set, the value of the initial portion of consumed bits of m goes from j to 2*j+1, so the next values of the W sequence we need are W2j+1 for a and W2j+2 for b. By the above recursion formula,
W_{2j+1} = W_j * W_{j+1} - W_1 (mod n)
W_{2j+2} = W_{j+1}^2 - 2 (mod n)
Since a was Wj and b was Wj+1, a becomes (a*b - W_1) % n and b becomes (b * b - 2) % n.
else:
a, b = (a*a-2)%n, (a*b-w1)%n
If the next bit is not set, the value of the initial portion of consumed bits of m goes from j to 2*j, so a becomes W2j = (Wj2 - 2) (mod n), and b becomes
W2j+1 = (Wj * Wj+1 - W1) (mod n).
mask >>= 1
Move the pointer to the next bit. When we have moved past the final bit, mask becomes 0 and the loop ends. The initial portion of consumed bits of m is now all of m's bits, so the value is of course m.
Then we can
return a, b
Some additional remarks:
def find_prime_number(bits, test):
while True:
number = random(3, 1 << bits, 2)
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
Primes are not too frequent among the larger numbers, so just picking random numbers is likely to take a lot of attempts to hit one. You will probably find a prime (or probable prime) faster if you pick one random number and check candidates in order.
Another point is that such a test as the Frobenius test is disproportionally expensive to find that e.g. a multiple of 3 is composite. Before using such a test (or a Miller-Rabin test, or a Lucas test, or an Euler test, ...), you should definitely do a bit of trial division to weed out most of the composites and do the work only where it has a fighting chance of being worth it.
Oh, and the is_square function isn't prepared to deal with arguments less than 2, divide-by-zero errors lurk there,
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer // 2
should help.

Python 3 integer division. How to make math operators consistent with C

I need to port quite a few formulas from C to Python and vice versa. What is the best way to make sure that nothing breaks in the process?
I am primarily worried about automatic int/int = float conversions.
You could use the // operator. It performs an integer division, but it's not quite what you'd expect from C:
A quote from here:
The // operator performs a quirky kind of integer division. When the
result is positive, you can think of
it as truncating (not rounding) to 0
decimal places, but be careful with
that.
When integer-dividing negative numbers, the // operator rounds “up”
to the nearest integer. Mathematically
speaking, it’s rounding “down” since
−6 is less than −5, but it could trip
you up if you were expecting it to
truncate to −5.
For example, -11 // 2 in Python returns -6, where -11 / 2 in C returns -5.
I'd suggest writing and thoroughly unit-testing a custom integer division function that "emulates" C behaviour.
The page I linked above also has a link to PEP 238 which has some interesting background information about division and the changes from Python 2 to 3. There are some suggestions about what to use for integer division, like divmod(x, y)[0] and int(x/y) for positive numbers, perhaps you'll find more useful things there.
In C:
-11/2 = -5
In Python:
-11/2 = -5.5
And also in Python:
-11//2 = -6
To achieve C-like behaviour, write int(-11/2) in Python. This will evaluate to -5.
Some ways to compute integer division with C semantics are as follows:
def div_c0(a, b):
if (a >= 0) != (b >= 0) and a % b:
return a // b + 1
else:
return a // b
def div_c1(a, b):
q, r = a // b, a % b
if (a >= 0) != (b >= 0) and r:
return q + 1
else:
return q
def div_c2(a, b):
q, r = divmod(a, b)
if (a >= 0) != (b >= 0) and r:
return q + 1
else:
return q
def mod_c(a, b):
return (a % b if b >= 0 else a % -b) if a >= 0 else (-(-a % b) if b >= 0 else a % b)
def div_c3(a, b):
r = mod_c(a, b)
return (a - r) // b
With timings:
import itertools
n = 100
l = [x for x in range(-n, n + 1)]
ll = [(a, b) for a, b in itertools.product(l, repeat=2) if b]
funcs = div_c0, div_c1, div_c2, div_c3
for func in funcs:
correct = all(func(a, b) == funcs[0](a, b) for a, b in ll)
print(f"{func.__name__} correct:{correct} ", end="")
%timeit [func(a, b) for a, b in ll]
# div_c0 correct:True 100 loops, best of 5: 10.3 ms per loop
# div_c1 correct:True 100 loops, best of 5: 11.5 ms per loop
# div_c2 correct:True 100 loops, best of 5: 13.2 ms per loop
# div_c3 correct:True 100 loops, best of 5: 15.4 ms per loop
Indicating the first approach to be the fastest.
For implementing C's % using Python, see here.
In the opposite direction:
Since Python 3 divmod (or //) integer division requires the remainder to have the same sign as divisor at non-zero remainder case, it's inconsistent with many other languages (quote from 1.4. Integer Arithmetic).
To have your "C-like" result same as Python, you should compare the remainder result with divisor (suggestion: by xor on sign bits equals to 1, or multiplication with negative result), and in case it's different, add the divisor to the remainder, and subtract 1 from the quotient.
// Python Divmod requires a remainder with the same sign as the divisor for
// a non-zero remainder
// Assuming isPyCompatible is a flag to distinguish C/Python mode
isPyCompatible *= (int)remainder;
if (isPyCompatible)
{
int32_t xorRes = remainder ^ divisor;
int32_t andRes = xorRes & ((int32_t)((uint32_t)1<<31));
if (andRes)
{
remainder += divisor;
quotient -= 1;
}
}
(Credit to Gawarkiewicz M. for pointing this out.)
You will need to know what the formula does, and understand both the C implementation and how to implement it in Python. But unless you are doing integer maths it should be quite similar, and if you are doing integer maths, the question is why. :)
Integer maths are either done because of some specific purpose, often related to computers, or because it's faster than floats when doing massive computations, like Fractint does for fractals, and in that case Python is usually not the right choice. ;)

Resources