Brocard's problem is n! + 1 = m^2. The solutions to this problems are pairs of integers called Brown numbers (4,5), etc, of which only three are known.
A very literal implementation to Brocard's problem:
import math
def brocard(n,m):
if math.factorial(n)+1 == m**2:
return (n,m)
else:
return
a=10000
for n in range(a):
for m in range(a):
b=brocard(n,m)
if b is not None:
print(b)
The time complexity of this should be O(n^2) because of the nested for loops with differing variables and the complexity of whatever math.factorial algorithm is (apparently divide-and-conquer). Is there any way to improve upon O(n^2)?
There are other interpretations on SO like this. How does the time complexity of this compare with my implementation?
Your algorithm is O(n^3).
You have two nested loops, and inside you use factorial(), having O(n) complexity itself.
Your algorithm tests all (n,m) combinations, even those where factorial(n) and m^2 are far apart, e.g. n=1 and m=10000.
You always recompute the factorial(n) deep inside the loop, although it's independent of the inner loop variable m. So, it could be moved outside of the inner loop.
And, instead of always computing factorial(n) from scratch, you could do that incrementally. Whenever you increment n by 1, you can multiply the previous factorial by n.
A different, better approach would be not to use nested loops, but to always keep n and m in a number range so that factorial(n) is close to m^2, to avoid checking number pairs that are vastly off. We can do this by deciding which variable to increment next. If the factorial is smaller, then the next brocard pair needs a bigger n. If the square is smaller, we need a bigger m.
In pseudo code, that would be
n = 1; m = 1; factorial = 1;
while n < 10000 and m < 10000
if factorial + 1 == m^2
found a brocard pair
// the next brocard pair will have different n and m,
// so we can increment both
n = n + 1
factorial = factorial * n
m = m + 1
else if factorial + 1 < m^2
// n is too small for the current m
n = n + 1
factorial = factorial * n
else
// m is too small for the given n
m = m + 1
In each loop iteration, we either increment n or m, so we can have at most 20000 iterations. There is no inner loop in the algorithm. We have O(n). So, this should be fast enough for n and m up to the millions range.
P.S. There are still some optimizations possible.
Factorials (after n=1, known to have no brocard pair) are always even numbers, so m^2 must be odd to satisfy the brocard condition, meaning that we can always increment m by 2, skipping the even number in between.
For larger n values, the factorial increases much faster than the square. So, instead of incrementing m until its square reaches the factorial+1 value, we could recompute the next plausible m as integer square root of factorial+1.
Or, using the square root approach, just compute the integer square root of factorial(n), and check if it matches, without any incremental steps for m.
Given a set of non-negative distinct integers, and a value m, determine if there is a subset of the given set with sum divisible by m.
The solution on geeksforgeeks states that-
If n > m there will always be a subset with sum divisible by m (which is easy to prove with pigeonhole principle). So we need to handle only cases of n <= m.
Can somebody please explain what this case means and what is its relation to pigeonhole principle? Also how is this case different from n <= m?
Making this a bit more verbose of this:
Label the numbers a1, a2, ... an in any order. Now consider the sums:
b1=a1
b2=a1+a2
b3=a1+a2+a3
...
bn=a1+a2+...+an
These are either all unique numbers or one of the as are 0 (which is divisible by m).
Now if any of the bs are divisible by m we are done.
Otherwise:
The remainders of some non-divisible number/m can be in the range of 1...(m-1). So there are m-1 numbers of possible remainders`.
Since numbers b1...bn weren't divisible by m they must have remainders in the range of 1...(m-1). So you must pair n numbers of bs (pigeons) with m-1 remainders (pigeonholes).
We have more pigeons than pigeonholes => there must be at least two pigeons in the same pigeonhole.
That means: there must be at least two bs with the same remainder: call them bi, bj (i<j). Since all of our bs are unique and bi % m == bj % m (the remainders of bi/m and bj/m are the same) => bi - bj = x * m (where x is a positive integer). Therefore bi - bj is divisible by m and bi - bj = ai+1 + ... + aj. Therefore ai+1 + ... + aj is divisible by m which is exactly what we wanted to proof.
Let us create a new set of numbers (i.e. a[ ] array) by doing prefix sum of given values (i.e. value[ ] array).
a[0] = value[0]
a[1] = value[0] + value[1]
a[n] = value[0] + value[1] + .... + value[n]
Now we have n new numbers. If any of them are divisible by m we are done.
If we divide the a[ ] array elements with m, we can get remainders in range of [1, m - 1].
But, we have a total of n values.
So, there exist two numbers 0<=i,j<=n in a such that a[i] mod(m) == a[j] mod(m).
Due to the above statement, we can say that a[i] - a[j] is divisible by m.
Now, let's consider i > j.
We also know that, a[i] = value[i] + value[i - 1] + ... + value[0] and a[j] = value[j] + value[j - 1] + ... + value[0].
So, a[i] - a[j] = value[i] + value[i - 1] + ... + value[i - j + 1] is also divisible by m.
We're doing the classic problem of determining the number of ways that we can make change that amounts to Z given a set of coins.
For example, Amount=5 and Coins={1, 2, 3}. One way we can make 5 is {2, 3}.
The naive recursive solution has a time complexity of factorial time.
f(n) = n * f(n-1) = n!
My professor argued that it actually has a time complexity of O(2^n), because we only choose to use a coin or not. That intuitively makes sense. However how come my recurence doesn't work out to be O(2^n)?
EDIT:
My recurrence is as follows:
f(5, {1, 2, 3})
/ \ .....
f(4, {2, 3}) f(3, {1, 3}) .....
Notice how the branching factor decreases by 1 at every step.
Formally.
T(n) = n*F(n-1) = n!
The recurrence doesn't work out to what you expect it to work out to because it doesn't reflect the number of operations made by the algorithm.
If the algorithm decides for each coin whether to output it or not, then you can model its time complexity with the recurrence T(n) = 2*T(n-1) + O(1) with T(1)=O(1); the intuition is that for each coin you have two options---output the coin or not; this obviously solves to T(n)=O(2^n).
I too was trying to analyze the time complexity for the brute force which performs depth first search:
def countCombinations(coins, n, amount, k=0):
if amount == 0:
return 1
res = 0
for i in range(k, n):
if coins[k] <= amount:
remaining_amount = amount - coins[i] # considering this coin, try for remaining sum
# in next round include this coin too
res += countCombinations(coins, n, remaining_amount, i)
return res
but we can see that the coins which are used in one round is used again in the next round, so at least for 1st coin we have n items at each stage which is equivalent to permutation with repetition n^r for n items available to arrange into r positions at each stage.
ex: [1, 1, 1, 1]; sum = 4
This will generate a recursive tree where for first path we literally have solutions at each diverged subpath until we have the sum=0. so the time complexity is O(sum^n) ie for each stage in the path towards sum we have n different subpaths.
Note however there is another algorithm which uses take/not-take approach and at most there is 2 branch at a node in recursion tree. Hence the time complexity for this algorithm is O(2^(n*m))
ex: say coins = [1, 1] sum = 2 there are 11 nodes/points to visit in the recursion tree for 6 paths(leaves) then complexity is at most 2^(2*2) => 2^4 => 16 (Hence 11 nodes visiting for a max of 16 possibility is correct but little loose on upper bound).
def get_count(coins, n, sum):
if(n == 0): # no coins left, to try a combination that matches the sum
return 0
if(sum == 0): # no more sum left to match, means that we have completely co-incided with our trial
return 1 # (return success)
# don't-include the last coin in the sum calc so, leave it and try rest
excluded = get_count(coins, n-1, sum)
included = 0
if(coins[n-1] <= sum):
# include the last coin in the sum calc, so reduce by its quantity in the sum
# we assume here that n is constant ie, it is supplied in unlimited(we can choose same coin again and again),
included = get_count(coins, n, sum-coins[n-1])
return included+excluded
I am trying to understand implementation of linear time suffix array creation algorithm by Karkkainen, P. Sanders. Details of algorithm can be found here.
I managed to understand overall concept but failing to match it with provided implementation and hence not able to grasp it clearly.
Here are initial code paths which are confusing me.
As per paper : n0, n1, n2 represent number of triplets starting at i mod 3 = (0,1,2)
As per code : n0 = (n + 2) / 3, n1 = (n + 1) / 3, n2 = n / 3; => How these initialisations has been derived?
As per paper : We need to create T` which is concatenation of triplets at i mod 3 != 0
As per code : n02 = n0 + n2; s12 = [n02] ==> How came n02? It should be n12 i.e n1 + n2.
As per code : for (int i = 0, j = 0; i < n + (n0 - n1); i++) fill s12 with triplets such that i%3 != 0; => Why for loop runs for n + (n0 - n1) times ? It should be simply n1 + n2. Should't be ?
I am not able to proceed because of these :( Please to help.
Consider the following example where the length of the input is n=13:
STA | CKO | WER | FLO | W
As per code : n0 = (n + 2) / 3, n1 = (n + 1) / 3, n2 = n / 3; => How these initialisations has been derived?
Note that the number of triplets i mod3 = 0 is n/3 if n mod3 = 0 and n/3+1 otherwise (if n mod3 = 1 or n mod3 = 2). In the current example n/3 = 4 but since the last triplet 'W' is not complete it is not counted in the integer division. A 'trick' to make this computation directly is to use (n+2)/3. Effectively, if n mod3 = 0 then the result of the integer divisions (n+2)/3 and n/3 will be the same. However, if n mod3 = 1 or 2 then the result of (n+2)/3 will be n/3+1. The same applies to n1 and n2.
As per code : n02 = n0 + n2; s12 = [n02] ==> How came n02? It should be n12 i.e n1 + n2.
As per code : for (int i = 0, j = 0; i < n + (n0 - n1); i++) fill s12 with triplets such that i%3 != 0; => Why for loop runs for n + (n0 - n1) times ? It should be simply n1 + n2. Should't be ?
Both questions have the same answer. In our example we'd have a B12 buffer like this:
B12 = B1 U B2 = {TA KO ER LO}
So you'd first sort the suffixes and end up with a suffix array of B12, which has 8 elements. To proceed to the merging step we first need to compute the suffix array of B0, which is obtained by sorting the tuples (B0(i),rank(i+1))... But this concrete case in which the last triplet has only one element (W) has a problem, because rank(i+1) is not defined for the last element of B0:
B0 = {0,3,6,9,12}
which sorted alphabetically results in
SA0 = {3, 9, 0, ?, ?}
Since the indices 6 and 12 contain a 'W', it is not enough to sort alphabetically, we need to check which goes first in the rank table, so let's check the rank of their suffixes.. oh, wait! rank(13) is not defined!
And that's why we add a dummy 0 to the last triplet of the input when the last triplet only contains one element (if n mod3 = 0). So then the size of B12 is n0+n2, no matter the size of n1, and one needs to add an extra element to B12 if B0 is larger than B1 (in which case n0-n1 = 1).
Hope it was clear.
Someone told me that the Frobenius pseudoprime algorithm take three times longer to run than the Miller–Rabin primality test but has seven times the resolution. So then if one where to run the former ten times and the later thirty times, both would take the same time to run, but the former would provide about 233% more analyse power. In trying to find out how to perform the test, the following paper was discovered with the algorithm at the end:
A Simple Derivation for the Frobenius Pseudoprime Test
There is an attempt at implementing the algorithm below, but the program never prints out a number. Could someone who is more familiar with the math notation or algorithm verify what is going on please?
Edit 1: The code below has corrections added, but the implementation for compute_wm_wm1 is missing. Could someone explain the recursive definition from an algorithmic standpoint? It is not "clicking" for me.
Edit 2: The erroneous code has been removed, and an implementation of the compute_wm_wm1 function has been added below. It appears to work but may require further optimization to be practical.
from random import SystemRandom
from fractions import gcd
random = SystemRandom().randrange
def find_prime_number(bits, test):
number = random((1 << bits - 1) + 1, 1 << bits, 2)
while True:
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
number += 2
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 * extended_gcd(b, integer)[0] - 2) % integer
m = (integer - jacobi_symbol(d, integer)) >> 1
wm, wm1 = compute_wm_wm1(w1, m, integer)
if w1 * wm != 2 * wm1 % integer:
return False
b = pow(b, (integer - 1) >> 1, integer)
return b * wm % integer == 2
def choose_ab(integer):
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
while is_square(d) or gcd(2 * d * a * b, integer) != 1:
a, b = random(1, integer), random(1, integer)
d = a ** 2 - 4 * b
return a, b, d
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer >> 1
seen = set([x])
while x * x != integer:
x = (x + integer // x) >> 1
if x in seen:
return False
seen.add(x)
return True
def extended_gcd(n, d):
x1, x2, y1, y2 = 0, 1, 1, 0
while d:
n, (q, d) = d, divmod(n, d)
x1, x2, y1, y2 = x2 - q * x1, x1, y2 - q * y1, y1
return x2, y2
def jacobi_symbol(n, d):
j = 1
while n:
while not n & 1:
n >>= 1
if d & 7 in {3, 5}:
j = -j
n, d = d, n
if n & 3 == 3 == d & 3:
j = -j
n %= d
return j if d == 1 else 0
def compute_wm_wm1(w1, m, n):
a, b = 2, w1
for shift in range(m.bit_length() - 1, -1, -1):
if m >> shift & 1:
a, b = (a * b - w1) % n, (b * b - 2) % n
else:
a, b = (a * a - 2) % n, (a * b - w1) % n
return a, b
print('Probably prime:\n', find_prime_number(300, 10))
You seem to have misunderstood the algorithm completely due to not being familiar with the notation.
def frobenius_pseudoprime(integer):
assert integer & 1 and integer >= 3
a, b, d = choose_ab(integer)
w1 = (a ** 2 // b - 2) % integer
That comes from the line
W0 ≡ 2 (mod n) and W1 ≡ a2b−1 − 2 (mod n)
But the b-1 doesn't mean 1/b here, but the modular inverse of b modulo n, i.e. an integer c with b·c ≡ 1 (mod n). You can most easily find such a c by continued fraction expansion of b/n or, equivalently, but with slightly more computation, by the extended Euclidean algorithm. Since you're probably not familiar with continued fractions, I recommend the latter.
m = (integer - d // integer) // 2
comes from
n − (∆/n) = 2m
and misunderstands the Jacobi symbol as a fraction/division (admittedly, I have displayed it here even more like a fraction, but since the site doesn't support LaTeX rendering, we'll have to make do).
The Jacobi symbol is a generalisation of the Legendre symbol - denoted identically - which indicates whether a number is a quadratic residue modulo an odd prime (if n is a quadratic residue modulo p, i.e. there is a k with k^2 ≡ n (mod p) and n is not a multiple of p, then (n/p) = 1, if n is a multiple of p, then (n/p) = 0, otherwise (n/p) = -1). The Jacobi symbol lifts the restriction that the 'denominator' be an odd prime and allows arbitrary odd numbers as 'denominators'. Its value is the product of the Legendre symbols with the same 'numerator' for all primes dividing n (according to multiplicity). More on that, and how to compute Jacobi symbols efficiently in the linked article.
The line should correctly read
m = (integer - jacobi_symbol(d,integer)) // 2
The following lines I completely fail to understand, logically, here should follow the calculation of
Wm and Wm+1 using the recursion
W2j ≡ Wj2 − 2 (mod n)
W2j+1 ≡ WjWj+1 − W1 (mod n)
An efficient method of using that recursion to compute the required values is given around formula (11) of the PDF.
w_m0 = w1 * 2 // m % integer
w_m1 = w1 * 2 // (m + 1) % integer
w_m2 = (w_m0 * w_m1 - w1) % integer
The remainder of the function is almost correct, except of course that it now gets the wrong data due to earlier misunderstandings.
if w1 * w_m0 != 2 * w_m2:
The (in)equality here should be modulo integer, namely if (w1*w_m0 - 2*w_m2) % integer != 0.
return False
b = pow(b, (integer - 1) // 2, integer)
return b * w_m0 % integer == 2
Note, however, that if n is a prime, then
b^((n-1)/2) ≡ (b/n) (mod n)
where (b/n) is the Legendre (or Jacobi) symbol (for prime 'denominators', the Jacobi symbol is the Legendre symbol), hence b^((n-1)/2) ≡ ±1 (mod n). So you could use that as an extra check, if Wm is not 2 or n-2, n can't be prime, nor can it be if b^((n-1)/2) (mod n) is not 1 or n-1.
Probably computing b^((n-1)/2) (mod n) first and checking whether that's 1 or n-1 is a good idea, since if that check fails (that is the Euler pseudoprime test, by the way) you don't need the other, no less expensive, computations anymore, and if it succeeds, it's very likely that you need to compute it anyway.
Regarding the corrections, they seem correct, except for one that made a glitch I previously overlooked possibly worse:
if w1 * wm != 2 * wm1 % integer:
That applies the modulus only to 2 * wm1.
Concerning the recursion for the Wj, I think it is best to explain with a working implementation, first in toto for easy copy and paste:
def compute_wm_wm1(w1,m,n):
a, b = 2, w1
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
while mask <= m:
mask <<= 1
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
else:
a, b = (a*a-2)%n, (a*b-w1)%n
mask >>= 1
return a, b
Then with explanations in between:
def compute_wm_wm1(w1,m,n):
We need the value of W1, the index of the desired number, and the number by which to take the modulus as input. The value W0 is always 2, so we don't need that as a parameter.
Call it as
wm, wm1 = compute_wm_wm1(w1,m,integer)
in frobenius_pseudoprime (aside: not a good name, most of the numbers returning True are real primes).
a, b = 2, w1
We initialise a and b to W0 and W1 respectively. At each point, a holds the value of Wj and b the value of Wj+1, where j is the value of the bits of m so far consumed. For example, with m = 13, the values of j, a and b develop as follows:
consumed remaining j a b
1101 0 w_0 w_1
1 101 1 w_1 w_2
11 01 3 w_3 w_4
110 1 6 w_6 w_7
1101 13 w_13 w_14
The bits are consumed left-to-right, so we have to find the first set bit of m and place our 'pointer' right before it
bits = int(log(m,2)) - 2
if bits < 0:
bits = 0
mask = 1 << bits
I subtracted a bit from the computed logarithm just to be entirely sure that we don't get fooled by a floating point error (by the way, using log limits you to numbers of at most 1024 bits, about 308 decimal digits; if you want to treat larger numbers, you have to find the base-2 logarithm of m in a different way, using log was the simplest way, and it's just a proof of concept, so I used that here).
while mask <= m:
mask <<= 1
Shift the mask until it's greater than m,so the set bit points just before m's first set bit. Then shift one position back, so we point at the bit.
mask >>= 1
while mask > 0:
if (mask & m) != 0:
a, b = (a*b-w1)%n, (b*b-2)%n
If the next bit is set, the value of the initial portion of consumed bits of m goes from j to 2*j+1, so the next values of the W sequence we need are W2j+1 for a and W2j+2 for b. By the above recursion formula,
W_{2j+1} = W_j * W_{j+1} - W_1 (mod n)
W_{2j+2} = W_{j+1}^2 - 2 (mod n)
Since a was Wj and b was Wj+1, a becomes (a*b - W_1) % n and b becomes (b * b - 2) % n.
else:
a, b = (a*a-2)%n, (a*b-w1)%n
If the next bit is not set, the value of the initial portion of consumed bits of m goes from j to 2*j, so a becomes W2j = (Wj2 - 2) (mod n), and b becomes
W2j+1 = (Wj * Wj+1 - W1) (mod n).
mask >>= 1
Move the pointer to the next bit. When we have moved past the final bit, mask becomes 0 and the loop ends. The initial portion of consumed bits of m is now all of m's bits, so the value is of course m.
Then we can
return a, b
Some additional remarks:
def find_prime_number(bits, test):
while True:
number = random(3, 1 << bits, 2)
for _ in range(test):
if not frobenius_pseudoprime(number):
break
else:
return number
Primes are not too frequent among the larger numbers, so just picking random numbers is likely to take a lot of attempts to hit one. You will probably find a prime (or probable prime) faster if you pick one random number and check candidates in order.
Another point is that such a test as the Frobenius test is disproportionally expensive to find that e.g. a multiple of 3 is composite. Before using such a test (or a Miller-Rabin test, or a Lucas test, or an Euler test, ...), you should definitely do a bit of trial division to weed out most of the composites and do the work only where it has a fighting chance of being worth it.
Oh, and the is_square function isn't prepared to deal with arguments less than 2, divide-by-zero errors lurk there,
def is_square(integer):
if integer < 0:
return False
if integer < 2:
return True
x = integer // 2
should help.