Project Euler #23 Optimization [Python 3.6] - python-3.x

I'm having trouble getting my code to run quickly for Project Euler Problem 23. The problem is pasted below:
A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.
A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.
As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.
Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.
And my code:
import math
import bisect
numbers = list(range(1, 20162))
tot = set()
numberabundance = []
abundant = []
for n in numbers:
m = 2
divisorsum = 1
while m <= math.sqrt(n):
if n % m == 0:
divisorsum += m + (n / m)
m += 1
if math.sqrt(n) % 1 == 0:
divisorsum -= math.sqrt(n)
if divisorsum > n:
numberabundance.append(1)
else:
numberabundance.append(0)
temp = 1
# print(numberabundance)
for each in numberabundance:
if each == 1:
abundant.append(temp)
temp += 1
abundant_set = set(abundant)
print(abundant_set)
for i in range(12, 20162):
for k in abundant:
if i - k in abundant_set:
tot.add(i)
break
elif i - k < i / 2:
break
print(sum(numbers.difference(tot)))
I know the issue lies in the for loop at the bottom but I'm not quire sure how to fix it. I've tried modeling it after some of the other answers I've seen here but none of them seem to work. Any suggestions? Thanks.

Your upper bound is incorrect - the question states all integers greater than 28123 can be written ..., not 20162
After changing the bound, generation of abundant is correct, although you could do this generation in a single pass by directly adding to a set abundant, instead of creating the bitmask array numberabundance.
The final loop is also incorrect - as per the question, you must
Find the sum of all the positive integers
whereas your code
for i in range(12, 20162):
will skip numbers below 12 and also doesn't include the correct upper bound.
I'm a bit puzzled about your choice of
elif i - k < i / 2:
Since the abundants are already sorted, I would just check if the inner loop had passed the midpoint of the outer loop:
if k > i / 2:
Also, since we just need the sum of these numbers, I would just keep a running total, instead of having to do a final sum on a collection.
So here's the result after making the above changes:
import math
import bisect
numbers = list(range(1, 28123))
abundant = set()
for n in numbers:
m = 2
divisorsum = 1
while m <= math.sqrt(n):
if n % m == 0:
divisorsum += m + (n / m)
m += 1
if math.sqrt(n) % 1 == 0:
divisorsum -= math.sqrt(n)
if divisorsum > n:
abundant.add(n)
#print(sorted(abundant))
nonabundantsum = 0
for i in numbers:
issumoftwoabundants = False
for k in abundant:
if k > i / 2:
break
if i - k in abundant:
issumoftwoabundants = True
break
if not issumoftwoabundants:
nonabundantsum += i
print(nonabundantsum)
Example here

Related

Check whether a subset with sum divisible by m exists

Given a set of non-negative distinct integers, and a value m, determine if there is a subset of the given set with sum divisible by m.
The solution on geeksforgeeks states that-
If n > m there will always be a subset with sum divisible by m (which is easy to prove with pigeonhole principle). So we need to handle only cases of n <= m.
Can somebody please explain what this case means and what is its relation to pigeonhole principle? Also how is this case different from n <= m?
Making this a bit more verbose of this:
Label the numbers a1, a2, ... an in any order. Now consider the sums:
b1=a1
b2=a1+a2
b3=a1+a2+a3
...
bn=a1+a2+...+an
These are either all unique numbers or one of the as are 0 (which is divisible by m).
Now if any of the bs are divisible by m we are done.
Otherwise:
The remainders of some non-divisible number/m can be in the range of 1...(m-1). So there are m-1 numbers of possible remainders`.
Since numbers b1...bn weren't divisible by m they must have remainders in the range of 1...(m-1). So you must pair n numbers of bs (pigeons) with m-1 remainders (pigeonholes).
We have more pigeons than pigeonholes => there must be at least two pigeons in the same pigeonhole.
That means: there must be at least two bs with the same remainder: call them bi, bj (i<j). Since all of our bs are unique and bi % m == bj % m (the remainders of bi/m and bj/m are the same) => bi - bj = x * m (where x is a positive integer). Therefore bi - bj is divisible by m and bi - bj = ai+1 + ... + aj. Therefore ai+1 + ... + aj is divisible by m which is exactly what we wanted to proof.
Let us create a new set of numbers (i.e. a[ ] array) by doing prefix sum of given values (i.e. value[ ] array).
a[0] = value[0]
a[1] = value[0] + value[1]
a[n] = value[0] + value[1] + .... + value[n]
Now we have n new numbers. If any of them are divisible by m we are done.
If we divide the a[ ] array elements with m, we can get remainders in range of [1, m - 1].
But, we have a total of n values.
So, there exist two numbers 0<=i,j<=n in a such that a[i] mod(m) == a[j] mod(m).
Due to the above statement, we can say that a[i] - a[j] is divisible by m.
Now, let's consider i > j.
We also know that, a[i] = value[i] + value[i - 1] + ... + value[0] and a[j] = value[j] + value[j - 1] + ... + value[0].
So, a[i] - a[j] = value[i] + value[i - 1] + ... + value[i - j + 1] is also divisible by m.

What is the time complexity of this division function (no divide or multiply operators used)?

I solved this leetcode question https://leetcode.com/problems/divide-two-integers/ . The goal is to get the quotient of the division of dividend by divisor without using a multiplication or division operator. Here is my solution:
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
Q = divisor
while dividend >= divisor:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
if dividend < Q:
Q = divisor
i = 0
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res
So I am having trouble analyzing the time complexity of this solution. I know it should be O(log(something)). Usually for algorithms we say they are O(log(n)) when the input gets divided by 2 at each iteration but here I multiply the divisor by 2 Q<<= 1 at each iteration so at each step I take a bigger step towards the solution. Obviously if the dividend is the same for a bigger divisor my algorithm will be faster. Similarly the bigger the dividend for the same divisor we get a slower run time.
My guess is the equation governing the runtime of this algorithm is basically of the form O(dividend/divisor) (duh that's division) with some logs in there to account for me multiplying Q by 2 at each step Q <<= 1... I can't figure out what exactly.
EDIT:
When I first posted the question the algorithm I posted is the one below, Alain Merigot's answer is based on that algorithm. The difference between the version on top and that one is for the one above I never have my dividend go below 0 resulting in a faster run time.
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
tmp_divisor = divisor
while dividend >= divisor:
old_dividend, old_res = dividend, res
dividend = dividend - tmp_divisor
tmp_divisor <<= 1
res += (1 << i)
i+=1
if dividend < 0:
dividend = old_dividend
res = old_res
tmp_divisor >>= 2
i -= 2
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res
Your algorithm is O(m^2) in the worst-case, where m is the number of bits in the result. In terms of the inputs, it would be O(log(dividend/divisor) ^ 2).
To see why, consider what your loop does. Let a=dividend, b=divisor. The loop subtracts b, 2b, 4b, 8b, ... from a as long as it's big enough, then repeats this sequence again and again until a<b.
It can be equivalently written as two nested loops:
while dividend >= divisor:
Q = divisor
i = 0
while Q <= dividend:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
For each iteration of the outer loop, the inner loop will perform less iterations because dividend is smaller. In the worst case, the inner loop will do only one iteration less for each iteration of the outer loop. This happens when the result is 1+3+7+15+...+(2^n-1) for some n. In this case, it can be shown that n = O(log(result)), but the total number of inner loop iterations is O(n^2), i.e. quadratic in the size of the result.
To improve this to be linear in the size of the result, first calculate the largest needed values of Q and i. Then work backwards from that, subtracting 1 from i and shifting Q right each iteration. This guarantees no more than 2n iterations total.
Worst case complexity is easy to find.
Every iteration generates a bit of the result, and the number of iterations is equal to the number of bits in the quotient.
When divider=1, quotient=dividend and in that case the number of iterations is equal to the number of bits in dividend after the leading (most significant) 1. It is maximized when dividend=2^(n-1)+k, where n is the number of bits and k any number such as 1≤k<2^(n-1). This will obviously be the worst case.
After first iteration, dividend=dividend-diviser(=dividend-1) and diviser=2^1
After iteration m, diviser=2^m and dividend=dividend-(1+2^1+..+2^(m-1))=dividend-(2^m-1)
Iterations stop when dividend is <0. As dividend=2^(n-1)+k, with k>0, this happens for m=n.
Hence, the number of steps in the worst case is n and complexity is linear with number of bits of the dividend.

make change in python (maximum recursion depth exceeded in comparison)

So I have a recursive solution to the make change problem that works sometimes. It is:
def change(n, c):
if (n == 0):
return 1
if (n < 0):
return 0;
if (c + 1 <= 0 and n >= 1):
return 0
return change(n, c - 1) + change(n - coins[c - 1], c);
where coins is my array of coins. For example [1,5,10,25]. n is the amount of coins, for example 1000, and c is the length of the coins array - 1. This solution works in some situations. But when I need it to run in under two seconds and I use:
coins: [1,5,10,25]
n: 1000
I get a:
RecursionError: maximum recursion depth exceeded in comparison
So my question is, what would be the best way to optimize this. Using some sort of flow control? I don't want to do something like.
# Set recursion limit
sys.setrecursionlimit(10000000000)
UPDATE:
I now have something like
def coinss(n, c):
if n == 0:
return 1
if n < 0:
return 0
nCombos = 0
for c in range(c, -1, -1):
nCombos += coinss(n - coins[c - 1], c)
return nCombos
but it takes forever. it'd be ideal to have this run under a second.
As suggested in the answers above you could use DP for a more optimal solution.
Also your conditional check -
if (c + 1 <= 0 and n >= 1)
should be
if (c <= 1 ):
as n will always be >=1 and c <= 1 will prevent any calculations if the number of coins is lesser than or equal to 1.
While using recursion you will always run into this. If you set the recursion limit higher, you may be able to use your algorithm on a bigger number, but you will always be limited. The recursion limit is there to keep you from getting a stack overflow.
The best way to solved for bigger change amounts would be to swap to an iterative approach. There are algorithms out there, wikipedia:
https://en.wikipedia.org/wiki/Change-making_problem
Note that you have a bug here:
if (c + 1 <= 0 and n >= 1):
is like
if (c <= -1 and n >= 1):
So c can be 0 and pass to the next step where you pass c-1 to the index, which works because python doesn't mind negative indexes but still false (coins[-1] yields 25), so your solution sometimes prints 1 combination too much.
I've rewritten your algorithm with recursive and stack approaches:
Recursive (fixed, no need for c at init thanks to an internal recursive method, but still overflows the stack):
coins = [1,5,10,25]
def change(n):
def change_recurse(n, c):
if n == 0:
return 1
if n < 0:
return 0;
if c <= 0:
return 0
return change_recurse(n, c - 1) + change_recurse(n - coins[c - 1], c)
return change_recurse(n,len(coins))
iterative/stack approach (not dynamic programming), doesn't recurse, just uses a "stack" to store the computations to perform:
def change2(n):
def change_iter(stack):
result = 0
# continue while the stack isn't empty
while stack:
# process one computation
n,c = stack.pop()
if n == 0:
# one solution found, increase counter
result += 1
if n > 0 and c > 0:
# not found, request 2 more computations
stack.append((n, c - 1))
stack.append((n - coins[c - 1], c))
return result
return change_iter([(n,len(coins))])
Both methods return the same values for low values of n.
for i in range(1,200):
a,b = change(i),change2(i)
if a != b:
print("error",i,a,b)
the code above runs without any error prints.
Now print(change2(1000)) takes a few seconds but prints 142511 without blowing the stack.

Efficient Mersenne prime generator in python

I have made a code that doesn't seem to be very efficient. It only calculates a few of the primes.
This is my code:
num=float(1)
a=1
while(num>0): # Create variable to hold the factors and add 1 and itself (all numbers have these factors)
factors = [1, num]
# For each possible factor
for i in range(2, int(num/4)+3):
# Check that it is a factor and that the factor and its corresponding factor are not already in the list
if float(num) % i == 0 and i not in factors and float(num/i) not in factors:
# Add i and its corresponding factor to the list
factors.append(i)
factors.append(float(num/i))
num=float(num)
number=num
# Takes an integer, returns true or false
number = float(number)
# Check if the only factors are 1 and itself and it is greater than 1
if (len(factors) == 2 and number > 1):
num2=2**num-1
factors2=[1, num]
for i in range(2, int(num2/4)+3):
# Check that it is a factor and that the factor and its corresponding factor are not already in the list
if float(num2) % i == 0 and i not in factors2 and float(num2/i) not in factors2:
# Add i and its corresponding factor to the list
factors2.append(i)
factors2.append(float(num2/i))
if(len(factors2)==2 and num2>1):
print(num2)
a=a+1
num=num+2
How can I make my code more efficient and be able to calculate the Mersenne Primes quicker. I would like to use the program to find any possible new perfect numbers.
All the solutions shown so far use bad algorithms, missing the point of Mersenne primes completely. The advantage of Mersenne primes is we can test their primality more efficiently than via brute force like other odd numbers. We only need to check an exponent for primeness and use a Lucas-Lehmer primality test to do the rest:
def lucas_lehmer(p):
s = 4
m = 2 ** p - 1
for _ in range(p - 2):
s = ((s * s) - 2) % m
return s == 0
def is_prime(number):
"""
the efficiency of this doesn't matter much as we're
only using it to test the primeness of the exponents
not the mersenne primes themselves
"""
if number % 2 == 0:
return number == 2
i = 3
while i * i <= number:
if number % i == 0:
return False
i += 2
return True
print(3) # to simplify code, treat first mersenne prime as a special case
for i in range(3, 5000, 2): # generate up to M20, found in 1961
if is_prime(i) and lucas_lehmer(i):
print(2 ** i - 1)
The OP's code bogs down after M7 524287 and #FrancescoBarban's code bogs down after M8 2147483647. The above code generates M18 in about 15 seconds! Here's up to M11, generated in about 1/4 of a second:
3
7
31
127
8191
131071
524287
2147483647
2305843009213693951
618970019642690137449562111
162259276829213363391578010288127
170141183460469231731687303715884105727
6864797660130609714981900799081393217269435300143305409394463459185543183397656052122559640661454554977296311391480858037121987999716643812574028291115057151
531137992816767098689588206552468627329593117727031923199444138200403559860852242739162502265229285668889329486246501015346579337652707239409519978766587351943831270835393219031728127
This program bogs down above M20, but it's not a particulary efficient implementation. It's simply not a bad algorithm.
import math
def is_it_prime(n):
# n is already a factor of itself
factors = [n]
#look for factors
for i in range(1, int(math.sqrt(n)) + 1):
#if i is a factor of n, append it to the list
if n%i == 0: factors.append(i)
else: pass
#if the list has more than 2 factors n is not prime
if len(factors) > 2: return False
#otherwise n is prime
else: return True
n = 1
while True:
#a prime P is a Mersenne prime if P = 2 ^ n - 1
test = (2 ** n) - 1
#if test is prime is also a Mersenne prime
if is_it_prime(test):
print(test)
else: pass
n += 1
Probably it will stuck to 2147483647, but you know, the next Mersenne prime is 2305843009213693951... so don't worry if it takes more time than you expected ;)
If you just want to check if a number is prime, then you do not need to find all its factors. You already know 1 and num are factors. As soon as you find a third factor then the number cannot be prime. You are wasting time looking for the fourth, fifth etc. factors.
A Mersenne number is of the form 2^n - 1, and so is always odd. Hence all its factors are odd. You can halve the run-time of your loop if you only look for odd factors: start at 3 and step 2 to the next possible factor.
Factors come in pairs, one larger than the square root and one smaller. Hence you only need to look for factors up to the square root, as #Francesco's code shows. That can give you a major time saving for the larger Mersenne numbers.
Putting these two points together, your loop should be more like:
#look for factors
for i in range(3, int(math.sqrt(n)) + 1, 2):

split a value into values in max, min range

I want to find an efficient algorithm to divide an integer number to some value in a max, min range. There should be as less values as possible.
For example:
max = 7, min = 3
then
8 = 4 + 4
9 = 4 + 5
16 = 5 + 5 + 6 (not 4 + 4 + 4 + 4)
EDIT
To make it more clear, let take an example. Assume that you have a bunch of apples and you want to pack them into baskets. Each basket can contain 3 to 7 apples, and you want the number of baskets to be used is as small as possible.
** I mentioned that the value should be evenly divided, but that's not so important. I am more concerned about less number of baskets.
This struck me as a fun problem so I had a go at hacking out a quick solution. I think this might be an interesting starting point, it'll either give you a valid solution with as few numbers as possible, or with numbers as similar to each other as possible, all within the bounds of the range defined by the min_bound and max_bound
number = int(input("Number: "))
min_bound = 3
max_bound = 7
def improve(solution):
solution = list(reversed(solution))
for i, num in enumerate(solution):
if i >= 2:
average = sum(solution[:i]) / i
if average.is_integer():
for x in range(i):
solution[x] = int(average)
break
return solution
def find_numbers(number, division, common_number):
extra_number = number - common_number * division
numbers_in_solution = [common_number] * division
if extra_number < min_bound and \
extra_number + common_number <= max_bound:
numbers_in_solution[-1] += extra_number
elif extra_number < min_bound or extra_number > max_bound:
return None
else:
numbers_in_solution.append(extra_number)
solution = improve(numbers_in_solution)
return solution
def tst(number):
try:
solution = None
for division in range(number//max_bound, number//min_bound + 1): # Reverse the order of this for numbers as close in value to each other as possible.
if round (number / division) in range(min_bound, max_bound + 1):
solution = find_numbers(number, division, round(number / division))
elif (number // division) in range(min_bound, max_bound + 1): # Rarely required but catches edge cases
solution = find_numbers(number, division, number // division)
if solution:
print(sum(solution), solution)
break
except ZeroDivisionError:
print("Solution is 1, your input is less than the max_bound")
tst(number)
for x in range(1,100):
tst(x)
This code is just to demonstrate an idea, I'm sure it could be tweaked for better performance.

Resources