Is there any way to make this code more efficient? - python-3.x

I have to write a code to calculate the number of elements that have the maximum number of divisors between any 2 given numbers (A[0], A[1])(inclusive of both). I have to take input in the form of a line separated with spaces. The first line of the input gives the number of cases present in an example. This code is working perfectly fine but is taking some time to execute. Can anyone please help me write this code more efficiently?
import numpy as np
from sys import stdin
t=input()
for i in range(int(t)):
if int(t)<=100 and int(t)>=1:
divisor=[]
A=list(map(int,stdin.readline().split(' ')))
def divisors(n):
count=0
for k in range(1,int(n/2)+1):
if n%k==0:
count+=1
return count
for j in np.arange(A[0],A[1]+1):
divisor.append(divisors(j))
print(divisor.count(max(divisor)))
Sample input:
2
2 9
1 10
Sample Output:
3
4

There is a way to calculate divisors from the prime factorisation of a number.
Given the prime factorisation, calculating divisors is faster than trial division (which you do here).
But prime factorisation has to be fast. For small numbers having a pre-calculated list of prime numbers (easy to do) can make prime factorisation fast and divisor calculation fast as well. If you konw the upper limit of the numbers you test (let's call it L), then you need the prime numbers up to sqrt(L). Given the prime factorisation of a number n = p_1^e_1 * p_2^e_2 * .. * p_k^e_k the number of divisors is simply (1+e_1) * (1+e_2) * .. * (1+e_k)
Even more, you can pre-calculate and/or memoize the num of divisors of some overused numbers up to some limit. This will save a lot of time but increase memory, else you can calculate it directly (for example using previous method).
Apart from that, you can optimise the code a bit. For example you can avoid doing int(t) casting (and similar) all the time, do it once and store it in a variable.
Numpy may be avoided all together, it is superflous and I doubt adds any speed advantage, depends.
That should make your code faster, but always need to measure performance by real tests.

Related

What's the Big-O-notation for this algorithm for printing the prime numbers?

I am trying to figure out the time complexity of the below problem.
import math
def prime(n):
for i in range(2,n+1):
for j in range(2, int(math.sqrt(i))+1):
if i%j == 0:
break
else:
print(i)
prime(36)
This problem prints the prime numbers until 36.
My understanding of the above program:
for every n the inner loop runs for sqrt(n) times so on until n.
so the Big-o-Notation is O(n sqrt(n)).
Does my understanding is right? Please correct me if I am wrong...
Time complexity measures the increase in number or steps (basic operations) as the input scales up:
O(1) : constant (hash look-up)
O(log n) : logarithmic in base 2 (binary search)
O(n) : linear (search for an element in unsorted list)
O(n^2) : quadratic (bubble sort)
To determine the exact complexity of an algorithm requires a lot of math and algorithms knowledge. You can find a detailed description of them here: time complexity
Also keep in mind that these values are considered for very large values of n, so as a rule of thumb, whenever you see nested for loops, think O(n^2).
You can add a steps counter inside your inner for loop and record its value for different values of n, then print the relation in a graph. Then you can compare your graph with the graphs of n, log n, n * sqrt(n) and n^2 to determine exactly where your algorithm is placed.

why is np.exp(x) not equal to np.exp(1)**x

Why is why is np.exp(x) not equal to np.exp(1)**x?
For example:
np.exp(400)
>>>5.221469689764144e+173
np.exp(1)**400
>>>5.221469689764033e+173
np.exp(400)-np.exp(1)**400
>>>1.1093513018771065e+160
This is optimisation of numpy that raise this diff.
Indeed, you have to understand how is calculated the Euler number in math:
e = (1/n)**n with n == inf.
I think numpy stop at a certain order:
You have in the numpy exp documentation here that is not very clear about how the Euler number is calculated.
Because of this order that is not equal to infinity, you have this small difference in the two calculations.
Indeed the value np.exp(400) is calculated using this: (1 + 400/n)**n
>>> (1 + 400/n)**n
5.221642085428121e+173
>>> numpy.exp(400)
5.221469689764144e+173
Here you have n = 1000000000000 wich is very small and raise this difference at 10e-5.
Indeed there is no exact value of the Euler number. Like Pi, you can only have an approched value.
It looks like a rounding issue. In the first case it's internally using a very precise value of e, while in the second you get a less precise value, which when multiplied 400 times the precision issues become more apparent.
The actual result when using the Windows calculator is 5.2214696897641439505887630066496e+173, so you can see your first outcome is fine, while the second is not.
5.2214696897641439505887630066496e+173 // calculator
5.221469689764144e+173 // exp(400)
5.221469689764033e+173 // exp(1)**400
Starting from your result, it looks it's using a value with 15 digits of precision.
2.7182818284590452353602874713527 // e
2.7182818284590450909589085441968 // 400th root of the 2nd result

Solving math with integers larger than any available integer data type

In some programming competitions where the numbers are larger than any available integer data type, we often use strings instead.
Question 1:
Given these large numbers, how to calculate e and f in the below expression?
(a/b) + (c/d) = e/f
note: GCD(e,f) = 1, i.e. they must be in minimised form. For example {e,f} = {1,2} rather than {2,4}.
Also, all a,b,c,d are large numbers known to us.
Question 2:
Can someone also suggest a way to find GCD of two big numbers (bigger than any available integer type)?
I would suggest using full bytes or words rather than strings.
It is relatively easy to think in base 256 instead of base 10 and a lot more efficient for the processor to not do multiplication and division by 10 all the time. Ideally, choose a word size that is half the processor's natural word size, as that makes carry easy to implement. Of course thinking in base 64K or 4G is slightly more complex, but even better than base 256.
The only downside is generating the initial big numbers from the ascii input, which you get for free in base 10. Using a larger word size you can make this more efficient by processing a number of digits initially into a single word (eg 9 digits at a time into 4G), then performing a long multiply of that single word into the correct offset in your large integer format.
A compromise might be to run your engine in base 1 billion: This will still be 9 or 81 times more efficient than using base 10!
The simplest way to solve this equation is to multiply a/b * d/d and c/d * b/b so they both have the common denominator b*d.
I think you will then need to prime factorise your big numbers e and f to find any common factors. Remember to search again for the same factor squared.
Of course, that means you have to write a prime generating sieve. You only need to generate factors up to the square root, or half the digits of the min value of e and f.
You could prime factorise b and d to get a lower initial denominator, but you will need to do it again anyway after the addition.
I think that the way to solve this is to separate the problem:
Process the input numbers as an array of characters (ie. std::string)
Make a class where each object can store an std::list (or similar) that represents one of the large numbers, and can do the needed arithmetic with your data
You can then solve your problems normally, without having to worry about your large inputs causing overflow.
Here's a webpage that explains how you can have such an arithmetic class (with sample code in C++ showing addition).
Once you have such an arithmetic class, you no longer need to worry about how to store the data or any overflow.
I get the impression that you already know how to find the GCD when you don't have overflow issues, but just in case, here's an explanation of finding the GCD (with C++ sample code).
As for the specific math problem:
// given formula: a/b + c/d = e/f
// = ( ( a*d + b*c ) / ( b*d ) )
// Define some variables here to save on copying
// (I assume that your class that holds the
// large numbers is called "ARITHMETIC")
ARITHMETIC numerator = a*d + b*c;
ARITHMETIC denominator = b*d;
ARITHMETIC gcd = GCD( numerator , denominator );
// because we know that GCD(e,f) is 1, this implies:
ARITHMETIC e = numerator / gcd;
ARITHMETIC f = denominator / gcd;

Python3 Exceed time limit when finding prime number using square root method

I'm trying to write in python3 to determine whether a number is a prime number or not.
I was specifically demanded to only use the following method:
"Divide the input with all the positive prime number smaller than it's square root."
For example, if the given number is 33, and then I would have to divide 33 with [2,3,5] (smaller than 5.xx, the square root of 33)
Meanwhile, in the process of finding [2,3,5], I can not use any method other than the demanded one.
So my code are as follow:
def is_prime(num):
import math
a=math.sqrt(num)
llist=[2,3]
pri=0
for i in range(2,int(a)+1):
root=math.sqrt(i)
for m in llist:
if m<root:
left=i%m
if left!=0:
llist.append(i)
if num in llist:
return True
for m in llist:
if num%m==0:
return False
if num%m!=0:
pri=pri+1
if pri==len(llist):
return True
and the code can not properly run when the input number exceed 7 figures, it just stop responding.
Apparently somewhere in my code there is an infinite loop that I can't figure out.
I'd be very grateful if someone can help me out with this one.

Find euclidean distance between rows of two huge CSR matrices

I have two sparse martrices, A and B. A is 120000*5000 and B is 30000*5000. I need to find the euclidean distances between each row in B with all rows of A and then find the 5 rows in A with the lowest distance to the selected row in B. As it is a very big data I am using CSR otherwise I get memory error. It is clear that for each row in A it calculates (x_b - x_a)^2 5000 times and sums them and then get a sqrt. This process is taking a very very long time, like 11 days! Is there any way I can do this more efficiently? I just need the 5 rows with the lowest distance to each row in B.
I am implementing K-Nearest Neighbours and A is my training set and B is my test set.
Well - I don't know if you could 'vectorize' that code, so that it would run in native code instead of Python. The trick to speed-up numpy and scipy is always getting that.
If you can run that code in native code in a 1GHz CPU, with 1 FP instruction for clock cicle, you'd get it done in a little under 10 hours.
(5000 * 2 * 30000 * 120000) / 1024 ** 3
Raise that to 1.5Ghz x 2 CPU physical cores x 4 way SIMD instructions with multiply + acummulate (Intel AVX extensions, available in most CPUs) and you could get that number crunching down to one hour, at 2 x 100% on a modest core i5 machinne. But that would require full SIMD optimization in native code - far from a trivial task (although, if you decide to go this path, further questions on S.O. could get help from people either to wet their hands in SIMD coding :-) ) - interfacing this code in C with Scipy is not hard using cython, for example (you only need that part to get it to the above 10 hour figure)
Now... as for algorithm optimization, and keeping things Python :-)
Fact is, you don't need to fully calculate all distances from rows in A - you just need to keep a sorted list of the 5 lower rows - and any time the cumulation of a sum of squares get larger than the 5th nearest row (so far), you just abort the calculation for that row.
You could use Python' heapq operations for that:
import heapq
import math
def get_closer_rows(b_row, a):
result = [(float("+inf"), None) * 5]
for i, a_row in enumerate(a):
distance_sq = 0
count = 0
for element_a, element_b in zip(a_row, b_row):
distance_sq += element_a * element_b
if not count % 64 and distance_sq > result[4][0]:
break
count += 1
else:
heapq.heappush(result, (distance, i))
result[:] = result[:5]
return [math.sqrt(r) for r in result]
closer_rows_to_b = []
for row in b:
closer_rows_to_b.append(get_closer_rows(row, a))
Note the auxiliar "count" to avoid the expensive retrieving and comparison of values for all multiplications.
Now, if you can run this code using pypy instead of regular Python, I believe it could get full benefit of JITting, and you could get a noticeable improvement over your times if you are running the code in pure Python (i.e.: non numpy/scipy vectorized code).

Resources