Why would spawning a process make computation run twice as fast? - multithreading

The following computation takes about 10.4 seconds on my Precision 5520:
import time
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
On the same laptop, the following takes only 5.2 seconds:
import multiprocessing as mp
import time
def foo():
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
mp.Process(target=foo).start()
This result is consistent. In fact, it holds (with a slightly smaller speed-up factor) even if I run cpu_count processes simultaneously.
So, why would spawning a process make computation run twice as fast?

It is not the process that make the computation fast, it is running the computation in a function. This is because global variable accesses are slower than local variable accesses using the CPython interpreter. If you simply run foo() in the same process, then the computation time is also twice lower. Here is an example:
import time
def foo():
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
# Slow (8.806 seconds)
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
# Fast (4.362 seconds)
foo()
Note overwriting the built-in function sum is a bit dangerous as it may break codes using it and cause weird errors.

Related

lazily iterate over generator in multiprocessing pool

I generate data using a generator (this data is memory intensive, although it is not the case in this dummy example) and then I have to make some calculations over that data. Since these calculations take much longer than data generation, I wish to do them in parallel. Here is the code I wrote (with dummy functions for simplicity):
from math import sqrt
from multiprocessing import Pool
def fibonacci(number_iter):
i = 0
j = 1
for round in range(number_iter):
yield i
k = i + j
i, j = j, k
def factors(n):
f = set()
for i in range(1, n+1, 1):
if n % i == 0:
f.add(i)
return f
if __name__ == "__main__":
pool = Pool()
results = pool.map(factors, fibonacci(45))
I learnt from other questions (see here and here) that map consumes the iterator fully. I wish to avoid that because that consumes a prohibitive amount of memory (that is why I am using a generator in the first place!).
How can I do this by lazily iterating over my generator function? The answers in the questions mentioned before have not been of help.

Huge difference in process timing of functions (timit, time)

For some didactical purposes, I want to measure the timing of some functions (slightly more complex ones, than the ones shown) and discuss later on the big O scaling behaviour. But I have problems with the reliability of the numbers produced:
My code:
import time
import numpy as np
def run_time(fun, runs):
times = []
for i in range(runs):
t0 = time.clock()
fun()
t1 = time.clock() - t0
times.append(t1)
return np.mean(times), np.std(times)#, times
def fact0(n):
product = 1
for i in range(n):
product = product * (i+1)
return product
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
print(run_time(lambda: fact0(500), 100000))
print(run_time(lambda: fact1(500), 100000))
and I usually get something like:
(0.000186065330000082, 5.08689027009196e-05)
(0.0002853808799999845, 8.285739309454826e-05)
So the std larger than the mean. That's awful for me. Furthermore I expected fact0() to be way faster than fact1() due to no recursion.
If I now use timeit:
import timeit
mysetup = ""
mycode = '''
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
fact1(500)
'''
print(timeit.timeit(setup = mysetup, stmt = mycode, number = 100000)/100000)
I'll get something almost an order of magnt. smaller for mean:
2.463713497854769e-07
After the below discussed corrections it is:
0.00028513264190871266
Which is in great accordance to the run_time version.
So my question is how do I time functions appropriately? Why the huge difference between my two methods to get the timing? Any advices? I would love to stick to "time.clock()" and I don't want to use (again for didactical reasons) cProfile or even more complex modules.
I am grateful for any comments...
(edited due to comments by #Carcigenicate)
You never call the function in mycode, so you're only timing how long it takes the function to be defined (which I would expect to be quite fast).
You need to call the function:
import timeit
mycode = '''
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
fact1(100000)
'''
print(timeit.timeit(stmt=mycode, number=100000) / 100000)
Really though, this is suboptimal, since you're including in the timing the definition. I'd change this up and just pass a function:
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
print(timeit.timeit(lambda: fact1(100000), number=100000) / 100000)

Calling a decorated function twice returns only the decorator and not the function

Here is my code that defines the timed decorator:
from functools import wraps, lru_cache
def timed(fn):
from time import perf_counter
#wraps(fn)
def inner(*args,**kwargs):
start = perf_counter()
result = fn(*args,**kwargs)
end = perf_counter()
timer = end - start
fs = '{} took {:.3f} microseconds'
print(fs.format(fn.__name__, (end - start) * 1000000))
return result
return inner
Here is the function definition:
#timed
#lru_cache
def factorial(n):
result = 1
cache = dict()
if n < 2:
print('Calculating factorial for n > 1')
result = 1
print(f'factorial of {result} is {result}')
else:
for i in range(1,n+1):
if i in cache.items():
result = cache[i]
#print(f'factorial of {i} is {result}')
else:
result *= i
cache[i] = result
print(f'factorial of {i} is {result}')
#print(f'{cache}')
return result
Here are the calls to the functions:
factorial(3)
factorial(10)
factorial(10)
Here is the output
factorial of 3 is 6
factorial took 32.968 microseconds
factorial of 10 is 3628800
factorial took 11.371 microseconds
**factorial took 0.323 microseconds**
Question:
Why when I call factorial(10) the second time, there is no Print out?
Because the whole point of lru_cache is to cache the function's arguments and the return values associated with them and to minimize the number of actual executions of the decorated function.
When you call factorial(10) for the second time, the function isn't called, but rather the value is fetched from the cache. And this is also why the second call is 35 times faster - because the function isn't even called, and that's the exact purpose of functools.lru_cache.
You only want to cache pure functions; your factorial isn't pure, because it has a side effect of writing to standard output.
For the behavior you want, define two functions: a pure function that you cache, and an impure wrapper that uses the pure function.
#lru_cache
def factorial_math(n):
result = 1
for i in range(2, n):
result *= i
return result
#timed
def factorial(n):
result = factorial_math(n)
print(f'factorial of {n} is {result}')
return result

how to improve computation speed when finding max in a large list? excluding Numpy

consider a game with an array(=num) containing some integers. I can take any integer and remove it from the array and add half of that number (rounded up) back to the array.i can do it for a fixed number of moves(=k).
the challenge is to minimize the sum of the final array.
my problem is that the test cases fail when dealing with large arrays :(
what is the efficient computation way to overcome this?
my first step for the challenge is taking max(num) and replace it with the result of ceil(max(num)/2) for k times.
another option is using sort(reverse) in every loop and replace the last value.
I've played with different sorting algos, read here and try bisect module they are all very new to me and didn't overcome the test-cases threshold, so i hope someone here can provide a helping hand for a newbie.
def minSum(num, k):
for i in range(k):
num.sort(reverse=True)
num.insert(0, math.ceil(num[0] / 2))
num.remove(num[1])
return sum(num)
minSum([10,20,7],4)
>>> 14
First off, inserting at the beginning of a Python list is much slower than inserting at the end, since every element has to be moved. Also, there's absolutely no reason to do so in the first place. You could just sort without reverse=True, then use pop to remove and return the last item and bisect.insort to put it back in at the right place without needing to re-sort the whole list.
from bisect import insort
from math import ceil
def min_sum(num, k):
num.sort()
for i in range(k):
largest = num.pop()
insort(num, ceil(largest/2))
return sum(num)
This should already be significantly faster than your original approach. However, in the worst case this is still O(n lg n) for the sort and O(k*n) for the processing; if your input is constructed such that halving each successive largest element makes it the new smallest element, you'll end up inserting it at the start which incurs an O(n) memory movement.
We can do better by using a priority queue approach, implemented in Python by the heapq library. You can heapify a list in linear time, and then use heapreplace to remove and replace the largest element successively. A slight awkwardness here is that heapq only implements a min-heap, so we'll need an extra pass to negate our input list at the beginning. One bonus side-effect is that since we now need to round down instead of up, we can just use integer division instead of math.ceil.
from heapq import heapify, heapreplace
def min_sum(num, k):
for i in range(len(num)):
num[i] = -num[i]
heapify(num)
for i in range(k):
largest = num[0]
heapreplace(num, largest // 2)
return -sum(num)
This way the initial list negation and heapification takes O(n), and then processing is only O(k lg n) since each heapreplace is a lg n operation.
To add some timing to all of the algorithms, it looks like #tzaman's heapq algorithm is by far the fastest. And it gives the same answer. Simply modifying the sort to not reversed does not give much speedup.
import random
import time
from bisect import insort
from heapq import heapify, heapreplace
from math import ceil
def makelist(num_elements):
mylist = range(int(num_elements))
mylist = list(mylist)
for i in mylist:
mylist[i] = int(random.random()*100)
return mylist
def minSum(num, k):
for i in range(k):
num.sort(reverse=True)
num.insert(0, ceil(num[0] / 2))
num.remove(num[1])
return sum(num)
def minSum2(num, k):
last_idx = len(num) - 1
for i in range(k):
num.sort()
num[last_idx] = ceil(num[last_idx] / 2)
return sum(num)
def min_sum(num, k):
num.sort()
for i in range(k):
largest = num.pop()
insort(num, ceil(largest/2))
return sum(num)
def min_heap(num, k):
for i in range(len(num)):
num[i] = -num[i]
heapify(num)
for i in range(k):
largest = num[0]
heapreplace(num, largest // 2)
return -sum(num)
if __name__ == '__main__':
mylist = makelist(1e4)
k = len(mylist) + 1
t0 = time.time()
# we have to make a copy of mylist for all of the functions
# otherwise mylist will be modified
print('minSum: ', minSum(mylist.copy(), k))
t1 = time.time()
print('minSum2: ', minSum2(mylist.copy(), k))
t2 = time.time()
print('min_sum: ', min_sum(mylist.copy(), k))
t3 = time.time()
print('min_heap: ', min_heap(mylist.copy(), k))
t4 = time.time()
print()
print('time for each method for k = %.0e: ' % k)
print('minSum: %f sec' % (t1-t0))
print('minSum2: %f sec' % (t2-t1))
print('min_sum: %f sec' % (t3-t2))
print('min_heap: %f sec' % (t4-t3))
and here is console output:
minSum: 205438
minSum2: 205438
min_sum: 205438
min_heap: 205438
time for each method for k = 1e+04:
minSum: 2.386861 sec
minSum2: 2.199656 sec
min_sum: 0.046802 sec
min_heap: 0.015600 sec
------------------
(program exited with code: 0)
Press any key to continue . . .

Measuring time since last time the if statement ran (Python)

I want to make my if statement run, only, if it is more than x seconds since it last ran. I just cant find the wey.
As you've provided no code, let's stay this is your program:
while True:
if doSomething:
print("Did it!")
We can ensure that the if statement will only run if it has been x seconds since it last ran by doing the following:
from time import time
doSomething = 1
x = 1
timeLastDidSomething = time()
while True:
if doSomething and time() - timeLastDidSomething > x:
print("Did it!")
timeLastDidSomething = time()
Hope this helps!
You'll want to use the time() method in the time module.
import time
...
old_time = time.time()
...
while (this is your game loop, presumably):
...
now = time.time()
if old_time + x <= now:
old_time = now
# only runs once every x seconds.
...
# Time in seconds
time_since_last_if = 30
time_if_ended = None
# Your loop
while your_condition:
# You still havent gone in the if, so we can only relate on our first declaration of time_since_last_if
if time_if_ended is not None:
time_since_last_if = time_if_ended - time.time()
if your_condition and time_since_last_if >= 30:
do_something()
# defining time_if_ended to keep track of the next time we'll have the if available
time_if_ended = time.time()

Resources