lazily iterate over generator in multiprocessing pool - python-3.x

I generate data using a generator (this data is memory intensive, although it is not the case in this dummy example) and then I have to make some calculations over that data. Since these calculations take much longer than data generation, I wish to do them in parallel. Here is the code I wrote (with dummy functions for simplicity):
from math import sqrt
from multiprocessing import Pool
def fibonacci(number_iter):
i = 0
j = 1
for round in range(number_iter):
yield i
k = i + j
i, j = j, k
def factors(n):
f = set()
for i in range(1, n+1, 1):
if n % i == 0:
f.add(i)
return f
if __name__ == "__main__":
pool = Pool()
results = pool.map(factors, fibonacci(45))
I learnt from other questions (see here and here) that map consumes the iterator fully. I wish to avoid that because that consumes a prohibitive amount of memory (that is why I am using a generator in the first place!).
How can I do this by lazily iterating over my generator function? The answers in the questions mentioned before have not been of help.

Related

Why would spawning a process make computation run twice as fast?

The following computation takes about 10.4 seconds on my Precision 5520:
import time
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
On the same laptop, the following takes only 5.2 seconds:
import multiprocessing as mp
import time
def foo():
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
mp.Process(target=foo).start()
This result is consistent. In fact, it holds (with a slightly smaller speed-up factor) even if I run cpu_count processes simultaneously.
So, why would spawning a process make computation run twice as fast?
It is not the process that make the computation fast, it is running the computation in a function. This is because global variable accesses are slower than local variable accesses using the CPython interpreter. If you simply run foo() in the same process, then the computation time is also twice lower. Here is an example:
import time
def foo():
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
# Slow (8.806 seconds)
before = time.time()
sum = 0
for i in range(1, 100000000):
sum += i
print(time.time() - before, sum)
# Fast (4.362 seconds)
foo()
Note overwriting the built-in function sum is a bit dangerous as it may break codes using it and cause weird errors.

Huge difference in process timing of functions (timit, time)

For some didactical purposes, I want to measure the timing of some functions (slightly more complex ones, than the ones shown) and discuss later on the big O scaling behaviour. But I have problems with the reliability of the numbers produced:
My code:
import time
import numpy as np
def run_time(fun, runs):
times = []
for i in range(runs):
t0 = time.clock()
fun()
t1 = time.clock() - t0
times.append(t1)
return np.mean(times), np.std(times)#, times
def fact0(n):
product = 1
for i in range(n):
product = product * (i+1)
return product
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
print(run_time(lambda: fact0(500), 100000))
print(run_time(lambda: fact1(500), 100000))
and I usually get something like:
(0.000186065330000082, 5.08689027009196e-05)
(0.0002853808799999845, 8.285739309454826e-05)
So the std larger than the mean. That's awful for me. Furthermore I expected fact0() to be way faster than fact1() due to no recursion.
If I now use timeit:
import timeit
mysetup = ""
mycode = '''
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
fact1(500)
'''
print(timeit.timeit(setup = mysetup, stmt = mycode, number = 100000)/100000)
I'll get something almost an order of magnt. smaller for mean:
2.463713497854769e-07
After the below discussed corrections it is:
0.00028513264190871266
Which is in great accordance to the run_time version.
So my question is how do I time functions appropriately? Why the huge difference between my two methods to get the timing? Any advices? I would love to stick to "time.clock()" and I don't want to use (again for didactical reasons) cProfile or even more complex modules.
I am grateful for any comments...
(edited due to comments by #Carcigenicate)
You never call the function in mycode, so you're only timing how long it takes the function to be defined (which I would expect to be quite fast).
You need to call the function:
import timeit
mycode = '''
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
fact1(100000)
'''
print(timeit.timeit(stmt=mycode, number=100000) / 100000)
Really though, this is suboptimal, since you're including in the timing the definition. I'd change this up and just pass a function:
def fact1(n):
if n == 0:
return 1
else:
return n * fact1(n-1)
print(timeit.timeit(lambda: fact1(100000), number=100000) / 100000)

how to improve computation speed when finding max in a large list? excluding Numpy

consider a game with an array(=num) containing some integers. I can take any integer and remove it from the array and add half of that number (rounded up) back to the array.i can do it for a fixed number of moves(=k).
the challenge is to minimize the sum of the final array.
my problem is that the test cases fail when dealing with large arrays :(
what is the efficient computation way to overcome this?
my first step for the challenge is taking max(num) and replace it with the result of ceil(max(num)/2) for k times.
another option is using sort(reverse) in every loop and replace the last value.
I've played with different sorting algos, read here and try bisect module they are all very new to me and didn't overcome the test-cases threshold, so i hope someone here can provide a helping hand for a newbie.
def minSum(num, k):
for i in range(k):
num.sort(reverse=True)
num.insert(0, math.ceil(num[0] / 2))
num.remove(num[1])
return sum(num)
minSum([10,20,7],4)
>>> 14
First off, inserting at the beginning of a Python list is much slower than inserting at the end, since every element has to be moved. Also, there's absolutely no reason to do so in the first place. You could just sort without reverse=True, then use pop to remove and return the last item and bisect.insort to put it back in at the right place without needing to re-sort the whole list.
from bisect import insort
from math import ceil
def min_sum(num, k):
num.sort()
for i in range(k):
largest = num.pop()
insort(num, ceil(largest/2))
return sum(num)
This should already be significantly faster than your original approach. However, in the worst case this is still O(n lg n) for the sort and O(k*n) for the processing; if your input is constructed such that halving each successive largest element makes it the new smallest element, you'll end up inserting it at the start which incurs an O(n) memory movement.
We can do better by using a priority queue approach, implemented in Python by the heapq library. You can heapify a list in linear time, and then use heapreplace to remove and replace the largest element successively. A slight awkwardness here is that heapq only implements a min-heap, so we'll need an extra pass to negate our input list at the beginning. One bonus side-effect is that since we now need to round down instead of up, we can just use integer division instead of math.ceil.
from heapq import heapify, heapreplace
def min_sum(num, k):
for i in range(len(num)):
num[i] = -num[i]
heapify(num)
for i in range(k):
largest = num[0]
heapreplace(num, largest // 2)
return -sum(num)
This way the initial list negation and heapification takes O(n), and then processing is only O(k lg n) since each heapreplace is a lg n operation.
To add some timing to all of the algorithms, it looks like #tzaman's heapq algorithm is by far the fastest. And it gives the same answer. Simply modifying the sort to not reversed does not give much speedup.
import random
import time
from bisect import insort
from heapq import heapify, heapreplace
from math import ceil
def makelist(num_elements):
mylist = range(int(num_elements))
mylist = list(mylist)
for i in mylist:
mylist[i] = int(random.random()*100)
return mylist
def minSum(num, k):
for i in range(k):
num.sort(reverse=True)
num.insert(0, ceil(num[0] / 2))
num.remove(num[1])
return sum(num)
def minSum2(num, k):
last_idx = len(num) - 1
for i in range(k):
num.sort()
num[last_idx] = ceil(num[last_idx] / 2)
return sum(num)
def min_sum(num, k):
num.sort()
for i in range(k):
largest = num.pop()
insort(num, ceil(largest/2))
return sum(num)
def min_heap(num, k):
for i in range(len(num)):
num[i] = -num[i]
heapify(num)
for i in range(k):
largest = num[0]
heapreplace(num, largest // 2)
return -sum(num)
if __name__ == '__main__':
mylist = makelist(1e4)
k = len(mylist) + 1
t0 = time.time()
# we have to make a copy of mylist for all of the functions
# otherwise mylist will be modified
print('minSum: ', minSum(mylist.copy(), k))
t1 = time.time()
print('minSum2: ', minSum2(mylist.copy(), k))
t2 = time.time()
print('min_sum: ', min_sum(mylist.copy(), k))
t3 = time.time()
print('min_heap: ', min_heap(mylist.copy(), k))
t4 = time.time()
print()
print('time for each method for k = %.0e: ' % k)
print('minSum: %f sec' % (t1-t0))
print('minSum2: %f sec' % (t2-t1))
print('min_sum: %f sec' % (t3-t2))
print('min_heap: %f sec' % (t4-t3))
and here is console output:
minSum: 205438
minSum2: 205438
min_sum: 205438
min_heap: 205438
time for each method for k = 1e+04:
minSum: 2.386861 sec
minSum2: 2.199656 sec
min_sum: 0.046802 sec
min_heap: 0.015600 sec
------------------
(program exited with code: 0)
Press any key to continue . . .

why is siftdown working in heapsort, but not siftup?

I have a programming assignment as follows:
You will need to convert the array into a heap using only O(n) swaps, as was described in the lectures. Note that you will need to use a min-heap instead of a max-heap in this problem. The first line of the output should contain single integer m — the total number of swaps. m must satisfy conditions 0 ≤ m ≤ 4n. The next m lines should contain the swap operations used to convert the array a into a heap. Each swap is described by a pair of integers i,j — the 0-based indices of the elements to be swapped
I have implemented a solution using sifting up technique by comparing with parent's value which gave solutions to small text cases, when number of integers in array is less than 10,verified by manual checking, but it could not pass the test case with 100000 integers as input.
this is the code for that
class HeapBuilder:
def __init__(self):
self._swaps = [] #array of tuples or arrays
self._data = []
def ReadData(self):
n = int(input())
self._data = [int(s) for s in input().split()]
assert n == len(self._data)
def WriteResponse(self):
print(len(self._swaps))
for swap in self._swaps:
print(swap[0], swap[1])
def swapup(self,i):
if i !=0:
if self._data[int((i-1)/2)]> self._data[i]:
self._swaps.append(((int((i-1)/2)),i))
self._data[int((i-1)/2)], self._data[i] = self._data[i],self._data[int((i-1)/2)]
self.swapup(int((i-1)/2))
def GenerateSwaps(self):
for i in range(len(self._data)-1,0,-1):
self.swapup(i)
def Solve(self):
self.ReadData()
self.GenerateSwaps()
self.WriteResponse()
if __name__ == '__main__':
heap_builder = HeapBuilder()
heap_builder.Solve()
on the other hand i have implemented a heap sort using sifting down technique with similar comparing process, and this thing has passed every test case.
following is the code for this method
class HeapBuilder:
def __init__(self):
self._swaps = [] #array of tuples or arrays
self._data = []
def ReadData(self):
n = int(input())
self._data = [int(s) for s in input().split()]
assert n == len(self._data)
def WriteResponse(self):
print(len(self._swaps))
for swap in self._swaps:
print(swap[0], swap[1])
def swapdown(self,i):
n = len(self._data)
min_index = i
l = 2*i+1 if (2*i+1<n) else -1
r = 2*i+2 if (2*i+2<n) else -1
if l != -1 and self._data[l] < self._data[min_index]:
min_index = l
if r != - 1 and self._data[r] < self._data[min_index]:
min_index = r
if i != min_index:
self._swaps.append((i, min_index))
self._data[i], self._data[min_index] = \
self._data[min_index], self._data[i]
self.swapdown(min_index)
def GenerateSwaps(self):
for i in range(len(self._data)//2 ,-1,-1):
self.swapdown(i)
def Solve(self):
self.ReadData()
self.GenerateSwaps()
self.WriteResponse()
if __name__ == '__main__':
heap_builder = HeapBuilder()
heap_builder.Solve()
can someone explain what is wrong with sift/swap up method?
Trying to build a heap by "swapping up" from the bottom won't always work. The resulting array will not necessarily be a valid heap. For example, consider this array: [3,6,2,4,5,7,1]. Viewed as tree that is:
3
4 2
6 5 7 1
Your algorithm starts at the last item and swaps up towards the root. So you swap 1 with 2, and then you swap 1 with 3. That gives you:
1
4 3
6 5 7 2
You then continue with the rest of the items, none of which have to be moved.
The result is an invalid heap: that last item, 2, should be the parent of 3.
The key here is that the swapping up method assumes that when you've processed a[i], then the item that ends up in that position is in its final place. Contrast that to the swap down method that allows repeated adjustment of items that are lower in the heap.

Lazy Sorting HackerRank Python

I am new to coding and so the following code I wrote may be incorrect or sub-optimal. However, the problem I have is that I do not understand the input and thus cannot get the code to run (I only tested it with custom inputs).
The essence of the problem is that you have some sequence of numbers and you want to arrange the sequence monotonically (nondecreasing or nonincreasing). You do this by a random shuffle. How many shuffles does it take for you to get to the monotonic sequence via a random shuffle? You can find the problem here and here is my code below:
#!/bin/python3 ------ The following import is given in the prompt
import os
import sys
# Complete the solve function below. Here is my code below
def solve(P):
P.sort()
correct = P
count = []
i = 0
# Here I am trying to calculate the number of ways to get the desired monotonic sequence
# I count the number of repeated numbers in the sequence as separate items in a list
for j in range(i,len(correct)):
if correct[i] != correct[j] or i == len(correct) - 1:
count.append(j-i)
i = j
j = len(correct)
else:
j = j + 1
summ = 0
for k in range(len(count)):
summ = summ + count[k]
if summ == len(correct):
i = len(correct)
poss = [1]*len(count)
for k in range(len(count)):
for l in range(1,count[k]+1):
poss[k] = poss[k]*l
possible = 1
for x in poss:
possible = possible * x
# This is calculating the number of different permutations of n distinct elements
total = 1
n = len(correct)
for i in range(1,n+1):
total = total * i
# Calculating the probability to the 6th decimal place
probability = float(possible / total)
expected = round(1/probability, 6)
print(expected)
# The following code is given in the prompt to input the test cases
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
P_count = int(input())
P = list(map(int, input().rstrip().split()))
result = solve(P)
fptr.write(str(result) + '\n')
fptr.close()
In my code, I just assumed that P is the sequence of numbers you receive from the input.

Resources