I have a scenario where there are 3 task. A,B,C. A has no dependency. B is dependent on A. C Has no dependency. I need to run tasks A and C in parallel and then run task B once after A is complete. I am thinking priority queue with threading is the best approach. Please suggest
for task in listTasks:
if not len(depDic [task]):
th = threadObject()
th.start()
else:
'''trying to figure out this logic'''
listTasks= ['A','B','C']
depDic = { 'A' :[], 'B':[A], 'C': '']
Related
i want to do a job as fast as possible so i should paralelize it using processes (not threads because of GIL). My problem is that i cant start the processes at the sametime, it always start p1, when p1 ends, p2, and so on... how can i start all my processes at the same time? My simplified code:
import multiprocessing
import time
if __name__ == '__main__':
def work(data,num):
if(num==0):
time.sleep(5)
print("starts:",num)
******heavy works that lasts random seconds to be done*****************
print("ends",num)
**********
for k in range(0,2):
p = multiprocessing.Process(target=work(data,k))
p.daemon=True
p.start()
result:
starts 0
ends 0
starts 1
ends 1
starts 2
ends 2
What i expected:
starts 0
starts 1
starts 2
ends 1 or 2
ends 1 or 2
ends 0 (because of time.sleep)
why my scripts waits always until the first process is finished to start the next one?
First of all, making your program parallel/concurrent does not always make it faster as Amdahl's law suggests
Secondly, you want to use the join() method in order to execute them concurrently, furthermore, you need to pass the arguments with the args parameter, because what you are doing is calling the whole function each time, and blocking each run with time.sleep(5), without waiting on one process to finish as such:
process_pool = []
for k in range(0, 5):
p = multiprocessing.Process(target=work, args=('you_data', k))
p.daemon = True
process_pool.append(p)
for process in process_pool:
process.start()
for process in process_pool:
process.join()
Tried 2 code examples from first answer here: Python sharing a lock between processes. Result is the same.
import multiprocessing
import time
from threading import Lock
def target(arg):
if arg == 1:
lock.acquire()
time.sleep(1.1)
print('hi')
lock.release()
elif arg == 2:
while True:
print('not locked')
time.sleep(0.5)
def init(lock_: Lock):
global lock
lock = lock_
if __name__ == '__main__':
lock_ = multiprocessing.Lock()
with multiprocessing.Pool(initializer=init, initargs=[lock_], processes=2) as pool:
pool.map(target, [1, 2])
Why does this code prints:
not locked
not locked
not locked
hi
not locked
instead
hi
not locked
Well, call your worker processes "1" and "2". They both start. 2 prints "not locked", sleeps half a second, and loops around to print "not locked" again. But note that what 2 is printing has nothing do with whether lock is locked. Nothing in the code 2 executes even references lock, let alone synchronizes on lock. After another half second, 2 wakes up to print "not locked" for a third time, and goes to sleep again.
While that's going on, 1 starts, acquires the lock, sleeps for 1.1 seconds, and then prints "hi". It then releases the lock and ends. At the time 1 gets around to printing "hi", 2 has already printed "not locked" three times, and is about 0.1 seconds into its latest half-second sleep.
After "hi" is printed, 2 will continue printing "not locked" about twice per second forever more.
So the code appears to be doing what it was told to do.
What I can't guess, though, is how you expected to see "hi" first and then "not locked". That would require some kind of timing miracle, where 2 didn't start executing at all before 1 had been running for over 1.1 seconds. Not impossible, but extremely unlikely.
Changes
Here's one way to get the output you want, although I'm making many guesses about your intent.
If you don't want 2 to start before 1 ends, then you have to force that. One way is to have 2 begin by acquiring lock at the start of what it does. That also requires guaranteeing that lock is in the acquired state before any worker begins.
So acquire it before map() is called. Then there's no point left to having 1 acquire it at all - 1 can just start at once, and release it when it ends, so that 2 can proceed.
There are few changes to the code, but I'll paste all of it in here for convenience:
import multiprocessing
import time
from threading import Lock
def target(arg):
if arg == 1:
time.sleep(1.1)
print('hi')
lock.release()
elif arg == 2:
lock.acquire()
print('not locked')
time.sleep(0.5)
def init(lock_: Lock):
global lock
lock = lock_
if __name__ == '__main__':
lock_ = multiprocessing.Lock()
lock_.acquire()
with multiprocessing.Pool(initializer=init, initargs=[lock_], processes=2) as pool:
pool.map(target, [1, 2])
I'm having performances issues in multi-threading.
I have a code snippet that reads 8MB buffers in parallel:
import copy
import itertools
import threading
import time
# Basic implementation of thread pool.
# Based on multiprocessing.Pool
class ThreadPool:
def __init__(self, nb_threads):
self.nb_threads = nb_threads
def map(self, fun, iter):
if self.nb_threads <= 1:
return map(fun, iter)
nb_threads = min(self.nb_threads, len(iter))
# ensure 'iter' does not evaluate lazily
# (generator or xrange...)
iter = list(iter)
# map to results list
results = [None] * nb_threads
def wrapper(i):
def f(args):
results[i] = map(fun, args)
return f
# slice iter in chunks
chunks = [iter[i::nb_threads] for i in range(nb_threads)]
# create threads
threads = [threading.Thread(target = wrapper(i), args = [chunk]) \
for i, chunk in enumerate(chunks)]
# start and join threads
[thread.start() for thread in threads]
[thread.join() for thread in threads]
# reorder results
r = list(itertools.chain.from_iterable(map(None, *results)))
return r
payload = [0] * (1000 * 1000) # 8 MB
payloads = [copy.deepcopy(payload) for _ in range(40)]
def process(i):
for i in payloads[i]:
j = i + 1
if __name__ == '__main__':
for nb_threads in [1, 2, 4, 8, 20]:
t = time.time()
c = time.clock()
pool = ThreadPool(nb_threads)
pool.map(process, xrange(40))
t = time.time() - t
c = time.clock() - c
print nb_threads, t, c
Output:
1 1.04805707932 1.05
2 1.45473504066 2.23
4 2.01357698441 3.98
8 1.56527090073 3.66
20 1.9085559845 4.15
Why does the threading module miserably fail at parallelizing mere buffer reads?
Is it because of the GIL? Or because of some weird configuration on my machine, one process
is allowed only one access to the RAM at a time (I have decent speed-up if I switch ThreadPool for multiprocessing.Pool is the code above)?
I'm using CPython 2.7.8 on a linux distro.
Yes, Python's GIL prevents Python code from running in parallel across multiple threads. You describe your code as doing "buffer reads", but it's really running arbitrary Python code (in this case, iterating over a list adding 1 to other integers). If your threads were making blocking system calls (like reading from a file, or from a network socket), then the GIL would usually be released while the thread blocked waiting on the external data. But since most operations on Python objects can have side effects, you can't do several of them in parallel.
One important reason for this is that CPython's garbage collector uses reference counting as its main way to know when an object can be cleaned up. If several threads try to update the reference count of the same object at the same time, they might end up in a race condition and leave the object with the wrong count. The GIL prevents that from happening, as only one thread can be making such internal changes at a time. Every time your process code does j = i + 1, it's going to be updating the reference counts of the integer objects 0 and 1 a couple of times each. That's exactly the kind of thing the GIL exists to guard.
I have a multiprocessing.manager.Array object that will be shared by multiple workers to tally observed events: each element in the array holds the tally of a different event type. Incrementing a tally requires both read and write operations, so I believe that to avoid race conditions, each worker needs to request a lock that covers both stages, e.g.
with lock:
my_array[event_type_index] += 1
My intuition is that it should be possible to place a lock on a specific array element. With that type of lock, worker #1 could increment element 1 at the same time that worker #2 is incrementing element 2. This would be especially helpful for my application (n-gram counting), where the array length is quite large and collisions would be rare.
However, I can't figure out how to request an element-wise lock for an array. Does such a thing exist in multiprocessing, or is there a workaround?
For more context, I've included my current implementation below:
import multiprocessing as mp
from queue import Empty
def count_ngrams_in_sentence(n, ngram_counts, char_to_idx_dict, sentence_queue, lock):
while True:
try:
my_sentence_str = sentence_queue.get_nowait()
my_sentence_indices = [char_to_idx_dict[i] for i in my_sentence_str]
my_n = n.value
for i in range(len(my_sentence_indices) - my_n + 1):
my_index = int(sum([my_sentence_indices[i+j]*(27**(my_n - j - 1)) \
for j in range(my_n)]))
with lock: # lock the whole array?
ngram_counts[my_index] += 1
sentence_queue.task_done()
except Empty:
break
return
if __name__ == '__main__':
n = 4
num_ngrams = 27**n
num_workers = 2
sentences = [ ... list of sentences in lowercase ASCII + spaces ... ]
manager = mp.Manager()
sentence_queue = manager.JoinableQueue()
for sentence in sentences:
sentence_queue.put(sentence)
n = manager.Value('i', value=n, lock=False)
char_to_idx_dict = manager.dict([(i,ord(i)-97) for i in string.ascii_lowercase] + [(' ', 26)],
lock=False)
lock = manager.Lock()
ngram_counts = manager.Array('l', [0]*num_ngrams, lock=lock)
''
workers = [mp.Process(target=count_ngrams_in_sentence,
args=[n,
ngram_counts,
char_to_idx_dict,
sentence_queue,
lock]) for i in range(num_workers)]
for worker in workers:
worker.start()
sentence_queue.join()
Multiprocessing.manager.Array comes with a built-in lock. Gotta switch to RawArray.
Have an list of locks. Before modifying an indice, acquire the lock for your array. Then release.
locks[i].acquire()
array[i,:]=0
locks[i].release()
As I said, if the array is a MultiProcessing.RawArray or similar, multiple processes can read or write simultaneously. For some types of Arrays, reading/writing to an Array is inherently atomic - the lock is essentially built in. Carefully research this before proceeding.
As for performance, indexing into a list is on the order of nanoseconds in Python, and acquiring and releasing a lock on the order of microseconds. It's not a huge issue.
Alright, so here's what I'm trying to do, and I'm almost sure I don't know what phrase to use to find what I'm looking for so I'll do my best to be as clear as possible with limited terminology knowledge.
I'm using lua (or at least attempting to) to generate race tracks/segments for a D&D game.
So here's what I've done, but I cant figure out how to get one table to reference another. And no matter how hard I research or play around it just wont work.
Table Dump:
--Track Tables
local raceClass = { 'SS', 'S', 'A', 'B', 'C', 'D', 'N' }
local trackLength = { 50, 30, 25, 20, 15, 10, 5 }
local trackDifficulty = { 3, 3, 3, 2, 2, 1, 1 }
local trackTypes = { 'Straightaway', 'Curve', 'Hill', 'Water', 'Jump' }
So, just to explain a little here. First off, we have the class of the race. N for novice, SS for most difficult. Next, we have the length of the resulting track. SS is a 50 segment track. N is a 5 segment track. Each race class has a difficulty cap on each segment of track. SS, S and A all have a cap of 3. D and N have a cap of 1. Then, each segment of track is further broken down into it's type. Those are generated using this slab of code;
--Track Generation
math.randomseed(os.time())
for i = 1, trackLength do
local trackTypeIndex = math.random(1, #trackTypes)
local SP = math.random(1, trackDifficulty) --SP is Stamina Cost for that segment.
print(tracktypes[trackTypeIndex]..' of SP '..SP)
end
io.read() --So it doesn't just close the window but waits for some user input.
Now it gets into the part that I start to loose myself in. I want the DM to be able to input the selected race class and get a generated list of the resulting race track.
--DM Input
print('Race Class? N, D, C, B, A, S, SS")
io.flush()
local classChoice = io.read()
So, the DM puts in the class choice, lets go with N. What I can't find is a piece of code that'll take the value for classChoice and pair it to raceClass. Then use that position to select the positions in trackLength, and trackDifficulty and finally run the remainder of the code segment Track Generation extrapolating the proper variables and print the results getting something along the lines of;
Straightaway of SP 1
Curve of SP 1
Water of SP 1
Water of SP 1
Jump of SP 1
For a low end novice race, which is only 5 segments long and has a max difficulty of 1. But with the higher classes would still generate the longer, more difficult tracks. I'm trying to be as specific as I can to minimize any confusion my inexperience in code could cost.
I think you'd be better off with a different table structures:
local raceClass = {
SS = {50, 3},
S = {30, 3},
A = {25, 3},
B = 20, 2},
C = {15, 2},
D = {10, 1},
N = {5, 1},
}
Now, you can access all the data for a raceClass easily. The code would be like:
print "Race Class? N, D, C, B, A, S, SS"
io.flush()
local classChoice = (io.read "*line"):upper() -- To convert the input to upper case characters
if not raceClass[classChoice] then
-- Wrong input was given
end
local SP, Length = raceClass[classChoice][2], raceClass[classChoice][1]