Why doesn't multiprocessing Lock acquiring work?

Why doesn't multiprocessing Lock acquiring work? - python-3.x

Tried 2 code examples from first answer here: Python sharing a lock between processes. Result is the same.
import multiprocessing
import time
from threading import Lock
def target(arg):
if arg == 1:
lock.acquire()
time.sleep(1.1)
print('hi')
lock.release()
elif arg == 2:
while True:
print('not locked')
time.sleep(0.5)
def init(lock_: Lock):
global lock
lock = lock_
if __name__ == '__main__':
lock_ = multiprocessing.Lock()
with multiprocessing.Pool(initializer=init, initargs=[lock_], processes=2) as pool:
pool.map(target, [1, 2])
Why does this code prints:
not locked
not locked
not locked
hi
not locked
instead
hi
not locked

Well, call your worker processes "1" and "2". They both start. 2 prints "not locked", sleeps half a second, and loops around to print "not locked" again. But note that what 2 is printing has nothing do with whether lock is locked. Nothing in the code 2 executes even references lock, let alone synchronizes on lock. After another half second, 2 wakes up to print "not locked" for a third time, and goes to sleep again.
While that's going on, 1 starts, acquires the lock, sleeps for 1.1 seconds, and then prints "hi". It then releases the lock and ends. At the time 1 gets around to printing "hi", 2 has already printed "not locked" three times, and is about 0.1 seconds into its latest half-second sleep.
After "hi" is printed, 2 will continue printing "not locked" about twice per second forever more.
So the code appears to be doing what it was told to do.
What I can't guess, though, is how you expected to see "hi" first and then "not locked". That would require some kind of timing miracle, where 2 didn't start executing at all before 1 had been running for over 1.1 seconds. Not impossible, but extremely unlikely.
Changes
Here's one way to get the output you want, although I'm making many guesses about your intent.
If you don't want 2 to start before 1 ends, then you have to force that. One way is to have 2 begin by acquiring lock at the start of what it does. That also requires guaranteeing that lock is in the acquired state before any worker begins.
So acquire it before map() is called. Then there's no point left to having 1 acquire it at all - 1 can just start at once, and release it when it ends, so that 2 can proceed.
There are few changes to the code, but I'll paste all of it in here for convenience:
import multiprocessing
import time
from threading import Lock
def target(arg):
if arg == 1:
time.sleep(1.1)
print('hi')
lock.release()
elif arg == 2:
lock.acquire()
print('not locked')
time.sleep(0.5)
def init(lock_: Lock):
global lock
lock = lock_
if __name__ == '__main__':
lock_ = multiprocessing.Lock()
lock_.acquire()
with multiprocessing.Pool(initializer=init, initargs=[lock_], processes=2) as pool:
pool.map(target, [1, 2])

Related

Asyncio gather difference

From my understanding, both the code blocks are doing the same thing. Why there is a difference in execution time?
import asyncio
import time
...
# Block 1:
start_time = time.time()
tasks = [
get_from_knowledge_v2(...),
get_from_knowledge_v2(...),
get_from_knowledge_v2(...),
]
data_list = await asyncio.gather(*tasks)
print("TIME TAKEN::", time.time() - start_time)
# Block 2:
start_time = time.time()
data1 = await get_from_knowledge_v2(...)
data2 = await get_from_knowledge_v2(...)
data3 = await get_from_knowledge_v2(...)
print("WITHOUT ASYNCIO GATHER TIME TAKEN::", time.time() - start_time)
Result:
TIME TAKEN:: 0.6016566753387451
WITHOUT ASYNCIO GATHER TIME TAKEN:: 1.7620849609375

The asyncio.gather function runs the awaitables you pass to it concurrently. That means, if I/O is happening in at least one of them that allows for useful context switches by the event loop. That in turn leads to a certain degree of parallelism.
In this case I assume that get_from_knowledge_v2 does some HTTP request in a way that supports asynchronous execution.
In the second code block you have no concurrency between the three get_from_knowledge_v2 calls. Instead you just execute them sequentially (with respect to each other). In other words, while you are awaiting the first one of them, the second one will not start. Their context is blocked.
Note: This does not mean that outside of that code block no concurrency is happening/possible. If this sequential code block is inside an async function (i.e. coroutine), you can execute that concurrently with some other coroutine. It is just that inside that code block, those get_from_knowledge_v2 coroutines are executed sequentially.
The time you measured confirms this rather nicely since you have three coroutines and gather allows them to be executed almost in parallel, while the other code block executes them sequentially, thus leading to an almost three times longer execution time.
PS
Maybe a minimal concrete example will help illustrate what I mean:
from asyncio import gather, run, sleep
from time import time
async def sleep_and_print(seconds: float) -> None:
await sleep(seconds)
print("slept", seconds, "seconds")
async def concurrent_sleeps() -> None:
await gather(
sleep_and_print(3),
sleep_and_print(2),
sleep_and_print(1),
)
async def sequential_sleeps() -> None:
await sleep_and_print(3)
await sleep_and_print(2)
await sleep_and_print(1)
async def something_else() -> None:
print("Doing something else that takes 4 seconds...")
await sleep(4)
print("Done with something else!")
async def main() -> None:
start = time()
await concurrent_sleeps()
print("concurrent_sleeps took", round(time() - start, 1), "seconds\n")
start = time()
await sequential_sleeps()
print("sequential_sleeps took", round(time() - start, 1), "seconds\n")
start = time()
await gather(
sequential_sleeps(),
something_else(),
)
print("sequential_sleeps & something_else together took", round(time() - start, 1), "seconds")
if __name__ == '__main__':
run(main())
Running that script gives the following output:
slept 1 seconds
slept 2 seconds
slept 3 seconds
concurrent_sleeps took 3.0 seconds
slept 3 seconds
slept 2 seconds
slept 1 seconds
sequential_sleeps took 6.0 seconds
Doing something else that takes 4 seconds...
slept 3 seconds
Done with something else!
slept 2 seconds
slept 1 seconds
sequential_sleeps & something_else together took 6.0 seconds
This illustrates that the sleeping was done almost in parallel inside concurrent_sleeps, with the 1 second sleep finishing first, then the 2 second sleep, then the 3 second sleep.
It shows that the sleeping is done sequentially inside sequential_sleeps and in the call order, meaning it first slept 3 seconds, then it slept 2 seconds, then 1 second.
And finally, executing sequential_sleeps concurrently with something_else shows that they are executed almost in parallel, with the 3-second-sleep finishing first (after 3 seconds), then one second later something_else finished, then another second later the 2-second-sleep, then after another second the 1-second-sleep. Together they still took approximately 6 seconds.
That last part is what I meant, when I said you an still execute another coroutine concurrently with the sequential block of code. In itself, the code block will still always remain sequential.
I hope this is clearer now.
PPS
Just to throw another option into the mix, you can also achieve concurrency by using Tasks. Calling asyncio.create_task will immediately schedule the coroutine for execution on the event loop. The task it creates should be awaited at some point, but the underlying coroutine will start running almost immediately after calling create_task. You can add this to the example script above:
from asyncio import create_task
...
async def task_sleeps() -> None:
t3 = create_task(sleep_and_print(3))
t2 = create_task(sleep_and_print(2))
t1 = create_task(sleep_and_print(1))
await t3
await t2
await t1
async def main() -> None:
...
start = time()
await task_sleeps()
print("task_sleeps took", round(time() - start, 1), "seconds\n")
And you'll see the following again:
...
slept 1 seconds
slept 2 seconds
slept 3 seconds
task_sleeps took 3.0 seconds
Tasks are a nice option to decouple the execution of some coroutine from its surrounding context to an extent, but you need to keep track of them in some way.

How to create new non-blocking processes in python? (examples does not work)

i want to do a job as fast as possible so i should paralelize it using processes (not threads because of GIL). My problem is that i cant start the processes at the sametime, it always start p1, when p1 ends, p2, and so on... how can i start all my processes at the same time? My simplified code:
import multiprocessing
import time
if __name__ == '__main__':
def work(data,num):
if(num==0):
time.sleep(5)
print("starts:",num)
******heavy works that lasts random seconds to be done*****************
print("ends",num)
**********
for k in range(0,2):
p = multiprocessing.Process(target=work(data,k))
p.daemon=True
p.start()
result:
starts 0
ends 0
starts 1
ends 1
starts 2
ends 2
What i expected:
starts 0
starts 1
starts 2
ends 1 or 2
ends 1 or 2
ends 0 (because of time.sleep)
why my scripts waits always until the first process is finished to start the next one?

First of all, making your program parallel/concurrent does not always make it faster as Amdahl's law suggests
Secondly, you want to use the join() method in order to execute them concurrently, furthermore, you need to pass the arguments with the args parameter, because what you are doing is calling the whole function each time, and blocking each run with time.sleep(5), without waiting on one process to finish as such:
process_pool = []
for k in range(0, 5):
p = multiprocessing.Process(target=work, args=('you_data', k))
p.daemon = True
process_pool.append(p)
for process in process_pool:
process.start()
for process in process_pool:
process.join()

Multiprocessing in Python with time offset

I have a code in which there are two functions (fun(), fun2()). Now I want to execute these 2 functions simultaneously but with some time offset.
Note: The execution time of fun1() and fun2() is the same.
Explanation:
The code is running.
fun1() is running and doing some tasks.
After a particular time offset (say after 10 seconds) I want to run
fun2() along with fun1().
When fun1() is done it should stop (Here fun2() is still running).
Again after 10 seconds fun1() should run and when fun2() is done it should stop (Here fun1() is still running).
And this process should repeat.
For parallel execution, I tried Multiprocess in python.
Below is a sample code.
from multiprocessing import Process
from datetime import datetime
def fun1():
#do something
def fun2():
#do something
main():
t = 0 # initially time = 0
t_offset = 10 # time offset
processes = []
p1 = Process(target=fun1) # Process p1 for fun1
p2 = Process(target=fun2) # Process p2 for fun2
processes.append(p1)
processes.append(p2)
while True:
dt = datetime.now()
t = datetime.now().second
if(dt.second == 0): # Here process p1 is started at beginning of a minute.
p1.start()
if(t == t_offset) # Here after 10 seconds offset process should start.
p2.start()
Is there any solution to the above problem. Can I have two processes running together with time offset between them?

Just add a time delay to the process you want by adding some lines of code at it's start.
eg.
import time
n=0
while n<3000:
time.sleep(1)
n=n+1
print(str(3000-n) + 'until process ')
else:
pass
causes delay of 300 seconds before body of process starts.

Yield stop the action by penetrating a while loop

I read such a minimal demonstration about coroutine A Curious Course on Coroutines and Concurrency
def countdown(n):
print("Counting down from", n)
while n > 0:
yield n
n -= 1
#why it stop?
x = countdown(10)
#no output was produced
I print no result, when first call it.
In [10]: x
Out[10]: <generator object countdown at 0x1036e4228>
but should
In [14]: next(x)
Out[14]: Counting down from 10
In [15]: next(x)
Out[15]: 1
In [16]: next(x)
Why print("Counting down from", n)not executed directly when i invoke the function countdown().
I think the Counting down from 10 should be executed whatever yield, it is a sequential process.
What stop print("Counting down from", n) running, I am aware that
do something yield
yield will stop the action ahead of it,
but in the countdown example, how could yield stop print("Counting down from", n) by penetrating the while loop

If I understand your question correctly, you expect to see the Counting down from 10 text printed out immediately when you call countdown(10). But that reflects a misunderstanding of how generator functions work.
A yield expression isn't something that just interrupts the control flow of a normal function. Rather, any function that contains a yield anywhere in it becomes a generator function, which works differently than a normal function.
When you call a generator function, none of its code runs immediately. Instead, Python just creates a generator object that encapsulates the state of the function call (which at first will just record that you're at the very top of the function which hasn't started running yet). The generator object is what gets returned to the caller.
It is only after you call next on the generator object that function's code starts to run. It will run until it comes to a yield expression, and the value being yielded is what the next will return. The state of the running function is saved as part of the generator object, and it remains paused until you call next on it again.
The important thing to note is that the generator object doesn't ever run ahead of a yield until the outside code is done with the yielded value and asks for another one. We use generator functions specifically because they are lazy!
Here's a simple script that might help you understand how it works better than your example generator that tries to do something more useful:
import time
def generator_function():
print("generator start")
yield 1
print("generator middle")
yield 2
print("generator end")
print("creating the generator")
generator_object = generator_function()
print("taking a break")
time.sleep(1)
print("getting first value")
val1 = next(generator_object)
print("got", val1)
print("taking a break")
time.sleep(1)
print("getting second value")
val2 = next(generator_object)
print("got", val2)
print("taking a break")
time.sleep(1)
print("try getting a third value (it won't work)")
try:
val3 = next(generator_object) # note, the assignment never occurs, since next() raises
print("got", val3) # this line won't ever be reached
except Exception as e:
print("got an exception instead of a value:", type(e))
The print statements from the generator will always appear between the "getting" and "got" messages from the outer code.

Python double-ended queue issue

I need to create a program that runs constantly unless the user presses q to end it. The program asks the user for a number and puts the number in a queue, then it prints the queue with the new element at the end. If the number is with 01,02 then it will be added at the left hand side without the 0 at the beginning, otherwise at the right hand side. The user can remove an item from the end of the queue by typing r.
I got the starting point where it asks the user and goes until 'q' is pressed.
while True:
if input("\n\n\nType a number to add it to the queue or q to exit: ") == 'q':
break

Separate out the input call from the logic that depends on the value it returns. Instead, assign the value to a variable that you can examine several times:
while True:
val = input(...)
if val == 'q':
break
if val.startswith('0'):
...
else:
...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why doesn't multiprocessing Lock acquiring work? - python-3.x

Related

Asyncio gather difference

How to create new non-blocking processes in python? (examples does not work)

Multiprocessing in Python with time offset

Yield stop the action by penetrating a while loop

Python double-ended queue issue

Categories

Resources