Multiprocessing threadpool concatenates arguments - python-3.x

I have a very long list data, lets assume that it looks like this:
[(a, a, 1),
(b, b, 1),
(c, c, 1),
(d, d, 1),
(e, e, 1),
(f, f, 1),
(g, g, 1),
(h, h, 1),
(i, i, 1),]
I am trying to use multithreading as follows:
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
pool.starmap(help_func, data)
Help_func is as follows:
def help_func(in_vala, in_valb, in_valc):
print("asking for " + str(in_vala) + " asking for " + str(in_valb))
receiver(in_vala)
and receiver is a simple test function as so:
def receiver(group):
print(group)
When I run my program, I can see that the output from help_func is correct, i.e., it enumerates the values of data.
However, when I look at the values generated at the receiver(), I notice some weird prints which look like:
a
b
c
de
e
f
gh
i
I am struggling to see why this might be the case. There is something that goes wrong when calling receiver, perhaps due to receiver bring non-blocking may be?
How should I go around this issue.
Also, when I use ThreadPool(1), I do not see this issue. My actual problem has a much larger function that is called from help_func, so I would like to run it under multiple threads ideally.

You are encountering classical concurrency problem: everything you think is atomic is not. Actually print function prints two things: the data you pass to it and the end argument, which by default is "\n".
So that concatenation is the result of one thread writing data, then other writing data, then both writing new lines.
It all much better explained in this Raymond Hettinger talk.
P.S.: I hope that you’re aware of python GIL. In short: only one python instruction can execute across all python threads at the same time. If you want to speed up execution of your function - use multiprocessing, multithreading is useful when your thread is blocking most of the time (for example, networking is mostly waiting for packets to arrive, so threads are ok for that)

Related

Why should I ever curry functions in Python

I have used Haskell in the past so I understand it's usage and benefits there. But in Python, I'm struggling to see why I would ever want to curry (besides stylistic preferences). Posts here and sites like this usually mention that currying allows for function re-usability but I don't see it. The common example of:
def f(a, b):
return a + b
vs
def f(a):
def g(b):
return a + b
return g
doesn't seem to showcase any real benefit. In fact, the function nesting definition and even calling f(a)(b) seems less clear then the 'standard' function definition. Can someone provide an example where currying showcases some net benefit?
I often curry functions when using the multiprocessing module for running a function in parallel. Its pool.map() requires the function to accept a single argument (the data being worked on) but I often need additional parameters that are invariant. For this, I create a partial version of the function using functools.partial:
from functools import partial
from multiprocessing import Pool
def do_work(constant_param, variable_param):
context = setup_spam(constant_param)
return calc_eggs(context, variable_param)
pool = Pool()
inputs = [1, 2, 3]
partial_func = partial(do_work, "potatoes")
results = pool.map(partial_func, inputs)
This effectively does the following calls in parallel:
do_work("potatoes", 1)
do_work("potatoes", 2)
do_work("potatoes", 3)

Multithreading based on duplicated jOOλ streams

The code below represents a toy example of the problem I am trying to solve.
Imagine that we have an original stream of data originalStream and that the goal is to apply 2 very different data processing. As an example here, one data processing will multiply each element by 2 and sum the result (dataProcess1) and the other will multiply by 4 and sum the result (dataProcess2). Obviously the operation would not be so simple in real life....
The idea is to use jOOλ in order to duplicate the stream and apply both operations to the 2 streams. However, the trick is that I want to run both data processing in different threads. Since originalStream.duplicate() is not thread-safe out of the box, the code below will fail to give the right result which should be: result1 = 570; result2 = 180. Instead the code may unpredictably fail on NPE, yield the wrong result or (sometimes) even give the right result...
The question is how to minimally modify the code such that it will become thread-safe.
Note that I do not want to first collect the stream into a list and then generate 2 new streams. Instead I want to stay with streams until they are eventually collected at the end of the data process. It may not be the most efficient nor the most logical thing to want to do but I think it is nevertheless conceptually interesting. Note also that I wish to keep using org.jooq.lambda.Seq (group: 'org.jooq', name: 'jool', version: '0.9.12') as much as possible as the real data processing functions will use methods that are specific to this library and not present in regular Java streams.
Seq<Long> originalStream = seq(LongStream.range(0, 10));
Tuple2<Seq<Long>, Seq<Long>> duplicatedOriginalStream = originalStream.duplicate();
ExecutorService executor = Executors.newFixedThreadPool(2);
List<Future<Long>> res = executor.invokeAll(Arrays.asList(
() -> duplicatedOriginalStream.v1.map(x -> 2 * x).zipWithIndex().map(x -> x.v1 * x.v2).reduce((x, y) -> x + y).orElse(0L),
() -> duplicatedOriginalStream.v2.map(x -> 4 * x).reduce((x, y) -> x + y).orElse(0L)
));
executor.shutdown();
System.out.printf("result1 = %d\tresult2 = %d\n", res.get(0).get(), res.get(1).get());

How to explain Read/Write global variables in multi threads environment

I am not familiar with multi-thread and locks and atomic/nonatomic operations.
Recently I saw an interview question as below.
Put f1 and f2 in two separate threads and run them at the same time, when both of them return, what is the value of a?
int a = 2, b = 0, c = 0
func f1()
{
a = a * 2
a = b
}
func f2()
{
c = a + 11
a = c
}
I tried to implement the above code in objective c environment and what I got is a = 11. I'm not sure if this is right since what I did is put f1 in main queue and put f2 in a dispatch global queue and ran it async which could be incorrect.
If someone could give an answer and explain the process based on the level of register accessing, CPU processing, memory usage, that would be great.
The answer is - the result of A is random. It can be anything. Since access to A is not atomic and there is no synchronization, different threads might see a different value for a depending on random factors. If you manage to make a unaligned and run it on X86, you might even see a non-value for a.

asyncio with map&reduce flavor and without flooding the event loop

I am trying to use asyncio in real applications and it doesn't go that
easy, a help of asyncio gurus is needed badly.
Tasks that spawn other tasks without flooding event loop (Success!)
Consider a task like crawling the web starting from some "seeding" web-pages. Each
web-page leads to generation of new downloading tasks in exponential(!)
progression. However we don't want neither to flood the event loop nor to
overload our network. We'd like to control the task flow. This is what I
achieve well with modification of nice Maxime's solution proposed here:
https://mail.python.org/pipermail/python-list/2014-July/687823.html
map & reduce (Fail)
Well, but I'd need as well a very natural thing, kind of map() & reduce()
or functools.reduce() if we are on python3 already. That is, I'd need to
call a "summarizing" function for all the downloading tasks completed on
links from a page. This is where i fail :(
I'd propose an oversimplified but still a nice test to model the use case:
Let's use fibonacci function implementation in its ineffective form.
That is, let the coro_sum() be applied in reduce() and coro_fib be what we apply with
map(). Something like this:
#asyncio.coroutine
def coro_sum(x):
return sum(x)
#asyncio.coroutine
def coro_fib(x):
if x < 2:
return 1
res_coro =
executor_pool.spawn_task_when_arg_list_of_coros_ready(coro=coro_sum,
arg_coro_list=[coro_fib(x - 1), coro_fib(x - 2)])
return res_coro
So that we could run the following tests.
Test #1 on one worker:
executor_pool = ExecutorPool(workers=1)
executor_pool.as_completed( coro_fib(x) for x in range(20) )
Test #2 on two workers:
executor_pool = ExecutorPool(workers=2)
executor_pool.as_completed( coro_fib(x) for x in range(20) )
It would be very important that both each coro_fib() and coro_sum()
invocations are done via a Task on some worker, not just spawned implicitly
and unmanaged!
It would be cool to find asyncio gurus interested in this very natural goal.
Your help and ideas would be very much appreciated.
best regards
Valery
There are multiple ways to compute fibonacci series asynchroniously. First, check that the explosive variant fails in your case:
#asyncio.coroutine
def coro_sum(summands):
return sum(summands)
#asyncio.coroutine
def coro_fib(n):
if n == 0: s = 0
elif n == 1: s = 1
else:
summands, _ = yield from asyncio.wait([coro_fib(n-2), coro_fib(n-1)])
s = yield from coro_sum(f.result() for f in summands)
return s
You could replace summands with:
a = yield from coro_fib(n-2) # don't return until its ready
b = yield from coro_fib(n-1)
s = yield from coro_sum([a, b])
In general, to prevent the exponential growth, you could use asyncio.Queue (synchronization via communication), asyncio.Semaphore (synchonization using mutex) primitives.

What is process interleaving? (in the realm of Concurrency)

I'm not quite sure as to what this term means. I saw it during a course where we are learning about concurrency. I've seen a lot of definitions for data interleaving, but I could find anything about process interleaving.
When looking at the term my instincts tell me it is the use of threads to run more than one process simultaneously, is that correct?
If you imagine a process as a (possibly infinite) sequence/trace of statements (e.g. obtained by loop unfolding), then the set of possible interleavings of several processes consists of all possible sequences of statements of any of those process.
Consider for example the processes
int i;
proctype A() {
i = 1;
}
proctype B() {
i = 2;
}
Then the possible interleavings are i = 1; i = 2 and i = 2; i = 1, i.e. the possible final values for i are 1 and 2. This can be of course more complex, for instance in the presence of guarded statements: Then the next possible statements in an interleaving sequence are not necessarily those at the position of the next program counter, but only those that are allowed by the guard; consider for example the proctype
proctype B() {
if
:: i == 0 -> i = 2
:: else -> skip
fi
}
Then the possible interleavings (given A() as before) are i = 1; skip and i = 2; i = 1, so there is only one possible final value for i.
Indeed the notion of interleavings is crucial for Spin's view of concurrency. In a trace semantics, the set of possible traces of concurrent processes is the set of possible interleavings of the traces of the individual processes.
It simply means performing (data access or execution or ... ) in an arbitrary order**(see the note). In the case of concurrency, it usually refers to action interleaving.
If the process P and Q are in parallel composition (P||Q) then the actions of these will be interleaved. Consider following processes:
PLAYING = (play_music -> stop_music -> STOP).
PERFORMING = (dance -> STOP).
||PLAY_PERFORM = (PLAYING || PERFORMING).
So each primitive process can be shown as: (generated by LTSA model-cheking tool)
Then the possible traces as the result of action interleaving will be:
dance -> play_music -> stop_music
play_music -> dance -> stop_music
play_music -> stop_music -> dance
Here is the LTSA tool generated output of this example.
**note: "arbitrary" here means arbitrary choice of process execution not their inner sequence of codes. The code execution in each process will be always followed sequentially.
If it is still something that you're not comfortable with you can take a look at: https://www.doc.ic.ac.uk/~jnm/book/firstbook/pdf/ch3.pdf
Hope it helps! :)
Operating Systems support Tasks (or Processes). But for now let's think of "Actitivities".
Activities can be executed in parallel. Here are two activities, P and Q:
P: abc
Q: def
a, b, c, d, e, f, are operations. *
Each operation has always the same effect independent of what other
operations may be executing at the same time (atomicity).
What is the effect of executing the two activities concurrently? We
do not know for sure, but we know that it will be the same as obtained
by executing sequentially an INTERLEAVING of the two activities
[interleavings are also called SCHEDULES]. Here are the possible
interleavings of these two activities:
abcdef
abdcef
abdecf
abdefc
adbcef
......
defabc
That is, the operations of the two activities are sequenced in all possible ways that preserve the order in which the operations appeared in the two activities. A serial interleaving [serial schedule] of two activities is one where all the operations of one activity precede all the operations of the other activity.
The importance of the concept of interleaving is that it allows us to express the meaning of concurrent programs: The parallel execution of activities is equivalent to the sequential execution of one of the interleavings of these activities.
For detailed information: https://cis.temple.edu/~ingargio/cis307/readings/interleave.html

Resources