Python multiprocessing: set() in dict() doesn't work - python-3.x

When there is a set() in dict(), the .add() function doesn't work.
manager = multiprocessing.Manager()
shared_dict = manager.dict()
def worker1(d, key):
if key not in shared_dict:
d[key] = {'0': set(), '1': set()}
def worker2(d, key):
if key not in shared_dict:
d[key] = {'0': set(), '1': set()}
d[key]['0'].add(1)
d[key]['1'].add(2)
process1 = multiprocessing.Process(
target=worker1, args=[shared_dict, 'a'])
process2 = multiprocessing.Process(
target=worker2, args=[shared_dict, 'b'])
process1.start()
process2.start()
process1.join()
process2.join()
I expected the following output:
{'a': {'1': set([]), '0': set([])}, 'b': {'1': (2), '0': (1)}}
instead of:
{'a': {'1': set([]), '0': set([])}, 'b': {'1': set([]), '0': set([])}}

You can read about your problem in Python documentation, which says:
If standard (non-proxy) list or dict objects are contained in a
referent, modifications to those mutable values will not be propagated
through the manager because the proxy has no way of knowing when the
values contained within are modified.
So, under "normal" circumstances if you create a new reference to an object and modify it, the modification is applied to the object no matter which reference you use to modify it:
a = set([1])
b = a
b.add(2)
print(a, b) # {1, 2} {1, 2}
In the manager, however, the modifications are not applied to the object for the quoted reason. Nevertheless, you can create a new reference to the object, change the value form there and then reassign the modified version to the dict.
import multiprocessing
manager = multiprocessing.Manager()
shared_dict = manager.dict()
def worker1(d, key):
d.setdefault(key, {'0': set(), '1': set()})
def worker2(d, key):
d.setdefault(key, {'0': set(), '1': set()})
buffer = d[key]
for i, (k, v) in enumerate(buffer.items()):
buffer[k].add(i)
d[key] = buffer
process1 = multiprocessing.Process(
target=worker1, args=[shared_dict, 'a'])
process2 = multiprocessing.Process(
target=worker2, args=[shared_dict, 'b'])
process1.start()
process2.start()
process1.join()
process2.join()
Btw, use dict.setdefault refactor these if statements.

Related

The problem of using {}.fromkey(['k1','k2'],[]) and {'k1':[],'k2':[]}

list1 = [99,55]
dict1 = {'k1':[],'k2':[]}
for num in list1:
if num > 77:
dict1['k1'].append(num)
else:
dict1['k2'].append(num)
print(dict1)
{'k1':[99],'k2':[55]}
But when I replaced dict1 = {'k1':[],'k2':[]} to {}.fromkeys(['k1','k2'],[]) , the result became {'k1': [99, 55], 'k2': [99, 55]}
why this happens? I really have no idea.
This happens because you are passing the same list object to both keys. This is the same situation as when you create an alias for a variable:
a = []
b = a
a.append(55)
b.append(99)
print(b)
prints [55, 99] because it is the same list instance.
If you want to make it more concise from a list of keys to initialize with empty list, you can do this:
dict1 = {k: [] for k in ('k1', 'k2')}
This will create a new list instance for every key.
Alternatively, you can use defaultdict
from collections import defaultdict
list1 = [99,55]
dict1 = defaultdict(list)
for num in list1:
if num > 77:
dict1['k1'].append(num)
else:
dict1['k2'].append(num)
print(dict1)
Also works.
The fromKeys() can also be supplied with a mutable object as the default value.
if we append value in the original list, the append takes place in all the values of keys.
example:
list1 = ['a', 'b', 'c', 'd']
list2 = ['SALIO']
dict1 = dict.fromkeys(list1, list2)
print(dict1)
output:
{'a': ['SALIO'], 'b': ['SALIO'], 'c': ['SALIO'], 'd': ['SALIO']}
then you can use this:
list1 = ['k1', 'k2']
dict1 = {'k1':[],'k2':[]}
list2 =[99,55]
for num in list2:
if num > 77:
a = ['k1']
dict1 = dict.fromkeys(a, [num])
else:
b = ['k2']
dict2 = dict.fromkeys(b,[num] )
res = {**dict1, **dict2}
print(res)
output:
{'k1': [99], 'k2': [55]}
You can also use the python code to merge dict code:
this function:
def Merge(dict1, dict2):
return(dict2.update(dict1))
then:
print(Merge(dict1, dict2)) #This return None
print(dict2) # changes made in dict2

Python:dict comprehension and eval function variable scope

Code 1: for loop
def foo():
one = '1'
two = '2'
three = '3'
d = {}
for name in ('one', 'two', 'three'):
d[name] = eval(name)
print(d)
foo()
output:
{'one': '1', 'two': '2', 'three': '3'}
Code 2: dict comprehension
def foo():
one = '1'
two = '2'
three = '3'
print({name: eval(name) for name in ('one', 'two', 'three')})
foo()
output:
NameError: name 'one' is not defined
Code 3: add global keyword
def foo():
global one, two, three # why?
one = '1'
two = '2'
three = '3'
print({name: eval(name) for name in ('one', 'two', 'three')})
foo()
output:
{'one': '1', 'two': '2', 'three': '3'}
Dict comprehensions and generator comprehensions create their own local scope. According to the definition of the closure (or not the closure here), but why can't Code 2 access the variable one[,two,three] of the outer function foo? However, Code 3 can successfully create a dictionary by setting the variable one[,two,three] to global?
So is it because the eval function and the dict comprehensions have different scopes?
Hope someone help me, I will be grateful!
To understand whats happening, try this:
def foo():
global one
one = '1'
two = '2'
print({'locals, global': (locals(), globals()) for _ in range(1)})
foo()
Output
{'locals, global': ({'_': 0, '.0': <range_iterator object at ...>},
{'__name__': '__main__', '__package__': None, ..., 'one': '1'})}
The builtin eval(expression) is a shortcut for eval(expression[, globals[, locals]]).
As you see in the previous output, locals() is not local symbol table of the function because list/dict comprehensions have their own scope (see https://bugs.python.org/msg348274 for instance).
To get the output you expected, you just have to pass the local symbol table of the function to eval.
def bar():
one = '1'
two = '2'
three = '3'
func_locals = locals() # bind the locals() here
print({name: eval(name, globals(), func_locals) for name in ('one', 'two', 'three')})
bar()
Output
{'one': '1', 'two': '2', 'three': '3'}

How can I write a program in Python Dictionary that prints repeated keys values?

This is my INPUT:
dic1 = {'a':'USA', 'b':'Canada', 'c':'France'}
dic2 = {'c':'Italy', 'd':'Norway', 'e':'Denmark'}
dic3 = {'e':'Finland', 'f':'Japan', 'g':'Germany’}
I want output something like below:
{'g': 'Germany', 'e': [‘Denmark’,’Finland'], 'd': 'Norway', 'c': ['Italy’,'France', 'f': 'Japan', 'b': 'Canada', 'a': 'USA'}
That is programing - you think the steps you need to get to your desired results, and write code to perform these steps, one at a time.
A funciton like this can do it:
def merge_dicts(*args):
merged = {}
for dct in args:
for key, value in dct.items():
if key not in merged:
merged[key] = []
merged[key].append(value)
return merged

recursively add to python dictionary from tuple

I'm attempting to take the tuple ('a', 'b', 'c') and create a layered dictionary like this:
{'a': {'b': {'c': {}}}}. I'm using recursion to do this. When I print out the dictionary after each stage of my script (many prints simply for debugging purposes) it shows that the dictionary is being created correctly, but then it is taken apart and left incorrect. The dictionary I'm left with is {'c': {}}. I must be doing something improper with the recursion part. Any help will be much appreciate. Here is my code:
def incr_dict(dct, tpl):
if len(tpl) == 0:
dct = dct
print(dct)
print('1')
else:
dct = {tpl[-1]:dct}
print(dct)
print('2')
incr_dict(dct, tpl[0:-1])
print(dct)
print('3')
return dct
dct = {}
tpl = ('a', 'b', 'c')
dct=incr_dict(dct, tpl)
print(dct)
print('4')
You're ALMOST there!! Change the incr_dict(dct, tpl[0:-1]) line to read return incr_dict(dct, tpl[0:-1]). I believe that will fix the problem.
When using recursion, it is important to return the recursive call -- otherwise the 'higher levels' of the recursion can't make use of the new information. By returning the recursion, the execution will continue to recurse until the terminating condition is met, and then the computed values will begin to be returned up the chain until they are finally returned from the first invocation of the function.
The final code should look as follows:
def incr_dict(dct, tpl):
if len(tpl) == 0:
dct = dct
else:
dct = {tpl[-1]:dct}
return incr_dict(dct, tpl[0:-1])
return dct
dct = {}
tpl = ('a', 'b', 'c')
dct=incr_dict(dct, tpl)
print(dct)
I removed some of the debugging statements for clarity.

Counter class extension

I am having a problem finding an elegant way to create a Counter() class that can:
Feed in arbitrary number of keys and return a nested dictionary based on this list of keys.
Increment for this nested dictionary is arbitrary as well.
For example:
counter = Counter()
for line in fin:
if a:
counter.incr(key1, 1)
else:
counter.incr(key2, key3, 2)
print counter
Ideally I am hoping to get the result looks like: {key1 : 20, {key2 : {key3 : 40}}}. But I am stuck in creating this arbitrary nested dictionary from list of keys. Any help is appreciated.
you can subclass dict and create your own nested structure.
here's my attempt at writing such class :
class Counter(dict):
def incr(self, *args):
if len(args) < 2:
raise TypeError, "incr() takes at least 2 arguments (%d given)" %len(args)
curr = self
keys, count = args[:-1], args[-1]
for depth, key in enumerate(keys, 1):
if depth == len(keys):
curr[key] = curr.setdefault(key, 0) + count
else:
curr = curr.setdefault(key, {})
counter = Counter()
counter.incr('key1', 1)
counter.incr('key2', 'key3', 2)
counter.incr('key1', 7)
print counter #{'key2': {'key3': 2}, 'key1': 8}
There are two possibilities.
First, you can always fake the nested-keys thing by using a flat Counter with a "key path" made of tuples:
counter = Counter()
for line in fin:
if a:
counter.incr((key1,), 1)
else:
counter.incr((key2, key3), 2)
But then you'll need to write a str-replacement—or, better, a wrapper class that implements __str__. And while you're at it, you can easily write an incr wrapper that lets you use exactly the API you wanted:
def incr(self, *args):
super().incr(args[:-1], args[-1])
Alternatively, you can build your own Counter-like class on top of a nested dict. The code for Counter is written in pure Python, and the source is pretty simple and readable.
From, your code, it looks like you don't have any need to access things like counter[key2][key3] anywhere, which means the first is probably going to be simpler and more appropriate.
The only type of value that can exist in a Counter object is an int, you will not be able to represent a nested dictionary with a Counter.
Here is one way to do this with a normal dictionary (counter = {}). First, to update increment the value for a single key:
counter[key1] = counter.setdefault(key1, 0) + 1
Or for an arbitrary list of keys to create the nested structure:
tmp = counter
for key in key_list[:-1]:
tmp = tmp.setdefault(key, {})
tmp[key_list[-1]] = tmp.setdefault(key_list[-1], 0) + 1
I would probably turn this into the following function:
def incr(counter, val, *keys):
tmp = counter
for key in keys[:-1]:
tmp = tmp.setdefault(key, {})
tmp[keys[-1]] = tmp.setdefault(keys[-1], 0) + val
Example:
>>> counter = {}
>>> incr(counter, 1, 'a')
>>> counter
{'a': 1}
>>> incr(counter, 2, 'a')
>>> counter
{'a': 3}
>>> incr(counter, 2, 'b', 'c', 'd')
>>> counter
{'a': 3, 'b': {'c': {'d': 2}}}
>>> incr(counter, 3, 'b', 'c', 'd')
>>> counter
{'a': 3, 'b': {'c': {'d': 5}}}

Resources