Python:dict comprehension and eval function variable scope - python-3.x

Code 1: for loop
def foo():
one = '1'
two = '2'
three = '3'
d = {}
for name in ('one', 'two', 'three'):
d[name] = eval(name)
print(d)
foo()
output:
{'one': '1', 'two': '2', 'three': '3'}
Code 2: dict comprehension
def foo():
one = '1'
two = '2'
three = '3'
print({name: eval(name) for name in ('one', 'two', 'three')})
foo()
output:
NameError: name 'one' is not defined
Code 3: add global keyword
def foo():
global one, two, three # why?
one = '1'
two = '2'
three = '3'
print({name: eval(name) for name in ('one', 'two', 'three')})
foo()
output:
{'one': '1', 'two': '2', 'three': '3'}
Dict comprehensions and generator comprehensions create their own local scope. According to the definition of the closure (or not the closure here), but why can't Code 2 access the variable one[,two,three] of the outer function foo? However, Code 3 can successfully create a dictionary by setting the variable one[,two,three] to global?
So is it because the eval function and the dict comprehensions have different scopes?
Hope someone help me, I will be grateful!

To understand whats happening, try this:
def foo():
global one
one = '1'
two = '2'
print({'locals, global': (locals(), globals()) for _ in range(1)})
foo()
Output
{'locals, global': ({'_': 0, '.0': <range_iterator object at ...>},
{'__name__': '__main__', '__package__': None, ..., 'one': '1'})}
The builtin eval(expression) is a shortcut for eval(expression[, globals[, locals]]).
As you see in the previous output, locals() is not local symbol table of the function because list/dict comprehensions have their own scope (see https://bugs.python.org/msg348274 for instance).
To get the output you expected, you just have to pass the local symbol table of the function to eval.
def bar():
one = '1'
two = '2'
three = '3'
func_locals = locals() # bind the locals() here
print({name: eval(name, globals(), func_locals) for name in ('one', 'two', 'three')})
bar()
Output
{'one': '1', 'two': '2', 'three': '3'}

Related

How to check that all variables inside a python function are local?

Given a Python function definition, is there some tool that can check that all variables used inside the function are local (either passed in as a parameter or declared within the function)?
To get all the local variables you can use dir():
e.g.
def func(a,b):
name='John'
city='New York'
age=33
#to print all the local variables inside this function.
print(dir())
func(2,3)
#output
['a', 'age', 'b', 'city', 'name']
You can also use locals():
def func(a,b):
name='John'
city='New York'
age=33
#to print all the local variables inside this function.
print(locals())
func(2,3)
#output
{'a': 2, 'b': 3, 'name': 'John', 'city': 'New York', 'age': 33}
The indentation is best feature in python so when the function , loop or any other is used in python we get single tab as indent from left so we can know about local and global variables in python
a=10
def func(a,b):
name='John'
city='New York'
age=33
#to print all the local variables inside this function.
print(dir())
func(2,3)
#output
['a', 'age', 'b', 'city', 'name']
In the above code the a=10 is global and the a inside the function parenthesis is local.

Create one nested object with two objects from dictionary

I'm not sure if the title of my question is the right description to the issue I'm facing.
I'm reading the following table of data from a spreadsheet and passing it as a dataframe:
Name Description Value
foo foobar 5
baz foobaz 4
bar foofoo 8
I need to transform this table of data to json following a specific schema.
I'm trying to get the following output:
{'global': {'Name': 'bar', 'Description': 'foofoo', 'spec': {'Value': '8'}}
So far I'm able to get the global and spec objects but I'm not sure how I should combine them to get the expected output above.
I wrote this:
for index, row in df.iterrows():
if row['Description'] == 'foofoo':
global = row.to_dict()
spec = row.to_dict()
del(global['Value'])
del(spec['Name'])
del(spec['Description'])
print("global:", global)
print("spec:", spec)
with the following output:
global: {'Name': 'bar', 'Description': 'foofoo'}
spec: {'Value': '8'}
How can I combine these two objects to get to the desired output?
This should give you that output:
global['spec'] = spec
combined = {'global': global}
Try this and see if it works faster: slow speed might be due to iterrows. I suggest you move the iteration to the dictionary after exporting from the dataframe.
Name Description Value
0 foo foobar 5
1 baz foobaz 4
2 bar foofoo 8
#Export dataframe to dictionar, using the 'index' option
M = df.to_dict('index')
r = {}
q = []
#iterating through the dictionary items(key,value pair)
for i,j in M.items():
#assign value to key 'global'
r['global'] = j
#popitem() works similarly to pop in list,
#take out the last item
#and remove it from parent dictionary
#this nests the spec key, inside the global key
r['global']['spec'] = dict([j.popitem()])
#this ensures the dictionaries already present are not overriden
#you could use copy or deep.copy to ensure same state
q.append(dict(r))
{'global': {'Name': 'foo', 'Description': 'foobar', 'spec': {'Value': 5}}}
{'global': {'Name': 'baz', 'Description': 'foobaz', 'spec': {'Value': 4}}}
{'global': {'Name': 'bar', 'Description': 'foofoo', 'spec': {'Value': 8}}}
dict popitem

Python multiprocessing: set() in dict() doesn't work

When there is a set() in dict(), the .add() function doesn't work.
manager = multiprocessing.Manager()
shared_dict = manager.dict()
def worker1(d, key):
if key not in shared_dict:
d[key] = {'0': set(), '1': set()}
def worker2(d, key):
if key not in shared_dict:
d[key] = {'0': set(), '1': set()}
d[key]['0'].add(1)
d[key]['1'].add(2)
process1 = multiprocessing.Process(
target=worker1, args=[shared_dict, 'a'])
process2 = multiprocessing.Process(
target=worker2, args=[shared_dict, 'b'])
process1.start()
process2.start()
process1.join()
process2.join()
I expected the following output:
{'a': {'1': set([]), '0': set([])}, 'b': {'1': (2), '0': (1)}}
instead of:
{'a': {'1': set([]), '0': set([])}, 'b': {'1': set([]), '0': set([])}}
You can read about your problem in Python documentation, which says:
If standard (non-proxy) list or dict objects are contained in a
referent, modifications to those mutable values will not be propagated
through the manager because the proxy has no way of knowing when the
values contained within are modified.
So, under "normal" circumstances if you create a new reference to an object and modify it, the modification is applied to the object no matter which reference you use to modify it:
a = set([1])
b = a
b.add(2)
print(a, b) # {1, 2} {1, 2}
In the manager, however, the modifications are not applied to the object for the quoted reason. Nevertheless, you can create a new reference to the object, change the value form there and then reassign the modified version to the dict.
import multiprocessing
manager = multiprocessing.Manager()
shared_dict = manager.dict()
def worker1(d, key):
d.setdefault(key, {'0': set(), '1': set()})
def worker2(d, key):
d.setdefault(key, {'0': set(), '1': set()})
buffer = d[key]
for i, (k, v) in enumerate(buffer.items()):
buffer[k].add(i)
d[key] = buffer
process1 = multiprocessing.Process(
target=worker1, args=[shared_dict, 'a'])
process2 = multiprocessing.Process(
target=worker2, args=[shared_dict, 'b'])
process1.start()
process2.start()
process1.join()
process2.join()
Btw, use dict.setdefault refactor these if statements.

What's the one liner to split a string to dictionary with default value in python3?

I have a input string input_str = 'a=1;b=2;c' and I want to split it into dictionary as {'a':1, 'b':2, 'c': '.'}
input_str = 'a=1;b=2;c'
default = '.'
output = dict(s.split('=') if '=' in s else {s ,default} for s in input_str.split(';'))
print(output)
{'a': '1', 'b': '2', '.': 'c'}
# Output I want:
{'a': '1', 'b': '2', 'c': '.'}
Following code works.But I was looking for a one liner with dict comprehension.
my_result = {}
input_str = 'a=1;b=2;c'
for s in input_str.split(';'):
if '=' in s:
key, val = s.split('=')
my_result[key] = val
else:
my_result[s] = '.'
I noticed that else condition in above code {s ,default} is treated as set. How to convert it into dictionary.
As you noted, {s, default} defines a set, and the order of sets is undefined.
All you need to do to remedy this is to use a list instead.
dict(s.split('=', 1) if '=' in s else [s, default] for s in input_str.split(';'))
Note, this is unlikely to be very useful in real-life unless you have very restricted requirements. What happens if you want to include a value that contains a ';' character?
By changing the first split() call to have , 1, this means that the value will only ever be split once, no matter how many '=' characters there are.
For example, trying to parse an input of: a=bad=value;b=2 would raise a ValueError.

How do I create a default dictionary of dictionaries

I am trying to write some code that involves creating a default dictionary of dictionaries. However, I have no idea how to initialise/create such a thing. My current attempt looks something like this:
from collections import defaultdict
inner_dict = {}
dict_of_dicts = defaultdict(inner_dict(int))
The use of this default dict of dictionaries is to for each pair of words that I produce from a file I open (e.g. [['M UH M', 'm oo m']] ), to set each segment of the first word delimited by empty space as a key in the outer dictionary, and then for each segment in the second word delimited by empty space count the frequency of that segment.
For example
[['M UH M', 'm oo m']]
(<class 'dict'>, {'M': {'m': 2}, 'UH': {'oo': 1}})
Having just run this now it doesn't seem to have output any errors, however I was just wondering if something like this will actually produce a default dictionary of dictionaries.
Apologies if this is a duplicate, however previous answers to these questions have been confusing and in a different context.
To initialise a defaultdict that creates dictionaries as its default value you would use:
d = defaultdict(dict)
For this particular problem, a collections.Counter would be more suitable
>>> from collections import defaultdict, Counter
>>> d = defaultdict(Counter)
>>> for a, b in zip(*[x.split() for x in ['M UH M', 'm oo m']]):
... d[a][b] += 1
>>> print(d)
defaultdict(collections.Counter,
{'M': Counter({'m': 2}), 'UH': Counter({'oo': 1})})
Edit
You expressed interest in a comment about the equivalent without a Counter. Here is the equivalent using a plain dict
>>> from collections import defaultdict
>>> d = defaultdict(dict)
>>> for a, b in zip(*[x.split() for x in ['M UH M', 'm oo m']]):
... d[a][b] = d[a].get(b, 0) + 1
>>> print(d)
defaultdict(dict, {'M': {'m': 2}, 'UH': {'oo': 1}})
You also could a use a normal dictionary and its setdefault method.
my_dict.setdefault(key, default) will look up my_dict[key] and ...
... if the key already exists, return its current value without modifying it, or ...
... assign the default value (my_dict[key] = default) and then return that.
So you can call my_dict.setdefault(key, {}) always when you want to get a value from your outer dictionary instead of the normal my_dict[key] to retrieve either the real value assigned with this key if it#s present, or to get a new empty dictionary as default value which gets automatically stored into your outer dictionary as well.
Example:
outer_dict = {"M": {"m": 2}}
inner_dict = d.setdefault("UH", {})
# outer_dict = {"M": {"m": 2}, "UH": {}}
# inner_dict = {}
inner_dict["oo"] = 1
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1}}
# inner_dict = {"oo": 1}
inner_dict = d.setdefault("UH", {})
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1}}
# inner_dict = {"oo": 1}
inner_dict["xy"] = 3
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1, "xy": 3}}
# inner_dict = {"oo": 1, "xy": 3}
This way you always get a valid inner_dict, either an empty default one or the one that's already present for the given key. As dictionaries are mutable data types, modifying the returned inner_dict will also modify the dictionary inside outer_dict.
The other answers propose alternative solutions or show you can make a default dictionary of dictionaries using d = defaultdict(dict)
but the question asked how to make a default dictionary of default dictionaries, my navie first attempt was this:
from collections import defaultdict
my_dict = defaultdict(defaultdict(list))
however this throw an error: *** TypeError: first argument must be callable or None
so my second attempt which works is to make a callable using the lambda key word to make an anonymous function:
from collections import defaultdict
my_dict = defaultdict(lambda: defaultdict(list))
which is more concise than the alternative method using a regular function:
from collections import defaultdict
def default_dict_maker():
return defaultdict(list)
my_dict = defaultdict(default_dict_maker)
you can check it works by assigning:
my_dict[2][3] = 5
my_dict[2][3]
>>> 5
or by trying to return a value:
my_dict[0][0]
>>> []
my_dict[5]
>>> defaultdict(<class 'list'>, {})
tl;dr
this is your oneline answer my_dict = defaultdict(lambda: defaultdict(list))

Resources