python iterate over dictionary emptys it - python-3.x

I have some code I'm analyzing. But I've found that iterating over a dictionary empties it. I've fixed the problem by by making a deepcopy of the dictionary and iterating over that in some code that displays the values, then later use the original dictionary to iterate over that to assign values to a 2D array. Why does iterating over the original dictionary to display it empty it, so that later use of the dictionary is unusable since it is now empty? Any replies welcome.
import copy
# This line fixed the problem
trans = copy.deepcopy(transitions)
print ("\nTransistions = ")
# Original line was:
# for state, next_states in transitions.items():
# Which empties the dictionary, so not usable after that
for state, next_states in trans.items():
for i in next_states:
print("\nstate = ", state, " next_state = ", i)
# Later code which with original for loop showed empty dictionary
for state, next_states in transitions.items():
for next_state in next_states:
print("\n one_step trans state = ", state, " next_state = ", next_state)
one_step[state,next_state] += 1
A print of the dictionary:
Transistions =
{0: <map object at 0x0000000003391550>, 1: <map object at 0x00000000033911D0>, 2: <map object at 0x0000000003391400>, 3: <map object at 0x00000000033915F8>, 4: <map object at 0x0000000003391320>}
Type:
Transistions =
<class 'dict'>
Edit: Here's the code that uses map. Any suggestions on how to edit it to created the dictionary without using map?
numbers = dict((state_set, n) for n, state_set in enumerate(sets))
transitions = {}
for state_set, next_sets in state_set_transitions.items():
dstate = numbers[state_set]
transitions[dstate] = map(numbers.get, next_sets)

Iterating over a dict doesn't empty it. Iterating over a map iterator empties it.
Wherever you generated your transitions dict, you should have used a list comprehension instead of map to create lists instead of iterators for the values:
[whatever for x in thing]
instead of
map(lambda x: whatever, thing)

Related

Is there a clean way to access the local iteration scope of a generator expression?

I am receiving a generator expression as an input argument to my function, and it looks like this:
(<some calculation> for item1 in list1) # single iterator case
(<some calculation> for item1 in list1 for item2 in list2) # multiple iterator case
The function is doing some internal stuff but in the end needs to return a dictionary that looks like this:
{item1: <calculated value>, ..} # single case
{(item1, item2): <calculated value>, ..} # multiple case
This works but I'd prefer to avoid messing around with what is obviously generator expression internals. Is there a clean way of making this work? I can imagine this could break in different versions of Python.
def to_dict(gen_expr):
d = {}
for x in gen_expr:
local_vars = gen_expr.gi_frame.f_locals
key = tuple(v for k, v in local_vars.items()
if k != ".0") # ".0" points to the iterator object
if len(key) == 1:
key = key[0]
d[key] = x
return d
EDIT:
This works in the IPython console but running the same piece of code in the Anaconda prompt crashes on f_locals, which then suddenly also contains all objects from the generator code block. Because these objects are not (necessarily) hashable, it raises a TypeError on d[key] = x. So my fears about (ab)using generator internals were quickly confirmed.

Calculating the occurrence of unknown values in a list

I very often face the following problem:
I have a list with unknown elements in it (but each element is of the same type, e.g.: str) and I want to count the occurrence of each element. Sometime I also want to do something with the occurrence values, so I usually store them in a dictionary.
My problem is, that I cannot "auto initialize" a dictionary with +=1, so I first I have to do a check, if the given element is still in the dictionary.
My usual go to solution:
dct = {}
for i in iterable:
if i in dct:
dct[i] += 1
else:
dct[i] = 1
Is there a simpler colution to this problem?
Yes! A defaultdict.
from collections import defaultdict
dct = defaultdict(int)
for i in iterable:
dict[i] += 1
You can auto-initialise with other types too:
Docs: https://docs.python.org/3.3/library/collections.html#collections.defaultdict
d = defaultdict(str)
d[i] += 'hello'
If you're just counting things, you could use a Counter instead:
from collections import Counter
c = Counter(iterable) # c is a subclass of dict

Pyspark Runtime Error Dictionary Changed size during iteration [duplicate]

I have obj like this
{hello: 'world', "foo.0.bar": v1, "foo.0.name": v2, "foo.1.bar": v3}
It should be expand to
{ hello: 'world', foo: [{'bar': v1, 'name': v2}, {bar: v3}]}
I wrote code below, splite by '.', remove old key, append new key if contains '.', but it said RuntimeError: dictionary changed size during iteration
def expand(obj):
for k in obj.keys():
expandField(obj, k, v)
def expandField(obj, f, v):
parts = f.split('.')
if(len(parts) == 1):
return
del obj[f]
for i in xrange(0, len(parts) - 1):
f = parts[i]
currobj = obj.get(f)
if (currobj == None):
nextf = parts[i + 1]
currobj = obj[f] = re.match(r'\d+', nextf) and [] or {}
obj = currobj
obj[len(parts) - 1] = v
for k, v in obj.iteritems():
RuntimeError: dictionary changed size during iteration
Like the message says: you changed the number of entries in obj inside of expandField() while in the middle of looping over this entries in expand.
You might try instead creating a new dictionary of the form you wish, or somehow recording the changes you want to make, and then making them AFTER the loop is done.
You might want to copy your keys in a list and iterate over your dict using the latter, eg:
def expand(obj):
keys = list(obj.keys()) # freeze keys iterator into a list
for k in keys:
expandField(obj, k, v)
I let you analyse if the resulting behavior suits your expected results.
Edited as per comments, thank you !
I had a similar issue with wanting to change the dictionary's structure (remove/add) dicts within other dicts.
For my situation I created a deepcopy of the dict. With a deepcopy of my dict, I was able to iterate through and remove keys as needed.Deepcopy - PythonDoc
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
Hope this helps!
For those experiencing
RuntimeError: dictionary changed size during iteration
also make sure you're not iterating through a defaultdict when trying to access a non-existent key! I caught myself doing that inside the for loop, which caused the defaultdict to create a default value for this key, causing the aforementioned error.
The solution is to convert your defaultdict to dict before looping through it, i.e.
d = defaultdict(int)
d_new = dict(d)
or make sure you're not adding/removing any keys while iterating through it.
Rewriting this part
def expand(obj):
for k in obj.keys():
expandField(obj, k, v)
to the following
def expand(obj):
keys = obj.keys()
for k in keys:
if k in obj:
expandField(obj, k, v)
shall make it work.

Nesting dictionaries with a for loop

I am trying to add a dictionary within a dictionary in the current code like this!
i = 0
A ={}
x = [...]
for i in x:
(a,b) = func(x)#this returns two different Dictionaries as a and b
for key in a.keys():
A[key] = {}
A[key][i] = a[key]
print('A:',A)
as I executed it, I am getting 'A' dictionary being printed throughout the loop! But, i need them in one single dictionary say: "C"
How do I do that?

How to get keys from nested dictionary of arbitrary length in Python

I have a dictionary object in python. Let's call it as dict. This object could contain another dictionary which may in turn contain another dictionary and so on.
dict = { 'k': v, 'k1': v1, 'dict2':{'k3': v3, 'k4':v4} , 'dict3':{'k5':v5, dict4:{'k6':v6}}}
This is just an example. Length of outermost dictionary could be anything. I want to extract keys from such dictionary object in following two ways :
get list of only keys.
[k,k1,k2,k3,k4,k5,k6]
get list of keys and its parent associated dictionary so something like this :
outer_dict_keys = [k ,dict2, dict3]
dict2_keys = [k3,k4]
dict3_keys = [k5, dict4]
dict4_keys = [k6]
Outermost dictionary dict length is always changing so I can not hard code anything.
What is best way to achieve above result ?
Use a mix of iteration and tail recursion. After quoting undefined names, making spacing uniform, and removing 'k2' from the first result, I came up with the code below. (Written and tested for 3.4, it should run on any 3.x and might on 2.7.) A key thing to remember is that the iteration order of dicts is essentially random, and varies with each run. Recursion as done here visit sub-dicts in depth-first rather than breadth-first order. For dict0, both are the same, But if dict4 were nested in dict2 rather than dict3, they would not be.
dict0 = {'k0': 0, 'k1': 1, 'dict2':{'k3': 3, 'k4': 4},
'dict3':{'k5': 5, 'dict4':{'k6': 6}}}
def keys(dic, klist=[]):
subdics = []
for key in sorted(dic):
val = dic[key]
if isinstance(val, dict):
subdics.append(val)
else:
klist.append(key)
for subdict in subdics:
keys(subdict, klist)
return klist
result = keys(dict0)
print(result, '\n', result == ['k0','k1','k3','k4','k5','k6'])
def keylines(dic, name='outer_dict', lines=[]):
vals = []
subdics = []
for key in sorted(dic):
val = dic[key]
if isinstance(val, dict):
subdics.append((key,val))
else:
vals.append(key)
vals.extend(pair[0] for pair in subdics)
lines.append('{}_keys = {}'.format(name, vals))
for subdict in subdics:
keylines(subdict[1], subdict[0], lines)
return lines
result = keylines(dict0)
for line in result:
print(line,)
print()
expect = [
"outer_dict_keys = ['k0', 'k1', 'dict2', 'dict3']",
"dict2_keys = ['k3', 'k4']",
"dict3_keys = ['k5', 'dict4']",
"dict4_keys = ['k6']"]
for actual, want in zip(result, expect):
if actual != want:
print(want)
for i, (c1, c2) in enumerate(zip(actual, want)):
if c1 != c2:
print(i, c1, c2)

Resources