populating a dictionary of lists behavior [duplicate]

populating a dictionary of lists behavior [duplicate] - python-3.x

My attempt to programmatically create a dictionary of lists is failing to allow me to individually address dictionary keys. Whenever I create the dictionary of lists and try to append to one key, all of them are updated. Here's a very simple test case:
data = {}
data = data.fromkeys(range(2),[])
data[1].append('hello')
print data
Actual result: {0: ['hello'], 1: ['hello']}
Expected result: {0: [], 1: ['hello']}
Here's what works
data = {0:[],1:[]}
data[1].append('hello')
print data
Actual and Expected Result: {0: [], 1: ['hello']}
Why is the fromkeys method not working as expected?

When [] is passed as the second argument to dict.fromkeys(), all values in the resulting dict will be the same list object.
In Python 2.7 or above, use a dict comprehension instead:
data = {k: [] for k in range(2)}
In earlier versions of Python, there is no dict comprehension, but a list comprehension can be passed to the dict constructor instead:
data = dict([(k, []) for k in range(2)])
In 2.4-2.6, it is also possible to pass a generator expression to dict, and the surrounding parentheses can be dropped:
data = dict((k, []) for k in range(2))

Try using a defaultdict instead:
from collections import defaultdict
data = defaultdict(list)
data[1].append('hello')
This way, the keys don't need to be initialized with empty lists ahead of time. The defaultdict() object instead calls the factory function given to it, every time a key is accessed that doesn't exist yet. So, in this example, attempting to access data[1] triggers data[1] = list() internally, giving that key a new empty list as its value.
The original code with .fromkeys shares one (mutable) list. Similarly,
alist = [1]
data = dict.fromkeys(range(2), alist)
alist.append(2)
print(data)
would output {0: [1, 2], 1: [1, 2]}. This is called out in the dict.fromkeys() documentation:
All of the values refer to just a single instance, so it generally doesn’t make sense for value to be a mutable object such as an empty list.
Another option is to use the dict.setdefault() method, which retrieves the value for a key after first checking it exists and setting a default if it doesn't. .append can then be called on the result:
data = {}
data.setdefault(1, []).append('hello')
Finally, to create a dictionary from a list of known keys and a given "template" list (where each value should start with the same elements, but be a distinct list), use a dictionary comprehension and copy the initial list:
alist = [1]
data = {key: alist[:] for key in range(2)}
Here, alist[:] creates a shallow copy of alist, and this is done separately for each value. See How do I clone a list so that it doesn't change unexpectedly after assignment? for more techniques for copying the list.

You could use a dict comprehension:
>>> keys = ['a','b','c']
>>> value = [0, 0]
>>> {key: list(value) for key in keys}
{'a': [0, 0], 'b': [0, 0], 'c': [0, 0]}

This answer is here to explain this behavior to anyone flummoxed by the results they get of trying to instantiate a dict with fromkeys() with a mutable default value in that dict.
Consider:
#Python 3.4.3 (default, Nov 17 2016, 01:08:31)
# start by validating that different variables pointing to an
# empty mutable are indeed different references.
>>> l1 = []
>>> l2 = []
>>> id(l1)
140150323815176
>>> id(l2)
140150324024968
so any change to l1 will not affect l2 and vice versa.
this would be true for any mutable so far, including a dict.
# create a new dict from an iterable of keys
>>> dict1 = dict.fromkeys(['a', 'b', 'c'], [])
>>> dict1
{'c': [], 'b': [], 'a': []}
this can be a handy function.
here we are assigning to each key a default value which also happens to be an empty list.
# the dict has its own id.
>>> id(dict1)
140150327601160
# but look at the ids of the values.
>>> id(dict1['a'])
140150323816328
>>> id(dict1['b'])
140150323816328
>>> id(dict1['c'])
140150323816328
Indeed they are all using the same ref!
A change to one is a change to all, since they are in fact the same object!
>>> dict1['a'].append('apples')
>>> dict1
{'c': ['apples'], 'b': ['apples'], 'a': ['apples']}
>>> id(dict1['a'])
>>> 140150323816328
>>> id(dict1['b'])
140150323816328
>>> id(dict1['c'])
140150323816328
for many, this was not what was intended!
Now let's try it with making an explicit copy of the list being used as a the default value.
>>> empty_list = []
>>> id(empty_list)
140150324169864
and now create a dict with a copy of empty_list.
>>> dict2 = dict.fromkeys(['a', 'b', 'c'], empty_list[:])
>>> id(dict2)
140150323831432
>>> id(dict2['a'])
140150327184328
>>> id(dict2['b'])
140150327184328
>>> id(dict2['c'])
140150327184328
>>> dict2['a'].append('apples')
>>> dict2
{'c': ['apples'], 'b': ['apples'], 'a': ['apples']}
Still no joy!
I hear someone shout, it's because I used an empty list!
>>> not_empty_list = [0]
>>> dict3 = dict.fromkeys(['a', 'b', 'c'], not_empty_list[:])
>>> dict3
{'c': [0], 'b': [0], 'a': [0]}
>>> dict3['a'].append('apples')
>>> dict3
{'c': [0, 'apples'], 'b': [0, 'apples'], 'a': [0, 'apples']}
The default behavior of fromkeys() is to assign None to the value.
>>> dict4 = dict.fromkeys(['a', 'b', 'c'])
>>> dict4
{'c': None, 'b': None, 'a': None}
>>> id(dict4['a'])
9901984
>>> id(dict4['b'])
9901984
>>> id(dict4['c'])
9901984
Indeed, all of the values are the same (and the only!) None.
Now, let's iterate, in one of a myriad number of ways, through the dict and change the value.
>>> for k, _ in dict4.items():
... dict4[k] = []
>>> dict4
{'c': [], 'b': [], 'a': []}
Hmm. Looks the same as before!
>>> id(dict4['a'])
140150318876488
>>> id(dict4['b'])
140150324122824
>>> id(dict4['c'])
140150294277576
>>> dict4['a'].append('apples')
>>> dict4
>>> {'c': [], 'b': [], 'a': ['apples']}
But they are indeed different []s, which was in this case the intended result.

You can use this:
l = ['a', 'b', 'c']
d = dict((k, [0, 0]) for k in l)

You are populating your dictionaries with references to a single list so when you update it, the update is reflected across all the references. Try a dictionary comprehension instead. See
Create a dictionary with list comprehension in Python
d = {k : v for k in blah blah blah}

You could use this:
data[:1] = ['hello']

Related

increment a dictionary value by one in Python with querying hash table by one time, like map.merge in Java

For example, given list of str: ['a', 'b', 'a', 'a', 'b'], I want to get the counts of distinct string {'a' : 3, 'b' : 2}.
the naive method is like following:
lst = ['a', 'b', 'a', 'a', 'b']
counts = dict()
for w in lst:
counts[w] = counts.get(w, 0) + 1
However, it needs twice Hash Table queries. In fact, when we firstly called the get method, we have already known the bucket location. In principle, we can modify the bucket value in-place
without searching the bucket location twice. I know in Java we can use map.merge to get this optimization: https://stackoverflow.com/a/33711386/10969942
How to do it in Python?

This is no such method in Python. Whether visible or not, at least under the covers the table lookup will be done twice. But, as the answer you linked to said about Java, nobody much cares - hash table lookup is typically fast, and since you just looked up a key all the info to look it up again is likely sitting in L1 cache.
Two ways of spelling your task that are more idiomatic, but despite that the double-lookup isn't directly visible in either, it still occurs under covers:
>>> lst = ['a', 'b', 'a', 'a', 'b']
>>> from collections import defaultdict
>>> counts = defaultdict(int) # default value is int(); i.e., 0
>>> for w in lst:
... counts[w] += 1
>>> counts
defaultdict(<class 'int'>, {'a': 3, 'b': 2})
and
>>> from collections import Counter
>>> Counter(lst)
Counter({'a': 3, 'b': 2})

Python Groupby keys list of dictionaries

I have the following for instance:
x = [{'A':1},{'A':1},{'A':2},{'B':1},{'B':1},{'B':2},{'B':3},{'C':1},{'D':1}]
and I would like to get a dictionary like this:
x = [{'A': [1,2], 'B': [1,2,3], 'C':[1], 'D': [1]}]
Do you have any idea how I could get this please?

You could use a collections.defaultdict of sets to collect unique values, then convert the final result to a dictionary with values as lists using a dict comprehension:
from collections import defaultdict
lst = [{'A':1},{'A':1},{'A':2},{'B':1},{'B':1},{'B':2},{'B':3},{'C':1},{'D':1}]
result = defaultdict(set)
for dic in lst:
for key, value in dic.items():
result[key].add(value)
print({key: list(value) for key, value in result.items()})
Output:
{'A': [1, 2], 'B': [1, 2, 3], 'C': [1], 'D': [1]}
Although its probably better to add your data directly to the defaultdict to begin with, instead of creating a list of singleton dictionaries(don't recommend this data structure) then converting the result.

Using dict.setdefault
Ex:
x = [{'A':1},{'A':1},{'A':2},{'B':1},{'B':1},{'B':2},{'B':3},{'C':1},{'D':1}]
res = {}
for i in x:
for k, v in i.items():
res.setdefault(k, set()).add(v)
#or res = [{k: list(v) for k, v in res.items()}]
print(res)
Output:
{'A': {1, 2}, 'B': {1, 2, 3}, 'C': {1}, 'D': {1}}

Dict comprehension from dict to inverse dict

I have the following data:
a = {1: {'data': 243}, 2: {'data': 253}, 4: {'data':243}}
And I want to turn it around, so that the key is the values, and the data values is the keys. So first try:
b = dict(map(lambda id: (a[id]['data'], id, a))
But when I do this, the 1 gets overwritten by the 4, so result will be:
{243: 4, 253: 2}
So what I would like to get is a structure like this:
{243: [1, 4], 253: [2]}
How do I do this?

I felt the below code is more readable and simpler way of approaching your problem.
from collections import defaultdict
a = {1: {'data': 243}, 2: {'data': 253}, 4: {'data':243}}
result = defaultdict(list)
for k, v in a.items():
result[v['data']].append(k)
print(result)
Output:
defaultdict(<class 'list'>, {243: [1, 4], 253: [2]})

This can be done with a dict comprehension and itertools.groupby(), but since dicts are not ordered, we must work with a sorted list, because groupby expects pre-sorted input.
from itertools import groupby
a = {1: {'data': 243}, 2: {'data': 253}, 4: {'data': 243}}
# key extractor function suitable for both sorted() and groupby()
keyfunc = lambda i: i[1]['data']
{g[0]: [i[0] for i in g[1]] for g in groupby(sorted(a.items(), key=keyfunc), key=keyfunc)}
here g is a grouping tuple (key, items), where
g[0] is whatever keyfunc extracts (in this case the 'data' value), and
g[1] is an iterable over dict items, i.e. (key, value) tuples, hence the additional list comprehension to extract the keys only.
result:
{243: [1, 4], 253: [2]}

Data appending in dictionary

I have a requirement to make a function which takes n number of arguments and returns the values in a dictionary data structure.
For example:
Input: it will take arguments in a list
list =['a','b','c']
this list can go to n number of values.
Output: Function returns the value as
{'a':[1,2,'x'],
'b':[3,4,'y'],
'c':[5,6,'z']
}
I have used python 3.x for the same and tried below code, which gave an error unhashable type: 'list':
def Myfunc(*args):
dir={}
for x in args:
lst=[1,2,3] # This list has static value here but in actual code,
# I am generating some dynamic value. Length of list always 3.
dir[x]=lst
z=Myfunct(['a','b','c'])

*args is meant to be used to pass variable number of arguments to the function. So in your case if you did
z = MyFunct('a', 'b', 'c')
then it would work as you expected.
You're actually passing just one argument so the for loop is evaluating just once and with x = ['a', 'b', 'c']
You should change the declaration to:
def MyFunct(arg):

Did you mean to pass a list as in the code below?
Also, please note that your function doesn't return anything.
I changed name of the variable from dir to d, because dir is a built-in python function:
https://docs.python.org/3/library/functions.html?highlight=dir#dir
In [31]: def Myfunc(*args):
...: d={}
...: for x in args:
...: lst=[1,2,3] #this list has static value here but in actual code, I am generating some dynamic value. Length of list always 3.
...: d[x]=lst
...: return d
...:
In [32]: z = Myfunc(*['a','b','c'])
In [33]: z
Out[33]: {'a': [1, 2, 3], 'b': [1, 2, 3], 'c': [1, 2, 3]}

How do I index an object in python?

I have a dictionary of objects,
e.g.
{'a': (one, two, three), 'b': (four, five, six)},
and i want to know how to pull out specific parts of each object in the dictionary so that i end up with a list of things that are in a certain position in each object.
For example ending up with; [two, five] (second position in each object)
How do you index the object so that this is possible?

You can't do this directly with an index operation, but the usual Pythonic approach is to use a list comprehension; e.g.
>>> D = {'a': ('one', 'two', 'three'), 'b': ('four', 'five', 'six')}
>>> [val[1] for val in D.values()]
['two', 'five']
Keep in mind that dictionaries are inherently unordered, so the order of the result is ambiguous in this case.
If you want a dictionary of the results, you can use a dictionary comprehension, e.g.
>>> {key:val[1] for key, val in D.items()}
{'a': 'two', 'b': 'five'}
For more information, you might check out the Python List Comprehension Docs.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

populating a dictionary of lists behavior [duplicate] - python-3.x

You could use a dict comprehension: >>> keys = ['a','b','c'] >>> value = [0, 0] >>> {key: list(value) for key in keys} {'a': [0, 0], 'b': [0, 0], 'c': [0, 0]}

You can use this: l = ['a', 'b', 'c'] d = dict((k, [0, 0]) for k in l)

You are populating your dictionaries with references to a single list so when you update it, the update is reflected across all the references. Try a dictionary comprehension instead. See Create a dictionary with list comprehension in Python d = {k : v for k in blah blah blah}

You could use this: data[:1] = ['hello']

Related

increment a dictionary value by one in Python with querying hash table by one time, like map.merge in Java

Python Groupby keys list of dictionaries

Dict comprehension from dict to inverse dict

Data appending in dictionary

How do I index an object in python?

Categories

Resources