get() inside dict composition - python-3.x

New to python/programming, and working on dict composition. (python 3)
Is it possible to use the get method inside of a dict composition on the dict you're creating?
e.g. d = {key: d.get(key,0) + 1 for key in some_list}
Feel free to provide a better way of accomplishing the example, but I'm really interested in understanding if using get on the dict you're creating is possible/valid.

Nope, not possible. If you want a dict of counts, use the collections.Counter dict subclass:
import collections
counts = collections.Counter(some_list)

Python has dedicated class for handing this kind of problems - collections.Counter.
import collections
seq = ['a', 'b', 'b', 'c']
d = collections.Counter(seq) # Counter({'b': 2, 'a': 1, 'c': 1})

Related

python merging nested dictionaries into a new dictionary and update that dictionary without affecting originals

I'm using Python 3 and I'm trying to merge two dictionaries of dictionaries into one dictionary. I'm successful, however, when I add a new key/value pair to the new dictionary, it also adds it to both of the original dictionaries. I would like the original dictionaries to remain unchanged.
dict1 = {'key_val_1': {'a': '1', 'b': '2', 'c': '3'}}
dict2 = {'key_val_2': {'d': '4', 'e': '5', 'f': '6'}}
dict3 = dict1 | dict2
for x in dict3:
dict3[x]['g'] = '7'
The above code will append 'g': '7' to all 3 dictionaries and I only want to alter dict3. I have to assume that this is the intended behavior, but for the life of me I can't understand why (or how to get the desired results).
I believe the root of your problem is your assumption that when concatenating the two dictionaries dict1 and dict2 that python makes a copy of the dictionaries before concatenating them. In fact Python simply creates a new object with pointers to each of the parts. With this in mind, when you change the contents of a part of dict3 you are in reality changing the underlying dictionaries dict1 and dict2. To remedy this condition, you need to make copies of the underlying dictionaries before concatenating them or merge them rather than concatenating them.
Using the copy function:
from copy import deepcopy
dict3 = deepcopy(dict1) | deepcopy(dict2)
Now dict3 contains independent copies of dict1 and dict2
To merge the dicts:
from copy import copy
def merge(d1, d2):
rslt = dict()
for k in d1.keys():
rslt[k] = d1[k].copy() #Note still necessary to copy underlying dict
for k in d2.keys():
rslt[k] = d2[k].copy()
return rslt
then use:
dict3 = merge(dict1, dict2)
Issue you are having is because dict3 consists reference to the sub-dicts in dict1 and dict2. And the dict objects are mutable. So, when you change a dict in one place, it effects all the place where it is referenced. You can verify it by using the id() function. Example:
>>> print(id(dict1['key_val_1']))
>>> 140294429633472
>>> print(id(dict3['key_val_1']))
>>> 140294429633472
>>> print(id(dict2['key_val_2']))
>>> 140294429633728
>>> print(id(dict3['key_val_2']))
>>> 140294429633728
From above example you can verify that, the sub-dict in dict1 and dict2 are referenced in dict3. So, when you modify them in dict3 the orginial dicts are also modified, as dict are mutable.
So, to solve your issue, you need to make a deep copy of each sub-dict before merging them.

Python Dictionary - Combine Dictionaries

Given a list of Dictionaries, return a new Dictionary of all of their keys combined.
This is what I have done so far:
def combine_dictionaries(dictionary_list):
# your code goes here
my_dictionary = {}
for key in dictionary_list:
my_dictionary.update(key, dictionary_list[key])
return my_dictionary
This is the error it produces:
list indices must be integers or slices, not dict
Can someone let me know, how to get a integer when I have been provided a list of dictionaries?
The expected result should look something like this:
{'a': 3, 'b': 2, 'c': 4, 4: 4, 3: 3}
Your function is almost there.
I believe that you should only pass key in your dictionnary update because the update built-in function accepts either another dictionary object or an iterable of key/value pairs.
def combine_dictionaries(dictionary_list):
my_dictionary = {}
for key in dictionary_list:
my_dictionary.update(key)
return my_dictionary

Python: Create directory structure according to dict

In Python, given a dict (list or sequence) as in:
d={'a': 9, 'b': 6, 'c': 10],
I would like to create, in a clever way, the a directory structure like this:
'a_9/b_6/c_10'.
import os
def make_dir(dictionary):
for key, val in dictionary.items():
new_dir = f"{key}_{val}"
os.mkdir(new_dir)
os.chdir(new_dir)
EDIT: Or even more concise
import os
os.makedirs("/".join(f"{key}_{val}" for key, val in d.items()))
Dictionaries are not strictly ordered in Python (although as of Python 3.6 there is some level of order), but if you sorted it (assuming you want that), perhaps by key, you could try something like follows:
new_dir = os.path.join(*[f'{k}_{v}' for k, v in sorted(d.items())])
os.makedirs(new_dir)

Is there a simple way to manually iterate through existing pandas groupby objects?

Is there a simple way to manually iterate through existing pandas groupby objects?
import pandas as pd
df = pd.DataFrame({'x': [0, 1, 2, 3, 4], 'category': ['A', 'A', 'B', 'B', 'B']})
grouped = df.groupby('category')
In the application a for name, group in grouped: loops follows. For manual-testing I would like to do something like group = grouped[0] and run the code within the for-loop. Unfortunately this does not work. The best thing I could find (here) was
group = df[grouped.ngroup()==0]
which relies on the original DataFrame and not soley on the groupby-Object and is therefore not optimal imo.
Any iterable (here the GroupBy object) can be turned into an iterator:
group_iter = iter(grouped)
The line below will be the equivalent of selecting the first group (indexed by 0):
name, group = next(group_iter)
To get the next group, just repeat:
name, group = next(group_iter)
And so on...
Source: https://treyhunner.com/2018/02/python-range-is-not-an-iterator/

How do I create a default dictionary of dictionaries

I am trying to write some code that involves creating a default dictionary of dictionaries. However, I have no idea how to initialise/create such a thing. My current attempt looks something like this:
from collections import defaultdict
inner_dict = {}
dict_of_dicts = defaultdict(inner_dict(int))
The use of this default dict of dictionaries is to for each pair of words that I produce from a file I open (e.g. [['M UH M', 'm oo m']] ), to set each segment of the first word delimited by empty space as a key in the outer dictionary, and then for each segment in the second word delimited by empty space count the frequency of that segment.
For example
[['M UH M', 'm oo m']]
(<class 'dict'>, {'M': {'m': 2}, 'UH': {'oo': 1}})
Having just run this now it doesn't seem to have output any errors, however I was just wondering if something like this will actually produce a default dictionary of dictionaries.
Apologies if this is a duplicate, however previous answers to these questions have been confusing and in a different context.
To initialise a defaultdict that creates dictionaries as its default value you would use:
d = defaultdict(dict)
For this particular problem, a collections.Counter would be more suitable
>>> from collections import defaultdict, Counter
>>> d = defaultdict(Counter)
>>> for a, b in zip(*[x.split() for x in ['M UH M', 'm oo m']]):
... d[a][b] += 1
>>> print(d)
defaultdict(collections.Counter,
{'M': Counter({'m': 2}), 'UH': Counter({'oo': 1})})
Edit
You expressed interest in a comment about the equivalent without a Counter. Here is the equivalent using a plain dict
>>> from collections import defaultdict
>>> d = defaultdict(dict)
>>> for a, b in zip(*[x.split() for x in ['M UH M', 'm oo m']]):
... d[a][b] = d[a].get(b, 0) + 1
>>> print(d)
defaultdict(dict, {'M': {'m': 2}, 'UH': {'oo': 1}})
You also could a use a normal dictionary and its setdefault method.
my_dict.setdefault(key, default) will look up my_dict[key] and ...
... if the key already exists, return its current value without modifying it, or ...
... assign the default value (my_dict[key] = default) and then return that.
So you can call my_dict.setdefault(key, {}) always when you want to get a value from your outer dictionary instead of the normal my_dict[key] to retrieve either the real value assigned with this key if it#s present, or to get a new empty dictionary as default value which gets automatically stored into your outer dictionary as well.
Example:
outer_dict = {"M": {"m": 2}}
inner_dict = d.setdefault("UH", {})
# outer_dict = {"M": {"m": 2}, "UH": {}}
# inner_dict = {}
inner_dict["oo"] = 1
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1}}
# inner_dict = {"oo": 1}
inner_dict = d.setdefault("UH", {})
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1}}
# inner_dict = {"oo": 1}
inner_dict["xy"] = 3
# outer_dict = {"M": {"m": 2}, "UH": {"oo": 1, "xy": 3}}
# inner_dict = {"oo": 1, "xy": 3}
This way you always get a valid inner_dict, either an empty default one or the one that's already present for the given key. As dictionaries are mutable data types, modifying the returned inner_dict will also modify the dictionary inside outer_dict.
The other answers propose alternative solutions or show you can make a default dictionary of dictionaries using d = defaultdict(dict)
but the question asked how to make a default dictionary of default dictionaries, my navie first attempt was this:
from collections import defaultdict
my_dict = defaultdict(defaultdict(list))
however this throw an error: *** TypeError: first argument must be callable or None
so my second attempt which works is to make a callable using the lambda key word to make an anonymous function:
from collections import defaultdict
my_dict = defaultdict(lambda: defaultdict(list))
which is more concise than the alternative method using a regular function:
from collections import defaultdict
def default_dict_maker():
return defaultdict(list)
my_dict = defaultdict(default_dict_maker)
you can check it works by assigning:
my_dict[2][3] = 5
my_dict[2][3]
>>> 5
or by trying to return a value:
my_dict[0][0]
>>> []
my_dict[5]
>>> defaultdict(<class 'list'>, {})
tl;dr
this is your oneline answer my_dict = defaultdict(lambda: defaultdict(list))

Resources