Dictionary copy() - is shallow deep sometimes? - python-3.x

According to the official docs the dictionary copy is shallow, i.e. it returns a new dictionary that contains the same key-value pairs:
dict1 = {1: "a", 2: "b", 3: "c"}
dict1_alias = dict1
dict1_shallow_copy = dict1.copy()
My understanding is that if we del an element of dict1 both dict1_alias & dict1_shallow_copy should be affected; however, a deepcopy would not.
del dict1[2]
print(dict1)
>>> {1: 'a', 3: 'c'}
print(dict1_alias)
>>> {1: 'a', 3: 'c'}
But dict1_shallow_copy 2nd element is still there!
print(dict1_shallow_copy)
>>> {1: 'a', 2: 'b', 3: 'c'}
What am I missing?

A shallow copy means that the elements themselves are the same, just not the dictionary itself.
>>> a = {'a':[1, 2, 3], #create a list instance at a['a']
'b':4,
'c':'efd'}
>>> b = a.copy() #shallow copy a
>>> b['a'].append(2) #change b['a']
>>> b['a']
[1, 2, 3, 2]
>>> a['a'] #a['a'] changes too, it refers to the same list
[1, 2, 3, 2]
>>> del b['b'] #here we do not change b['b'], we change b
>>> b
{'a': [1, 2, 3, 2], 'c': 'efd'}
>>> a #so a remains unchanged
{'a': [1, 2, 3, 2], 'b': 4, 'c': 'efd'}1

Related

Is it possible to make a dictionary with lists as values from a list of dictionaries, in a one line comprehension?

If I have list of dicts A:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
can I make the following dict:
B = {'a': [1, 3], 'b': [2, 4]}
using only dict/list comprehension?
bonus: can I also account for varied keys in A e.g:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4, 'c': 5}]
B = {'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}
I have managed to do this with a for loops and if statements, was hoping for something that processes faster
Try:
A = [{"a": 1, "b": 2}, {"a": 3, "b": 4, "c": 5}]
out = {k: [d.get(k) for d in A] for k in set(k for d in A for k in d)}
print(out)
Prints:
{'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}

Extending a list

Currently I am tidying up one of my projects by using the more pythonic way of dong things. Now I struggle extending a list by values from a dictionary that are lists.
my_dict = {'a': [1, 2, 3], 'b': [2, 3, 4], 'c': [4, 5, 6]}
criteria = ['a', 'c']
my_list = []
for c in criteria:
my_list.extend(my_dict[c])
Results in [1, 2, 3, 4, 5, 6] which is the sought for result, where as
my_list = []
my_list.extend(my_dict[c] for c in criteria)
Results in a nested list [[1, 2, 3], [4, 5, 6]]. I can't quite find a reason why this is happening
Your code does not work because it attempts to extend the list with the result of the generator comprehension, which is a list of lists:
>>> list(my_dict[c] for c in criteria)
[[1, 2, 3], [4, 5, 6]]
This is because my_dict[c] is itself a list.
A more Pythonic way is to use a list comprehension:
my_dict = {'a': [1, 2, 3], 'b': [2, 3, 4], 'c': [4, 5, 6]}
criteria = ['a', 'c']
my_list = [item for k in criteria for item in my_dict[k]]
>>> my_list
[1, 2, 3, 4, 5, 6]
This uses a nested loop to select the values from the dict by criteria and to flatten the lists that are those values.

Can anyone help me find a reason why my function, after return, changes the values of the dict I am passing? [duplicate]

While reading up the documentation for dict.copy(), it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley's Python Reference), which says:
The m.copy() method makes a shallow
copy of the items contained in a
mapping object and places them in a
new mapping object.
Consider this:
>>> original = dict(a=1, b=2)
>>> new = original.copy()
>>> new.update({'c': 3})
>>> original
{'a': 1, 'b': 2}
>>> new
{'a': 1, 'c': 3, 'b': 2}
So I assumed this would update the value of original (and add 'c': 3) also since I was doing a shallow copy. Like if you do it for a list:
>>> original = [1, 2, 3]
>>> new = original
>>> new.append(4)
>>> new, original
([1, 2, 3, 4], [1, 2, 3, 4])
This works as expected.
Since both are shallow copies, why is that the dict.copy() doesn't work as I expect it to? Or my understanding of shallow vs deep copying is flawed?
By "shallow copying" it means the content of the dictionary is not copied by value, but just creating a new reference.
>>> a = {1: [1,2,3]}
>>> b = a.copy()
>>> a, b
({1: [1, 2, 3]}, {1: [1, 2, 3]})
>>> a[1].append(4)
>>> a, b
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
In contrast, a deep copy will copy all contents by value.
>>> import copy
>>> c = copy.deepcopy(a)
>>> a, c
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
>>> a[1].append(5)
>>> a, c
({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})
So:
b = a: Reference assignment, Make a and b points to the same object.
b = a.copy(): Shallow copying, a and b will become two isolated objects, but their contents still share the same reference
b = copy.deepcopy(a): Deep copying, a and b's structure and content become completely isolated.
Take this example:
original = dict(a=1, b=2, c=dict(d=4, e=5))
new = original.copy()
Now let's change a value in the 'shallow' (first) level:
new['a'] = 10
# new = {'a': 10, 'b': 2, 'c': {'d': 4, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 4, 'e': 5}}
# no change in original, since ['a'] is an immutable integer
Now let's change a value one level deeper:
new['c']['d'] = 40
# new = {'a': 10, 'b': 2, 'c': {'d': 40, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 40, 'e': 5}}
# new['c'] points to the same original['d'] mutable dictionary, so it will be changed
It's not a matter of deep copy or shallow copy, none of what you're doing is deep copy.
Here:
>>> new = original
you're creating a new reference to the the list/dict referenced by original.
while here:
>>> new = original.copy()
>>> # or
>>> new = list(original) # dict(original)
you're creating a new list/dict which is filled with a copy of the references of objects contained in the original container.
Adding to kennytm's answer. When you do a shallow copy parent.copy() a new dictionary is created with same keys,but the values are not copied they are referenced.If you add a new value to parent_copy it won't effect parent because parent_copy is a new dictionary not reference.
parent = {1: [1,2,3]}
parent_copy = parent.copy()
parent_reference = parent
print id(parent),id(parent_copy),id(parent_reference)
#140690938288400 140690938290536 140690938288400
print id(parent[1]),id(parent_copy[1]),id(parent_reference[1])
#140690938137128 140690938137128 140690938137128
parent_copy[1].append(4)
parent_copy[2] = ['new']
print parent, parent_copy, parent_reference
#{1: [1, 2, 3, 4]} {1: [1, 2, 3, 4], 2: ['new']} {1: [1, 2, 3, 4]}
The hash(id) value of parent[1], parent_copy[1] are identical which implies [1,2,3] of parent[1] and parent_copy[1] stored at id 140690938288400.
But hash of parent and parent_copy are different which implies
They are different dictionaries and parent_copy is a new dictionary having values reference to values of parent
"new" and "original" are different dicts, that's why you can update just one of them.. The items are shallow-copied, not the dict itself.
In your second part, you should use new = original.copy()
.copy and = are different things.
Contents are shallow copied.
So if the original dict contains a list or another dictionary, modifying one them in the original or its shallow copy will modify them (the list or the dict) in the other.

A better way to create a dictionary out of two lists with duplicated values in one

I have two lists:
a = ["A", "B", "B", "C", "D", "A"]
b = [1, 2, 3, 4, 5, 6]
I want to have a dictionary like the one below:
d = {"A":[1, 6], "B":[2, 3], "C":[4], "D":[5]}
Right now I am doing something like this:
d = {i:[] for i in set(a)}
for c in zip(a, b):
d[c[0]].append(c[1])
Is there a better way to do this?
You can use the dict.setdefault method to initialize each key as a list and then append the current value to it while you iterate:
d = {}
for k, v in zip(a, b):
d.setdefault(k, []).append(v)
With the sample input, d would become:
{'A': [1, 6], 'B': [2, 3], 'C': [4], 'D': [5]}
Running your code, didnt produce the output what you wanted. The below is a bit more verbose but makes it easy to see what the code is doing.
a = ["A", "B", "B", "C", "D", "A"]
b = [1, 2, 3, 4, 5, 6]
c = {}
for a, b in zip(a, b):
if a in c:
if isinstance(a, list):
c[a].append(b)
else:
c[a] = [c[a], b]
else:
c[a] = b
print(c)
OUTPUT
{'A': [1, 6], 'B': [2, 3], 'C': 4, 'D': 5}
An alternative less verbose mode would be to use a default dict with list as the type, you can then append all the items to it. this will mean even single items will be in a list. to me its much cleaner as you will know the data type of each item in the list. However if you really do want to have single items not in a list you can clean it up with a dict comprehension
from collections import defaultdict
a = ["A", "B", "B", "C", "D", "A"]
b = [1, 2, 3, 4, 5, 6]
c = defaultdict(list)
for a, b in zip(a, b):
c[a].append(b)
d = {k: v if len(v) > 1 else v[0] for k, v in c.items()}
print(c)
print(d)
OUTPUT
defaultdict(<class 'list'>, {'A': [1, 6], 'B': [2, 3], 'C': [4], 'D': [5]})
{'A': [1, 6], 'B': [2, 3], 'C': 4, 'D': 5}

Making a special sort of dictionary

I have a python program in which I have a list which resembles the list below:
a = [[1,2,3], [4,2,7], [5,2,3], [7,8,5]]
Here I want to create a dictionary using the middle value of each sublist as keys which should look something like this:
b = {2:[[1,2,3], [4,2,7], [5,2,3]], 8: [[7,8,5]]}
How can I achieve this?
You can do it simply like this:
a = [[1,2,3], [4,2,7], [5,2,3], [7,8,5]]
b = {}
for l in a:
m = l[len(l) // 2] # : get the middle element
if m in b:
b[m].append(l)
else:
b[m] = [l]
print(b)
Output:
{2: [[1, 2, 3], [4, 2, 7], [5, 2, 3]], 8: [[7, 8, 5]]}
You could also use a defaultdict to avoid the if in the loop:
from collections import defaultdict
b = defaultdict(list)
for l in a:
m = l[len(l) // 2]
b[m].append(l)
print(b)
Output:
defaultdict(<class 'list'>, {2: [[1, 2, 3], [4, 2, 7], [5, 2, 3]], 8: [[7, 8, 5]]})
Here is a solution that uses dictionary comprehension:
from itertools import groupby
a = [[1,2,3], [4,2,7], [5,2,3], [7,8,5]]
def get_mid(x):
return x[len(x) // 2]
b = {key: list(val) for key, val in groupby(sorted(a, key=get_mid), get_mid)}
print(b)

Resources