How to extend values in dictionaries using key in Python - python-3.x

How do I extend the values in a dictionary from a list of dictionaries using the keys as the main constraint, say:
d = {'a': (), 'b': 0, 'c': "", d: ""}
l = [ {'a': (23, 48), 'b': 34, 'c': "fame", d: "who"},
{'a': (94, 29), 'b': 3, 'c': "house", d: "cats"},
{'a': (23, 12), 'b': 93, 'c': "imap", d: "stack"},
]
to give
d = {'a': [(23, 48), (94,29), 23,12], 'b': [34, 3, 94],
'c': ["fame", "house", "imap"], 'd': ['who', 'cats', 'stack'] }
code used
for i in l:
d["a"].extend(i.get('a')),
d["b"].extend(i.get('b')),
d["c"].extend(i.get('c')),
d['d'].extend(i.get('d'))

You should initialize d as an empty dict instead, so that you can iterate through l and the key-value pairs to keep appending the values to the sub-list of d at the given keys:
l = [
{'a': (23, 48), 'b': 34, 'c': "fame", 'd': "who"},
{'a': (94, 29), 'b': 3, 'c': "house", 'd': "cats"},
{'a': (23, 12), 'b': 93, 'c': "imap", 'd': "stack"},
]
d = {}
for s in l:
for k, v in s.items():
d.setdefault(k, []).append(v)
d becomes:
{'a': [(23, 48), (94, 29), (23, 12)],
'b': [34, 3, 93],
'c': ['fame', 'house', 'imap'],
'd': ['who', 'cats', 'stack']}
If the sub-dicts in l may contain other keys, you can instead initialize d as a dict of empty lists under the desired keys:
l = [
{'a': (23, 48), 'b': 34, 'c': "fame", 'd': "who"},
{'a': (94, 29), 'b': 3, 'c': "house", 'd': "cats"},
{'a': (23, 12), 'b': 93, 'c': "imap", 'd': "stack"},
{'e': 'choices'}
]
d = {k: [] for k in ('a', 'b', 'c', 'd')}
for s in l:
for k in d:
d[k].append(s.get(k))
in which case d becomes:
{'a': [(23, 48), (94, 29), (23, 12), None],
'b': [34, 3, 93, None],
'c': ['fame', 'house', 'imap', None],
'd': ['who', 'cats', 'stack', None]}

You can use defaultdict as follow since the default value is an empty list (https://docs.python.org/2/library/collections.html#collections.defaultdict)
import collections
d = collections.defaultdict(list)
keys = ['a', 'b', 'c', 'd']
for i in l:
for k in keys:
d[k].append(i[k])
print(d)
Best regard

Related

How can we include only specific json fields for comparision in deepdiff pyhton?

Here, we have 2 JSON, I want to include only 'id' field for comparison, rest of the fields should be ignored.
j1 = {'MyList': [{'a': 1, 'b': 2, 'c': [{'id': '1'}]},
{'a': 1, 'b': 2, 'c': [{'id': '2'}]}], "j": "2222"}
j2 = {'MyList': [{'a': 1, 'b': 2, 'c': [{'id': '4'}]},
{'a': 1, 'b': 2, 'c': [{'id': '7'}]}], "j": "7777"}
Please suggest, How we can achieve this using DeepDiff.

Is it possible to make a dictionary with lists as values from a list of dictionaries, in a one line comprehension?

If I have list of dicts A:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
can I make the following dict:
B = {'a': [1, 3], 'b': [2, 4]}
using only dict/list comprehension?
bonus: can I also account for varied keys in A e.g:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4, 'c': 5}]
B = {'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}
I have managed to do this with a for loops and if statements, was hoping for something that processes faster
Try:
A = [{"a": 1, "b": 2}, {"a": 3, "b": 4, "c": 5}]
out = {k: [d.get(k) for d in A] for k in set(k for d in A for k in d)}
print(out)
Prints:
{'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}

Strange behavior by list of dictionaries [duplicate]

This question already has answers here:
How to copy a dictionary and only edit the copy
(23 answers)
Closed 1 year ago.
I have a list of dictionaries as follows:
a = [{'a':1, 'b':2, 'c':3}, {'d':4, 'e':5, 'f':6}]
Now I want another list b to have same contents as a but with one (key,value) pair extra. So I do it as:
b = a.copy()
for item in b:
item['x'] = 6
But now both the lists a and b have 'x': 6 sitting in them.
>>> b
[{'a': 1, 'b': 2, 'c': 3, 'x': 6}, {'d': 4, 'e': 5, 'f': 6, 'x': 6}]
>>> a
[{'a': 1, 'b': 2, 'c': 3, 'x': 6}, {'d': 4, 'e': 5, 'f': 6, 'x': 6}]
I also tried this:
c = a[:]
for item in c:
item['q'] = 12
And now all the three lists have 'q': 12.
>>> c
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
>>> b
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
>>> a
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
I can't understand how is this working. This would have been acceptable if I had done b = a. But why for b = a.copy() and c = a[:].
Thanks in advance:)
To copy a dictionary and copy all referenced objects use the deepcopy() function from the copy module instead of dict's method copy().
import copy
a = [{'a':1, 'b':2, 'c':3}, {'d':4, 'e':5, 'f':6}]
b = copy.deepcopy(d)
You can use #maziyank's solution. The explanation is that except copy.deepcopy() all the methods are just pointing one variable to the previous variable.
Thus any change to any of them will transcend to all the variables that point to the same variable.

writing a generator that yields dictionaries of base frequencies of nucleotides

I am trying to write a function that returns a generator that can be iterated over all starting position of a k-window in the DNA sequence. For each starting position, the generator returns the nucleotide frequencies in the window as a dictionary.
def sliding(s,k):
d = {}
for i in range(len(s)-3):
chunk = ''.join([s[i],s[i+(k-3)],s[i+(k-2)],s[i+(k-1)]])
for j in chunk:
if j not in d:
d[j] = 1
else:
d[j] += 1
yield d
seq = "ACGTTGCA"
for d in sliding(seq,4):
print(d)
Output:
{'A': 1, 'C': 1, 'G': 1, 'T': 1}
{'A': 1, 'C': 2, 'G': 2, 'T': 3}
{'A': 1, 'C': 2, 'G': 4, 'T': 5}
{'A': 1, 'C': 3, 'G': 5, 'T': 7}
{'A': 2, 'C': 4, 'G': 6, 'T': 8}
Expected Output:
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 2, 'C': 0, 'A': 0, 'G': 2}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
However, in my function, as one can see, the dictionary is the same for all the windows and the nucleotide counts to the same dictionary key in every iteration. For every window (chunk) there should be different dictionary.
You should initialize d inside the loop instead so that it starts with a new dict for each iteration:
for i in range(len(s) - 3):
d = {}
...
If you want the dicts in the output to always have the same keys even if their values are 0, as suggested by your expected output, you can initialize a dict with all of the distinct letters as keys, and copy the dict to d for each iteration:
initialized_dict = dict.fromkeys(s, 0)
for i in range(len(s) - 3):
d = initialized_dict.copy()
...

Simultaneously create attributes and edges (if same attr. exists) in NetworkX

After creating nodes in NetworkX, I would like add edges between nodes if both nodes have (at least) one overlapping same attribute.
It seems to be a problem that not all nodes contain the same number of attributes - could this be the case, and if so, how should I solve it?
import networkx as nx
from itertools import product
# Mothernodes
M = [('E_%d' % h, {'a': i, 'b': j, 'c': k, 'd': l})
for h, (i, j, k, l) in enumerate(product(range(2), repeat=4), start=1)]
# children nodes
a = [ ( 'a_%d' % i, {'a' : i}) for i in range(0,2) ]
b = [ ( 'b_%d' % i, {'b' : i}) for i in range(0,2) ]
c = [ ( 'c_%d' % i, {'c' : i}) for i in range(0,2) ]
d = [ ( 'd_%d' % i, {'d' : i}) for i in range(0,2) ]
# graph containing both
M_c = nx.Graph()
M_c.add_nodes_from(M)
ls_children = [a, b, c , d]
for ls_c in ls_children:
M_c.add_nodes_from(ls_c)
list(M_c.nodes(data=True))[0:20]
[('E_9', {'a': 1, 'b': 0, 'c': 0, 'd': 0}),
('d_0', {'d': 0}),
('E_10', {'a': 1, 'b': 0, 'c': 0, 'd': 1}),
('b_0', {'b': 0}),
('E_2', {'a': 0, 'b': 0, 'c': 0, 'd': 1}),
('E_1', {'a': 0, 'b': 0, 'c': 0, 'd': 0}),
('c_1', {'c': 1}),
...
]
# attempting to add edges if one attribute is overlapping/the same:
for start in M_c.nodes(data=True):
for end in M_c.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr] == end[1][attr]:
M_c.add_edge(start[0] ,end[0] )
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-4-22b88809e853> in <module>()
2 for end in M_c.nodes(data=True):
3 for attr in list(start[1].keys()):
----> 4 if start[1][attr] == end[1][attr]:
5 M_c.add_edge(start[0] ,end[0] )
KeyError: 'b'
EDIT2:
I have now tried to test both start and end for existence of attributes, but I still get an error:
for start in M_c.nodes(data=True):
for end in M_c.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr]:
if end[1][attr]:
if start[1][attr] == end[1][attr]:
M_c.add_edge(start[0], end[0] )
# Adding an else and continue statement does not affect the error,
# even adding three of them, for each if statement
# else:
# continue
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-5-32ae2a6095e5> in <module>()
3 for attr in list(start[1].keys()):
4 if start[1][attr]:
----> 5 if end[1][attr]:
6 if start[1][attr] == end[1][attr]:
7 M_c.add_edge(start[0], end[0] )
KeyError: 'a'
Creating edges between nodes sharing the same attribute value
import networkx as nx
import random
import matplotlib.pyplot as plt
# create nodes
G = nx.Graph()
_ = [G.add_node(i, a=random.randint(1,10),
b=random.randint(1,10),
c=random.randint(1,10)) for i in range(20)]
for node in G.nodes(data=True):
print(node)
[out:]
(0, {'a': 10, 'b': 10, 'c': 3})
(1, {'a': 10, 'b': 6, 'c': 2})
(2, {'a': 4, 'b': 5, 'c': 5})
(3, {'a': 5, 'b': 10, 'c': 8})
(4, {'a': 3, 'b': 10, 'c': 8})
(5, {'a': 7, 'b': 1, 'c': 9})
(6, {'a': 10, 'b': 8, 'c': 8})
(7, {'a': 7, 'b': 7, 'c': 2})
(8, {'a': 6, 'b': 3, 'c': 9})
(9, {'a': 1, 'b': 5, 'c': 7})
(10, {'a': 3, 'b': 2, 'c': 8})
(11, {'a': 4, 'b': 1, 'c': 6})
(12, {'a': 3, 'b': 10, 'c': 3})
(13, {'a': 4, 'b': 5, 'c': 3})
(14, {'a': 7, 'b': 10, 'c': 4})
(15, {'a': 1, 'b': 1, 'c': 10})
(16, {'a': 1, 'b': 9, 'c': 1})
(17, {'a': 3, 'b': 8, 'c': 4})
(18, {'a': 6, 'b': 8, 'c': 7})
(19, {'a': 7, 'b': 10, 'c': 4})
Creating edges
for start in G.nodes(data=True):
for end in G.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr] == end[1][attr]:
G.add_edge(start[0] ,end[0] )
Plotting the network
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos)
nx.draw_networkx_edges(G,pos)
nx.draw_networkx_labels(G,pos, font_color='w')
plt.axis('off')
plt.show()

Resources