python3 recursion based on tree structure - python-3.x

I have the code with recursion function:
def myPermutation (newString, newDict):
if sum(newDict.values()) == 0:
print(newDict)
return
else:
curDict = newDict
nextDict=newDict
for char in newString:
# print('from line 09 -> ', curDict)
# print('from line 10 -> ', char, curDict[char])
if curDict[char] == 0:
continue
else:
# print(char)
print(curDict)
nextDict[char] -= 1
print(nextDict)
myPermutation(newString, nextDict)
nextDict=curDict
return
newString = 'AB'
# newDict = curDict(newString)
newDict = {'A': 1, 'B': 1}
# print(newString, newDict)
test = myPermutation(newString, newDict)
# print(test)
My out put is this:
{'A': 1, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}
It looks like my recursion function is not working correctly, I did some debug and found when the function tried to do second loop from top level, (move from 'A' to 'B' from 1st level of Tree), the dictionary changed from {'A':1, 'B':1} to {'A':0, 'B':0}. The expect output put should be something like:
{'A': 1, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}
{'A': 1, 'B': 1}
{'A': 1, 'B': 0}
{'A': 1, 'B': 0}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}

The following code:
curDict = newDict
nextDict=newDict
creates two names that both point to the same object. If you change one, the other will change. Maybe you want a deep copy?
import copy
curDict = copy.deepcopy(newDict)
nextDict = copy.deepcopy(newDict)
This means you can now change the two curDict independently of nextDict.

Related

Is it possible to make a dictionary with lists as values from a list of dictionaries, in a one line comprehension?

If I have list of dicts A:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
can I make the following dict:
B = {'a': [1, 3], 'b': [2, 4]}
using only dict/list comprehension?
bonus: can I also account for varied keys in A e.g:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4, 'c': 5}]
B = {'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}
I have managed to do this with a for loops and if statements, was hoping for something that processes faster
Try:
A = [{"a": 1, "b": 2}, {"a": 3, "b": 4, "c": 5}]
out = {k: [d.get(k) for d in A] for k in set(k for d in A for k in d)}
print(out)
Prints:
{'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}

writing a generator that yields dictionaries of base frequencies of nucleotides

I am trying to write a function that returns a generator that can be iterated over all starting position of a k-window in the DNA sequence. For each starting position, the generator returns the nucleotide frequencies in the window as a dictionary.
def sliding(s,k):
d = {}
for i in range(len(s)-3):
chunk = ''.join([s[i],s[i+(k-3)],s[i+(k-2)],s[i+(k-1)]])
for j in chunk:
if j not in d:
d[j] = 1
else:
d[j] += 1
yield d
seq = "ACGTTGCA"
for d in sliding(seq,4):
print(d)
Output:
{'A': 1, 'C': 1, 'G': 1, 'T': 1}
{'A': 1, 'C': 2, 'G': 2, 'T': 3}
{'A': 1, 'C': 2, 'G': 4, 'T': 5}
{'A': 1, 'C': 3, 'G': 5, 'T': 7}
{'A': 2, 'C': 4, 'G': 6, 'T': 8}
Expected Output:
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 2, 'C': 0, 'A': 0, 'G': 2}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
However, in my function, as one can see, the dictionary is the same for all the windows and the nucleotide counts to the same dictionary key in every iteration. For every window (chunk) there should be different dictionary.
You should initialize d inside the loop instead so that it starts with a new dict for each iteration:
for i in range(len(s) - 3):
d = {}
...
If you want the dicts in the output to always have the same keys even if their values are 0, as suggested by your expected output, you can initialize a dict with all of the distinct letters as keys, and copy the dict to d for each iteration:
initialized_dict = dict.fromkeys(s, 0)
for i in range(len(s) - 3):
d = initialized_dict.copy()
...

is there a simple way i can convert this list into a dictionary (python)

the list is this :
List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
And I am hoping for the outcome to be where each times the item appears in the list its assigned an integer e.g:
List1 = ['a:1']
without using the 'import counter' module
You could use this list comprehension:
dict((x, List1.count(x)) for x in set(List1))
Example output:
{'d': 1, 'f': 1, 'l': 1, 'c': 1, 'j': 1, 'e': 1, 'i': 1, 'a': 1, 'h': 2, 'b': 1, 'm': 1, 'n': 1, 'k': 1, 'g': 1}
(Edited to match edited question.)
Use a dictionary comprehension and count.
>>> List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
>>> mapping = {v: List1.count(v) for v in List1}
>>> mapping
{'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1, 'f': 1,
'g': 1, 'h': 2, 'i': 1, 'j': 1, 'k': 1, 'l': 1, 'm': 1, 'n': 1}

Simultaneously create attributes and edges (if same attr. exists) in NetworkX

After creating nodes in NetworkX, I would like add edges between nodes if both nodes have (at least) one overlapping same attribute.
It seems to be a problem that not all nodes contain the same number of attributes - could this be the case, and if so, how should I solve it?
import networkx as nx
from itertools import product
# Mothernodes
M = [('E_%d' % h, {'a': i, 'b': j, 'c': k, 'd': l})
for h, (i, j, k, l) in enumerate(product(range(2), repeat=4), start=1)]
# children nodes
a = [ ( 'a_%d' % i, {'a' : i}) for i in range(0,2) ]
b = [ ( 'b_%d' % i, {'b' : i}) for i in range(0,2) ]
c = [ ( 'c_%d' % i, {'c' : i}) for i in range(0,2) ]
d = [ ( 'd_%d' % i, {'d' : i}) for i in range(0,2) ]
# graph containing both
M_c = nx.Graph()
M_c.add_nodes_from(M)
ls_children = [a, b, c , d]
for ls_c in ls_children:
M_c.add_nodes_from(ls_c)
list(M_c.nodes(data=True))[0:20]
[('E_9', {'a': 1, 'b': 0, 'c': 0, 'd': 0}),
('d_0', {'d': 0}),
('E_10', {'a': 1, 'b': 0, 'c': 0, 'd': 1}),
('b_0', {'b': 0}),
('E_2', {'a': 0, 'b': 0, 'c': 0, 'd': 1}),
('E_1', {'a': 0, 'b': 0, 'c': 0, 'd': 0}),
('c_1', {'c': 1}),
...
]
# attempting to add edges if one attribute is overlapping/the same:
for start in M_c.nodes(data=True):
for end in M_c.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr] == end[1][attr]:
M_c.add_edge(start[0] ,end[0] )
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-4-22b88809e853> in <module>()
2 for end in M_c.nodes(data=True):
3 for attr in list(start[1].keys()):
----> 4 if start[1][attr] == end[1][attr]:
5 M_c.add_edge(start[0] ,end[0] )
KeyError: 'b'
EDIT2:
I have now tried to test both start and end for existence of attributes, but I still get an error:
for start in M_c.nodes(data=True):
for end in M_c.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr]:
if end[1][attr]:
if start[1][attr] == end[1][attr]:
M_c.add_edge(start[0], end[0] )
# Adding an else and continue statement does not affect the error,
# even adding three of them, for each if statement
# else:
# continue
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-5-32ae2a6095e5> in <module>()
3 for attr in list(start[1].keys()):
4 if start[1][attr]:
----> 5 if end[1][attr]:
6 if start[1][attr] == end[1][attr]:
7 M_c.add_edge(start[0], end[0] )
KeyError: 'a'
Creating edges between nodes sharing the same attribute value
import networkx as nx
import random
import matplotlib.pyplot as plt
# create nodes
G = nx.Graph()
_ = [G.add_node(i, a=random.randint(1,10),
b=random.randint(1,10),
c=random.randint(1,10)) for i in range(20)]
for node in G.nodes(data=True):
print(node)
[out:]
(0, {'a': 10, 'b': 10, 'c': 3})
(1, {'a': 10, 'b': 6, 'c': 2})
(2, {'a': 4, 'b': 5, 'c': 5})
(3, {'a': 5, 'b': 10, 'c': 8})
(4, {'a': 3, 'b': 10, 'c': 8})
(5, {'a': 7, 'b': 1, 'c': 9})
(6, {'a': 10, 'b': 8, 'c': 8})
(7, {'a': 7, 'b': 7, 'c': 2})
(8, {'a': 6, 'b': 3, 'c': 9})
(9, {'a': 1, 'b': 5, 'c': 7})
(10, {'a': 3, 'b': 2, 'c': 8})
(11, {'a': 4, 'b': 1, 'c': 6})
(12, {'a': 3, 'b': 10, 'c': 3})
(13, {'a': 4, 'b': 5, 'c': 3})
(14, {'a': 7, 'b': 10, 'c': 4})
(15, {'a': 1, 'b': 1, 'c': 10})
(16, {'a': 1, 'b': 9, 'c': 1})
(17, {'a': 3, 'b': 8, 'c': 4})
(18, {'a': 6, 'b': 8, 'c': 7})
(19, {'a': 7, 'b': 10, 'c': 4})
Creating edges
for start in G.nodes(data=True):
for end in G.nodes(data=True):
for attr in list(start[1].keys()):
if start[1][attr] == end[1][attr]:
G.add_edge(start[0] ,end[0] )
Plotting the network
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos)
nx.draw_networkx_edges(G,pos)
nx.draw_networkx_labels(G,pos, font_color='w')
plt.axis('off')
plt.show()

Why do I get TypeError when trying to do dot product?

If I have one integer and multiply it by each integer in a container (tuple) and add them together -- similar to a dot product -- I get the right answer. When I convert them to floats, I get a TypeError:
TypeError: can't multiply sequence by non-int of type 'float'
sig = {'a': 1.0, 'b': 2.0, 'c': 3.0}
exp = {'a': (1.0,2.0,3.0), 'b': (1.0,2.0,3.0), 'c': (1.0,2.0,3.0)}
man_dot = {'a': 1*1+1*2+1*3, 'b': 2*1+2*2+2*3, 'c': 3*1+3*2+3*3}
weighted_dict = {}
for s in sig:
print("this is s:\n{}".format(s))
for e in exp:
print("this is e:\n{}".format(e))
weighted_dict[s] = sum(sig[s] * exp[e])
# weighted_dict should be equivalent to man_dot
# weighted_dict should be {'a': 6, 'c': 18, 'b': 12}
This script must handle operation with floats, so how can I modify it to do so? Why does this happen? Is there a better of of doing this with some math-oriented library?
Your problem is that you are trying to multiply (1.0, 2.0, 3.0) by 1.0, which gives the aforementioned error. Try the following:
sig = {'a': 1.0, 'b': 2.0, 'c': 3.0}
exp = {'a': (1.0,2.0,3.0), 'b': (1.0,2.0,3.0), 'c': (1.0,2.0,3.0)}
man_dot = {'a': 1*1+1*2+1*3, 'b': 2*1+2*2+2*3, 'c': 3*1+3*2+3*3}
weighted_dict = {}
for s in sig:
print("this is s:\n{}".format(s))
for e in exp:
print("this is e:\n{}".format(e))
weighted_dict[s] = sum([sig[s] * item for item in exp[e]])
>>> weighted_dict
{'c': 18.0, 'a': 6.0, 'b': 12.0}
>>>

Resources