writing a generator that yields dictionaries of base frequencies of nucleotides - python-3.x

I am trying to write a function that returns a generator that can be iterated over all starting position of a k-window in the DNA sequence. For each starting position, the generator returns the nucleotide frequencies in the window as a dictionary.
def sliding(s,k):
d = {}
for i in range(len(s)-3):
chunk = ''.join([s[i],s[i+(k-3)],s[i+(k-2)],s[i+(k-1)]])
for j in chunk:
if j not in d:
d[j] = 1
else:
d[j] += 1
yield d
seq = "ACGTTGCA"
for d in sliding(seq,4):
print(d)
Output:
{'A': 1, 'C': 1, 'G': 1, 'T': 1}
{'A': 1, 'C': 2, 'G': 2, 'T': 3}
{'A': 1, 'C': 2, 'G': 4, 'T': 5}
{'A': 1, 'C': 3, 'G': 5, 'T': 7}
{'A': 2, 'C': 4, 'G': 6, 'T': 8}
Expected Output:
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 2, 'C': 0, 'A': 0, 'G': 2}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
However, in my function, as one can see, the dictionary is the same for all the windows and the nucleotide counts to the same dictionary key in every iteration. For every window (chunk) there should be different dictionary.

You should initialize d inside the loop instead so that it starts with a new dict for each iteration:
for i in range(len(s) - 3):
d = {}
...
If you want the dicts in the output to always have the same keys even if their values are 0, as suggested by your expected output, you can initialize a dict with all of the distinct letters as keys, and copy the dict to d for each iteration:
initialized_dict = dict.fromkeys(s, 0)
for i in range(len(s) - 3):
d = initialized_dict.copy()
...

Related

Is it possible to make a dictionary with lists as values from a list of dictionaries, in a one line comprehension?

If I have list of dicts A:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
can I make the following dict:
B = {'a': [1, 3], 'b': [2, 4]}
using only dict/list comprehension?
bonus: can I also account for varied keys in A e.g:
A = [{ 'a': 1, 'b': 2}, {'a': 3, 'b': 4, 'c': 5}]
B = {'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}
I have managed to do this with a for loops and if statements, was hoping for something that processes faster
Try:
A = [{"a": 1, "b": 2}, {"a": 3, "b": 4, "c": 5}]
out = {k: [d.get(k) for d in A] for k in set(k for d in A for k in d)}
print(out)
Prints:
{'a': [1, 3], 'b': [2, 4], 'c': [None, 5]}

Strange behavior by list of dictionaries [duplicate]

This question already has answers here:
How to copy a dictionary and only edit the copy
(23 answers)
Closed 1 year ago.
I have a list of dictionaries as follows:
a = [{'a':1, 'b':2, 'c':3}, {'d':4, 'e':5, 'f':6}]
Now I want another list b to have same contents as a but with one (key,value) pair extra. So I do it as:
b = a.copy()
for item in b:
item['x'] = 6
But now both the lists a and b have 'x': 6 sitting in them.
>>> b
[{'a': 1, 'b': 2, 'c': 3, 'x': 6}, {'d': 4, 'e': 5, 'f': 6, 'x': 6}]
>>> a
[{'a': 1, 'b': 2, 'c': 3, 'x': 6}, {'d': 4, 'e': 5, 'f': 6, 'x': 6}]
I also tried this:
c = a[:]
for item in c:
item['q'] = 12
And now all the three lists have 'q': 12.
>>> c
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
>>> b
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
>>> a
[{'a': 1, 'b': 2, 'c': 3, 'x': 6, 'q': 12}, {'d': 4, 'e': 5, 'f': 6, 'x': 6, 'q': 12}]
I can't understand how is this working. This would have been acceptable if I had done b = a. But why for b = a.copy() and c = a[:].
Thanks in advance:)
To copy a dictionary and copy all referenced objects use the deepcopy() function from the copy module instead of dict's method copy().
import copy
a = [{'a':1, 'b':2, 'c':3}, {'d':4, 'e':5, 'f':6}]
b = copy.deepcopy(d)
You can use #maziyank's solution. The explanation is that except copy.deepcopy() all the methods are just pointing one variable to the previous variable.
Thus any change to any of them will transcend to all the variables that point to the same variable.

Return multiple lines in a for loop

d = {'U': 4, '_': 2, 'C': 2, 'K': 1, 'D': 4, 'T': 6, 'Q': 1, 'V': 2, 'A': 9, 'F': 2, 'O': 8, 'J': 1, 'I': 9, 'N': 6, 'P': 2, 'S': 4, 'M': 2, 'W': 2, 'E': 12, 'Z': 1, 'G': 3, 'Y': 2, 'B': 2, 'L': 4, 'R': 6, 'X': 1, 'H': 2}
def __str__(self):
omgekeerd = {}
for sleutel, waarde in self.inhoud.items():
letters = omgekeerd.get(waarde, '')
letters += sleutel
omgekeerd[waarde] = letters
for aantal in sorted(omgekeerd):
return '{}: {}'.format(aantal, ''.join(sorted(omgekeerd[aantal])))
I need to return the value, followed by a ':' and then followed by every letter that has that value.
The problem is that when I use return, it only returns one value instead of every vale on a new line.
I can't use print() because that is not supported by the method str(self).
The return statement ends function execution and specifies a value to
be returned to the function caller.
I believe that your code is terminated too early because of wrong usage of return statement.
What you could do is to store what you would like to return in a seperate list/dictionary and then when everything is done, you can return the new dict/list that you've stored the results in.
If I understood you correctly; This is what might be looking for:
def someFunc():
d = {'U': 4, '_': 2, 'C': 2, 'K': 1, 'D': 4, 'T': 6, 'Q': 1, 'V': 2, 'A': 9,
'F': 2, 'O': 8, 'J': 1, 'I': 9, 'N': 6, 'P': 2, 'S': 4, 'M': 2, 'W': 2, 'E': 12,
'Z': 1, 'G': 3, 'Y': 2, 'B': 2, 'L': 4, 'R': 6, 'X': 1, 'H': 2}
result = {}
for key, value in d.iteritems():
result[value] = [k for k,v in d.iteritems() if v == value]
return result
# call function and iterate over given dictionary
for key, value in someFunc().iteritems():
print key, value
Result:
1 ['K', 'J', 'Q', 'X', 'Z']
2 ['C', 'B', 'F', 'H', 'M', 'P', 'W', 'V', 'Y', '_']
3 ['G']
4 ['D', 'L', 'S', 'U']
6 ['N', 'R', 'T']
8 ['O']
9 ['A', 'I']
12 ['E']

is there a simple way i can convert this list into a dictionary (python)

the list is this :
List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
And I am hoping for the outcome to be where each times the item appears in the list its assigned an integer e.g:
List1 = ['a:1']
without using the 'import counter' module
You could use this list comprehension:
dict((x, List1.count(x)) for x in set(List1))
Example output:
{'d': 1, 'f': 1, 'l': 1, 'c': 1, 'j': 1, 'e': 1, 'i': 1, 'a': 1, 'h': 2, 'b': 1, 'm': 1, 'n': 1, 'k': 1, 'g': 1}
(Edited to match edited question.)
Use a dictionary comprehension and count.
>>> List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
>>> mapping = {v: List1.count(v) for v in List1}
>>> mapping
{'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1, 'f': 1,
'g': 1, 'h': 2, 'i': 1, 'j': 1, 'k': 1, 'l': 1, 'm': 1, 'n': 1}

python3 recursion based on tree structure

I have the code with recursion function:
def myPermutation (newString, newDict):
if sum(newDict.values()) == 0:
print(newDict)
return
else:
curDict = newDict
nextDict=newDict
for char in newString:
# print('from line 09 -> ', curDict)
# print('from line 10 -> ', char, curDict[char])
if curDict[char] == 0:
continue
else:
# print(char)
print(curDict)
nextDict[char] -= 1
print(nextDict)
myPermutation(newString, nextDict)
nextDict=curDict
return
newString = 'AB'
# newDict = curDict(newString)
newDict = {'A': 1, 'B': 1}
# print(newString, newDict)
test = myPermutation(newString, newDict)
# print(test)
My out put is this:
{'A': 1, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}
It looks like my recursion function is not working correctly, I did some debug and found when the function tried to do second loop from top level, (move from 'A' to 'B' from 1st level of Tree), the dictionary changed from {'A':1, 'B':1} to {'A':0, 'B':0}. The expect output put should be something like:
{'A': 1, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 1}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}
{'A': 1, 'B': 1}
{'A': 1, 'B': 0}
{'A': 1, 'B': 0}
{'A': 0, 'B': 0}
{'A': 0, 'B': 0}
The following code:
curDict = newDict
nextDict=newDict
creates two names that both point to the same object. If you change one, the other will change. Maybe you want a deep copy?
import copy
curDict = copy.deepcopy(newDict)
nextDict = copy.deepcopy(newDict)
This means you can now change the two curDict independently of nextDict.

Resources