How to individually add each letter of the alphabet into a dictionary

How to individually add each letter of the alphabet into a dictionary - python-3.x

I'm trying to add each letter of the alphabet into a python dictionary, but I don't want to add it manually.
I have tried using string.ascii_lowercase, but it does not add each letter individually into the dictionary. Is there a way to add each letter in individually without doing it manually?
import string
dict = {'letter':string.ascii_lowercase, 'appearances':0}
print(dict['letter'], dict['appearances'])
I'm trying to get it to print out, 'a' 0, 'b' 0, etc. However, instead, it is printing out 'abcdefg...z' 0. Is there a way to enter then print out each letter individually followed by 0?

Initialize your dictionary with dict comprehension:
import string
d = {k: 0 for k in string.ascii_lowercase}
for k, v in d.items():
print(k, v)
Prints:
a 0
b 0
c 0
d 0
...and so on.
Dictionary d contains:
{'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 0, 'f': 0, 'g': 0, 'h': 0, 'i': 0, 'j': 0, 'k': 0, 'l': 0, 'm': 0, 'n': 0, 'o': 0, 'p': 0, 'q': 0, 'r': 0, 's': 0, 't': 0, 'u': 0, 'v': 0, 'w': 0, 'x': 0, 'y': 0, 'z': 0}

Related

writing a generator that yields dictionaries of base frequencies of nucleotides

I am trying to write a function that returns a generator that can be iterated over all starting position of a k-window in the DNA sequence. For each starting position, the generator returns the nucleotide frequencies in the window as a dictionary.
def sliding(s,k):
d = {}
for i in range(len(s)-3):
chunk = ''.join([s[i],s[i+(k-3)],s[i+(k-2)],s[i+(k-1)]])
for j in chunk:
if j not in d:
d[j] = 1
else:
d[j] += 1
yield d
seq = "ACGTTGCA"
for d in sliding(seq,4):
print(d)
Output:
{'A': 1, 'C': 1, 'G': 1, 'T': 1}
{'A': 1, 'C': 2, 'G': 2, 'T': 3}
{'A': 1, 'C': 2, 'G': 4, 'T': 5}
{'A': 1, 'C': 3, 'G': 5, 'T': 7}
{'A': 2, 'C': 4, 'G': 6, 'T': 8}
Expected Output:
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 2, 'C': 0, 'A': 0, 'G': 2}
{'T': 2, 'C': 1, 'A': 0, 'G': 1}
{'T': 1, 'C': 1, 'A': 1, 'G': 1}
However, in my function, as one can see, the dictionary is the same for all the windows and the nucleotide counts to the same dictionary key in every iteration. For every window (chunk) there should be different dictionary.

You should initialize d inside the loop instead so that it starts with a new dict for each iteration:
for i in range(len(s) - 3):
d = {}
...
If you want the dicts in the output to always have the same keys even if their values are 0, as suggested by your expected output, you can initialize a dict with all of the distinct letters as keys, and copy the dict to d for each iteration:
initialized_dict = dict.fromkeys(s, 0)
for i in range(len(s) - 3):
d = initialized_dict.copy()
...

How can I improve this algorithm to count the frequency of characters in a string?

In order to sort in a descending manner, the frequency of char appearance in a string, I've developed the following algorithm.
First I pass the string to a dictionary using each char as a key along with its frequency of appearance as value. Afterwards I have converted the dictionary to a descending sorted multi-dimension list.
I'd like to know how to improve the algorithm, was it a good approach? Can it be done diferently? All proposals are welcome.
#Libraries
from operator import itemgetter
# START
# Function
# String to Dict. Value as freq.
# of appearance and char as key.
def frequencyChar(string):
#string = string.lower() # Optional
freq = 0
thisDict = {}
for char in string:
if char.isalpha(): # just chars
freq = string.count(char)
thisDict[char] = freq # {key:value}
return(thisDict)
str2Dict = frequencyChar("Would you like to travel with me?")
#print(str2Dict)
# Dictionary to list
list_key_value = [[k,v] for k, v in str2Dict.items()]
# Descending sorted list
list_key_value = sorted(list_key_value, key=itemgetter(1), reverse=True)
print("\n", list_key_value, "\n")
#END

You're doing way too much work. collections.Counter counts things for you automatically, and even sorts by frequency:
from collections import Counter
s = "Would you like to travel with me?"
freq = Counter(s)
# Counter({' ': 6, 'o': 3, 'l': 3, 'e': 3, 't': 3, 'u': 2, 'i': 2, 'W': 1, 'd': 1, 'y': 1, 'k': 1, 'r': 1, 'a': 1, 'v': 1, 'w': 1, 'h': 1, 'm': 1, '?': 1})
If you want to remove the spaces from the count:
del freq[' ']
# Counter({'o': 3, 'l': 3, 'e': 3, 't': 3, 'u': 2, 'i': 2, 'W': 1, 'd': 1, 'y': 1, 'k': 1, 'r': 1, 'a': 1, 'v': 1, 'w': 1, 'h': 1, 'm': 1, '?': 1})
Also just in general, your algorithm is doing too much work. string.count involves iterating over the whole string for each character you're trying to count. Instead, you can just iterate once over the whole string, and for every letter you just keep incrementing the key associated with that letter (initialize it to 1 if it's a letter you haven't seen before). That's essentially what Counter is doing for you.
Spelling it out:
count = {}
for letter in the_string:
if not letter.isalpha():
continue
if letter not in count:
count[letter] = 1
else:
count[letter] += 1
And then to sort it you don't need to convert to a list first, you can just do it directly:
ordered = sorted(count.items(), key=itemgetter(1), reverse=True)

Return multiple lines in a for loop

d = {'U': 4, '_': 2, 'C': 2, 'K': 1, 'D': 4, 'T': 6, 'Q': 1, 'V': 2, 'A': 9, 'F': 2, 'O': 8, 'J': 1, 'I': 9, 'N': 6, 'P': 2, 'S': 4, 'M': 2, 'W': 2, 'E': 12, 'Z': 1, 'G': 3, 'Y': 2, 'B': 2, 'L': 4, 'R': 6, 'X': 1, 'H': 2}
def __str__(self):
omgekeerd = {}
for sleutel, waarde in self.inhoud.items():
letters = omgekeerd.get(waarde, '')
letters += sleutel
omgekeerd[waarde] = letters
for aantal in sorted(omgekeerd):
return '{}: {}'.format(aantal, ''.join(sorted(omgekeerd[aantal])))
I need to return the value, followed by a ':' and then followed by every letter that has that value.
The problem is that when I use return, it only returns one value instead of every vale on a new line.
I can't use print() because that is not supported by the method str(self).

The return statement ends function execution and specifies a value to
be returned to the function caller.
I believe that your code is terminated too early because of wrong usage of return statement.
What you could do is to store what you would like to return in a seperate list/dictionary and then when everything is done, you can return the new dict/list that you've stored the results in.
If I understood you correctly; This is what might be looking for:
def someFunc():
d = {'U': 4, '_': 2, 'C': 2, 'K': 1, 'D': 4, 'T': 6, 'Q': 1, 'V': 2, 'A': 9,
'F': 2, 'O': 8, 'J': 1, 'I': 9, 'N': 6, 'P': 2, 'S': 4, 'M': 2, 'W': 2, 'E': 12,
'Z': 1, 'G': 3, 'Y': 2, 'B': 2, 'L': 4, 'R': 6, 'X': 1, 'H': 2}
result = {}
for key, value in d.iteritems():
result[value] = [k for k,v in d.iteritems() if v == value]
return result
# call function and iterate over given dictionary
for key, value in someFunc().iteritems():
print key, value
Result:
1 ['K', 'J', 'Q', 'X', 'Z']
2 ['C', 'B', 'F', 'H', 'M', 'P', 'W', 'V', 'Y', '_']
3 ['G']
4 ['D', 'L', 'S', 'U']
6 ['N', 'R', 'T']
8 ['O']
9 ['A', 'I']
12 ['E']

is there a simple way i can convert this list into a dictionary (python)

the list is this :
List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
And I am hoping for the outcome to be where each times the item appears in the list its assigned an integer e.g:
List1 = ['a:1']
without using the 'import counter' module

You could use this list comprehension:
dict((x, List1.count(x)) for x in set(List1))
Example output:
{'d': 1, 'f': 1, 'l': 1, 'c': 1, 'j': 1, 'e': 1, 'i': 1, 'a': 1, 'h': 2, 'b': 1, 'm': 1, 'n': 1, 'k': 1, 'g': 1}

(Edited to match edited question.)
Use a dictionary comprehension and count.
>>> List1 = ['a','b','c','d','e','f','g','h','h','i','j','k','l','m','n']
>>> mapping = {v: List1.count(v) for v in List1}
>>> mapping
{'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1, 'f': 1,
'g': 1, 'h': 2, 'i': 1, 'j': 1, 'k': 1, 'l': 1, 'm': 1, 'n': 1}

merging mulitple list of lists in python 3

let's say I have multiple lists of lists, I'll a include a shortened version of three of them in this example.
list1=[['name', '1A5ZA'], ['length', 83], ['A', 28], ['V', 31], ['I', 24]]
list2=[['name', '1AJ8A'], ['length', 49], ['A', 18], ['V', 11], ['I', 20]]
list3=[['name', '1AORA'], ['length', 96], ['A', 32], ['V', 49], ['I', 15]]
all of the lists are in the same format: they have the same number of nested lists, with the same labels.
I generate each of these lists with the following function
def GetResCount(sequence):
residues=[['A',0],['V',0],['I',0],['L',0],['M',0],['F',0],['Y',0],['W',0],
['S',0],['T',0],['N',0],['Q',0],['C',0],['U',0],['G',0],['P',0],['R',0],
['H',0],['K',0],['D',0],['E',0]]
name=sequence[0:5]
AAseq=sequence[27:]
for AA in AAseq:
for n in range(len(residues)):
if residues[n][0] == AA:
residues[n][1]=residues[n][1]+1
length=len(AAseq)
nameLsit=(['name', name])
lengthList=(['length', length])
residues.insert(0,lengthList)
residues.insert(0,nameLsit)
return residues
the script takes a sequence such as this
1A5ZA:A|PDBID|CHAIN|SQUENCEMKIGIVGLGRVGSSTAFAL
and will create a list similar to the ones mentioned above.
As each individual list is generated, I would like to append it to a final form, such that all of them combined together looks like this:
final=[['name', '1A5ZA', '1AJ8A', '1AORA'], ['length', 83, 49, 96], ['A', 28, 18, 32], ['V', 31, 11, 49], ['I', 24, 20, 15]]
maybe the final form of the data isn't in the right format. I am open to suggestion on how to format the final form better...
To summarize, what the script should do is to get a sequence of letters with the name of the sequence being at beginning, count the occurrence of each letter withing the sequence as well as the overall sequence length, and output the name length and the letter frequency to a list. Then it should combine the info from each sequence into a larger list(maybe dictionary?..)
at the very end all of this info will go into a spreadsheet that will look like this:
name length A V I
1A5ZA 83 28 31 24
1AJ8A 49 18 11 20
1AORA 96 32 49 15
I'm including this last bit because maybe I'm not starting starting in the right way to end up with what I want.
Anyway,
I hope you made it here and thanks for the help!

So if you are looking for a table then a dict might be a better approach. (Note: collections.Counter does the same as your counting), e.g.:
from collections import Counter
def GetResCount(sequence):
name, AAseq = sequence[0:5], sequence[27:]
residuals = {'name': name, 'length': len(AAseq), 'A': 0, 'V': 0, 'I': 0, 'L': 0,
'M': 0, 'F': 0, 'Y': 0, 'W': 0, 'S': 0, 'T': 0, 'N': 0, 'Q': 0, 'C': 0,
'U': 0, 'G': 0, 'P': 0, 'R': 0, 'H': 0, 'K': 0, 'D': 0, 'E': 0}
residuals.update(Counter(AAseq))
return residuals
In []:
GetResCount('1A5ZA:A|PDBID|CHAIN|SQUENCEMKIGIVGLGRVGSSTAFAL')
Out[]:
{'name': '1A5ZA', 'length': 19, 'A': 2, 'V': 2, 'I': 2, 'L': 2, 'M': 1, 'F': 1, 'Y': 0,
'W': 0, 'S': 2, 'T': 1, 'N': 0, 'Q': 0, 'C': 0, 'U': 0, 'G': 4, 'P': 0, 'R': 1,
'H': 0, 'K': 1, 'D': 0, 'E': 0}
Note: this may only be in the order you might be looking in Py3.6+ but we can fix that later as we create the table if necessary.
Then you can create a list of the dicts, e.g. (assuming you are reading these lines from a file):
with open(<file>) as file:
data = [GetResCount(line.strip()) for line in file]
Then you can load it directly into pandas, e.g.:
In []:
import pandas as pd
columns = ['name', 'length', 'A', 'V', 'I', ...] # columns = list(data[0].keys()) - Py3.6+
df = pd.DataFrame(data, columns=columns)
print(df)
Out[]:
name length A V I ...
0 1A5ZA 83 28 31 24 ...
1 1AJ8A 49 18 11 20 ...
2 1AORA 96 32 49 15 ...
...
You could also just dump it out to a file with cvs.DictWriter():
from csv import DictWriter
fieldnames = ['name', 'length', 'A', 'V', 'I', ...]
with open(<output>, 'w') as file:
writer = DictWrite(file, fieldnames)
writer.writerows(data)
Which would output something like:
name,length,A,V,I,...
1A5ZA,83,28,31,24,...
1AJ8A,49,18,11,20,...
1AORA,96,32,49,15 ...
...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to individually add each letter of the alphabet into a dictionary - python-3.x

Related

writing a generator that yields dictionaries of base frequencies of nucleotides

How can I improve this algorithm to count the frequency of characters in a string?

Return multiple lines in a for loop

is there a simple way i can convert this list into a dictionary (python)

merging mulitple list of lists in python 3

Categories

Resources