Related
This is my code that takes a number of codons. Codons are a group of three nucleotides, each coding for an Amino Acid
codon_sequence=[]
print("Enter no. of codons you want")
n=int(input())
for i in range(n):
codon=str(input())
codon_sequence.append(codon)
print(codon_sequence)
for i in range(n):
if(codon_sequence[i]=="UUU" or "UUC" or "TTT" or "TTC"):
print("Phe_")
elif(codon_sequence[i]=="UUA" or "UUG" or "CUU" or "CUC" or "CUG" or "CUA" or "TTA" or "TTG" or "CTT" or "CTC" or "CTG" or "CTA"):
print("Leu_")
elif(codon_sequence[i]=="UCU" or "UCC" or "UCG" or "UCA" or "AGU" or "AGC" or "TCT" or "TCC" or "TCG" or "TCA" or "AGT" or "AGC"):
print("Ser_")
elif(codon_sequence[i]=="UAU" or "UAC" or "TAT" or "TAC"):
print("Tyr_")
elif(codon_sequence[i]=="UGU" or "UGC" or "TGT" or "TGC"):
print("Cys_")
elif(codon_sequence[i]=="UGG" or "TGG"):
print("Trp_")
elif(codon_sequence[i]=="CCU" or "CCC" or "CCA" or "CCG" or "CCT"):
print("Pro_")
elif(codon_sequence[i]=="CGU" or "CGC" or "CGA" or "CGG" or "AGA" or "AGG" or "CGT"):
print("Arg_")
elif(codon_sequence[i]=="CAU" or "CAC" or "CAT"):
print("His_")
elif(codon_sequence[i]=="CAA" or "CAG"):
print("Gln_")
elif(codon_sequence[i]=="AUU" or "AUC" or "AUA" or "ATT" or "ATC" or "ATA"):
print("Ile_")
elif(codon_sequence[i]=="AUG"):
print("Met_")
elif(codon_sequence[i]=="ACU" or "ACC" or "ACA" or "ACG" or "ACT"):
print("Thr_")
elif(codon_sequence[i]=="GUU" or "GUC" or "GUA" or "GUG" or "GTT" or "GTC" or "GTA" or "GTG"):
print("Val_")
elif(codon_sequence[i]=="GCU" or "GCC" or "GCA" or "GCG" or "GCT"):
print("Ala_")
elif(codon_sequence[i]=="GGU" or "GGC" or "GGA" or "GGG" or "GGT"):
print("Gly_")
elif(codon_sequence[i]=="GAU" or "GAC" or "GAT"):
print("Asp_")
elif(codon_sequence[i]=="GAA" or "GAG"):
print("Glu_")
elif(codon_sequence[i]=="AAU" or "AAC" or "AAT"):
print("Asn_")
elif(codon_sequence[i]=="AAA" or "AAG"):
print("Lys_")
else:
print("Stop_")
This is however, giving me only 'Phe_' as result, and ignores all other conditions
Reason why your code is not hitting the elif blocks
Your if and elif blocks should look like this.
It should check if codon_sequence[i] is equal to a string of interest.
if(codon_sequence[i]=="UUU" or codon_sequence[i]=="UUC" or codon_sequence[i]=="TTT" or codon_sequence[i]=="TTC"):
Instead you have an or condition against just plain strings like UUC.
This will result in the first if condition always being True.
Thereby you will never hit the elif block.
Also a better way of writing the if statement would be:
if codon_sequence[i] in ["UUU", "UUC", "TTT", "TTC"]:
print("Phe_")
This would be a great candidate for a switch statement, but as the previous answer mentioned you can't put an "or" between each string like you're doing.
I thought that setting a fixed number of decimal points to all numbers of an array of Decimals, and the new arrays resulting from operations thereof, could be achieved by simply doing:
from decimal import *
getcontext().prec = 5 # 4 decimal points
v = Decimal(0.005)
print(v)
0.005000000000000000104083408558608425664715468883514404296875
However, I get spurious results that I know are the consequence of the contribution of these extra decimals to the calculations. Therefore, as a workaround, I used the round() function like this:
C_subgrid= [Decimal('33.340'), Decimal('33.345'), Decimal('33.350'), Decimal('33.355'), Decimal('33.360'), Decimal('33.365'), Decimal('33.370'), Decimal('33.375'), Decimal('33.380'), Decimal('33.385'), Decimal('33.390'), Decimal('33.395'), Decimal('33.400'), Decimal('33.405'), Decimal('33.410'), Decimal('33.415'), Decimal('33.420'), Decimal('33.425'), Decimal('33.430'), Decimal('33.435'), Decimal('33.440'), Decimal('33.445'), Decimal('33.450'), Decimal('33.455'), Decimal('33.460'), Decimal('33.465'), Decimal('33.470'), Decimal('33.475'), Decimal('33.480'), Decimal('33.485'), Decimal('33.490'), Decimal('33.495'), Decimal('33.500'), Decimal('33.505'), Decimal('33.510'), Decimal('33.515'), Decimal('33.520'), Decimal('33.525'), Decimal('33.530'), Decimal('33.535'), Decimal('33.540'), Decimal('33.545'), Decimal('33.550'), Decimal('33.555'), Decimal('33.560'), Decimal('33.565'), Decimal('33.570'), Decimal('33.575'), Decimal('33.580'), Decimal('33.585'), Decimal('33.590'), Decimal('33.595'), Decimal('33.600'), Decimal('33.605'), Decimal('33.610'), Decimal('33.615'), Decimal('33.620'), Decimal('33.625'), Decimal('33.630'), Decimal('33.635'), Decimal('33.640'), Decimal('33.645'), Decimal('33.650'), Decimal('33.655'), Decimal('33.660'), Decimal('33.665'), Decimal('33.670'), Decimal('33.675'), Decimal('33.680'), Decimal('33.685'), Decimal('33.690'), Decimal('33.695'), Decimal('33.700'), Decimal('33.705'), Decimal('33.710'), Decimal('33.715'), Decimal('33.720'), Decimal('33.725'), Decimal('33.730'), Decimal('33.735'), Decimal('33.740'), Decimal('33.745'), Decimal('33.750'), Decimal('33.755'), Decimal('33.760'), Decimal('33.765'), Decimal('33.770'), Decimal('33.775'), Decimal('33.780'), Decimal('33.785'), Decimal('33.790'), Decimal('33.795'), Decimal('33.800'), Decimal('33.805'), Decimal('33.810'), Decimal('33.815'), Decimal('33.820'), Decimal('33.825'), Decimal('33.830'), Decimal('33.835'), Decimal('33.840'), Decimal('33.845'), Decimal('33.850'), Decimal('33.855'), Decimal('33.860'), Decimal('33.865'), Decimal('33.870'), Decimal('33.875'), Decimal('33.880'), Decimal('33.885'), Decimal('33.890'), Decimal('33.895'), Decimal('33.900'), Decimal('33.905'), Decimal('33.910'), Decimal('33.915'), Decimal('33.920'), Decimal('33.925'), Decimal('33.930'), Decimal('33.935'), Decimal('33.940'), Decimal('33.945'), Decimal('33.950'), Decimal('33.955'), Decimal('33.960'), Decimal('33.965'), Decimal('33.970'), Decimal('33.975'), Decimal('33.980'), Decimal('33.985'), Decimal('33.990'), Decimal('33.995'), Decimal('34.000'), Decimal('34.005'), Decimal('34.010'), Decimal('34.015'), Decimal('34.020'), Decimal('34.025'), Decimal('34.030'), Decimal('34.035'), Decimal('34.040'), Decimal('34.045'), Decimal('34.050'), Decimal('34.055'), Decimal('34.060'), Decimal('34.065'), Decimal('34.070'), Decimal('34.075'), Decimal('34.080'), Decimal('34.085'), Decimal('34.090'), Decimal('34.095'), Decimal('34.100'), Decimal('34.105'), Decimal('34.110'), Decimal('34.115'), Decimal('34.120'), Decimal('34.125'), Decimal('34.130'), Decimal('34.135'), Decimal('34.140')]
C_subgrid = [round(v, 4) for v in C_subgrid]
I got the values of C_subgrid list by printing it out during execution of my code, and I pasted it here. Not sure where the single quotes come from. This code snipped worked fine in Python2.7, but when I upgraded to Python 3.7 it started raising this error:
File "/home2/thomas/Documents/4D-CHAINS_dev/lib/peak.py", line 301, in <listcomp>
C_subgrid = [round(v, 4) for v in C_subgrid] # convert all values to fixed decimal length floats!
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]
Strangely, if I run it within ipython it works fine, only within my code it creates problems. Can anybody think of any possible reason?
I am reading the text file consisting of bengali words. But I am unable to print the dependent vowels like KA,KI etc...
Here is my sample code and output
import unicodedata
bengali_phoneme_maplist={u'অ':'A',u'আ':'AA',u'ই':'I',u'ঈ':'II',u'উ':'U',u'ঊ ':'UU',u'ঋ ':'R',u'ঌ ':'L',u'এ ':'E',u'ঐ ':'AI',u'ও ':'O',u'ঔ ':'AU',u'ক':'KA',u'খ ':'KHA',u'গ ':'GA',u'ঘ':'GHA',u'ঙ ':'NGA',u'চ ':'CA',u'ছ':'CHA',u'জ ':'JA',u'ঝ':'JHA',u'ঞ':'NYA',u'ট ':'TTA',u'ঠ':'TTHA',u'ড ':'DDA',u'ঢ':'DDHA',u'ণ ':'NNA',u'ত ':'TA',u'ত ':'THA',u'দ':'DA',u'ধ':'DHA',u'ন':'NA',u'প':'PA',u'ফ':'PHA',u'ব':'BA',u'ভ':'BHA',u'ম ':'MA',u'য ':'YA',u'র':'RA',u'ল ':'LA',u'শ ':'SHA',u'ষ':'SSA',u'স ':'SA',u'হ':'ha',u' া ':'AAV',u' ি':'IV',u'ী':'IIV',u'ু':'UV',u'ূ':'UUV',u'ৃ':'RRV',u'ৄ ':'RR',u'ৄ':'EV',u' ৈ':'EV',u'়':'NUKTHA',u'ঽ':'AVAGRAHA'}
bengali_phoneme_maplist_normalise={unicodedata.normalize('NFKD',k):v
for k,v in bengali_phoneme_maplist.items()}
with open('bengali.txt','r')as infile:
lines=infile.readlines()
for index,line in enumerate(lines):
print('Phonemes in line{0}.total{1} symbols'.format(index,len(line)))
unknown=[]
words=line.split()
for word in words:
print(word,':',sep=' ', end='')
for character in word:
c=unicodedata.normalize('NFKD',character).casefold()
try:
print(bengali_phoneme_maplist_normalise[c],sep='',end='')
except KeyError:
print('_',sep='',end='')
if c not in unknown:
unknown.append(c)
print()
if unknown:
print('Unrecognised symbols:{0},total {1} symbols'.format(','.join(unknown),len(unknown)))
Sample input:
শিল্পাঞ্চলে ঢোকার মুখে, স্ন্যাক্সবারে খাবার কিনছিলেন, বহুজাতিক তথ্যপ্রযুক্তি সংস্থার কর্মী, শুভময় বন্দ্যোপাধ্যায়
Sample output:
Phonemes in line0.total129 symbols
text_000002 :___________
"শিল্পাঞ্চলে :_____PA_NYA____
ঢোকার :DDHA_KA_RA
মুখে, :_UV___
স্ন্যাক্সবারে :__NA___KA__BA_RA_
খাবার :__BA_RA
কিনছিলেন, :KA_NACHA___NA_
Unrecognisedsymbols:t,e,x,_,0,2,",শ,ি,ল,্,া,চ,ে,ো,ম,খ,,,স,য,জ,ত,থ,ং,য়,),
(Note that I know nohting about Bengali. :)
There are a few problems in your code:
There are many extra SPACE chars in the bengali_phoneme_maplist definition. For example, u'ঊ ' should be u'ঊ'. And it seems like it's not easy to input chars like u'া' in an text editor so I suggest you directly use unicode in the code, like '\u09be':'AAV'. (Actually I'd suggest you use '\uxxxx' for all chars and write the real chars in comments.)
u'ত':'TA',u'ত':'THA' should change to u'ত':'TA',u'থ':'THA'.
The chars in bengali_phoneme_maplist are not complete. For example there's no ো , ৌ , ্ and ং
After fixing these errors you will get the correct result.
An example like:
print("1.", get_list_nums_without_9([589775, 677017, 34439, 48731548, 782295632, 181967909]))
print("2.", get_list_nums_without_9([6162, 29657355, 5485406, 422862350, 74452, 480506, 2881]))
print("3.", get_list_nums_without_9([292069010, 73980, 8980155, 921545108, 75841309, 6899644]))
print("4.", get_list_nums_without_9([]))
nums = [292069010, 73980, 8980155, 21545108, 7584130, 688644, 644908219, 44281, 3259, 8527361, 2816279, 985462264, 904259, 3869, 609436333, 36915, 83705, 405576, 4333000, 79386997]
print("5.", get_list_nums_without_9(nums))
I'm trying to get number list without 9. If the list is empty or if all of the numbers in the list contain the digit 9, the function should return an empty list. I tried the function below, it doesn't work.
def get_list_nums_without_9(a_list):
j=0
for i in a_list:
a_list[j]=i.rstrip(9)
j+=1
return a_list
expected:
1. [677017, 48731548]
2. [6162, 5485406, 422862350, 74452, 480506, 2881]
3. []
4. []
5. [21545108, 7584130, 688644, 44281, 8527361, 83705, 405576, 4333000]
your lists contain integers. To remove the ones containing 9 the best way is to test if 9 belongs to the number as string and rebuild the output using a list comprehension with a conditional.
(Besides, rstrip removes the trailing chars of a string. Not suitable at all for your problem)
def get_list_nums_without_9(a_list):
return [x for x in a_list if "9" not in str(x)]
testing with your data:
>>> numbers = [
... [589775, 677017, 34439, 48731548, 782295632, 181967909],
... [6162, 29657355, 5485406, 422862350, 74452, 480506, 2881],
... [292069010, 73980, 8980155, 921545108, 75841309, 6899644],
... [],
... [292069010, 73980, 8980155, 21545108, 7584130, 688644, 644908219,
... 44281, 3259, 8527361, 2816279, 985462264, 904259, 3869, 609436333,
... 36915, 83705, 405576, 4333000, 79386997]
... ]
>>> for i, l in enumerate(numbers, 1):
... print("{}. {}".format(i, get_list_nums_without_9(l)))
1. [677017, 48731548]
2. [6162, 5485406, 422862350, 74452, 480506, 2881]
3. []
4. []
5. [21545108, 7584130, 688644, 44281, 8527361, 83705, 405576, 4333000]
You don't want to replace anything in the list, you want to generate a new list based on what you detect in the old list:
def get_list_nums_without_9(numbers):
sans_9 = []
for number in numbers:
if '9' not in str(number):
sans_9.append(number)
return sans_9
This will get you an empty list if there are no valid numbers.
In my code I have the following line:
fprintf(logfile,'Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: %d\tSigma: %d\tDisp: %.1f\r\n',parameter_sets(ps,:));
which is too long, so I want to break it to:
fprintf(logfile,'Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: ...
%d\tSigma: %d\tDisp: %.1f\r\n',parameter_sets(ps,:));
However, since the brake is within a string, MATLAB see the formatting %d sign in the second line as a start of a comment, and ignore this line (and produce an error...).
So I tried to make it clearer with a [] that warp the string:
fprintf(logfile,['Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR: ...
%d\tSigma: %d\tDisp: %.1f\r\n'],parameter_sets(ps,:));
but no help, it still interpret the second line as a comment. I also tried with and without the ellipsis (...) in different places, with no success.
So how can I write a line in a formatted way (i.e. a reasonable length) if it has a % sign in it?
Divide it in two lines like this:
fprintf(logfile,['Parameters: Size: %d\tH: %.4f\tF: %.1f\tI: %.3f\tR:', ...
'%d\tSigma: %d\tDisp: %.1f\r\n'],parameter_sets(ps,:));
% notice the apostrophe and comma(',) before ellpsis(...) at the end of first line
% and apostrophe(') at the start of the second line