reduce the number of IF statements in Python - python-3.x

I have written a function that is going to have up to 72 IF statements
and i was hoping to write code that will be much shorter, but have no idea where to start
The function reads the self.timeselect variable when a radio button is selected and the result is saved to a text file called missing_time.txt. If the result is equal to 1 then save "0000" to the file, if the result is 2 save then 0020 to the text file etc. This can be for 72 possible combinations.
Is there a smarter way to simplify the function ?
def buttonaction():
selectedchoice = ""
if self.timeselect.get() == 1:
selectedchoice = "0000"
orig_stdout = sys.stdout
f = open('missing_time.txt', 'w')
sys.stdout = f
print(selectedchoice)
f.close()
if self.timeselect.get() == 2:
selectedchoice = "0020"
orig_stdout = sys.stdout
f = open('missing_time.txt', 'w')
sys.stdout = f
print(selectedchoice)
f.close()
self.timeselect = tkinter.IntVar()
self.Radio_1 = tkinter.Radiobutton(text="0000",variable =
self.timeselect,indicator = 0 ,value=1)
self.Radio_1.place(x=50,y=200)
self.Radio_2 = tkinter.Radiobutton(text="0020",variable =
self.timeselect,indicator = 0 ,value=2)
self.Radio_2.place(x=90,y=200)

choice_map = {
1 : "0000",
2 : "0020"
}
def buttonaction():
selected = self.timeselect.get()
if 0 < selected < 73: # This works as intended in Python
selectedchoice = choice_map[selected]
# Do you intend to append to file instead of replacing it?
# See text below.
with open("missing_time.txt", 'w') as outfile:
outfile.write(selectedchoice + "\n")
print(selectedchoice)
Better yet, if there is a pattern that relates the value of self.timeselect.get() to the string that you write out, generate selectchoice directly from that pattern instead of using a dictionary to do the mapping.
Edit
I find it a bit odd that you are clearing the file "missing_time.txt" every time you call buttonaction. If your intention is to append to it, change the file mode accordingly.
Also, instead of opening and closing the file each time, you might just want to open it once and pass the handler to buttonaction or keep it as a global depending on how you use it.
Finally, if you do not intend to catch the KeyError from an invalid key, you can do what #Clifford suggests and use choice_map.get(selected, "some default value that does not have to be str").

All you need to do in this case is construct a string from the integer value self.timeselect.get().
selectedchoice = self.timeselect.get()
if 0 < selectedchoice < 73:
orig_stdout = sys.stdout
f = open('missing_time.txt', 'w')
sys.stdout = f
print( str(selectedchoice).zfill(4) ) # Convert choice to
# string with leading
# zeros to 4 charaters
f.close()
Further in the interests of simplification, redirecting stdout and restoring it is a cumbersome method of outputting to a file. Instead, you can write directly to the file:
with open('missing_time.txt', 'w') as f:
f.write(selectedchoice + "\n")
Note that because we use the with context manager here, f is automatically closed when we leave this context so there is no need to call f.close(). Ultimately you end up with:
selectedchoice = self.timeselect.get()
if 0 < selectedchoice < 73:
with open('missing_time.txt', 'w') as f:
f.write( str(selectedchoice).zfill(4) + "\n" )
Even if you did use the conditionals each one differs only in the first line, so only that part need be conditional and the remainder of the content performed after the conditionals. Moreover all conditionals are mutually exclusive so you can use else-if:
if self.timeselect.get() == 1:
selectedchoice = "0000"
elif self.timeselect.get() == 2:
selectedchoice = "0020"
...
if 0 < selectedchoice < 73:
with open('missing_time.txt', 'w') as f:
f.write(selectedchoice + "\n")
In circumstances where there is no direct arithmetic relationship between selectchoice and the required string, or the available choices are perhaps not contiguous, it is possible to implement a switch using a dictionary:
choiceToString = {
1: "0001",
2: "0002",
...
72: "0072",
}
selectedchoice = choiceToString.get( self.timeselect.get(), "Invalid Choice")
if selectedchoice != "Invalid Choice":
with open('missing_time.txt', 'w') as f:
f.write(selectedchoice + "\n")

Since there is no switch statement in Python, you can't really reduce the number of if statements. But I see 2 two way to optimize and reduce your code length.
First, you can use some
if condition:
elif condition:
instead of
if condition:
if condition:
since you can't have self.timeselect.get() evaluated to more than one int.
Secondly you can wrap all the code that doesn't vary in a function.
You can get rid of selectedchoice and put
orig_stdout = sys.stdout
f = open('missing_time.txt', 'w')
sys.stdout = f
print(selectedchoice)
f.close()
in a function writeToFile(selectedOption)

I'm assuming that the values are arbitrary and there's no defined pattern. I also see that the only thing that changes in your code is the selectedChoice variable. You can use a Dictionary in such cases. A dictionary's elements are key/value pairs so you can reference the key and get the value.
dictionary = {
1:"0000",
2:"0020",
3:"0300",
4:"4000"
}
def buttonAction():
selectedChoice = dictionary[self.timeselect.get()]
if 0<selectedChoice<=72:
f=open('missing_time.txt','w')
f.write(selectedChoice+" ")
f.close()
print(choice)

Related

Having Issues Concatenating Strings into list without \n - Python3

I am currently having some issues trying to append strings into a new list. However, when I get to the end, my list looks like this:
['MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH\nIIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT\nCGLANMFAYHTLGYEDLDELQKEPQPLVFVIELLQVDAPSDYQRETWNLSNHEKMKAVPV\nLHGEGNRLFKLGRYEEASSKYQEAIICLRNLQTKEKPWEVQWLKLEKMINTLILNYCQCL\nLKKEEYYEVLEHTSDILRHHPGIVKAYYVRARAHAEVWNEAEAKADLQKVLELEPSMQKA\nVRRELRLLENRMAEKQEEERLRCRNMLSQGATQPPAEPPTEPPAQSSTEPPAEPPTAPSA\nELSAGPPAEPATEPPPSPGHSLQH\n']
I'd like to remove the newlines somehow. I looked at other questions on here and most suggest to use .rstrip however in adding that to my code, I get the same output. What am I missing here? Apologies if this question has been asked.
My input also looks like this(took the first 3 lines):
sp|Q9NZN9|AIPL1_HUMAN Aryl-hydrocarbon-interacting protein-like 1 OS=Homo sapiens OX=9606 GN=AIPL1 PE=1 SV=2
MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH
IIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT
from sys import argv
protein = argv[1] #fasta file
sequence = '' #string linker
get_line = False #False = not the sequence
Uniprot_ID = []
sequence_list =[]
with open(protein) as pn:
for line in pn:
line.rstrip("\n")
if line.startswith(">") and get_line == False:
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
get_line = True
continue
if line.startswith(">") and get_line == True:
sequence.rstrip('\n')
sequence_list.append(sequence) #add the amino acids onto the list
sequence = '' #resets the str
if line != ">" and get_line == True: #if the first line is not a fasta ID and is it a sequence?
sequence += line
print(sequence_list)
Per documentation, rstrip removes trailing characters – the ones at the end. You probably misunderstood others' use of it to remove \ns because typically those would only appear at the end.
To replace a character with something else in an entire string, use replace instead.
These commands do not modify your string! They return a new string, so if you want to change something 'in' a current string variable, assign the result back to the original variable:
>>> line = 'ab\ncd\n'
>>> line.rstrip('\n')
'ab\ncd' # note: this is the immediate result, which is not assigned back to line
>>> line = line.replace('\n', '')
>>> line
'abcd'
When I asked this question I didn't take my time in looking at documentation & understanding my code. After looking, I realized two things:
my code isn't actually getting what I am interested in.
For the specific question I asked, I could have simply used line.split() to remove the '\n'.
sequence = '' #string linker
get_line = False #False = not the sequence
uni_seq = {}
"""this block of code takes a uniprot FASTA file and creates a
dictionary with the key as the uniprot id and the value as a sequence"""
with open (protein) as pn:
for line in pn:
if line.startswith(">"):
if get_line == False:
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
get_line = True
else:
uni_seq[u_id] = sequence
sequence_list.append(sequence)
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
sequence = ''
else:
if get_line == True:
sequence += line.strip() # removes the newline space
uni_seq[u_id] = sequence
sequence_list.append(sequence)

How to print 1st string of file's line during second iteration in python

Actually My file contents are.
ttsighser66
dagadfgadgadgfadg
dafgad
fgadfgad
ttsighser63
sadfsadf
asfdas
My code
file=open("C:\\file.txt","r")
cont = []
for i in file:
dd = i.strip("\n")
cont.append(dd)
cc = ",".join(cont)
if "tt" in i:
cc = ",".join(cont[:-1])
print(cont[-1], cc)
cont = []
My code generate below Output:
ttsighser66
ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad
But I want output like below format
ttsighser66,dagadfgadgadgfadg,dafgad,fgadfgad
ttsighser63,sadfsadf,asfdas
file=open("file.txt","r")
cont = []
for i in file:
dd = i.strip("\n")
cont.append(dd)
#print('cc',cont)
if "ttsighser" in i and len(cont) != 1:
cc = ",".join(cont[:-1])
print(cc)
cont = []
cont.append(dd)
print(",".join(cont))
If you don't need to store any strings to a list and just need to print strings, you could try this instead.
with open("file.txt", "r") as f:
line_counter = 0
file_lines = f.readlines()
for i in file_lines:
dd = i.strip()
if "tt" in dd:
print("{0}{1}".format("\n" if line_counter > 0 else "", dd), end="")
else:
print(",{0}".format(dd), end="")
line_counter += 1
print("")
The reason why your code displays
ttsighser66
ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad
instead of
ttsighser66,dagadfgadgadgfadg,dafgad,fgadfgad
ttsighser63,sadfsadf,asfdas
is because when you first encounter 'ttsighser66', it is appended to cont. Then since 'ttsighser66' contains 'tt', we would proceed to the conditional branch.
In the conditional branch, we would be joining the first and second to the last string in cont in cc = ",".join(cont[:-1]). However, since we only have 'ttsighser66' in cont, cont[:-1] will give us [] (an empty list). Since cont[:-1] is empty, ",".join(cont[:-1]) will be empty as well. Thus, cc will be empty. Since cc is empty, print(cont[-1], cc) will give us ttsighser66.
In the second line, ttsighser63 dagadfgadgadgfadg,dafgad,fgadfgad gets displayed because cont contains more than one value already so it will also display the values before 'ttsighser63'.
The remaining strings are not displayed because, based from your code, it would need another string containing 'tt' before the strings in cc could be displayed.
Essentially, you require a pair of strings containing 'tt' to display the strings between the pairs.
Additonal remark: The line cc = ",".join(cont) in your code seems pretty useless since its scope is limited to the for loop only and its value is being replaced inside the conditional branch.
version 1 (all data in list of strings && 1 time print)
fp=open("file.txt", "r")
data = []
for line in fp:
if "tt" in line:
data.append(line.strip())
else:
data.append(data.pop() + "," + line.strip())
fp.close()
[print (data) for line in data]
Version 2 (all data in a single string && 1 time print)
fp=open("file.txt","r")
data = ""
for line in fp:
if "tt" in line:
data += "\n" + line.strip()
else:
data += ","+line.strip()
fp.close()
data = data[1:]
print (data)

How to wrap a Python text stream to replace strings on the fly?

Given how convoluted my solution seems to be, I am probably doing it all wrong.
Basically, I am trying to replace strings on the fly in a text stream (e.g. open('filename', 'r') or io.StringIO(text)). The context is that I'm trying to let pandas.read_csv() handle "Infinity" as "inf" instead of choking on it.
I do not want to slurp the whole file in memory (it can be big, and even if the resulting DataFrame will live in memory, no need to have the whole text file too). Efficiency is a concern. So I'd like to keep using read(size) as the main way to get text in (no readline which is quite slower). The difficulty comes from the cases where read() might return a block of text that ends in the middle of one of the strings we'd like to replace.
Anyway, below is what I've got so far. It handles the conditions I've thrown at it so far (lines longer than size, search strings at the boundary of some read block), but I'm wondering if there is something simpler.
Oh, BTW, I don't handle anything else than calls to read().
class ReplaceIOFile(io.TextIOBase):
def __init__(self, iobuffer, old_list, new_list):
self.iobuffer = iobuffer
self.old_list = old_list
self.new_list = new_list
self.buf0 = ''
self.buf1 = ''
self.sub_has_more = True
def read(self, size=None):
if size is None:
size = 2**16
while len(self.buf0) < size and self.sub_has_more:
eol = 0
while eol <= 0:
txt = self.iobuffer.read(size)
self.buf1 += txt
if len(txt) < size:
self.sub_has_more = False
eol = len(self.buf1) + 1
else:
eol = self.buf1.rfind('\n') + 1
txt, self.buf1 = self.buf1[:eol], self.buf1[eol:]
for old, new in zip(self.old_list, self.new_list):
txt = txt.replace(old, new)
self.buf0 += txt
val, self.buf0 = self.buf0[:size], self.buf0[size:]
return val
Example:
text = """\
name,val
a,1.0
b,2.0
e,+Infinity
f,-inf
"""
size = 4 # or whatever -- I tried 1,2,4,10,100,2**16
with ReplaceIOFile(io.StringIO(text), ['Infinity'], ['inf']) as f:
while True:
buf = f.read(size)
print(buf, end='')
if len(buf) < size:
break
Output:
name,val
a,1.0
b,2.0
e,+inf
f,-inf
So for my application:
# x = pd.read_csv(io.StringIO(text), dtype=dict(val=np.float64)) ## crashes
x = pd.read_csv(ReplaceIOFile(io.StringIO(text), ['Infinity'], ['inf']), dtype=dict(val=np.float64))
Out:
name val
0 a 1.000000
1 b 2.000000
2 e inf
3 f -inf

Evaluating a mathematical expression without eval() on Python3 [duplicate]

This question already has answers here:
Evaluating a mathematical expression in a string
(14 answers)
Closed 10 months ago.
I'm working on a "copy-paste calculator" that detects any mathematical expressions copied to the system clipboard, evaluates them and copies the answer to the clipboard ready to be pasted. However, while the code uses the eval()-function, I'm not terribly concerned considering the user normally knows what they are copying. That being said, I want to find a better way without giving the calculations a handicap (= eg. removing the ability to calculate multiplications or exponents).
Here's the important parts of my code:
#! python3
import pyperclip, time
parsedict = {"×": "*",
"÷": "/",
"^": "**"} # Get rid of anything that cannot be evaluated
def stringparse(string): # Remove whitespace and replace unevaluateable objects
a = string
a = a.replace(" ", "")
for i in a:
if i in parsedict.keys():
a = a.replace(i, parsedict[i])
print(a)
return a
def calculate(string):
parsed = stringparse(string)
ans = eval(parsed) # EVIL!!!
print(ans)
pyperclip.copy(str(ans))
def validcheck(string): # Check if the copied item is a math expression
proof = 0
for i in mathproof:
if i in string:
proof += 1
elif "http" in string: #TODO: Create a better way of passing non-math copies
proof = 0
break
if proof != 0:
calculate(string)
def init(): # Ensure previous copies have no effect
current = pyperclip.paste()
new = current
main(current, new)
def main(current, new):
while True:
new = pyperclip.paste()
if new != current:
validcheck(new)
current = new
pass
else:
time.sleep(1.0)
pass
if __name__ == "__main__":
init()
Q: What should I use instead of eval() to calculate the answer?
You should use ast.parse:
import ast
try:
tree = ast.parse(expression, mode='eval')
except SyntaxError:
return # not a Python expression
if not all(isinstance(node, (ast.Expression,
ast.UnaryOp, ast.unaryop,
ast.BinOp, ast.operator,
ast.Num)) for node in ast.walk(tree)):
return # not a mathematical expression (numbers and operators)
result = eval(compile(tree, filename='', mode='eval'))
Note that for simplicity this allows all the unary operators (+, -, ~, not) as well as the arithmetic and bitwise binary operators (+, -, *, /, %, // **, <<, >>, &, |, ^) but not the logical or comparison operators. If should be straightforward to refine or expand the allowed operators.
without using eval, you'd have to implement a parser, or use existing packages like simpleeval (I'm not the author, and there are others, but I have tested that one successfully)
In one line, plus import:
>>> from simpleeval import simpleeval
>>> simpleeval.simple_eval("(45 + -45) + 34")
34
>>> simpleeval.simple_eval("(45 - 22*2) + 34**2")
1157
now if I try to hack the calculator by trying to import a module:
>>> simpleeval.simple_eval("import os")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "K:\CODE\COTS\python\simpleeval\simpleeval.py", line 466, in simple_eval
return s.eval(expr)
File "K:\CODE\COTS\python\simpleeval\simpleeval.py", line 274, in eval
return self._eval(ast.parse(expr.strip()).body[0].value)
AttributeError: 'Import' object has no attribute 'value'
Caught! the cryptic error message comes from the fact that simpleeval can evaluate variables that you can optionally pass through a dictionary. Catch AttributeError exception to intercept wrongly formed expressions. No need for eval for that.
By native Python3: without using inbuilt function
input_string = '1+1-1*4+1'
result = 0
counter = -1
for ch in range(len(input_string)):
if counter == ch:
continue
if input_string[ch] in ['-', '+', '/', '*', '**']:
next_value = int(input_string[ch+1])
if input_string[ch] == '-':
result -= next_value
counter = ch+1
elif input_string[ch] == '+':
result += next_value
counter = ch+1
elif input_string[ch] == '*':
result *= next_value
counter = ch+1
elif input_string[ch] == '/':
result /= next_value
counter = ch+1
elif input_string[ch] == '**':
result **= next_value
counter = ch+1
else:
result = int(input_string[ch])
print(result)
Output : 
The original string is : '1+1-1*4+1'
The evaluated result is : 5

Having trouble with str.find()

I'm trying to use the str.find() and it keeps raising an error, what am I doing wrong?
import codecs
def countLOC(inFile):
""" Receives a file and then returns the amount
of actual lines of code by not counting commented
or blank lines """
LOC = 0
for line in inFile:
if line.isspace():
continue
comment = line.find('#')
if comment > 0:
for letter in range(comment):
if not letter.whitespace:
LOC += 1
break
return LOC
if __name__ == "__main__":
while True:
file_loc = input("Enter the file name: ").strip()
try:
source = codecs.open(file_loc)
except:
print ("**Invalid filename**")
else:
break
LOC_count = countLOC(source)
print ("\nThere were {0} lines of code in {1}".format(LOC_count,source.name))
Error
File "C:\Users\Justen-san\Documents\Eclipse Workspace\countLOC\src\root\nested\linesOfCode.py", line 12, in countLOC
comment = line.find('#')
TypeError: expected an object with the buffer interface
Use the built-in function open() instead of codecs.open().
You're running afoul of the difference between non-Unicode (Python 3 bytes, Python 2 str) and Unicode (Python 3 str, Python 2 unicode) string types. Python 3 won't convert automatically between non-Unicode and Unicode like Python 2 will. Using codecs.open() without an encoding parameter returns an object which yields bytes when you read from it.
Also, your countLOC function won't work:
for letter in range(comment):
if not letter.whitespace:
LOC += 1
break
That for loop will iterate over the numbers from zero to one less than the position of '#' in the string (letter = 0, 1, 2...); whitespace isn't a method of integers, and even if it were, you're not calling it.
Also, you're never incrementing LOC if the line doesn't contain #.
A "fixed" but otherwise faithful (and inefficient) version of your countLOC:
def countLOC(inFile):
LOC = 0
for line in inFile:
if line.isspace():
continue
comment = line.find('#')
if comment > 0:
for letter in line[:comment]:
if not letter.isspace():
LOC += 1
break
else:
LOC += 1
return LOC
How I might write the function:
def count_LOC(in_file):
loc = 0
for line in in_file:
line = line.lstrip()
if len(line) > 0 and not line.startswith('#'):
loc += 1
return loc
Are you actually passing an open file to the function? Maybe try printing type(file) and type(line), as there's something fishy here -- with an open file as the argument, I just can't reproduce your problem! (There are other bugs in your code but none that would cause that exception). Oh btw, as best practice, DON'T use names of builtins, such as file, for your own purposes -- that causes incredible amounts of confusion!

Resources