How to count number of entries in a text file - python-3.x

I have a text file:
10
List of ARFCNs = 987 988 989 991 992 993 995 999 1000 1004 1008 1009 1010 1011 1012 1018 1019 1020 1023
I want a code such that it returns the number of values in the list i.e. 19 in this case.
Furthermore, I need to use this number of entries such that if the answer is either 0 or more than 1, the output prints a textline. But if the numbe ris exactly 1, then another statement is printed.
I know its a very basic question but I cant find any specific solution.
I tried using len, but im not sure as to what should be the delimiter. Putting space as a delimiter gives a faulty answer. I want an exception to the code where entries after the = sign are counted and then processed in an if/else loop probably.

Here is the full code:
with open('problem.txt','r') as myfile:
file_str = myfile.read()
my_list = [] ##initializing a list
for word in string.split('=')[1].split(' '): ##using split function to split, the list.this splits the value in index specified and return the value.
## storing the updated list
my_list.append(word)
for var in my_list:
if var == '':
my_list.remove(var)
if len(my_list) == 0 :
print(f'the number of entries are zero')
else:
print(f'the number of entries are {len(my_list)}')

This will help you:
numbers = data.split("=")[1].split(" ")
for value in numbers:
if value == "":
numbers.remove(value)
count = len(numbers)
Here data the the whole string tha you are reading from text file.
This will return the number counts and you can use it further.

Related

Count pattern matches in paragraphs separated by empty lines in Python

I want to count matches in rows that have a pattern TRP or PHE or MET - I need to count it per paragraph (separated by empty lines). Then I would like to calculate the percentage of the matches by dividing the matches count by the number of lines in each paragraph. Is there a quick python solution for this?
My input looks like:
THR 61 65.21
LEU 62 63.85
PRO 63 54.61
LEU 64 50.74
ALA 65 57.40
PRO 66 56.49
ASP 67 56.77
PRO 68 55.94
TYR 69 56.06
PRO 70 56.55
GLY 71 57.74
HIS 72 55.69
ASN 73 64.70
PRO 74 65.70
ASP 422 65.05
SER 423 53.19
SER 424 45.39
ARG 425 47.80
ALA 426 48.84
ARG 427 46.19
ALA 428 46.81
SER 429 51.64
GLY 430 56.53
GLY 431 69.14
ASP 471 59.01
VAL 472 51.82
ASP 473 52.63
GLN 474 45.86
LEU 475 44.30
SER 476 45.83
LEU 477 45.78
THR 478 37.91
PRO 479 44.77
VAL 480 41.47
VAL 481 46.86
PRO 482 46.12
GLY 483 46.38
PRO 484 49.42
PRO 485 57.74
I tried with awk but it is too hard...
This should do the trick, assuming your input is a txt file.
Even though your input is not a text file you can load the input accordingly.
def calc_percetage(log, line_count):
for pattern, sum in log.items():
percentage[pattern] = (sum/line_count)*100
return percentage
#log = {'TRP': 0, 'THR': 0, 'PRO': 0} This method can be used if number of patterns are less
log =dict()
for patterns in ['TRP', 'THR', 'PRO']:
log[patterns] = 0
para = 1
percentage ={}
count = 0
with open("input.txt") as input_file:
for line in input_file:
count +=1
for pattern, sum in log.items():
if pattern in line:
log[pattern] += 1
if (re.match('\r?\n', line)):
line_count = count -1
print(f"end of para {para} & number of lines {line_count}")
print(f"count from paragraph {para} is {log}")
percentage = calc_percetage(log, line_count)
print(f"percentages are as followed {percentage}")
para = para +1
#reset for next paragraph
count = 0
log = {'TRP': 0, 'THR': 0, 'PRO': 0} #This will change if you use dynamic way to generate the dict called 'log', you can reuse the for loop initially used to create dict
#handaling last paragraph
line_count = count
print(f"end of para {para} & number of lines {line_count}")
print(f"count from paragraph {para} is {log}")
percentage = calc_percetage(log, line_count)
print(f"percentages are as followed {percentage}")
This task is straight forward in awk if the record separator is set to read paragraphs (one or more blank lines between lines) using RS="" (special meaning explained towards the bottom of this page of the awk manual: https://www.gnu.org/software/gawk/manual/html_node/awk-split-records.html), and the field separator is set to read lines as fields using FS="\n". In my example I have set these in a BEGIN block but shell switches could be used also.
Once the fields are configured, pattern blocks are established for each search pattern. The action of each is to increment a counter (action only applied when pattern is present). A final universal block can print the count and the number of fields/lines (NF) for that record, and perform whatever arithmetic is required with them.
awk procedure run on file.txt:
awk ' BEGIN {RS="";FS="\n";} /TRP/{aa++;} /PHE/{aa++} /MET/{aa++} {print "set " NR": " 0+aa " matches in " NF " lines, Ratio=" (0+aa)/NF; aa=0}' file.txt
Note that the patterns are separated into distinct blocks to make sure the counter is incremented more than once for more than one match - if a combined or (|) pattern had been used, the count would only increase once if two matches were present.
output
set 1: 0 matches in 14 lines, Ratio=0
set 2: 0 matches in 10 lines, Ratio=0
set 3: 0 matches in 15 lines, Ratio=0
If totals for the file are required, a second counter variable can be added to each block that is not reset in the last block, along with a counter to accumulate the NF count for each record. In such a case, an END block can be used to sum and calculate overall ratios.

Horizotal print of a complex string block

Once again I'm asking for you advice. I'm trying to print a complex string block, it should look like this:
32 1 9999 523
+ 8 - 3801 + 9999 - 49
---- ------ ------ -----
40 -3800 19998 474
I wrote the function arrange_printer() for the characters arrangement in the correct format that could be reutilized for printing the list. This is how my code looks by now:
import operator
import sys
def arithmetic_arranger(problems, boolean: bool):
arranged_problems = []
if len(problems) <= 5:
for num in range(len(problems)):
arranged_problems += arrange_printer(problems[num], boolean)
else:
sys.exit("Error: Too many problems")
return print(*arranged_problems, end=' ')
def arrange_printer(oper: str, boolean: bool):
oper = oper.split()
ops = {"+": operator.add, "-": operator.sub}
a = int(oper[0])
b = int(oper[2])
if len(oper[0]) > len(oper[2]):
size = len(oper[0])
elif len(oper[0]) < len(oper[2]):
size = len(oper[2])
else:
size = len(oper[0])
line = '------'
ope = ' %*i\n%s %*i\n%s' % (size,a,oper[1],size,b,'------'[0:size+2])
try:
res = ops[oper[1]](a,b)
except:
sys.exit("Error: Operator must be '+' or '-'.")
if boolean == True:
ope = '%s\n%*i' % (ope,size+2, res)
return ope
arithmetic_arranger(['20 + 300', '1563 - 465 '], True)
#arrange_printer(' 20 + 334 ', True)
Sadly, I'm getting this format:
2 0
+ 3 0 0
- - - - -
3 2 0 1 5 6 3
- 4 6 5
- - - - - -
1 0 9 8
If you try printing the return of arrange_printer() as in the last commented line the format is the desired.
Any suggestion for improving my code or adopt good coding practices are well received, I'm starting to get a feel for programming in Python.
Thank you by your help!
The first problem I see is that you use += to add an item to the arranged_problems list. Strings are iterable. somelist += someiterable iterates over the someiterable, and appends each element to somelist. To append, use somelist.append()
Now once you fix this, it still won't work like you expect it to, because print() works by printing what you give it at the location of the cursor. Once you're on a new line, you can't go back to a previous line, because your cursor is already on the new line. Anything you print after that will go to the new line at the location of the cursor, so you need to arrange multiple problems such that their first lines all print first, then their second lines, and so on. Just fixing append(), you'd get this output:
20
+ 300
-----
320 1563
- 465
------
1098
You get a string with \n denoting the start of the new line from each call to arrange_printer(). You can split this output into lines, and then process each row separately.
For example:
def arithmetic_arranger(problems, boolean:bool):
arranged_problems = []
if len(problems) > 5:
print("Too many problems")
return
for problem in problems:
# Arrange and split into individual lines
lines = arrange_printer(problem, boolean).split('\n')
# Append the list of lines to our main list
arranged_problems.append(lines)
# Now, arranged_problems contains one list for each problem.
# Each list contains individual lines we want to print
# Use zip() to iterate over all the lists inside arranged_problems simultaneously
for problems_lines in zip(*arranged_problems):
# problems_lines is e.g.
# (' 20', ' 1563')
# ('+ 300', '- 465') etc
# Unpack this tuple and print it, separated by spaces.
print(*problems_lines, sep=" ")
Which gives the output:
20 1563
+ 300 - 465
----- ------
320 1098
If you expect each problem to have a different number of lines, then you can use the itertools.zip_longest() function instead of zip()
To collect all my other comments in one place:
return print(...) is pretty useless. print() doesn't return anything. return print(...) will always cause your function to return None.
Instead of iterating over range(len(problems)) and accessing problems[num], just do for problem in problems and then use problem instead of problems[num]
Debugging is an important skill, and the sooner into your programming career you learn it, the better off you will be.
Stepping through your program with a debugger allows you to see how each statement affects your program and is an invaluable debugging tool

Python, How do I ignore a string such as 'done' amongst numbers to sum the total from a list

while True:
numbers = input('> ')
if numbers == 'done':
break
total = 0
for number in numbers:
if numbers == int:
total = total + numbers
print(total)
I've had a hard time understanding exactley what u want to do with this code, please use a proper code block next time, with proper indentation. My guess is that u want to get a input number like 345 and add 3+4+5 as a output. If the input is not a int it should break the loop. Ive come up with 2 diffrent solutions, depending on what you need.
This code will simply take the input and check if it is "done", if it is not "done" it will try to add. This is a easy to understand solution but it will produce a error if the input is any diffrent string than "done".
while True:
numbers = input(">")
if numbers == "done":
break
else:
total = 0
for number in numbers:
total += int(number)
print(total)
This approach will test for the "done" string again, but afterwards will also check if the input can be converted into a int. if not the error is captured and it will return "invalid input". If u want the programm to terminate at any string u can just put break in the except section.
while True:
numbers = input(">")
if numbers == "done":
break
else:
try:
testing = int(numbers)
total = 0
for number in numbers:
total += int(number)
except:
total = "invalid input"
print(total)
im a Beginner myself and if a experiencend person can show me a better way to do this i would be very interested
while True:
numbers = input('> ')
if numbers == 'done':
break
total = 0
for number in numbers:
if numbers == int:
total = total + numbers
print(total)
Assuming this as your code:
Here's a solution to your problem-->
numbers=[]
while True:
a=input('>')
if a=='done':
break
else:
numbers.append(a)
total=0
for number in numbers:
total = total + int(number)
print(total)
By default everything gets accepted as string so we convert it to integer to find total.
Also we use list to store all the values we accept.
Another Solution is-->
numbers=[]
while True:
a=input('>')
if a=='done':
break
else:
numbers.append(a)
p=map(int,numbers)
print(sum(p))
Hope you understand the solution :-)
total = 0
average = 0
count = 0
while True:
numbers = input('> ')
if numbers == 'done': break
try:
total = int(numbers) + total
count = count + 1
except:
print('nope')
try:
average = total / count
except:
print('error')
print(total)
print(average)
print(count)
Try this:
total = 0
number_of_inputs = 0
while True:
number_string = input('Enter a number: ')
try:
total += float(number_string)
number_of_inputs += 1
except ValueError:
break # we weren't given a number, so exit the loop
# Now that we're outside of the loop, print out the total:
print('The total is:', total)
if number_of_inputs > 0:
average = total / number_of_inputs
print('The average is:', average)
else:
print('The average cannot be calculated, as no inputs were given.')
Do you see what's happening? The while loop keeps ask for and adding integers to total until a non-integer (like "done") is given. Once it gets that non-integer, the int() function will fail, and the exception it throws will get caught, and the code will immediately break out of the while loop.
And once out of the loop, the total and the average are printed out.
A few things you should be aware of:
If the user gave no inputs (which is possible here), the total will correctly print out as 0, but if you try to calculate the average, you will error, due to diving by number_of_inputs (which is also 0). That is why I check that number_of_inputs is greater than zero before I even attempt to calculate the average.
Originally I used int() to convert the string to a number, but I changed it to use float() instead. I figure that since you want to calculate an average, averages are not necessarily integers (even if all the inputs are), so there's no point in enforcing integer input. That is why I changed the int() to float(), but whether or not you want to use it is up to you.
ValueError isn't a function; it's an Exception. At this point you probably don't know what Exceptions are, so just know that they are special cases that can happen, and they're often used for catching errors, such as bad input values.
In the code I posted above, the loop is always expecting numerical input. But as soon as we have input that can't be converted to a number, the program then says, "Hey, I have an exception to what we're expecting! The exception is that there's an error in the value!" Then the program, instead of continuing to the next line of code (which is number_of_inputs += 1) will then execute the block of code under the except ValueError: section. And in the code above, all it does is call break, which exits the loop.
Once out of the loop, the code prints out the total and the average.
If it weren't for the try: and except ValueError: lines in the code, then the program would abruptly end (with a lengthy error message) once someone gave a non-numerical input. That happens because the call to float() wouldn't know how to convert a value like "done" to a number, so it does nothing more than just quitting.
However, by using try: and except ValueError:, we are anticipating that someone might give non-numerical input. When that happens (which it will, when the user is finished giving inputs) -- instead of quitting -- we want an alternate action to take. And we specify that alternate action to be a simple break out of the loop -- which will allow the program to continue with whatever is after the loop.
I hope this makes sense. If it doesn't, it will make more sense once you start learning about Exceptions in Python.

I am getting a "Time Limit Exceeded " error in the following code. How to fix that error

The following code is my view of checking whether the sum of a number and it's reverse is a palindrome or not.If the sum is a palindrome then sum will be displayed.Otherwise the process will be repeated until we get a palindrome. When it is set to execute, I am getting a time limit exceeded error.Where do I need to correct the code?
def pal(n1):
temp=n1
rev=0
while(temp>0):
rev=(rev*10)+(temp%10)
temp=temp/10
sum1=n1+rev
temp=sum1
rev=0
while(temp>0):
rev=(rev*10)+(temp%10)
temp=temp/10
if(rev==sum1):
print(sum1)
else:
pal(sum1)
n=int(input())
pal(n)
I expect the output of a number 453 to be 6666.
i.e.
453+354=807 (not a palindrome. So repeat the process)
807+708=1515
1515+5151=6666 (it is a palindrome)
Your problem is that you are checking for while temp > 0: but inside that loop you are using float division: temp=temp/10. So the condition will always hold. For example:
>>> 8/10
0.8
>>> 0.8/10
0.08
What you want is to change your divisions to int division:
>>> 8//10
0
Still you might consider working with strings which is much easier in that case:
def pal(n):
rev_n = str(n)[::-1]
sum_str = str(n + int(rev_n))
while sum_str != sum_str[::-1]:
# print(sum_str)
sum_rev = sum_str[::-1]
sum_str = str(int(sum_str) + int(sum_rev))
print(sum_str)
And with the commented print this gives:
>>> pal(453)
807
1515
6666
Here is one way of doing this using string manipulation, which goes a lot easier than trying to do this with numbers. It is also a more direct translation of what you describe afterwards. (I do not really see the link between your code and your description...)
def is_palindrome(text):
# : approach1, faster for large inputs
# mid_length = len(text) // 2
# offset = 0 if len(text) % 2 else 1
# return text[:mid_length] == text[:-mid_length - offset:-1]
# : approach2, faster for small inputs
return text == text[::-1]
def palindrome_sum(num):
while not is_palindrome(num):
num = str(int(num) + int(num[::-1]))
return num
num = input() # 453
palindrome = palindrome_sum(num)
print(palindrome)
# 6666

How do I sum up values from a text file in Python?

I know there are a couple of post about this question on S.O. but they have not helped me solve my problem. I am trying to use an accumulator to sum up the values in a text file. When there is a number on each line my code just prints each line that is in the file. When there is a blank space between I get an error message. I think it is a simple oversight but I am new to Python so I am not sure what I am doing wrong.
My code:
def main():
#Open a file named numbers.txt
numbers_file = open('numbers.txt','r')
#read the numbers on the file
number = numbers_file.readline()
while number != '':
#convert to integer
int_number = int(number)
#create accumulator
total = 0
#Accumulates a total number
total += int_number
#read the numbers on the file
number = numbers_file.readline()
#Print the data that was inside the file
print(total)
#Close the the numbers file
numbers_file.close()
#Call the main function
main()
Inputs in the text file:
100
200
300
400
500
Gives me error message:
ValueError: invalid literal for int() with base 10: '\n'
Inputs in the text file:
100
200
300
400
500
Prints:
100
200
300
400
500
You need to exclude empty lines because you can't convert them to an int(). One pythonic (EAFP) way to do this is to catch the exception and ignore (though this will silently ignore any non-number line):
with open('numbers.txt','r') as numbers_file:
total = 0
for line in numbers_file:
try:
total += int(line)
except ValueError:
pass
print(total)
Or you can explicitly test that you don't have an empty string after you .strip() all the whitespace (this would still error for a non-numeric line, e.g. 'hello'):
with open('numbers.txt','r') as numbers_file:
total = 0
for line in numbers_file:
if line.strip():
total += int(line)
print(total)
This second one can be written as a generator expression:
with open('numbers.txt','r') as numbers_file:
total = sum(int(line) for line in numbers_file if line.strip())
print(total)
You are assigning the value 0 to your accumulator each time you go through the loop, before you add the new value. This means you're adding the new value to 0 each time, which means you're just printing the new value.
If you move the line total = 0 to occur before the loop, then it should work as you were hoping.
If you want, you can clean this up a little:
numbers_file = open('numbers.txt','r')
total = 0
for number in numbers_file:
if number:
int_number = int(number)
total += int_number
print(total)
numbers_file.close()
would be a first pass. The check if number returns True if number contains a "truthy" value, which in this case would happen if you hit an empty line.
Hi you are missing to remove the 'new line symbol' which is \n.
To ensure you get only literals that can be converted to numbers you have to strip other characters.
With e.g.
a = '100\ntest'
print(a.isnumeric())
a = '103478'
print(a.isnumeric())
You can test if there is a character that prevents conversion to a number.
The regular expression package to manipulate string easily.
See this stack overflow threat.
import re
a = jkfads1000ki'
re.sub('\D','',a)
'1000'
See the Python docs on re.

Resources