How do I write a Python program that computes the average from a .dat file? - statistics

I have this so far but I don't know how to write over the .dat file:
def main():
fname = input("Enter filename:")
infile = open(fname, "r")
data = infile.read()
print(data)
for line in infile.readlines():
score = int(line)
counts[score] = counts[score]+1
infile.close()
total=0
for c in enumerate(counts):
total = total + i*c
average = float(total)/float(sum(counts))
print(average)
main()
Here is my .dat file:
4
3
5
6
7
My statistics professor expects us to learn Python to compute the mean and standard deviation. All I need to know is how to do the mean and then I've got the rest figured out. I want to know how does Python write over each line in a .dat file. Could someone tell me how to fix this code? I've never done programming before.

To answer your question, as I understand it, in three parts:
How to read the file in
in your example you use
infile.read()
which reads the entire contents of the file into a string and takes you to the end of file. Therefore the following
infile.readlines()
will read nothing more. You should omit the first read().
How to compute the mean
There are many ways to do this in python - more or less elegant - and also I guess it depends on exactly what the problem is. But in the simplest case you can just sum and count the values as you go , then divide sum by count at the end to get the result:
infile = open("d.dat", "r")
total = 0.0
count = 0
for line in infile.readlines():
print ("reading in line: ",line)
try:
line_value = float(line)
total += line_value
count += 1
print ("value = ",line_value, "running total =",total, "valid lines read = ",count)
except:
pass #skipping non-numeric lines or characters
infile.close()
The try/except part is just in case you have lines or characters in the file that can't be turned into floats, these will be skipped.
How to write to the .dat file
Finally you seem to be asking how to write the result back out to the d.dat file. Not sure whether you really need to do this, it should be acceptable to just display the result as in the above code. However if you do need to write it back to the same file, just close it after reading from it, reopen it for writing (in 'append' mode so output goes to the end of the file), and output the result using write().
outfile = open("d.dat","a")
outfile.write("\naverage = final total / number of data points = " + str(total/count)+"\n")
outfile.close()

fname = input("Enter filename:")
infile = open(fname, "r")
data = infile.readline() #Reads first line
print(data)
data = infile.readline() #Reads second line
print(data)
You can put this in a loop.
Also, these values will come in as Strings convert them to floats using float(data) each time.
Also, the guys over at StackOverflow are not as bad at math as you think. This could have easily been answered there. (And maybe in a better fashion)

Related

How to iterate and find specific characters in a file using Python?

I have an exercise which I am struggling a lot with related to file handling in python. I tried a lot of times but keep on failing to create the program. Those are the 2 questions.
1) Write a program that takes as input a phrase and a path to a text file on the computer. The program should return either true if the phrase is present in the file or false if the phrase is not present in the file. Can you please help me to figure out those 2 questions. Thanks in advance.
2) Write a program that reads a csv file, where each line contains a coma-separated list of numbers and writes the sum of each line in another file. Use try-catch to handle potential errors such as empty lines or non-numeric values.
For question 1 I wrote:
phrase = input("Enter a phrase: ")
file = open("example.txt", "r")
if phrase in file:
print("True")
else:
print("False")
For question 2 I wrote:
file = open("example.csv", "r")
print(f.readlines())
try:
for i in file:
line = line + 1
file2 = open("lines.csv", "w")
file2.write(line)
file2.close

Updating values in an external file only works if I restart the shell window

Hi there and thank you in advance for your response! I'm very new to python so please keep that in mind as you read through this, thanks!
So I've been working on some code for a very basic game using python (just for practice) I've written out a function that opens another file, selects a variable from it and adjusts that variable by an amount or if it's a string changes it into another string, the funtion looks like this.
def ovr(file, target, change):
with open(file, "r+") as open_file:
opened = open_file.readlines()
open_file.close()
with open(file, "w+") as open_file:
position = []
for appended_list, element in enumerate(opened):
if target in element:
position.append(appended_list)
if type(change) == int:
opened[position[0]] = (str(target)) + (" = ") + (str(change)) + (str("\n"))
open_file.writelines(opened)
open_file.close()
else:
opened[position[0]] = (str(target)) + (" = ") + ("'") + (str(change)) + ("'") + (str("\n"))
open_file.writelines(opened)
open_file.close()
for loop in range(5):
ovr(file = "test.py", target = "gold", change = gold + 1)
At the end I have basic loop that should re-write my file 5 times, each time increasing the amount of gold by 1. If I write this ovr() funtion outside of the loop and just run the program over and over it works just fine increasing the number in the external file by 1 each time.
Edit: I should mention that as it stands if I run this loop the value of gold increases by 1. if I close the shell and rerun the loop it increases by 1 again becoming 2. If I change the loop to happen any number of times it only ever increases the value of gold by 1.
Edit 2: I found a truly horrific way of fixing this isssue, if anyone has a better way for the love of god please let me know, code below.
for loop in range(3):
ovr(file = "test.py", target = "gold", change = test.gold + 1)
reload(test)
sleep(1)
print(test.gold)
The sleep part is because it takes longer to rewrite the file then it does to run the full loop.
you can go for a workaround and write your new inforamtion into a file called: file1
So you can use ur working loop outside of the write file. Anfter using your Loop you can just change the content of your file by the following steps.
This is how you dont need to rewrite your loop and still can change your file content.
first step:
with open('file.text', 'r') as input_file, open('file1.txt', 'w') as output_file:
for line in input_file:
output_file.write(line)
second step:
with open('file1.tex', 'r') as input_file, open('file.tex', 'w') as output_file:
for line in input_file:
if line.strip() == '(text'+(string of old value of variable)+'text)':
output_file.write('text'+(string of new value of variable)+' ')
else:
output_file.write(line)
then you have updated your text file.

Replacing a float number in txt file

Firstly, I would like to say that I am newbie in Python.
I will ll try to explain my problem as best as I can.
The main aim of the code is to be able to read, modify and copy a txt file.
In order to do that I would like to split the problem up in three different steps.
1 - Copy the first N lines into a new txt file (CopyFile), exactly as they are in the original file (OrigFile)
2 - Access to a specific line where I want to change a float number for other. I want to append this line to CopyFile.
3 - Copy the rest of the OrigFile from line in point 2 to the end of the file.
At the moment I have been able to do step 1 with next code:
with open("OrigFile.txt") as myfile:
head = [next(myfile) for x iin range(10)] #read first 10 lines of txt file
copy = open("CopyFile.txt", "w") #create a txt file named CopyFile.txt
copy.write("".join(head)) #convert list into str
copy.close #close txt file
For the second step, my idea is to access directly to the txt line I am interested in and recognize the float number I would like to change. Code:
line11 = linecache.getline("OrigFile.txt", 11) #opening and accessing directly to line 11
FltNmb = re.findall("\d+\.\d+", line11) #regular expressions to identify float numbers
My problem comes when I need to change FltNmb for a new one, taking into consideration that I need to specify it inside the line11. How could I achieve that?
Open both files and write each line sequentially while incrementing line counter.
Condition for line 11 to replace the float number. Rest of the lines are written without modifications:
with open("CopyFile.txt", "w") as newfile:
with open("OrigFile.txt") as myfile:
linecounter = 1
for line in myfile:
if linecounter == 11:
newline = re.sub("^(\d+\.\d+)", "<new number>", line)
linecounter += 1
outfile.write(newline)
else:
newfile.write(line)
linecounter += 1

How can I expand List capacity in Python?

read = open('700kLine.txt')
# use readline() to read the first line
line = read.readline()
aList = []
for line in read:
try:
num = int(line.strip())
aList.append(num)
except:
print ("Not a number in line " + line)
read.close()
print(aList)
There is 700k Line in that file (every single line has max 2 digits number)
I can only get ~280k Line in that file to in my aList.
So, How can I expand aList capacity 280k to 700k or more? (Is there a different solution for this case?)
Hello, I just solved that problem. Thanks for all your helps. That was an obvious buffer problem.
Solution is just increasing the size of buffer.
link is here
Increase output buffer when running or debugging in PyCharm
Please try this.
filename = '700kLine.txt'
with open(filename) as f:
data = f.readlines()
print(data)
print(type(data)) #stores the data in a list
Yes, you can.
Once a list is defined, you can add, edit or delete its elements. To add more elements at the end, use the append function:
MyList.append(data)
Where MyList is the name of the list and data is the element you want to add.
I tried to re-create your problem:
# creating 700kLine file
with open('700kLine.txt', 'w') as f:
for i in range(700000):
f.write(str(i+1) + '\n')
# creating list from file entries
aList = []
with open('700kLine.txt', 'r') as f:
for line in f:
num = int(line.strip())
aList.append(num)
# print(aList)
print(aList[:30])
Jupyter notebook throws an error while printing all 700K lines due to too much memory used. If you really want to print all 700k values, run the python script from terminal.
It could be that your computer ran out of memory processing the file? I have tried generating an infinite loop appending a single digit to the list and I ended up with 47 million-ish len(list) >> 47119572, the code I use to test as below.
I tried this code on an online REPL and it came to a significantly lower 'len(list)`.
list = []
while True:
try:
if len(list) > 0:
list.append(list[-1] + 1)
else:
list.append(1)
except MemoryError:
print("memory error, last count is: ", list[-1])
raise MemoryError
Maybe try saving bits of data read instead of reading the whole file at once?
Just my assumption.

Python 3.6.1: Code does not execute after a for loop

I've been learning Python and I wanted to write a script to count the number of characters in a text and calculate their relative frequencies. But first, I wanted to know the length of the file. My intention is that, while the script goes from line to line counting all the characters, it would print the current line and the total number of lines, so I could know how much it is going to take.
I executed a simple for loop to count the number of lines, and then another for loop to count the characters and put them in a dictionary. However, when I run the script with the first for loop, it stops early. It doesn't even go into the second for loop as far as I know. If I remove this loop, the rest of the code goes on fine. What is causing this?
Excuse my code. It's rudimentary, but I'm proud of it.
My code:
import string
fname = input ('Enter a file name: ')
try:
fhand = open(fname)
except:
print ('Cannot open file.')
quit()
#Problematic bit. If this part is present, the script ends abruptly.
#filelength = 0
#for lines in fhand:
# filelength = filelength + 1
counts = dict()
currentline = 1
for line in fhand:
if len(line) == 0: continue
line = line.translate(str.maketrans('','',string.punctuation))
line = line.translate(str.maketrans('','',string.digits))
line = line.translate(str.maketrans('','',string.whitespace))
line = line.translate(str.maketrans('','',""" '"’‘“” """))
line = line.lower()
index = 0
while index < len(line):
if line[index] not in counts:
counts[line[index]] = 1
else:
counts[line[index]] += 1
index += 1
print('Currently at line: ', currentline, 'of', filelength)
currentline += 1
listtosort = list()
totalcount = 0
for (char, number) in list(counts.items()):
listtosort.append((number,char))
totalcount = totalcount + number
listtosort.sort(reverse=True)
for (number, char) in listtosort:
frequency = number/totalcount*100
print ('Character: %s, count: %d, Frequency: %g' % (char, number, frequency))
It looks fine the way you are doing it, however to simulate your problem, I downloaded and saved a Guttenberg text book. It's a unicode issue. Two ways to resolve it. Open it as a binary file or add the encoding. As it's text, I'd go the utf-8 option.
I'd also suggest you code it differently, below is the basic structure that closes the file after opening it.
filename = "GutenbergBook.txt"
try:
#fhand = open(filename, 'rb')
#open read only and utf-8 encoding
fhand = open(filename, 'r', encoding = 'utf-8')
except IOError:
print("couldn't find the file")
else:
try:
for line in fhand:
#put your code here
print(line)
except:
print("Error reading the file")
finally:
fhand.close()
For the op, this is a specific occasion. However, for visitors, if your code below the for state does not execute, it is not a python built-in issue, most likely to be: an exception error handling in parent caller.
Your iteration is inside a function, which is called inside a try except block of caller, then if any error occur during the loop, it will get escaped.
This issue can be hard to find, especially when you dealing with intricate architecture.

Resources