Count the number of characters in a file - python-3.x

The question:
Write a function file_size(filename) that returns a count of the number of characters in the file whose name is given as a parameter. You may assume that when being tested in this CodeRunner question your function will never be called with a non-existent filename.
For example, if data.txt is a file containing just the following line: Hi there!
A call to file_size('data.txt') should return the value 10. This includes the newline character that will be added to the line when you're creating the file (be sure to hit the 'Enter' key at the end of each line).
What I have tried:
def file_size(data):
"""Count the number of characters in a file"""
infile = open('data.txt')
data = infile.read()
infile.close()
return len(data)
print(file_size('data.txt'))
# data.txt contains 'Hi there!' followed by a new line
character.
I get the correct answer for this file however I fail a test that users a larger/longer file which should have a character count of 81 but I still get 10. I am trying to get the code to count the correct size of any file.

Related

file reading in python usnig different methods

# open file in read mode
f=open(text_file,'r')
# iterate over the file object
for line in f.read():
print(line)
# close the file
f.close()
the content of file is "Congratulations you have successfully opened the file"! when i try to run this code the output comes in following form:
c (newline) o (newline) n (newline) g.................
...... that is each character is printed individually on a new line because i used read()! but with readline it gives the answer in a single line! why is it so?
r.read() returns one string will all characters (the full file content).
Iterating a string iterates it character wise.
Use
for line in f: # no read()
instead to iterate line wise.
f.read() returns the whole file in a string. for i in iterates something. For a string, it iterates over its characters.
For readline(), it should not print the line. It would read the first line of the file, then print it character by character, like read. Is it possible that you used readlines(), which returns the lines as a list.
One more thing: there is with which takes a "closable" object and auto-closes it at the end of scope. And you can iterate over a file object. So, your code can be improved like this:
with open(text_file, 'r') as f:
for i in f:
print(i)

How to find a line which contains a string without any suffix and prefix in a string?

I tried to find the solution on different platform, but I couldn't able to. So I am here.
I am reading a line in a file which contains a specific string(user Input). But the Problem is, my Code is reading all the lines. For an example.
Here user Input is: "Mon_ErrEntryEspSqPlaus"
Output line:
/begin MEASUREMENT Icsp_Dem_Deb_LfEve_Mon_ErrEntryEspSqPlaus
Here Output line string has Suffix with it. Not intended.
Instead of reading just below line:
941 "Mon_ErrEntryEspSqPlaus"
No Suffix and prefix in the above line with user Input string.
Here is the Code:
import re
def a2l_reader(parameter):
count = 0;
count_1 = 0;
with open("TPT.a2l", errors = 'replace') as myfile:
for num, line in enumerate(myfile,1):
if parameter in line:
if re.match(r'sample', line):
count += 1
else:
count_1 += 1
print(count)
print(count_1)
The Question is how to search for the specific line which contains a specific string without Suffix and prefix. Since I have to use the number associated with that string.
Thanks in advance
Instead of
if parameter in line:
you can simply do
if parameter == line:
and it will only proceed if there is an exact match. The first example (which is the one you have in your code) will match if there are substrings matching your input
In that case if you want to match the exact string you can split by spaces and then check contains using in ::
Split by Spaces and the check in list
if parameter in re.split("( )",line):

Search for numbers after a certain string in an output file?

I have an output file with a load of information in and I want to read a number value that appears after a specific word.
In my file, I have a line such as
"Final energy, E = -82137.1098 eV"
What I would like to do is search my file for the string 'Final energy' and then read and store the number value.
So far I have managed to search the file for 'Final energy' and print the entire line containing that string but I can't seem to find a way to then read the number.
So far my code goes like this
energystring = 'Final energy'
with open(filename, 'r') as file:
for line in file:
if energystring in line:
energyline = line
print(energyline)
Thank you for any help you can give.
You just need to parse the number out of the string then. You can split the string on whitespace to get all the words, try to cast each word to a float, and get the one that works. Since there's only one number in the string, whatever successfully casts to float is your energy number.
def get_energy_level(line):
for word in line.split():
try:
return float(word)
except ValueError:
pass
with open(filename, 'r') as file:
for line in file:
if energystring in line:
energy_level = get_energy_level(line)

Python3 - Problem during removing a line from a text file

I am trying to delete a line from a text file after opening it and without storing it in any list variable using f.readlines() or anything like that.
I dont have an option to open the file and store the contents in a variable and make some changes and write them to another file or any kind of operations that would require to open the file and store them again in a list variable and make some changes and store them back to the file. The file is being constantly appended by some other program, so I cannot do any kind of that stuff.
I am using f.seek() to reset the pointer to the beginning of the file, and using f.readline() as well as f.tell() to know the length of the first line. After that I am trying to replace each character with a blank space using while loop.
pos=0
eol = 0
ll=0
with open('file1.txt','rb+') as f:
f.seek(pos,1) #position at the beginning of the file
print(f.readline()) #reading the first line
pos = f.tell() #storing the length of first line
#the while loop will run from 0 to pos and replace every character with blank space
while eol != pos:
with open('file1.txt','rb+') as f:
f.seek(eol,1)
f.write(b' ')
eol += 1 #incrementing the eol variable to move the file pointer to next character
the code is working fine but with one problem which I cant figure out what,
for example if this is the original file
file1.txt
this is line 1
this is line 2
this is line 3
after running the program , my output is
this is line 2
this is line 3
the first line is getting deleted but there is a bunch of white space in front of the 2nd line.
Maybe I am missing a simple logic here.
Any help will be appreciated.
Thank you
Update :
If i have understood it correctly I have changed the code and made it like this, and instead of b' ' i am putting '\r' as carraige return, which resulted in this :
the code :
while eol != pos-1:
with open('file1.txt','rb+') as f:
f.seek(eol,0)
f.write(b'\r')
eol += 1
the result :
original :
this is line 1
this is line 2
this is line 3
after execution
this is line 2
this is line 3
you see the 1st line is removed but followed with '\r'

Best way to fix inconsistent csv file in python

I have a csv file which is not consistent. It looks like this where some have a middle name and some do not. I don't know the best way to fix this. The middle name will always be in the second position if it exists. But if a middle name doesn't exist the last name is in the second position.
john,doe,52,florida
jane,mary,doe,55,texas
fred,johnson,23,maine
wally,mark,david,44,florida
Let's say that you have ① wrong.csv and want to produce ② fixed.csv.
You want to read a line from ①, fix it and write the fixed line to ②, this can be done like this
with open('wrong.csv') as input, open('fixed.csv', 'w') as output:
for line in input:
line = fix(line)
output.write(line)
Now we want to define the fix function...
Each line has either 3 or 4 fields, separated by commas, so what we want to do is splitting the line using the comma as a delimiter, return the unmodified line if the number of fields is 3, otherwise join the field 0 and the field 1 (Python counts from zero...), reassemble the output line and return it to the caller.
def fix(line):
items = line.split(',') # items is a list of strings
if len(items) == 3: # the line is OK as it stands
return line
# join first and middle name
first_middle = join(' ')((items[0], items[1]))
# we want to return a "fixed" line,
# i.e., a string not a list of strings
# we have to join the new name with the remaining info
return ','.join([first_second]+items[2:])

Resources