file reading in python, need help for homework - python-3.x

Write a function func(infilepath) that reads the file whose file path is infilepath, and prints the number of times each character(excluding newline characters) appeared in the file, in sorted order of the characters.
Any help would be greatly appreciated !

This won't be the exact answer, but enough to get you started!
First, open a file:
f = open("file.txt", "r")
Then read lines
lines = f.readlines()
Define a dictionary. Split the line by spaces, increment the dictionary by one if they character is already present in the dictionary, else initialize it to 0.
chars = {}
lines = [line.strip() for line in lines]
for line in lines:
line = line.split(" ")
for i in line:
if i not in chars.keys():
chars[i] = 0
else:
chars[i]+=1
More about file handling: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/06-file-handling.md
More about sets/lits/dictionaries: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/04-list-set-dict.md
Some practical examples to get you thinking: https://github.com/thewhitetulip/build-app-with-python-antitextbook/blob/master/manuscript/13-examples.md

Related

extract words from a text file and print netxt line

sample input
in parsing a text file .txt = ["'blah.txt'", "'blah1.txt'", "'blah2.txt'" ]
the expected output in another text file out_path.txt
blah.txt
blah1.txt
blah2.txt
Code that I tried, this just appends "[]" to the input file. While I also tried perl one liner replacing double and single quotes.
read_out_fh = open('out_path.txt',"r")
for line in read_out_fh:
for word in line.split():
curr_line = re.findall(r'"(\[^"]*)"', '\n')
print(curr_line)
this happens because while you reading a file it will be taken as string and not as a list even if u kept the formatting of a list. thats why you getting [] while doing re.for line in read_in_fh: here you are taking each letters in the string thats why you are not getting the desired output. so iwrote something first to transform the string into a list. while doing that i also eliminated "" and '' as you mensioned. then wrote it in to a new file example.txt.
Note: change the file name according to your files
read_out_fh = open('file.txt',"r")
for line in read_out_fh:
line=line.strip("[]").replace('"','').replace("'",'').split(", ")
with open("example.txt", "w") as output:
for word in line:
#print(word)
output.write(word+'\n')
example.txt(outputfile)
blah.txt
blah1.txt
blah2.txt
The code below works out for your example you gave in the question:
# Content of textfile.txt:
asdasdasd=["'blah.txt'", "'blah1.txt'", "'blah2.txt'"]asdasdasd
# Code:
import re
read_in_fh = open('textfile.txt',"r")
write_out_fh = open('out_path.txt', "w")
for line in read_in_fh:
find_list = re.findall(r'\[(".*?"*)\]', line)
for element in find_list[0].split(","):
element_formatted = element.replace('"','').replace("'","").strip()
write_out_fh.write(element_formatted + "\n")
write_out_fh.close()

Using a function to print the characters from a file?

So I have a text file, and I need to define a function to open the file, read through it, and then return and print the number of characters within the file.
So far I've got:
def num_chars_in_file(file):
path = 'planets.txt'
file_handle = open(path)
for text in file_handle:
file = file_handle.readlines()
print(file)
print(f"\nProblem 1: {num_chars_in_file()}")
# I'm not sure where to go from where.
You could create a count variable to store the cumulative total of characters as you iterate over each line, something like this:
def num_chars_in_file():
path = 'planets.txt'
file_handle = open(path)
count = 0
for text in file_handle:
count += len(text.rstrip())
file_handle.close() # Make sure to close the file if you're not using with
return count
print(f"\nProblem 1: {num_chars_in_file()}")
with open('my_words.txt') as infile:
lines=0
words=0
characters=0
for line in infile:
wordslist=line.split()
lines=lines+1
words=words+len(wordslist)
characters += sum(len(word) for word in wordslist)
print(lines)
print(words)
print(characters)
Try this to print number of line, words and characters in the file.
Refer to this similar question more details.

reading text line by line in python 3.6

I have date.txt file where are codes
ex:
1111111111111111
2222222222222222
3333333333333333
4444444444444444
I want to check each code in website.
i tried:
with open('date.txt', 'r') as f:
data = f.readlines()
for line in data:
words = line.split()
send_keys(words)
But this copy only last line to.
I need to make a loop that will be checking line by line until check all
thanks for help
4am is to late 4my little brain..
==
edit:
slove
while lines > 0:
lines = lines - 1
with open('date.txt', 'r') as f:
data = f.readlines()
words = data[lines]
print(words)
Try this I think it will work :
line_1 = file.readline()
line_2 = file.readline()
repeat this for how many lines you would like to read.
One thing to keep in mind is if you print these lines they will all print on the same line.

Search text file for word from list then output word that matched in Python 3.x

I have been searching a large directory of text files for files that match a list of words. How do I have python output the word from the list that matches?
This is what I have so far. It writes the file name every time one of the words from the list is found. I want to add the matching word to the line with the file name so I have the file name and 1 matched word each time. How do I do that?
ngwrds= ['words'...]
for filename in os.listdir(os.getcwd()):
with open(filename, 'r') as searchfile:
for line in searchfile:
if any(x in line for x in ngwrds):
with open("keyword.txt", 'a') as out:
out.write(filename + '\n')
The input is a long text file a line might read like this:
The company reported depreciation of $1.20.
The if one of the search words from the list was depreciation then the output file would look like this:
filename depreciation
Thank you.
I am not sure what out is and I can't run your code from where I am but you could try something like this:
ngwrds= ['words'...]
for filename in os.listdir(os.getcwd()):
with open(filename, 'r') as searchfile:
for line in searchfile:
line = line.strip().split(" ")
for word in line:
if word in ngwrds:
out.write(filename + " " + word)
strip gets rid of whitespace on either end of line. split returns a list of the words in line.

python3 opening files and reading lines

Can you explain what is going on in this code? I don't seem to understand
how you can open the file and read it line by line instead of all of the sentences at the same time in a for loop. Thanks
Let's say I have these sentences in a document file:
cat:dog:mice
cat1:dog1:mice1
cat2:dog2:mice2
cat3:dog3:mice3
Here is the code:
from sys import argv
filename = input("Please enter the name of a file: ")
f = open(filename,'r')
d1ct = dict()
print("Number of times each animal visited each station:")
print("Animal Id Station 1 Station 2")
for line in f:
if '\n' == line[-1]:
line = line[:-1]
(AnimalId, Timestamp, StationId,) = line.split(':')
key = (AnimalId,StationId,)
if key not in d1ct:
d1ct[key] = 0
d1ct[key] += 1
The magic is at:
for line in f:
if '\n' == line[-1]:
line = line[:-1]
Python file objects are special in that they can be iterated over in a for loop. On each iteration, it retrieves the next line of the file. Because it includes the last character in the line, which could be a newline, it's often useful to check and remove the last character.
As Moshe wrote, open file objects can be iterated. Only, they are not of the file type in Python 3.x (as they were in Python 2.x). If the file object is opened in text mode, then the unit of iteration is one text line including the \n.
You can use line = line.rstrip() to remove the \n plus the trailing withespaces.
If you want to read the content of the file at once (into a multiline string), you can use content = f.read().
There is a minor bug in the code. The open file should always be closed. I means to use f.close() after the for loop. Or you can wrap the open to the newer with construct that will close the file for you -- I suggest to get used to the later approach.

Resources