Why the output of "open" function doesn't allow me to attribute index? - python-3.x

I started to learn programming in python3 and i am doing a project that reads the content of a text file and tells you how many words are in the file. Being me I always want to challenge myself and tried to add in the output message the name of the file so in the future I will do a GUI for it and so on.
The error that I get is : AttributeError: '_io.TextIOWrapper' object has no attribute 'index'
Here is my code:
# Open text file
document = open("text2.txt", "r+")
# Reads the text file and splits it into arrays
text_split = document.read().split()
# Count the words
words = len(text_split)
# Display the counted words
document_name = document[document.index("name=")]
output = "In the file {} there are {} words.".format(document_name, words)
print (output)

Decided to take #Jean-François Fabre 's advice and abandoned the idea to also output the name of the file (FOR NOW).

Related

How to search a text file using input method

I have a .txt file that I want to search for specific words, or phrases. I want to be able to use an input to do this. Then I would like the file parsed for the input and printed. Basically something like this:
input("Search For:")I WANT TO ENTER MY SEARCH TERM HERE
print(I WANT TO PRINT WHAT I SEARCHED FOR ABOVE)
I am able to do this another way by creating a variable, and then just changing the variable name as needed, but this is not ideal for me. Any ideas on how to create an input to search my .txt?
word = 'Scrubbing'
#variable to store search term
with open(r'/Users/kev/PycharmProjects/find_text/common.txt', 'r') as fp:
lines = fp.readlines()
# read all lines in a list
for line in lines:
if line.find(word) != -1:
# check if string present on a current line
print(word, 'string exists in file')
print('Line Number:', lines.index(line))
print('Line:', line)

Automating The Boring Stuff With Python - Chapter 8 - Exercise - Regex Search

I'm trying to complete the exercise for Chapter 8 using which takes a user supplied regular expression and uses it to search each string in each text file in a folder.
I keep getting the error:
AttributeError: 'NoneType' object has no attribute 'group'
The code is here:
import os, glob, re
os.chdir("C:\Automating The Boring Stuff With Python\Chapter 8 - \
Reading and Writing Files\Practice Projects\RegexSearchTextFiles")
userRegex = re.compile(input('Enter your Regex expression :'))
for textFile in glob.glob("*.txt"):
currentFile = open(textFile) #open the text file and assign it to a file object
textCurrentFile = currentFile.read() #read the contents of the text file and assign to a variable
print(textCurrentFile)
#print(type(textCurrentFile))
searchedText = userRegex.search(textCurrentFile)
searchedText.group()
When I try this individually in the IDLE shell it works:
textCurrentFile = "What is life like for those left behind when the last foreign troops flew out of Afghanistan? Four people from cities and provinces around the country told the BBC they had lost basic freedoms and were struggling to survive."
>>> userRegex = re.compile(input('Enter the your Regex expression :'))
Enter the your Regex expression :troops
>>> searchedText = userRegex.search(textCurrentFile)
>>> searchedText.group()
'troops'
But I can't seem to make it work in the code when I run it. I'm really confused.
Thanks
Since you are just looping across all .txt files, there could be files that doesn't have the word "troops" in it. To prove this, don't call the .group(), just perform:
print(textFile, textCurrentFile, searchedText)
If you see that searchedText is None, then that means the contents of textFile (which is textCurrentFile) doesn't have the word "troops".
You could either:
Add the word troops in all .txt files.
Only select the target .txt files, not all.
Check first if if the match is found before accessing .group()
print(searchedText.group() if searchedText else None)

How to read many files have a specific format in python

I am a little bit confused in how to read all lines in many files where the file names have format from "datalog.txt.98" to "datalog.txt.120".
This is my code:
import json
file = "datalog.txt."
i = 97
for line in file:
i+=1
f = open (line + str (i),'r')
for row in f:
print (row)
Here, you will find an example of one line in one of those files:
I need really to your help
I suggest using a loop for opening multiple files with different formats.
To better understand this project I would recommend researching the following topics
for loops,
String manipulation,
Opening a file and reading its content,
List manipulation,
String parsing.
This is one of my favourite beginner guides.
To set the parameters of the integers at the end of the file name I would look into python for loops.
I think this is what you are trying to do
# create a list to store all your file content
files_content = []
# the prefix is of type string
filename_prefix = "datalog.txt."
# loop from 0 to 13
for i in range(0,14):
# make the filename variable with the prefix and
# the integer i which you need to convert to a string type
filename = filename_prefix + str(i)
# open the file read all the lines to a variable
with open(filename) as f:
content = f.readlines()
# append the file content to the files_content list
files_content.append(content)
To get rid of white space from file parsing add the missing line
content = [x.strip() for x in content]
files_content.append(content)
Here's an example of printing out files_content
for file in files_content:
print(file)

Can't check the content of an email

I am trying to read the content of an mbox file and compare it with a list of words also read from a different file. I believe the problem is I am reading them wrong, since the output does not match what I expect knowing the content of the files.
I have tried to read them both as rb and r with no luck. I then tried to put the txt file into a list. Anyway the mbox file cannot be inserted into a list. As further test, I tried to read the content of the email by using the get_payload() function but it returns bytes that are not useful to me.
# Opening the file that contains the balcklisted words and printing it
with open("blacklist.txt",'r') as afile:
buf=afile.read()
print(buf)
# Opening the mbox files
mbox = mailbox.mbox('Andishe.mbox')
# To read the content of the mbox file when its a multiple messages
for message in mbox:
if message.is_multipart():
print ("from :",message['from'])
print ("to :",message['to'])
content = message.as_string()
# print(content)
else:
print ("from :",message['from'])
print ("to :",message['to'])
content = message.as_string()
# print(content)
# To check and see if the black listed words are inside the content of the email
for file in content:
if file in buf:
print("file contains blacklisted words" + file)
else:
print("file does not contain blacklisted words")
I would expect the results to be like this:
some black listed word
file contains blacklisted words + the black listed word
But I am stuck in a loop that keeps printing itself, the following is a part of what gets printed:
file contains blacklisted wordsr
file contains blacklisted wordso
file contains blacklisted wordsm
file contains blacklisted words
I have no idea what those r, o, m stand for or where they are coming from?
I have figured out where I was going wrong:
1- I was reading the content of the txt file wrong. I should have used this:
blacklist=[]
for line in afile:
blacklist.append(line.strip('\n'))
this way, I was getting rid of the end of line charterer and also keeping each line to a word
2- I was also not doing my for loop wrong, since I did not append the content of the mbox file. this fixed the issue:
content_string = ''.join(content)
content_string = content_string.lower()
for word in blacklist:
if word.lower() in content_string:
print("This black listed word exists in content : ",word)

How to convert Navigable String to File Object

I am trying to get some data from a website (using the modules named requests & BeautifulSoup) and print it in a text file but every time I try to do so, it says the following:
TypeError: descriptor 'write' requires a 'file' object but received a 'NavigableString'
I have tried using the csv library to import the data but since I couldn't add the line by line data to the csv, I decided to add all the output to a text file and then take out the data I require.
file_object = open("name-list.txt", "w") #Opening the file
name = soup.find(class_='table-responsive') #Extracting the data
name_list = name.find_all('td') #Refining the data
for final in name_list:
all = final.contents[0] #Final result
file.write(all) #This is where the Error Comes
file.close()
When I use print(all) in the for loop, I get the output that I need which consists of multi-line text including the names, age, gender, etc. of the people from the table on the website but when I try to print that output into the text file, the error pops up.

Resources