using python to parse through files for data - python-3.x

I have two files one template file and one file which has the values for the template file. I am trying to take the template file and then pass values to the variables from another file and combine the two into a third file. I am able to copy one file to another using the following snippet of code
`
print("Enter the Name of Source File: ")
sFile = input()
print("Enter the Name of Target File: ")
tFile = input()
fileHandle = open(sFile, "r")
texts = fileHandle.readlines()
fileHandle.close()
fileHandle = open(tFile, "w")
for s in texts:
fileHandle.write(s)
fileHandle.close()
print("\nFile Copied Successfully!")
`
however I am not sure how to do it for two or more files and then to make them into one file. Any help/guidance is appreciated

This is certainly not the most elegant solution but I think it should work for you.
# You could add as many files to this list as you want.
list_of_files = []
count = 1
while True:
print(f"Enter the Name of Source File{count} (Enter blank when done adding files): ")
sFile = input()
# If the input is not empty then add the filename to list_of_files.
if sFile:
list_of_files.append(sFile)
count += 1
else:
break
print("Enter the Name of Target File: ")
tFile = input()
# With open will open the file and then close if when done.
with open(tFile, 'a+') as target:
# This will loop over all the files in your list.
for file in list_of_files:
tmp = open(file, 'r')
target.write('\n' + tmp.read())
tmp.close()

Related

Trying to implement an user error message in python3

I have made a program that reads in a user expression and path file and then picks out each line from the users file that contains the expression. My code is as follows:
# Necessary imports
import os
# Variables
userExpression = [] # Variable for user expression
userFile = [] # Variable for user file
fileLines = [] # Variable for lines of text in the users file
lineNum = 0 # Variable for keeping track of line numbers
userExpression = " " + input("Please enter the expression you wise to find: ") + " " # Read in and store users expression
userFile = input("Enter the path of your file: ") # Read in and store file path of users file
myFile = open(userFile) # Opening user file
print(" ") # User to make output easier to read
print("HOORAY!! File found!")
print("File lines that include your expressions are found below: ")
print(" ") # User to make output easier to read
# Store each line of text into a list
for line in myFile:
lineNum += 1
if line.lower().find(userExpression) != -1:
fileLines.append("Line " + str(lineNum) + ": " + line.rstrip('\n'))
# Print out file text stored in list
for element in fileLines:
print(element)
myFile.close()
Last thing i want to try do is have an error message displayed if the user inputs an incorrect file path. Im new to python so honestly im not really sure where to even start.
You can solve this using a loop and a try/catch block:
while True:
userFile = input("Enter the path of your file: ") # Read in and store file path of users file
try:
myFile = open(userFile) # Opening user file
break
except:
print("Invalid file path!")
What the code does:
wait for user to tell program the file name
check if file can be opened
if file can be opened, exit the loop
if file cannot be opened, warn the user and go back to step 1
Edit: this solutions ensures Python can actually access the file, not only that it exists
You can use the os.path module to check if a file exists and if it’s a regular file (as opposed to a directory, for instance). First you import the module:
import os.path
from os import path
You use it as
if not path.exists(userFile):
# File does not exist
And to check if it’s a regular file:
if not path.isfile(userFile):
# Not a regular file
You can check more in this post.

How to I check whether a file already contains the text I want to append?

I am currently working on a project. So I want to read all the *.pdf files in a directory, extract their text and append it to a text file. So far so good. I was able to do this, yeah.
Now the problem: if I am reading the same directory again, it appends the same files again. Is there a way to check whether the extracted text is already in the file and thus, skip the whole thing?
My code for this looks like this right now (I created the directory variable already):
`
for filename in os.listdir(directory):
if filename.endswith(".pdf"):
file = os.path.join(directory, filename)
print(file)
#parse data from file
file_data = parser.from_file(file)
#get files text content
text = file_data['content']
#print(type(text))
print("len ", len(text))
#print(text)
#save to textfile
f = open("test2.txt", "a+", encoding = 'utf-8')
f.write(text)
f.close()
else:
continue
`
Thanks in advance!
One thing you could do is load the file contents and check if the file is within the file:
if text in open("test2.txt"):
# write here
else:
# text is already in file, don't write
However, this is very inefficient. A better way is to create a file with the filenames that you have already written, and check that:
(at the beginning of your code):
files = open("files.txt").readlines()
(before parser.from_file(file)):
if file in files:
continue # don't read or write
(after f.close()):
files.append(file)
(after the whole loop has finished)
with open("files.txt", "w") as f:
f.write("\n".join(files))
Putting it all together:
files = open("files.txt").readlines()
for filename in os.listdir(directory):
if filename.endswith(".pdf"):
file = os.path.join(directory, filename)
if file in files:
continue # don't read or write
print(file)
#parse data from file
file_data = parser.from_file(file)
#get files text content
text = file_data['content']
#print(type(text))
print("len ", len(text))
#print(text)
#save to textfile
f = open("test2.txt", "a+", encoding = 'utf-8')
f.write(text)
f.close()
files.append(file)
else:
continue
with open("files.txt", "a+") as f:
f.write("\n".join(files))
Note that you need to create a file named files.txt in the current directory.

How to replace all instances of a string in text file with Python 3?

I am trying to replace all instances of a given string in a text file. I am trying to read the file line by line and then use the replace function, however it is just outputting a blank file instead of the expected. What could I be doing wrong?
file = input("Enter a filename: ")
remove = input("Enter the string to be removed: ")
fopen = open(file, 'r+')
lines = []
for line in fopen:
line = line.replace(remove,"")
fopen.close()
Try this:
# Make sure this is the valid path to your file
file = input("Enter a filename: ")
remove = input("Enter the string to be removed: ")
# Read in the file
with open(file, "r") as file:
filedata = file.read()
# Replace the target string
filedata = filedata.replace(remove, "")
# Write the file out again
with open(file, "w") as file:
file.write(filedata)
Note: You might want to use with open syntax, the benefit is elaborated in this answer by Jason Sundram.

Appending in Python 3 from a list

def add():
aList = input("Enter file name with file extension too: ")
file = open(aList, "a")
txt = input("Enter the text you would like to add on to the txt file: ")
aList.append (txt);
print ("Updated List : ", aList)
I need this to append on to an external file like this:
NIGHT
SMOKE
GHOST
TOOTH
ABOUT
CAMEL
BROWN
FUNNY
CHAIR
PRICE
This list is called "List.txt"
I input this as the first variable and for the second variable I input "Hello" but I'm not really sure why it's giving me an error.
I just need it to add on to this list...
If you open a file in append-Mode, you can simply use write to add the text at the end of the file. The additional '\n' inserts a line break so the new input will be on an extra line.
aList = input("Enter the file name with the file extension too: ")
file = open(aList, "a")
txt = input("Enter the text you would like to add on to the txt file: ")
file.write(txt + '\n');
file.close()
print ("Updated List : ", aList)
Also please make sure that you close the file at the end of use. Or even better use the with statement which will close the file at the and of the block.
aList = input("Enter the file name with the file extension too: ")
with open(aList, "a") as file:
txt = input("Enter the text you would like to add on to the txt file: ")
file.write(txt + '\n')
print("Updated List : ", aList)

Something's wrong with my Python code (complete beginner)

So I am completely new to Python and can't figure out what's wrong with my code.
I need to write a program that asks for the name of the existing text file and then of the other one, that doesn't necessarily need to exist. The task of the program is to take content of the first file, convert it to upper-case letters and paste to the second file. Then it should return the number of symbols used in the file(s).
The code is:
file1 = input("The name of the first text file: ")
file2 = input("The name of the second file: ")
f = open(file1)
file1content = f.read()
f.close
f2 = open(file2, "w")
file2content = f2.write(file1content.upper())
f2.close
print("There is ", len(str(file2content)), "symbols in the second file.")
I created two text files to check whether Python performs the operations correctly. Turns out the length of the file(s) is incorrect as there were 18 symbols in my file(s) and Python showed there were 2.
Could you please help me with this one?
Issues I see with your code:
close is a method, so you need to use the () operator otherwise f.close does not do what your think.
It is usually preferred in any case to use the with form of opening a file -- then it is close automatically at the end.
the write method does not return anything, so file2content = f2.write(file1content.upper()) is None
There is no reason the read the entire file contents in; just loop over each line if it is a text file.
(Not tested) but I would write your program like this:
file1 = input("The name of the first text file: ")
file2 = input("The name of the second file: ")
chars=0
with open(file1) as f, open(file2, 'w') as f2:
for line in f:
f2.write(line.upper())
chars+=len(line)
print("There are ", chars, "symbols in the second file.")
input() does not do what you expect, use raw_input() instead.

Resources