extract words from a text file and print netxt line - python-3.x

sample input
in parsing a text file .txt = ["'blah.txt'", "'blah1.txt'", "'blah2.txt'" ]
the expected output in another text file out_path.txt
blah.txt
blah1.txt
blah2.txt
Code that I tried, this just appends "[]" to the input file. While I also tried perl one liner replacing double and single quotes.
read_out_fh = open('out_path.txt',"r")
for line in read_out_fh:
for word in line.split():
curr_line = re.findall(r'"(\[^"]*)"', '\n')
print(curr_line)

this happens because while you reading a file it will be taken as string and not as a list even if u kept the formatting of a list. thats why you getting [] while doing re.for line in read_in_fh: here you are taking each letters in the string thats why you are not getting the desired output. so iwrote something first to transform the string into a list. while doing that i also eliminated "" and '' as you mensioned. then wrote it in to a new file example.txt.
Note: change the file name according to your files
read_out_fh = open('file.txt',"r")
for line in read_out_fh:
line=line.strip("[]").replace('"','').replace("'",'').split(", ")
with open("example.txt", "w") as output:
for word in line:
#print(word)
output.write(word+'\n')
example.txt(outputfile)
blah.txt
blah1.txt
blah2.txt

The code below works out for your example you gave in the question:
# Content of textfile.txt:
asdasdasd=["'blah.txt'", "'blah1.txt'", "'blah2.txt'"]asdasdasd
# Code:
import re
read_in_fh = open('textfile.txt',"r")
write_out_fh = open('out_path.txt', "w")
for line in read_in_fh:
find_list = re.findall(r'\[(".*?"*)\]', line)
for element in find_list[0].split(","):
element_formatted = element.replace('"','').replace("'","").strip()
write_out_fh.write(element_formatted + "\n")
write_out_fh.close()

Related

Python : Updating multiple words in a text file based on text in another text file using in_place module

I have a text file say storyfile.txt
Content in storyfile.txt is as
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe
I have another file- hashfile.txt that contains some words separated by comma(,)
Content of hashfile.txt is:
All,mimsy,were,the,borogoves,raths,outgrabe
My objective
My objective is to
1. Read hashfile.txt
2. Insert Hashtag on each of the comma separated word
3. Read storyfile.txt . Search for same words as in hashtag.txt and add hashtag on these words.
4. Update storyfile.txt with words that are hash-tagged
My Python code so far
import in_place
hashfile = open('hashfile.txt', 'w+')
n1 = hashfile.read().rstrip('\n')
print(n1)
checkWords = n1.split(',')
print(checkWords)
repWords = ["#"+i for i in checkWords]
print(repWords)
hashfile.close()
with in_place.InPlace('storyfile.txt') as file:
for line in file:
for check, rep in zip(checkWords, repWords):
line = line.replace(check, rep)
file.write(line)
The output
can be seen here
https://dpaste.de/Yp35
Why is this kind of output is coming?
Why the last sentence has no newlines in it?
Where I am wrong?
The output
attached image
The current working code for single text
import in_place
with in_place.InPlace('somefile.txt') as file:
for line in file:
line = line.replace('mome', 'testZ')
file.write(line)
Look if this helps. This fulfills the objective that you mentioned, though I have not used the in_place module.
hash_list = []
with open("hashfile.txt", 'r') as f:
for i in f.readlines():
for j in i.split(","):
hash_list.append(j.strip())
with open("storyfile.txt", "r") as f:
for i in f.readlines():
for j in hash_list:
i = i.replace(j, "#"+j)
print(i)
Let me know if you require further clarification on the same.

How do I replace the 4th item in a list that is in a file that starts with a particular string?

I need to search for a name in a file and in the line starting with that name, I need to replace the fourth item in the list that is separated my commas. I have began trying to program this with the following code, but I have not got it to work.
with open("SampleFile.txt", "r") as f:
newline=[]
for word in f.line():
newline.append(word.replace(str(String1), str(String2)))
with open("SampleFile.txt", "w") as f:
for line in newline :
f.writelines(line)
#this piece of code replaced every occurence of String1 with String 2
f = open("SampleFile.txt", "r")
for line in f:
if line.startswith(Name):
if line.contains(String1):
newline = line.replace(str(String1), str(String2))
#this came up with a syntax error
You could give some dummy data which would help people to answer your question. I suppose you to backup your data: You can save the edited data to a new file or you can backup the old file to a backup folder before working on the data (think about using "from shutil import copyfile" and then "copyfile(src, dst)"). Otherwise by making a mistake you could easily ruin your data without being able to easily restore them.
You can't replace the string with "newline = line.replace(str(String1), str(String2))"! Think about "strong" as your search term and a line like "Armstrong,Paul,strong,44" - if you replace "strong" with "weak" you would get "Armweak,Paul,weak,44".
I hope the following code helps you:
filename = "SampleFile.txt"
filename_new = filename.replace(".", "_new.")
search_term = "Smith"
with open(filename) as src, open(filename_new, 'w') as dst:
for line in src:
if line.startswith(search_term):
items = line.split(",")
items[4-1] = items[4-1].replace("old", "new")
line = ",".join(items)
dst.write(line)
If you work with a csv-file you should have a look at the csv module.
PS My files contain the following data (the filenames are not in the files!!!):
SampleFile.txt SampleFile_new.txt
Adams,George,m,old,34 Adams,George,m,old,34
Adams,Tracy,f,old,32 Adams,Tracy,f,old,32
Smith,John,m,old,53 Smith,John,m,new,53
Man,Emily,w,old,44 Man,Emily,w,old,44

How to print a file containing a list

So basically i have a list in a file and i only want to print the line containing an A
Here is a small part of the list
E5341,21/09/2015,C102,440,E,0
E5342,21/09/2015,C103,290,A,290
E5343,21/09/2015,C104,730,N,0
E5344,22/09/2015,C105,180,A,180
E5345,22/09/2015,C106,815,A,400
So i only want to print the line containing A
Sorry im still new at python,
i gave a try using one "print" to print the whole line but ended up failing guess i will always suck at python
You just have to:
open file
read lines
for each line, split at ","
for each line, if the 5th part of the splitted str is equal to "A", print line
Code:
filepath = 'file.txt'
with open(filepath, 'r') as f:
lines = f.readlines()
for line in lines:
if line.split(',')[4] == "A":
print(line)

Search text file for word from list then output word that matched in Python 3.x

I have been searching a large directory of text files for files that match a list of words. How do I have python output the word from the list that matches?
This is what I have so far. It writes the file name every time one of the words from the list is found. I want to add the matching word to the line with the file name so I have the file name and 1 matched word each time. How do I do that?
ngwrds= ['words'...]
for filename in os.listdir(os.getcwd()):
with open(filename, 'r') as searchfile:
for line in searchfile:
if any(x in line for x in ngwrds):
with open("keyword.txt", 'a') as out:
out.write(filename + '\n')
The input is a long text file a line might read like this:
The company reported depreciation of $1.20.
The if one of the search words from the list was depreciation then the output file would look like this:
filename depreciation
Thank you.
I am not sure what out is and I can't run your code from where I am but you could try something like this:
ngwrds= ['words'...]
for filename in os.listdir(os.getcwd()):
with open(filename, 'r') as searchfile:
for line in searchfile:
line = line.strip().split(" ")
for word in line:
if word in ngwrds:
out.write(filename + " " + word)
strip gets rid of whitespace on either end of line. split returns a list of the words in line.

python3 opening files and reading lines

Can you explain what is going on in this code? I don't seem to understand
how you can open the file and read it line by line instead of all of the sentences at the same time in a for loop. Thanks
Let's say I have these sentences in a document file:
cat:dog:mice
cat1:dog1:mice1
cat2:dog2:mice2
cat3:dog3:mice3
Here is the code:
from sys import argv
filename = input("Please enter the name of a file: ")
f = open(filename,'r')
d1ct = dict()
print("Number of times each animal visited each station:")
print("Animal Id Station 1 Station 2")
for line in f:
if '\n' == line[-1]:
line = line[:-1]
(AnimalId, Timestamp, StationId,) = line.split(':')
key = (AnimalId,StationId,)
if key not in d1ct:
d1ct[key] = 0
d1ct[key] += 1
The magic is at:
for line in f:
if '\n' == line[-1]:
line = line[:-1]
Python file objects are special in that they can be iterated over in a for loop. On each iteration, it retrieves the next line of the file. Because it includes the last character in the line, which could be a newline, it's often useful to check and remove the last character.
As Moshe wrote, open file objects can be iterated. Only, they are not of the file type in Python 3.x (as they were in Python 2.x). If the file object is opened in text mode, then the unit of iteration is one text line including the \n.
You can use line = line.rstrip() to remove the \n plus the trailing withespaces.
If you want to read the content of the file at once (into a multiline string), you can use content = f.read().
There is a minor bug in the code. The open file should always be closed. I means to use f.close() after the for loop. Or you can wrap the open to the newer with construct that will close the file for you -- I suggest to get used to the later approach.

Resources