How do I delete line starting with ">" while reading a file to string in Python - python-3.x

I want to read all files ending with ".fasta" in mydir directory one by one and save content except lines starting with ">" to a string called "data" for further analysis while also ignoring newline characters. So far I have this:
for file in os.listdir(mydir):
if file.endswith(".fasta"):
with open(file, 'r') as myfile:
data = myfile.read().replace('\n', '')
How do I read file into a string AND in the same command skip all lines starting with ">"?

Here you go
for file in os.listdir(mydir):
if file.endswith(".fasta"):
with open(file, 'r') as myfile:
data = "".join(line for line in myfile if line[:1]!='>')

Related

search and replace using a file for computer name

I've got to search for the computer name and replace this with another name in python. These are stored in a file seperated by a space.
xerox fj1336
mongodb gocv1344
ec2-hab-223 telephone24
I know this can be done in linux using a simple while loop.
What I've tried is
#input file
fin = open("comp_name.txt", "rt")
#output file to write the result to
fout = open("comp_name.txt", "wt")
#for each line in the input file
for line in fin:
#read replace the string and write to output file
fout.write(line.replace('xerox ', 'fj1336'))
#close input and output files
fin.close()
fout.close()
But the output don't really work and if it did it would only replace the one line.
u can try this way:
with open('comp_name.txt', 'r+') as file:
content = file.readlines()
for i, line in enumerate(content):
content[i] = line.replace('xerox', 'fj1336')
file.seek(0)
print(str(content))
file.writelines(content)

Pass a file with filepaths to Python in Ubuntu terminal to analyze each file?

I have a text file with file paths:
path1
path2
path3
...
path100000000
I have my python script app.py that should run on each file (path1, path2 ...)
Please advise what is the best way to do it?
Should I just get it as argument, and then:
with open(input_file, "r") as f:
lines = f.readlines()
for line in lines:
main_function(line)
Yes that should work, except readlines() doesn't remove newline characters.
with open(input_file, "r") as f:
lines = f.readlines()
for line in lines:
main_function(line.strip())
**Note: The above code assumes the file is in the same directory as the python script file.
You are using context managers. Hence, place the code inside the context.
So according to your comment,
If you want to pass filename where you will read the file contents in the main_function, then the above code will work.
If you want to read the file and then pass the file contents, then you will have to modify the above code to first read the content and then pass it to the function
with open(input_file, "r") as f:
lines = f.readlines()
for line in lines:
main_function(open(line.strip(), "r").read())
**Note: the above function will read the whole file as a single string (text)

How to search and replace character such as "\" in a file using Python?

I have a text file where I want to replace the character \ by ,. After reading the answer of #Jack Aidley in this SO post:
# Read in the file
with open('file.txt', 'r') as file :
filedata = file.read()
# Replace the target string
filedata = filedata.replace('n', '***IT WORKED!!!***')
# Write the file out again
with open('file.txt', 'w') as file:
file.write(filedata)
I could successfully change the content such as simple letter like n into ***IT WORKED!!!***. However, if I replace
filedata.replace('n', '***IT WORKED!!!***')
by
filedata.replace('\', '***IT WORKED!!!***')
I get the syntax error:
SyntaxError: EOL while scanning string literal
How can I replace all the \ by ,?

Adjusting the content of a txt file

I am trying to filter specific chars out of a txt.file by copying the content selectively to a string and write this to a second file:
file = open(filepath, 'r')
file2 = open("C:/.../test2.txt", 'w')
newline = ""
for line in file:
for letter in line:
if letter == "#": continue
else: newline += letter
newline += "\n"
file2.write(newline)
I only manage to copy and mutate the content of file1 by using the newline character after reading each line, but with the effect of having undesired empty lines in my new txt2 file:
fewfewfw
fwefewf
How do I prevent having to remove these empty lines afterwards? Is there a better way to adjust a txt file anyway?
If you are trying to remove all the # symbols from the file, use:
with open(filepath, 'r') as f1, open("C:/.../test2.txt", 'w') f2:
content = f1.read()
f2.write(content.replace('#', ''))

Python - Spyder 3 - Open a list of .csv files and remove all double quotes in every file

I've read every thing I can find and tried about 20 examples from SO and google, and nothing seems to work.
This should be very simple, but I cannot get it to work. I just want to point to a folder, and replace every double quote in every file in the folder. That is it. (And I don't know Python well at all, hence my issues.) I have no doubt that some of the scripts I've tried to retask must work, but my lack of Python skill is getting in the way. This is as close as I've gotten, and I get errors. If I don't get errors it seems to do nothing. Thanks.
import glob
import csv
mypath = glob.glob('\\C:\\csv\\*.csv')
for fname in mypath:
with open(mypath, "r") as infile, open("output.csv", "w") as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
writer.writerow(item.replace("""", "") for item in row)
You don't need to use csv-specific file opening and writing, I think that makes it more complex. How about this instead:
import os
mypath = r'\path\to\folder'
for file in os.listdir(mypath): # This will loop through every file in the folder
if '.csv' in file: # Check if it's a csv file
fpath = os.path.join(mypath, file)
fpath_out = fpath + '_output' # Create an output file with a similar name to the input file
with open(fpath) as infile
lines = infile.readlines() # Read all lines
with open(fpath_out, 'w') as outfile:
for line in lines: # One line at a time
outfile.write(line.replace('"', '')) # Remove each " and write the line
Let me know if this works, and respond with any error messages you may have.
I found the solution to this based on the original answer provided by u/Jeff. It was actually smart quotes (u'\u201d') to be exact, not straight quotes. That is why I could get nothing to work. That is a great way to spend like two days, now if you'll excuse me I have to go jump off the roof. But for posterity, here is what I used that worked. (And note - there is the left curving smart quote as well - that is u'\u201c'.
mypath = 'C:\\csv\\'
myoutputpath = 'C:\\csv\\output\\'
for file in os.listdir(mypath): # This will loop through every file in the folder
if '.csv' in file: # Check if it's a csv file
fpath = os.path.join(mypath, file)
fpath_out = os.path.join(myoutputpath, file) #+ '_output' # Create an output file with a similar name to the input file
with open(fpath) as infile:
lines = infile.readlines() # Read all lines
with open(fpath_out, 'w') as outfile:
for line in lines: # One line at a time
outfile.write(line.replace(u'\u201d', ''))# Remove each " and write the line
infile.close()
outfile.close()

Resources