Other methods of reading a file without newlines - python-3.x

I'm learning python and was wondering if it's possible to write the below code differently, but have the same function. The purpose of this code is to split \n and remove spaces from the right:
contents = readlines()
stripped_contents = [(element.rstrip()) for element in contents]

You don't need to call readlines if you're going to iterate over the file(?). Also, use better names like "line" instead of "element".
stripped_lines = [line.rstrip() for line in file]
If you're going to use stripped_lines immediately as an iterable and only once, use a generator expression instead.
stripped_lines = (line.rstrip() for line in file)

Related

Removing \n from a list

so i'm currently learning about mail merging and was issued a challenge on it. The idea is to open a names file, read the name on the current line and then replace it in the letter and save that letter as a new item.
I figured a good idea to do this would be a for loop.
Open file > for loop > append names to list > loop the list and replace ect.
Except when I try to actually append the names to the list, i get this:
['Aang\nZuko\nAppa\nKatara\nSokka\nMomo\nUncle Iroh\nToph']
The code I am using is:
invited_names = []
with open ("./Input/Names/invited_names.txt") as names:
invited_names.append(names.read())
for item in invited_names:
new_names = [str.strip("\n") for str in invited_names]
print(new_names)
Have tried to replace the \n and now .strip but I have not been able to remove the \n. Any ideas?
EDIT: not sure if it helps but the .txt file for the names looks like this:
Aang
Zuko
Appa
Katara
Sokka
Momo
Uncle Iroh
Toph
As you can see, read() only returns a giant string of what you have in your invited_names.txt file. But instead, you can use readlines() which returns a list which contains strings of every line (Thanks to codeflush.dev for the comment). Then use extend() method to add this list to another list invited_names.
Again, you are using for loop and list comprehension at the same time. As a result, you are running the same list comprehension code for many times. So, you can cut off any of them. But I prefer you should keep the list comprehension because it is efficient.
Try this code:
invited_names = []
with open ("./Input/Names/invited_names.txt") as names:
invited_names.extend(names.readlines()) # <--
new_names = [str.strip("\n") for str in invited_names]
print(new_names)

Remove all text after last occurence in file

I have a bunch of lines inside a text file that looks like this
STANGHOLMEN_BVP01_03_ME41_DELTAT_PV
STANGHOLMEN_TA02_TF01_FO_OP
STANGHOLMEN_VV01_PV01_SP2
STANGHOLMEN_VS01_GT11_EFFBEG_X1
I am trying to remove the text after the last occurrence of _
So this is how i try to make my text look
STANGHOLMEN_BVP01_03_ME41_DELTAT
STANGHOLMEN_TA02_TF01_FO
STANGHOLMEN_VV01_PV01
STANGHOLMEN_VS01_GT11_EFFBEG
its usually around 700 lines, Best way to do this?
You can parse the file line by line and add the content to a new file. To split the string you can use rsplit with maxsplit=1.
>>> with open("f_in.txt") as f_in, open("f_out.txt","w") as f_out:
... for line in f_in:
... f_out.write(line.rsplit('_', maxsplit=1)[0])
... f_out.write("\n")
You can user rfind() (returning index of substring looking from right side in simple words) from standard library, it will be the simplest way, but not so reliable.
last_index = string.rfind("_")
Next you have to slice yours string
new_string = string[:index]
You can use rsplit() and use the index[0] value.
For example if txt = 'STANGHOLMEN_VS01_GT11_EFFBEG_X1
txt1 = txt.rsplit('_',1)[0] will give you the values upto EFFBEG.
with open("f_in.txt") as f_in, open("f_out.txt","w") as f_out:
for line in f_in:
f_out.write(line.rsplit('_', maxsplit=1)[0])
f_out.write("\n")
This worked, however now all my text is in a long line, before it was sorted in lines.

file reading in python usnig different methods

# open file in read mode
f=open(text_file,'r')
# iterate over the file object
for line in f.read():
print(line)
# close the file
f.close()
the content of file is "Congratulations you have successfully opened the file"! when i try to run this code the output comes in following form:
c (newline) o (newline) n (newline) g.................
...... that is each character is printed individually on a new line because i used read()! but with readline it gives the answer in a single line! why is it so?
r.read() returns one string will all characters (the full file content).
Iterating a string iterates it character wise.
Use
for line in f: # no read()
instead to iterate line wise.
f.read() returns the whole file in a string. for i in iterates something. For a string, it iterates over its characters.
For readline(), it should not print the line. It would read the first line of the file, then print it character by character, like read. Is it possible that you used readlines(), which returns the lines as a list.
One more thing: there is with which takes a "closable" object and auto-closes it at the end of scope. And you can iterate over a file object. So, your code can be improved like this:
with open(text_file, 'r') as f:
for i in f:
print(i)

Reading a list of tuples from a text file in python

I am reading a text file and I want to read a list of tuples so that I can add another tuple to it in my program and write that appended tuple back to the text file.
Example in the file
[('john', 'abc')]
Want to write back to the file as
[('john', 'abc'), ('jack', 'def')]
However, I whenever I keep writing back to the file, the appended tuple seems to add in double quotes along with square brackets. I just want it to appear as above.
You can write a reusable function which takes 2 parameters file_path (on which you want to write tuple), tup (which you want to append) and put your logic inside that. Later you can supply proper data to this function and it will do the job for you.
Note: Don't forget to read the documentation as comments in code
tuples.txt (Before writing)
[('john', 'abc')]
Code
def add_tuple_to_file(file_path, tup):
with open(file_path, 'r+') as f:
content = f.read().strip() # read content from file and remove whitespaces around
tuples = eval(content) # convert string format tuple to original tuple object (not possible using json.loads())
tuples.append(tup) # append new tuple `tup` to the old list
f.seek(0) # After reading file, file pointer reaches to end of file so place it again at beginning
f.truncate() # truncate file (erase old content)
f.write(str(tuples)) # write back the updated list
# Try
add_tuple_to_file("./tuples.txt", ('jack', 'def'))
tuples.txt (After writing back)
[('john', 'abc'), ('jack', 'def')]
References
https://www.geeksforgeeks.org/python-ways-to-convert-string-to-json-object/
How to open a file for both reading and writing?
You can use ast.literal_eval to get the list object from the string.
import ast
s = "[('john', 'abc')]"
o = ast.literal_eval(s)
print(repr(o)==s)
o.append(('jack', 'def'))
newstr = repr(o)
print(newstr)
Here it is in action.

Make python program read 2 files line by line in sync and conduct program on each line

This question is two fold:
Background: I have 2 large files, each line of file 1 is "AATTGGCCAA" and each line of file 2 is "AATTTTCCAA". Each file has 20,000 lines and I have a python code I have to run on each pair of lines in turn.
Firstly, how would you go about getting the python code to run on the same numbered line of each file e.g. line 1 of both files?
Secondly, how would you get the file to move down to line 2 on both files after running on line 1 etc?
File objects are iterators. You can pass them to any function that expects an iterable object and it will work. For your specific use case, you want to use the zip builtin function, which iterates over several objects in parallel and yields tuples with one object from each iterable.
with open(filename1) as file1, open(filename2) as file2:
for line1, line2 in zip(file1, file2):
do_something(line1, line2)
In Python 3, zip is an iterator, so this is efficient. If you needed to do the same thing in Python 2, you'd probably want to use itertools.izip instead, as the regular zip would cause all the data from both files to be read at into a list up front.
File objects are iterators. You can open them and then call .next() on the object to get the next line. An example
For line in file1:
other_line = file2.next()
do_something(line, other_line)
The following code uses two Python features:
1. Generator function
2. File object treated as iterator
def get_line(file_path):
# Generator function
with open(file_path) as file_obj:
for line in file_obj:
# Give one line and return control to the calling scope
yield line
# Generator function will not be executed here
# Instead we get two generator instances
lines_a = get_line(path_to_file_a)
lines_b = get_line(path_to_file_b)
while True:
try:
# Now grab one line from each generator
line_pair = (next(lines_a), next(lines_b))
except StopIteration:
# This exception means that we hit EOF in one of the files so exit the loop
break
do_something(line_pair)
Assuming that your code is wrapped in do_something(line_pair) function which accepts a tuple of length 2 which holds the pair of lines.
Here's the code that allows you to process lines in sync from multiple files:
from contextlib import ExitStack
with ExitStack() as stack:
files = [stack.enter_context(open(filename)) for filename in filenames]
for lines in zip(*files):
do_something(*lines)
e.g., for 2 files it calls do_something(line_from_file1, line_from_file2) for each pair of lines in the given files.

Resources