Printing an entire list, instead of one line - string

I am having trouble writing the entire list into an outfile. Here is the code:
with open(infline, "r") as f:
lines = f.readlines()
for l in lines:
if "ATOM" in l :
split = l.split()
if split[-1] == "1":
print(split)
#print(type(split))
with open( newFile,"w") as f:
f.write("Model Number One" + "\n")
f.write(str(split))
When I use print(split) it allows me to see the entire list (image below):
with open(infile, "r") as f:
lines = f.readlines()
for l in lines:
if "ATOM" in l :
split = l.split()
if split[-1] == "1":
#print(split)
print(type(split))
with open( newFile,"w") as f:
f.write("Model Number One" + "\n")
for i in range(len(split)):
f.write(str(split))
However, when I try to use f.write(split) I get an error because the function can only take a str not a list. So, I used f.write(str(split)) and it worked. The only issue now is that it only writes the last item in the list, not the whole list.

The function print is slightly more permissible than the method f.write, in the sense that it can accept lists and various types of objects as input. f.write is usually called by passing pre-formatted strings, as you noticed.
I think the issue with the code is that the write routine is nested inside the code. This causes Python to erase any contents stored inside newFile, and write only the last line read (l).
The problem can be easily fixed by changing the open call to open( newFile,"a"). The flag "a" tells Python to append the new contents to the existing file newFile (without erasing information). If newFile does not exist yet, Python will automatically create it.

Related

python: How to read a file and store each line using map function?

I'm trying to reconvert a program that I wrote but getting rid of all for loops.
The original code reads a file with thousands of lines that are structured like:
Ex. 2 lines of a file:
As you can see, the first line starts with LPPD;LEMD and the second line starts with DAAE;LFML. I'm only interested in the very first and second element of each line.
The original code I wrote is:
# Libraries
import sys
from collections import Counter
import collections
from itertools import chain
from collections import defaultdict
import time
# START
# #time=0
start = time.time()
# Defining default program argument
if len(sys.argv)==1:
fileName = "file.txt"
else:
fileName = sys.argv[1]
takeOffAirport = []
landingAirport = []
# Reading file
lines = 0 # Counter for file lines
try:
with open(fileName) as file:
for line in file:
words = line.split(';')
# Relevant data, item1 and item2 from each file line
origin = words[0]
destination = words[1]
# Populating lists
landingAirport.append(destination)
takeOffAirport.append(origin)
lines += 1
except IOError:
print ("\n\033[0;31mIoError: could not open the file:\033[00m %s" %fileName)
airports_dict = defaultdict(list)
# Merge lists into a dictionary key:value
for key, value in chain(Counter(takeOffAirport).items(),
Counter(landingAirport).items()):
# 'AIRPOT_NAME':[num_takeOffs, num_landings]
airports_dict[key].append(value)
# Sum key values and add it as another value
for key, value in airports_dict.items():
#'AIRPOT_NAME':[num_totalMovements, num_takeOffs, num_landings]
airports_dict[key] = [sum(value),value]
# Sort dictionary by the top 10 total movements
airports_dict = sorted(airports_dict.items(),
key=lambda kv:kv[1], reverse=True)[:10]
airports_dict = collections.OrderedDict(airports_dict)
# Print results
print("\nAIRPORT"+ "\t\t#TOTAL_MOVEMENTS"+ "\t#TAKEOFFS"+ "\t#LANDINGS")
for k in airports_dict:
print(k,"\t\t", airports_dict[k][0],
"\t\t\t", airports_dict[k][1][1],
"\t\t", airports_dict[k][1][0])
# #time=1
end = time.time()- start
print("\nAlgorithm execution time: %0.5f" % end)
print("Total number of lines read in the file: %u\n" % lines)
airports_dict.clear
takeOffAirport.clear
landingAirport.clear
My goal is to simplify the program using map, reduce and filter. So far I have sorted teh creation of the two independent lists, one for each first element of each file line and another list with the second element of each file line by using:
# Creates two independent lists with the first and second element from each line
takeOff_Airport = list(map(lambda sub: (sub[0].split(';')[0]), lines))
landing_Airport = list(map(lambda sub: (sub[0].split(';')[1]), lines))
I was hoping to find the way to open the file and achieve the exact same result as the original code by been able to opemn the file thru a map() function, so I could pass each list to the above defined maps; takeOff_Airport and landing_Airport.
So if we have a file as such
line 1
line 2
line 3
line 4
and we do like this
open(file_name).read().split('\n')
we get this
['line 1', 'line 2', 'line 3', 'line 4', '']
Is this what you wanted?
Edit 1
I feel this is somewhat reduntant but since map applies a function to each element of an iterator we will have to have our file name in a list, and we ofcourse define our function
def open_read(file_name):
return open(file_name).read().split('\n')
print(list(map(open_read, ['test.txt'])))
This gets us
>>> [['line 1', 'line 2', 'line 3', 'line 4', '']]
So first off, calling split('\n') on each line is silly; the line is guaranteed to have at most one newline, at the end, and nothing after it, so you'd end up with a bunch of ['all of line', ''] lists. To avoid the empty string, just strip the newline. This won't leave each line wrapped in a list, but frankly, I can't imagine why you'd want a list of one-element lists containing a single string each.
So I'm just going to demonstrate using map+strip to get rid of the newlines, using operator.methodcaller to perform the strip on each line:
from operator import methodcaller
def readFile(fileName):
try:
with open(fileName) as file:
return list(map(methodcaller('strip', '\n'), file))
except IOError:
print ("\n\033[0;31mIoError: could not open the file:\033[00m %s" %fileName)
Sadly, since your file is context managed (a good thing, just inconvenient here), you do have to listify the result; map is lazy, and if you didn't listify before the return, the with statement would close the file, and pulling data from the map object would die with an exception.
To get around that, you can implement it as a trivial generator function, so the generator context keeps the file open until the generator is exhausted (or explicitly closed, or garbage collected):
def readFile(fileName):
try:
with open(fileName) as file:
yield from map(methodcaller('strip', '\n'), file)
except IOError:
print ("\n\033[0;31mIoError: could not open the file:\033[00m %s" %fileName)
yield from will introduce a tiny amount of overhead over directly iterating the map, but not much, and now you don't have to slurp the whole file if you don't want to; the caller can just iterate the result and get a split line on each iteration without pulling the whole file into memory. It does have the slight weakness that opening the file will be done lazily, so you won't see the exception (if there is any) until you begin iterating. This can be worked around, but it's not worth the trouble if you don't really need it.
I'd generally recommend the latter implementation as it gives the caller flexibility. If they want a list anyway, they just wrap the call in list and get the list result (with a tiny amount of overhead). If they don't, they can begin processing faster, and have much lower memory demands.
Mind you, this whole function is fairly odd; replacing IOErrors with prints and (implicitly) returning None is hostile to API consumers (they now have to check return values, and can't actually tell what went wrong). In real code, I'd probably just skip the function and insert:
with open(fileName) as file:
for line in map(methodcaller('strip', '\n'), file)):
# do stuff with line (with newline pre-stripped)
inline in the caller; maybe define split_by_newline = methodcaller('split', '\n') globally to use a friendlier name. It's not that much code, and I can't imagine that this specific behavior is needed in that many independent parts of your file, and inlining it removes the concerns about when the file is opened and closed.

Python data extract from text file - script stops before expected data match

Suppose I have this data in a text file, the script extracts everything between index1 and index2 and includes those strings in the output file. But for some reason it stops a few lines before index2.
Dumb Data
index1 0000
random data
index1 0000
random data
index1 0000
index2 0000
Here is my code; it starts writing to my output file as soon as it sees index1, but then if it sees index2, it should write that last match and exit. But it never exits, it seems to hang and stop a few lines before index2, always on the same line though. If the data wasn't sensitive I would paste the actual data.
import re
myvar = False
myfile = open('extract','w')
with open('input.txt') as f:
for line in f:
if re.search(r'index1', line):
myvar = True
myfile.write(line)
elif re.search(r'index2', line):
myvar = False
break
elif myvar == True:
myfile.write(line)
continue
myfile.close
f.close
The thing is, it works with my dummy data, but not with the real data, it stops on this line. It starts with a form feed, which I though might be messing it up, but there are multiple form feeds before this one which is printed to the output file.
FF (redacted) whitespace whitespace (redacted) datetime at datetime page 50
Thank you.
Following our discussion ...
You can simply your code, eliminate the loop and remove the cause of your error by switching from re.search to re.findall. This will produce a list - technically a tuple - with all the matches.
If you want to eliminate duplicates, you can transfer the list to a set, which is an unordered list without duplicates.
You should also wrap the output file in a context manager (with open) in the same way you have the input file. This has a better chance of closing the file properly.
If you want to take actions on the set, you can loop through it as if it were a list, or if you need to get just one element (e.g. for testing on the next part of your code), you can convert to a list - list(j)[0]
import re
output = []
with open("extract.txt", 'w') as myfile:
with open("input2.txt", 'r') as f:
output = re.findall(r'index1.*?index3',f.read(), re.DOTALL)
j = set(output)
for x in j:
myfile.write(x + '\n')
With a single element, it would change to:
with open("extract.txt", 'w') as myfile:
with open("input2.txt", 'r') as f:
output = re.findall(r'index1.*?index3',f.read(), re.DOTALL)
myfile.write(list(set(output))[0] + '\n')

How do I replace the 4th item in a list that is in a file that starts with a particular string?

I need to search for a name in a file and in the line starting with that name, I need to replace the fourth item in the list that is separated my commas. I have began trying to program this with the following code, but I have not got it to work.
with open("SampleFile.txt", "r") as f:
newline=[]
for word in f.line():
newline.append(word.replace(str(String1), str(String2)))
with open("SampleFile.txt", "w") as f:
for line in newline :
f.writelines(line)
#this piece of code replaced every occurence of String1 with String 2
f = open("SampleFile.txt", "r")
for line in f:
if line.startswith(Name):
if line.contains(String1):
newline = line.replace(str(String1), str(String2))
#this came up with a syntax error
You could give some dummy data which would help people to answer your question. I suppose you to backup your data: You can save the edited data to a new file or you can backup the old file to a backup folder before working on the data (think about using "from shutil import copyfile" and then "copyfile(src, dst)"). Otherwise by making a mistake you could easily ruin your data without being able to easily restore them.
You can't replace the string with "newline = line.replace(str(String1), str(String2))"! Think about "strong" as your search term and a line like "Armstrong,Paul,strong,44" - if you replace "strong" with "weak" you would get "Armweak,Paul,weak,44".
I hope the following code helps you:
filename = "SampleFile.txt"
filename_new = filename.replace(".", "_new.")
search_term = "Smith"
with open(filename) as src, open(filename_new, 'w') as dst:
for line in src:
if line.startswith(search_term):
items = line.split(",")
items[4-1] = items[4-1].replace("old", "new")
line = ",".join(items)
dst.write(line)
If you work with a csv-file you should have a look at the csv module.
PS My files contain the following data (the filenames are not in the files!!!):
SampleFile.txt SampleFile_new.txt
Adams,George,m,old,34 Adams,George,m,old,34
Adams,Tracy,f,old,32 Adams,Tracy,f,old,32
Smith,John,m,old,53 Smith,John,m,new,53
Man,Emily,w,old,44 Man,Emily,w,old,44

Something's wrong with my Python code (complete beginner)

So I am completely new to Python and can't figure out what's wrong with my code.
I need to write a program that asks for the name of the existing text file and then of the other one, that doesn't necessarily need to exist. The task of the program is to take content of the first file, convert it to upper-case letters and paste to the second file. Then it should return the number of symbols used in the file(s).
The code is:
file1 = input("The name of the first text file: ")
file2 = input("The name of the second file: ")
f = open(file1)
file1content = f.read()
f.close
f2 = open(file2, "w")
file2content = f2.write(file1content.upper())
f2.close
print("There is ", len(str(file2content)), "symbols in the second file.")
I created two text files to check whether Python performs the operations correctly. Turns out the length of the file(s) is incorrect as there were 18 symbols in my file(s) and Python showed there were 2.
Could you please help me with this one?
Issues I see with your code:
close is a method, so you need to use the () operator otherwise f.close does not do what your think.
It is usually preferred in any case to use the with form of opening a file -- then it is close automatically at the end.
the write method does not return anything, so file2content = f2.write(file1content.upper()) is None
There is no reason the read the entire file contents in; just loop over each line if it is a text file.
(Not tested) but I would write your program like this:
file1 = input("The name of the first text file: ")
file2 = input("The name of the second file: ")
chars=0
with open(file1) as f, open(file2, 'w') as f2:
for line in f:
f2.write(line.upper())
chars+=len(line)
print("There are ", chars, "symbols in the second file.")
input() does not do what you expect, use raw_input() instead.

python3 opening files and reading lines

Can you explain what is going on in this code? I don't seem to understand
how you can open the file and read it line by line instead of all of the sentences at the same time in a for loop. Thanks
Let's say I have these sentences in a document file:
cat:dog:mice
cat1:dog1:mice1
cat2:dog2:mice2
cat3:dog3:mice3
Here is the code:
from sys import argv
filename = input("Please enter the name of a file: ")
f = open(filename,'r')
d1ct = dict()
print("Number of times each animal visited each station:")
print("Animal Id Station 1 Station 2")
for line in f:
if '\n' == line[-1]:
line = line[:-1]
(AnimalId, Timestamp, StationId,) = line.split(':')
key = (AnimalId,StationId,)
if key not in d1ct:
d1ct[key] = 0
d1ct[key] += 1
The magic is at:
for line in f:
if '\n' == line[-1]:
line = line[:-1]
Python file objects are special in that they can be iterated over in a for loop. On each iteration, it retrieves the next line of the file. Because it includes the last character in the line, which could be a newline, it's often useful to check and remove the last character.
As Moshe wrote, open file objects can be iterated. Only, they are not of the file type in Python 3.x (as they were in Python 2.x). If the file object is opened in text mode, then the unit of iteration is one text line including the \n.
You can use line = line.rstrip() to remove the \n plus the trailing withespaces.
If you want to read the content of the file at once (into a multiline string), you can use content = f.read().
There is a minor bug in the code. The open file should always be closed. I means to use f.close() after the for loop. Or you can wrap the open to the newer with construct that will close the file for you -- I suggest to get used to the later approach.

Resources