Eliminate footer and header information from multiple text files (why can't I eliminate the last line as easily as I can eliminate the first lines?) - io

I have been trying all day.
# successfully writes the data from line 17 and next lines
# to new (temp) file named and saved in the os
import os
import glob
files = glob.glob('/Users/path/Documents/test/*.txt')
for myspec in files:
temp_filename = 'foo.temp.txt'
with open(myspec) as f:
for n in range(17):
f.readline()
with open(temp_filename, 'w') as w:
w.writelines(f)
os.remove(myspec)
os.rename(temp_filename, myspec)
# delete original file and rename the temp file so it replaces the original file
print("done")
The above works and it works well! I love it. I am very happy.
But this below does NOT work (same files, I am preprocessing files) :
# trying unsuccessfully to remove the last line which is line
# 2048 in all files and save again like above
import os
import glob
files = glob.glob('/Users/path/Documents/test/*.txt')
for myspec in files:
temp_filename = 'foo.temp.txt'
with open(myspec) as f:
for n in range(-1):
f.readline()
with open(temp_filename, 'w') as w:
w.writelines(f)
os.remove(myspec)
os.rename(temp_filename, myspec)
# delete original file and rename the temp file so it replaces the original file
print("done")
This does not work. It doesn't give an error, it prints done, but it does not change the file. I have tried range(-1), all the way up to range(-7), thinking maybe there were blank lines at the end I could not see. This is the only difference between the two blocks of code. If anyone could help that would be great.
To summarize, I got rid of permanently the headers and now I still have a 1 line footer I can not get rid of permanently.
Thank you so much for any help. I need to write permanently edited files. Because I have a ton of code that wants 2 or 3 column files without all the header footer junk, and the junk and file types vary widely. So if I lose the junk permanently ASCII can guess correctly the file types. And I really do not want to try and rewrite that code right now, it's very complicated and involves uncertainty and it took me months to get working correctly. I don't read the files until I'm inside a function and there are many files that are displayed in multiple drop downs. Thank you! All day I've been at this, I have tried other methods. I'd like to make THIS the above method work. To pop off the last and write it back to a permanent file. It doesn't like the -1. Right now it is just one specific line, it is (specifically line 2048 after the header is removed.) Therefore just removing line 2048 would be fine too. Its the last line of the files which are a batch of TSV files that are CCD readouts. Thanks in advance!

Related

Unable to write in first line of csv file

I'm trying to append values to a csv file and convert it into a list. However, I'm not able to write on the first line of the csv file. Instead, the code starts writing from the second line. Any clarifications would be appreciated.
Thanks
time = get_time()
time_list = []
with open('time_data.csv', 'a', newline= '') as time_file:
time_file_write = csv.writer(time_file, delimiter=',',quotechar='|', quoting=csv.QUOTE_MINIMAL)
time_file_write.writerow([time])
with open('time_data.csv', 'r') as time_data:
read = csv.reader(time_data)
for i in read:
time_list.append(int(i[0]))
The error is not reproduceble. But...
A CRLF before the first line and no CRLF after the last line is what Windows CSV would have (older versions). New versions of Windows would keep the first line perfectly and add an additional CRLF after the last line. In Linux there will be no unwanted CRLF anywhere. Even if you re-define under write, etc, the basic behavior remains.
Few points to note:
The csv writerow takes different settings for Windows / Linux / etc as given under lib/csv.py. The settings are different for different versions of Windows. But it is the same for all flavors / versions of Linux. So, though the error does not show up in Windows here, you might have...
While asking...better if you furnish here a simple, to the point and very immediately executable code. Like...
import csv
time = '12'
with open('time_data.csv', 'a', newline= '') as time_file:
time_file_write = csv.writer(time_file, delimiter=',',quotechar='|', quoting=csv.QUOTE_MINIMAL)
time_file_write.writerow([time])
with open('time_data.csv', 'r') as time_data:
read = csv.reader(time_data)
for i in read:
print(i)
This code - after eliminating (1) the get_time() and (2) defining / populating the time_list - would make you understand where the issue is. And it ran as expected.
Considering the fact that this code (without get_time()) does not generate blank first line in Windows too and since the Code line time = get_time() adds ambiguity as we do not know what it returns to time, you need to check that too to resolve - if older Windows Version is not the case.

Why won't this Python script replace one variable with another variable?

I have a CSV file with two columns in it, the one of the left being an old string, and the one directly to right being the new one. I have a heap of .xml files that contain the old strings, which I need to replace/update with the new ones.
The script is supposed to open each .xml file one at a time and replace all of the old strings in the CSV file with the new ones. I have tried to use a replace function to replace instances of the old string, called 'column[0]' with the new string, called 'column[1]'. However I must be missing something as this seems to do nothing. If I the first variable in the replace function to an actual string with quotation marks, the replace function works. However if both the terms in the replace function are variables, it doesn't.
Does anyone know what I am doing wrong?
import os
import csv
with open('csv.csv') as csv:
lines = csv.readline()
column = lines.split(',')
fileNames=[f for f in os.listdir('.') if f.endswith('.xml')]
for f in fileNames:
x=open(f).read()
x=x.replace(column[0],column[1])
print(x)
Example of CSV file:
oldstring1,newstring1
oldstring2,newstring2
Example of .xml file:
Word words words oldstring1 words words words oldstring2
What I want in the new .xml files:
Word words words newstring1 words words words newstring2
The problem over here is you are treating the csv file as normal text file not looping over the all the lines in the csv file.
You need to read file using csv reader
Following code will work for your task
import os
import csv
with open('csv.csv') as csvfile:
reader = csv.reader(csvfile)
fileNames=[f for f in os.listdir('.') if f.endswith('.xml')]
for f in fileNames:
x=open(f).read()
for row in reader:
x=x.replace(row[0],row[1])
print(x)
It looks like this is better done using sed. However.
If we want to use Python, it seems to me that what you want to do is best achieved
reading all the obsolete - replacements pairs and store them in a list of lists,
have a loop over the .xml files, as specified on the command line, using the handy fileinput module, specifying that we want to operate in line and that we want to keep around the backup files,
for every line in each of the .xml s operate all the replacements,
put back the modified line in the original file (using simply a print, thanks to fileinput's magic) (end='' because we don't want to strip each line to preserve eventual white space).
import fileinput
import sys
old_new = [line.strip().split(',') for line in open('csv.csv')]
for line in fileinput.input(sys.argv[1:], inplace=True, backup='.bak'):
for old, new in old_new:
line = line.replace(old, new)
print(line, end='')
If you save the code in replace.py, you will execute it like this
$ python3 replace.py *.xml subdir/*.xml another_one/a_single.xml

re-organize data stored in a csv

I have successfully downloaded my data from a given url and for storing it into a csv file I used the following code:
fx = open(destination_url, "w") #write data into a file
for line in lines: #loop through the string
fx.write(line + "\n")
fx.close() # close the file object
return
What happened is that the data is stored but not in separate lines. As one can see in the snapshot - the data is not separated into a different lines when I use the '\n'.
Every separate line of data that I wanted seems to be separated via the '\r' (marked by yellow) on the same cell in the csv file. Here is a snip: .
I know I am missing something here but can I get some pointers with regards to rearranging each line that ends with a \r into a separate line ?
I hope I have made myself clear.
Thanks
~V
There is a method call writelines
https://www.tutorialspoint.com/python/file_writelines.htm
some example is in the given link you can try that first in reality it should work we need the format of the data (what is inside the element) during each iteration print that out if the above method does not work

Overwriting specific lines in Python

I have a simple program that manipulates some stored data on some text files. However I have to store the name and the password on different files for python to read.
I was wondering if I could get these two words (The name and the password) on two separate lines on one file and get python to overwrite just one of the lines based on what I choose to overwrite (either the password or the name).
I can get python to read specific lines with:
linenumber=linecache.getline("example.txt",4)
Ideally id like something like this:
linenumber=linecache.writeline("example.txt","Hello",4)
So this would just write "Hello" in "example.txt" only on line 4.
But unfortunately it doesn't seem to be as simple as that, I can get the words to be stored on separate files but overall doing this on a larger scale, I'm going to have a lot of text files all named differently and with different words on them.
If anyone would be able to help, it would be much appreciated!
Thanks, James.
You can try with built in open() function:
def overwrite(filename,newline,linenumber):
try:
with open(filename,'r') as reading:
lines = reading.readlines()
lines[linenumber]=newline+'\n'
with open(filename,'w') as writing:
for i in lines:
writing.write(i)
return 0
except:
return 1 #when reading/writing gone wrong, eg. no such a file
Be careful! It is writing all the lines all over again in a loop and when it comes to exception example.txt may already be blank. You may want to store all the lines in list all the time to write them back to file in exception. Or keep backup of your old files.

file.read() not working as intended in string comparison

stackoverflow.
I've been trying to get the following code to create a .txt file, write some string on it and then print some message if said string was in the file. This is merely a study for a more complex project, but even given it's simplicity, it's still not working.
Code:
import io
file = open("C:\\Users\\...\\txt.txt", "w+") #"..." is the rest of the file destination
file.write('wololo')
if "wololo" in file.read():
print ("ok")
This function always skips the if as if there was no "wololo" inside the file, even though I've checked it all times and it was properly in there.
I'm not exactly sure what could be the problem, and I've spend a great deal of time searching everywhere for a solution, all to no avail. What could be wrong in this simple code?
Oh, and if I was to search for a string in a much bigger .txt file, would it still be wise to use file.read()?
Thanks!
When you write to your file, the cursor is moved to the end of your file. If you want to read the data aferwards, you'll have to move the cursor to the beginning of the file, such as:
file = open("txt.txt", "w+")
file.write('wololo')
file.seek(0)
if "wololo" in file.read():
print ("ok")
file.close() # Remember to close the file
If the file is big, you should consider to iterate over the file line by line instead. This would avoid that the entire file is stored in memory. Also consider using a context manager (the with keyword), so that you don't have to explicitly close the file yourself.
with open('bigdata.txt', 'rb') as ifile: # Use rb mode in Windows for reading
for line in ifile:
if 'wololo' in line:
print('OK')
else:
print('String not in file')

Resources