How to delete numbers at the start of lines of a file - python-3.x

I have some files containing a lot of lines. At the start of each line there are some numbers which there is ";" between them. How can I delete these numbers and ";"? (I tested split to put numbers together so can I delete them but when I use split, the word next to the numbers is put with them and when I delete the numbers, they delete, too. but I don't want to delete words. Just the the numbers and ";"). Or is there a way in notepad++?
The sample file:
https://www.dropbox.com/s/yvgc659f9rrfhop/N.txt?dl=0
file = "c:/Python34/N.txt"
h = ["1","2","3","4","5","6","7","8","9","0", ";"]
with open (file) as f:
for line in f:
for i in h:
if i in line:
line.replace(i, "")
print (line)
with open ("new.txt", "w") as f2:
f2.write(line)

Regular expressions can deal with this:
import re
file = 'c:/Python34/N.txt'
with open(file) as f:
contents = re.sub(r'\d+;', '', f.read())
with open('new.txt', 'w') as f2:
f2.write(contents)

Related

How to delete digits-only lines from a text file?

Let's say I have a text file which contains both alphanumerical values and only numerical values of length 10 digits line by line, like the one shown below:
abcdefgh
0123456789
edf6543jewjew
9876543219
I want to delete all the lines which contains only those random 10 digit numbers, i.e. expected output for the above example is the following:
abcdefgh
edf6543jewjew
How can one do this in Python 3.x?
with open("yourTextFile.txt", "r") as f:
lines = f.readlines()
with open("yourTextFile.txt", "w") as f:
for line in lines:
if not line.strip('\n').isnumeric():
f.write(line)
elif len(line.strip('\n')) != 10:
f.write(line)
Open the input file, read all its lines, filter out the lines that contain only digits, then write the filtered lines back to a new file.
import re
with open(input_file_path) as file:
lines = file.readlines()
output_lines = [line for line in lines if not re.match(r'^[0-9]+$', line.strip('\n'))]
with open(output_file_path, 'w') as file:
file.write('\n'.join(output_lines))
import re
fh=open('Desktop\Python13.txt','r+')
content=fh.readlines()
fh.seek(0)
for line in content:
if re.match(r'[0-9]{10}',line):
content.remove(line)
fh.write(''.join(content))
fh.truncate()
fh.close()

Adjusting the content of a txt file

I am trying to filter specific chars out of a txt.file by copying the content selectively to a string and write this to a second file:
file = open(filepath, 'r')
file2 = open("C:/.../test2.txt", 'w')
newline = ""
for line in file:
for letter in line:
if letter == "#": continue
else: newline += letter
newline += "\n"
file2.write(newline)
I only manage to copy and mutate the content of file1 by using the newline character after reading each line, but with the effect of having undesired empty lines in my new txt2 file:
fewfewfw
fwefewf
How do I prevent having to remove these empty lines afterwards? Is there a better way to adjust a txt file anyway?
If you are trying to remove all the # symbols from the file, use:
with open(filepath, 'r') as f1, open("C:/.../test2.txt", 'w') f2:
content = f1.read()
f2.write(content.replace('#', ''))

python add numbers in front of lines and export it to a new file

i have a file(A.txt) that it has a series of lines. i would like to read file(A) and create a new file(B) and add a number and a semicolon at the beginning and a space before the text of each line. at the moment i have:
with open('A.txt','r+') as f:
for index, line in enumerate(f.readlines(), start=1):
print('{:4d}: {}'.format(index, line.rstrip()))
the above code takes the file(A) and adds the number in the format i want. The problem is that i do not know how to leave file(A.txt), just read the contents of A.txt, as it is and make all changes to file(B.txt).
Any ideas, please?
Open file B in write-mode, open("B.txt", "w"), then instead of calling print, call write on the new file descriptor.
with open("A.txt", "r") as a, open("B.txt", "w") as b:
b.write(...)
Your program would look like:
with open("A.txt", "r") as a, open("B.txt", "w") as b:
index = 1
for line in a:
b.write("{:4d}: {}\n".format(index, line.rstrip()))
index += 1

How to print a file containing a list

So basically i have a list in a file and i only want to print the line containing an A
Here is a small part of the list
E5341,21/09/2015,C102,440,E,0
E5342,21/09/2015,C103,290,A,290
E5343,21/09/2015,C104,730,N,0
E5344,22/09/2015,C105,180,A,180
E5345,22/09/2015,C106,815,A,400
So i only want to print the line containing A
Sorry im still new at python,
i gave a try using one "print" to print the whole line but ended up failing guess i will always suck at python
You just have to:
open file
read lines
for each line, split at ","
for each line, if the 5th part of the splitted str is equal to "A", print line
Code:
filepath = 'file.txt'
with open(filepath, 'r') as f:
lines = f.readlines()
for line in lines:
if line.split(',')[4] == "A":
print(line)

python csv format all rows to one line

Ive a csv file that I would like to get all the rows in one column. Ive tried importing into MS Excel or Formatting it with Notedpad++ . However with each try it considers a piece of data as a new row.
How can I format file with pythons csv module so that it removes a string "BRAS" and corrects the format. Each row is found between a quote " and delimiter is a pipe |.
Update:
"aa|bb|cc|dd|
ee|ff"
"ba|bc|bd|be|
bf"
"ca|cb|cd|
ce|cf"
The above is supposed to be 3 rows, however my editors see them as 5 rows or 6 and so forth.
import csv
import fileinput
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
if 'BRAS' not in line:
w.write(line)
N.B I get a unicode error when trying to use in python.
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 18: character maps to <undefined>
This is a quick hack for small input files (the content is read to memory).
#!python2
fnameIn = 'ventoya.csv'
fnameOut = 'ventoya2.csv'
with open(fnameIn) as fin, open(fnameOut, 'w') as fout:
data = fin.read() # content of the input file
data = data.replace('\n', '') # make it one line
data = data.replace('""', '|') # split char instead of doubled ""
data = data.replace('"', '') # remove the first and last "
print data
for x in data.split('|'): # split by bar
fout.write(x + '\n') # write to separate lines
Or if the goal is only to fix the extra (unwanted) newline to form a single-column CSV file, the file can be fixed first, and then read through the csv module:
#!python2
import csv
fnameIn = 'ventoya.csv'
fnameFixed = 'ventoyaFixed.csv'
fnameOut = 'ventoya2.csv'
# Fix the input file.
with open(fnameIn) as fin, open(fnameFixed, 'w') as fout:
data = fin.read() # content of the file
data = data.replace('\n', '') # remove the newlines
data = data.replace('""', '"\n"') # add the newlines back between the cells
fout.write(data)
# It is an overkill, but now the fixed file can be read using
# the csv module.
with open(fnameFixed, 'rb') as fin, open(fnameOut, 'wb') as fout:
reader = csv.reader(fin)
writer = csv.writer(fout)
for row in reader:
writer.writerow(row)
For solving this you need not to go to even code.
1: Just open file in Notepad++
2: In first line select from | symble till next line
3: go to replace and replace the selected format with |
Search mode can be normal or extended :)
Well, since the line breaks are consistent, you could go in and do find/replace as suggested, but you could also do a quick conversion with your python script:
import csv
import fileinput
linecount = 0
with open('ventoya.csv') as f, open('ventoya2.csv', 'w') as w:
for line in f:
line = line.rstrip()
# remove unwanted breaks by concatenating pairs of rows
if linecount%2 == 0:
line1 = line
else:
full_line = line1 + line
full_line = full_line.replace(' ','')
# remove spaces from front of 2nd half of line
# if you want comma delimiters, uncomment next line:
# full_line = full_line.replace('|',',')
if 'BRAS' not in full_line:
w.write(full_line + '\n')
linecount += 1
This works for me with the test data, and if you want to change the delimiters while writing to file, you can. The nice thing about doing with code is: 1. you can do it with code (always fun) and 2. you can remove the line breaks and filter content to the written file at the same time.

Resources