ValueError using split in Python 3.6.5 - python-3.x

ValueError: not enough values to unpack (expected 2, got 1)
I am a newb at Python. Trying to run the following script and getting the above error on line 3. Running this in Python 3.6.5. Any ideas?
with open ('namespace.txt', 'r') as f, open ('testfile.txt', 'w') as fo:
for line in f:
t,y =line.split()
fo.write(t + '\n')
print(t)
f.close
fo.close

One of your lines has 1 or 0 fields. If what you want is the first field, you could do this instead:
with open ('namespace.txt', 'r') as f, open ('testfile.txt', 'w') as fo:
for line in f:
t=line.split()
fo.write(t[0] + '\n')
print(t)
f.close
fo.close

Related

Multi line binary file read line by line and convert to char in python

My text file have data as :
01100110011010010111001001110011011101000010000001101100011010010110111001100101
0111001101100101011000110110111101101110011001000010000001101100011010010110111001100101
01110100011010000110100101110010011001000010000001101100011010010110111001100101
so, I need to convert this data file to English by python.
but my programme get some error as :
ValueError: invalid literal for int() with base 2: ''
Please help me to solve this
def bit2strings():
with open('test_doc.txt', 'r' ) as f:
x = (f.read())
for line in x.split(' '):
data = line
if data =='':
print(data)
break
else:
data = f.read(8)
plaintext = chr(int(data, 2))
print(plaintext, end='')
data = f.read(8)
I know that this is not more advance coding part. but how ever lastly write proper programme to solve my problem. I am a beginner to python. so then please give me some comment to do advance this coding part more than this. this is my coding part :
def bit2strings():
with open('test_doc.txt', 'r') as f:
for i in f:
#print(i)
for j in range(len(i)//8):
s = ((i[j * 8:j * 8 + 8]))
#print(s)
get_string = ''.join(chr(int(s, 2)))
print(get_string, end='')
print(end='\n')
if __name__=='__main__':
bit2strings()

Python: input saving multiline string

while True:
try:
line = input("paste:")
except EOFError:
break
f = open("notam_new.txt", "w+")
f.write(line)
f.close()
This code return only the last line of multi-line after Ctrl+D
I tried also:
notam = input("paste new notam: ")
f = open("notam_new.txt", "w+")
f.write(notam)
f.close()
getting only the first row.
Any ideas?
You're setting line in a loop, so every iteration you're just overwriting said line with the next one You need to accumulate your lines in a list (created before the while True) so you can keep track of all of them, and then write to the file in a loop. Plus you also need to add a newline as input() strips it.
lines = []
while True:
try:
lines.append(input("paste:"))
except EOFError:
break
with open("notam_new.txt", "w+") as f:
for line in lines:
f.write(line)
f.write('\n')

Python Split itertools output to multi files(BIG output)

So I have created a script to read lines from a file (1500 lines)
Write them as 10 per line
(and do every possible output we can get with product a b c d a , a b c d b etc...)
The thing is the moment I run the script my computer freezes completly(because it writes so much data)
So I thought if its possible to run the script every 100 mb it will save it to a file and save the current state so when I run the script again it will actuly run from where we stopped (the last line on the 100mb file)
Or if you have another solution I would love to hear it :P
heres the script :
from itertools import product
with open('file.txt', 'r') as f:
content = f.readlines()
comb = product(content, repeat=10)
new_content = [elem for elem in list(comb)]
with open('log.txt', 'w') as f:
for line in new_content:
f.write(str(line) + '\n')
The line
new_content = [elem for elem in list(comb)]
takes the generator and transforms it into a list in memory, twice. The result is the same as just doing
new_content = list(comb)
Your computer freezes up because this will use all of the available RAM.
Since you use new_content only for iterating over it, you could just iterate over the initial generator directly instead:
from itertools import product
with open('file.txt', 'r') as f:
content = f.readlines()
comb = product(content, repeat=10)
with open('log.txt', 'w') as f:
for line in comb:
f.write(str(line) + '\n')
But now this will fill up your harddisk, since with an input size of 1500 lines it will produce 57665039062500000000000000000000 lines (1500**10) of output.
I would open the file in a separate function and yield a line at a time - that way you're never going to blow your memory.
function read_file(filename):
with open(filename", "r") as f:
for line in f:
yield line
Then you can use this in your code:
for line in read_file("log.txt"):
f.write(line + "\n")

Reading each line from text file

I have a script which reads each line from text file. but somehow it prints all at once. I want to run one line end and run next. here is the code.
f = open('textfile.txt', 'r')
file= f.read()
for x in file:
print(x, file.strip())
comSerialPort.write(x.encode('utf-8'))
Use readlines instead of read
with open('textfile.txt', 'r') as f:
lines = f.readlines()
for line in lines:
print(line)
# do stuff with each line
Use with statement and then iterate lines.
Ex:
with open('textfile.txt', 'r') as infile:
for line in infile:
print(line)
comSerialPort.write(line.strip().encode('utf-8'))
Note: read() reads the entire content of the file.

Issue opening large number of files in Python

I'm trying to process a pipe separated text file with the following format:
18511|1|2587198|2004-03-31|0|100000|0|1.97|0.49988|100000||||
18511|2|2587198|2004-06-30|0|160000|0|3.2|0.79669|60000|60|||
18511|3|2587198|2004-09-30|0|160000|0|2.17|0.79279|0|0|||
18511|4|2587198|2004-09-30|0|160000|0|1.72|0.79118|0|0|||
18511|5|2587198|2005-03-31|0|0|0|0|0|-160000|-100|||19
18511|1|2587940|2004-03-31|0|240000|0|0.78|0.27327|240000||||
18511|2|2587940|2004-06-30|0|560000|0|1.59|0.63576|320000|133.33||24|
18511|3|2587940|2004-09-30|0|560000|0|1.13|0.50704|0|0|||
18511|4|2587940|2004-09-30|0|560000|0|0.96|0.50704|0|0|||
18511|5|2587940|2005-03-31|0|0|0|0|0|-560000|-100|||14
For each line I want to isolate the second field and write that line to a file with that field as part of the filename e.g issue1.txt, issue2.txt where the number is the second field in the above file excerpt. This number can be in the range 1 to 56. My code is shown below:
with open('d:\\tmp\issueholding.txt') as f, open('d:\\tmp\issue1.txt', 'w') as out_f1,\
open('d:\\tmp\issue2.txt', 'w') as out_f2,open('d:\\tmp\issue3.txt', 'w') as out_f3,\
open('d:\\tmp\issue4.txt', 'w') as out_f4,open('d:\\tmp\issue5.txt', 'w') as out_f5,\
open('d:\\tmp\issue6.txt', 'w') as out_f6,open('d:\\tmp\issue7.txt', 'w') as out_f7,\
open('d:\\tmp\issue8.txt', 'w') as out_f8,open('d:\\tmp\issue9.txt', 'w') as out_f9,\
open('d:\\tmp\issue10.txt', 'w') as out_f10,open('d:\\tmp\issue11.txt', 'w') as out_f11,\
open('d:\\tmp\issue12.txt', 'w') as out_f12,open('d:\\tmp\issue13.txt', 'w') as out_f13,\
open('d:\\tmp\issue14.txt', 'w') as out_f14,open('d:\\tmp\issue15.txt', 'w') as out_f15,\
open('d:\\tmp\issue16.txt', 'w') as out_f16,open('d:\\tmp\issue17.txt', 'w') as out_f17,\
open('d:\\tmp\issue18.txt', 'w') as out_f18,open('d:\\tmp\issue19.txt', 'w') as out_f19,\
open('d:\\tmp\issue20.txt', 'w') as out_f20,open('d:\\tmp\issue21.txt', 'w') as out_f21,\
open('d:\\tmp\issue22.txt', 'w') as out_f22,open('d:\\tmp\issue23.txt', 'w') as out_f23,\
open('d:\\tmp\issue24.txt', 'w') as out_f24,open('d:\\tmp\issue25.txt', 'w') as out_f25,\
open('d:\\tmp\issue32.txt', 'w') as out_f32,open('d:\\tmp\issue33.txt', 'w') as out_f33,\
open('d:\\tmp\issue34.txt', 'w') as out_f34,open('d:\\tmp\issue35.txt', 'w') as out_f35,\
open('d:\\tmp\issue36.txt', 'w') as out_f36,open('d:\\tmp\issue37.txt', 'w') as out_f37,\
open('d:\\tmp\issue38.txt', 'w') as out_f38,open('d:\\tmp\issue39.txt', 'w') as out_f39,\
open('d:\\tmp\issue40.txt', 'w') as out_f40,open('d:\\tmp\issue41.txt', 'w') as out_f41,\
open('d:\\tmp\issue42.txt', 'w') as out_f42,open('d:\\tmp\issue43.txt', 'w') as out_f43,\
open('d:\\tmp\issue44.txt', 'w') as out_f44,open('d:\\tmp\issue45.txt', 'w') as out_f45,\
open('d:\\tmp\issue46.txt', 'w') as out_f46,open('d:\\tmp\issue47.txt', 'w') as out_f47,\
open('d:\\tmp\issue48.txt', 'w') as out_f48,open('d:\\tmp\issue49.txt', 'w') as out_f49,\
open('d:\\tmp\issue50.txt', 'w') as out_f50,open('d:\\tmp\issue51.txt', 'w') as out_f51,\
open('d:\\tmp\issue52.txt', 'w') as out_f52,open('d:\\tmp\issue53.txt', 'w') as out_f53,\
open('d:\\tmp\issue54.txt', 'w') as out_f54,open('d:\\tmp\issue55.txt', 'w') as out_f55,\
open('d:\\tmp\issue56.txt', 'w') as out_f56:
for line in f:
field1_end = line.find('|') +1
field2_end = line.find('|',field1_end)
f2=line[field1_end:field2_end]
out_f56.write(line)
My two issue are:
1) When trying to run the above I get the following error message
File "", line unknown
SyntaxError: too many statically nested blocks
2) How do I change this line out_f56.write(line) so that I can use the variable f2 as part of the file descriptor rather than hard coding it.
I am running this in a jupyter notebook running python3 under Windows. To be clear, the input file has approx 235 Million records so performance is key.
Appreciate any help or suggestions
Try something like this (see comments in code for explanation):
with open(R"d:\tmp\issueholding.txt") as f:
for line in f:
# splitting line into list of strings at '|' character
fields = line.split('|')
# defining output file name according to issue code in second field
# NB: list-indexes are zero-based, therefore use 1
out_name = R"d:\tmp\issue%s.txt" % fields[1]
# opening output file and writing current line to it
# NB: make sure you use the 'a+' mode to append to existing file
with open(out_name, 'a+') as ff:
ff.write(line)
To avoid opening files repeatedly inside the reading loop, you could do the following:
from collections import defaultdict
with open(R"D:\tmp\issueholding.txt") as f:
# setting up dictionary to hold lines grouped by issue code
# using a defaultdict here to automatically create a list when inserting
# the first item
collected_issues = defaultdict(list)
for line in f:
# splitting line into list of strings at '|' character and retrieving
# current issue code from second token
issue_code = line.split('|')[1]
# appending current line to list of collected lines associated with
# current issue code
collected_issues[issue_code].append(line)
else:
for issue_code in collected_issues:
# defining output file name according to issue code
out_name = R"D:\tmp\issue%s.txt" % issue_code
# opening output file and writing collected lines to it
with open(out_name, 'a+') as ff:
ff.write("".join(collected_issues[issue_code]))
This of course creates an in-memory dictionary holding all lines retrieved from the input file. Given your specification this could very well be not feasible with your machine. An alternative would be to split up the input file and processing it chunk by chunk instead. This can be achieved in code by defining a corresponding generator that reads a defined amount of lines (here: 1000) from the input file. A possible final solution could then look like this:
from itertools import islice
from collections import defaultdict
def get_chunk_of_lines(file, N):
"""
Retrieves N lines from specified opened file.
"""
return [x.strip() for x in islice(file, N)]
def collect_issues(lines):
"""
Collects and groups issues from specified lines.
"""
collected_issues = defaultdict(list)
for line in lines:
# splitting line into list of strings at '|' character and retrieving
# current issue code from second token
issue_code = line.split('|')[1]
# appending current line to list of collected lines associated with
# current issue code
collected_issues[issue_code].append(line)
return collected_issues
def export_grouped_issues(issues):
"""
Exports collected and grouped issues.
"""
for issue_code in issues:
# defining output file name according to issue code
out_name = R"D:\tmp\issue%s.txt" % issue_code
# opening output file and writing collected lines to it
with open(out_name, 'a+') as f:
f.write("".join(issues[issue_code]))
with open(R"D:\tmp\issueholding.txt") as issue_src:
chunk_cnt = 0
while True:
# retrieving 1000 input lines at a time
line_chunk = get_chunk_of_lines(issue_src, 1000)
# exiting while loop if no more chunk is left
if not line_chunk:
break
chunk_cnt += 1
print("+ Working on chunk %d" % chunk_cnt)
# collecting, grouping and exporting issues
issues = collect_issues(line_chunk)
export_grouped_issues(issues)

Resources