How to maintain multiple stream positions in a Python stream - python-3.x

I'd like to use 2 stream pointers within a stream, and position the 2 pointers at different positions. How do I make a copy of the first stream, so that the copy doesn't mirror the state of the first stream, from this point in time?
In particular, I'm interested in streams of the type io.BytesIO()
import io
stream1 = open("Input.jpg", "rb")
stream2 = stream1
print('A', stream1.tell(), stream2.tell())
stream1.seek(10)
print('B', stream1.tell(), stream2.tell())
My goal is to see output of
A 0 0
B 10 0
However, I see
A 0 0
B 10 10
#varela
Thanks for the response. Unfortunately, this doesn't work well when the stream doesn't have a file descriptor (which can happen if we don't open a file). For example, instead of stream1=open("Input.jpg", "rb")
stream1 = io.BytesIO()
image.save(stream1, format='JPEG')
Any suggestions on how to handle this case?
Thanks.

You can open file twice, like
stream1 = open("Input.jpg", "rb")
stream2 = open("Input.jpg", "rb")
Then they will be independent. When you do stream2 = stream1 you just copy object reference, which doesn't create any new object.
You need to remember to close both file objects as well.
Usually copy of file descriptions is not needed. However it's possible to do with low level system operations, but I wouldn't recommend to do it unless you really have use case for this, example:
import os
# return integer file handle
fd1 = os.open("Input.jpg", os.O_BINARY | os.O_RDONLY)
fd2 = os.dup(fd1)
# you can convert them to file objects if required.
stream1 = os.fdopen(fd1, 'rb')
stream2 = os.fdopen(fd2, 'rb')
Here some use cases when os.dup makes sense to use: dup2 / dup - why would I need to duplicate a file descriptor?

Related

Need assistance on creating for loop

I'm trying to write bytes from one file to a second file, then go back to the first file and delete the bytes written. I'm doing this one byte at a time (first byte essentially copied and written to 2nd file, then that byte is removed from the first file).
The problem I'm having is creating a for loop (assuming that's the best way to go about this) to make this happen. My current code is below:
in_file = open('file', "rb")
data = in_file.read()
length = len(in_file.read())
in_file.close()
out_file = open('file2', "wb")
out_file.write(data[length:length+1])
out_file.close()
in_file = open('file', "wb")
in_file.write(data[1:])
in_file.close()
in_file = open('file', "rb")
data = in_file.read()
length = len(in_file.read())
in_file.close()
out_file = open('file2', "ab")
out_file.write(data[length:length+1])
out_file.close()
in_file = open('file', "wb")
in_file.write(data[1:])
in_file.close()
in_file = open('file', "rb")
data = in_file.read()
length = len(in_file.read())
in_file.close()
out_file = open('file2', "ab")
out_file.write(data[length:length+1])
out_file.close()
in_file = open('file', "wb")
in_file.write(data[1:])
in_file.close()
I guess the way I saw this happening is I get the first byte written outside of the loop, and then I have a for loop for appending each subsequent byte between the two files. I've tried creating a for loop for that sequence but I keep receiving errors about trying to access the closed file, so I'm not sure when/where to "close" my file. Reason I'm doing this is eventually I will convert each byte (files I'm dealing with are obfuscated bytes and I need to convert them back) to a difference byte value.
I appreciate any assistance!
Keep the 2 files open until you're done. Closing and opening files can't happen as fast your program tries to execute them.
You may have to flush the file object before you do the final (and only) close, i.e.
in_file.close(); // no need to flush the in-file
out_file.flush(); // do flush the out-file
out_file.close();

TypeError: string argument expected, got 'bytes'

I would like to convert the below hex sequences to images, in the process of sifting through quite a number of problems that are similar to mine none have come close as to that solved in https://stackoverflow.com/a/33989302/13648455, my code is below, where could I be going wrong?
data = "2a2b2c2a2b2c2a2b2c2a2b2cb1"
buf = io.StringIO()
for line in data.splitlines():
line = line.strip().replace(" ", "")
if not line:
continue
bytez = binascii.unhexlify(line)
buf.write(bytez)
with open("image.jpg", "wb") as f:
f.write(buf.getvalue())
io.StringIO() creates a string object which yields a text stream.
You need io.BytesIO() instead, which creates a bytes object to which you can write your binary data:
buf = io.BytesIO()
...
buf.write(bytez)
See also io — Core tools for working with streams

two pieces of python code creates zip archive one of two is broken

Initially I want to create zip file dynamically and return it in http response. I use python 3.7 lib zipfile.
I tried both io buffer and tmp dir, neither one of them creates valid zip archive. Archive is only opened if its saved on disc
import zipfile
import io
#==============================================
# V1
file_like_object = io.BytesIO()
myZipFile = zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile.writestr(u'test.py', b'test')
tmparchive="zip1.zip"
out = open(tmparchive,'wb') ## Open temporary file as bytes
out.write(file_like_object.getvalue())
out.close()
r = open(tmparchive, 'rb')
print (r.read())
r.close()
#==============================================
# V2
tmparchive2 = 'zip2.zip'
myZipFile2 = zipfile.ZipFile(tmparchive2, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile2.writestr(u'test.py', b'test')
r2 = open(tmparchive2, 'rb')
print (r2.read())
r2.close()
#====================================================
It's preferable to use a context manager like so:
import zipfile, io
file_like_object = io.BytesIO()
with zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED) as myZipFile:
myZipFile.writestr(u'test.txt', b'test')
# file_like_object.getvalue() are the bytes you send in your http response.
I wrote it to file. It's definitely a valid zip file.
If you want to open the archive, you need to save it to disk. Applications like Explorer and 7-Zip have no way to read the BytesIO object that exists in the python process. They can only open archives saved to disk.
Calling print(r.read()) isn't going to open the archive. It's just going to print the bytes that make up the tiny zip file you just created.

python-snappy streaming data in a loop to a client

I would like to send multiple compressed arrays from a server to a client using python snappy, but I cannot get it to work after the first array. Here is a snippet for what is happening:
(sock is just the network socket that these are communicating through)
Server:
for i in range(n): #number of arrays to send
val = items[i][1] #this is the array
y = (json.dumps(val)).encode('utf-8')
b = io.BytesIO(y)
#snappy.stream_compress requires a file-like object as input, as far as I know.
with b as in_file:
with sock as out_file:
snappy.stream_compress(in_file, out_file)
Client:
for i in range(n): #same n as before
data = ''
b = io.BytesIO()
#snappy.stream_decompress requires a file-like object to write o, as far as I know
snappy.stream_decompress(sock, b)
data = b.getvalue().decode('utf-8')
val = json.loads(data)
val = json.loads(data) works only on the first iteration, but afterwards it stop working. When I do a print(data), only the first iteration will print anything. I've verified that the server does flush and send all the data, so I believe it is a problem with how I decide to receive the data.
I could not find a different way to do this. I searched and the only thing I could find is this post which has led me to what I currently have.
Any suggestions or comments?
with doesn't do what you think, refer to it's documentation. It calls sock.__exit__() after the block executed, that's not what you intended.
# what you wrote
with b as in_file:
with sock as out_file:
snappy.stream_compress(in_file, out_file)
# what you meant
snappy.stream_compress(b, sock)
By the way:
The line data = '' is obsolete because it's reassigned anyways.
Adding to #paul-scharnofske's answer:
Likewise, on the receiving side: stream_decompress doesn't quit until end-of-file, which means it will read until the socket is closed. So if you send separate multiple compressed chunks, it will read all of them before finishing, which seems not what you intend. Bottom line, you need to add "framing" around each chunk so that you know on the receiving end when one ends and the next one starts. One way to do that... For each array to be sent:
Create a io.BytesIO object with the json-encoded input as you're doing now
Create a second io.BytesIO object for the compressed output
Call stream_compress with the two BytesIO objects (you can write into a BytesIO in addition to reading from it)
Obtain the len of the output object
Send the length encoded as a 32-bit integer, say, with struct.pack("!I", length)
Send the output object
On the receiving side, reverse the process. For each array:
Read 4 bytes (the length)
Create a BytesIO object. Receive exactly length bytes, writing those bytes to the object
Create a second BytesIO object
Pass the received object as input and the second object as output to stream_decompress
json-decode the resulting output object

A python program for file download. But only data in last packet is written

everyone. I write a server program and a client program and client is designed to download file from the server. The following is a part of client program:
if Type == DS.CONTROL_IDX:
if data["Success"] == True:
stream = ""
package = data["Package"]
file_name = data["File_Name"]
extension = data["Extend"]
if extension == DS.TXT:
fp = open(os.path.join(folder_root, file_name), 'w')
for i in range(package):
recv_msg = sock.recv(DS.FIXED_RECEIVE_LENGTH)
data = json.loads(recv_msg)
stream = stream + data["File"]
pdb.set_trace()
write_len = fp.write(stream)
fp.flush()
os.fsync(fp.fileno())
fp.close()
else:
print("download fail")
In the server program, I have splitted the big file into packets which are padded to have the same size. In the for loop, data carried in each packet is appended to stream, a string object. Finally, stream is written to the local file fp. However, just shown in the picture result1 and result2, only 1kb data is written. And 1kb is also the maxium size of data carried in one packet.
Could someone help me? Thank you very much!

Resources