Reopening a closed stringIO object in Python 3 - python-3.x

So, I create a StringIO object to treat my string as a file:
>>> a = 'Me, you and them\n'
>>> import io
>>> f = io.StringIO(a)
>>> f.read(1)
'M'
And then I proceed to close the 'file':
>>> f.close()
>>> f.closed
True
Now, when I try to open the 'file' again, Python does not permit me to do so:
>>> p = open(f)
Traceback (most recent call last):
File "<pyshell#166>", line 1, in <module>
p = open(f)
TypeError: invalid file: <_io.StringIO object at 0x0325D4E0>
Is there a way to 'reopen' a closed StringIO object? Or should it be declared again using the io.StringIO() method?
Thanks!

I have a nice hack, which I am currently using for testing (Since my code can make I/O operations, and giving it StringIO is a nice get-around).
If this problem is kind of one time thing:
st = StringIO()
close = st.close
st.close = lambda: None
f(st) # Some function which can make I/O changes and finally close st
st.getvalue() # This is available now
close() # If you don't want to store the close function you can also:
StringIO.close(st)
If this is recurring thing, you can also define a context-manager:
#contextlib.contextmanager
def uncloseable(fd):
"""
Context manager which turns the fd's close operation to no-op for the duration of the context.
"""
close = fd.close
fd.close = lambda: None
yield fd
fd.close = close
which can be used in the following way:
st = StringIO()
with uncloseable(st):
f(st)
# Now st is still open!!!
I hope this helps you with your problem, and if not, I hope you will find the solution you are looking for.
Note: This should work exactly the same for other file-like objects.

No, there is no way to re-open an io.StringIO object. Instead, just create a new object with io.StringIO().
Calling close() on an io.StringIO object throws away the "file contents" data, so re-opening couldn't give access to that anyways.
If you need the data, call getvalue() before closing.
See also the StringIO documentation here:
The text buffer is discarded when the close() method is called.
and here:
getvalue()
Return a str containing the entire contents of the buffer.

The builtin open() creates a file object (i.e. a stream), but in your example, f is already a stream.
That's the reason why you get TypeError: invalid file
After the method close() has executed, any stream operation will raise ValueError.
And the documentation does not mention about how to reopen a closed stream.
Maybe you need not close() the stream yet if you want to use (reopen) it again later.

When you f.close() you remove it from memory. You're basically doing a deref x, call x; you're looking for a memory location that doesn't exist.
Here is what you could do in stead:
import io
a = 'Me, you and them\n'
f = io.StringIO(a)
f.read(1)
f.close()
# Put the text form a without the first char into StringIO.
p = io.StringIO(a[1:]).
# do some work with p.
I think your confusion comes form thinking of io.StringIO as a file on the block device. If you used open() and not StringIO, then you would be correct in your example and you could reopen the file. StringIO is not a file. It's the idea of a file object in memory. A file object does have a StringIO, but It also exists physically on the block device. A StringIO is just a buffer, a staging area in memory of the data with in it. When you call open() a buffer is created, but there is still the data on block device.
Perhaps this is more what you want
fo = open('f.txt','w+')
fo.write('Me, you and them\n')
fo.read(1)
fo.close()
# reopen the now closed file `f`
p = open('f.txt','r')
# do stuff with p
p.close()
Here we are writing the string to the block device, so that when we close the file, the information written to it will remain after it's closed. Because this is creating a file in the directory the progarm is run in, it may be a good idea to give the file an extension. For example, you could name the file f.txt instead of f.

Related

FileNotFoundError But The File Is There: Cryptography Edition

I'm working on a script that takes a checksum and directory as inputs.
Without too much background, I'm looking for 'malware' (ie. a flag) in a directory of executables. I'm given the SHA512 sum of the 'malware'. I've gotten it to work (I found the flag), but I ran into an issue with the output after generalizing the function for different cryptographic protocols, encodings, and individual files instead of directories:
FileNotFoundError: [Errno 2] No such file or directory : 'lessecho'
There is indeed a file lessecho in the directory, and as it happens, is close to the file that returns the actual flag. Probably a coincidence. Probably.
Below is my Python script:
#!/usr/bin/python3
import hashlib, sys, os
"""
### TO DO ###
Add other encryption techniques
Include file read functionality
"""
def main(to_check = sys.argv[1:]):
dir_to_check = to_check[0]
hash_to_check = to_check[1]
BUF_SIZE = 65536
for f in os.listdir(dir_to_check):
sha256 = hashlib.sha256()
with open(f, 'br') as f: <--- line where the issue occurs
while True:
data = f.read(BUF_SIZE)
if not data:
break
sha256.update(data)
f.close()
if sha256.hexdigest() == hash_to_check:
return f
if __name__ == '__main__':
k = main()
print(k)
Credit to Randall for his answer here
Here are some humble trinkets from my native land in exchange for your wisdom.
Your listdir call is giving you bare filenames (e.g. lessecho), but that is within the dir_to_check directory (which I'll call foo for convenience). To open the file, you need to join those two parts of the path back together, to get a proper path (e.g. foo/lessecho). The os.path.join function does exactly that:
for f in os.listdir(dir_to_check):
sha256 = hashlib.sha256()
with open(os.path.join(dir_to_check, f), 'br') as f: # add os.path.join call here!
...
There are a few other issues in the code, unrelated to your current error. One is that you're using the same variable name f for both the file name (from the loop) and file object (in the with statement). Pick a different name for one of them, since you need both available (because I assume you intend return f to return the filename, not the recently closed file object).
And speaking of the closed file, you're actually closing the file object twice. The first one happens at the end of the with statement (that's why you use with). The second is your manual call to f.close(). You don't need the manual call at all.

Problem with using CreatePseudoConsole to create a pty pipe in python

I am trying to make a pty pipe. For that I have to use the CreatePseudoConsole function from the Windows api. I am loosely copying this which is this but in python.
I don't know if it's relevant but I am using Python 3.7.9 and Windows 10.
This is my code:
from ctypes.wintypes import DWORD, HANDLE, SHORT
from ctypes import POINTER, POINTER, HRESULT
import ctypes
import msvcrt
import os
# The COORD data type used for the size of the console
class COORD(ctypes.Structure):
_fields_ = [("X", SHORT),
("Y", SHORT)]
# HPCON is the same as HANDLE
HPCON = HANDLE
CreatePseudoConsole = ctypes.windll.kernel32.CreatePseudoConsole
CreatePseudoConsole.argtypes = [COORD, HANDLE, HANDLE, DWORD, POINTER(HPCON)]
CreatePseudoConsole.restype = HRESULT
def create_console(width:int, height:int) -> HPCON:
read_pty_fd, write_fd = os.pipe()
read_pty_handle = msvcrt.get_osfhandle(read_pty_fd)
read_fd, write_pty_fd = os.pipe()
write_pty_handle = msvcrt.get_osfhandle(write_pty_fd)
# Create the console
size = COORD(width, height)
console = HPCON()
result = CreatePseudoConsole(size, read_pty_handle, write_pty_handle,
DWORD(0), ctypes.byref(console))
# Check if any errors occured
if result != 0:
raise ctypes.WinError(result)
# Add references for the fds to the console
console.read_fd = read_fd
console.write_fd = write_fd
# Return the console object
return console
if __name__ == "__main__":
consol = create_console(80, 80)
print("Writing...")
os.write(consol.write_fd, b"abc")
print("Reading...")
print(os.read(consol.read_fd, 1))
print("Done")
The problem is that it isn't able to read from the pipe. I expected it to print "a" but it just gets stuck on the os.read. Please note that this is the first time I use the WinAPI so the problem is likely to be there.
There is nothing wrong with the code: what you got wrong is your expectations.
What you are doing is writing to the pipe meant to feed ‘keyboard’ input to the program, and reading from another pipe that returns ‘screen’ output from the program. But there is no actual program at the other end of either pipe, and so there is nothing the read call can ever return.
The HPCON handle returned by the CreatePseudoConsole API is supposed to be passed in a thread attribute list to a newly-spawned process via CreateProcess. Pretty much nothing can be done with it other than that. (You cannot even connect yourself to the pseudoconsole, as you would be able on Unix.) After the handle is passed in this manner, you can communicate with the process using the read_fd and write_fd descriptors.
An article on MSDN provides a full sample in C that creates a pseudoconsole and passes it to a new process; the exact same thing is done by the very source you linked.

Python avoid partial writes with non-blocking write to named pipe

I am running python3.8 on linux.
In my script, I create a named pipe, and open it as follows:
import os
import posix
import time
file_name = 'fifo.txt'
os.mkfifo(file_name)
f = posix.open(file_name, os.O_RDWR | os.O_NONBLOCK)
os.set_blocking(f, False)
Without yet having opened the file for reading elsewhere ( for instance, with cat), I start to write to the file in a loop.
base_line = 'abcdefghijklmnopqrstuvwxyz'
s = base_line * 10000 + '\n'
while True:
try:
posix.write(f, s.encode())
except BlockingIOError as e:
print("Exception occurred: {}".format(e))
time.sleep(.5)
When I then go to read from the named pipe with cat, I find that there was a partial-write that took place.
I am confused how I can know how many bytes were written in this instance. Since the exception was thrown, I do not have access to the return value (num bytes written). The documentation suggests that BlockingIOError has a property called characters_written, however when I try to access this field an AttributeError is raised.
In summary: How can I either avoid this partial write in the first place, or at least know how much was partially written in this instance?
os.write performs an unbuffered write. The docs state that BlockingIOError only has a characters_written attribute when a buffered write operation would block.
If any bytes were successfully written before the pipe became full, that number of bytes will be returned from os.write. Otherwise, you'll get an exception. Of course, something like a drive failure will also cause an exception, even if some bytes were written. This is no different from how POSIX write works, except instead of returning -1 on error, an exception is raised.
If you don't like dealing with the exception, you can use a wrapper around the file descriptor, such as a io.FileIO object. I've modified your code since it tries to write the entire buffer every time you looped back to the os.write call (if it failed once, it will fail every time):
import io
import os
import time
base_line = 'abcdefghijklmnopqrstuvwxyz'
data = (base_line * 10000 + '\n').encode()
file_name = 'fifo.txt'
os.mkfifo(file_name)
fd = os.open(file_name, os.O_RDWR | os.O_NONBLOCK)
# os.O_NONBLOCK makes os.set_blocking(fd, False) unnecessary.
with io.FileIO(fd, 'wb') as f:
written = 0
while written < len(data):
n = f.write(data[written:])
if n is None:
time.sleep(.5)
else:
written += n
BTW, you might use the selectors module instead of time.sleep; I noticed a slight delay when trying to read from the pipe because of the sleep delay, which shouldn't happen if you use the selectors module:
with io.FileIO(fd, 'wb') as f:
written = 0
sel = selectors.DefaultSelector()
sel.register(f, selectors.EVENT_WRITE)
while written < len(data):
n = f.write(data[written:])
if n is None:
# Wait here until we can start writing again.
sel.select()
else:
written += n
sel.unregister(f)
Some useful information can also be found in the answer to POSIX named pipe (fifo) drops record in nonblocking mode.

How to write to a file in case of multiprocessing?

I am trying to write to a text file in append mode. The write() method is being called in a function that is run through multiprocessing. I am closing the file at the very end of the code thinking that everything will have been written before it is closed. But, what is happening is totally opposite. The file gets closed in each process. I want to have it closed once all the processes have ended.
This is how I am doing this.
import concurrent.futures
f = open('input.txt', 'a')
pairs = ['pair1', 'pair2', 'pair3', 'pair4', 'pair5']
def validity_check(pair):
f.write(f'{pair}\n')
if __name__ == '__main__':
with concurrent.futures.ProcessPoolExecutor() as executor:
idx = 0
while True:
executor.map(validity_check, pairs[idx:idx + 5])
idx = idx + 5
if idx >= len(pairs):
break
f.close()
I want all the pairs written to the file before it closes. Thanks!
The answer in this question maybe provides a solution to your problem:
Python multiprocessing safely writing to a file
This way you would have a handler/manager that write to the file and all those processes give the information to the handler. The handler manages all file I/O.

How to use thread lock when reading and writing a csv file in python?

I am working on refining dummy blockchain code, and want to make it impossible to read and write csv file if it's already being used. What should do i do?
I've put start(), join(), acquire(), release() etc all the places that i could thought, but i weren't work at all. Once gotten a message that "Permission denied" while i opened my file, however, it still gave me the information in the file. (All the other functions are working properly.)
def readBlockchain(blockchainFilePath, mode = 'internal'):
get_lock.acquire()
print("readBlockchain is called")
importedBlockchain = []
try:
with open(blockchainFilePath, 'r', newline='') as file:
blockReader = csv.reader(file)
for line in blockReader:
block = Block(line[0], line[1], line[2], line[3], line[4], line[5],line[6])
importedBlockchain.append(block)
print("Pulling blockchain from csv...")
get_lock.release()
return importedBlockchain
except:
if mode == 'internal':
blockchain = generateGenesisBlock()
importedBlockchain.append(blockchain)
writeBlockchain(importedBlockchain)
get_lock.release()
return importedBlockchain
else:
get_lock.release()
return None
I expect it not to be read if i've opened the csv file, and to be read after i closed the file.
I'll look forward to your answers!
Thanks.
Have a look at mutexes, that enables you to acquire and lock a ressource, and to unlock it once your job is finished.
Link : mutex

Resources