What does "deallocated bytearray object has exported buffers" mean exactly - python-3.x

I am trying to run an encryption algorithm using AES256 but i get this error instead:
"deallocated bytearray object has exported buffers"
I can't seem to find any proper explanation on what the error itself actually means and therefore am having trouble debugging this. Can anyone explain?
For context, this seems to happen particularly for large files over 1GB
for root, dirs, files in os.walk(dirPath):
for name in files:
filePath = os.path.join(root, name)
with open(filePath, 'rb') as _file:
textStr = _file.read()
encrypted = fernet.encrypt(textStr)
with open(filePath, 'wb') as _file:
_file.write(encrypted)
The above code is me trying to encrypt all files in a directory

It's referring to the buffer protocol, a way of making views of raw memory in Python. It's not frequently used at the Python layer, instead usually being seen in CPython C modules, both built-in and third party C extension modules. The easiest way to use it from Python itself is with the memoryview type.
My guess is that something in the code (yours or the module you're using) made a view of a bytearray object as a buffer, then decref-ed the bytearray to zero references before releasing the buffer.

Related

Writing BytesIO objects to in-memory Zipfile

I have a Flask-based webapp that I'm trying to do everything in-memory without touching the disk at all.
I have created an in-memory Word doc (using python-docx library) and an in-memory Excel file (using openpyxl). They are both of type BytesIO. I want to return them both with Flask, so I want to zip them up and return the zipfile to the user's browser.
My code is as follows:
inMemory = io.BytesIO()
zipfileObj = zipfile.ZipFile(inMemory, mode='w', compression=zipfile.ZIP_DEFLATED)
try:
print('adding files to zip archive')
zipfileObj.write(virtualWorkbook)
zipfileObj.write(virtualWordDoc)
When the zipfile tries to write the virtualWorkbook I get the following error:
{TypeError}stat: path should be string, bytes, os.PathLike or integer, not BytesIO
I have skimmed the entirety of the internet but have come up empty-handed, so if someone could explain what I'm doing wrong that would be amazing
Seems like it's easier to mount tmpfs/ramdisk/smth to a specific directory like here, and just use tempfile.NamedTemporaryFile() as usual.
You could use the writestr method. It accepts both string and bytes.
zipfileObj.write(zipfile.ZipInfo('folder/name.docx'),
virtualWorkbook.read())

Julia: Using ProtoBuf to read messages from gzipped file

A sensor provides a stream of frames containing object coordinates, which are stored in ProtoBuf format in a gzipped file. I would like to read this file in Julia.
Using protoc, I have generated the Protobuf files for both Python and Julia, coordinate_push.py and coordinate_push.jl
My Python code is as follows:
frameList = []
with gzip.open(filePath) as f:
data = f.read()
next_pos, pos = 0, 0
while pos < len(data):
msg = coordinate_push.CoordinatesFrame()
next_pos, pos = _DecodeVarint32(data, pos)
msg.ParseFromString(data[pos:pos + next_pos])
frameList.append(msg)
pos += next_pos
I'd like to rewrite the above in Julia, and don't know where to start. Part of the problem is that I haven't fully understood the Python script (IO is not my strong point).
I understand that I need:
to open the gzip file, presumably using using GZip; file = GZip.open(file_path, "r")
to read in the data, along the lines of using ProtoBuf; data = readproto(iob, CoordinatesFrame())
What I don't understand is:
how to define iob, and especially how to link it to file (in the Julia Protobuf manual, we had iob = PipeBuffer(), but here it's a gzip-file that we'd like to read)
how to replicate the while-loop in Julia, and in particular the mysterious _DecodeVarint32 (I'm on Windows, if it's related to that.)
whether the file coordinate_push.jl has to be in the same directory as my main file, and if not, how I can properly import it (it is currently in a proto subfolder, and in Python I'd import it using from src.proto import coordinate_push)
Insight on any of the three points would be highly appreciated.
You should open an issue on the Gzip GitHub repo and ask this first part of your question there (I am not a Gzip expert unfortunately).
On the second point, I suggest looking here: https://github.com/JuliaIO/FileIO.jl/blob/master/README.md for lots of examples of FileIO loops which seems exactly what you need to replicate that Python loop. For the second part of this question, you best bet for that function is to try and hunt down the definition on GitHub or in the docs somewhere.
For the 3rd questions, coordinate_push.jl does not need to be in the same folder as your "main file" (I am not sure what you mean by this so perhaps it would help to add context on the structure of your files). To import that file all you need to do is add include("path/to/coordinate_push.jl") at the top of the file you want to call/run the code from. It's worth noting that the path can either be the absolute path or the relative project path (in some cases).

Read with MXRecordIO from bytes object

Is there a way that I can use mx.recordio.MXRecordIO to read from a bytes object rather than a file object?
For example I'm currently doing:
import mxnet as mx
results_file = 'results.rec'
with open(results_file, 'wb') as f:
f.write(results)
recordio = mx.recordio.MXRecordIO(results_file, 'r')
temp = recordio.read()
But if possible I'd rather not have to write to file as an intermediate step. I've tried using BytesIO, but can't seem to get it to work.
Currently they is no way of achieving this sorry. This is non-trivial because the RecordIO reading/parsing is done in C++ and you can't simply forward the stream to the C++ API.

Load spydata file

I'm coming from R + Rstudio. In RStudio, you can save objects to an .RData file using save()
save(object_to_save, file = "C:/path/where/RData/file/will/be/saved.RData")
You can then load() the objects :
load(file = "C:/path/where/RData/file/was/saved.RData")
I'm now using Spyder and Python3, and I was wondering if the same thing is possible.
I'm aware everything in the globalenv can be saved to a .spydata using this :
But I'm looking for a way to save to a .spydata file in the code. Basically, just the code under the buttons.
Bonus points if the answer includes a way to save an object (or multiple objects) and not the whole env.
(Please note I'm not looking for an answer using pickle or shelve, but really something similar to R's load() and save().)
(Spyder developer here) There's no way to do what you ask for with a command in Spyder consoles.
If you'd like to see this in a future Spyder release, please open an issue in our issues tracker about it, so we don't forget to consider it.
Considering the comment here, we can
rename the file from .spydata to .tar
extract the file (using file manager, for example). It will deliver a file .pickle (and maybe a .npy)
extract the objects saved from the environment:
import pickle
with open(path, 'rb') as f:
data_temp = pickle.load(f)
that object will be a dictionary with the objects saved.

How do I pickle an object hierarchy, *each object usually its own individual file* so that saving is fast?

I want to use pickle, specifically cPickle to serialize my objects' data as a folder of files representing modules, projects, module objects, scene objects, etc. Is there an easy way to do this?
Thus unpickling will be a little tricky as each parent object stores a reference to child/sibling objects when running but the pickle data of the parent will hold a filepath to the object.
I started with a PathUtil class that all classes inherit, but have been running into issues. Has anyone solved a similar problem/feature of data file saving / restoring?
The more transparently it works with existing code the better. For instance, if using a meta class __call__ will make existing constructor syntax stay the same, that will be a plus. For example, the static __call__ will check the pickle file first and load it if it exists, while doing a default construction if it doesn't.
You can override __getstate__ to write to a new pickle file and return its path, and __setstate__ to unpickle the file.
import pickle, os
DIRNAME = 'path/to/my/pickles/'
class AutoPickleable:
def __getstate__(self):
state = dict(self.__dict__)
path = os.path.join(DIRNAME, str(id(self)))
with open(path, 'wb') as f:
pickle.dump(state, f)
return path
def __setstate__(self, path):
with open(path, 'b') as f:
state = pickle.load(f)
self.__dict__.update(state)
Now, each type which should have this special auto-pickleable behavior, should subclass AutoPickleable.
When you want to dump the files, you can do pickle.dumps(obj) or copy.deepcopy(obj) and ignore the result.
Unpickling works as usual (pickle.load). If you want to restore the objects from a file-path (and not from the results of pickle.dumps), it is a bit trickier. Let me know if you want it, and I'll add details. In anycase, if you wrap your AutoPickleable object with a "standard" object, and do all pickle operations on that, it should all work.
There are several potential problems with this approach, but for a "clean" case such as the one you describe, it might work.
Some notes:
There is no way to "dynamically" specify the directory to write to. It has to be globally accessible, and set before the pickling operation
Probably wouldn't work if several objects keep references the same AutoPickleable object, or if you have circular references (in general, pickle handle these cases with no problem)
There is no code here to clean the directory / delete the files.

Resources