two pieces of python code creates zip archive one of two is broken - python-3.x

Initially I want to create zip file dynamically and return it in http response. I use python 3.7 lib zipfile.
I tried both io buffer and tmp dir, neither one of them creates valid zip archive. Archive is only opened if its saved on disc
import zipfile
import io
#==============================================
# V1
file_like_object = io.BytesIO()
myZipFile = zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile.writestr(u'test.py', b'test')
tmparchive="zip1.zip"
out = open(tmparchive,'wb') ## Open temporary file as bytes
out.write(file_like_object.getvalue())
out.close()
r = open(tmparchive, 'rb')
print (r.read())
r.close()
#==============================================
# V2
tmparchive2 = 'zip2.zip'
myZipFile2 = zipfile.ZipFile(tmparchive2, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile2.writestr(u'test.py', b'test')
r2 = open(tmparchive2, 'rb')
print (r2.read())
r2.close()
#====================================================

It's preferable to use a context manager like so:
import zipfile, io
file_like_object = io.BytesIO()
with zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED) as myZipFile:
myZipFile.writestr(u'test.txt', b'test')
# file_like_object.getvalue() are the bytes you send in your http response.
I wrote it to file. It's definitely a valid zip file.
If you want to open the archive, you need to save it to disk. Applications like Explorer and 7-Zip have no way to read the BytesIO object that exists in the python process. They can only open archives saved to disk.
Calling print(r.read()) isn't going to open the archive. It's just going to print the bytes that make up the tiny zip file you just created.

Related

python3 os write, end file

I have am trying to open a file for writing and I am using os.write() because I need to lock my files. I don't know how to write a string to the file and have the remaining file content removed. For example, the code below first writes qwertyui to the file, and then writes asdf to the file, which results in the file containing asdftyui. I would like to know how I can do this so that the resulting content of the file is asdf.
import os
fileName = 'file.txt'
def write(newContent):
fd = os.open(fileName,os.O_WRONLY|os.O_EXLOCK)
os.write(fd,str.encode(newContent))
os.close(fd)
write('qwertyui')
write('asdf')
Add os.O_TRUNC to the flags:
os.open(fileName,os.O_WRONLY|os.O_EXLOCK|os.O_TRUNC)
import os
fileName = 'file.txt'
def write(newContent):
fd = os.open(fileName,os.O_WRONLY|os.O_EXLOCK)
Lastvalue = os.read(fd, os.path.getsize(fd))
os.write(fd,str.encode(Lastvalue.decode() +newContent))
os.close(fd)
write('qwertyui')
write('asdf')

Is there a way to download N files from S3 as just one file?

Current code:
#!/usr/bin/python
import boto3
s3=boto3.client('s3')
list=s3.list_objects(Bucket='my_bucket_name')['Contents']
for key in list:
s3.download_file('my_bucket_name', key['Key'], key['Key'])
In the specific path I have N files. This way I download them and then I have also N local files. I just want one single file.
I did:
data=""
files = []
for file in glob.glob("*.json"):
files.append(file)
for file in files:
with open(file) as fp:
data += fp.read()
data += "\n"
with open ('output.json', 'w') as fp:
fp.write(data)
Is there a way to do it faster or even using boto to stream downloaded bytes to a file?

How to read specific information from .txt files in .tar file without fully unzipping it python3

I have many (like 1000) .bz2 files with each (200-50Mb) containing 4 .txt(.dat) files inside , how can I read some specific information from .txt(dat)s without decompressing them? I am only a beginner python 3 user,so please give me some hits or maybe useful examples. Thank you.
I made code which actually unzip .txt(s) in temp folder but it takes like 40sec to proceed 170Mb tar...only one...whereas I have thousands.
import bz2
import os
import tempfile
import shutil
pa = '/home/user/tar' #.tar(s) location
fds = sorted(os.listdir(pa))
i = 0
for bz in fds:
path = os.path.join(pa, tar)
i +=1
archive = bz2.BZ2File(path, 'r')
tmpdir = tempfile.mkdtemp(dir=os.getcwd())
bz2.decompress('example.txt', path=tmpdir)
path_to_my_file = os.path.join(tmpdir, 'example.txt')
here goes some simple manupulation with my .txt (like print smthg)
shutil.rmtree(tmpdir)

How to compress csv encoded file into zip archive directly?

I want to write data to cp1250 encoded file and zip it without temporary storing it on filesystem.
I figured out that I need someting like this
f = io.TextIOBase(newline='', encoding='cp1250')
writer = csv.writer(f, delimiter=';', dialect='excel', quoting=csv.QUOTE_ALL)
writer.writerow([3,3,3,4])
with ZipFile('cvs.zip', 'w') as zip_file:
zip_file.writestr('test.cvs', f.getvalue())
But now on third line I got:
io.UnsupportedOperation: write
This is probably because of use io.TextIOBase, but with any stringIO i can't set encoding

Custom filetype in Python 3

How to start creating my own filetype in Python ? I have a design in mind but how to pack my data into a file with a specific format ?
For example I would like my fileformat to be a mix of an archive ( like other format such as zip, apk, jar, etc etc, they are basically all archives ) with some room for packed files, plus a section of the file containing settings and serialized data that will not be accessed by an archive-manager application.
My requirement for this is about doing all this with the default modules for Cpython, without external modules.
I know that this can be long to explain and do, but I can't see how to start this in Python 3.x with Cpython.
Try this:
from zipfile import ZipFile
import json
data = json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
with ZipFile('foo.filetype', 'w') as myzip:
myzip.writestr('digest.json', data)
The file is now a zip archive with a json file (thats easy to read in again in many lannguages) for data you can add files to the archive with myzip write or writestr. You can read data back with:
with ZipFile('foo.filetype', 'r') as myzip:
json_data_read = myzip.read('digest.json')
newdata = json.loads(json_data_read)
Edit: you can append arbitrary data to the file with:
f = open('foo.filetype', 'a')
f.write(data)
f.close()
this works for winrar but python can no longer process the zipfile.
Use this:
import base64
import gzip
import ast
def save(data):
data = "[{}]".format(data).encode()
data = base64.b64encode(data)
return gzip.compress(data)
def load(data):
data = gzip.decompress(data)
data = base64.b64decode(data)
return ast.literal_eval(data.decode())[0]
How to use this with file:
open(filename, "wb").write(save(data)) # save data
data = load(open(filename, "rb").read()) # load data
This might look like this is able to be open with archive program
but it cannot because it is base64 encoded and they have to decode it to access it.
Also you can store any type of variable in it!
example:
open(filename, "wb").write(save({"foo": "bar"})) # dict
open(filename, "wb").write(save("foo bar")) # string
open(filename, "wb").write(save(b"foo bar")) # bytes
# there's more you can store!
This may not be appropriate for your question but I think this may help you.
I have a similar problem faced... but end up with some thing like creating a zip file and then renamed the zip file format to my custom file format... But it can be opened with the winRar.

Resources