How to compress csv encoded file into zip archive directly? - python-3.x

I want to write data to cp1250 encoded file and zip it without temporary storing it on filesystem.
I figured out that I need someting like this
f = io.TextIOBase(newline='', encoding='cp1250')
writer = csv.writer(f, delimiter=';', dialect='excel', quoting=csv.QUOTE_ALL)
writer.writerow([3,3,3,4])
with ZipFile('cvs.zip', 'w') as zip_file:
zip_file.writestr('test.cvs', f.getvalue())
But now on third line I got:
io.UnsupportedOperation: write
This is probably because of use io.TextIOBase, but with any stringIO i can't set encoding

Related

Is there a way to download N files from S3 as just one file?

Current code:
#!/usr/bin/python
import boto3
s3=boto3.client('s3')
list=s3.list_objects(Bucket='my_bucket_name')['Contents']
for key in list:
s3.download_file('my_bucket_name', key['Key'], key['Key'])
In the specific path I have N files. This way I download them and then I have also N local files. I just want one single file.
I did:
data=""
files = []
for file in glob.glob("*.json"):
files.append(file)
for file in files:
with open(file) as fp:
data += fp.read()
data += "\n"
with open ('output.json', 'w') as fp:
fp.write(data)
Is there a way to do it faster or even using boto to stream downloaded bytes to a file?

How to read file as .dat and write it as a .txt

So I'm making a thing where it reads data from a .dat file and saves it as a list, then it takes that list and writes it to a .txt file (basically a .dat to .txt converter). However, whenever I run it and it makes the file, it is a .txt file but it contains the .dat data. After troubleshooting the variable that is written to the .dat file is normal legible .txt not weird .dat data...
Here is my code (pls don't roast I'm very new I know it sucks and has lots of mistakes just leave me be xD):
#import dependencies
import sys
import pickle
import time
#define constants and get file path
data = []
index = 0
path = input("Absolute file path:\n")
#checks if last character is a space (common in copy+pasting) and removes it if there is a space
if path.endswith(' '):
path = path[:-1]
#load the .dat file into a list names bits
bits = pickle.load(open(path, "rb"))
with open(path, 'rb') as fp:
bits = pickle.load(fp)
#convert the data from bits into a new list called data
while index < len(bits):
print("Decoding....\n")
storage = bits[index]
print("Decoding....\n")
str(storage)
print("Decoding....\n")
data.append(storage)
print("Decoding....\n")
index += 1
print("Decoding....\n")
time.sleep(0.1)
#removes the .dat of the file
split = path[:-4]
#creates the new txt file with _converted.txt added to the end
with open(f"{split}_convert.txt", "wb") as fp:
pickle.dump(data, fp)
#tells the user where the file has been created
close_file = str(split)+"_convert.txt"
print(f"\nA decoded txt file has been created. Run this command to open it: cd {close_file}\n\n")
Quick review; I'm setting a variable named data which contains all of the data from the .dat file, then I want to the save the variable to a .txt file, but whenever I save it to a .txt file it has the contents of the .dat file, even though when I call print(data) it tells me the data in normal, legible text. Thanks for any help.
with open(f"{split}_convert.txt", "wb") as fp:
pickle.dump(data, fp)
When you're opening the file in wb mode, it will automatically write binary data to it. To write plain text to .txt file, use
with open(f"{split}_convert.txt", "w") as fp:
fp.write(data)
Since data is a list, you can't write it straight away as well. You'll need to write each item, using a loop.
with open(f"{split}_convert.txt", "w") as fp:
for line in data:
fp.write(line)
For more details on file writing, check this article as well: https://www.tutorialspoint.com/python3/python_files_io.htm

two pieces of python code creates zip archive one of two is broken

Initially I want to create zip file dynamically and return it in http response. I use python 3.7 lib zipfile.
I tried both io buffer and tmp dir, neither one of them creates valid zip archive. Archive is only opened if its saved on disc
import zipfile
import io
#==============================================
# V1
file_like_object = io.BytesIO()
myZipFile = zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile.writestr(u'test.py', b'test')
tmparchive="zip1.zip"
out = open(tmparchive,'wb') ## Open temporary file as bytes
out.write(file_like_object.getvalue())
out.close()
r = open(tmparchive, 'rb')
print (r.read())
r.close()
#==============================================
# V2
tmparchive2 = 'zip2.zip'
myZipFile2 = zipfile.ZipFile(tmparchive2, "w", compression=zipfile.ZIP_DEFLATED)
myZipFile2.writestr(u'test.py', b'test')
r2 = open(tmparchive2, 'rb')
print (r2.read())
r2.close()
#====================================================
It's preferable to use a context manager like so:
import zipfile, io
file_like_object = io.BytesIO()
with zipfile.ZipFile(file_like_object, "w", compression=zipfile.ZIP_DEFLATED) as myZipFile:
myZipFile.writestr(u'test.txt', b'test')
# file_like_object.getvalue() are the bytes you send in your http response.
I wrote it to file. It's definitely a valid zip file.
If you want to open the archive, you need to save it to disk. Applications like Explorer and 7-Zip have no way to read the BytesIO object that exists in the python process. They can only open archives saved to disk.
Calling print(r.read()) isn't going to open the archive. It's just going to print the bytes that make up the tiny zip file you just created.

How to register .gz format in shutil.register_archive_format to use same format in shutil.unpack_archive

I have Example.json.gz and I want to unpack it or extract it in python using shutil.unpack_archive()
However it gives error shutil.ReadError: Unknown archive format as '.gz' format is not in the list of default format.
So it has to be register first using shutil.register_archive_format. Can somebody please help me register and unpack (extract it)
You should define a function that knows how to extract a gz file and then register this function. You could use the gzip library, for instance:
import os
import re
import gzip
import shutil
def gunzip_something(gzipped_file_name, work_dir):
"""gunzip the given gzipped file"""
# see warning about filename
filename = os.path.split(gzipped_file_name)[-1]
filename = re.sub(r"\.gz$", "", filename, flags=re.IGNORECASE)
with gzip.open(gzipped_file_name, 'rb') as f_in: # <<========== extraction happens here
with open(os.path.join(work_dir, filename), 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
try:
shutil.register_unpack_format('gz', ['.gz', ], gunzip_something)
except:
pass
shutil.unpack_archive("Example.json.gz", os.curdir, 'gz')
WARNING: if you extract on the same dir where your gzipped file resides and your file does not have a .gz extension I'm not sure what happens (overwrite?).

Custom filetype in Python 3

How to start creating my own filetype in Python ? I have a design in mind but how to pack my data into a file with a specific format ?
For example I would like my fileformat to be a mix of an archive ( like other format such as zip, apk, jar, etc etc, they are basically all archives ) with some room for packed files, plus a section of the file containing settings and serialized data that will not be accessed by an archive-manager application.
My requirement for this is about doing all this with the default modules for Cpython, without external modules.
I know that this can be long to explain and do, but I can't see how to start this in Python 3.x with Cpython.
Try this:
from zipfile import ZipFile
import json
data = json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
with ZipFile('foo.filetype', 'w') as myzip:
myzip.writestr('digest.json', data)
The file is now a zip archive with a json file (thats easy to read in again in many lannguages) for data you can add files to the archive with myzip write or writestr. You can read data back with:
with ZipFile('foo.filetype', 'r') as myzip:
json_data_read = myzip.read('digest.json')
newdata = json.loads(json_data_read)
Edit: you can append arbitrary data to the file with:
f = open('foo.filetype', 'a')
f.write(data)
f.close()
this works for winrar but python can no longer process the zipfile.
Use this:
import base64
import gzip
import ast
def save(data):
data = "[{}]".format(data).encode()
data = base64.b64encode(data)
return gzip.compress(data)
def load(data):
data = gzip.decompress(data)
data = base64.b64decode(data)
return ast.literal_eval(data.decode())[0]
How to use this with file:
open(filename, "wb").write(save(data)) # save data
data = load(open(filename, "rb").read()) # load data
This might look like this is able to be open with archive program
but it cannot because it is base64 encoded and they have to decode it to access it.
Also you can store any type of variable in it!
example:
open(filename, "wb").write(save({"foo": "bar"})) # dict
open(filename, "wb").write(save("foo bar")) # string
open(filename, "wb").write(save(b"foo bar")) # bytes
# there's more you can store!
This may not be appropriate for your question but I think this may help you.
I have a similar problem faced... but end up with some thing like creating a zip file and then renamed the zip file format to my custom file format... But it can be opened with the winRar.

Resources