Zipfile file in cloud(amazon s3) without writing it first to local file(no write privileges) - python-3.x

I need to zip some files in amazon s3 without needing to write them to file locally first. Ideally my code worked in development but i don't have many write privileges in production.
folder = output_dir
files = fs.glob(folder)
f = BytesIO()
zip = zipfile.ZipFile(f, 'a', zipfile.ZIP_DEFLATED)
for file in files:
filename = os.path.basename(file)
image = fs.get(file, filename)
zip.write(filename)
zip.close()
the proplem is at this line in production
image = fs.get(file, filename)
Because i don't have write privileges.
My last resort is to write to /tmp/ directory which i have privileges to.
Is there a way to zip files from a url path or directly in the cloud?

I ended up using python tempfile which ended up being a perfect solution.
Using NamedTemporaryFile gave me the guarantee to create named and system visible temporary files that could be deleted automatically. No manual work.

Related

Update file in folder inside the s3bucket python

I have a folder inside the s3 bucket i need to update the file inside the existing folder
using python can any one assist
I found an answer to my own question. The code below works perfectly
from my local folder. I opened the folder and and use 'rb' (read binary) to store file.
s3 = boto3.client(
's3',
region_name='ap-south-1',
aws_access_key_id = S3_access_key,
aws_secret_access_key=S3_secret_key
)
path_data=os.path.join(MEDIA_ROOT, "screenShots",folder_name)
dir=os.listdir(path_data)
for file in dir:
with open(path_data+"/"+file, 'rb') as data:
s3.upload_fileobj(data, s3_Bucket_name, folder_name+"/"+file)
shutil.rmtree(path_data)

Renaming the folder zip file extracts to

Might just be an edge case but I extracted a zip file to a directory using the zip file module. When extracting, zip file names the directory it extracts to.
If there is a way I get to specify the name of the folder Zip file creates to extract the files to? I am hitting an error because I am using the same folder zipped up to test zip file and it keeps using the old folder name which already exists so it throws an error. Here is my code:
orginalFolderName = jobFolder + name
with zipfile.ZipFile(directory,"r") as zip_ref:
zip_ref.extractall(jobFolder)
os.rename(orginalFolderName, newFoldername)
directory = newFoldername
with zipfile.ZipFile(filepath) as z:
z.extractall(dest_folder)
filepath - Complete path of zipfile
dest_folder - destination folder

Python ZipFile module problem when file is encrypted

I have the following short program
from zipfile import ZipFile
procFile1 ="C:\\Temp\\XLFile-Demo.zip"
procFile2 ="C:\\Temp2\\XLFile-Demo-PW123.zip"
# Unencrypted file
print ("Unencrypted file")
myzip1 = ZipFile(procFile1)
print (myzip1.infolist())
myzip1.extractall("C:\\Temp")
# Encrypted File
print ("Encrypted file")
myzip2 = ZipFile(procFile2)
print (myzip2.infolist())
myzip2.setpassword(bytes('123', 'utf-8'))
myzip2.extractall("C:\\Temp2")enter code here
At this Amazon Drive link are the two files. They are identical except that one zip is protected with the password 123.
Executing the above code successfully extracts the unencrypted one but raises the error NotImplementedError: That compression method is not supported for the other.
Unencrypted file
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=deflate external_attr=0x20 file_size=31964 compress_size=29252>]
Encrypted file
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=99 external_attr=0x20 file_size=31964 compress_size=29280>]
Am I doing anything wrong from my end?
The error came up when the file was zipped using WinRar's ZIP option. I installed 7Zip and it is working.
The .infolist for the 7Zip file is the following:
[<ZipInfo filename='XLFile-Demo.xlsx' compress_type=deflate external_attr=0x20 file_size=31964 compress_size=29340>]
Incidentally WinRar can handle this file and 7Zip can correctly process the encrypted Zip archive created by WinRar.

Copy files from blob container to another container using python

I am trying to copy 'specific files' from one folder to another. when I am trying to use Wild card operator (*) at the end, the copy does not happen.
But if I provide just the folder name, then all the files from this source folder are copied to target folder without any issues.
Problem: File copy does not happen when Wild card operator is used.
Can you please help me to fix the problem?
def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
try:
blob_service = BlockBlobService(account_name=account_name, account_key=account_key)
files = blob_service.list_blobs(copy_from_container, prefix=copy_from_prefix)
for f in files:
#print(f.name)
blob_service.copy_blob(copy_to_container, f.name.replace(copy_from_prefix,""), f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{f.name}")
except:
print('Could not copy files from source to target')
copy_from_prefix = 'Folder1/FileName_20191104*.csv'
copy_blob_files (accountName, accesskey, copy_fromcontainer, copy_to_container, copy_from_prefix)
The copy_blob method does not support wildcard.
1.If you want to copy specified pattern of blobs, you can filter the blobs in list_blobs() method with prefix(it also does not support wildcard). In your case, the prefix looks like copy_from_prefix = 'Folder1/FileName_20191104', note that there is no wildcard.
The code below works at my side, and all the specified pattern files are copies and blob name replaced:
from azure.storage.blob import BlockBlobService
account_name ="xxx"
account_key ="xxx"
copy_from_container="test7"
copy_to_container ="test4"
#remove the wildcard
copy_from_prefix = 'Folder1/FileName_20191104'
def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
try:
block_blob_service = BlockBlobService(account_name,account_key)
files = block_blob_service.list_blobs(copy_from_container,copy_from_prefix)
for file in files:
block_blob_service.copy_blob(copy_to_container,file.name.replace(copy_from_prefix,""),f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{file.name}")
except:
print('could not copy files')
copy_blob_files(account_name,account_key,copy_from_container,copy_to_container,copy_from_prefix)
2.Another way as others mentioned, you can use python to call azcopy(you can use azcopy v10, which is just a .exe file). And for using wildcard in azcopy, you can follow this doc. Then you write you own azcopy command, at last, write your python code as below:
import subprocess
#the path of azcopy.exe, v10 version
exepath = "D:\\azcopy\\v10\\azcopy.exe"
myscript= "your azcopy command"
#call the azcopy command
subprocess.call(myscript)
AzCopy supports wildcards, you could excute AzCopy from your Python code.
An example of how to do this can be found here: How to run Azure CLI commands using python?

Can Dirsync for Python sync files and folders in two directions

I want to create a script to sync files between two directories, and was going to utilise Dirsync and Python 3 for this.
from dirsync import sync
sync('C:/03py/Sync/Sync1','C:/03py/Sync/Sync2','sync', twoway=True, create=True)
After running the file for the first time, the folders are synced. I then put a dummy file and folder into the target directory and reran the above script, hoping the file and folder would be copied back into the source directory. However I get the following:
Only in C:/03py/Sync/Sync2
<< TESTTWOFOLDER
<< _TESTTWOWAY.txt
I am not certain if I am using the above commands correctly.
i don't know if this helps...
from dirsync import sync
source_path = 'C:\\wamp\\www\\first-python-app\\http'
target_path = 'C:\\wamp\\www\\first-python-app\\dev'
sync(target_path, source_path, 'sync', twoway=True, purge=True)
sync(source_path, target_path, 'sync')

Resources