new coder here. For work, I receive a request to put certain files in an already established s3 bucket with a requested "path."
For example: "Create a path of (bucket name)/1/2/3/ with folder3 containing (requested files)"
I'm looking to create a Python3 script to upload multiple files from my local machine to a specified bucket and "path" using CLI arguments specifying the file(s), bucket name, and "path"/key - I understand s3 doesn't technically have a folder structure, and that you have to put your "folders" in as part of the key, which is why I put "path" in quotes.
I have a working script doing what I want it to do, but the bucket/key is hard coded at the moment and I'm looking to get away from that with the use and understanding of CLI arguments. This is what I have so far -- it just doesn't upload the file, though it builds the path in s3 successfully :/
EDIT: Below is the working version of what I was looking for!
import argparse
#import os
import boto3
def upload_to_s3(file_name, bucket, path):
s3 = boto3.client('s3')
s3.upload_file(file_name, bucket, path)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--file_name')
parser.add_argument('--bucket')
parser.add_argument('--path')
args = parser.parse_args()
upload_to_s3(args.file_name, args.bucket, args.path)
my input is:
>>> python3 s3_upload_args_experiment.py --file_name test.txt --bucket mybucket2112 --path 1/2/3/test.txt
Everything executes properly!
Thank you much!
Related
really need someones help here.
I am currently trying to store a zip file in GCP secrets manager and then retrieve the zip file in python.
from google.cloud import secretmanager
import base64
client = secretmanager.SecretManagerServiceClient()
PROJECT_ID = "project"
secret_id = "secret"
version_id = 1
name = f"projects/{PROJECT_ID}/secrets/{secret_id}/versions/{version_id}"
response = client.access_secret_version(name=name)
bytes_returned = base64.b64decode(response.payload.data)
with open("my_zip.zip", "wb") as binary_file:
# Write bytes to file
binary_file.write(bytes_returned)
once I try and open the zip however it complains that the file is in the incorrect format.
When I download using gcloud commands everything seems to work
gcloud secrets versions access latest --secret "bi_cass_secure_bundle" --format "json" | \
jq -r .payload.data | \
base64 --decode > results_binary.zip
I have also tried the method explained here but with no luck
Create zip file object from bytestring in Python?
Thanks in advance and I am sending you all some good karma
extra notes
Soo even if I have a valid zip file and write the bytes to another zip file I get the same error so it is something to do with the python library
with open("valid.zip", 'rb') as file_data:
bytes_content = file_data.read()
with open("test_valid.zip", "wb") as binary_file:
# Write bytes to file
binary_file.write(bytes_content)
I have a folder inside the s3 bucket i need to update the file inside the existing folder
using python can any one assist
I found an answer to my own question. The code below works perfectly
from my local folder. I opened the folder and and use 'rb' (read binary) to store file.
s3 = boto3.client(
's3',
region_name='ap-south-1',
aws_access_key_id = S3_access_key,
aws_secret_access_key=S3_secret_key
)
path_data=os.path.join(MEDIA_ROOT, "screenShots",folder_name)
dir=os.listdir(path_data)
for file in dir:
with open(path_data+"/"+file, 'rb') as data:
s3.upload_fileobj(data, s3_Bucket_name, folder_name+"/"+file)
shutil.rmtree(path_data)
I have an AWS Lambda function that generates PDFs using the html-pdf library with custom fonts.
At first, I imported my fonts externally from Google Fonts, but then the PDF's size has enlarged by ten times.
So I tried to import my fonts locally src('file:///var/task/fonts/...ttf/woff2') but still no luck.
Lastly, I trie to create fonts folder in the main project and then I added all of my fonts, plus the file fonts.config:
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<dir>/var/task/fonts/</dir>
<cachedir>/tmp/fonts-cache/</cachedir>
<config></config>
</fontconfig>
and set the following env:
FONTCONFIG_PATH = /var/task/fonts
but still no luck (I haven't installed fontconfig since I'm not sure how and if I need to).
My Runtime env is Node.js 8.1.0.
You can upload your fonts into an S3 bucket and then download them to the lambda's /tmp directory, during its execution. In case your lib creates .pkl files, you should first change your root directory to /tmp (lambda is not allowed to write in the default root directory).
The following Python code downloads your files from a /fonts directory in an S3 bucket to /tmp/fonts "local" directory.
import os
import boto3
os.chdir('/tmp')
os.mkdir(os.path.join('/tmp/', 'fonts'))
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')
my_bucket = s3.Bucket("bucket_name")
for file in my_bucket.objects.filter(Prefix="fonts/"):
filename = file.key
short_filename = filename.replace('fonts/','')
if(len(short_filename) > 0):
s3_client.download_file(
bucket,
filename,
"/tmp/fonts/" + short_filename,
)
I am trying to copy 'specific files' from one folder to another. when I am trying to use Wild card operator (*) at the end, the copy does not happen.
But if I provide just the folder name, then all the files from this source folder are copied to target folder without any issues.
Problem: File copy does not happen when Wild card operator is used.
Can you please help me to fix the problem?
def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
try:
blob_service = BlockBlobService(account_name=account_name, account_key=account_key)
files = blob_service.list_blobs(copy_from_container, prefix=copy_from_prefix)
for f in files:
#print(f.name)
blob_service.copy_blob(copy_to_container, f.name.replace(copy_from_prefix,""), f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{f.name}")
except:
print('Could not copy files from source to target')
copy_from_prefix = 'Folder1/FileName_20191104*.csv'
copy_blob_files (accountName, accesskey, copy_fromcontainer, copy_to_container, copy_from_prefix)
The copy_blob method does not support wildcard.
1.If you want to copy specified pattern of blobs, you can filter the blobs in list_blobs() method with prefix(it also does not support wildcard). In your case, the prefix looks like copy_from_prefix = 'Folder1/FileName_20191104', note that there is no wildcard.
The code below works at my side, and all the specified pattern files are copies and blob name replaced:
from azure.storage.blob import BlockBlobService
account_name ="xxx"
account_key ="xxx"
copy_from_container="test7"
copy_to_container ="test4"
#remove the wildcard
copy_from_prefix = 'Folder1/FileName_20191104'
def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
try:
block_blob_service = BlockBlobService(account_name,account_key)
files = block_blob_service.list_blobs(copy_from_container,copy_from_prefix)
for file in files:
block_blob_service.copy_blob(copy_to_container,file.name.replace(copy_from_prefix,""),f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{file.name}")
except:
print('could not copy files')
copy_blob_files(account_name,account_key,copy_from_container,copy_to_container,copy_from_prefix)
2.Another way as others mentioned, you can use python to call azcopy(you can use azcopy v10, which is just a .exe file). And for using wildcard in azcopy, you can follow this doc. Then you write you own azcopy command, at last, write your python code as below:
import subprocess
#the path of azcopy.exe, v10 version
exepath = "D:\\azcopy\\v10\\azcopy.exe"
myscript= "your azcopy command"
#call the azcopy command
subprocess.call(myscript)
AzCopy supports wildcards, you could excute AzCopy from your Python code.
An example of how to do this can be found here: How to run Azure CLI commands using python?
I am able to load my txt file using the line below on my local machine.
lines=open(args['train_file1'],mode='r').read().split('\n')
args is dict which has the dir of training file.
Now i changed the working python version to 3.5 and now i am getting this error. I am clueless why this error is coming, the file is present in that directory.
FileNotFoundError: [Errno 2] No such file or directory: 'gs://bot_chat-227711/data/movie_lines.txt'
If I understood your question correctly, you are trying to read a file from Cloud Storage in App Engine.
You cannot do so directly by using the open function, as files in Cloud Storage are located in Buckets in the Cloud. Since you are using Python 3.5, you can use the Python Client library for GCS in order to work with files located in GCS .
This is a small example, that reads your file located in your Bucket, in a handler on an App Engine application:
from flask import Flask
from google.cloud import storage
app = Flask(__name__)
#app.route('/openFile')
def openFile():
client = storage.Client()
bucket = client.get_bucket('bot_chat-227711')
blob = bucket.get_blob('data/movie_lines.txt')
your_file_contents = blob.download_as_string()
return your_file_contents
if __name__ == '__main__':
app.run(host='127.0.0.1', port=8080, debug=True)
Note that you will need to add the line google-cloud-storage to your requirements.txt file in order to import and use this library.