I am running into a problem trying to update an AWS Gamelift script with a python command that zips a directory and uploads it with all its contents as a newer version to AWS Gamelift.
from zipfile import ZipFile
import os
from os.path import basename
import boto3
import sys, getopt
def main(argv):
versInput = sys.argv[1]
#initializes client for updating script in aws gamelift
client = boto3.client('gamelift')
#Where is the directory relative to the script directory. In this case, one folder dir lower and the contents of the RealtimeServer dir
dirName = '../RealtimeServer'
# create a ZipFile object
with ZipFile('RealtimeServer.zip', 'w') as zipObj:
# Iterate over all the files in directory
for folderName, subfolders, filenames in os.walk(dirName):
rootlen = len(dirName) + 1
for filename in filenames:
#create complete filepath of file in directory
filePath = os.path.join(folderName, filename)
# Add file to zip
zipObj.write(filePath, filePath[rootlen:])
response = client.update_script(
ScriptId=SCRIPT_ID_GOES_HERE,
Version=sys.argv[1],
ZipFile=b'--zip-file \"fileb://RealtimeServer.zip\"'
)
if __name__ == "__main__":
main(sys.argv[1])
I plan on using it by giving it a new version number everytime I make changes with:
python updateScript.py "0.1.1"
This is meant to help speed up development. However, I am doing something wrong with the ZipFile parameter of client.update_script()
For context, I can use the AWS CLI directly from the commandline and update a script without a problem by using:
aws gamelift update-script --script-id SCRIPT_STRING_ID_HERE --script-version "0.4.5" --zip-file fileb://RealtimeServer.zip
However, I am not sure what is going on because it fails to unzip the file when I try it:
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the UpdateScript operation: Failed to unzip the zipped file.
UPDATE:
After reading more documentation about the ZipFile parameter:
https://docs.aws.amazon.com/gamelift/latest/apireference/API_UpdateScript.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/gamelift.html#GameLift.Client.update_script
I tried sending a base64 encoded version of the zip file. However, that didn't work. I put the following code before the client_update part of the script and used b64EncodedZip as the ZipFile parameter.
with open("RealtimeServer.zip", "rb") as f:
bytes = f.read()
b64EncodedZip = base64.b64encode(bytes)
I was able to get it to work by having some help from a maintainer of boto3 over at https://github.com/boto/boto3/issues/2646
(Thanks #swetashre)
Here is the code and it will only work up to 5mb and requires use of an s3 bucket if you want to upload a zip file any larger than that.
from zipfile import ZipFile
import os
from os.path import basename
import boto3
import sys, getopt
def main(argv):
versInput = sys.argv[1]
#initializes client for updating script in aws gamelift
client = boto3.client('gamelift')
#Where is the directory relative to the script directory. In this case, one folder dir lower and the contents of the RealtimeServer dir
dirName = '../RealtimeServer'
# create a ZipFile object
with ZipFile('RealtimeServer.zip', 'w') as zipObj:
# Iterate over all the files in directory
for folderName, subfolders, filenames in os.walk(dirName):
rootlen = len(dirName) + 1
for filename in filenames:
#create complete filepath of file in directory
filePath = os.path.join(folderName, filename)
# Add file to zip
zipObj.write(filePath, filePath[rootlen:])
with open('RealtimeServer.zip','rb') as f:
contents = f.read()
response = client.update_script(
ScriptId="SCRIPT_ID_GOES_HERE",
Version=sys.argv[1],
ZipFile=contents
)
if __name__ == "__main__":
main(sys.argv[1])
I got the script working but I did it by avoiding the use of boto3. I don't like it but it works.
os.system("aws gamelift update-script --script-id \"SCRIPT_ID_GOES_HERE\" --script-version " + sys.argv[1] + " --zip-file fileb://RealtimeServer.zip")
If anyone knows how to get boto3 to work for updating an AWS Gamelift script then please let me know.
Related
I recently started getting into Python and I am having a hard time searching through directories and matching files based on a regex that I have created.
Basically I want it to scan through all the directories in another directory and find all the files that ends with .zip or .rar or .r01 and then run various commands based on what file it is.
import os, re
rootdir = "/mnt/externa/Torrents/completed"
for subdir, dirs, files in os.walk(rootdir):
if re.search('(w?.zip)|(w?.rar)|(w?.r01)', files):
print "match: " . files
import os
import re
rootdir = "/mnt/externa/Torrents/completed"
regex = re.compile('(.*zip$)|(.*rar$)|(.*r01$)')
for root, dirs, files in os.walk(rootdir):
for file in files:
if regex.match(file):
print(file)
CODE BELLOW ANSWERS QUESTION IN FOLLOWING COMMENT
That worked really well, is there a way to do this if match is found on regex group 1 and do this if match is found on regex group 2 etc ? – nillenilsson
import os
import re
regex = re.compile('(.*zip$)|(.*rar$)|(.*r01$)')
rx = '(.*zip$)|(.*rar$)|(.*r01$)'
for root, dirs, files in os.walk("../Documents"):
for file in files:
res = re.match(rx, file)
if res:
if res.group(1):
print("ZIP",file)
if res.group(2):
print("RAR",file)
if res.group(3):
print("R01",file)
It might be possible to do this in a nicer way, but this works.
Given that you are a beginner, I would recommend using glob in place of a quickly written file-walking-regex matcher.
Snippets of functions using glob and a file-walking-regex matcher
The below snippet contains two file-regex searching functions (one using glob and the other using a custom file-walking-regex matcher). The snippet also contains a "stopwatch" function to time the two functions.
import os
import sys
from datetime import timedelta
from timeit import time
import os
import re
import glob
def stopwatch(method):
def timed(*args, **kw):
ts = time.perf_counter()
result = method(*args, **kw)
te = time.perf_counter()
duration = timedelta(seconds=te - ts)
print(f"{method.__name__}: {duration}")
return result
return timed
#stopwatch
def get_filepaths_with_oswalk(root_path: str, file_regex: str):
files_paths = []
pattern = re.compile(file_regex)
for root, directories, files in os.walk(root_path):
for file in files:
if pattern.match(file):
files_paths.append(os.path.join(root, file))
return files_paths
#stopwatch
def get_filepaths_with_glob(root_path: str, file_regex: str):
return glob.glob(os.path.join(root_path, file_regex))
Comparing runtimes of the above functions
On using the above two functions to find 5076 files matching the regex filename_*.csv in a dir called root_path (containing 66,948 files):
>>> glob_files = get_filepaths_with_glob(root_path, 'filename_*.csv')
get_filepaths_with_glob: 0:00:00.176400
>>> oswalk_files = get_filepaths_with_oswalk(root_path,'filename_(.*).csv')
get_filepaths_with_oswalk: 0:03:29.385379
The glob method is much faster and the code for it is shorter.
For your case
For your case, you can probably use something like the following to get your *.zip,*.rar and *.r01 files:
files = []
for ext in ['*.zip', '*.rar', '*.r01']:
files += get_filepaths_with_glob(root_path, ext)
Here's an alternative using glob.
from pathlib import Path
rootdir = "/mnt/externa/Torrents/completed"
for extension in 'zip rar r01'.split():
for path in Path(rootdir).glob('*.' + extension):
print("match: " + path)
I would do it this way:
import re
from pathlib import Path
def glob_re(path, regex="", glob_mask="**/*", inverse=False):
p = Path(path)
if inverse:
res = [str(f) for f in p.glob(glob_mask) if not re.search(regex, str(f))]
else:
res = [str(f) for f in p.glob(glob_mask) if re.search(regex, str(f))]
return res
NOTE: per default it will recursively scan all subdirectories. If you want to scan only the current directory then you should explicitly specify glob_mask="*"
How to make this code works?
There is a zip file with folders and .png files in it. Folder ".\icons_by_year" is empty. I need to get every file one by one without unzipping it and copy to the root of the selected folder (so no extra folders made).
class ArrangerOutZip(Arranger):
def __init__(self):
self.base_source_folder = '\\icons.zip'
self.base_output_folder = ".\\icons_by_year"
def proceed(self):
self.create_and_copy()
def create_and_copy(self):
reg_pattern = re.compile('.+\.\w{1,4}$')
f = open(self.base_source_folder, 'rb')
zfile = zipfile.ZipFile(f)
for cont in zfile.namelist():
if reg_pattern.match(cont):
with zfile.open(cont) as file:
shutil.copyfileobj(file, self.base_output_folder)
zfile.close()
f.close()
arranger = ArrangerOutZip()
arranger.proceed()
shutil.copyfileobj uses file objects for source and destination files. To open the destination you need to construct a file path for it. pathlib is a part of the standard python library and is a nice way to handle file paths. And ZipFile.extract does some of the work of creating intermediate output directories for you (plus sets file metadata) and can be used instead of copyfileobj.
One risk of unzipping files is that they can contain absolute or relative paths outside of the target directory you intend (e.g., "../../badvirus.exe"). extract is a bit too lax about that - putting those files in the root of the target directory - so I wrote a little something to reject the whole zip if you are being messed with.
With a few tweeks to make this a testable program,
from pathlib import Path
import re
import zipfile
#import shutil
#class ArrangerOutZip(Arranger):
class ArrangerOutZip:
def __init__(self, base_source_folder, base_output_folder):
self.base_source_folder = Path(base_source_folder).resolve(strict=True)
self.base_output_folder = Path(base_output_folder).resolve()
def proceed(self):
self.create_and_copy()
def create_and_copy(self):
"""Unzip files matching pattern to base_output_folder, raising
ValueError if any resulting paths are outside of that folder.
Output folder created if it does not exist."""
reg_pattern = re.compile('.+\.\w{1,4}$')
with open(self.base_source_folder, 'rb') as f:
with zipfile.ZipFile(f) as zfile:
wanted_files = [cont for cont in zfile.namelist()
if reg_pattern.match(cont)]
rebased_files = self._rebase_paths(wanted_files,
self.base_output_folder)
for cont, rebased in zip(wanted_files, rebased_files):
print(cont, rebased, rebased.parent)
# option 1: use shutil
#rebased.parent.mkdir(parents=True, exist_ok=True)
#with zfile.open(cont) as file, open(rebased, 'wb') as outfile:
# shutil.copyfileobj(file, outfile)
# option 2: zipfile does the work for you
zfile.extract(cont, self.base_output_folder)
#staticmethod
def _rebase_paths(pathlist, target_dir):
"""Rebase relative file paths to target directory, raising
ValueError if any resulting paths are not within target_dir"""
target = Path(target_dir).resolve()
newpaths = []
for path in pathlist:
newpath = target.joinpath(path).resolve()
newpath.relative_to(target) # raises ValueError if not subpath
newpaths.append(newpath)
return newpaths
#arranger = ArrangerOutZip('\\icons.zip', '.\\icons_by_year')
import sys
try:
arranger = ArrangerOutZip(sys.argv[1], sys.argv[2])
arranger.proceed()
except IndexError:
print("usage: test.py zipfile targetdir")
I'd take a look at the zipfile libraries' getinfo() and also ZipFile.Path() for construction since the constructor class can also use paths that way if you intend to do any creation.
Specifically PathObjects. This is able to do is to construct an object with a path in it, and it appears to be based on pathlib. Assuming you don't need to create zipfiles, you can ignore this ZipFile.Path()
However, that's not exactly what I wanted to point out. Rather consider the following:
zipfile.getinfo()
There is a person who I think is getting at this exact situation here:
https://www.programcreek.com/python/example/104991/zipfile.getinfo
This person seems to be getting a path using getinfo(). It's also clear that NOT every zipfile has the info.
How to check if a particular file is present inside a particular directory in my S3? I use Boto3 and tried this code (which doesn't work):
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
key = 'dootdoot.jpg'
objs = list(bucket.objects.filter(Prefix=key))
if len(objs) > 0 and objs[0].key == key:
print("Exists!")
else:
print("Doesn't exist")
While checking for S3 folder, there are two scenarios:
Scenario 1
import boto3
def folder_exists_and_not_empty(bucket:str, path:str) -> bool:
'''
Folder should exists.
Folder should not be empty.
'''
s3 = boto3.client('s3')
if not path.endswith('/'):
path = path+'/'
resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
return 'Contents' in resp
The above code uses MaxKeys=1. This it more efficient. Even if the folder contains lot of files, it quickly responds back with just one of the contents.
Observe it checks Contents in response
Scenario 2
import boto3
def folder_exists(bucket:str, path:str) -> bool:
'''
Folder should exists.
Folder could be empty.
'''
s3 = boto3.client('s3')
path = path.rstrip('/')
resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
return 'CommonPrefixes' in resp
Observe it strips off the last / from path. This prefix will check just that folder and doesn't check within that folder.
Observe it checks CommonPrefixes in response and not Contents
import boto3
import botocore
client = boto3.client('s3')
def checkPath(file_path):
result = client.list_objects(Bucket="Bucket", Prefix=file_path )
exists=False
if 'Contents' in result:
exists=True
return exists
if the provided file_path will exist then it will return True.
example: 's3://bucket/dir1/dir2/dir3/file.txt'
file_path: 'dir1/dir2' or 'dir1/'
Note:- file path should start with the first directory just after the bucket name.
Basically a directory/file is S3 is an object. I have created a method for this (IsObjectExists) that returns True or False. If the directory/file doesn't exists, it won't go inside the loop and hence the method return False, else it will return True.
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('<givebucketnamehere>')
def IsObjectExists(path):
for object_summary in bucket.objects.filter(Prefix=path):
return True
return False
if(IsObjectExists("<giveobjectnamehere>")):
print("Directory/File exists")
else:
print("Directory/File doesn't exists")
Note that if you are checking a folder, make sure that you end the string with / . One use case is that when you try to check for a folder called Hello and if the folder doesn't exist, rather there is a folder called Hello_World. In such case, method will return True. In this case you have to add / character to the end of folder name while coding. You can see how this is handled in the below example
foldername = "Hello/"
if(IsObjectExists(foldername))
print("Directory/File exists")
import boto3
import botocore
client = boto3.client('s3')
result= client.list_objects_v2(Bucket='athenards', Prefix = 'cxdata')
for obj in result['Contents']:
if obj['Key'] == 'cxdata/':
print("true")
Please try this code as following
Get subdirectory info folder¶
folders = bucket.list("","/")
for folder in folders:
print (folder.name)
PS reference URL(How to use python script to copy files from one bucket to another bucket at the Amazon S3 with boto)
The following code should work...
import boto3
import botocore
def does_exist(bucket_name, folder_name):
s3 = boto3.resource(
service_name='s3',
region_name='us-east-2',
aws_access_key_id='********************',
aws_secret_access_key='********************'
)
objects = s3.meta.client.list_objects_v2(Bucket=bucket_name, Delimiter='/', Prefix='')
# print(objects)
folders = objects['CommonPrefixes']
folders_in_bucket = []
for f in folders:
print(f['Prefix'])
folders_in_bucket.append(f['Prefix'])
return folder_name in folders_in_bucket
print("does it exist?", does_exist('images-bucket','ddd/'))
As #Vinayak mentioned in one of the answer's comment in march, 2020...
The way to get a 'folder' list in boto3 is objects = s3.list_objects_v2(Bucket=BUCKET_NAME, Delimiter='/', Prefix='')
While running this with the latest versions of boto3 and botocore in August 2021 - '1.18.27', '1.21.27' respectively, gives the following error:
AttributeError: 's3.ServiceResource' object has no attribute 'list_objects_v2'
This happens since you are using s3 = s3.resource("mybucketname", credential-params) and s3.ServiceResource will not have s3.list_objects_v2() method. Instead, ServiceResource is having a meta attribute that will further have client type object from where you can apply Client object's methods on ServiceResource Object. like this - s3.meta.client.list_objects_v2()
Hope that helps!
Check this for checking folder is existed and not empty:
def folder_exists_and_not_empty(bucket_name: str, object_key: str) -> bool:
'''
Folder should exists.
Folder should not be empty.
'''
if not object_key.endswith('/'):
object_key = object_key+'/'
s3 = boto3.resource("s3")
bucket = s3.Bucket(bucket_name)
current_object = [file.key for file in bucket.objects.filter(Prefix=object_key) if (file.key == object_key and (str(file.get()['ContentType']).startswith('application/x-directory')))]
list_files = [file.key for file in bucket.objects.filter(Prefix=object_key) if (file.key != object_key)]
return len(current_object) == 1 and len(list_files) > 0
I would like to know if there is any way to activate the webhook for all intents (other than activating it one by one). Thank you!
There is no such functionality as of now, but I had similar problem and this is how I solved it:
Download the zip file of all the intents
Write a program (I wrote in python) to go through all files (ignoring files that ends with usersays
change "webhookUsed": false, to "webhookUsed": true,
Upload the zip file replacing existing intents using Restore from zip option
UPDATE 1:
Below is the code:
import zipfile
import json
import os
import glob
cwd = os.getcwd()
zip_ref = zipfile.ZipFile(cwd + '/filename.zip', 'r')
zip_ref.extractall('zipped')
zip_ref.close()
cwd = cwd + '/zipped/intents'
files = glob.glob(cwd + "/*.json")
for file in files:
print(file)
if "usersay" not in file:
json_data= json.loads(open(file).read())
json_data['webhookUsed'] = True
with open(file, 'w') as outfile:
json.dump(json_data, outfile)
Place the zip file you get from dialogflow in the directory same as where you place above code and run the python program.
After running this code, navigate to directory named zipped and zip all the contents of the file and follow step 4.
UPDATE 2:
Updated the code to make it compatible to multiple languages Dialogflow agent.
Hope it helps.
Aside from activating it one by one, or downloading the zip file, setting it one by tone in the JSON, and uploading the results - no.
#sid8491 thank you so much, this worked for me!
I had to make some changes so that it worked correctly. Your answer has been very helpful. This is my final script:
import zipfile
import json
import os
import glob
cwd = os.getcwd()
zip_ref = zipfile.ZipFile(cwd + '/Bill.zip', 'r')
zip_ref.extractall('zipped')
zip_ref.close()
cwd = cwd + '/zipped/intents'
files = glob.glob(cwd + "/*.json")
for file in files:
print(file)
if "usersay" not in file:
json_data = json.loads(open(file, encoding="utf8").read())
json_data['webhookUsed'] = True
with open(file, 'w') as outfile:
json.dump(json_data, outfile)
else:
print("Usersay file", file)
I am currently using the below python script to download data from AWS S3 to my local. Only problem I have is when I run this I have to manually enter the exact folder from where the files need to be downloaded. The S3 bucket I use creates a new folder for each day and I would like to download files from only the current day's folder. I tried creating a variable using the system date and tried to pass that in the bucket list variable but the script did nothing neither did it throw an error. Could anyone help me with this.
import boto, os
import datetime
from os import path
current_date = datetime.datetime.now().strftime("%Y-%m-%d")
LOCAL_PATH = '/Users/user/Desktop/rep'
AWS_ACCESS_KEY_ID = 'ACCESS'
AWS_SECRET_ACCESS_KEY = 'SECRET'
bucket_name = 'bucket'
# connect to the bucket
conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(bucket_name)
# go through the list of files
bucket_list = bucket.list(prefix='Nation/State/City/2018-05-01')
#bucket_list = bucket.list(prefix='Nation/State/City/current_date')
#bucket_list = bucket.list()
for l in bucket_list:
keyString = str(l.key)
d = LOCAL_PATH + keyString
try:
l.get_contents_to_filename(d)
except OSError:
# check if dir exists
if not os.path.exists(d):
os.makedirs(d)
Thanks..
Your Python code is wrong for what you want.
The error is here:
bucket_list = bucket.list(prefix='Nation/State/City/current_date')
In this context, current_data is just a string containing the words current_data. To fix it you should change the line above to:
bucket_list = bucket.list(prefix='Nation/State/City/{}'.format(current_date))
This line will pick the value of current_date variable and set it in your prefix string, replacing the {}.
I would also recommend you to check this link:
https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3.