AWS Glue Job Python Script Boto3 - want to hide credentials - python-3.x

Has anyone found a way to hide boto3 credentials in a python script that gets called from AWS Glue?
Right now I have my key and access_key embedded within my script, and I am pretty sure that this is not good practice...

I found the answer! When I established the IAM role for my Glue services, I didn't realize I was opening that up to boto3 as well.
The answer is that I don't need to pass my credentials. I simply use this:
mySession = boto3.Session(region_name='my_region_name')
and it works like a charm!

Related

Running queries against Amazon S3 using Boto3

Because of Glacier Deep's expensive support for small objects, I am writing an archiver. It would be most helpful to me to be able to ask boto3 to give me a list of objects in the bucket which are not already in the desired storage class. Thanks to this answer, I know I can do this in a shell:
aws s3api list-objects --bucket $BUCKETNAME --query 'Contents[?StorageClass!=`DEEP_ARCHIVE`]'
Is there a way to pass that query parameter into boto3? I haven't dug into the source yet, but I thought it was essentially a wrapper on the command line tools- but I can't find docs or examples anywhere using this technique.
Is there a way to pass that query parameter into boto3?
Sadly, you can't do this, as --query option is specific to AWS CLI. But boto3 is Python AWS SDK, so you very easily post-process its outputs to obtain the same results as from CLI.
The --query option is based on jmespath. So if you really want to use jmespath in your python, you can use jmespath package .
Query S3 Inventory size column with Athena.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory-athena-query.html

Using shared AWS config/credentials files locally in NodeJS ExpressJS app

Im asking this question because I spent a long time trawling google trying to find an answer to it and finally stumbled upon a single line in an AWS doc that pointed me to the correct answer, but it is not obvious, so hope this helps someone save some time.
Problem:
I am trying to call AWS services via the AWS SDK in my NodeJS Express app, but the SDK is not loading the profile and key details I have in my ~/.aws/credentials and ~/.aws/config files. When I try and log the credentials object it is empty. I am trying to use a profile which I have set using the AWS_PROFILE environment variable, set in a .env file. Looking at the AWS docs the SDK should look in these files under ~/.aws, but it doesnt seem to be doing this.
So the answer lies in a single line in one of the AWS docs: http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/getting-started-nodejs.html
Similarly, if you have set your region correctly in your config file, you can display that value by setting the AWS_SDK_LOAD_CONFIG environment variable to a truthy value
Adding AWS_SDK_LOAD_CONFIG=true to my .env file caused the SDK to start using the credentials stored in the ~/.aws/config and ~/.aws/credentials files.
I dont know why the docs dont make it more explicit that you need to do this (that page suggests you only need to set this to get the region to pass), everything I read suggested that these files would be checked by the SDK regardless.
Hopefully this helps save someone else a lot of frustration and Googling!

Need to upload directory content to S3 bucket

My scenario is I am currently using AWS CLI to upload my directory content to S3 bucket using following AWS CLI command:
aws s3 sync results/foo s3://bucket/
Now I need to replace this and have python code to do this. I am exploring boto3 documentation to find the right way to do it. I see some options such as:
https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/s3.html#S3.Client.upload_file
https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/s3.html#S3.ServiceResource.Object
Could someone suggest which is the right approach.
I am aware that I would have to get the credentials by calling boto3.client('sts').assume_role(role, session) and use them subsequently.
The AWS CLI is actually written in Python and uses the same API calls you can use.
The important thing to realize is that Amazon S3 only has an API call to upload/download one object at a time.
Therefore, your Python code would need to:
Obtain a list of files to copy
Loop through each file and upload it to Amazon S3
Of course, if you want sync functionality (which only copies new/modified files), then your program will need more intelligence to figure out which files to copy.
Boto3 has two general types of methods:
client methods that map 1:1 with API calls, and
resource methods that are more Pythonic but might make multiple API calls in the background
Which type you use is your own choice. Personally, I find the client methods easier for uploading/downloading objects, and the resource methods are good when having to loop through resources (eg "for each EC2 instance, for each EBS volume, check each tag").

How do I get the HTTP URL for an object on Amazon S3 using Botocore?

I've always used the Boto library to use Amazon's S3 service, but because of a Python 3.4 project I had to use Botocore instead. I've figured out how to do most things, but I can't seem to find how to do one (pretty essential) thing; generating urls.
In Boto I would simply set a Key and fire the generate_url method. How do I do this in Botocore? I know how to download and save files, but I would much rather just get a link because of server performance and what-not.
I had a similar question because I wanted to use boto with tornado. There is an asynchronous version of botocore called tornado_botocore, which seems promising.
Taking a look at the S3Connection generate_url method:
https://github.com/boto/boto/blob/develop/boto/s3/connection.py
After a quick skim of the code, it looks like the method itself does not require a server request, it simply generates the URL locally. I was able to verify the lack of a server request by running the following without a network connection:
import boto
client = boto.connect_s3()
client.generate_url(60,'GET','test','')
This means that this particular boto API is safe to use with tornado. Unfortunately, this doesn't address your Python 3.4 issue.

handling sharepoint login via jython

I am trying to read files from a sharepoint folder and write into a different file.
I was successful to write a simple script using urllib
import urllib
page=urllib.urlopen("file.txt")
output=open('c:/test.txt','w')
output.write(page.read())
output.close()
The issue is that I need to read via sharepoint site which has authentication and iam not able to find the correct urllib command to pass the user and password info.
i have even tried urlencode but still iam getting access denied. I do have permission to see the file if being view from a browse.
Jython -2.1 is being used so urllib2 cannot be called.
Please suggest any other function that do the work.
Thanks in advance for your help.
Which version of Sharepoint are we talking about? If it is 2010, I suggest you take a look at REST API: http://msdn.microsoft.com/en-us/library/ff798339.aspx

Resources