I have been trying to write a small script to download to lambda /tmp all the content of a S3 folder. To do this I need to list all Objects in a specific bucket. Unfortunately I keep getting the following error;
An error occurred (403) when calling the HeadObject operation: Forbidden
Here is how I try to download all the files from a folder:
#initialize S3
try:
s3 = boto3.resource('s3',
aws_access_key_id=os.getenv('S3USERACCESSKEY'),
aws_secret_access_key=os.getenv('S3USERSECRETKEY')
)
s3_client = boto3.client('s3',
aws_access_key_id=os.getenv('S3USERACCESSKEY'),
aws_secret_access_key=os.getenv('S3USERSECRETKEY')
)
except Exception as e:
logger.error("Could not connect to s3 bucket: " + str(e))
#Function to download whole folders from s3
for s3_key in s3_client.list_objects(Bucket=os.getenv('S3BUCKETNAME'))['Contents']:
s3_object = s3_key['Key']
if not s3_object.endswith("/"):
s3_client.download_file('bucket', s3_object, s3_object)
else:
import os
if not os.path.exists(s3_object):
os.makedirs(s3_object)
The access keys above have full admin rights:
EDIT
Still no success after removing my manual keys, here are the right i attached to Lambda:
Here is the actual error from cloudwatch:
The code now looks like so:
#initialize S3
try:
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')
except Exception as e:
[....]
Seems like "Forbidden" might be another issue then permission but I can't find any doc on it.
Make sure the access key belongs to the user with the IAM role that has rights to access the s3 bucket.
If you run from lambda, there's no need to use an access key, just attach the IAM role to lambda
https://docs.aws.amazon.com/lambda/latest/dg/accessing-resources.html
Did you import boto?
Try to run execute only this:
UPDATE
import boto3
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
Related
Even though passing the the correct Access key, Id and token, I am getting an error while running a below code. Anything missing in this code?
import boto3
session = boto3.Session(
region_name='us-east-1',
aws_secret_access_key='XXXX',
aws_access_key_id='YYYY',
aws_session_token= 'ZZZZ')
s3_client = session.client('s3')
response = s3_client.get_object(Bucket = 'dev-bucket-test',
Key='abc.xlsx')
data = response['Body'].read()
print(data)
Error:
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the GetObject operation: The AWS Access Key Id you provided does not exist in our records.
I would like to suggest a better approach.
There is a credentials file in .aws directory try to put your credentials there under [default] profile and it will help you to make all calls without writing credentials in code.
I am trying to write a response to AWS S3 as a new file each time.
Below is the code I am using
s3 = boto3.resource('s3', region_name=region_name)
s3_obj = s3.Object(s3_bucket, f'/{folder}/{file_name}.json')
resp_ = s3_obj.put(Body=json.dumps(response_json).encode('UTF-8'))
I can see that I get a 200 response and the file on the directory as well. But it also produces the below exception :
[DEBUG] 2020-10-13T08:29:10.828Z. Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7f2cf2fdfe123>>
My code throws 500 Exception even though it works. I have other business logic as part of the lambda and things work just fine as the write to S3 operation is at the last. Any help would be appreciated.
The Key (filename) of an Amazon S3 object should not start with a slash (/).
I have a Lambda function that triggers when an S3 upload happens. It then downloads to the /tmp and then sends to GCP Storage. Issue is that the logfiles can be up to 900 MB so there is not enough space on the /tmp storage in the Lambda function. Is there away around this?
I tried sending to memory but I believe the memory is read only.
Also there is talk about mounting efs but not sure this will work.
retrieve bucket name and file_key from the S3 event
logger.info(event)
s3_bucket_name = event['Records'][0]['s3']['bucket']['name']
file_key = event['Records'][0]['s3']['object']['key']
logger.info('Reading {} from {}'.format(file_key, s3_bucket_name))
logger.info(s3_bucket_name)
logger.info(file_key)
# s3 download file
s3.download_file(s3_bucket_name, file_key, '/tmp/{}'.format(file_key))
# upload to google bucket
bucket = google_storage.get_bucket(google_bucket_name)
blob = bucket.blob(file_key)
blob.upload_from_filename('/tmp/{}'.format(file_key))
This the error from cloudwatch logs for the lambda function.
[ERROR] OSError: [Errno 28] No space left on device
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 30, in lambda_handler
s3.download_file(s3_bucket_name, file_key, '/tmp/
storage_client = storage.Client()
bucket = storage_client.get_bucket("YOUR_BUCKET_NAME")
blob = bucket.blob("file/path.csv") #file path on your gcs
blob.upload_from_filename("/tmp/path.csv") #tmp file
i hope that will help you.
I'm using the boto3 package to connect from outside an s3 cluster (i.e. the script is currently not being run within the AWS 'cloud', but from my MBP connecting to the relevant cluster). My code:
s3 = boto3.resource(
"s3",
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
)
bucket = s3.Bucket(self.settings['S3']['bucket_test'])
for bucket_in_all in boto3.resource('s3').buckets.all():
if bucket_in_all.name == self.settings['S3']['bucket_test']:
print ("Bucket {} verified".format(self.settings['S3']['bucket_test']))
Now I'm receiving this error message:
botocore.exceptions.ClientError: An error occurred (SignatureDoesNotMatch) when calling the ListBuckets operation
I'm aware of the sequence of how the aws credentials are checked, and tried different permutations of my environment variables and ~/.aws/credentials, and know that the credentials as per my .py script should override, however I'm still seeing this SignatureDoesNotMatch error message. Any ideas where I may be going wrong? I've also tried:
# Create a session
session = boto3.session.Session(
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
aws_session_token=self.settings['CREDENTIALS']['session_token'],
region_name=self.settings['CREDENTIALS']['region_name']
)
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
...however I also see the same error traceback.
Actually, this was partly answered by #John Rotenstein and #bdcloud nevertheless I need to be more specific...
The following code in my case was not necessary and causing the error message:
# Create a session
session = boto3.session.Session(
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
aws_session_token=self.settings['CREDENTIALS']['session_token'],
region_name=self.settings['CREDENTIALS']['region_name']
)
The credentials now stored in self.settings mirror the ~/.aws/credentials. Weirdly (and like last week where the reverse happened), I now have access. It could be that a simple reboot of my laptop meant that my new credentials (since I updated these again yesterday) in ~/.aws/credentials were then 'accepted'.
I am trying to connect to AWS S3 without using credentials. I attached the Role S3 fullaccess for my instance to check if the file exists or not; if it is not, upload it into S3 bucket. If is isn't I want to check md5sum and if it is different from the local local file, upload a new version.
I try to get key of file in S3 via boto by using bucket.get_key('mykey') and get this error:
File "/usr/local/lib/python3.5/dist-packages/boto/s3/bucket.py", line 193, in get_key key, resp = self._get_key_internal(key_name, headers, query_args_l)
File "/usr/local/lib/python3.5/dist-packages/boto/s3/bucket.py", line 232, in _get_key_internal response.status, response.reason, '') boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden"
I searched and added "validate=False" when getting the bucket, but this didn't resolve my issue. I'm using Python 3.5, boto and boto3.
Here is my code:
import boto3
import boto
from boto import ec2
import os
import boto.s3.connection
from boto.s3.key import Key
bucket_name = "abc"
conn = boto.s3.connect_to_region('us-west-1', is_secure = True, calling_format = boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket(bucket_name, validate=False)
key = bucket.get_key('xxxx')
print (key)
I don't know why I get that error. Please help me to clearly this problem. Thanks!
Updated
I've just find root cause this problem. Cause by "The difference between the request time and the current time is too large".
Then it didn't get key of file from S3 bucket. I updated ntp service to synchronize local time and UTC time. It run success.
Synchronization time by:
sudo service ntp stop
sudo ntpdate -s 0.ubuntu.pool.ntp.org
sudo service ntp start
Thanks!
IAM role is the last in the order of search. I bet you have the credentials stored before the search order which doesn't have full S3 access. Check Configuration Settings and Precedence and make sure no credentials is present so that IAM role is used to fetch the credentials. Though it is for CLI, it applies to scripts too.
The AWS CLI looks for credentials and configuration settings in the following order:
Command line options – region, output format and profile can be specified as command options to override default settings.
Environment variables – AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN.
The AWS credentials file – located at ~/.aws/credentials on Linux, macOS, or Unix, or at C:\Users\USERNAME .aws\credentials on Windows. This file can contain multiple named profiles in addition to a default profile.
The CLI configuration file – typically located at ~/.aws/config on Linux, macOS, or Unix, or at C:\Users\USERNAME .aws\config on Windows. This file can contain a default profile, named profiles, and CLI specific configuration parameters for each.
Container credentials – provided by Amazon Elastic Container Service on container instances when you assign a role to your task.
Instance profile credentials – these credentials can be used on EC2 instances with an assigned instance role, and are delivered through the Amazon EC2 metadata service.