How to get AWS S3 object Location/URL using python 3.8? - python-3.x

I am uploading a file to AWS S3 using AWS Lambda function (Python3.8) with the following code.
file_obj = open(filename, 'rb')
s3_upload = s3.put_object( Bucket="aaa", Key="aaa.png", Body=file_obj)
return {
'statusCode': 200,
'body': json.dumps("Executed Successfully")
}
I want to get the location/url of the S3 object in return. In Node.js we use the .location parameter for getting the object location/url.
Any idea how to do this using python 3.8?

The url of S3 objects has known format and follows Virtual hosted style access:
https://bucket-name.s3.Region.amazonaws.com/keyname
Thus, you can construct the url yourself:
bucket_name = 'aaa'
aws_region = boto3.session.Session().region_name
object_key = 'aaa.png'
s3_url = f"https://{bucket_name}.s3.{aws_region}.amazonaws.com/{object_key}"
return {
'statusCode': 200,
'body': json.dumps({'s3_url': s3_url})
}

Related

python aws botocore.response.streamingbody to json

I am using boto3 to acccess files from S3,
The objective is to read the files and convert it to JSON
But the issue is none of the files have any file extension (no .csv,.json etc),although the data in the file is structured like JSON
client = boto3.client(
's3',
aws_access_key_id = 'AKEY',
aws_secret_access_key = 'ASAKEY',
region_name = 'us-east-1'
)
obj = client.get_object(
Bucket = 'bucketname',
Key = '*filename without extension*'
)
obj['Body'] returns a <botocore.response.StreamingBody> object
is it possible to find out the data within it?
The extension does not matter. Assuming your file contains valid json, you can get it:
my_json = json.loads(obj['Body'].read())
The response is a dictionary object.
Response returns StreamingBody in 'Body' attribute. So here is the solution.
Find more information here.
Boto S3 Get Object
client = boto3.client('s3')
response = client.get_object(
Bucket='<<bucket_name_here>>',
Key='<<file key from aws mangement console (S3 Info) >>'
)
jsonContent = json.loads(response['Body'].read())
print(jsonContent)

Phone book search on AWS Lambda and S3

I want to make a serverless application in AWS Lambda for phone book searches.
What I've done:
Created a bucket and uploaded a CSV file to it.
Created a role with full access to the bucket.
Created a Lambda function
Created API Gateway with GET and POST methods
The Lambda function contains the following code:
import boto3
import json
s3 = boto3.client('s3')
resp = s3.select_object_content(
Bucket='namebbacket',
Key='sample_data.csv',
ExpressionType='SQL',
Expression="SELECT * FROM s3object s where s.\"Name\" = 'Jane'",
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': 'NONE'},
OutputSerialization = {'CSV': {}},
)
for event in resp['Payload']:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8')
print(records)
elif 'Stats' in event:
statsDetails = event['Stats']['Details']
print("Stats details bytesScanned: ")
print(statsDetails['BytesScanned'])
print("Stats details bytesProcessed: ")
print(statsDetails['BytesProcessed'])
print("Stats details bytesReturned: ")
print(statsDetails['BytesReturned'])
When I access the Invoke URL, I get the following error:
{errorMessage = Handler 'lambda_handler' missing on module 'lambda_function', errorType = Runtime.HandlerNotFound}
CSV structure: Name, PhoneNumber, City, Occupation
How to solve this problem?
Please refer to this documentation topic to learn how to write a Lambda function in Python. You are missing the Handler. See: AWS Lambda function handler in Python
Wecome to S.O. #smac2020 links you to the right place AWS Lambda function handler in Python. In short, AWS Lambda needs to know where to find your code, hence the "handler". Though a better way to think about it might be "entry-point."
Here is a close approximation of your function, refactored for use on AWS Lambda:
import json
import boto3
def function_to_be_called(event, context):
# TODO implement
s3 = boto3.client('s3')
resp = s3.select_object_content(
Bucket='stack-exchange',
Key='48836509/dogs.csv',
ExpressionType='SQL',
Expression="SELECT * FROM s3object s where s.\"breen_name\" = 'pug'",
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': 'NONE'},
OutputSerialization = {'CSV': {}},
)
for event in resp['Payload']:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8')
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!'),
'pugInfo': records
}
This function produces the following result:
Response
{
"statusCode": 200,
"body": "\"Hello from Lambda!\"",
"currentWorkdingDirectory": "/var/task",
"currentdirlist": [
"lambda_function.py"
],
"pugInfo": "1,pug,toy\r\n"
}
The "entry point" for this function is in a Python file called lambda_function.py and the function function_to_be_called. Together these are the "handler." We can see this in the Console:
or using the API through Boto3
import boto3
awslambda = boto3.client('lambda')
awslambda.get_function_configuration('s3SelectFunction')
Which returns:
{'CodeSha256': 'mFVVlakisUIIsLstQsJUpeBIeww4QhJjl7wJaXqsJ+Q=',
'CodeSize': 565,
'Description': '',
'FunctionArn': 'arn:aws:lambda:us-east-1:***********:function:s3SelectFunction',
'FunctionName': 's3SelectFunction',
'Handler': 'lambda_function.function_to_be_called',
'LastModified': '2021-03-10T00:57:48.651+0000',
'MemorySize': 128,
'ResponseMetadata': ...
'Version': '$LATEST'}

Kaggle login and unzip file to store in s3 bucket

Create a lambda function for python 3.7.
Role attached to the lambda function should have S3 access and lambda basic execution.
Read data from https://www.kaggle.com/therohk/india-headlines-news-dataset/download and save into S3 as CSV. file is zip how to unzip and store in temp file
Getting Failed in AWS Lambda function:
Lambda Handler to download news headline dataset from kaggle
import urllib3
import boto3
from botocore.client import Config
http = urllib3.PoolManager()
def lambda_handler(event, context):
bucket_name = 'news-data-kaggle'
file_name = "india-news-headlines.csv"
lambda_path = "/tmp/" +file_name
kaggle_info = {'UserName': "bossdk", 'Password': "xxx"}
url = "https://www.kaggle.com/account/login"
data_url = "https://www.kaggle.com/therohk/india-headlines-news-dataset/download"
r = http.request('POST',url,kaggle_info)
r = http.request('GET',data_url)
f = open(lambda_path, 'wb')
for chunk in r.iter_content(chunk_size = 512 * 1024):
if chunk:
f.write(chunk)
f.close()
data = ZipFile(lambda_path)
# S3 Connect
s3 = boto3.resource('s3',config=Config(signature_version='s3v4'))
# Uploaded File
s3.Bucket(bucket_name).put(Key=lambda_path, Body=data, ACL='public-read')
return {
'status': 'True',
'statusCode': 200,
'body': 'Dataset Uploaded'
}

How to download PDF file from AWS API gateway in python

I'm creating AWS API endpoint (GET) to get a PDF file and facing a serializable issue.
AWS Lambda is mapped to access the file from s3.
import boto3
import base64
def lambda_handler(event, context):
response = client.get_object(
Bucket='test-bucket',
Key=file_path,
)
data = response['Body'].read()
return {
'statusCode': 200,
'isBase64Encoded': True,
'body': data,
'headers': {
'content-type': 'application/pdf',
'content-disposition': 'attachment; filename=test.pdf'
}
}
[ERROR] Runtime.MarshalError: Unable to marshal response: bytes is not JSON serializable.
If I return str(data, "utf-8") will download PDF file and making issues while open.
Please suggest to me where I'm lagging.
Thanks.
You will need to initiate the client variable first and then encode the data that is coming back from s3 as followed:
import json
import boto3
import base64
client = boto3.client('s3')
def lambda_handler(event, context):
bucket_name ='bucket-name'
file_name='file-name.pdf'
fileObject = client.get_object(Bucket=bucket_name, Key=file_name)
file_content = fileObject["Body"].read()
print(bucket_name, file_name)
return base64.b64encode(file_content)

Writing string to S3 with boto3: "'dict' object has no attribute 'put'"

In an AWS lambda, I am using boto3 to put a string into an S3 file:
import boto3
s3 = boto3.client('s3')
data = s3.get_object(Bucket=XXX, Key=YYY)
data.put('Body', 'hello')
I am told this:
[ERROR] AttributeError: 'dict' object has no attribute 'put'
The same happens with data.put('hello') which is the method recommended by the top answers at How to write a file or data to an S3 object using boto3 and with data.put_object: 'dict' object has no attribute 'put_object'.
What am I doing wrong?
On the opposite, reading works great (with data.get('Body').read().decode('utf-8')).
put_object is a method of the s3 object, not the data object.
Here is a full working example with Python 3.7:
import json
import boto3
s3 = boto3.client('s3')
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
bucket = 'mybucket'
key = 'id.txt'
id = None
# Write id to S3
s3.put_object(Body='Hello!', Bucket=bucket, Key=key)
# Read id from S3
data = s3.get_object(Bucket=bucket, Key=key)
id = data.get('Body').read().decode('utf-8')
logger.info("Id:" + id)
return {
'statusCode': 200,
'body': json.dumps('Id:' + id)
}

Resources