Phone book search on AWS Lambda and S3

Phone book search on AWS Lambda and S3 - python-3.x

I want to make a serverless application in AWS Lambda for phone book searches.
What I've done:
Created a bucket and uploaded a CSV file to it.
Created a role with full access to the bucket.
Created a Lambda function
Created API Gateway with GET and POST methods
The Lambda function contains the following code:
import boto3
import json
s3 = boto3.client('s3')
resp = s3.select_object_content(
Bucket='namebbacket',
Key='sample_data.csv',
ExpressionType='SQL',
Expression="SELECT * FROM s3object s where s.\"Name\" = 'Jane'",
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': 'NONE'},
OutputSerialization = {'CSV': {}},
)
for event in resp['Payload']:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8')
print(records)
elif 'Stats' in event:
statsDetails = event['Stats']['Details']
print("Stats details bytesScanned: ")
print(statsDetails['BytesScanned'])
print("Stats details bytesProcessed: ")
print(statsDetails['BytesProcessed'])
print("Stats details bytesReturned: ")
print(statsDetails['BytesReturned'])
When I access the Invoke URL, I get the following error:
{errorMessage = Handler 'lambda_handler' missing on module 'lambda_function', errorType = Runtime.HandlerNotFound}
CSV structure: Name, PhoneNumber, City, Occupation
How to solve this problem?

Please refer to this documentation topic to learn how to write a Lambda function in Python. You are missing the Handler. See: AWS Lambda function handler in Python

Wecome to S.O. #smac2020 links you to the right place AWS Lambda function handler in Python. In short, AWS Lambda needs to know where to find your code, hence the "handler". Though a better way to think about it might be "entry-point."
Here is a close approximation of your function, refactored for use on AWS Lambda:
import json
import boto3
def function_to_be_called(event, context):
# TODO implement
s3 = boto3.client('s3')
resp = s3.select_object_content(
Bucket='stack-exchange',
Key='48836509/dogs.csv',
ExpressionType='SQL',
Expression="SELECT * FROM s3object s where s.\"breen_name\" = 'pug'",
InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}, 'CompressionType': 'NONE'},
OutputSerialization = {'CSV': {}},
)
for event in resp['Payload']:
if 'Records' in event:
records = event['Records']['Payload'].decode('utf-8')
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!'),
'pugInfo': records
}
This function produces the following result:
Response
{
"statusCode": 200,
"body": "\"Hello from Lambda!\"",
"currentWorkdingDirectory": "/var/task",
"currentdirlist": [
"lambda_function.py"
],
"pugInfo": "1,pug,toy\r\n"
}
The "entry point" for this function is in a Python file called lambda_function.py and the function function_to_be_called. Together these are the "handler." We can see this in the Console:
or using the API through Boto3
import boto3
awslambda = boto3.client('lambda')
awslambda.get_function_configuration('s3SelectFunction')
Which returns:
{'CodeSha256': 'mFVVlakisUIIsLstQsJUpeBIeww4QhJjl7wJaXqsJ+Q=',
'CodeSize': 565,
'Description': '',
'FunctionArn': 'arn:aws:lambda:us-east-1:***********:function:s3SelectFunction',
'FunctionName': 's3SelectFunction',
'Handler': 'lambda_function.function_to_be_called',
'LastModified': '2021-03-10T00:57:48.651+0000',
'MemorySize': 128,
'ResponseMetadata': ...
'Version': '$LATEST'}

Related

How to get AWS S3 object Location/URL using python 3.8?

I am uploading a file to AWS S3 using AWS Lambda function (Python3.8) with the following code.
file_obj = open(filename, 'rb')
s3_upload = s3.put_object( Bucket="aaa", Key="aaa.png", Body=file_obj)
return {
'statusCode': 200,
'body': json.dumps("Executed Successfully")
}
I want to get the location/url of the S3 object in return. In Node.js we use the .location parameter for getting the object location/url.
Any idea how to do this using python 3.8?

The url of S3 objects has known format and follows Virtual hosted style access:
https://bucket-name.s3.Region.amazonaws.com/keyname
Thus, you can construct the url yourself:
bucket_name = 'aaa'
aws_region = boto3.session.Session().region_name
object_key = 'aaa.png'
s3_url = f"https://{bucket_name}.s3.{aws_region}.amazonaws.com/{object_key}"
return {
'statusCode': 200,
'body': json.dumps({'s3_url': s3_url})
}

Using AWS Lambda to read Kafka(MSK) event source

I am trying to read values from a kafka topic (AWS MSK) using AWS lambda.
The event record when printed from lambda looks like this:
{'eventSource': 'aws:kafka', 'eventSourceArn': 'arn:aws:kafka:ap-northeast-1:987654321:cluster/mskcluster/79y80c66-813a-4f-af0e-4ea47ba107e6', 'records': {'Transactions-0': [{'topic': 'Transactions', 'partition': 0, 'offset': 4798, 'timestamp': 1603565835915, 'timestampType': 'CREATE_TIME', 'value': 'eyJFdmVudFRpbWUiOiAiMjAyMC0xMC0yNCAxODo1NzoxNS45MTUzMjQiLCAiSVAiOiAiMTgwLjI0MS4xNTkuMjE4IiwgIkFjY291bnROdW1iZXIiOiwiMTQ2ODA4ODYiLCAiVXNlck5hbWUiOi67iQW1iZXIgUm9tYXJvIiwgIkFtb3VudCI6ICI1NTYyIiwgIlRyYW5zYWN0aW9uSUQiOiAiTzI4Qlg3TlBJbWZmSXExWCIsICJDb3VuTHJ5IjogIk9tYW4ifQ=='}]}}
How can I extract the 'topic' and 'value' fields? The value one is base64 encoded.
I get the following error:
NameError: name 'record' is not defined
I am trying the following code:
import json
import base64
def lambda_handler(event, context):
print(event)
message = event['records']
payload=base64.b64decode(record["message"]["value"])
print("Decoded payload: " + str(payload))
Sample MSK event structure

In your code snippet the record variable you try to pass to the decode function does not exist. An example to iterate over the records is:
records = event['records']['Transactions-0']
for record in records:
payload=base64.b64decode(record["message"]["value"])
print("Decoded payload: " + str(payload))
Every function call contains multiple records per topic. Though you could also iterate over those if you have multiple like Transactions-1,...

Getting Error While Inserting JSON data to DynamoDB using Python

HI,
I am trying to put the json data into AWS dynamodb table using AWS Lambda , however i am getting an error like below. My json file is uploaded to S3 bucket.
Parameter validation failed:
Invalid type for parameter Item, value:
{
"IPList": [
"10.1.0.36",
"10.1.0.27"
],
"TimeStamp": "2020-04-22 11:43:13",
"IPCount": 2,
"LoadBalancerName": "internal-ALB-1447121364.us-west-2.elb.amazonaws.com"
}
, type: <class 'str'>, valid types: <class 'dict'>: ParamValidationError
Below i my python script:-
import boto3
import json
s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
json_file_name = event['Records'][0]['s3']['object']['key']
json_object = s3_client.get_object(Bucket=bucket,Key=json_file_name)
jsonFileReader = json_object['Body'].read()
jsonDict = json.loads(jsonFileReader)
table = dynamodb.Table('test')
table.put_item(Item=jsonDict)
return 'Hello'
Below is my json content
"{\"IPList\": [\"10.1.0.36\", \"10.1.0.27\"], \"TimeStamp\": \"2020-04-22 11:43:13\",
\"IPCount\": 2, \"LoadBalancerName\": \"internal-ALB-1447121364.us-west-2.elb.amazonaws.com\"}"
Can someone help me, how can insert data to dynamodb.

json.loads(jsonFileReader) returns string, but table.put_item() expects dict. Use json.load() instead.

get aws lambda to start transcribe job with filename without extension

I have a lambda function which will start a transcribe job when an object is put into the s3 bucket. I am having trouble getting setting the transcribe job to be the file name without the extension; also the file is not putting into the correct prefix folder in S3 bucket for some reason, here's what I have:
import json
import boto3
import time
import os
from urllib.request import urlopen
transcribe = boto3.client('transcribe')
def lambda_handler(event, context):
if event:
file_obj = event["Records"][0]
bucket_name = str(file_obj['s3']['bucket']['name'])
file_name = str(file_obj['s3']['object']['key'])
s3_uri = create_uri(bucket_name, file_name)
job_name = filename
print(os.path.splitext(file_name)[0])
transcribe.start_transcription_job(TranscriptionJobName = job_name,
Media = {'MediaFileUri': s3_uri},
MediaFormat = 'mp3',
LanguageCode = "en-US",
OutputBucketName = "sbox-digirepo-transcribe-us-east-1",
Settings={
# 'VocabularyName': 'string',
'ShowSpeakerLabels': True,
'MaxSpeakerLabels': 2,
'ChannelIdentification': False
})
while Ture:
status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED", "FAILED"]:
break
print("Transcription in progress")
time.sleep(5)
s3.put_object(Bucket = bucket_name, Key="output/{}.json".format(job_name), Body=load_)
return {
'statusCode': 200,
'body': json.dumps('Transcription job created!')
}
def create_uri(bucket_name, file_name):
return "s3://"+bucket_name+"/"+file_name
the error i get is
[ERROR] BadRequestException: An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: 1 validation error detected: Value 'input/7800533A.mp3' at 'transcriptionJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[0-9a-zA-Z._-]+
so my desired output should have the TranscriptionJobName value to be 7800533A for this case, and the result OutputBucketName to be in s3bucket/output. any help is appreciated, thanks in advance.

The TranscriptionJobName argument is a friendly name for your job and is pretty limited based on the regex. You're passing it the full object key, which contains the prefix input/, but / is a disallowed character in the job name. You could just split out the file name portion in your code:
job_name = file_name.split('/')[-1]
I put a full example of uploading media and starting an AWS Transcribe job on GitHub that puts all this in context.

aws firehose lambda function invocation gives wrong output strcuture format

When i insert a data object to aws firhose stream using a put operation it works fine .As lambda function is enabled on my firehose stream .hence a lambda function is invoked but gives me a output structure response error :
"errorMessage":"Invalid output structure: Please check your function and make sure the processed records contain valid result status of Dropped, Ok, or ProcessingFailed."
so now i have created my lambda function like this way to make the correct output strcuture :
import base64
import json
print('Loading function')
def lambda_handler(event, context):
output=[]
print('event'+str(event))
for record in event['records']:
payload = base64.b64decode(record['data'])
print('payload'+str(payload))
payload=base64.b64encode(payload)
output_record={
'recordId':record['recordId'],
'result': 'Ok',
'data': base64.b64encode(json.dumps('hello'))
}
output.append(output_record)
return { 'records': output }
Now i am getting the follwing eror on encoding the 'data' field as
"errorMessage": "a bytes-like object is required, not 'str'",
and if i change the 'hello' to bytes like b'hello' then i get the following error :
"errorMessage": "Object of type bytes is not JSON serializable",

import json
import base64
import gzip
import io
import zlib
def lambda_handler(event, context):
output = []
for record in event['records']:
payload = base64.b64decode(record['data']).decode('utf-8')
output_record = {
'recordId': record['recordId'],
'result': 'Ok',
'data': base64.b64encode(payload.encode('utf-8')).decode('utf-8')
}
output.append(output_record)
return {'records': output}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Phone book search on AWS Lambda and S3 - python-3.x

Please refer to this documentation topic to learn how to write a Lambda function in Python. You are missing the Handler. See: AWS Lambda function handler in Python

Related

How to get AWS S3 object Location/URL using python 3.8?

Using AWS Lambda to read Kafka(MSK) event source

Getting Error While Inserting JSON data to DynamoDB using Python

get aws lambda to start transcribe job with filename without extension

aws firehose lambda function invocation gives wrong output strcuture format

Categories

Resources