How to use AWS Secrets Manager Caching for Python Lambda? - python-3.x

I am referring the aws-secretsmanager-caching-python documentation and trying to cache the retrieved secret from secrets manager however, for some reason, i am always getting timeout without any helpful errors to troubleshoot this further. I am able to retrieve the secrets properly if i retrieve the secrets from secrets manager (without caching).
My main function in lambda function looks like this:
import botocore
import botocore.session
from aws_secretsmanager_caching import SecretCache, SecretCacheConfig
from cacheSecret import getCachedSecrets
def lambda_handler(event, context):
result = getCachedSecrets()
print(result)
and i have created cacheSecret as following.
from aws_secretsmanager_caching import SecretCache
from aws_secretsmanager_caching import InjectKeywordedSecretString, InjectSecretString
cache = SecretCache()
#InjectKeywordedSecretString(secret_id='my_secret_name', cache=cache, secretKey1='keyname1', secretKey2='keyname2')
def getCachedSecrets(secretKey1, secretKey2):
print(secretKey1)
print(secretKey2)
return secretKey1
In the above code, my_secret_name is the name of the secret created in secret manager and secretKey1 and secretKey1 are the secret key names which have string values.
Error:
{
"errorMessage": "2021-03-31T15:29:08.598Z 01f5ded3-7658-4zb5-ae66-6f300098a6e47 Task timed out after 3.00 seconds"
}
Can someone please suggest what needs to be fixed in the above to make this work. Also, i am not sure where to define the secret_name, secret key names in case if we dont use decorators.

The lambda needs to ack the response within 3 seconds but the code can run longer. The timeout can be configured in the function config: https://docs.aws.amazon.com/lambda/latest/dg/configuration-function-common.html

Related

Azure function failing : "statusCode": 413, "message": "request entity too large"

I have an Python script that creates a CSV file and loads it into an Azure Container. I want to run it as an Azure Function, but it is failing once I deploy it to Azure.
The code works fine in Google Colab (code here minus the connection string). It also works fine when I run it locally as an Azure function via the CLI (func start command).
Here is the code from the init.py file that is deployed
import logging
import azure.functions as func
import numpy as np
import pandas as pd
from datetime import datetime
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__
import tempfile
def main(mytimer: func.TimerRequest) -> None:
logging.info('Python trigger function.')
temp_path = tempfile.gettempdir()
dateTimeObj = datetime.now()
timestampStr = dateTimeObj.strftime("%d%b%Y%H%M%S")
filename =f"{timestampStr}.csv"
df = pd.DataFrame(np.random.randn(5, 3), columns=['Column1','Column2','Colum3'])
df.to_csv(f"{temp_path}{filename}", index=False)
blob = BlobClient.from_connection_string(
conn_str="My connection string",
container_name="container2",
blob_name=filename)
with open(f"{temp_path}{filename}", "rb") as data:
blob.upload_blob(data)
The deployment to Azure is successful, but the functions fails in Azure. When I look in the Azure Function portal, the output from the Code+ Test menu says
{
"statusCode": 413,
"message": "request entity too large"
}
The CSV file produced by the script is 327B - so tiny. Besides that error message I can't see any good information on what is causing the failure. Can anyone suggest a solution / way forward?
Here is the requirements file contents
# Do not include azure-functions-worker as it may conflict with the Azure Functions platform
azure-functions
azure-storage-blob==12.8.1
logging
numpy==1.19.3
pandas==1.3.0
DateTime==4.3
backports.tempfile==1.0
azure-storage-blob==12.8.1
Here is what I have tried.
This article suggests this issue is linked to CORS (Cross Origin Resource sharing) issue. However adding : https://functions.azure.com to my CORS allowed domains in the the Resource sharing menu in the storage settings of my storage account didn't solve the problem (I also republished the Azure function).
Any help, or suggestions would be appreciated.
I tried reproducing the issue with all the information which you have provided and got the same 404 error as below:
Later after adding a value * in CORS, we can get rid of this issue, below is the fixed screenshot and CORS values:
CORS:
Test/Run:
Also we can add the following domain names to avoid 404 error
https://functions.azure.com
https://functions-staging.azure.com
https://functions.azure.com
This will also fix the issue.

Lambda function gets stuck when calling RDS via SQLalchemy URI

I have a fast API application. Initially, I was passing my DB URI via ngrok tunnel like this in my SAM template. In this setup Lambda will be using my local machine's PSQL DB.
DbConnnectionString:
Type: String
Default: postgresql://<uname>:<pwd>#x.tcp.ngrok.io:PORT/DB
This is how I read the URI in my Python code
# config.py
DATABASE_URL = os.environ.get('DB_URI')
db_engine = create_engine(DATABASE_URL)
db_session = sessionmaker(autocommit=False, autoflush=False,bind=db_engine)
print(f"Configs initialized for {API_V1_STR}")
# app.py
# 3rd party
from fastapi import FastAPI
# Custom
from config.app_config import PROJECT_NAME, db_engine
from models.db_models import Base
print("Creating all database")
Base.metadata.create_all(bind=db_engine)
app = FastAPI(title=PROJECT_NAME)
print("APP created")
In this setup, everything seems to work as expected.
But whenever I replace the DB URL with RDS DB, suddenly the call gets stuck at create all database step as shown in the image below. when this happens the lambda always times out and throws exceptions.
If I run the code locally using uvicorn this error doesn't occur.
Everything works as expected.
When I use sam local invoke even with RDS URL, the API call works without any issues.
This problem occurs only while deployed in AWS Lambda.
I notice that configs are initialized twice in this setup, Once before START request ID and once after.
I have tried reading up on it but not clear what could I do to fix this. Any help would be much appreciated.
It was my bad!. I didn't pay attention to security groups. It was a connection timeout all along. Once I fixed the port access in Security groups, lambda started working as expected.

Timeout when writing custom metric data to CloudWatch with AWS lambda

I'm running a vanilla AWS lambda function to count the number of messages in my RabbitMQ task queue:
import boto3
from botocore.vendored import requests
cloudwatch_client = boto3.client('cloudwatch')
def get_queue_count(user="user", password="password", domain="<my domain>/api/queues"):
url = f"https://{user}:{password}#{domain}"
res = requests.get(url)
message_count = 0
for queue in res.json():
message_count += queue["messages"]
return message_count
def lambda_handler(event, context):
metric_data = [{'MetricName': 'RabbitMQQueueLength', "Unit": "None", 'Value': get_queue_count()}]
print(metric_data)
response = cloudwatch_client.put_metric_data(MetricData=metric_data, Namespace="RabbitMQ")
print(response)
Which returns the following output on a test run:
Response:
{
"errorMessage": "2020-06-30T19:50:50.175Z d3945a14-82e5-42e5-b03d-3fc07d5c5148 Task timed out after 15.02 seconds"
}
Request ID:
"d3945a14-82e5-42e5-b03d-3fc07d5c5148"
Function logs:
START RequestId: d3945a14-82e5-42e5-b03d-3fc07d5c5148 Version: $LATEST
/var/runtime/botocore/vendored/requests/api.py:72: DeprecationWarning: You are using the get() function from 'botocore.vendored.requests'. This dependency was removed from Botocore and will be removed from Lambda after 2021/01/30. https://aws.amazon.com/blogs/developer/removing-the-vendored-version-of-requests-from-botocore/. Install the requests package, 'import requests' directly, and use the requests.get() function instead.
DeprecationWarning
[{'MetricName': 'RabbitMQQueueLength', 'Value': 295}]
END RequestId: d3945a14-82e5-42e5-b03d-3fc07d5c5148
You can see that I'm able to interact with the RabbitMQ API just fine--the function hangs when trying to post the metric.
The lambda function uses the IAM role put-custom-metric, which uses the policies recommended here, as well as CloudWatchFullAccess for good measure.
Resources on my internal load balancer, where my RabbitMQ server lives, are protected by a VPN, so it's necessary for me to associate this function with the proper VPC/security group. Here's how it's setup right now (I know this is working, because otherwise the communication with RabbitMQ would fail):
I read this post where multiple contributors suggest increasing the function memory and timeout settings. I've done both of these, and the timeout persists.
I can run this locally without any issue to create the metric on CloudWatch in less than 5 seconds.
#noxdafox has written a brilliant plugin that got me most of the way there, but at the end of the day I ended going for a pure lambda-based solution. It was surprisingly tricky getting the cloud watch plugin running with docker, and after I had trouble with the container shutting down its services and stopping processing of the message queue. Additionally, I wanted to be able to normalize queue count by the number of worker services in my ECS cluster, so I was going to need to connect to at least one AWS resource from within my VPC anyhow. I figured best to keep everything simple and in the same place.
import os
import boto3
from botocore.vendored import requests
USER = os.getenv("RMQ_USER")
PASSWORD = os.getenv("RMQ_PASSWORD")
cloudwatch_client = boto3.client(service_name='cloudwatch', endpoint_url="https://MYCLOUDWATCHURL.monitoring.us-east-1.vpce.amazonaws.com")
ecs_client = boto3.client(service_name='ecs', endpoint_url="https://vpce-MYECSURL.ecs.us-east-1.vpce.amazonaws.com")
def get_message_count(user=USER, password=PASSWORD, domain="rabbitmq.stockbets.io/api/queues"):
url = f"https://{user}:{password}#{domain}"
res = requests.get(url)
message_count = 0
for queue in res.json():
message_count += queue["messages"]
print(f"message count: {message_count}")
return message_count
def get_worker_count():
worker_data = ecs_client.describe_services(cluster="prod", services=["worker"])
worker_count = worker_data["services"][0]["runningCount"]
print(f"worker_count count: {worker_count}")
return worker_count
def lambda_handler(event, context):
message_count = get_message_count()
worker_count = get_worker_count()
print(f"msgs per worker: {message_count / worker_count}")
metric_data = [
{'MetricName': 'MessagesPerWorker', "Unit": "Count", 'Value': message_count / worker_count},
{'MetricName': 'NTasks', "Unit": "Count", 'Value': worker_count}
]
cloudwatch_client.put_metric_data(MetricData=metric_data, Namespace="RabbitMQ")
Creating the VPC endpoints was easier that I thought it would be. For Cloudwatch, you want to search for the "monitoring" VPC endpoint during the creation step (not "cloudwatch" or "logs". Searching for "ecs" gets you what you need for the ECS connect.
Once your lambda is us you need to configure the metric and accompanying alerts, and then relate those to an auto-scaling policy, but that's probably beyond the scope of this post. Leave a comment if you have questions on how I worked that out.
Only reason you might want to use a Lambda function to achieve your goal is if you do not own the RabbitMQ cluster. The fact your logic is hanging during communication suggests a network issue mostly due to mis-configured security groups.
If you can change the cluster configuration, I'd suggest you to install and configure the CloudWatch metrics exporter plugin which does most of the heavy-lifting work for you.
If your cluster runs on Docker, I believe the custom Docker file to be the best solution. If you run your Docker instances in AWS via ECS/Fargate, the plugin should be able to automatically infer the credentials from the Task Role through ExAws. Otherwise, just follow the README instructions on how to set the credentials yourself.

list_datasets() method does nothing in AWS Lambda

I am trying to get the list of datasets from BigQuery inside the AWS lambda. But, while executing the client.list_datasets() method it does nothing and lambda is timed out.
My code is as follows:
from google.cloud.bigquery import Client
from google.oauth2.service_account import Credentials
credentials = Credentials.from_service_account_info(
service_account_dict)
client = Client(
project=service_account_dict.get("project_id"),
credentials=credentials
)
datasets = client.list_datasets()
print(datasets)
for dataset in datasets:
print("dataset info", dataset.__dict__)
The output of first print statement is:
<google.api_core.page_iterator.HTTPIterator object at 0x7fbae4975550>
But, the second print for dataset.__dict__ is not being printed. Or, looping over the HTTPIterator object is not performed.
BTW, the code works perfectly fine in local machine.
The AWS VPC that I used in lambda function was causing this issue. The VPC blocked requests to the external API (in my case BigQuery API).
Configuring the VPC subnet and NAT Gateway to expose lambda function to the internet (0.0.0.0/0) solved the issue.

boto3 s3 connection error: An error occurred (SignatureDoesNotMatch) when calling the ListBuckets operation

I'm using the boto3 package to connect from outside an s3 cluster (i.e. the script is currently not being run within the AWS 'cloud', but from my MBP connecting to the relevant cluster). My code:
s3 = boto3.resource(
"s3",
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
)
bucket = s3.Bucket(self.settings['S3']['bucket_test'])
for bucket_in_all in boto3.resource('s3').buckets.all():
if bucket_in_all.name == self.settings['S3']['bucket_test']:
print ("Bucket {} verified".format(self.settings['S3']['bucket_test']))
Now I'm receiving this error message:
botocore.exceptions.ClientError: An error occurred (SignatureDoesNotMatch) when calling the ListBuckets operation
I'm aware of the sequence of how the aws credentials are checked, and tried different permutations of my environment variables and ~/.aws/credentials, and know that the credentials as per my .py script should override, however I'm still seeing this SignatureDoesNotMatch error message. Any ideas where I may be going wrong? I've also tried:
# Create a session
session = boto3.session.Session(
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
aws_session_token=self.settings['CREDENTIALS']['session_token'],
region_name=self.settings['CREDENTIALS']['region_name']
)
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
...however I also see the same error traceback.
Actually, this was partly answered by #John Rotenstein and #bdcloud nevertheless I need to be more specific...
The following code in my case was not necessary and causing the error message:
# Create a session
session = boto3.session.Session(
aws_access_key_id=self.settings['CREDENTIALS']['aws_access_key_id'],
aws_secret_access_key=self.settings['CREDENTIALS']['aws_secret_access_key'],
aws_session_token=self.settings['CREDENTIALS']['session_token'],
region_name=self.settings['CREDENTIALS']['region_name']
)
The credentials now stored in self.settings mirror the ~/.aws/credentials. Weirdly (and like last week where the reverse happened), I now have access. It could be that a simple reboot of my laptop meant that my new credentials (since I updated these again yesterday) in ~/.aws/credentials were then 'accepted'.

Resources