getting error while connecting to redshift from aws lambda function - python-3.x

trying to connecting redshift from aws lambda python code using psycopg2 lib, when running same code from EC2 instance not getting any error. getting below error response.
{
"errorMessage": "FATAL: no pg_hba.conf entry for host \"::xxxxx\", user \"xxxx\", database \"xxxx\", SSL off\n",
"errorType": "OperationalError",
"stackTrace": [
[
"/var/task/aws_unload_to_s3_audit.py",
86,
"lambda_handler",
"mainly()"
],
[
"/var/task/aws_unload_to_s3_audit.py",
74,
"mainly",
"con = psycopg2.connect(conn_string)"
],
[
"/var/task/psycopg2/__init__.py",
130,
"connect",
"conn = _connect(dsn, connection_factory=connection_factory, **kwasync)"
]
]
}

My suggestion on this would be putting checking the network configuration for Redshift, chances are that the connection is being refused.
Places to check -
Redshift Security Group
VPC configuration is the lambda resides under a private subnet.

Related

AWS Lambda, boto3 - start instances, error while testing (not traceable)

I am trying to create a Lambda function to automatically start/stop/reboot some instances (with some additional tasks in the future).
I created the IAM role with a policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:StartInstances",
"ec2:StopInstances",
"ec2:RebootInstances"
],
"Condition": {
"StringEquals": {
"ec2:ResourceTag/critical":"true"
}
},
"Resource": [
"arn:aws:ec2:*:<12_digits>:instance/*"
],
"Effect": "Allow"
}
]
}
The Lambda function has been granted access to the correct VPC, subnet, and security group.
I assigned the role to a new Lambda function (Python 3.9):
import boto3
from botocore.exceptions import ClientError
# instance IDs copied from my AWS Console
instances = ['i-xx1', 'i-xx2', 'i-xx3', 'i-xx4']
ec2 = boto3.client('ec2')
def lambda_handler(event, context):
print(str(instances))
try:
print('The break occurs here \u2193')
response = ec2.start_instances(InstanceIds=instances, DryRun=True)
except ClientError as e:
print(e)
if 'DryRunOperation' not in str(e):
print("You don't have permission to reboot instances.")
raise
try:
response = ec2.start_instances(InstanceIds=instances, DryRun=False)
print(response)
except ClientError as e:
print(e)
return response
I cannot find anything due to no message in the test output about where the error is. I had thought it had been a matter of time duration, then I set the time limit to 5 mins to be sure if it was a matter of time. For example:
Test Event Name
chc_lambda_test1
Response
{
"errorMessage": "2022-07-30T19:15:40.088Z e037d31d-5658-40b4-8677-1935efd3fdb7 Task timed out after 300.00 seconds"
}
Function Logs
START RequestId: e037d31d-5658-40b4-8677-1935efd3fdb7 Version: $LATEST
['i-xx', 'i-xx', 'i-xx', 'i-xx']
The break occurs here ↓
END RequestId: e037d31d-5658-40b4-8677-1935efd3fdb7
REPORT RequestId: e037d31d-5658-40b4-8677-1935efd3fdb7 Duration: 300004.15 ms Billed Duration: 300000 ms Memory Size: 128 MB Max Memory Used: 79 MB Init Duration: 419.46 ms
2022-07-30T19:15:40.088Z e037d31d-5658-40b4-8677-1935efd3fdb7 Task timed out after 300.00 seconds
Request ID
e037d31d-5658-40b4-8677-1935efd3fdb7
I had tried increasing the Lambda memory too, but it hasn't worked (it is not the case, since Max Memory Used: 79 MB).
The main reason the issue occurred, is the lack of internet access to the subnet assigned to the Lambda function. I have added (as Ervin Szilagyi suggested) an endpoint in the VPC (with an assignment to the subnet and security group).
The next step was to provide authorization - thanks to this idea Unauthorized operation error occurs when using Boto3 to launch an EC2 instance with an IAM role, I added the IAM access key and secret key to the client invocation.
ec2 = boto3.client(
'ec2',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
)
However, please be careful with security settings, I am a new user (and working on my private projects at the moment), therefore, you shouldn't take this solution as secure.

Why can't my AWS Lambda node JS app access my MongoDB Atlas cluster?

Context :
I have just created a Node.JS application and deployed it with the Serverless Framework on AWS Lambda.
Problem :
I would like that application to be able to access my (free tier) MongoDB Atlas Cluster. For this I am using mongoose.
Setup :
I have got a IAM User with AdministratorAccess rights. This user has been authorized on my MongoDB Cluster.
I am using the authMechanism=MONGODB-AWS, therefore using the Token and Secret of that IAM user. The password has been correctly "url encoded".
This is the piece of code used to create a connection :
const uri = "mongodb+srv://myIAMtoken:myIAMsecret#cluster0.tfws6.mongodb.net/DBNAME?authSource=%24external&authMechanism=MONGODB-AWS&retryWrites=true&w=majority"
mongoose.connect(uri, {useNewUrlParser: true, useUnifiedTopology: true })
When I run this code on my laptop, the connection is made and I can retrieve the data needed.
However when I deploy this exact same code on AWS Lambda (through Serverless), I get this response :
message "Internal server error"
The trace on CloudWatch looks like this :
{
"errorType": "Runtime.UnhandledPromiseRejection",
"errorMessage": "MongoError: bad auth : aws sts call has response 403",
"reason": {
"errorType": "MongoError",
"errorMessage": "bad auth : aws sts call has response 403",
"code": 8000,
"ok": 0,
"codeName": "AtlasError",
"name": "MongoError",
"stack": [
"MongoError: bad auth : aws sts call has response 403",
" at MessageStream.messageHandler (/var/task/node_modules/mongodb/lib/cmap/connection.js:268:20)",
" at MessageStream.emit (events.js:314:20)",
" at processIncomingData (/var/task/node_modules/mongodb/lib/cmap/message_stream.js:144:12)",
" at MessageStream._write (/var/task/node_modules/mongodb/lib/cmap/message_stream.js:42:5)",
" at doWrite (_stream_writable.js:403:12)",
" at writeOrBuffer (_stream_writable.js:387:5)",
" at MessageStream.Writable.write (_stream_writable.js:318:11)",
" at TLSSocket.ondata (_stream_readable.js:718:22)",
" at TLSSocket.emit (events.js:314:20)",
" at addChunk (_stream_readable.js:297:12)",
" at readableAddChunk (_stream_readable.js:272:9)",
" at TLSSocket.Readable.push (_stream_readable.js:213:10)",
" at TLSWrap.onStreamRead (internal/stream_base_commons.js:188:23)"
]
},
"promise": {},
"stack": [
"Runtime.UnhandledPromiseRejection: MongoError: bad auth : aws sts call has response 403",
" at process.<anonymous> (/var/runtime/index.js:35:15)",
" at process.emit (events.js:314:20)",
" at processPromiseRejections (internal/process/promises.js:209:33)",
" at processTicksAndRejections (internal/process/task_queues.js:98:32)"
]
}
I thought it was a network access issue from AWS, so I tried fetching "http://google.com" , no problem, my node app could access the page and provide the response. My app has an internet access but cannot reach my MongoDB cloud instance. My MongoDB cluster is accessible from any IP address.
This is reaching the limits of my knowledge :-)
if you are using an iam type mongodb user then you don't need the username + password in the connection string.
const uri = "mongodb+srv://cluster0.tfws6.mongodb.net/DBNAME?authSource=$external&authMechanism=MONGODB-AWS&retryWrites=true&w=majority"
when you invoke your lambda connecting to the mongodb cluster, the iam role that it will be using will be the execution role of the lambda
"arn:aws:iam::ACCOUNT_ID:role/SLS_SERVICE_NAME-ENVIRONMENT-AWS_REGION-lambdaRole"
"arn:aws:iam::123456789012:role/awesome-service-dev-us-east-1-lambdaRole"
check the default iam section of sls framework:
https://www.serverless.com/framework/docs/providers/aws/guide/iam/#the-default-iam-role

Input format for Tensorflow models on GCP AI Platform

I have a uploaded a model to GCP AI Platform Models. It's a simple Keras, Multistep Model, with 5 features trained on 168 lagged values. When I am trying to test the models in, I'm getting this strange error message:
"error": "Prediction failed: Error during model execution: <_MultiThreadedRendezvous of RPC that terminated with:\n\tstatus = StatusCode.FAILED_PRECONDITION\n\tdetails = \"Error while reading resource variable dense_7/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_7/bias)\n\t [[{{node model_2/dense_7/BiasAdd/ReadVariableOp}}]]\"\n\tdebug_error_string = \"{\"created\":\"#1618946146.138507164\",\"description\":\"Error received from peer ipv4:127.0.0.1:8081\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":1061,\"grpc_message\":\"Error while reading resource variable dense_7/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense_7/bias)\\n\\t [[{{node model_2/dense_7/BiasAdd/ReadVariableOp}}]]\",\"grpc_status\":9}\"\n>"
The input is on the following format, a list ((1, 168, 5))
See below of example:
{
"instances":
[[[ 3.10978284e-01, 2.94650396e-01, 8.83664149e-01,
1.60210423e+00, -1.47402699e+00],
[ 3.10978284e-01, 2.94650396e-01, 5.23466315e-01,
1.60210423e+00, -1.47402699e+00],
[ 8.68576328e-01, 7.78699823e-01, 2.83334426e-01,
1.60210423e+00, -1.47402699e+00]]]
}

Node AWS Lambda fetch request failing

I am using node-fetch to perform a request to an API (hosted on AWS Lambda/API Gateway with Serverless Framework) from a lambda. The lambda is failing with the below invocation error:
{
"errorType": "FetchError",
"errorMessage": "request to https://[API].us-east-2.amazonaws.com/[ENDPOINT] failed, reason: connect ETIMEDOUT [IP]:443",
"code": "ETIMEDOUT",
"message": "request to https://[API].us-east-2.amazonaws.com/[ENDPOINT] failed, reason: connect ETIMEDOUT [IP]:443",
"type": "system",
"errno": "ETIMEDOUT",
"stack": [
"FetchError: request to https://[API].us-east-2.amazonaws.com/[ENDPOINT] failed, reason: connect ETIMEDOUT [IP]:443",
" at ClientRequest.<anonymous> (/var/task/node_modules/node-fetch/lib/index.js:1461:11)",
" at ClientRequest.emit (events.js:315:20)",
" at TLSSocket.socketErrorListener (_http_client.js:426:9)",
" at TLSSocket.emit (events.js:315:20)",
" at emitErrorNT (internal/streams/destroy.js:92:8)",
" at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)",
" at processTicksAndRejections (internal/process/task_queues.js:84:21)"
]
}
Here is the lambda in question with extraneous code removed:
"use strict";
import { PrismaClient } from "#prisma/client";
import fetch from "node-fetch";
const prisma = new PrismaClient();
module.exports.handler = async (event, context, callback) => {
const users = await prisma.user.findMany();
for (const user of users) {
await fetch(...); // this is where the error occurs
}
};
The code works fine locally (the code in the lambda itself as well as manaully making the request). Because of that, I thought this might be fixed by setting up a NAT for the lambda/configuring the VPC to have external internet access, though I'm not sure how to do that with Serverless Framework if that is indeed the issue. The lambda attempting to perform the fetch request is in the same VPC as the API. Any help or ideas is greatly appreciated!
I solved this by adding a VPC endpoint for the lambda function. I believe an alternative solution (though possibly more expensive) is to set up a NAT gateway for the Lambda.

Unable to build local AMLS environment with private wheel

I am trying to write a small program using the AzureML Python SDK (v1.0.85) to register an Environment in AMLS and use that definition to construct a local Conda environment when experiments are being run (for a pre-trained model). The code works fine for simple scenarios where all dependencies are loaded from Conda/ public PyPI, but when I introduce a private dependency (e.g. a utils library) I am getting a InternalServerError with the message "Error getting recipe specifications".
The code I am using to register the environment is (after having authenticated to Azure and connected to our workspace):
environment_name = config['environment']['name']
py_version = "3.7"
conda_packages = ["pip"]
pip_packages = ["azureml-defaults"]
private_packages = ["./env-wheels/utils-0.0.3-py3-none-any.whl"]
print(f"Creating environment with name {environment_name}")
environment = Environment(name=environment_name)
conda_deps = CondaDependencies()
print(f"Adding Python version: {py_version}")
conda_deps.set_python_version(py_version)
for conda_pkg in conda_packages:
print(f"Adding Conda denpendency: {conda_pkg}")
conda_deps.add_conda_package(conda_pkg)
for pip_pkg in pip_packages:
print(f"Adding Pip dependency: {pip_pkg}")
conda_deps.add_pip_package(pip_pkg)
for private_pkg in private_packages:
print(f"Uploading private wheel from {private_pkg}")
private_pkg_url = Environment.add_private_pip_wheel(workspace=ws, file_path=Path(private_pkg).absolute(), exist_ok=True)
print(f"Adding private Pip dependency: {private_pkg_url}")
conda_deps.add_pip_package(private_pkg_url)
environment.python.conda_dependencies = conda_deps
environment.register(workspace=ws)
And the code I am using to create the local Conda environment is:
amls_environment = Environment.get(ws, name=environment_name, version=environment_version)
print(f"Building environment...")
amls_environment.build_local(workspace=ws)
The exact error message being returned when build_local(...) is called is:
Traceback (most recent call last):
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\core\environment.py", line 814, in build_local
raise error
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\core\environment.py", line 807, in build_local
recipe = environment_client._get_recipe_for_build(name=self.name, version=self.version, **payload)
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\_restclient\environment_client.py", line 171, in _get_recipe_for_build
raise Exception(message)
Exception: Error getting recipe specifications. Code: 500
: {
"error": {
"code": "ServiceError",
"message": "InternalServerError",
"detailsUri": null,
"target": null,
"details": [],
"innerError": null,
"debugInfo": null
},
"correlation": {
"operation": "15043e1469e85a4c96a3c18c45a2af67",
"request": "19231be75a2b8192"
},
"environment": "westeurope",
"location": "westeurope",
"time": "2020-02-28T09:38:47.8900715+00:00"
}
Process finished with exit code 1
Has anyone seen this error before or able to provide some guidance around what the issue may be?
The issue was with out firewall blocking the required requests between AMLS and the storage container (I presume to get the environment definitions/ private wheels).
We resolved this by updating the firewall with appropriate ALLOW rules for the AMLS service to contact and read from the attached storage container.
Assuming that you'd like to run in the script on a remote compute, then my suggestion would be to pass the environment you just "got". to a RunConfiguration, then pass that to an ScriptRunConfig, Estimator, or a PythonScriptStep
from azureml.core import ScriptRunConfig
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
src = ScriptRunConfig(source_directory=project_folder, script='train.py')
# Set compute target to the one created in previous step
src.run_config.target = cpu_cluster.name
# Set environment
amls_environment = Environment.get(ws, name=environment_name, version=environment_version)
src.run_config.environment = amls_environment
run = experiment.submit(config=src)
run
Check out the rest of the notebook here.
If you're looking for a local run this notebook might help.

Resources