AWS Lambda SSM calls randomly goes in timeout

AWS Lambda SSM calls randomly goes in timeout - node.js

I've a lambda deployed on AWS, in a VPC that has internet access via NAT. The deploy is made using Serverless.
The lambda uses some Middy middlewares and fetches some credentials from SSM.
The problem is that the SSM fetch randomly goes in timeout!
Here's the lambda code:
/* requirements are omitted */
const authorize = async (_event, _context) => {
try {
const ssm = new SSM({
maxRetries: 6, // lowers a chance to hit service rate limits, default is 3
retryDelayOptions: { base: 200 }
})
const params = {
Names: ["param1", "param2"],
WithDecryption: true
}
const fetch = () => new Promise(resolve => {
ssm.getParameters(params, function(err, data) {
if (err) resolve(err, err.stack); // an error occurred
else resolve(data); // successful response
})
})
const res = await fetch()
return {
statusCode: 200,
body: JSON.stringify(res)
}
} catch (_err) {
console.error(_err)
return {
statusCode: 500,
body: 'error'
}
}
}
export default middy(authorize)
.use(warmup({ waitForEmptyEventLoop: false }))
.use(doNotWaitForEmptyEventLoop({ runOnError: true }))
.use(httpSecurityHeaders())

The lambda is timing out, because ssm is throttling you with your current configuration (6 retries 200ms) it takes around 26 seconds before your lambda will give up.
You are running here against the SSM standard throughput limits.
You can enable increased throuhgput with:
aws ssm update-service-setting --setting-id arn:aws:ssm:*region*:*account-id*:servicesetting/ssm/parameter-store/high-throughput-enabled --setting-value true
Be aware an extra cost will be incurred for every getParameter call afterwards (0.05$/10.000 requests).

Related

AWS Lambda: Redis ElastiCache connection timeout error

I have a lambda function using Node 12.
I need to add a new connection to a Redis database hosted in AWS ElastiCache.
Both are in one private VPC and the security groups/subnets are configured properly.
Solution:
globals.js:
const redis = require('redis');
const redisClient = redis.createClient(
`redis://${process.env.REDIS_HOST}:${process.env.REDIS_PORT}/${process.env.REDIS_DB}`,
);
redisClient.on('error', (err) => {
console.log('REDIS CLIENT ERROR:' + err);
});
module.exports.globals = {
REDIS: require('../helpers/redis')(redisClient),
};
index.js (outside handler):
const { globals } = require('./config/globals');
global.app = globals;
const lambda_handler = (event, context, callback) => { ... }
exports.handler = lambda_handler;
helpers/redis/index.js:
const get = require('./get');
module.exports = (redisClient) => {
return {
get: get(redisClient)
};
};
helpers/redis/get.js:
module.exports = (redisClient) => {
return (key, cb) => {
redisClient.get(key, (err, reply) => {
if (err) {
cb(err);
} else {
cb(null, reply);
}
});
};
};
Function call:
app.REDIS.get(redisKey, (err, reply) => {
console.log(`REDIS GET: ${err} ${reply}`);
});
Problem:
When increasing lambda timeout to a value greater than Redis timeout, I get this error:
REDIS CLIENT ERROR:Error: Redis connection to ... failed - connect ETIMEDOUT ...
Addition:
I tried quiting/closing the connection after each transaction:
module.exports = (redisClient) => {
return (cb) => {
redisClient.quit((err, reply) => {
if (err) {
cb(err);
} else {
cb(null, reply);
}
});
};
};
app.REDIS.get(redisKey, (err, reply) => {
console.log(`REDIS GET: ${err} ${reply}`);
if (err) {
cb(err);
} else {
if (reply) {
app.REDIS.quit(() => {
cb()
});
}
}
})
Error:
REDIS GET: AbortError: GET can't be processed. The connection is already closed.
Extra Notes:
I have to use callbacks, this is why I pass ones in the above examples
I'm using "redis": "^3.0.2"
It's not a configuration issue as the cache was accessed hundred of times in a small period of time but it then started giving the timeout errors.
Everything works normally locally

It's not a configuration issue as the cache was accessed hundred of times in a small period of time but it then started giving the timeout errors.
i think it is origin of issue, probably redis database size hit the size limit, and it cannot process new data?
Can you delete old data in it?
Also it is possible Elastic Cache has limits on new TCP clients' connections, and if its depleted, new connections are refused with similar error message you mentioned.
If redis client in aws lambda function cannot establish connection, aws lambda function fails - and new one is started. New lambda function makes one more connection to redis, redis cannot process it, and one more lambda function is started...
So, at one moment, we hit the limit on active redis connections, and system is in deadlock.
I think you can temporary stop all lambda functions, and scale up Elastic Cache redis database.

Wrong aws request signature caused by opentelemetry https plugin

When using the #opentelemetry/plugin-https and the aws-sdk together in a NodeJS application, the opentelemetry plugin adds the traceparent header to each AWS request. This works fine if there is no need for retries in the aws-sdk. When the aws-sdk retries a request the following errors can occur:
InvalidSignatureException: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.
SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.
The first AWS request contains the following headers:
traceparent: '00-32c9b7adee1da37fad593ee38e9e479b-875169606368a166-01'
Authorization: 'AWS4-HMAC-SHA256 Credential=<credential>, SignedHeaders=host;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-target, Signature=<signature>'
Note that the SignedHeaders doesn't include traceparent.
The retried request contains the following headers:
traceparent: '00-c573e391a455a207469ffa4fb75b3cab-6f20c315628cfcc0-01'
Authorization: AWS4-HMAC-SHA256 Credential=<credential>, SignedHeaders=host;traceparent;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-target, Signature=<signature>
Note that the SignedHeaders does include traceparent.
Before the retry request is sent, the #opentelemetry/plugin-https sets new traceparent header and this makes the signature of the AWS request invalid.
Here is a code which reproduces the issue (you may need to run the script a few times before hitting the rate limit which causes the retries):
const opentelemetry = require("#opentelemetry/api");
const { NodeTracerProvider } = require("#opentelemetry/node");
const { SimpleSpanProcessor } = require("#opentelemetry/tracing");
const { JaegerExporter } = require("#opentelemetry/exporter-jaeger");
const provider = new NodeTracerProvider({
plugins: {
https: {
enabled: true,
path: "#opentelemetry/plugin-https"
}
}
});
const exporter = new JaegerExporter({ serviceName: "test" });
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();
const AWS = require("aws-sdk");
const main = async () => {
const cwl = new AWS.CloudWatchLogs({ region: "us-east-1" });
const promises = new Array(100).fill(true).map(() => new Promise((resolve, reject) => {
cwl.describeLogGroups(function (err, data) {
if (err) {
console.log(err.name);
console.log("Got error:", err.message);
console.log("ERROR Request Authorization:");
console.log(this.request.httpRequest.headers.Authorization);
console.log("ERROR Request traceparent:");
console.log(this.request.httpRequest.headers.traceparent);
console.log("Retry count:", this.retryCount);
reject(err);
return;
}
resolve(data);
});
}));
const result = await Promise.all(promises);
console.log(result.length);
};
main().catch(console.error);
Possible solutions:
Ignore all calls to aws in the #opentelemetry/plugin-https.
Ignoring the calls to aws will lead to loosing all spans for aws requests.
Add the traceparent header to the unsignableHeaders in the aws-sdk - AWS.Signers.V4.prototype.unsignableHeaders.push("traceparent");
Changing the prototype seems like a hack and also doesn't handle the case if another node module uses different version of the aws-sdk.
Is there another solution which could allow me to keep the spans for aws requests and guarantees that the signature of all aws requests will be correct?
Update (16.12.2020):
The issue seems to be fixed in the aws sdk v3
The following code throws the correct error (ThrottlingException):
const opentelemetry = require("#opentelemetry/api");
const { NodeTracerProvider } = require("#opentelemetry/node");
const { SimpleSpanProcessor } = require("#opentelemetry/tracing");
const { JaegerExporter } = require("#opentelemetry/exporter-jaeger");
const { CloudWatchLogs } = require("#aws-sdk/client-cloudwatch-logs");
const provider = new NodeTracerProvider({
plugins: {
https: {
enabled: true,
path: "#opentelemetry/plugin-https"
}
}
});
const exporter = new JaegerExporter({ serviceName: "test" });
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();
const main = async () => {
const cwl = new CloudWatchLogs({ region: "us-east-1" });
const promises = new Array(100).fill(true).map(() => new Promise((resolve, reject) => {
cwl.describeLogGroups({ limit: 50 })
.then(resolve)
.catch((err) => {
console.log(err.name);
console.log("Got error:", err.message);
reject(err);
});
}));
const result = await Promise.all(promises);
console.log(result.length);
};
main().catch(console.error);

How to write unit test for the function which is accessing aws resources?

I have a function which is accessing multiple aws resources and now need to test this function, but I don't know how to mock these resources.
I have tried following github of aws-sdk-mock, but didn't get much there.
function someData(event, configuration, callback) {
// sts set-up
var sts = new AWS.STS(configuration.STS_CONFIG);
sts.assumeRole({
DurationSeconds: 3600,
RoleArn: process.env.CROSS_ACCOUNT_ROLE,
RoleSessionName: configuration.ROLE_NAME
}, function(err, data) {
if (err) {
// an error occurred
console.log(err, err.stack);
} else {
// successful response
// resolving static credential
var creds = new AWS.Credentials({
accessKeyId: data.Credentials.AccessKeyId,
secretAccessKey: data.Credentials.SecretAccessKey,
sessionToken: data.Credentials.SessionToken
});
// Query function
var dynamodb = new AWS.DynamoDB({apiVersion: configuration.API_VERSION, credentials: creds, region: configuration.REGION});
var docClient = new AWS.DynamoDB.DocumentClient({apiVersion: configuration.API_VERSION, region: configuration.REGION, endpoint: configuration.DDB_ENDPOINT, service: dynamodb });
// extract params
var ID = event.queryStringParameters.Id;
console.log('metrics of id ' + ID);
var params = {
TableName: configuration.TABLE_NAME,
ProjectionExpression: configuration.PROJECTION_ATTR,
KeyConditionExpression: '#ID = :ID',
ExpressionAttributeNames: {
'#ID': configuration.ID
},
ExpressionAttributeValues: {
':ID': ID
}
};
queryDynamoDB(params, docClient).then((response) => {
console.log('Params: ' + JSON.stringify(params));
// if the query is Successful
if( typeof(response[0]) !== 'undefined'){
response[0]['Steps'] = process.env.STEPS;
response[0]['PageName'] = process.env.STEPS_NAME;
}
console.log('The response you get', response);
var success = {
statusCode: HTTP_RESPONSE_CONSTANTS.SUCCESS.statusCode,
body: JSON.stringify(response),
headers: {
'Content-Type': 'application/json'
},
isBase64Encoded: false
};
return callback(null, success);
}, (err) => {
// return internal server error
return callback(null, HTTP_RESPONSE_CONSTANTS.BAD_REQUEST);
});
}
});
}
This is lambda function which I need to test, there are some env variable also which is being used here.
Now I tried writing Unit test for above function using aws-sdk-mock but still I am not able to figure out how to actually do it. Any help will be appreciated. Below is my test code
describe('test getMetrics', function() {
var expectedOnInvalid = HTTP_RESPONSE_CONSTANTS.BAD_REQUEST;
it('should assume role ', function(done){
var event = {
queryStringParameters : {
Id: '123456'
}
};
AWS.mock('STS', 'assumeRole', 'roleAssumed');
AWS.restore('STS');
AWS.mock('Credentials', 'credentials');
AWS.restore('Credentials');
AWS.mock('DynamoDB.DocumentClient', 'get', 'message');
AWS.mock('DynamoDB', 'describeTable', 'message');
AWS.restore('DynamoDB');
AWS.restore('DynamoDB.DocumentClient');
someData(event, configuration, (err, response) => {
expect(response).to.deep.equal(expectedOnInvalid);
done();
});
});
});
I am getting the following error :
{ MultipleValidationErrors: There were 2 validation errors:
* MissingRequiredParameter: Missing required key 'RoleArn' in params
* MissingRequiredParameter: Missing required key 'RoleSessionName' in params

Try setting aws-sdk module explicitly.
Project structures that don't include the aws-sdk at the top level node_modules project folder will not be properly mocked. An example of this would be installing the aws-sdk in a nested project directory. You can get around this by explicitly setting the path to a nested aws-sdk module using setSDK().
const AWSMock = require('aws-sdk-mock');
import AWS = require('aws-sdk');
AWSMock.setSDKInstance(AWS);
For more details on this : Read aws-sdk-mock documentation, they have explained it even better.

I strongly disagree with #ttulka's answer, so I have decided to add my own as well.
Given you received an event in your Lambda function, it's very likely you'll process the event and then invoke some other service. It could be a call to S3, DynamoDB, SQS, SNS, Kinesis...you name it. What is there to be asserted at this point?
Correct arguments!
Consider the following event:
{
"data": "some-data",
"user": "some-user",
"additionalInfo": "additionalInfo"
}
Now imagine you want to invoke documentClient.put and you want to make sure that the arguments you're passing are correct. Let's also say that you DON'T want the additionalInfo attribute to be persisted, so, somewhere in your code, you'd have this to get rid of this attribute
delete event.additionalInfo
right?
You can now create a unit test to assert that the correct arguments were passed into documentClient.put, meaning the final object should look like this:
{
"data": "some-data",
"user": "some-user"
}
Your test must assert that documentClient.put was invoked with a JSON which deep equals the JSON above.
If you or any other developer now, for some reason, removes the delete event.additionalInfo line, tests will start failing.
And this is very powerful! If you make sure that your code works the way you expect, you basically don't have to worry about creating integration tests at all.
Now, if a SQS consumer Lambda expects the body of the message to contain some field, the producer Lambda should always take care of it to make sure the right arguments are being persisted in the Queue. I think by now you get the idea, right?
I always tell my colleagues that if we can create proper unit tests, we should be good to go in 95% of the cases, leaving integration tests out. Of course it's better to have both, but given the amount of time spent on creating integration tests like setting up environments, credentials, sometimes even different accounts, is not worth it. But that's just MY opinion. Both you and #ttulka are more than welcome to disagree.
Now, back to your question:
You can use Sinon to mock and assert arguments in your Lambda functions. If you need to mock a 3rd-party service (like DynamoDB, SQS, etc), you can create a mock object and replace it in your file under test using Rewire. This usually is the road I ride and it has been great so far.

I see unit testing as a way to check if your domain (business) rules are met.
As far as your Lambda contains exclusively only integration of AWS services, it doesn't make much sense to write a unit test for it.
To mock all the resources means, your test will be testing only communication among those mocks - such a test has no value.
External resources mean input/output, this is what integration testing focuses on.
Write integration tests and run them as a part of your integration pipeline against real deployed resources.

This is how we can mock STS in nodeJs.
import { STS } from 'aws-sdk';
export default class GetCredential {
constructor(public sts: STS) { }
public async getCredentials(role: string) {
this.log.info('Retrieving credential...', { role });
const apiRole = await this.sts
.assumeRole({
RoleArn: role,
RoleSessionName: 'test-api',
})
.promise();
if (!apiRole?.Credentials) {
throw new Error(`Credentials for ${role} could not be retrieved`);
}
return apiRole.Credentials;
}
}
Mock for the above function
import { STS } from 'aws-sdk';
import CredentialRepository from './GetCredential';
const sts = new STS();
let testService: GetCredential;
beforeEach(() => {
testService = new GetCredential(sts);
});
describe('Given getCredentials has been called', () => {
it('The method returns a credential', async () => {
const credential = {
AccessKeyId: 'AccessKeyId',
SecretAccessKey: 'SecretAccessKey',
SessionToken: 'SessionToken'
};
const mockGetCredentials = jest.fn().mockReturnValue({
promise: () => Promise.resolve({ Credentials: credential }),
});
testService.sts.assumeRole = mockGetCredentials;
const result = await testService.getCredentials('fakeRole');
expect(result).toEqual(credential);
});
});

AWS Lambda: Sequelize acess denied error after accessing successfully the first time

I have an AWS Lambda that uses Sequelize ORM to talk to AWS Aurora. It works fine the first time it's accessed but then after some unknown amount of minutes the Lambda errors out with a Sequelize error saying access denied for user#ip.address
async function connect() {
const signer = new AWS.RDS.Signer({
'region': region,
'username': dbUsername,
'hostname': dbEndpoint,
'port': dbPort
});
let token;
await signer.getAuthToken((error, result) => {
if (error) {
throw error;
}
token = result;
});
return token;
};
const sequelizeOptions = {
'host': dbEndpoint,
'port': dbPort,
'ssl': true,
'dialect': 'mysql',
'dialectOptions': {
'ssl': 'Amazon RDS',
'authSwitchHandler': (data, callback) => {
if (data.pluginName === 'mysql_clear_password') {
const password = token + '\0';
const buffer = Buffer.from(password);
callback(null, buffer);
}
}
},
pool: {
max: 5,
min: 0,
acquire: 30000,
idle: 10000
}
};
let token;
exports.create = async () => {
token = await connect();
return new Sequelize(dbName, dbUsername, token, sequelizeOptions);
}
exports.buildResponse = resultsArray => {
return {
"statusCode": 200,
headers: {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Credentials": true
},
"body": JSON.stringify(resultsArray),
"isBase64Encoded": false
};
};
reference: article

Posting as a more explicit answer than my previous comment.
Short answer
As you are reusing a token and db connection created outside of the lambda handler, one or both of those things is timing out.
Longer answer
Lambdas run in containers, those containers will be re-used until killed due to inactivity or code change, but once a container is running only the code inside of the handler function is run on subsequent invocations.
This means that code run outside of a handler function is only run when a new container is started (because there is no running container or a concurrent invocation is received).
If code outside of the handler creates something that is time limited, like creating a db connection or receiving a time limited token, and the lambda is invoked often enough not to kill the container, time will simply run out.

AWS Lambda publishing to IOT Topic fires indefinitely

The Issue:
I have a node.js (8.10) AWS Lambda function that takes a json object and publishes it to an IOT topic. The function successfully publishes to the topic, however, once fired it is continuously called until I throttle the concurrency to zero to halt any further calling of the function.
I'm trying to figure out what I've implemented incorrectly that causes more than one instance the of the function to be called.
The Function:
Here is my function:
var AWS = require('aws-sdk');
exports.handler = function (event, context) {
var iotdata = new AWS.IotData({endpoint: 'xxxxxxxxxx.iot.us-east-1.amazonaws.com'});
var params = {
topic: '/PiDevTest/SyncDevice',
payload: JSON.stringify(event),
qos: 0
};
iotdata.publish(params, function(err, data) {
if (err) {
console.log(err, err.stack);
} else {
console.log("Message sent.");
context.succeed();
}
});
};
My test json is:
{
"success": 1,
"TccvID": "TestID01"
}
The test console has a response of "null", but the IOT topic shows the data from the test json, published to the topic about once per second.
What I've Tried
-I've attempted to define the handler in it's own, non-anonymous function called handler, and then having the exports.handler = handler; This didn't produce any errors, but didn't successfully post to the iot topic either.
-I thought maybe the issues was with the node.js callback. I've tried implementing it and leaving it out (Current iteration above), but neither way seemed to make a difference. I had read somewhere that the function will retry if it errors, but I believe that only happens three times so it wouldn't explain the indefinite calling of the function.
-I've also tried calling the function from another lambda to make sure that the issue wasn't the aws test tool. This produced the same behavior, though.
Summary:
What am I doing incorrectly that causes this function to publish the json data indefinitely to the iot topic?
Thanks in advance for your time and expertise.

Use aws-iot-device-sdk to create a MQTT client and use it's messageHandler and publish method to publish your messages to IOT topic. Sample MQTT client code is below,
import * as DeviceSdk from 'aws-iot-device-sdk';
import * as AWS from 'aws-sdk';
let instance: any = null;
export default class IoTClient {
client: any;
/**
* Constructor
*
* #params {boolean} createNewClient - Whether or not to use existing client instance
*/
constructor(createNewClient = false, options = {}) {
}
async init(createNewClient, options) {
if (createNewClient && instance) {
instance.disconnect();
instance = null;
}
if (instance) {
return instance;
}
instance = this;
this.initClient(options);
this.attachDebugHandlers();
}
/**
* Instantiate AWS IoT device object
* Note that the credentials must be initialized with empty strings;
* When we successfully authenticate to the Cognito Identity Pool,
* the credentials will be dynamically updated.
*
* #params {Object} options - Options to pass to DeviceSdk
*/
initClient(options) {
const clientId = getUniqueId();
this.client = DeviceSdk.device({
region: options.region || getConfig('iotRegion'),
// AWS IoT Host endpoint
host: options.host || getConfig('iotHost'),
// clientId created earlier
clientId: options.clientId || clientId,
// Connect via secure WebSocket
protocol: options.protocol || getConfig('iotProtocol'),
// Set the maximum reconnect time to 500ms; this is a browser application
// so we don't want to leave the user waiting too long for reconnection after
// re-connecting to the network/re-opening their laptop/etc...
baseReconnectTimeMs: options.baseReconnectTimeMs || 500,
maximumReconnectTimeMs: options.maximumReconnectTimeMs || 1000,
// Enable console debugging information
debug: (typeof options.debug === 'undefined') ? true : options.debug,
// AWS access key ID, secret key and session token must be
// initialized with empty strings
accessKeyId: options.accessKeyId,
secretKey: options.secretKey,
sessionToken: options.sessionToken,
// Let redux handle subscriptions
autoResubscribe: (typeof options.debug === 'undefined') ? false : options.autoResubscribe,
});
}
disconnect() {
this.client.end();
}
attachDebugHandlers() {
this.client.on('reconnect', () => {
logger.info('reconnect');
});
this.client.on('offline', () => {
logger.info('offline');
});
this.client.on('error', (err) => {
logger.info('iot client error', err);
});
this.client.on('message', (topic, message) => {
logger.info('new message', topic, JSON.parse(message.toString()));
});
}
updateWebSocketCredentials(accessKeyId, secretAccessKey, sessionToken) {
this.client.updateWebSocketCredentials(accessKeyId, secretAccessKey, sessionToken);
}
attachMessageHandler(onNewMessageHandler) {
this.client.on('message', onNewMessageHandler);
}
attachConnectHandler(onConnectHandler) {
this.client.on('connect', (connack) => {
logger.info('connected', connack);
onConnectHandler(connack);
});
}
attachCloseHandler(onCloseHandler) {
this.client.on('close', (err) => {
logger.info('close', err);
onCloseHandler(err);
});
}
publish(topic, message) {
this.client.publish(topic, message);
}
subscribe(topic) {
this.client.subscribe(topic);
}
unsubscribe(topic) {
this.client.unsubscribe(topic);
logger.info('unsubscribed from topic', topic);
}
}
***getConfig() is to get environment variables from a yml file or else you can directly specify it here.

While he only posted it as an comment, MarkB pointed me in the correct direction.
The problem was the solution was related to another lambda who was listening to the same topic and invoking the lambda I was working on. This resulted in circular logic as the exit condition was never met. Fixing that code solved this issue.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string