When writing a Lambda function which utilizes the aws-sdk NodeJS package for a Shopify webhook, I noticed that the require statement for the package is taking over 3 seconds to load. This is causing issues because Shopify requires a response from its webhooks within 5 seconds. I abstracted the code out of my function to test it by itself and received the same results:
exports.handler = (event, context, callback) => {
const start = new Date().getTime();
console.log('start: '+ new Date())
require('aws-sdk');
console.log('end:' + new Date())
const end = new Date().getTime();
console.log('length: ' + (end - start) + 'ms');
callback();
}
Here was the output:
START RequestId: Version: $LATEST
2017-11-30T13:23:57.506Z start: Thu Nov 30 2017 13:23:57 GMT+0000 (UTC)
END RequestId:
REPORT RequestId: Duration: 3001.29 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 31 MB
2017-11-30T13:24:00.499Z Task timed out after 3.00 seconds
The issue appears to be that I was not giving the Lambda function enough memory to load in the package. I increased the memory with the following results:
512MB brought it down to 1000ms
1GB brought it down to 500ms
Related
I have an AWS lambda that does a very expensive job and sometimes it takes 14m to finish (lambda function has a 15m timeout).
The call to this lambda is done using the lambda.invoke method from the aws-sdk with the following settings:
const lambda = new AWS.Lambda({
maxRetries: 0,
httpOptions: {
timeout: (15 * 60 * 1000) + (20 * 1000) // time to wait for a response (lambda timeout + 20s)
}
});
In my nodejs logs I'm able to see:
Feb 07 14:29:43.825 invoking lambda
Feb 07 14:45:04.396
Closing job 151218 due to some error: "TimeoutError: Connection timed out after 920000ms"
While on my AWS lambda logs I see
Feb 07 14:29:44.329 INIT_START Runtime Version: java:8.v14
...
Feb 07 14:40:22.293 REPORT RequestId: xxx Duration: 635736.88 ms
As you can see, the invoke + lambda start is consistent, but the lambda ends at 14:40 but the nodejs invoke method is stuck waiting for the response and it times out at 14:45.
I've read the AWS documentation regarding timeouts (https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/) and everything seems correct, so I do not know why my invoke method does not return once the lambda finished successfully.
Any ideas?
I have an lambda function which spins up a worker thread.
hello-world.js:
const { Worker } = require('node:worker_threads')
export const hello = async (event, context) => {
console.log('Hello World')
const w = new Worker('./build/other-file.js')
return new Promise(resolve => {
w.on('exit', () => {
console.log('Goodbye')
resolve()
})
})
}
other-file.js:
console.log('Blah Blah')
Defined in serverless.yml as:
functions:
helloWorld:
handler: build/hello-world.hello
maximumRetryAttempts: 0
If I invoke this locally with serverless, I get what I'd expect:
15:40:59 $ sls invoke local --function helloWorld
Using serverless-localstack
Hello World
Blah Blah
Goodbye
If I invoke it on AWS each message from the function itself is also prefixed with a timestamp and an execution ID. This is very nice. However, it doesn't do this for messages from the worker thread.
15:44:00 $ aws lambda invoke --function-name sheetbuilder-dev-helloWorld out --log-type Tail --query 'LogResult' --output text | base64 -d
START RequestId: ebf33295-c59e-4182-a5b0-d436210f5e6f Version: $LATEST
2022-09-05T14:51:48.833Z ebf33295-c59e-4182-a5b0-d436210f5e6f INFO Hello World
Blah Blah
2022-09-05T14:51:48.973Z ebf33295-c59e-4182-a5b0-d436210f5e6f INFO Goodbye
END RequestId: ebf33295-c59e-4182-a5b0-d436210f5e6f
REPORT RequestId: ebf33295-c59e-4182-a5b0-d436210f5e6f Duration: 154.59 ms Billed Duration: 155 ms Memory Size: 1024 MB Max Memory Used: 73 MB Init Duration: 192.28 ms
15:51:49 $
This also matches what I get if I look at the logs in Cloudwatch
Hence my questions are:
Why's there a difference between the default output and the worker thread output?
Is it possible / easy to make the worker thread output match the main output? (or at the very least include timestamp/execution ID)
Is it possible to also make the serverless local output consistent?
I wrote up a simple load testing script that runs N number of hits to and HTTP endpoint over M async parallel lanes. Each lane waits for the previous request to finish before starting a new request. The script, for my specific use-case, is randomly picking a numeric "width" parameter to add to the URL each time. The endpoint returns between 200k and 900k of image data on each request depending on the width parameter. But my script does not care about this data and simply relies on garbage collection to clean it up.
const fetch = require('node-fetch');
const MIN_WIDTH = 200;
const MAX_WIDTH = 1600;
const loadTestUrl = `
http://load-testing-server.com/endpoint?width={width}
`.trim();
async function fetchAll(url) {
const res = await fetch(url, {
method: 'GET'
});
if (!res.ok) {
throw new Error(res.statusText);
}
}
async function doSingleRun(runs, id) {
const runStart = Date.now();
console.log(`(id = ${id}) - Running ${runs} times...`);
for (let i = 0; i < runs; i++) {
const start = Date.now();
const width = Math.floor(Math.random() * (MAX_WIDTH - MIN_WIDTH)) + MIN_WIDTH;
try {
const result = await fetchAll(loadTestUrl.replace('{width}', `${width}`));
const duration = Date.now() - start;
console.log(`(id = ${id}) - Width ${width} Success. ${i+1}/${runs}. Duration: ${duration}`)
} catch (e) {
const duration = Date.now() - start;
console.log(`(id = ${id}) - Width ${width} Error fetching. ${i+1}/${runs}. Duration: ${duration}`, e)
}
}
console.log(`(id = ${id}) - Finished run. Duration: ` + (Date.now() - runStart));
}
(async function () {
const RUNS = 200;
const parallelRuns = 10;
const promises = [];
const parallelRunStart = Date.now();
console.log(`Running ${parallelRuns} parallel runs`)
for (let i = 0; i < parallelRuns; i++) {
promises.push(doSingleRun(RUNS, i))
}
await Promise.all(promises);
console.log(`Finished parallel runs. Duration ${Date.now() - parallelRunStart}`)
})();
When I run this in Node 14.17.3 on my MacBook Pro running MacOS 10.15.7 (Catalina) with even a modest parallel lane number of 3, after about 120 (x 3) hits of the endpoint the following happens in succession:
Console output ceases in the terminal for the script, indicating the script has halted
Other applications such as my browser are unable to make network connections.
Within 1 - 2 mins other applications on my machine begin to slow down and eventually freeze up.
My entire system crashes with a kernel panic and the machine reboots.
panic(cpu 2 caller 0xffffff7f91ba1ad5): userspace watchdog timeout: remoted connection watchdog expired, no updates from remoted monitoring thread in 60 seconds, 30 checkins from thread since monitoring enabled 640 seconds ago after loadservice: com.apple.logd, total successful checkins since load (642 seconds ago): 64, last successful checkin: 10 seconds ago
service: com.apple.WindowServer, total successful checkins since load (610 seconds ago): 60, last successful checkin: 10 seconds ago
I can very easily stop of the progression of these symptoms by doing a Ctrl+C in the terminal of my script and force quitting it. Everything quickly gets back to normal. And I can repeat the experiment multiple times before allowing it to crash my machine.
I've monitored Activity Monitor during the progression and there is very little (~1%) CPU usage, memory usage reaches up to maybe 60-70mb, though it is pretty evident that the Network activity is peaking during the script's run.
In my search for others with this problem there were only two Stack Overflow articles that came close:
node.js hangs other programs on my mac
Node script causes system freeze when uploading a lot of files
Anyone have any idea why this would happen? It seems very dangerous that a single app/script could so easily bring down a machine without being killed first by the OS.
I am using Alexa Node.js sdk, to implement a skill. On session start (at LaunchRequest intent), I want to store some variables in the session attributes. As per the blog here, I am using this.attributes.key to store the session attributes.
const handlers = {
'LaunchRequest': function () {
database.startSession()
.then(data => {
// console.log(data); // data does have token
this.attributes.token=data.token;
// this.attributes['token']=data.token; // Tried this too
this.emit(':ask', responses.launch, responses.launchReprompt);
})
.catch(err => {
console.error(err);
this.emit(":ask", responses.error);
});
},
.... More handlers
}
However, the on launch command, I get this error,
There was a problem with the requested skill's response
I see no error in logs.
This is my response (as visible in alexa test developer console)
{
"body": {
"version": "1.0",
"response": {
"outputSpeech": {
"type": "SSML",
"ssml": "<speak> Ok, exiting App. </speak>"
},
"shouldEndSession": true
},
"sessionAttributes": {},
"userAgent": "ask-nodejs/1.0.25 Node/v8.10.0"
}
}
As per here, the sessionAttributes should contain what I set as session variables using this.attributes, but this is somehow empty.
How can I resolve this?
Edit: If I comment out this.attributes line, I get the welcome message correctly.
This is my startSession function, if its helpful.
async function startSession() {
return {
token: await getToken(),
... More attributes
};
}
Edit 2: Very wierd thing that I noticed. If I just do this.attributes.token="foobar", the session attribute gets set correctly. So I am assuming there is a problem with my async function. Note that console.log(data) still prints the data correctly with token attribute.
Edit 3: Cloudwatch logs
START RequestId: Version: $LATEST
2018-08-15T14:00:47.639Z Warning: Application ID is not set
END RequestId: REPORT RequestId: Duration: 315.05
ms Billed Duration: 400 ms Memory Size: 128 MB Max Memory Used: 73 MB
START RequestId: Version: $LATEST
2018-08-15T14:00:47.749Z Warning: Application ID is not set
2018-08-15T14:00:48.564Z { token: 'token', filter:
'foobar'} END RequestId: REPORT RequestId: Duration:
849.98 ms Billed Duration: 900 ms Memory Size: 128 MB Max Memory Used: 74 MB START RequestId: Version: $LATEST
2018-08-15T14:00:49.301Z Warning: Application ID is not
set END RequestId: REPORT RequestId:
Duration: 0.72 ms Billed Duration: 100 ms Memory Size:
128 MB Max Memory Used: 74 MB
We found that the max size of the response object is 24kb, reference1, reference2, reference3.
My data size was way more than 24kb. Hence the session attributes did not get stored, and it resulted in exit intent. The solution is to store it in some db like DynamoDB.
Special credits to Will.
We have created a NodeJS based Lambda function named - execInspector which gets triggered everyday once. This function is created based on AWS Lambda blueprint --> "inspector-scheduled-run" in NodeJS.
The problem we see is the scheduled job fails randomly one day or the other. We are getting only the below logs from the cloudwatch log stream.
In a week, it randomly runs =~ 4/5 times & fails remaining days. Based on the log, it consumes only very little amount of memory/time for its execution but not sure why it fails randomly. It also retries itself 3 times before getting killed.
From the below log we could also find that the job only takes 35 MB avg. & takes only 60 sec to complete on an avg. We tried modifying the NodeJS run time, increasing memory, timeouts well beyond this limit but nothing worked out.
Can you please help with some alternate approaches to handle these failures automatically & if anyone has insights on why its happening?
Additional Inputs:
We have already given 5 mins of maximum timeout also, but it fails saying "timed out after 300 secs.".
What i mean here is the task of just triggering the inspector takes only less than 30 secs on avg. Since, its a PaaS based solution, I cannot expect always this to be completed within 30 secs. But 60 secs should be more than enough for this to handle a job which it was able to complete within 30 secs.
Sample CloudWatch Successful log:
18:01:00
START RequestId: 12eb468a-4174-11e7-be7b-6d0faaa584aa Version: $LATEST
18:01:03
2017-05-25T18:01:02.935Z 12eb468a-4174-11e7-be7b-6d0faaa584aa { assessmentRunArn: 'arn:aws:inspector:us-east-1:102461617910:target/0-Ly60lmEP/template/0-POpZxSLA/run/0-MMx30fLl' }
2017-05-25T18:01:02.935Z 12eb468a-4174-11e7-be7b-6d0faaa584aa { assessmentRunArn: 'arn:aws:inspector:us-east-1:102461617910:target/0-Ly60lmEP/template/0-POpZxSLA/run/0-MMx30fLl' }
18:01:03
END RequestId: 12eb468a-4174-11e7-be7b-6d0faaa584aa
END RequestId: 12eb468a-4174-11e7-be7b-6d0faaa584aa
18:01:03
REPORT RequestId: 12eb468a-4174-11e7-be7b-6d0faaa584aa Duration: 2346.37 ms Billed Duration: 2400 ms Memory Size: 128 MB Max Memory Used: 33 MB
REPORT RequestId: 12eb468a-4174-11e7-be7b-6d0faaa584aa Duration: 2346.37 ms Billed Duration: 2400 ms Memory Size: 128 MB Max Memory Used: 33 MB
Cloudwatch log:
Similar log below is repeated 3 times which seems to be a retry attempt
06:32:52
START RequestId: 80190395-404a-11e7-845d-1f88a00ed4f3 Version: $LATEST
06:32:56
2017-05-24T06:32:55.942Z 80190395-404a-11e7-845d-1f88a00ed4f3 Execution Started...
06:33:52
END RequestId: 80190395-404a-11e7-845d-1f88a00ed4f3
06:33:52
REPORT RequestId: 80190395-404a-11e7-845d-1f88a00ed4f3 Duration: 60000.88 ms Billed Duration: 60000 ms Memory Size: 128 MB Max Memory Used: 32 MB
06:33:52
2017-05-24T06:33:52.437Z 80190395-404a-11e7-845d-1f88a00ed4f3 Task timed out after 60.00 seconds
2017-05-24T06:33:52.437Z 80190395-404a-11e7-845d-1f88a00ed4f3 Task timed out after 60.00 seconds
Lambda code:
'use strict';
/**
* A blueprint to schedule a recurring assessment run for an Amazon Inspector assessment template.
*
* This blueprint assumes that you've already done the following:
* 1. onboarded with the Amazon Inspector service https://aws.amazon.com/inspector
* 2. created an assessment target - what hosts you want to assess
* 3. created an assessment template - how you want to assess your target
*
* Then, all you need to do to use this blueprint is to define an environment variable in the Lambda console called
* `assessmentTemplateArn` and provide the template arn you want to run on a schedule.
*/
const AWS = require('aws-sdk');
const inspector = new AWS.Inspector();
const params = {
assessmentTemplateArn: process.env.assessmentTemplateArn,
};
exports.handler = (event, context, callback) => {
try {
// Inspector.StartAssessmentRun response will look something like:
// {"assessmentRunArn":"arn:aws:inspector:us-west-2:123456789012:target/0-wJ0KWygn/template/0-jRPJqnQh/run/0-Ga1lDjhP"
inspector.startAssessmentRun(params, (error, data) => {
if (error) {
console.log(error, error.stack);
return callback(error);
}
console.log(data);
return callback(null, data);
});
} catch (error) {
console.log('Caught Error: ', error);
callback(error);
}
};
The log says your request is timing out after 60 seconds. You can set it as high as 5 minutes according to this https://aws.amazon.com/blogs/aws/aws-lambda-update-python-vpc-increased-function-duration-scheduling-and-more/ If your task takes about 60 seconds and the timeout is 60 secs then maybe some are timing out. Thats what the log suggests to me. Otherwise, post some code from the function