I want to determine how large an array would be in memory for when my function executes. Determining the size of the array is easy but I am not seeing a correlation to the size of my array to the Max Memory used that gets recorded at the end of a Lambda execution.
There is no apparent coloration after inspecting process.memoryUsage() before and after setting the array as well as the Max Memory used reported by the Lambda. I can't find a good resource that indicates how/what Lambda actually uses to determine the memory used. Any help would be appreciated?
This question made me curious myself, so I decided to run some tests to see how memory allocation works inside an AWS Lambda container.
Test 1: Create array with 100,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 100000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 56 MB
2019-04-30T01:00:59.577Z cd473d5b-986c-436e-8b36-b114410c84cf { rss: 35299328,
heapTotal: 11853824,
heapUsed: 7590320,
external: 8224 }
REPORT RequestId: 2a7548f9-5d2f-4060-8f9e-deb228730d8c Duration: 155.74 ms Billed Duration: 200 ms Memory Size: 128 MB Max Memory Used: 56 MB
Test 2: Create array with 1,000,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 1000000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 99 MB
2019-04-30T01:03:44.582Z 547a9de8-35f7-48e2-a53f-ab669b188f9a { rss: 80093184,
heapTotal: 55263232,
heapUsed: 52951088,
external: 8224 }
REPORT RequestId: 547a9de8-35f7-48e2-a53f-ab669b188f9a Duration: 801.68 ms Billed Duration: 900 ms Memory Size: 128 MB Max Memory Used: 99 MB
Test 3: Create array with 10,000,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 10000000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 128 MB
REPORT RequestId: f1df4f39-e0fc-4b44-8f90-c3c0e3d9c12d Duration: 3001.33 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 128 MB
2019-04-30T00:54:32.970Z f1df4f39-e0fc-4b44-8f90-c3c0e3d9c12d Task timed out after 3.00 seconds
I think we can pretty confidently say that the memory used by the lambda container does go up based on the size of an array in memory; in our third test we ended up maxing out our memory and timing out. My assumption here is that the process that controls the execution of the lambda also monitors how much memory that execution acquires; likely by cat /proc/meminfo as trognanders suggests.
Okay so I used the following code and increased the amount of array values to get a correlation. Three tests were done on each max value of the array. Lambda was set at 1024MB. Each array element is 10 chars/bytes long.
const util = require('util');
const exec = util.promisify(require('child_process').exec);
async function GetContainerUsage()
{
const { stdout, stderr } = await exec('cat /proc/meminfo');
// console.log(stdout);
let memInfoSplits = stdout.split(/[\n: ]/).filter( val => val.trim());
// console.log(memInfoSplits[19]); // This returns the "Active" value which seems to be used
return Math.round(memInfoSplits[19] / 1024);
}
function GetMemoryUsage()
{
const used = process.memoryUsage();
for (let key in used)
used[key] = Math.round((used[key] / 1024 / 1024));
return used;
}
exports.handler = async (event, context) =>
{
let max = event.ArrTotal;
let arr = [];
for(let i = 0; i < max; i++)
{
arr.push("1234567890"); //10 Bytes
}
let csvLine = [];
let jsMemUsed = GetMemoryUsage();
let containerMemUsed = await GetContainerUsage();
csvLine.push(event.ArrTotal);
csvLine.push(jsMemUsed.rss);
csvLine.push(jsMemUsed.heapTotal);
csvLine.push(jsMemUsed.heapUsed);
csvLine.push(jsMemUsed.external);
csvLine.push(containerMemUsed);
console.log(csvLine.join(','));
return true;
};
This output the following values used in the CSV:
Array Count, JS rss, JS heapTotal, JS heapUsed, external, System Active, Lambda reported usage
1,30,7,5,0,53,54
1,31,7,5,0,53,55
1,30,8,5,0,53,55
1000,30,8,5,0,53,55
1000,30,8,5,0,53,55
1000,30,8,5,0,53,55
10000,30,8,5,0,53,55
10000,31,8,6,0,54,56
10000,33,7,5,0,54,57
100000,32,12,7,0,56,57
100000,34,11,8,0,57,59
100000,36,12,10,0,59,61
1000000,64,42,39,0,88,89
1000000,60,36,34,0,84,89
1000000,60,36,34,0,84,89
10000000,271,248,244,0,294,297
10000000,271,248,244,0,295,297
10000000,271,250,244,0,295,297
Which if graphed becomes:
So at 10 Million elements the array is assumed to be 10mil*10bytes = 100MB. There must be some overhead I am missing somewhere as there is about 200MB used elsewhere. But at least there is a clear linear correlation, which I can now work with.
Capacity specification for FaaS vs PaaS
The whole idea of doing computing with lambda functions(FaaS) is that least bother on capacity planning. Now, given that its not possible for the cloud provider to default a lot of choices, memory settings and timeouts are some settings AWS uses to configure the function. Apparently, if you test it out you may see that memory settings are not just determining the memory but also the CPU compute capacity. This is as quoted by AWS -
Lambda allocates CPU power linearly in proportion to the amount of memory configured. At 1,792 MB, a function has the equivalent of 1 full vCPU (one vCPU-second of credits per second)
Ref https://docs.aws.amazon.com/lambda/latest/dg/resource-model.html
Hence, its not just enough to consider runtime memory footprint, but also for CPU speed with which it executed and finishes the function.
AWS does not call out what capacity or CPU/Memory/Server Type/IOPS they use in these containers and neither they show that usage in any CW metrics like an EC2 instance.
Hence we need to choose memory setting based on testing.
Each lambda(nodejs) will have its own memory footprint and dedicated set of node module dependencies. Hence, each one needs to load and performance tested to tune the memory and timeout settings and cannot be planned upfront.
General research observation
With any standard nodejs based lambda function, which has logging and does just hello world, deployed without a VPC
128 MB may show an execution time of say 150+ ms and a billing of
200ms for 128 MB
256 MB may show an execution time of say 80+ ms and
a billing of 100ms for 256 MB
Lower memory setting does not mean lower cost essentially and hence fine tuning based on load & performance test is the best way to determine the memory setting that can be used.
Attributes like timeout, is purely based on how long the function takes to complete the activity, which can be way high up for batch job operations(say 10m) vs a webservice which expects a quick response(say 10s). Timing out earlier instead of waiting on any long pending dependencies are important to avoid high billing in case of high throughput APIs. In case of API, slow timeout can result in alternate containers(functions) to spin up to scale for new requests which can also impact the number of IPs being allocated within the subnet which hosts the function(in case function runs within a vpc ).
Lambda limits on ENI and IPs or maximum lambda concurrency within a account/region is important factors to consider while planning for the capacity.
Ref https://docs.aws.amazon.com/lambda/latest/dg/limits.html
I am using Alexa Node.js sdk, to implement a skill. On session start (at LaunchRequest intent), I want to store some variables in the session attributes. As per the blog here, I am using this.attributes.key to store the session attributes.
const handlers = {
'LaunchRequest': function () {
database.startSession()
.then(data => {
// console.log(data); // data does have token
this.attributes.token=data.token;
// this.attributes['token']=data.token; // Tried this too
this.emit(':ask', responses.launch, responses.launchReprompt);
})
.catch(err => {
console.error(err);
this.emit(":ask", responses.error);
});
},
.... More handlers
}
However, the on launch command, I get this error,
There was a problem with the requested skill's response
I see no error in logs.
This is my response (as visible in alexa test developer console)
{
"body": {
"version": "1.0",
"response": {
"outputSpeech": {
"type": "SSML",
"ssml": "<speak> Ok, exiting App. </speak>"
},
"shouldEndSession": true
},
"sessionAttributes": {},
"userAgent": "ask-nodejs/1.0.25 Node/v8.10.0"
}
}
As per here, the sessionAttributes should contain what I set as session variables using this.attributes, but this is somehow empty.
How can I resolve this?
Edit: If I comment out this.attributes line, I get the welcome message correctly.
This is my startSession function, if its helpful.
async function startSession() {
return {
token: await getToken(),
... More attributes
};
}
Edit 2: Very wierd thing that I noticed. If I just do this.attributes.token="foobar", the session attribute gets set correctly. So I am assuming there is a problem with my async function. Note that console.log(data) still prints the data correctly with token attribute.
Edit 3: Cloudwatch logs
START RequestId: Version: $LATEST
2018-08-15T14:00:47.639Z Warning: Application ID is not set
END RequestId: REPORT RequestId: Duration: 315.05
ms Billed Duration: 400 ms Memory Size: 128 MB Max Memory Used: 73 MB
START RequestId: Version: $LATEST
2018-08-15T14:00:47.749Z Warning: Application ID is not set
2018-08-15T14:00:48.564Z { token: 'token', filter:
'foobar'} END RequestId: REPORT RequestId: Duration:
849.98 ms Billed Duration: 900 ms Memory Size: 128 MB Max Memory Used: 74 MB START RequestId: Version: $LATEST
2018-08-15T14:00:49.301Z Warning: Application ID is not
set END RequestId: REPORT RequestId:
Duration: 0.72 ms Billed Duration: 100 ms Memory Size:
128 MB Max Memory Used: 74 MB
We found that the max size of the response object is 24kb, reference1, reference2, reference3.
My data size was way more than 24kb. Hence the session attributes did not get stored, and it resulted in exit intent. The solution is to store it in some db like DynamoDB.
Special credits to Will.
When writing a Lambda function which utilizes the aws-sdk NodeJS package for a Shopify webhook, I noticed that the require statement for the package is taking over 3 seconds to load. This is causing issues because Shopify requires a response from its webhooks within 5 seconds. I abstracted the code out of my function to test it by itself and received the same results:
exports.handler = (event, context, callback) => {
const start = new Date().getTime();
console.log('start: '+ new Date())
require('aws-sdk');
console.log('end:' + new Date())
const end = new Date().getTime();
console.log('length: ' + (end - start) + 'ms');
callback();
}
Here was the output:
START RequestId: Version: $LATEST
2017-11-30T13:23:57.506Z start: Thu Nov 30 2017 13:23:57 GMT+0000 (UTC)
END RequestId:
REPORT RequestId: Duration: 3001.29 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 31 MB
2017-11-30T13:24:00.499Z Task timed out after 3.00 seconds
The issue appears to be that I was not giving the Lambda function enough memory to load in the package. I increased the memory with the following results:
512MB brought it down to 1000ms
1GB brought it down to 500ms
I am processing image files in AWS Lambda with GraphicsMagick/ImageMagick, node.js. Some of the files are > 200MB in size which causes the Lambda function to hit memory limits. I have set the maximum memory of 1.5GB.
The log file displays:
REPORT RequestId: xxx Duration: 23200.51 ms Billed Duration: 23300 ms Memory Size: 1536 MB Max Memory Used: 1536 MB
Code:
async.series([
function getOriginalSize(p_next) {
// size
gm(s3_img.Body).size(function (err, size) {
if (!err) {
width_orig = size.width;
height_orig = size.height;
p_next(null, 'getOriginalSize');
}
});
},
function identify(p_next) {
gm(s3_img).flatten();
gm(s3_img.Body).identify(function(err, id_info){
// THIS IS WHERE THE FOLLOWING ERROR OCCURS:
// { [Error: Command failed: ] code: null, signal: 'SIGKILL' }
...
...
...
});
}
]);
I have not found an answer to this and would be grateful for any kind of tips or comments.