Why nodejs memory is not consumed for specific loop count - node.js

I was trying to find memory leak in my code. I found out when n is, 1 < n < 257 it is showing 0KB consume, but as I put 257 it consumed memory 304KB then increase proportionally with n.
function somefunction()
{
var n = 256;
var x ={};
for(var i=0; i<n; i++){
x['some'+i] = {"abc" : ("abc#yxy.com"+i)};
}
}
// Memory Leak
var init = process.memoryUsage();
somefunction();
var end = process.memoryUsage();
console.log("memory consumed 2nd Call : "+((end.rss-init.rss)/1024)+" KB");

It's probably not leak. You cannot always expect gc to purge everything so soon.
See Why does nodejs have incremental memory usage?
If you want to force garbage collection, see
How to request the Garbage Collector in node.js to run?

Related

What metric does AWS Lambda, specifically for Node.js runtime use to determine the Max Memory used?

I want to determine how large an array would be in memory for when my function executes. Determining the size of the array is easy but I am not seeing a correlation to the size of my array to the Max Memory used that gets recorded at the end of a Lambda execution.
There is no apparent coloration after inspecting process.memoryUsage() before and after setting the array as well as the Max Memory used reported by the Lambda. I can't find a good resource that indicates how/what Lambda actually uses to determine the memory used. Any help would be appreciated?
This question made me curious myself, so I decided to run some tests to see how memory allocation works inside an AWS Lambda container.
Test 1: Create array with 100,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 100000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 56 MB
2019-04-30T01:00:59.577Z cd473d5b-986c-436e-8b36-b114410c84cf { rss: 35299328,
heapTotal: 11853824,
heapUsed: 7590320,
external: 8224 }
REPORT RequestId: 2a7548f9-5d2f-4060-8f9e-deb228730d8c Duration: 155.74 ms Billed Duration: 200 ms Memory Size: 128 MB Max Memory Used: 56 MB
Test 2: Create array with 1,000,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 1000000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 99 MB
2019-04-30T01:03:44.582Z 547a9de8-35f7-48e2-a53f-ab669b188f9a { rss: 80093184,
heapTotal: 55263232,
heapUsed: 52951088,
external: 8224 }
REPORT RequestId: 547a9de8-35f7-48e2-a53f-ab669b188f9a Duration: 801.68 ms Billed Duration: 900 ms Memory Size: 128 MB Max Memory Used: 99 MB
Test 3: Create array with 10,000,000 elements in memory
Memory size: 128MB
exports.handler = async (event) => {
const arr = [];
for (let i = 0; i < 10000000; i++) {
arr.push(i);
}
console.log(process.memoryUsage());
return 'done';
};
Result: 128 MB
REPORT RequestId: f1df4f39-e0fc-4b44-8f90-c3c0e3d9c12d Duration: 3001.33 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 128 MB
2019-04-30T00:54:32.970Z f1df4f39-e0fc-4b44-8f90-c3c0e3d9c12d Task timed out after 3.00 seconds
I think we can pretty confidently say that the memory used by the lambda container does go up based on the size of an array in memory; in our third test we ended up maxing out our memory and timing out. My assumption here is that the process that controls the execution of the lambda also monitors how much memory that execution acquires; likely by cat /proc/meminfo as trognanders suggests.
Okay so I used the following code and increased the amount of array values to get a correlation. Three tests were done on each max value of the array. Lambda was set at 1024MB. Each array element is 10 chars/bytes long.
const util = require('util');
const exec = util.promisify(require('child_process').exec);
async function GetContainerUsage()
{
const { stdout, stderr } = await exec('cat /proc/meminfo');
// console.log(stdout);
let memInfoSplits = stdout.split(/[\n: ]/).filter( val => val.trim());
// console.log(memInfoSplits[19]); // This returns the "Active" value which seems to be used
return Math.round(memInfoSplits[19] / 1024);
}
function GetMemoryUsage()
{
const used = process.memoryUsage();
for (let key in used)
used[key] = Math.round((used[key] / 1024 / 1024));
return used;
}
exports.handler = async (event, context) =>
{
let max = event.ArrTotal;
let arr = [];
for(let i = 0; i < max; i++)
{
arr.push("1234567890"); //10 Bytes
}
let csvLine = [];
let jsMemUsed = GetMemoryUsage();
let containerMemUsed = await GetContainerUsage();
csvLine.push(event.ArrTotal);
csvLine.push(jsMemUsed.rss);
csvLine.push(jsMemUsed.heapTotal);
csvLine.push(jsMemUsed.heapUsed);
csvLine.push(jsMemUsed.external);
csvLine.push(containerMemUsed);
console.log(csvLine.join(','));
return true;
};
This output the following values used in the CSV:
Array Count, JS rss, JS heapTotal, JS heapUsed, external, System Active, Lambda reported usage
1,30,7,5,0,53,54
1,31,7,5,0,53,55
1,30,8,5,0,53,55
1000,30,8,5,0,53,55
1000,30,8,5,0,53,55
1000,30,8,5,0,53,55
10000,30,8,5,0,53,55
10000,31,8,6,0,54,56
10000,33,7,5,0,54,57
100000,32,12,7,0,56,57
100000,34,11,8,0,57,59
100000,36,12,10,0,59,61
1000000,64,42,39,0,88,89
1000000,60,36,34,0,84,89
1000000,60,36,34,0,84,89
10000000,271,248,244,0,294,297
10000000,271,248,244,0,295,297
10000000,271,250,244,0,295,297
Which if graphed becomes:
So at 10 Million elements the array is assumed to be 10mil*10bytes = 100MB. There must be some overhead I am missing somewhere as there is about 200MB used elsewhere. But at least there is a clear linear correlation, which I can now work with.
Capacity specification for FaaS vs PaaS
The whole idea of doing computing with lambda functions(FaaS) is that least bother on capacity planning. Now, given that its not possible for the cloud provider to default a lot of choices, memory settings and timeouts are some settings AWS uses to configure the function. Apparently, if you test it out you may see that memory settings are not just determining the memory but also the CPU compute capacity. This is as quoted by AWS -
Lambda allocates CPU power linearly in proportion to the amount of memory configured. At 1,792 MB, a function has the equivalent of 1 full vCPU (one vCPU-second of credits per second)
Ref https://docs.aws.amazon.com/lambda/latest/dg/resource-model.html
Hence, its not just enough to consider runtime memory footprint, but also for CPU speed with which it executed and finishes the function.
AWS does not call out what capacity or CPU/Memory/Server Type/IOPS they use in these containers and neither they show that usage in any CW metrics like an EC2 instance.
Hence we need to choose memory setting based on testing.
Each lambda(nodejs) will have its own memory footprint and dedicated set of node module dependencies. Hence, each one needs to load and performance tested to tune the memory and timeout settings and cannot be planned upfront.
General research observation
With any standard nodejs based lambda function, which has logging and does just hello world, deployed without a VPC
128 MB may show an execution time of say 150+ ms and a billing of
200ms for 128 MB
256 MB may show an execution time of say 80+ ms and
a billing of 100ms for 256 MB
Lower memory setting does not mean lower cost essentially and hence fine tuning based on load & performance test is the best way to determine the memory setting that can be used.
Attributes like timeout, is purely based on how long the function takes to complete the activity, which can be way high up for batch job operations(say 10m) vs a webservice which expects a quick response(say 10s). Timing out earlier instead of waiting on any long pending dependencies are important to avoid high billing in case of high throughput APIs. In case of API, slow timeout can result in alternate containers(functions) to spin up to scale for new requests which can also impact the number of IPs being allocated within the subnet which hosts the function(in case function runs within a vpc ).
Lambda limits on ENI and IPs or maximum lambda concurrency within a account/region is important factors to consider while planning for the capacity.
Ref https://docs.aws.amazon.com/lambda/latest/dg/limits.html

Increase Electron Memory limit

My electron app crashes as soon as the memory usage reaches 2,000 MB.
I can test it by having this code in my main process file which intentionally raises the memory usage:
const all = [];
let big = [];
all.push(big);
for (let i = 0; i < 2000000000; i++) {
const newLen = big.push(Math.random());
if (newLen % 500000 === 0) {
big = [];
all.push(big);
console.log('all.length: ' + all.length);
console.log('heapTotal: ' + Math.round(process.memoryUsage().heapTotal / 1e6));
}
}
console.log(all.length);
I have tried everything:
require('v8').setFlagsFromString('--max-old-space-size=4096');
app.commandLine.appendSwitch('js-flags', '--max-old-space-size=4096');
But nothing it working...
Tested on electron v3.0.0-beta.12 AND on electron v2.0.9 ~ 2.0.x
How can I increase the memory limit on Electron & not have my app crash as soon as it hits 2GB or RAM usage?
No such problem in electron >8.0.3 (at least).
Tested with multiple Buffers, ~2GB each (buffer.constants.MAX_LENGTH)

Node.js + Redis memory leak. Am I doing something wrong?

var redis = require("redis"),
client = redis.createClient();
for(var i =0 ; i < 1000000; i++){
client.publish('channel_1', 'hello!');
}
After the code is executed, the Node process consumes 1.2GB of memory and stays there; GC does not reduce allocated memory. If I simulate 2 million messages or 4x500000, node crashes with memory error.
Node: 0.8.*, tried 4.1.1 later but nothing changed
Redis: 2.8 , works well (1MB allocated memory).
My server will be publishing more than 1 million messages per hour. So this is absolutely not acceptable (process crashing every hour).
updated test
var redis = require("redis"),
client = redis.createClient();
var count = 0;
var x;
function loop(){
count++;
console.log(count);
if(count > 2000){
console.log('cleared');
clearInterval(x);
}
for(var i =0 ; i < 100000; i++){
client.set('channel_' + i, 'hello!');
}
}
x = setInterval(loop, 3000);
This allocate ~ 50Mb, with peak at 200Mb, and now GC drop memory back to 50Mb
If you take a look at the node_redis client source, you'll see that every send operation returns a boolean that indicates whether the command queue has passed the high water mark (by default 1000). If you were to log this return value (alternatively, enable redis.debug_mode), there is a good possibility that you'll see false a lot- an indication that you're sending too more requests than Redis can handle all at once.
If this turns out not to be the case, then the command queue is indeed being cleared regularly which means GC is most likely the issue.
Either way, try jfriend00's suggestion. Sending 1M+ async messages with no delay (so basically all at once) is not a good test. The queue needs time to clear and GC needs time to do its thing.
Sources:
Backpressure and Unbounded Concurrency & Node-redis client return values

How to improve throughput with FileStream in a single-threaded application

I am trying to get top I/O performance in a data streaming application with eight SSDs in RAID-5 (each SSD advertises and delivers 500 MB/sec reads).
I create FileStream with 64KB buffer and read many blocks in a blocking fashion (pun not intended). Here's what I have now with 80GB in 20K files, no fragments:
Legacy blocking reads are at 1270 MB/sec with single thread, 1556 MB/sec with 6 threads.
What I noticed with single-thread is that a single core's worth of CPU time is spent in kernel (8.3% red in Process Explorer with 12 cores). With 6 threads, approximately 5x CPU time is spent in kernel (41% red in in Process Explorer with 12 cores).
I would really like to avoid complexity of a multi-threaded application in the I/O bound scenario.
Is it possible to achieve these transfer rates in a single-threaded application? That is, what would be a good way to reduce the amount of time in kernel mode?
How, if at all, would the new Async feature in C# help?
For comparison, ATTO disk benchmark shows 2500 MB/sec at these block sizes on this hardware and low CPU utilization. However, ATTO dataset size is mere 2GB.
Using LSI 9265-8i RAID controller, with 64k stripe size, 64k cluster size.
Here's a sketch of the code in use. I don't write production code this way, it's just a proof of concept.
volatile bool _somethingLeftToRead = false;
long _totalReadInSize = 0;
void ProcessReadThread(object obj)
{
TestThreadJob job = obj as TestThreadJob;
var dirInfo = new DirectoryInfo(job.InFilePath);
int chunk = job.DataBatchSize * 1024;
//var tile = new List<byte[]>();
var sw = new Stopwatch();
var allFiles = dirInfo.GetFiles();
var fileStreams = new List<FileStream>();
long totalSize = 0;
_totalReadInSize = 0;
foreach (var fileInfo in allFiles)
{
totalSize += fileInfo.Length;
var fileStream = new FileStream(fileInfo.FullName,
FileMode.Open, FileAccess.Read, FileShare.None, job.FileBufferSize * 1024);
fileStreams.Add(fileStream);
}
var partial = new byte[chunk];
var taskParam = new TaskParam(null, partial);
var tasks = new List<Task>();
int numTasks = (int)Math.Ceiling(fileStreams.Count * 1.0 / job.NumThreads);
sw.Start();
do
{
_somethingLeftToRead = false;
for (int taskIndex = 0; taskIndex < numTasks; taskIndex++)
{
if (_threadCanceled)
break;
tasks.Clear();
for (int thread = 0; thread < job.NumThreads; thread++)
{
if (_threadCanceled)
break;
int fileIndex = taskIndex * job.NumThreads + thread;
if (fileIndex >= fileStreams.Count)
break;
var fileStream = fileStreams[fileIndex];
taskParam.File = fileStream;
if (job.NumThreads == 1)
ProcessFileRead(taskParam);
else
tasks.Add(Task.Factory.StartNew(ProcessFileRead, taskParam));
//tile.Add(partial);
}
if (_threadCanceled)
break;
if (job.NumThreads > 1)
Task.WaitAll(tasks.ToArray());
}
//tile = new List<byte[]>();
}
while (_somethingLeftToRead);
sw.Stop();
foreach (var fileStream in fileStreams)
fileStream.Close();
totalSize = (long)Math.Round(totalSize / 1024.0 / 1024.0);
UpdateUIRead(false, totalSize, sw.Elapsed.TotalSeconds);
}
void ProcessFileRead(object taskParam)
{
TaskParam param = taskParam as TaskParam;
int readInSize;
if ((readInSize = param.File.Read(param.Bytes, 0, param.Bytes.Length)) != 0)
{
_somethingLeftToRead = true;
_totalReadInSize += readInSize;
}
}
There's a number of issues here.
First, I see that you are not trying to use non-cached I/O. This means that the system will try to cache your data in RAM and service reads out of it. SO you get an extra data transfer out of things. Do non-cached I/O.
Next, you appear to be creating/destroying threads inside a loop. This is inefficient.
Lastly, you need to investigate the alignment of the data. Crossing read-block boundaries can add to your costs.
I would advocate using non-cached, async I/O. I'm not sure how to accomplish this in C# (but it should be easy).
EDITED: Also, why are you using RAID 5? Unless the data is write-once, this is likely to have hideous performance on SSDs. Notably, the erase block size is typically 512K, meaning when you write something smaller, the SSD will need to read the 512K in its firmware, change the data, and then write it somewhere else. You might want to make the stripe size = size of erase block. Also, you should check to see what the alignment of the writes are as well.

Why do arrays take less memory than buffers in node.js?

I'm deciding on the best way to store a lot of timeseries data in memory and I made a simple benchmark to compare buffers vs simple arrays:
var buffers = {};
var started = Date.now();
var before = process.memoryUsage().heapUsed;
for (var i = 0; i < 100000; i++) {
buffers[i] = new Buffer(4);
buffers[i].writeFloatLE(i+1.2, 0);
// buffers[i] = [i+1.2];
}
console.log(Date.now() - started, 'ms');
console.log((process.memoryUsage().heapUsed - before) / 1024 / 1024);
And the results are as follows:
Arrays: 22 'ms'
8.391242980957031
Buffers:
123 'ms'
9.9490966796875
So according to this benchmark arrays are 5+ times faster and take 18% less memory. Is this correct? I certainly expected buffers to take less memory.
There's an overhead (in time and space ) for each Buffer you create.
I expect you'll get better space (and maybe time) performance if you compare
buffers[i] = new Buffer(4*1000);
for(k=0;j<1000;++j)
{
buffers[i].writeFloatLE(i+k+1.2, 4*j);
}
With
buffers[i] = [];
for(k=0;j<1000;++j)
{
buffers[i].push(i+k+1.2);
}

Resources