NodeJS of Azure Function is much more slower than local nodeJS - node.js

I'm beginner at nodeJs and Azure.
I'm trying to use wav-encoder npm module in my program.
so I wrote code like below,
var WavEncoder = require('wav-encoder');
const whiteNoise1sec = {
sampleRate: 40000,
channelData: [
new Float32Array(40000).map(() => Math.random() - 0.5),
new Float32Array(40000).map(() => Math.random() - 0.5)
It runs on my local machine, less than 2 secs.
but if I upload similar code to Azure Functions, it takes more than 2 mins.
below is code in my Functions. It is triggered by http REST call.
var WavEncoder = require('wav-encoder');
module.exports = function (context, req) {
context.log('JavaScript HTTP trigger function processed a request.');
const whiteNoise1sec = {
sampleRate: 40000,
channelData: [
new Float32Array(40000).map(() => Math.random() - 0.5),
new Float32Array(40000).map(() => Math.random() - 0.5)
context.res = {
// status: 200, /* Defaults to 200 */
body: whiteNoise1sec
Do you know how can I improve performance of Azure?
context.res = {
// status: 200, /* Defaults to 200 */
body: whiteNoise1sec
I found that this line cause slow performance.
If I give large size array to context.res.body it takes long time when I call context.done();
Isn't large size json response proper for Azure Functions???

It's a bit hard to analyze performance issues like this, but there are few things to consider here and few things to look at.
Cold functions vs warm functions performance
if the function hasn't been invoked in a while or never (I think it's about 10 or 20 minutes) it goes idle, meaning it gets deprovisioned. next time you hit that function it needs to be loaded from storage. Due to some architecture and relying of a certain type of storage, IO hits for small files are bad currently. There is work in progress to improve that, but a large npm tree would cause > 1 minute loading time just to fetch all the small js files. if the function is warm however, it should be in the msec range (or depending on the work your function is doing, see below for more)
Workaround: use this to pack your function
Slower CPU for consumption sku
in consumption sku, you are scaled to many instances (in the hundreds) but each instance is affinitized to a single core. That is fine for IO bound operations, regular node functions (since they are single threaded anyway), etc. But if your function tries to utilize CPU for CPU bound workloads, it's not going to perform as you expect it.
Workaround: you can use dedicated Skus for CPU bound workloads


DynamoDB PutItem using all heap memory - NodeJS

I have a csv with over a million lines, I want to import all the lines into DynamoDB. I'm able to loop through the csv just fine, however, when I try to call DynamoDB PutItem on these lines, I run out of heap memory after about 18k calls.
I don't understand why this memory is being used or how I can get around this issue. Here is my code:
let insertIntoDynamoDB = async () => {
const file = './file.csv';
let index = 0;
const readLine = createInterface({
input: createReadStream(file),
crlfDelay: Infinity
readLine.on('line', async (line) => {
let record = parse(`${line}`, {
delimiter: ',',
skip_empty_lines: true,
skip_lines_with_empty_values: false
await dynamodb.putItem({
Item: {
"Id": {
S: record[0][2]
"newId": {
S: record[0][0]
TableName: "My-Table-Name"
if (index % 1000 === 0) {
// halts process until all lines have been processed
await once(readLine, 'close');
console.log('FINAL: ' + index);
If I comment out the Dynamodb call, I can look through the file just fine and read every line. Where is this memory usage coming from? My DynamoDB write throughput is at 500, adjusting this value has no affect.
For anyone that is grudging through the internet and trying to find out why DynamoDB is consuming all the heap memory, there is a github bug report found here:
Basically, the aws sdk only has 50 sockets to make http requests, if all sockets are consumed, then the events will be queued until a socket becomes available. When processing millions of requests, these sockets get consumed immediately, and then the queue builds up until it blows up the heap.
So, then how do you get around this?
Increase heap size
Increase number of sockets
Control how many "events" you are queueing
Options 1 and 2 are the easy way out, but do no scale. They might work for your scenario, if you are doing a 1 off thing, but if you are trying to build a robust solution, then you will wan't to go with number 3.
To do number 3, I determine the max heap size, and divide it by how large I think an "event" will be in memory. For example: I assume an updateItem event for dynamodb would be 100,000 bytes. My heap size was 4GB, so 4,000,000,000 B / 100,000 B = 40,000 events. However, I only take 50% of this many events to leave room on the heap for other processes that the node application might be doing. This percentage can be lowered/increased depending on your preference. Once I have the amount of events, I then read a line from the csv and consume an event, when the event has been completed, I release the event back into the pool. If there are no events available, then I pause the input stream to the csv until an event becomes available.
Now I can upload millions of entries to dynamodb without any worry of blowing up the heap.

How can the AWS Lambda concurrent execution limit be reached?

The original test code below is largely correct, but in NodeJS the various AWS services should be setup a bit differently as per the SDK link provided by #Michael-sqlbot
// manager
const AWS = require("aws-sdk")
const https = require('https');
const agent = new https.Agent({
maxSockets: 498 // workers hit this level; expect plus 1 for the manager instance
const lambda = new AWS.Lambda({
apiVersion: '2015-03-31',
region: 'us-east-2', // Initial concurrency burst limit = 500
httpOptions: { // <--- replace the default of 50 (https) by
agent: agent // <--- plugging the modified Agent into the service
// NOW begin the manager handler code
In planning for a new service, I am doing some preliminary stress testing. After reading about the 1,000 concurrent execution limit per account and the initial burst rate (which in us-east-2 is 500), I was expecting to achieve at least the 500 burst concurrent executions right away. The screenshot below of CloudWatch's Lambda metric shows otherwise. I cannot get past 51 concurrent executions no matter what mix of parameters I try. Here's the test code:
// worker
exports.handler = async (event) => {
// declare sleep promise
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
// return after one second
let nStart = new Date().getTime()
await sleep(1000)
return new Date().getTime() - nStart; // report the exact ms the sleep actually took
// manager
exports.handler = async(event) => {
const invokeWorker = async() => {
try {
let lambda = new AWS.Lambda() // NO! DO NOT DO THIS, SEE UPDATE ABOVE
var params = {
FunctionName: "worker-function",
InvocationType: "RequestResponse",
LogType: "None"
return await lambda.invoke(params).promise()
catch (error) {
try {
let nStart = new Date().getTime()
let aPromises = []
// invoke workers
for (var i = 1; i <= 3000; i++) {
// record time to complete spawning
let nSpawnMs = new Date().getTime() - nStart
// wait for the workers to ALL return
let aResponses = await Promise.all(aPromises)
// sum all the actual sleep times
const reducer = (accumulator, response) => { return accumulator + parseInt(response.Payload) };
let nTotalWorkMs = aResponses.reduce(reducer, 0)
// show me
let nTotalET = new Date().getTime() - nStart
return {
jobsCount: aResponses.length,
spawnCompletionMs: nSpawnMs,
spawnCompletionPct: `${Math.floor(nSpawnMs / nTotalET * 10000) / 100}%`,
totalElapsedMs: nTotalET,
totalWorkMs: nTotalWorkMs,
parallelRatio: Math.floor(nTotalET / nTotalWorkMs * 1000) / 1000
catch (error) {
"jobsCount": 3000,
"spawnCompletionMs": 1879,
"spawnCompletionPct": "2.91%",
"totalElapsedMs": 64546,
"totalWorkMs": 3004205,
"parallelRatio": 0.021
Request ID:
Am I hitting a different limit that I have not mentioned? Is there a flaw in my test code? I was attempting to hit the limit here with 3,000 workers, but there was NO throttling encountered, which I guess is due to the Asynchronous invocation retry behaviour.
Edit: There is no VPC involved on either Lambda; the setting in the select input is "No VPC".
Edit: Showing Cloudwatch before and after the fix
There were a number of potential suspects, particularly due to the fact that you were invoking Lambda from Lambda, but your focus on consistently seeing a concurrency of 50 — a seemingly arbitrary limit (and a suspiciously round number) — reminded me that there's an anti-footgun lurking in the JavaScript SDK:
In Node.js, you can set the maximum number of connections per origin. If maxSockets is set, the low-level HTTP client queues requests and assigns them to sockets as they become available.
Here of course, "origin" means any unique combination of scheme + hostname, which in this case is the service endpoint for Lambda in us-east-2 that the SDK is connecting to in order to call the Invoke method,
This lets you set an upper bound on the number of concurrent requests to a given origin at a time. Lowering this value can reduce the number of throttling or timeout errors received. However, it can also increase memory usage because requests are queued until a socket becomes available.
When using the default of https, the SDK takes the maxSockets value from the globalAgent. If the maxSockets value is not defined or is Infinity, the SDK assumes a maxSockets value of 50.
Lambda concurrency it not the only factor that decides how scalable your functions are. If your Lambda function is runnning within a VPC, it will require an ENI (Elastic Network Interface) which allows for ethernet traffic from and to the container (Lambda function).
It's possible your throttling occurred due to too many ENI's being requested (50 at a time). You can check this by viewing the logs of the Manager lambda function and looking for an error message when it's trying to invoke one of the child containers. If the error looks something like the following, you'll know ENI's is your issue.
Lambda was not able to create an ENI in the VPC of the Lambda function because the limit for Network Interfaces has been reached.

Is making sequential HTTP requests a blocking operation in node?

Note that irrelevant information to my question will be 'quoted'
like so (feel free to skip these).
I am using node to make in-order HTTP requests on behalf of multiple clients. This way, what originally took the client(s) several different page loads to get the desired result, now only takes a single request via my server. I am currently using the ‘async’ module for flow control and ‘request’ module for making the HTTP requests. There are approximately 5 callbacks which, using console.time, takes about ~2 seconds from start to finish (sketch code included below).
Now I am rather inexperienced with node, but I am aware of the
single-threaded nature of node. While I have read many times that node
isn’t built for CPU-bound tasks, I didn’t really understand what that
meant until now. If I have a correct understanding of what’s going on,
this means that what I currently have (in development) is in no way
going to scale to even more than 10 clients.
Since I am not an expert at node, I ask this question (in the title) to get a confirmation that making several sequential HTTP requests is indeed blocking.
If that is the case, I expect I will ask a different SO question (after doing the appropriate research) discussing various possible solutions, should I choose to continue approaching this problem in node (which itself may not be suitable for what I'm trying to do).
Other closing thoughts
The code I mentioned earlier:
var async = require('async');
var request = require('request');
function(cb) {
request(someUrl1, function(err, res, body) {
// load and parse the given web page.
// make a callback with data parsed from the web page
function(someParameters, cb) {
request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
// more computation
// make a callback with a session cookie given by the visited url
function(jar, cb) {
request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
// do more parsing + computation
// make another callback with the results
function(moreParameters, cb) {
request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
// make final callback after some more computation.
//This part takes about ~1s to complete
], function (err, result) {
console.timeEnd('4'); //
Normally, I/O in node.js are non-blocking. You can test this out by making several requests simultaneously to your server. For example, if each request takes 1 second to process, a blocking server would take 2 seconds to process 2 simultaneous requests but a non-blocking server would take just a bit more than 1 second to process both requests.
However, you can deliberately make requests blocking by using the sync-request module instead of request. Obviously, that's not recommended for servers.
Here's a bit of code to demonstrate the difference between blocking and non-blocking I/O:
var req = require('request');
var sync = require('sync-request');
// Load N times (yes, it's a real website):
var N = 10;
console.log('BLOCKING test ==========');
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
var res = sync('GET','')
console.log('Downloaded ' + res.getBody().length + ' bytes');
var end = new Date().valueOf();
console.log('Total time: ' + (end-start) + 'ms');
console.log('NON-BLOCKING test ======');
var loaded = 0;
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
req('',function( err, response, body ) {
console.log('Downloaded ' + body.length + ' bytes');
if (loaded == N) {
var end = new Date().valueOf();
console.log('Total time: ' + (end-start) + 'ms');
Running the code above you'll see the non-blocking test takes roughly the same amount of time to process all requests as it does for a single request (for example, if you set N = 10, the non-blocking code executes 10 times faster than the blocking code). This clearly illustrates that the requests are non-blocking.
Additional answer:
You also mentioned that you're worried about your process being CPU intensive. But in your code, you're not benchmarking CPU utility. You're mixing both network request time (I/O, which we know is non-blocking) and CPU process time. To measure how much time the request is in blocking mode, change your code to this:
function(cb) {
request(someUrl1, function(err, res, body) {
// load and parse the given web page.
// make a callback with data parsed from the web page
function(someParameters, cb) {
request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
// more computation
// make a callback with a session cookie given by the visited url
function(jar, cb) {
request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
// do more parsing + computation
// make another callback with the results
function(moreParameters, cb) {
request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
// some more computation.
// make final callback
], function (err, result) {
Your code only blocks in the "more computation" parts. So you can completely ignore any time spent waiting for the other parts to execute. In fact, that's exactly how node can serve multiple requests concurrently. While waiting for the other parts to call the respective callbacks (you mention that it may take up to 1 second) node can execute other javascript code and handle other requests.
Your code is non-blocking because it uses non-blocking I/O with the request() function. This means that node.js is free to service other requests while your series of http requests is being fetched.
What async.waterfall() does it to order your requests to be sequential and pass the results of one on to the next. The requests themselves are non-blocking and async.waterfall() does not change or influence that. The series you have just means that you have multiple non-blocking requests in a row.
What you have is analogous to a series of nested setTimeout() calls. For example, this sequence of code takes 5 seconds to get to the inner callback (like your async.waterfall() takes n seconds to get to the last callback):
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
// it takes 5 seconds to get here
}, 1000);
}, 1000);
}, 1000);
}, 1000);
}, 1000);
But, this uses basically zero CPU because it's just 5 consecutive asynchronous operations. The actual node.js process is involved for probably no more than 1ms to schedule the next setTimeout() and then the node.js process literally could be doing lots of other things until the system posts an event to fire the next timer.
You can read more about how the node.js event queue works in these references:
Run Arbitrary Code While Waiting For Callback in Node?
blocking code in non-blocking http server
Hidden threads in Javascript/Node that never execute user code: is it possible, and if so could it lead to an arcane possibility for a race condition?
How does JavaScript handle AJAX responses in the background? (written about the browser, but concept is the same)
If I have a correct understanding of what’s going on, this means that
what I currently have (in development) is in no way going to scale to
even more than 10 clients.
This is not a correct understanding. A node.js process can easily have thousands of non-blocking requests in flight at the same time. Your sequentially measured time is only a start to finish time - it has nothing to do with CPU resources or other OS resources consumed (see comments below on non-blocking resource consumption).
I still have concerns about using node for this particular
application then. I'm worried about how it will scale considering that
the work it is doing is not simple I/O but computationally intensive.
I feel as though I should switch to a platform that enables
multi-threading. Does what I'm asking/the concern I'm expressing make
sense? I could just be spitting total BS and have no idea what I'm
talking about.
Non-blocking I/O consumes almost no CPU (only a little when the request is originally sent and then a little when the result arrives back), but while the compmuter is waiting for the remove result, no CPU is consumed at all and no OS thread is consumed. This is one of the reasons that node.js scales well for non-blocking I/O as no resources are used when the computer is waiting for a response from a remove site.
If your processing of the request is computationally intensive (e.g. takes a measurable amount of pure blocking CPU time to process), then yes you would want to explore getting multiple processes involved in running the computations. There are multiple ways to do this. You can use clustering (so you simply have multiple identical node.js processes each working on requests from different clients) with the nodejs clustering module. Or, you can create a work queue of computationally intensive work to do and have a set of child processes that do the computationally intensive work. Or, there are several other options too. This not the type of problem that one needs to switch away from node.js to solve - it can be solved using node.js just fine.
You can use queue to process concurrent http calls in nodeJs
var cq = require('concurrent-queue');
test_queue = cq();
// request action method
testQueue: function(req, res) {
// queuing each request to process sequentially
test_queue(req.user, function (err, user) {
console.log(' done');
res.json(200, user)
// Queue will be processed one by one.
test_queue.limit({ concurrency: 1 }).process(function (user, cb) {
console.log( + ' started')
// async calls will go there
setTimeout(function () {
// on callback of async, call cb and return response.
cb(null, user)
}, 1000);
Please remember that it needs to implement for sensitive business calls where the resource needs to be accessed or update at a time by one user only.
This will block your I/O and make your users to wait and response time will be slow.
You can make it faster and optimize it by creating resource dependent queue. So that the there is a separate queue for each shared resource and synchronous calls for same resource can only be execute for same resource and for different resources the calls will be executed asynchronously
Let suppose that you want to implement that on the base of current user. So that for the same user http calls can only execute synchronously and for different users the https calls will be asynchronous
testQueue: function(req, res) {
// if queue not exist for current user.
if(! (test_queue.hasOwnProperty( ){
// initialize queue for current user
test_queue[] = cq();
// initialize queue processing for current user
// Queue will be processed one by one.
test_queue[].limit({ concurrency: 1 }).process(function (task, cb) {
console.log( + ' started')
// async functionality will go there
setTimeout(function () {
cb(null, task)
}, 1000)
// queuing each request in user specific queue to process sequentially
test_queue[](req.user, function (err, user) {
res.json(200, user)
console.log(' done');
This will be fast and block I/O for only that resource for which you want.

node.js azure cache latency

I am working on a node.js express application which uses azure cache. I have deployed the service to azure and I notice a latency of 50ms or so for get and put rquests.
The methods I am using are:
var time1, time2;
var start =;
var cacheObject = this.cache;
cacheObject.put('test1', { first: 'Jane', last: 'Doe' }, function (error) {
if (error) throw error;
time1= - start;
start =;
cacheObject.get('test1', function (error, data) {
if (error) throw error;
console.log('Data from cache:' + data);
time2 = (;
res.send({t1: time1, t2: time2});
The time for put is represented by time1 and time2 represents the time for get.
From reading other posts on the internet, I understood that the latency should be in the order of a couple of ms, but 50ms seems a bit high. Am I using the methods properly? Are there any special settings I need to setup on the management portal? Or is 50ms latency expected?
A few obvious things to check first:
Is the client code running in the same region as the cache? The minimum possible latency is the network round trip time, which may be around 50ms between regions.
Is the Node.js precise enough to measure a small number of milliseconds? I'm not familiar with Node.js, but in .NET you should use the StopWatch class for timing, rather than DateTime.Now.

Node.js Synchronous Library Code Blocking Async Execution

Suppose you've got a 3rd-party library that's got a synchronous API. Naturally, attempting to use it in an async fashion yields undesirable results in the sense that you get blocked when trying to do multiple things in "parallel".
Are there any common patterns that allow us to use such libraries in an async fashion?
Consider the following example (using the async library from NPM for brevity):
var async = require('async');
function ts() {
return new Date().getTime();
var startTs = ts();
process.on('exit', function() {
console.log('Total Time: ~' + (ts() - startTs) + ' ms');
// This is a dummy function that simulates some 3rd-party synchronous code.
function vendorSyncCode() {
var future = ts() + 50; // ~50 ms in the future.
while(ts() <= future) {} // Spin to simulate blocking work.
// My code that handles the workload and uses `vendorSyncCode`.
function myTaskRunner(task, callback) {
// Do async stuff with `task`...
// Do more async stuff...
// Dummy workload.
var work = (function() {
var result = [];
for(var i = 0; i < 100; ++i) result.push(i);
return result;
// Problem:
// -------
// The following two calls will take roughly the same amount of time to complete.
// In this case, ~6 seconds each.
async.each(work, myTaskRunner, function(err) {});
async.eachLimit(work, 10, myTaskRunner, function(err) {});
// Desired:
// --------
// The latter call with 10 "workers" should complete roughly an order of magnitude
// faster than the former.
Are fork/join or spawning worker processes manually my only options?
Yes, it is your only option.
If you need to use 50ms of cpu time to do something, and need to do it 10 times, then you'll need 500ms of cpu time to do it. If you want it to be done in less than 500ms of wall clock time, you need to use more cpus. That means multiple node instances (or a C++ addon that pushes the work out onto the thread pool). How to get multiple instances depends on your app strucuture, a child that you feed the work to using child_process.send() is one way, running multiple servers with cluster is another. Breaking up your server is another way. Say its an image store application, and mostly is fast to process requests, unless someone asks to convert an image into another format and that's cpu intensive. You could push the image processing portion into a different app, and access it through a REST API, leaving the main app server responsive.
If you aren't concerned that it takes 50ms of cpu to do the request, but instead you are concerned that you can't interleave handling of other requests with the processing of the cpu intensive request, then you could break the work up into small chunks, and schedule the next chunk with setInterval(). That's usually a horrid hack, though. Better to restructure the app.
