ExpressJS Middleware bodyParser has very poor performance

ExpressJS Middleware bodyParser has very poor performance - node.js

Few days ago we added NewRelic, APM to our Rest API, which is written in NodeJS and uses EXPRESS JS as a development framework.
Now we see a lot of users experience poor response times, because of JSON parser middleware.
Here is one of those requests reported in NewRelic:
Drilled-down report:
As you can see the most of the time is consumed by middleware JSON Parser.
We were thinking maybe issue comes from big JSON payloads, which is sometimes sent out from API. But for this response, illustrated above, API returned data with contentLength=598, which shouldn't be too huge JSON.
We also use compression middleware as visible on drilled down request execution screenshot. Which should be reducing size of IO sent back and forth to clients.
At this moment we have a doubt for a parameter limit which is passed to middleware when initialized. {limit:50kb} But when testing locally it doesn't make any difference.
We were thinking to switch with protobuf, also think about the way to parse JSON payloads asynchronously. Because JSON.parse which is used by middleware is synchronous process and stops non-blocking IO.
But before staring those changes and experiments, please if anyone had same kind of problem to suggest any possible solution.
Benchmarking:
For benchmarking on local/stage environments we use JMeter and generate loads to check when such timeouts may happen, but we are not able to catch this when testing with JMeter.
Thank You.

Express comes with an embedded bodyparser, and you can try it if you want. It should perform better since it's integrated.
const express = require('express');
const app = express();
app.use(express.urlencoded()); // support for GET
// No limited usage,
app.use(express.json()); // support for POST
// If you want add request limit,
app.use(express.json({ limit: "1mb" }));

Related

How does Koa help avoid "monkey patching" and how "Hapi" or "Express" don't do the same?

I have hard times understanding why people preach Koa as solving the "monkey patching" problem (whereas one needs to modify prepackaged code). (see https://www.quora.com/Should-I-learn-Express-js-or-Koa-js-for-node/answer/Yvan-Scher?share=1 or http://blog.onclickinnovations.com/koa-js/).
How is Koa special in that regards? How isn't Hapi or Express the same in that regards?

Having done Koa for 2 years, and some express.js recently, I ran into 1 big example of this.
Say you have a controller that emits a response, and you want to intercept that response and do something with it (e.g.: gzip it, or convert it to some other format).
This works easily natively with koa because you can just do something like this:
function myMw(ctx, next) {
await next();
ctx.response.body = gzip(ctx.response.body);
}
The above is a fictional example, but you get the idea.
With express your code for this looks like absolute garbage. Easy to see in the express gzip middleware:
https://github.com/expressjs/compression/blob/master/index.js
This has to do with the fact that express middlewares provide direct access to the HTTP socket for writing responses (with send()).
I'm suspecting this is where this sentiment comes from. Frankly I don't understand why people still use Express. Mostly habitual and the vast amounts of tutorials I reckon. Express was great, but it's painful today.

node.js body on http request object vs body on express request object

I'm trying to build an http module that suppose to work with an express server.
while reading the http module api, I see that it doesn't save the body inside the request object.
So my questions are:
If I want to build an express server which works with the official http module, how should I get the body?
I consider to implement the http module in the following way: listening to the socket, and if I get content-length header, listetning to the rest of the socket stream till I get all the body, save it as a memeber of the http request, and only then send the request object to the express server handler.
What are the pros and cons of my suggestion above vs letting the express server to "listen" to the body of the request via request.on('data',callback(data))
I mean , why shouldn't I keep the body inside the 'request' object the same way I keep the headers?

It's hard to answer your question without knowing exactly what you want to do. But I can give you some detail about how the request body is handled by Node/Express, and hopefully you can take things from there.
When handling a request (either directly via Node's request handler, or through Express's request handlers), the body of the request won't automatically be received: you have to open an HTTP stream to receive it.
The type of the body content should be determined by the Content-Type request header. The two most common body types are application/x-www-form-urlencoded and multipart/form-data. It's possible, however, to use any content type you want, which is usually more common for APIs (for example, using application/json is becoming more common for REST APIs).
application/x-www-form-urlencoded is pretty straightforward; name=value pairs are URL encoded (using JavaScript's built-in encodeURIComponent, for example), then combined with an ampersand (&). They're usually UTF-8 encoded, but that can also be specified in Content-Type.
multipart/form-data is more complicated, and can also typically be quite large, as vkurchatkin's answer points out (meaning you may not want to bring it into memory).
Express makes available some middleware to automatically handle the various types of body parsing. Usually, people simply use bodyParser, though you have to be careful with that middleware. It's really just a convenience middleware that combines json, urlencoded, and multipart. However, multipart has been deprecated. Express is still bundling Connect 2.12, which still includes multipart. When Express updates its dependency, though, the situation will change.
As I write this, bodyParser, json, urlencoded, and multipart have all been removed from Connect. Everything but multipart has been moved into the module body-parser (https://github.com/expressjs/body-parser). If you need multipart support, I recommend Busboy (https://npmjs.org/package/busboy), which is very robust. At some point, Express will update it's dependency on Connect, and will most likely add a dependency to body-parser since it has been removed from Connect.
So, since bodyParser bundles deprecated code (multipart), I recommend explicitly linking in only json and urlencoded (and you could even omit json if you're not accepting any JSON-encoded bodies):
app.use(express.json());
app.use(express.urlencoded());
If you're writing middleware, you probably don't want to automatically link in json and urlencoded (much less Busboy); that would break the modular nature of Express. However, you should specify in your documentation that your middleware requires the req.body object to be available (and fail gracefully if it isn't): you can go on to say that json, urlencoded, and Busboy all provide the req.body object, depending on what kind of content types you need to accept.
If you dig into the source code for urlencoded or json, you will find that they rely on another Node module, raw-body, which simply opens the request stream and retrieves the body content. If you really need to know the details of retrieving the body from a request, you will find everything you need in the source code for that module (https://github.com/stream-utils/raw-body/blob/master/index.js).
I know that's a lot of detail, but they're important details!

You can do that, it fairly simple. bodyParser middleware does that, for example (https://github.com/expressjs/body-parser/blob/master/index.js#L27). The thing is, request body can be really large (file upload, for example), so you generally don't want to put that in memory. Rather you can stream it to disk or s3 or whatnot.

Capture and log the api calls made to the server along with the passed data and the results

I have an api written in node.js that handles calls coming in from websites, desktop applications, iOS applications etc. There are probably 50+ endpoints and each end point can accept anywhere from 1 parameter to possibly 10-20 depending on what is intendeding to be accomplished. These can be GET/POST/PUT/DEL
I want to start load testing my API and simulating users activities.
What I am looking for is suggestions on how you can capture the API call and the parameters that were passed along with it in a logical way.
I use forever to run my app and everything is written to a log file so my initial reaction was to do something like add a piece of middleware to the express routes that would capture the endpoint as well as the req.params and req.body but then I need to put this middleware in all 50+ routes kind of tedious.
Anyone done something like this before and has a good idea on how to capture calls / data with those calls as well as possibly capturing what is returned from my API.
Perhaps some module?
I need to have this in a readable format to provide to other people so they can structure a fake set of calls... so raw log files aren't really helpful unless they are outputted.... "pretty".
Thanks!

You're on the right track – just add your logger middleware via app.use, which runs the middleware on every request (rather than adding it to each route).
In fact, the Express docs give an example of using logger middleware:
var express = require('express');
var app = express();
// simple logger
app.use(function(req, res, next){
console.log('%s %s', req.method, req.url);
next();
});
Connect (on which Express is built) provides logger middleware, so you can just do:
var logFile = fs.createWriteStream('./myLogFile.log', {flags: 'a'});
app.use(express.logger({stream: logFile}));

Node.js express correct use of bodyParser middleware

I am new to node.js and express and have been experimenting with them for a while. Now I am confused with the design of the express framework related to parsing the request body.
From the official guide of express:
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
app.use(logErrors);
app.use(clientErrorHandler);
app.use(errorHandler);
After setting up all the middleware, then we add the route that we want to handle:
app.post('/test', function(req, res){
//do something with req.body
});
The problem with this approach is that all request body will be parsed first before the route validity is checked. It seems very inefficient to parse the body of invalid requests. And even more, if we enable the upload processing:
app.use(express.bodyParser({uploadDir: '/temp_dir'}));
Any client can bombard the server by uploading any files (by sending request to ANY route/path!!), all which will be processed and kept in the '/temp_dir'. I can't believe that this default method is being widely promoted!
We can of course use the bodyParser function when defining the route:
app.post('/test1', bodyParser, routeHandler1);
app.post('/test2', bodyParser, routeHandler2);
or even perhaps parse the body in each function that handle the route. However, this is tedious to do.
Is there any better way to use express.bodyParser for all valid (defined) routes only, and to use the file upload handling capability only on selected routes, without having a lot of code repetitions?

Your second method is fine. Remember you can also pass arrays of middleware functions to app.post, app.get and friends. So you can define an array called uploadMiddleware with your things that handle POST bodies, uploads, etc, and use that.
app.post('/test1', uploadMiddleware, routeHandler1);
The examples are for beginners. Beginner code to help you get the damn thing working on day 1 and production code that is efficient and secure are often very different. You make a certainly valid point about not accepting uploads to arbitrary paths. As to parsing all request bodies being 'very inefficient', that depends on the ratio of invalid/attack POST requests to legitimate requests that are sent to your application. The average background radiation of attack probe requests is probably not enough to worry about until your site starts to get popular.
Also here's a blog post with further details of the security considerations of bodyParser.

Connect 1.8.0 - multipart support - formidable reporting progress - large files?

I've come across TJ Holowaychuk post on multipart support and how bodyParser now does what I used to to with formidable directly.I think it's quite handy, at the same time I am now confused on how to deal with:
- uploading large files?
Does connect-form handle everything that is needed to support uploading of files that are for example 100MB in size?
- report progress during upload?
I used to get the form.parse(..) event called before the change, now since it's all handled by bodyParser it's never called...
Holowaychuk says that "The downside to this is that if you wish to report upload progress, or access files and fields as the request is streamed, you will have to use formidable directly" (http://tjholowaychuk.com/). I tried to use it directly. The only way it worked for me was to put:
...
app.use(app.router);
before:
app.use(express.bodyParser());
So I thought that solved my problem until I wanted to use sessions which didn't work because router has to be put before bodyparser to make the upload work and:
app.use(express.cookieParser());
app.use(express.session({...})
has to go after:
app.use(express.bodyParser());
which throws off sessions....
So:
WHAT IS THE RIGHT WAY OF HANDLING/CONFIGURING FILE UPLOADS so that reporting progress works and sessions work all together, small and large files using this new way with connect-form?
I'm not super experienced with Node and express so if you do answer please keep that in mind if possible.
Thank you!

There's no one right answer, bodyParser() wont be for everyone. In the next version of connect we'll have multpart(), json() and urlencoded() of which bodyParser() uses all three, so if you dont want the req.files support you would just use the other two. If we can think of an elegant way to expose some of the events with multipart() / bodyParser() I'm down for that, until then you could add your own formidable middleware above bodyParser()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string