Node.js/Express Netflix Issue - node.js

I'm a little confused by this issue Netflix ran into with Express. They started to see a build of latency in their APIs. We use Express for everything, and I'd like to avoid any sudden problems.
Here's a link to the article.
http://www.infoq.com/news/2014/12/expressjs-burned-netflix
The way it's written, it sounds like a problem with Express, and how it's handling routing. But in the end, they state the following:
"After dig into their source code the team found out the problem. It resided in a periodic function that was being executed 10 times per hour and whose main purpose was to refresh route handlers from an external source. When the team fixed the code so that the function would stop adding duplicate route handlers, the latency and CPU usage increases went away."
I don't understand what exactly they were trying to do. I don't believe this was something that Express was doing on it's own. Sounds like they were doing something a bit oddball, and it didn't work out. I'd think load testing would have revealed this. Anyway, anyone who understands this better who can comment on what the problem actually was? The entire section at the top of the article talks about how Express rotates through the routes list, but I really don't see how iterating over what should not be a very large array would cause that much of a delay.

The best counterpoint explanation of this I've seen is Eran Hammer's. The comments are also illuminating. Of particular interest are the following excerpts from Yunong Xiao's (the author of the Netflix post) comment:
The specific problem we encountered was not a global handler but the
express static file handler with a simple string path. We were adding
the same static router handler each time we refreshed our routes.
since this route handler was in the global routing array, it meant
that every request that was serviced by our app had to iterate though
this handler.
It was absolutely our mis-use of the Express API that caused this --
after all, we were leaking that specific handler! However, had Express
1) not stored static handlers with simple strings in the global
routing array, and 2) rejected duplicate routing handlers, or 3) not
taken 1ms of CPU time to merely iterate through this static handler,
then we would not have experienced such drastic performance problems.
Express would have masked the fact that we had this leak -- and
perhaps this would have bit us down the road in another subtle way.
Our application has over 100 GET routes (and growing), even using the
Express's Router feature -- which lets you compose arrays of handlers
for each path inside the global route array, we'd still have to
iterate through all 100 handlers for each request. Instead, we built
our own custom global route handler, which takes in the context of a
request (including its path) and returns a set of handlers specific to
the request such that we don't have to iterate through handlers we
don't need.
This was our implementation, which separated the global handlers that
every request needs from handlers specific to each request. I'm sure
more optimal solutions are out there.

Related

Knot Resolver: Paralelism and concurrency in modules

Context
Dear Knot Resolver users, I have a module that hooks into Knot's finish phase,
static knot_layer_api_t _layer = {
.finish = &collect,
};
the purpose of the collect function static int collect(knot_layer_t *ctx) { is to ask an external oraculum via a REST API whether a particular domain is listed for containing a malware or phishing campaign and whether it should be resolved or sinkholed.
It works well as long as Knot Resolver is not targeted with hundreds of concurrent DNS requests.
When that happens, given the fact that the oraculum's API response time varies and could be as long as tens to hundreds of milliseconds on occasion,
clients start to temporarily perceive very long response times from Knot Resolver, far exceeding the hard timeout set on communication to oraculum's API.
Possible problem
I think that the scaling-with-processes actually
renders the module very inefficiently implemented, because queries are being queued and processed by
module one by one (in a particular process). That means if n queries almost-hit oraculum's API timeout limit t, the client
who sent its n+1 query to this particular kresd process, will perceive a very long response time of accumulated n*t.
Or would it? Am I completely off?
When I prototyped similar functionality in GoDNS using goroutines, GoDNS server (at the cost of hideous CPU usage) let numerous
DNS clients' queries talk to the oraculum and return to clients "concurrently".
Question
Is it O.K. to use Apache Portable Runtime threading or OpenMP threading and to start hiding the API's response time in the module? Isn't it a complete Knot Resolver antipattern?
I'm caching oraculum's API responses in a simple in memory ephemeral LRU cache that resides in each kresd process. Would it be possible to use kresd's own MVCC cache instead for my arbitrary structure?
Is it possible that the problem is elsewhere, for instance, that Knot Resolver doesn't expect any blocking delay in finish layer and thus some network queue is filled and subsequent DNS queries are rejected and/or intolerably delayed?
Thanks for pointers (pun intended)
A Knot Resolver developer here :-) (I also repeat some things answered by Jan already.)
Scaling-with-processes is able to work fine. Waiting for responses from name-servers is done by libuv (via event-loop and callbacks, all within a single thread).
Due to the single-threaded style, no layer function should be blocked (on I/O), as that would make everything block on it. AFAIK currently the only case when this can really happen is when (part of) the cache gets swapped-out.
There is the YIELD state http://knot-resolver.readthedocs.io/en/latest/lib.html?highlight=yield It's used when a sub-request is needed before processing of the layer can continue, but I currently don't know details of its working. I don't think it's directly applicable, as resuming the layers seems currently only triggered by a sub-request finishing.
Cache: if you put your module before the rrcache module and you change the RRset, it will get cached changed already.
Knot DNS developer here (not Resolver though). I think you are right. My understanding is that the layer code is executed synchronously in the daemon thread. The asynchrony appears only at the resolver network I/O level.
Internally the server runs libuv loop which just executes callbacks for events on primitives provided by libuv (sockets, timers, signals, etc.). The problem is that you just cannot suspend the running callback (C function) at an arbitrary point, escape back to libuv loop, and continue with the callback execution at some point later.
That said, asynchronous waiting for an event can happen only where this was expected. And the code driving layers doesn't expect that.
Answers:
I'm not very familiar with libapr or OpenMP. But I don't think this could be really solved without reworking the layer interface and making it asynchronous.
The shared cache could be used for sure. If you cannot find the API, jolly Knot DNS folks will happily accept a patch or help you writing one.
This is exactly the case. Knot Resolver doesn't expect blocking code in the layer finish callback.

Node.js optimizing module for best performance

I'm writing a crawler module which is calling it-self recursively to download more and more links depending on a depth option parameter passed.
Besides that, I'm doing more tasks on the returned resources I've downloaded (enrich/change it depending on the configuration passed to the crawler). This process is going on recursively until it's done which might take a-lot of time (or not) depending on the configurations used.
I wish to optimize it to be as fast as possible and not to hinder on any Node.js application that will use it.I've set up an express server that one of its routes launch the crawler for a user defined (query string) host. After launching a few crawling sessions for different hosts, I've noticed that I can sometimes get real slow responses from other routes that only return simple text.The delay can be anywhere from a few milliseconds to something like 30 seconds, and it's seems to be happening at random times (well nothing is random but I can't pinpoint the cause).I've read an article of Jetbrains about CPU profiling using V8 profiler functionality that is integrated with Webstorm, but unfortunately it only shows on how to collect the information and how to view it, but it doesn't give me any hints on how to find such problems, so I'm pretty much stuck here.
Could anyone help me with this matter and guide me, any tips on what could hinder the express server that my crawler might do (A lot of recursive calls), or maybe how to find those hotspots I'm looking for and optimize them?
It's hard to say anything more specific on how to optimize code that is not shown, but I can give some advice that is relevant to the described situation.
One thing that comes to mind is that you may be running some blocking code. Never use deep recursion without using setTimeout or process.nextTick to break it up and give the event loop a chance to run once in a while.

Performance/Latency differences - Dynamic vs Static Routes in Node.js

If I decide to use a DRY approach and set up my routing dynamically where one route can handle multiple different tasks can this cause latency issues?
This is my first Node.js project and Im using it only as a backend to handle requests using a RESTful architecture, where some data the end user requests can be quite large.
Are there performance differences when deciding between using dynamic vs static routes in node.js. I have around 10 different resources obtainable at there specific route
app.get('/resource1', ....
app.get('/resource2', ....
app.get('/resource3', ....
app.get('/resource4', ....
....
about half have will pass params or some sort of query. I current it configured for it to be set dynamically then in I have sorting logic then I handle the request like so.
app.get('/:resource* ', ....
[sorting logic for every case]
[handle request]
I'm assuming this will result in higher latency. What are the trade offs and best practice in this case?
Most route handling logic would take up a tiny amount of time to process that you would not be able to detect. Regardless, it is best to write it in the way that is most clear, so static routes where they make sense. Networking latency will affect things much more than a small amount of processing to sort routes.
See the following and useful links off of it for information about performance and latency: https://gist.github.com/jboner/2841832

Should I cache results of functions involving mass file I/O in a node.js server app?

I'm writing my first 'serious' Node/Express application, and I'm becoming concerned about the number of O(n) and O(n^2) operations I'm performing on every request. The application is a blog engine, which indexes and serves up articles stored in markdown format in the file system. The contents of the articles folder do not change frequently, as the app is scaled for a personal blog, but I would still like to be able to add a file to that folder whenever I want, and have the app include it without further intervention.
Operations I'm concerned about
When /index is requested, my route is iterating over all files in the directory and storing them as objects
When a "tag page" is requested (/tag/foo) I'm iterating over all the articles, and then iterating over their arrays of tags to determine which articles to present in an index format
Now, I know that this is probably premature optimisation as the performance is still satisfactory over <200 files, but definitely not lightning fast. And I also know that in production, measures like this wouldn't be considered necessary/worthwhile unless backed by significant benchmarking results. But as this is purely a learning exercise/demonstration of ability, and as I'm (perhaps excessively) concerned about learning optimal habits and patterns, I worry I'm committing some kind of sin here.
Measures I have considered
I get the impression that a database might be a more typical solution, rather than filesystem I/O. But this would mean monitoring the directory for changes and processing/adding new articles to the database, a whole separate operation/functionality. If I did this, would it make sense to be watching that folder for changes even when a request isn't coming in? Or would it be better to check the freshness of the database, then retrieve results from the database? I also don't know how much this helps ultimately, as database calls are still async/slower than internal state, aren't they? Or would a database query, e.g. articles where tags contain x be O(1) rather than O(n)? If so, that would clearly be ideal.
Also, I am beginning to learn about techniques/patterns for caching results, e.g. a property on the function containing the previous result, which could be checked for and served up without performing the operation. But I'd need to check if the folder had new files added to know if it was OK to serve up the cached version, right? But more fundamentally (and this is the essential newbie query at hand) is it considered OK to do this? Everyone talks about how node apps should be stateless, and this would amount to maintaining state, right? Once again, I'm still a fairly raw beginner, and so reading the source of mature apps isn't always as enlightening to me as I wish it was.
Also have I fundamentally misunderstood how routes work in node/express? If I store a variable in index.js, are all the variables/objects created by it destroyed when the route is done and the page is served? If so I apologise profusely for my ignorance, as that would negate basically everything discussed, and make maintaining an external database (or just continuing to redo the file I/O) the only solution.
First off, the request and response objects that are part of each request last only for the duration of a given request and are not shared by other requests. They will be garbage collected as soon as they are no longer in use.
But, module-scoped variables in any of your Express modules last for the duration of the server. So, you can load some information in one request, store it in a module-level variable and that information will still be there when the next request comes along.
Since multiple requests can be "in-flight" at the same time if you are using any async operations in your request handlers, then if you are sharing/updating information between requests you have to make sure you have atomic updates so that the data is shared safely. In node.js, this is much simpler than in a multi-threaded response handler web server, but there still can be issues if you're doing part of an update to a shared object, then doing some async operation, then doing the rest of an update to a shared object. When you do an async operation, another request could run and see the shared object.
When not doing an async operation, your Javascript code is single threaded so other requests won't interleave until you go async.
It sounds like you want to cache your parsed state into a simple in-memory Javascript structure and then intelligently update this cache of information when new articles are added.
Since you already have the code to parse your set of files and tags into in-memory Javascript variables, you can just keep that code. You will want to package that into a separate function that you can call at any time and it will return a newly updated state.
Then, you want to call it when your server starts and that will establish the initial state.
All your routes can be changed to operate on the cached state and this should speed them up tremendously.
Then, all you need is a scheme to decide when to update the cached state (e.g. when something in the file system changed). There are lots of options and which to use depends a little bit on how often things will change and how often the changes need to get reflected to the outside world. Here are some options:
You could register a file system watcher for a particular directory of your file system and when it triggers, you figure out what has changed and update your cache. You can make the update function as dumb (just start over and parse everything from scratch) or as smart (figure out what one item changed and update only that part of the cache) as it is worth doing. I'd suggest you start simple and only invest more in it when you're sure that effort is needed.
You could just manually rebuild the cache once every hour. Updates would take an average of 30 minutes to show, but this would take 10 seconds to implement.
You could create an admin function in your server to instruct the server to update its cache now. This might be combined with option 2, so that if you added new content, it would automatically show within an hour, but if you wanted it to show immediately, you could hit the admin page to tell it to update its cache.

node.js express custom format debug logging

A seemingly simple question, but I am unsure of the node.js equivalent to what I'm used to (say from Python, or LAMP), and I actually think there may not be one.
Problem statement: I want to use basic, simple logging in my express app. Maybe I want to output DEBUG messages, or INFO messages, or just some stats to the log for consumption by other back-end systems later.
1) I want all logs message, however, to contain some fields: remote-ip and request url, for example.
2) On the other hand, code that logs is everywhere in my app, including deep inside the call tree.
3) I don't want to pass (req,res) down into every node in the call tree (this just creates a lot of parameter passing where they are mostly not needed, and complicates my code, as I need to pass these into async callbacks and timeouts etc.)
In other systems, where there is a thread per request, I will store the (req,res) pair (where all the data I need is) in a thread-local-storage, and the logger will read this and format the message.
In node, there is only one thread. What is my alternative here? What's "the request context in which a specific piece of code is running under"?
The only way I can think of achieving something like this is by looking at a trace, and using reflection to look at local variables up the call tree. I hate that, plus would need to implement this for all callbacks, setTimeouts, setIntervals, new Function()'s, eval's, ... and the list goes on.
What are other people doing?

Resources