Should each website be its own `node.js` process - node.js

We host about 150 websites (possibly scaling to 300+) that we are considering migrating to node.js. Most of the sites are fairly low traffic <1mil pageviews per month.
Should each website be it's own node.js process, or should we serve all websites using the same node.js process (or small set of load balanced processes). Is there a technical limit or a reasonable limit to the number of node processes per server?
Process per site: Feels inefficient, but I don't know if it actually is inefficient. Would ensure one buggy site doesn't affect other sites.
Process per core/small set of processes: Likely higher performance, but what happens when I need to update a sites codebase, won't it take down other sites? Also, code failures in one site would affect other sites.
Ideally, I would prefer one process per site so that we could host all sites from each worker server. That way when load increases we can just spin up another identical worker server and load balance between the two without having to arbitrarily say SiteA goes to ServerA and SiteB goes to ServerB. Any node.js gurus available to offer some wisdom?
All static file requests will be handled likely by Nginx or something like Varnish.

There are a lot of issues at play here. The big picture answer is, it depends... as it always does when you bring in the whole "performance" discussion. That being said, the simplest way to get a solid Node set up is to note the following basic facts about NodeJS, and I will also comment on their implications as they pertain to your questions.
The concurrency you get with Node works really good in certain situations, namely IO heavy operations. What we're really talking about here is minimizing the amount of downtime to wait for the next request. Because of this, Node works really well in an environment where there is one process per core on a machine. Node does really well at maximizing the amount of CPU available to serve requests under heavy load. This being said, if you have literally ZERO other work going on in your even loop, you can see minor performance increases (in terms of max requests/second/processor core) by having multiple node processes per core. But, I've never seen any benefit from increasing this number past 3. Even under circumstances where the entire event loop was literally just a file server.
On the process per site comment. This is a bad idea for many reasons. For one, a well put together node server can process thousands of requests per second. Our (company name omitted) servers, hosted through Amazon EC2 on medium clusters (lots of ram, mid CPU clock, 4 cores), typically fail around 3000 requests per second per cluster. Our servers do a fair bit of CPU work, for simple file servers I'm sure you can do much better. Strictly speaking, sure, per site, you will be able to serve more requests by launching each site in its own process/core/escalating quickly here! But it's not necessary from a cost and over complication of your architecture point of view. What I WOULD recommend, is investing in a setup with a lot of RAM. The ability for your server to cache often requested files will effect your performance infinitely more than launching an abundance of processes for a given machine.
On the whole RAM thing. The number of processes you want to launch for a given core is dependant on two things. One is how much synchronous work done in your event loop. The more synchronous work, the more time between a given request coming in and the event loop being ready to adress the next one. If you have a busy event loop, you will be in a situation where you require more processes/CPU Core. The other thing that can effect this, particularly relevant for file servers, is the amount of RAM. Node runs much better in a high ram environment, but you can say this about ANY file server really... What this has to do with, is the number of active asynchronous operations. One downside of the way node works, is under heavy loads, you can get a large number of event handlers active at once. This is great for concurrency/simplicity, however, if your server is busy waiting around for a lot of async disk/IO to happen it will slow down and crash much sooner than if you had plenty of RAM. If you don't have enough RAM to handle all of these event handlers, you will want to keep to the 1 process/core arrangement. Otherwise, it is easier for Node to spin up many event handlers simultaneously, and again cause you to crash sooner than you would otherwise.
I don't really have enough information to tell you what you SHOULD do. This depends entirely too much on the architecture of your specific server, sites, size of your sites, amount of data... etc. But these three pieces of knowledge are the basic things that help you get the most out of your Node server. To be honest, your idea about load balancing mixed with the considerations above, should do nicely for you. Surely, microoptimizations are possible, but if you do these things, you should easily see requests/second in the thousands before you start experiencing crashes because of DDOS type of conditions.

No, don't do it. Keep it simple! And check out http://12factor.net/.
A few hundred processes is nothing compared to the simplicity you otherwise lose. It would be a terrible decision, on so many levels, to have more than one site (or, "logical application unit") served by a single Node process.
If you're asking this question, you may want to explore Node more before you "migrate" to Node. Error handling and separation of concerns are more complicated in Node than in other situations. Specifically, neither the domain nor cluster APIs are mature. But really it's the philosophy of clean and simple application deployment that you'd be violating. I could go on and on.

Related

NodeJS Monitoring Website (Worker Threads?/Multi Process?)

I am doing small project of application that will monitor some servers.
It will base on telnet port check, ping, and also it will use libraries to connect directly to databases (MSSQL, Oracle, MySQL) to check their status.
I wonder what will be the best effective solution for this idea, currently with around 30 servers it works quite smooth, around 2.5sec to check status for all of them (running async). However I am worried that in the future with more servers it might get worse. Hence thinking about using some alternative like Worker Threads maybe? or some multi processing? Any ideas? Everything is happening in internal network so I do not expect huge latency.
Thank you in advance.
Have you ever tried the PM2 cluster mode:
https://pm2.keymetrics.io/docs/usage/cluster-mode/
The telnet stuff is TCP, which Node.js does very well using OS-level networking events. The connections to databases can vary. In the case of Oracle, you'll likely be using the node-oracledb. Those are SQL*Net connections that rely on the OCI libs and Node.js' thread pool. The thread pool defaults to four threads, but you can grow it up to 128 per Node.js process. See this doc for info:
https://oracle.github.io/node-oracledb/doc/api.html#-143-connections-threads-and-parallelism
Having said all that, other than increasing the size of the thread pool, I wouldn't recommend you make any changes. Why fight fires before they're burning? No need to over-engineer things. You're getting acceptable performance given the current number of servers you have.
How many servers do you plan to add in, say, 5 years? What's the difference in timing if you run the status checks for half of the servers vs all of them? Perhaps you could use that kind of data to make an educated guess as to where things would go.
As you add new ones, keep track of the total time to check the status. Is it slipping? If so, look into where the time is being spent and write the solution that will help.

I'm not sure how to correctly configure my server setup

This is kind of a multi-tiered question in which my end goal is to establish the best way to setup my server which will be hosting a website as well as a service (using Socket.io) for an iOS (and eventually an Android) app. Both the app service and the website are going to be written in node.js as I need high concurrency and scaling for the app server and I figured whilst I'm at it may as well do the website in node because it wouldn't be that much different in terms of performance than something different like Apache (from my understanding).
Also the website has a lower priority than the app service, the app service should receive significantly higher traffic than the website (but in the long run this may change). Money isn't my greatest priority here, but it is a limiting factor, I feel that having a service that has 99.9% uptime (as 100% uptime appears to be virtually impossible in the long run) is more important than saving money at the compromise of having more down time.
Firstly I understand that having one node process per cpu core is the best way to fully utilise a multi-core cpu. I now understand after researching that running more than one per core is inefficient due to the fact that the cpu has to do context switching between the multiple processes. How come then whenever I see code posted on how to use the in-built cluster module in node.js, the master worker creates a number of workers equal to the number of cores because that would mean you would have 9 processes on an 8 core machine (1 master process and 8 worker processes)? Is this because the master process usually is there just to restart worker processes if they crash or end and therefore does so little it doesnt matter that it shares a cpu core with another node process?
If this is the case then, I am planning to have the workers handle providing the app service and have the master worker handle the workers but also host a webpage which would provide statistical information on the server's state and all other relevant information (like number of clients connected, worker restart count, error logs etc). Is this a bad idea? Would it be better to have this webpage running on a separate worker and just leave the master worker to handle the workers?
So overall I wanted to have the following elements; a service to handle the request from the app (my main point of traffic), a website (fairly simple, a couple of pages and a registration form), an SQL database to store user information, a webpage (probably locally hosted on the server machine) which only I can access that hosts information about the server (users connected, worker restarts, server logs, other useful information etc) and apparently nginx would be a good idea where I'm handling multiple node processes accepting connection from the app. After doing research I've also found that it would probably be best to host on a VPS initially. I was thinking at first when the amount of traffic the app service would be receiving will most likely be fairly low, I could run all of those elements on one VPS. Or would it be best to have them running on seperate VPS's except for the website and the server status webpage which I could run on the same one? I guess this way if there is a hardware failure and something goes down, not everything does and I could run 2 instances of the app service on 2 different VPS's so if one goes down the other one is still functioning. Would this just be overkill? I doubt for a while I would need multiple app service instances to support the traffic load but it would help reduce the apparent down time for users.
Maybe this all depends on what I value more and have the time to do? A more complex server setup that costs more and maybe a little unnecessary but guarantees a consistent and reliable service, or a cheaper and simpler setup that may succumb to downtime due to coding errors and server hardware issues.
Also it's worth noting I've never had any real experience with production level servers so in some ways I've jumped in the deep end a little with this. I feel like I've come a long way in the past half a year and feel like I'm getting a fairly good grasp on what I need to do, I could just do with some advice from someone with experience that has an idea with what roadblocks I may come across along the way and whether I'm causing myself unnecessary problems with this kind of setup.
Any advice is greatly appreciated, thanks for taking the time to read my question.

What are the most important statistics to look at when deploying a Node.js web-application?

First - a little bit about my background: I have been programming for some time (10 years at this point) and am fairly competent when it comes to coding ideas up. I started working on web-application programming just over a year ago, and thankfully discovered nodeJS, which made web-app creation feel a lot more like traditional programming. Now, I have a node.js app that I've been developing for some time that is now running in production on the web. My main confusion stems from the fact that I am very new to the world of the web development, and don't really know what's important and what isn't when it comes to monitoring my application.
I am using a Joyent SmartMachine, and looking at the analytics options that they provide is a little overwhelming. There are so many different options and configurations, and I have no clue what purpose each analytic really serves. For the questions below, I'd appreciate any answer, whether it's specific to Joyent's Cloud Analytics or completely general.
QUESTION ONE
Right now, my main concern is to figure out how my application is utilizing the server that I have it running on. I want to know if my application has the right amount of resources allocated to it. Does the number of requests that it receives make the server it's on overkill, or does it warrant extra resources? What analytics are important to look at for a NodeJS app for that purpose? (using both MongoDB and Redis on separate servers if that makes a difference)
QUESTION TWO
What other statistics are generally really important to look at when managing a server that's in production? I'm used to programs that run once to do something specific (e.g. a raytracer that finishes running once it has computed an image), as opposed to web-apps which are continuously running and interacting with many clients. I'm sure there are many things that are obvious to long-time server administrators that aren't to newbies like me.
QUESTION THREE
What's important to look at when dealing with NodeJS specifically? What are statistics/analytics that become particularly critical when dealing with the single-threaded event loop of NodeJS versus more standard server systems?
I have other questions about how databases play into the equation, but I think this is enough for now...
We have been running node.js in production nearly an year starting from 0.4 and currenty 0.8 series. Web app is express 2 and 3 based with mongo, redis and memcached.
Few facts.
node can not handle large v8 heap, when it grows over 200mb you will start seeing increased cpu usage
node always seem to leak memory, or at least grow large heap size without actually using it. I suspect memory fragmentation, as v8 profiling or valgrind shows no leaks in js space nor resident heap. Early 0.8 was awful in this respect, rss could be 1GB with 50MB heap.
hanging requests are hard to track. We wrote our middleware to monitor these especially as our app is long poll based
My suggestions.
use multiple instances per machine, at least 1 per cpu. Balance with haproxy, nginx or such with session affinity
write midleware to report hanged connections, ie ones that code never responded or latency was over threshold
restart instances often, at least weekly
write poller that prints out memory stats with process module one per minute
Use supervisord and fabric for easy process management
Monitor cpu, reported memory stats and restart on threshold
Whichever the type of web app, NodeJS or otherwise, load testing will answer whether your application has the right amount of server resources. A good website I recently found for this is Load Impact.
The real question to answer is WHEN does the load time begin to increase as the number of concurrent users increase? A tipping point is reached when you get to a certain number of concurrent users, after which the server performance will start to degrade. So load test according to how many users you expect to reach your website in the near future.
How can you estimate the amount of users you expect?
Installing Google Analytics or another analytics package on your pages is a must! This way you will be able to see how many daily users are visiting your website, and what is the growth of your visits from month-to-month which can help in predicting future expected visits and therefore expected load on your server.
Even if I know the number of users, how can I estimate actual load?
The answer is in the F12 Development Tools available in all browsers. Open up your website in any browser and push F12 (or for Opera Ctrl+Shift+I), which should open up the browser's development tools. On Firefox make sure you have Firebug installed, on Chrome and Internet Explorer it should work out of the box. Go to the Net or Network tab and then refresh your page. This will show you the number of HTTP requests, bandwidth usage per page load!
So the formula to work out daily server load is simple:
Number of HTTP requests per page load X the average number of pages load per user per day X Expected number of concurrent users = Total HTTP Requests to Server per Day
And...
Number of MBs transferred per page load X the average number of pages load per user per day X Expected number of concurrent users = Total Bandwidth Required per Day
I've always found it easier to calculate these figures on a daily basis and then extrapolate it to weeks and months.
Node.js is single threaded so you should definitely start a process for every cpu your machine has. Cluster is by far the best way to do this and has the added benefit of being able to restart died workers and to detect unresponsive workers.
You also want to do load testing until your requests start timing out or exceed what you consider a reasonable response time. This will give you a good idea of the upper limit your server can handle. Blitz is one of the many options to have a look at.
I have never used Joyent's statistics, but NodeFly and their node-nodefly-gcinfo is a great tools to monitor node processes.

"Everything runs in parallel except your code".. wait what?

I am trying to learn Node.js and some of points that I understand:
Node.js does'nt create a seperate process for each request, instead it is just one process which processes all requests.
It is asynchronous which means you can attach a callback to a long-lasting process and continue your rest of the work without waiting for it to finish.
What I really don't understand is author's point in Understanding node.js - "Everything runs in parallel except your code". I have understood the analogy and the code that explains it but still I don't get it what is the distinction between "Everything" and "code". I have more often heard this about node.js.
Also, people pat node.js for its efficiency since memory overhead for one concurrent connection may be as low as 8KB but what about CPU load. Does node.js make it way less as compared to PHP+Apache?
Node.js uses a single thread any time it is running the JavaScript in your application. Tasks that are asynchronous (network, filesystem, etc.) are all handled on separate threads automatically for you. This means that you get much of the usefulness of a multithreaded application without having to worry about all of the trouble that comes with locking resources and what not.
Node is not a tool for every job. It is ideal for applications that are IO bound. For example, if your application required a ton of work to process templates and what not, Node probably isn't for you. If instead you're just shuffling data around, Node can be very effective.
The reason Node is often quoted as being faster than servers like Apache is that it doesn't create a thread and all of the resources with it to handling requests. In Apache, most of the time, that thread handling requests is waiting on network or filesystem data. While it does this, it is wasting resources. With Node, only one thread processes those requests (in your application). Again, this is great for some things, but if you have a lot of processing to do, Node would not be effective as it can really only handle a single request at a time in these situations.
This video does a pretty good job of explaining: http://www.youtube.com/watch?v=F6k8lTrAE2g&feature=youtube_gdata
Everything runs in parallel except your code.
It means if you do
while(true){}
anywhere in your code the entire node application will stop. While the code you write executes, nothing else does. Requests will not be handled, responses won't be returned, nothing. You have to be extremely careful to not hog the cpu in node.
but what about CPU load?
That completely depends on the nature of your application and the load. If your app is busy, it'll use more cpu.
Imagine a busy intersection with a traffic cop in the middle. When the cop is doing his job properly, hundreds of cars can pass through the intersection in a very fast and efficient way.
If the cop starts receiving and answering SMS messages on his cell while doing traffic, then things might go out of hand really quickly.
The traffic cop is your node.js app, and the time he spends doing SMS is what the author refers to as "your code".
In other words: node.js performance will shine the more you use it as a traffic cop. The more you start using it to do things other than pulling and pushing data (i.e.: sorting a list of numbers, rendering an html template, etc.), the more your capacity to accept and process new connections quickly will suffer.
"Everything" refers to everything else besides your code. For example, the stuff that handles HTTP. Another way to say the same thing is "your code doesn't wait for node.js to do stuff, like send data over TCP, because that's done asynchronously."
To answer your second question, I don't know which has less CPU load, I'm guessing they're similar. Node.js' touted advantage is the CPU is better utilized due to the aforementioned asynchronicity.

IIS Application Pool Optimisation

We are experiencing some performance issues with one of our websites and one of the things I am looking at is optimising the application pool. Are there any recommended methods of calculating maximum virtual/used memory to allow the pool?
The default settings of IIS are pretty well thought out. I'd first start by placing the application in its own application pool to make sure there aren't any other websites that might be causing the problem.
From there, I'd work at analyzing the code to make sure it is the resources the application has access to that are limiting the website. If the resources are what is limiting, it should be easy to figure out what needs priority in the application pool.
If your application is running massive memory software, it would be apparent by the spike and eventual ceiling hit of no more memory available, thus, you'd up your memory for that pool. Likewise, if the site just bogs down over time, you might check how long your sessions are staying alive and possibly shorten them. Just some things to start out with.
As far as hard and fast rules, I haven't come across any and usually it is so 'per-application' specific, that it is hard to define. I would just pay attention to performance counters and then turn one thing on, check them again, and then try another thing, making sure to keep all tests agnostic so you know what is working and what is not.

Resources