Which Node.js Concurrent Web Server is best on Heroku? - node.js

I have just learned about Heroku and was pretty much excited to test it out. Ive quickly assembled their demo's with Node.js Language and stumbled across a problem. When running the application locally, apache benchmark prints roughly about 3500 request/s but when its on the cloud that drops to 10 request/s and does not increase or lower based on network latency. I cannot believe that this is the performance they are asking 5 cents/hour for and highly suspect my application to be not multi-threaded.
This is my code.js: http://pastebin.com/hyM47Ue7
What configuration do I need to apply in order to get it 'running' (faster) on Heroku ? Or what other web servers for node.js could I use ?
I am thankful for every answer on this topic.

Your little example is not multi-threaded. (Not even on your own machine.) But you don't need to pay for more dyno's immediately, as you can make use of multiple cores on a dyno, see this answer Running Node.js App with cluster module is meaningless in Heroku?
To repeat that answer: a node solution to using multiple processes that should increase your throughput is to use the (built-in) cluster module.
I would guess that you can easily get more than 10 req/s from a heroku dyno without a problem, see this benchmark, for example:
http://openhood.com/ruby/node/heroku/sinatra/mongo_mapper/unicorn/express/mongoose/cluster/2011/06/14/benchmark-ruby-versus-node-js/
What do you use to benchmark?

You're right, the web server is not multi-threaded, until you pay for more web dynos. I've found Heroku is handy for prototyping; depending on the monetary value of your time, you may or may not want to use it to set up a scalable server instead of using EC2 directly.

Related

NodeJS Monitoring Website (Worker Threads?/Multi Process?)

I am doing small project of application that will monitor some servers.
It will base on telnet port check, ping, and also it will use libraries to connect directly to databases (MSSQL, Oracle, MySQL) to check their status.
I wonder what will be the best effective solution for this idea, currently with around 30 servers it works quite smooth, around 2.5sec to check status for all of them (running async). However I am worried that in the future with more servers it might get worse. Hence thinking about using some alternative like Worker Threads maybe? or some multi processing? Any ideas? Everything is happening in internal network so I do not expect huge latency.
Thank you in advance.
Have you ever tried the PM2 cluster mode:
https://pm2.keymetrics.io/docs/usage/cluster-mode/
The telnet stuff is TCP, which Node.js does very well using OS-level networking events. The connections to databases can vary. In the case of Oracle, you'll likely be using the node-oracledb. Those are SQL*Net connections that rely on the OCI libs and Node.js' thread pool. The thread pool defaults to four threads, but you can grow it up to 128 per Node.js process. See this doc for info:
https://oracle.github.io/node-oracledb/doc/api.html#-143-connections-threads-and-parallelism
Having said all that, other than increasing the size of the thread pool, I wouldn't recommend you make any changes. Why fight fires before they're burning? No need to over-engineer things. You're getting acceptable performance given the current number of servers you have.
How many servers do you plan to add in, say, 5 years? What's the difference in timing if you run the status checks for half of the servers vs all of them? Perhaps you could use that kind of data to make an educated guess as to where things would go.
As you add new ones, keep track of the total time to check the status. Is it slipping? If so, look into where the time is being spent and write the solution that will help.

Node cluster versus Dynos scaling

After reading a handful of articles on scaling node apps I have not yet made up my mind about when should I use node builtin cluster or simply adding more dynos.
Let me tell you I have already read the following threads on StackOverflow:
How to properly scale nodejs app on heroku using clusters
Running Node.js App with cluster module is meaningless in Heroku?
As far as I understood it, if I make use of node cluster functionality I will end up with the total memory available divided by the number of forked processes.
On the other hand, if I add one more dyno I will double the memory available.
So, what is the point of using node clusters?
It's not really an either-or situation. You can make use of multiple node cluster instances on multiple dynos. Memory isn't really what you want to look at, though, since that would be a shared resource. CPU / core usage is more relevant to clustering in node, since each node process can only make use of one CPU core at a time.
It's really going to depend on which dynos you are using, too.
Have you seen these suggestions on the official heroku docs yet?

I'm not sure how to correctly configure my server setup

This is kind of a multi-tiered question in which my end goal is to establish the best way to setup my server which will be hosting a website as well as a service (using Socket.io) for an iOS (and eventually an Android) app. Both the app service and the website are going to be written in node.js as I need high concurrency and scaling for the app server and I figured whilst I'm at it may as well do the website in node because it wouldn't be that much different in terms of performance than something different like Apache (from my understanding).
Also the website has a lower priority than the app service, the app service should receive significantly higher traffic than the website (but in the long run this may change). Money isn't my greatest priority here, but it is a limiting factor, I feel that having a service that has 99.9% uptime (as 100% uptime appears to be virtually impossible in the long run) is more important than saving money at the compromise of having more down time.
Firstly I understand that having one node process per cpu core is the best way to fully utilise a multi-core cpu. I now understand after researching that running more than one per core is inefficient due to the fact that the cpu has to do context switching between the multiple processes. How come then whenever I see code posted on how to use the in-built cluster module in node.js, the master worker creates a number of workers equal to the number of cores because that would mean you would have 9 processes on an 8 core machine (1 master process and 8 worker processes)? Is this because the master process usually is there just to restart worker processes if they crash or end and therefore does so little it doesnt matter that it shares a cpu core with another node process?
If this is the case then, I am planning to have the workers handle providing the app service and have the master worker handle the workers but also host a webpage which would provide statistical information on the server's state and all other relevant information (like number of clients connected, worker restart count, error logs etc). Is this a bad idea? Would it be better to have this webpage running on a separate worker and just leave the master worker to handle the workers?
So overall I wanted to have the following elements; a service to handle the request from the app (my main point of traffic), a website (fairly simple, a couple of pages and a registration form), an SQL database to store user information, a webpage (probably locally hosted on the server machine) which only I can access that hosts information about the server (users connected, worker restarts, server logs, other useful information etc) and apparently nginx would be a good idea where I'm handling multiple node processes accepting connection from the app. After doing research I've also found that it would probably be best to host on a VPS initially. I was thinking at first when the amount of traffic the app service would be receiving will most likely be fairly low, I could run all of those elements on one VPS. Or would it be best to have them running on seperate VPS's except for the website and the server status webpage which I could run on the same one? I guess this way if there is a hardware failure and something goes down, not everything does and I could run 2 instances of the app service on 2 different VPS's so if one goes down the other one is still functioning. Would this just be overkill? I doubt for a while I would need multiple app service instances to support the traffic load but it would help reduce the apparent down time for users.
Maybe this all depends on what I value more and have the time to do? A more complex server setup that costs more and maybe a little unnecessary but guarantees a consistent and reliable service, or a cheaper and simpler setup that may succumb to downtime due to coding errors and server hardware issues.
Also it's worth noting I've never had any real experience with production level servers so in some ways I've jumped in the deep end a little with this. I feel like I've come a long way in the past half a year and feel like I'm getting a fairly good grasp on what I need to do, I could just do with some advice from someone with experience that has an idea with what roadblocks I may come across along the way and whether I'm causing myself unnecessary problems with this kind of setup.
Any advice is greatly appreciated, thanks for taking the time to read my question.

Load Balancing in Nodejs

I recently started with node and I have been reading a lot about its limitation of it being single threaded and how it does not utilise your cores and then I read this
http://bit.ly/1n2YW68 (which talk about the new cluster module of nodejs for loadbalancing)
Now I'm not sure I completely agree to it :) because the first thing that I thought of before starting with node on how to make it utilise cores with proper load balancing is via web-server some like upstream module like nginx
like doing something like this
upstream domain1 {
server http://nodeapp1;
server http://nodeapp2;
server http://nodeapp3;
}
So my question is there an advantage to use such cluster module for load balancing to utilise the cores does it has any significant advantage over web server load balancing
or is blog post too far from real use.
Note: I'm ain't concerned about load balancing handle by various app server like passenger(passenger has nodejs support as well but something that I'm not looking for answer :)) which I already know since I'm mostly a ruby programmer
One other option you can use to cluster NodeJs applications is to deploy the app using PM2.
Clustering is just easy as this, You don't need to implement clustering by hand
pm2 start app.js -i max
PM2 is an expert to auto detect the number of available CPUs and run as many processes as possible
Read about PM2 cluster mode here
http://pm2.keymetrics.io/docs/usage/cluster-mode/
For controlling the load of IO operations, I wrote a library called QueueP using the memoization concept. You can even customize the memoization logic and gain speedup values of more than 10, sometimes
https://www.npmjs.com/package/queuep
As far as I know, the built in node cluster is not a good solution yet (load is not evenly distributed across cores). Until v0.12: http://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/
So you should use nginx until then. After that we will see some benchmarks comparing both options and see if the built in cluster module is a good choice.

Scaling Node.JS across multiple cores / servers

Ok so I have an idea I want to peruse but before I do I need to understand a few things fully.
Firstly the way I think im going to go ahead with this system is to have 3 Server which are described below:
The First Server will be my web Front End, this is the server that will be listening for connection and responding to clients, this server will have 8 cores and 16GB Ram.
The Second Server will be the Database Server, pretty self explanatory really, connect to the host and set / get data.
The Third Server will be my storage server, this will be where downloadable files are stored.
My first questions is:
On my front end server, I have 8 cores, what's the best way to scale node so that the load is distributed across the cores?
My second question is:
Is there a system out there I can drop into my application framework that will allow me to talk to the other cores and pass messages around to save I/O.
and final question:
Is there any system I can use to help move the content from my storage server to the request on the front-end server with as little overhead as possible, speed is a concern here as we would have 500+ clients downloading and uploading concurrently at peak times.
I have finally convinced my employer that node.js is extremely fast and its the latest in programming technology, and we should invest in a platform for our Intranet system, but he has requested detailed documentation on how this could be scaled across the current hardware we have available.
On my front end server, I have 8
cores, what's the best way to scale
node so that the load is distributed
across the cores?
Try to look at node.js cluster module which is a multi-core server manager.
Firstly, I wouldn't describe the setup you propose as 'scaling', it's more like 'spreading'. You only have one app server serving the requests. If you add more app servers in the future, then you will have a scaling problem then.
I understand that node.js is single-threaded, which implies that it can only use a single core. Not my area of expertise on how to/if you can scale it, will leave that part to someone else.
I would suggest NFS mounting a directory on the storage server to the app server. NFS has relatively low overhead. Then you can access the files as if they were local.
Concerning your first question: use cluster (we already use it in a production system, works like a charm).
When it comes to worker messaging, i cannot really help you out. But your best bet is cluster too. Maybe there will be some functionality that provides "inter-core" messaging accross all cluster workers in the future (don't know the roadmap of cluster, but it seems like an idea).
For your third requirement, i'd use a low-overhead protocol like NFS or (if you can go really crazy when it comes to infrastructure) a high-speed SAN backend.
Another advice: use MongoDB as your database backend. You can start with low-end hardware and scale up your database instance with ease using MongoDB's sharding/replication set features (if that is some kind of requirement).

Resources