Node: Scale socket.io / nowjs - scale across different instances - node.js

Before starting to write my application I need to know what to do when a single node.js instance (express and (socket.io or nowjs)) isn't enough anymore.
You might tell me now, that I shouldn't care about scale until it's about time but I don't want to develop an application and run into trouble because you can't easily scale socket.io or nowjs across multiple instances.
I recently read that socket.io now supports a way to scale using Redis (which I also have no experience in). Nowjs is build on to of socket.io - does it work the same way? On nowjs.org you can read that a "distributed version of NowJS" is under development and is going to cost money.

If you need to scale node, the first place people usually start is putting a load balancer in front of multiple node instances. The standard for this today is nginx, though I would would like to check out the node balancer 'bouncy' that came out recently. Here's an example of someone using the nginx reverse proxy to manage multiple node instances:
Node.js + Nginx - What now?
The second thing you mention is socket.io/nowjs. Depending on how you're using these frameworks, you could get into a situation where you want to share context between clients who are hitting multiple node.js instances. If this is the case, I would recommend using a persistent store, like redis, to bridge the gap between your node instances. Here's an example:
How to reuse redis connection in socket.io?
Hopefully this is enough information and reading to get you started, let me know if you have any questions.
Happy coding!

Another useful link on 'Scaling Socket.IO' https://github.com/dshaw/talks/tree/master/2011-10-jsclub (slides and sample application)

Just as a sidenote on the discussion to use nginx for reverse proxy with socket.io, the way I understand it at least, nginx 1.0.x which is stable version does not support proxying of http/1.1 connections (which is needed in order to make socket.io work with websockets). there is a workaround described on here: http://www.letseehere.com/reverse-proxy-web-sockets to make it work, or use something like this: https://github.com/nodejitsu/node-http-proxy instead, the guys at nodejitsu says this should support it.

Related

Isn't Ngnix load balancing like proxy server?

Mainly is there any difference between using nginx as load balancer for bunch of upstream servers. Or using small nodejs proxy server that acts like a proxy between bunch of servers and one public hosting.
It may look obvious to you but to me nginx is very new. And i barely know anything about it..
Also i guess my question is there any performance advantage for using nginx as proxy server that distribute load vs running your own node js code that acts a proxy between other requests.
In case of introducing +1 technology I'd say keep custom NodeJS proxy as short-term solution.
Long-term solution is Nginx as reverse-proxy among array of backend makes a big sense by number technical and maintenance reasons. An application rarely stays the same because you apply new features, replace legacy code and deploy new ones so the way is to use right tool for right task. Nginx is proven and chosen by many heavy loaded applications over the web. The memory consumption and CPU utilisation is low and stable.
Most of people use Nginx as reverse-proxy (the biggest reason to use Nginx by the way) rather than anything else because such powerful and featured it is.
From request-response life cycle m Nginx keeps rotating between backend to send request again if a given backend is dead, so not even one request lost.
From maintenance point of view dynamic upstream (part of commercial installation) with Rest interface looks good enough. Even open source version is easy to roll out upstream update + graceful reload (HUP signal). Nginx also supports zero downtime binary upgrade (USR2+QUIT).

Is NodeJS supposed to be standalone (I.E. without apache nginx)

Ok so finally decided I was way behind in knowing some of the frameworks/platforms that are out like Angularjs, NodeJS, Knockout, Backbone etc etc. And so decided to learn NodeJS first, and have set it up on a local vm of ubuntu server.
So was wondering if NodeJS was supposed to be paired with another server software like apache,nginx, etc. And let apache/nginx just serve up the basic pages and then just let node do the data communications since their site says its "for easily building fast, scalable network applications".
Because I have seen several questions on S.O. asking how to get NodeJS to run on port 80, which implies they want to run node as a regular server or they just dont want to have to always specify a port when doing requests. And have not seen anyone comment or say that node was not meant to be used like a regular server. So was hoping to get to an answer on this.
Node.js can be used standalone, out there are good frameworks to do it like express. You can clusterize your process in the same physical machine (and the same port) really easy via its native module cluster. Also, I'm sure you can use Node.js like a reverse proxy too, but some developers prefer using other tools to do it (in my enterprise, we use Nginx with some of our node.js apps).
So, in short: You dont need Nginx or Apache at all, but you can use if you want. It's very cosy to some people use Nginx to do the load balance, or even other stuff like handle the https or server static content. It's your choice at the end.
You should play something with the native library http or https first, and then check express or another framework. You will see wich parts of Node.js you love and which do you dont feel awesome.

Scaling Node.js App - Which provider?

I have been working with a couple Node.js frameworks to create applications of which I typically use Heroku to deploy. Recently I came across this disclaimer on Derby's documentation page that states:
Note that while Derby supports multiple servers, it currently requires
that clients repeatedly connect to the same server. Heroku does not
support sticky sessions or WebSockets, so it isn’t possible to use
more than one dyno. You’ll have to use a different hosting option to
scale your app.
This is obviously concerning for scalability. Because of the statement above, I understand it is not a Node limitation, but a Heroku limitation.
First, is this accurate? That is - I cannot scale Node apps on Heroku?
If that is the truth, where should I turn? AWS?
Thanks.
Derby is not a typical Node.js app. Derby (and Meteor) are essentially complete frameworks built on top of Node. Websockets is not yet supported on Heroku: https://devcenter.heroku.com/articles/using-socket-io-with-node-js-on-heroku
However, the typical alternative to sticky sessions for regular Node apps is to use a datastore like Redis (or Postgres or Mongo) to store session data. This is a far more robust approach than sticky sessions, as it's resilient to the failure of any particular device.
Take a look at http://12factor.net/ for more info on horizontal scaling.

What are my options when it comes to node.js lifecycle?

Are there any examples or conventions out there of how to use node.js to host multiple web apps?
I'm already aware that node itself can be used to build a server, but I'm curious as to whether there have been implementations where you aren't necessarily running it all the time. Strictly for the reason that perhaps there are multiple sites being hosted, each with their own copy of a framework, static files and custom functionality.
Or maybe you do run one instance of node and code a multiple site architecture to ensure one bad site doesn't take the server downin some way?
Virtual hosts, ensuring that one site can't crash others...these are all things that have been considered with other platforms, but I have had some difficulties finding for node! :)
I am already aware of connect, express and other middleware, however it doesn't cover what I'm asking here.
If you're worried about runtime isolation, each "site" should run it's own node process. Then use a proxy like node-http-proxy that will do host header based routing. Another great node based option is bouncy, but you don't necessarily need to use node to do the host based routing. You could just as well use haproxy, nginx, etc.
The baseline RAM overhead of each node process is very small (~10mb - 15mb). Also, if you do HTTP based routing you can spread your sites easily across machines, user home directories, etc.
If you want to handle the site/host registration programmatically, I would use seaport and then communicate the hostname and host + port details back to the proxy so that the routing table can by dynamic. This would also make it fairly easy to scale a site across multiple node processes.
Good luck!

Node.js as an application container

Apache and Node.js have something in common. The more I use Node.js, the more I like Node.js; similarly, more I use Apache, the more I like Node.js.
One good thing about Apache though, it can do a lot of things through the same port. PHP, Python, Perl, different apps, different paths, the whole magilla. Node.js doesn't do that, and it isn't supposed to but I would like to do something similar.
I would like to give it a list of URL-prefixes (or regexps ideally) and enough information to, if it receives a request matching a particular prefix, it passes off the request to a subordinate instance running a specified script (and it will start such an instance if it hasn't already, and close it down when doing so seems prudent). Basically, I want nodejs-proxy and cluster cooperating. With it, I could run several apps together on the same machine through port 80.
This seems pretty easy and very useful and I was about to just write it myself when it occurred to me, "This seems pretty easy and very useful -- probably someone has already written it!" Any suggestions?
Node.js doesn’t have any built-in ability to route requests to different applications, but frameworks like this are in development.
Nodejitsu’s Haibu comes to mind — it manages child processes for each application and uses node-http-proxy to route the requests.
You could take a look at http://expressjs.com which I describe as a 'sinatra for node'. It gives the whole URL/pattern based routing thing. You can couple this with https://github.com/visionmedia/express-resource to create a kinda RESTful style resource approach.
To me it sounds like you're looking for an event-based HTTP proxy (to replace Apache) - in that regard, nginx seem to be current king of the hill.
Use dokku (Docker based) which will spawn your apps and provide a reverse proxy via nginx. Containers are isolated, you have a choice of buildpacks and your deployments have 0 downtime all by pushing repos via git and auth via ssh.
You can follow this easy guide on DigitalOcean on how to deploy your Node.js apps or just watch the guide from the man himself.

Resources