I have been working with a couple Node.js frameworks to create applications of which I typically use Heroku to deploy. Recently I came across this disclaimer on Derby's documentation page that states:
Note that while Derby supports multiple servers, it currently requires
that clients repeatedly connect to the same server. Heroku does not
support sticky sessions or WebSockets, so it isn’t possible to use
more than one dyno. You’ll have to use a different hosting option to
scale your app.
This is obviously concerning for scalability. Because of the statement above, I understand it is not a Node limitation, but a Heroku limitation.
First, is this accurate? That is - I cannot scale Node apps on Heroku?
If that is the truth, where should I turn? AWS?
Thanks.
Derby is not a typical Node.js app. Derby (and Meteor) are essentially complete frameworks built on top of Node. Websockets is not yet supported on Heroku: https://devcenter.heroku.com/articles/using-socket-io-with-node-js-on-heroku
However, the typical alternative to sticky sessions for regular Node apps is to use a datastore like Redis (or Postgres or Mongo) to store session data. This is a far more robust approach than sticky sessions, as it's resilient to the failure of any particular device.
Take a look at http://12factor.net/ for more info on horizontal scaling.
Related
I use Redis because it allows me to scale my applications horizontally (multiple servers). By using it's pub sub features all my servers can communicate with each other without needing to share memory.
So far, cool! We can add more nodejs servers, BUT all this servers subscribe to one single Redis server. So we have the situation in which we have many NodeJs servers communication to just one Redis server, we can serve more clients but still we have one Redis.
From my tests the Redis server uses less resources so can handle more, but still in this design I think is a SPF. What do you think?
What are the best ways to design a scalable system? I know about master/slave Redis but still I am not convinced if it is the best solution.
Yes, Redis is a single point of failure in what you describe. Not only in a sense that when it is down then your app is down, but also in the sense that if one of your processes remove or corrupt the data then it is lost forever.
What you can do is use multiple Redis servers and have a good backup strategy.
See this tutorial for clustering:
https://redis.io/topics/cluster-tutorial
See these tutorials for backups:
http://zdk.blinkenshell.org/redis-backup-and-restore/
https://redis.io/topics/persistence
https://www.digitalocean.com/community/tutorials/how-to-back-up-and-restore-your-redis-data-on-ubuntu-14-04
I'm programming a server-side application which will manage requests from:
Game client
Website (HTTP requests)
API
As of now I'm using only one (NodeJS) application for every type of requests, the problem is that with a growing user base, this approach will generate a bottle-neck.
I would like some advices on how to develop the server-side architetture so that it'll be scalable.
The only solution that I know of is to use multiple servers with the same application running which will share the same memory (Redis server).
Is it possible in nodeJS to split the management of these types of request into multiple servers? Maybe one or more servers for each type of request?
Currently I'm using:
NodeJS
Redis
MySQL
Express
Socket.io
Thanks in advance, can you recommend some books on this matter?
On one machine to handle the power of multicore architecture you can use node.js Cluster module (https://nodejs.org/api/cluster.html).
I think it is a good idea to split API and Website on different applications. If you decide to run multiple node.js applications on one machine try to use pm2 (http://pm2.keymetrics.io/). You could probably split your API on a bunch of small applications - which called microservice architecture. I personally don't like microservice approach you can check the web for pros and cons.
Also if you deploy your application (or bunch of applications) on different virtual/phisycal machines (which is usual in production) you can use haproxy for balancing and
fault tolerance (http://www.haproxy.org/).
Am I correct in the assumption that without access to the MongoDB server, there is not much point developing with Meteor?
Meteor is a great framework for building, packaging and deploy apps and sites. From a development POV, the templating and responsive DB work make prototyping so much easier than most MVC's.
I understand that underneath the hood, websockets and DDP provide the realtime sync'ing magic which means that you need access to the MongoDB server, something you don't have with PaaS solutions like GoogleAppEngine, Parse or Kinvey.
So, for the backend developer, they don't derive much benefit from Meteor since they need to maintain the server stack and scalability issues.
Is there a path to create and deploy products with Meteor without having to build and maintain the backend infrastructure? Heroku is still pretty close to the bone when it comes to managing infrastructure.
Wondering if there's a way to have CRUD operations through a REST driver that maps out to whatever PaaS you want and have the PaaS post log changes to a server that strictly handles websocket connections. Basically, pass the CRUD operation to a PaaS and maintain your own websocket server/s.
MeteorPedia has a page on deploying to PaaS: http://www.meteorpedia.com/read/Category:PaaS_providers
Recently, Google AppEngine has added support for custom VMs.
You can also use MongoHQ or similar for the database.
Is there an easy way to manage offline data with a web app, and synchronize with a server when there is a connection? I have been looking at Meteor, CouchDB and the likes, but still not sure what would be the least painfull way.
I could of course implement it myself with sockets or something similar, but if something is already made for the purpose, I don't see a reason to do it again.
I'm planning to work with Node as the server.
Thanks
You're talking about two things; 1) How to store/persist data if/when offline (storage mechanism), and 2) How to synchronize with a server when online (communication mechanism). The answer to 1 is some kind of local storage, and there any several ways of doing that (localstorage, websql, filesystem APIs etc) depending on your platform. The answer to 2 really depend on how urgent your synchronization needs are, but in general you can use HTTP itself with periodic (long-) polling, websockets and similar.
On top of both storage and communication mechanisms there are numerous libraries that make the job simpler, like Meteor (communication) and CouchDB (storage), but also many many more. There are even libraries that take care of the actual synchronization mechanism (with possible conflict resolution as well), but this very much depends on your actual application.
Updated: This framework looks promising, but I haven't tested it myself:
http://blog.nateps.com/announcing-racer-experimental-realtime-model
You might want to look at cloud services as well. These are best if you are developing a new application as they push you more to a serverless model, and of course you have to be happy using a service.
Simperium (simperium) is an interesting cloud service - the only one I can find today that does syncing (unlike Firebase and Spire.io who are similar in other respects), and for iOS it includes offline storage, while for JavaScript clients you'd need to cover the local storage yourself using HTML5 features. Backbone.js seems to have some support for this, and Simperium can integrate with Backbone, using a similar API style.
For non-cloud services, Derbyjs (derbyjs) is an open source project that includes Racer, a data synchronization library (mentioned by the earlier answer) - both are under rapid development and not yet complete, but look interesting if your timescales allow, and don't require a cloud service. There is a comparison of Derbyjs to Meteor that is useful - although it's written by the Derbyjs developers it's not too biased.
I also looked at CouchDB, which has some interesting built-in replication features, but I didn't like its use of indexes that are updated lazily when a query needs them (or by a batch process), and I wasn't happy with exposing the server DB directly to clients to enable replication/sync. Generally I think it's best to decouple the client side local storage from the server side DB, and of course for a web app it would be hard to use CouchDB on the client.
Before starting to write my application I need to know what to do when a single node.js instance (express and (socket.io or nowjs)) isn't enough anymore.
You might tell me now, that I shouldn't care about scale until it's about time but I don't want to develop an application and run into trouble because you can't easily scale socket.io or nowjs across multiple instances.
I recently read that socket.io now supports a way to scale using Redis (which I also have no experience in). Nowjs is build on to of socket.io - does it work the same way? On nowjs.org you can read that a "distributed version of NowJS" is under development and is going to cost money.
If you need to scale node, the first place people usually start is putting a load balancer in front of multiple node instances. The standard for this today is nginx, though I would would like to check out the node balancer 'bouncy' that came out recently. Here's an example of someone using the nginx reverse proxy to manage multiple node instances:
Node.js + Nginx - What now?
The second thing you mention is socket.io/nowjs. Depending on how you're using these frameworks, you could get into a situation where you want to share context between clients who are hitting multiple node.js instances. If this is the case, I would recommend using a persistent store, like redis, to bridge the gap between your node instances. Here's an example:
How to reuse redis connection in socket.io?
Hopefully this is enough information and reading to get you started, let me know if you have any questions.
Happy coding!
Another useful link on 'Scaling Socket.IO' https://github.com/dshaw/talks/tree/master/2011-10-jsclub (slides and sample application)
Just as a sidenote on the discussion to use nginx for reverse proxy with socket.io, the way I understand it at least, nginx 1.0.x which is stable version does not support proxying of http/1.1 connections (which is needed in order to make socket.io work with websockets). there is a workaround described on here: http://www.letseehere.com/reverse-proxy-web-sockets to make it work, or use something like this: https://github.com/nodejitsu/node-http-proxy instead, the guys at nodejitsu says this should support it.