I have designed a Node.JS server that allows users to log in, join a room and exchange data with other users in this room using websockets. However I am now looking for a way to make this setup scalable.
I spend all afternoon researching various loadbalancers such as nginx and haproxy, but I still can't figure out how to organise my setup.
Initally users can login and view the active rooms. No biggie to makes this part scalable. However then they can join a specific room and at that point they need to be connected to the same node.js instance as the others in the room. It's this part I have trouble figuring out.
For now my solution exists out of creating two different types of node.js instances. One generic type to handle the login and room overview request methods and one roomtype that handle a number of rooms. The generic type then keeps track of which specific instance is responsible for which room and can deliver the correct address to the user's application.
However I am not satisfied with this solution, so I am open for suggestions. I understand that this is a rather vague question, but I am not looking for a exact solution, rather hints as to how to organize everything.
udidu touched on a possible solution, but to expand, you should look at a scalable pub/sub solution; Redis, a popular data store, has pub/sub built in and I use it often to great effect.
Using Redis (or some other system) to help make sure every instance of your Node.js app receives information about who's chatting in which room removes the dependency that all users in a room are connected to the same Node.js instance.
Related
I am building two sets of services on a website (all written in NodeJS on the server), both are using a RESTful approach. For the sake of modularity I decided to make both services separate entities. The first service deals with the products of the site and the second specifically deals with user related functions. So the first might have functions like getProducts, deleteProduct etc... The second would have functions like isLoggedIn, register, hasAccessTo etc... The product module will make several calls to the user module to make sure that the person making the calls has the privilege to do so.
Now the reason I separated them like this, was because in the near future I foresee a separate product range opening up, but will need to use the same user system as the first (even sharing the same database). The user system will use a database that spans the entire site and all subsequent products
My question is about communication between these projects and the users project. What is the most effective way of keeping the users module separate without suffering any significant speed hits. If the product API made a call to the user API on the same server (localhost), is there a signifcant cost to this, versus building the user API into each of the subsequent projects? Is there a better way to do this through interprocess communication maybe? Is simply having the users API run as its own service an effective solution?
If you have two nodes on same server (machine) then you have not bad performance in terms of network latency because both are on localhost.
Then, nodes will be communicating using a rest api, so on the underground, you will use node js sockets. You could use unix sockets instead of http sockets because are faster BUT are worst to debug, so I recommend you don't to that (but it's ok know alternatives).
And finally, your system looks like an "actor design pattern". At first glance this design patter is a little difficult to understand but you could have a look at this if you want more info about actor model pattern:
Actor model for NodeJS https://github.com/benlau/nactor
Actor model explanation http://en.wikipedia.org/wiki/Actor_model
I want to monitor in real time the data that users enter in comments table.
I have an Apache server running, and suppose that has a node server on port 1337.
How would I do that every time someone save new data, eg return me the total number of table rows in comment and show it in a view?
Maybe way is to make the $this->Comment->save($this->request->data); using a different port using Httpsockect?
Yes, it is possible.
You have multiple ways of solving this, let me give you my ideas
You can simply use long-polling and don't use Node.js at all. It's a suitable solution if there won't be too much traffic there, otherwise you will have a bad time.
You can use websockets and don't use Node.js at all. Here you have a basic guide about websockets and PHP. Although, I am almost sure you won't be able to create "rooms", that is, sending notifications for specific comments.
You can also use Ratchet. This is a more sophisticated library to handle websockets and it supports rooms.
Lastly, if you want to full dive in with Node.js and CakePHP, I would suggest start by watching this talk given on Cakefest 2012 which exactly describe your scenario.
After you have watched that, you might want to learn a little about Socket.io. This is a more complex solution, but it's what I have used when integrating CakePHP and Node.js to create real time applications.
The strategy here is to have the users join a room when they visit /article/view/123, let's say the room name is the articleID, then socket.io will be listening for events happening in this room.
You will have a Cakephp method that handles the save. Then, when user submits the form you don't call directly the Cake action, you have socket.io to dispatch an event, then in your event you pass the data to the server (Node.js) and nodejs will call your cakephp function that saves the data. When Nodejs receives confirmation from CakePHP then you broadcast an event (using socket.io), this event will let know all users connected to that room that a comment has been made.
You have basically the choice between Websockets and long polling.
Websockets (with Ratchet and Autobahn.js)
Long Polling Using Comet
Decide which technology you want to use and start implementing your use case. Consider that Websockets are more or less new. Depending on your requirements you might not be able to use Websockets because you might have to support crappy browsers. See this page.
I'm using Backbone.iobind to bind my client Backbone models over socket.io to the back-end server which in turn store it all to MongoDB.
I'm using socket.io so I can synchronize changes back to other clients Backbone models.
The problems starts when I try to run the same thing over a cluster of node.js servers.
Setting a session store was easy using connect-mongo which stores the session to MongoDB.
But now I can't inform all the clients on every change, since the clients are distributed between the different node.js servers.
The only solution I found is to set a pub/sub queue between the different node.js servers (e.g. mubsub), which seems like a very heavy weight solution that will trigger an event on all the servers for every change.
How did you reach the conclusion that pub/sub is a "very heavy weight solution"?
Sounds like you got it right up until that part :-)
Oh, and pub/sub is not a queue.
Let's examine that claim:
The nice thing about pub/sub is that you publish and subscribe to channels/topics.
So, using the classic chat server example, let's say you have a million users connected in total, but #myroom only has 50 users in it.
When a message is sent to #myroom, it's being published once. No duplication whatsoever.
In most use-cases you won't even need to store it on disk/RAM, so we're mostly looking at network/bandwidth here. And, I mean, you're probably throwing more data (probably over the wire?) to MongoDB already, so I assume that's not your bottleneck.
If you also use socket.io's rooms features (which is basically its own pub/sub mechanism), that means only 5 users will have that message emitted to them over the websocket.
And no, socket.io won't iterate over 1M clients to find out which of them are in room #myroom ;-)
So the message is published once, each subscriber (node.js instance) will get notified once, and only the relevant clients -- socket.io won't waste CPU cycles in order to find them as it keeps track of them when they join() or leave() a room -- will receive the message.
Doesn't that sound pretty efficient and light-weight?
Give Redis a shot.
It's really simple to set-up, runs entirely in memory, blazing-fast, replication is extremely simple, etc.
That's the way socket.io recommends passing events between nodes.
You can find more information/code here.
Additionally, if MongoDB can't handle the load at any point, you can use Redis as your session-store as well.
Hope this helps!
Brief Description:
Well, since many days I've been looking for an answer to this question but there seems to be answers for 'How to create a Push Notification Server' and like questions. I am using node.js and it's quite easy to 'create' a push notification server using sock.js (I've heard socket.io isn't good as compared to sock.js). No problem till here. But what I want is how to model such a server.
Details:
OK, so, let's say I've an application where there's a chat service (just an example this is, actual thing is big as you might have guessed). A person sends a message in a room and all the people in the room get notified. But what I want is a 'stateful' chat - that is, I want to store the messages in a data store. Here's where the trouble comes. Storing the message in the database and later telling everyone that "Hey, there's a message for you". This seems easy when we need the real-time activity for just one part of the app. What to do when the whole app is based on real-time communication? Besides this, I also want to have a RESTful api.
My solution (with which I am not really happy)
What I thought of doing was this: (on the server side of course)
Data Store
||
Data Layer (which talks to data store)
||
------------------
| |
Real-Time Server Restful server
And here, the Real-time server listens to interesting events that the data-layer publishes. Whenever something interesting happens, the server notifies the client. But which client? - This is the problem with my method
Hope you can be of help. :)
UPDATE:
I think I forgot to emphasize an important part of my question. How to implement a pub-sub system? (NOTE: I don't want the actual code, I'll manage that myself; just how to go about doing it is where I need some help). The problem is that I get quite boggled when writing the code - what to do how (my confusion is quite apparent from this question itself). Could please provide some references to read or some advice as to how to begin with this thing?
I am not sure if I understood you correctly; but I will summarize how I read it:
We have a real-time chat server that uses socket connections to publish new messages to all connected clients.
We have a database where we want to keep chat logs.
We have also a restful interface to access the realtime server to get current chats in a lazier manner.
And you want to architect your system this way:
In the above diagram, the components I circled with purple curve wants to be updated like all other clients. Am I right? I don't know what you meant with "Data Layer" but I thought it is a daemon that will be writing to database and also interfacing database for other components.
In this architecture, everything is okay in the direction you meant. I mean DataStore is connected by servers to access data, maybe to query client credentials for authentication, maybe to read user preferences etc.
For your other expectation from these components, I mean to allow these components to be updated like connected clients, why don't you allow them to be clients, too?
Your realtime server is a server for clients; but it is also a client for data layer, or database server, if we prefer a more common naming. So we already know that there is nothing that stops a server from being a client. Then, why can't our database system and restful system also be clients? Connect them to realtime server the same way you connect browsers and other clients. Let them enjoy being one of the people. :)
I hope I did not understand everything completely wrong and this makes sense for the question.
I'm building a web application that will allow team collaboration. That is, a user within a team will be able to edit shared data, and their edits should be pushed to other connected team members.
Are Socket.io rooms a reasonable way of achieving this?
i.e. (roughly speaking):
All connected team members will join the same room (dynamically created upon first team member connecting).
Any edits received by the
server will be broadcast to the room (in addition to being persisted,
etc).
On the client-side, any edits received will be used to update
the shared data displayed in the browser accordingly.
Obviously it will need to somehow handle simultaneous updates to the same data.
Does this seem like a reasonable approach?
Might I need to consider something more robust, such as having a Redis database to hold the shared data during an editing session (with it being 'flushed' to the persistant DB at regular intervals)?
All you need is Socket.IO (with RedisStore) and Express.js. With Socket.IO you can setup rooms and also limit the access per room to only users who are auth.
Using Redis you can make your app scale outside a process.
Useful links for you to read:
Handling Socket.IO, Express and sessions
Scaling Socket.IO
How to reuse redis connection in socket.io?
socket.io chat with private rooms
How to handle user and socket pairs with node.js + redis
Node.js, multi-threading and Socket.io