Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am making a chat app and would like to add the functionality of showing an online/offline symbol next to users. How can I do this reliably with minimum number of srever requests and database writes?
One way I found was to update a lastSeenAt field in the user document every time that user requests a page and use this to indicate whether the user is online/offline. Another way is to ping the server from the client side at fixed intervals of time and then update the lastSeenAt field.
Both theses ways would require a lot of database writes and/or server requests. Is there a way to do this more efficiently?
You have to push data from your server to the clients upon change, with something like web-sockets or server sent events. socket.io is a popular tool for implementing such functionally.
With socket.io or specifically websockets you could track the if a client is connected or not and have a flag in database that tracks whenever a client connects or
disconnects(keep in mind, the client might be having multiple connections(multiple devices, or even browser tabs), so if one connection disconnect he still might be online.
Assuming you will have multiple state-less webservers(common practice), then once a client connects or disconnects you want to notify other interested clients, then you should a use a pubsub pattern to notify other servers which will notify their connected clients respectively. There's a lot of implementations such as Redis, zeromq, AWS SNS, GCP Cloud Pubsub. You could even use MongoDB with tailable cursors as your pubsub.
However this might mean a lot of constant connections to your server, so this might hurt your scaliblity. If it proves to be to expensive for you then your lastSeenAt approach with some sane polling. You can find out which works better with your setup by just running some experiments.
If database writes and requests worries, you could always throw more servers at it. You could have a microservice specifically for this so your main server wouldn't be affected much and also you could have a database specifically for this. If performance worries you, you can use a in-memory database.
Also you can also have some caching in your servers with simple time based invalidation or you can sync the online/offline data between your servers with a pubsub pattern.
I would suggest try to make your setup as simple as possible, and you can run experiments to see how it handles your expected traffic.
Look into Express.io where you can combine HTTP calls with sockets. (the best method for real-time communication is Sockets).
Related
I'm trying to learn Node.js and adequate design approaches.
I've implemented a little API server (using express) that fetches a set of data from several remote sites, according to client requests that use the API.
This process can take some time (several fecth / await), so I want the user to know how is his request doing. I've read about socket.io / websockets but maybe that's somewhat an overkill solution for this case.
So what I did is:
For each client request, a requestID is generated and returned to the client.
With that ID, the client can query the API (via another endpoint) to know his request status at any time.
Using setTimeout() on the client page and some DOM manipulation, I can update and display the current request status every X, like a polling approach.
Although the solution works fine, even with several clients connecting concurrently, maybe there's a better solution?. Are there any caveats I'm not considering?
TL;DR The approach you're using is just fine, although it may not scale very well. Websockets are a different approach to solve the same problem, but again, may not scale very well.
You've identified what are basically the only two options for real-time (or close to it) updates on a web site:
polling the server - the client requests information periodically
using Websockets - the server can push updates to the client when something happens
There are a couple of things to consider.
How important are "real time" updates? If the user can wait several seconds (or longer), then go with polling.
What sort of load can the server handle? If load is a concern, then Websockets might be the way to go.
That last question is really the crux of the issue. If you're expecting a few or a few dozen clients to use this functionality, then either solution will work just fine.
If you're expecting thousands or more to be connecting, then polling starts to become a concern, because now we're talking about many repeated requests to the server. Of course, if the interval is longer, the load will be lower.
It is my understanding that the overhead for Websockets is lower, but still can be a concern when you're talking about large numbers of clients. Again, a lot of clients means the server is managing a lot of open connections.
The way large services handle this is to design their applications in such a way that they can be distributed over many identical servers and which server you connect to is managed by a load balancer. This is true for either polling or Websockets.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm learning about server side websockets and have a question about a certain implementation and whether it's simply not a good idea or if it could be made to work.
Suppose a user is looking at their profile for a website they've joined and for some reason, they're expecting something specific to their account to change, e.g. their karma points to increase based on a funny post they made (Suppose their karma is private).
You could have a /karma websocket endpoint, and post to it whenever anyone's karma changes, but then everyone could potentially see everyone else's karma and in this scenario it's supposed to be private (Filtering it out on the client side is unacceptable, as I wouldn't want anyone except the target user to even receive a notification).
You could somehow store the userId against the web socket connection, and only send the karma notification to the intended user, but that doesn't feel scalable (It might work for this particular example, but probably not in cases where it's something like "send this message to everyone called john", as you could have a very large foreach loop to process).
The alternative is simply long polling with an ajax request, with a simple "get karma for this user" on a timer, which would work just fine, but be less fancy and require more requests to the server than is actually necessary (e.g. you might poll 1000 times but your karma changes just once during that time).
What's a good way of addressing such a requirement, keep it simple with long polling, or is there a websocket'y way that I'm missing?
If the requirement was "publicly get latest karma changes for all users" then websockets would be ideal, but I just can't see an obvious and simple way to make it work for more granular requirements.
Thanks
First off, I know of no circumstance where long polling is more efficient than a webSocket connection for server push. So, if you want to push data from server to client and have the client get it pretty much real time, you will want to use a webSocket connection from client to server and then you can send data to the client at any time and the client will receive and process it immediately.
The rest of your question is a little hard to understand what your objection is to using a webSocket.
If you're concerned about data being kept private (only the specific user can see their karma value), then that's just a matter of proper implementation to enforce that. When a webSocket connection is established, the user has to be authenticated so you know exactly which user is making the webSocket connection. Assuming your web pages have some sort of user authentication already, you can piggy back on that same auth when the webSocket connection is established because cookies are passed with the http request that starts a webSocket connection. So, now let's assume you have an authenticated webSocket connection and the server knows which webSocket belongs to which user.
So, now it's a matter of only sending appropriate data on each webSocket. Your server needs to implement the correct logic for taking a given karma change for a given user, find the authenticated webSocket connections belonging to that user and then sending a message over only those webSocket connections to alert the client that there's a new karma value.
I'm using Backbone.iobind to bind my client Backbone models over socket.io to the back-end server which in turn store it all to MongoDB.
I'm using socket.io so I can synchronize changes back to other clients Backbone models.
The problems starts when I try to run the same thing over a cluster of node.js servers.
Setting a session store was easy using connect-mongo which stores the session to MongoDB.
But now I can't inform all the clients on every change, since the clients are distributed between the different node.js servers.
The only solution I found is to set a pub/sub queue between the different node.js servers (e.g. mubsub), which seems like a very heavy weight solution that will trigger an event on all the servers for every change.
How did you reach the conclusion that pub/sub is a "very heavy weight solution"?
Sounds like you got it right up until that part :-)
Oh, and pub/sub is not a queue.
Let's examine that claim:
The nice thing about pub/sub is that you publish and subscribe to channels/topics.
So, using the classic chat server example, let's say you have a million users connected in total, but #myroom only has 50 users in it.
When a message is sent to #myroom, it's being published once. No duplication whatsoever.
In most use-cases you won't even need to store it on disk/RAM, so we're mostly looking at network/bandwidth here. And, I mean, you're probably throwing more data (probably over the wire?) to MongoDB already, so I assume that's not your bottleneck.
If you also use socket.io's rooms features (which is basically its own pub/sub mechanism), that means only 5 users will have that message emitted to them over the websocket.
And no, socket.io won't iterate over 1M clients to find out which of them are in room #myroom ;-)
So the message is published once, each subscriber (node.js instance) will get notified once, and only the relevant clients -- socket.io won't waste CPU cycles in order to find them as it keeps track of them when they join() or leave() a room -- will receive the message.
Doesn't that sound pretty efficient and light-weight?
Give Redis a shot.
It's really simple to set-up, runs entirely in memory, blazing-fast, replication is extremely simple, etc.
That's the way socket.io recommends passing events between nodes.
You can find more information/code here.
Additionally, if MongoDB can't handle the load at any point, you can use Redis as your session-store as well.
Hope this helps!
I have a good old-style LAMP webapp. A week ago I needed to add a push notification mechanism to it.
Therefore, what I did was to add node.js+socket.io on the server and poll the MySQL database every 10 seconds using node.js to check whether there were new items: if so, I would have sent them to the client(s) with socket.io.
I was pretty happy with the result, even if that is not a proper realtime notification (as there is a lag of up to 10 secs).
Now, I am about to build a new webapp which will need push notifications, too. I am wondering whether to go with the same approach as the first one (that I believe is more stable and mature) or to go totally Node.js, without PHP and Apache. As for the database, I have already decided to go for MongoDB.
Finally, my question is: if I go for Node.js+Socket.io+MongoDB will I get a truly near-real-time webapp? I mean, as soon as a new record is inserted into MongoDB, will there be some sort of event triggered that I can catch via node.js, do some checking on it and, if relevant, send the notification to the client? Or will there be anyway some sort of polling on the db server-side and lag, as with my first LAMP webapp?
A related question: can you build a realtime webapp on MySQL without doing any polling as I did with my first app. Or do you need MongoDB (or Redis)?
I hope this question is not too silly - sorry, I am just starting with Node.js and co.
Thanks.
I understand your problem because I switched to node.js from php/apache/mysql too.
Generally node.js is stable, modules and your scripts are the main reasons for errors
Real-time has nothing to do with database, it's all about client and server, you can query as many data as you want in your requests and push it to the other client.
Choosing node.js is very wise but it's harder to implement.
When you insert a new record to your db, the event is the request itself, you will make a push event along with the database query something like:
// Please note this is not real code, just an example of the idea
app.get('/query', function(request, response){
// Query your database
db.query('SELECT * FROM users', function(rows){
// Push notification to dan
socket.emit('database_query_executed', 'to_dan', rows);
// End request
response.end('success');
})
})
Of course you can use MySQL! And any database you want, as I said real-time has nothing to do with databases because the database is in the middle of the process and it's totally optional.
If you want to use node.js for push notifications and php/apache for mysql then you will need to create 2 requests for each server something like:
// this is javascript
ajax('http://node.yoursite.com/push', node_options)
ajax('http://php.yoursite.com/mysql_query', php_options)
or if you want just one request, or you want to use a form, you can call your php and inside php you can create an http or net request to node.js from php, something like:
// this is php
new HttpRequest('http://node.youtsite.com/push', HttpRequest::METH_GET);
Using:
A regular MongoDB Collection as the Store,
A MongoDB Capped Collection with Tailable Cursors as the Queue,
A Node worker with Socket.IO watching the Queue as the Worker,
A Node server to serve the page with the Socket.IO client, and to receive POSTed data (or however else the data gets added) as the Server
It goes like:
The new data gets sent to the Server,
The Server puts the data in the Store,
The Server adds the data's ObjectID to the Queue,
The Queue will send the newly arrived ObjectID to the open Tailable Cursor on the Worker,
The Worker goes and gets the actual data in the ObjectID from the Store,
The Worker emits the data through the socket,
The client receives the data from the socket.
This is 'push' from the initial addition of the data all the way to receipt at the client - no polling, so as real-time as you can get given the processing time at each step.
Re: triggers in MongoDB - please see this answer: https://stackoverflow.com/a/12405093/1651408
There are much more convenient triggers in MySQL, but to call Node.js from them would require a bit of work with MySQL UDFs (user-defined functions), for instance pushing data through a Unix socket. Please note that this is necessary only when other applications (besides your Node.js process) are updating the database, and be sure to choose InnoDB as storage in this case (row- vs. table-level locking).
Can see no big problem with your technology choice of sockets.io, even if client-side web sockets aren't supported, you'll fall back (gracefully, I hope) to polling.
Finally, your question is not silly at all, since push technology is definitely superior to the flood of polling requests - it scales better. EDIT: However, would not describe either technology as real-time.
Another EDIT: for a quite well-known and successful setup of this kind please read this: http://blog.fogcreek.com/the-trello-tech-stack/
Have you discovered Chole? It works separately from your web sever and interfaces with it by using HTTP POSTs. That way you can code your web app any which way you want.
Actually Using Push Technology like Socket.IO helps you to use
the server's resource efficiently and also helps you to leverage old browsers to modern browsers making websocket or websocket-like connection.
10 sec polling is a HTTP request which is expensive especially when a lot of users present.
Unlike polling technology, push technology is relatively cheap. Users' client is opening a dedicated socket(ie. websocket) to listen to the server's push notification.
And usually your client-side JavaScript do some actions when the push notification is received.
Using your LAMP stack and Socket.IO with different port (other than 80) will be good enough to implement what you need.
But using Node.js + MongoDB + Socket.IO actually helps you to manage your server's resource much efficiently.
Because those three have non-blocking nature.
If you understand non-blocking concept correctly and implement your app appropriately,
your identical app, an app with same feature but with different language and different database, would be able to handle a lot more requests than general LAMP stack.
Above picture is a famous chart of comparing Non-blocking vs Thread way to handle concurrency
Apache(Thread) vs Nginx(Non-blocking)
MySQL is a great database. I believe you won't need join and transactions for realtime notification.
MongoDB does not have those two features unless you implement similar features by yourself.
Because of not having those two and some characteristics of its own, MongoDB can store and fetch data much faster than traditional SQL databases.
Switching from MySQL to MongoDB will decrease the time taking to insert and fetch data.
with JS you can open a socket to your server (not old browser), the server will have a ah-hoc program (on an ad-hoc port, so you need the permission to open door and run program on your server) that will send data (almost) realtime from and to the client, and without the HTTP's protocol overhead.old browser will just fall-back to polling mechanism.
I can't see other way to do this (probably there are already "coocked" framework that do this)
Brief Description:
Well, since many days I've been looking for an answer to this question but there seems to be answers for 'How to create a Push Notification Server' and like questions. I am using node.js and it's quite easy to 'create' a push notification server using sock.js (I've heard socket.io isn't good as compared to sock.js). No problem till here. But what I want is how to model such a server.
Details:
OK, so, let's say I've an application where there's a chat service (just an example this is, actual thing is big as you might have guessed). A person sends a message in a room and all the people in the room get notified. But what I want is a 'stateful' chat - that is, I want to store the messages in a data store. Here's where the trouble comes. Storing the message in the database and later telling everyone that "Hey, there's a message for you". This seems easy when we need the real-time activity for just one part of the app. What to do when the whole app is based on real-time communication? Besides this, I also want to have a RESTful api.
My solution (with which I am not really happy)
What I thought of doing was this: (on the server side of course)
Data Store
||
Data Layer (which talks to data store)
||
------------------
| |
Real-Time Server Restful server
And here, the Real-time server listens to interesting events that the data-layer publishes. Whenever something interesting happens, the server notifies the client. But which client? - This is the problem with my method
Hope you can be of help. :)
UPDATE:
I think I forgot to emphasize an important part of my question. How to implement a pub-sub system? (NOTE: I don't want the actual code, I'll manage that myself; just how to go about doing it is where I need some help). The problem is that I get quite boggled when writing the code - what to do how (my confusion is quite apparent from this question itself). Could please provide some references to read or some advice as to how to begin with this thing?
I am not sure if I understood you correctly; but I will summarize how I read it:
We have a real-time chat server that uses socket connections to publish new messages to all connected clients.
We have a database where we want to keep chat logs.
We have also a restful interface to access the realtime server to get current chats in a lazier manner.
And you want to architect your system this way:
In the above diagram, the components I circled with purple curve wants to be updated like all other clients. Am I right? I don't know what you meant with "Data Layer" but I thought it is a daemon that will be writing to database and also interfacing database for other components.
In this architecture, everything is okay in the direction you meant. I mean DataStore is connected by servers to access data, maybe to query client credentials for authentication, maybe to read user preferences etc.
For your other expectation from these components, I mean to allow these components to be updated like connected clients, why don't you allow them to be clients, too?
Your realtime server is a server for clients; but it is also a client for data layer, or database server, if we prefer a more common naming. So we already know that there is nothing that stops a server from being a client. Then, why can't our database system and restful system also be clients? Connect them to realtime server the same way you connect browsers and other clients. Let them enjoy being one of the people. :)
I hope I did not understand everything completely wrong and this makes sense for the question.