Creating A Database Snapshot Listener - node.js

Forgive me if I'm heading down the wrong path here, if so, would be grateful if someone could point me in the right direction.
I'm curious about building a snapshot listener in Node/Express that returns database updates similar to how the snapshot listener on cloud firestore works.
For example, a front-end client would be able to listen through a single call, then receive updates in real-time without having to make additional calls.
For simplicity's sake, imagine for some reason we wanted to wrap Firestore's snapshot listener in a node/express function, then pass it onto the client and have identical functionality. How would you go about doing this, or am I totally wide of the mark?

Answering this as Community wiki. As mentioned in the comments,
Building your own persistent listener is definitely possible. If Firebase can do it, so can others.
Web sockets are an option indeed, but not required. Firestore's realtime listeners don't use web sockets for example, but the listeners on Firebase's other database (Realtime Database) do.

Related

Node.js REST API wrapper for async messaging

Given an event driven micro service architecture with asynchronous messaging, what solutions are there to implementing a 'synchronous' REST API wrapper such that requests to the REST interface wait for a response event to be published before sending a response to the client?
Example: POST /api/articles
Internally this would send a CreateArticleEvent in the services layer, eventually expecting an ArticleCreatedEvent in response containing the ID of the persisted article.
Only then would the REST interface response to the end client with this ID.
Dealing with multiple simultaneous requests - is keeping an in-memory map of inflight requests in the REST api layer keyed by some correlating identifier conceptually a workable approach?
How can we deal with timing out requests after a certain period?
Generally you don't need to maintain a map of in-flight requests, because this is basically done for you by node.js's http library.
Just use express as it's intended, and this is probably something you never really have to worry about, as long as you avoid any global state.
If you have a weirder pattern in mind to build, and not sure how to solve it. It might help to share a simple example. Chances are that it's not hard to rebuild and avoid global state.
With express, have you tried middleware? You can chain a series of callback functions with a certain timeout after the article is created.
I assume you are in the context of Event Sourcing and microservices? If so I recommend that you don't publish a CreateArticleEvent to the event store, and instead directly create the article in the database and then publish the ArticleCreatedEvent to the Event store.
Why you ask? Generally this pattern is created to orchestrate different microservices. In the example show in the link above, it was used to orchestrate how the Customer service should react when an Order is created. Note the past tense. The Order Service created the order, and Customer Service reacts to it.
In your case it is easier (and probably better) to just insert the order into the database (by calling the ArticleService directly) and responding with the article ID. Then just publish the ArctileCreatedEvent to your event store, to trigger other microservices that may want to listen to it (like, for example, trigger a notification to the editor for review).
Event Sourcing is a good pattern, but we don't need to apply it to everything.

Is possible integrate Nodejs with Cakephp?

I want to monitor in real time the data that users enter in comments table.
I have an Apache server running, and suppose that has a node server on port 1337.
How would I do that every time someone save new data, eg return me the total number of table rows in comment and show it in a view?
Maybe way is to make the $this->Comment->save($this->request->data); using a different port using Httpsockect?
Yes, it is possible.
You have multiple ways of solving this, let me give you my ideas
You can simply use long-polling and don't use Node.js at all. It's a suitable solution if there won't be too much traffic there, otherwise you will have a bad time.
You can use websockets and don't use Node.js at all. Here you have a basic guide about websockets and PHP. Although, I am almost sure you won't be able to create "rooms", that is, sending notifications for specific comments.
You can also use Ratchet. This is a more sophisticated library to handle websockets and it supports rooms.
Lastly, if you want to full dive in with Node.js and CakePHP, I would suggest start by watching this talk given on Cakefest 2012 which exactly describe your scenario.
After you have watched that, you might want to learn a little about Socket.io. This is a more complex solution, but it's what I have used when integrating CakePHP and Node.js to create real time applications.
The strategy here is to have the users join a room when they visit /article/view/123, let's say the room name is the articleID, then socket.io will be listening for events happening in this room.
You will have a Cakephp method that handles the save. Then, when user submits the form you don't call directly the Cake action, you have socket.io to dispatch an event, then in your event you pass the data to the server (Node.js) and nodejs will call your cakephp function that saves the data. When Nodejs receives confirmation from CakePHP then you broadcast an event (using socket.io), this event will let know all users connected to that room that a comment has been made.
You have basically the choice between Websockets and long polling.
Websockets (with Ratchet and Autobahn.js)
Long Polling Using Comet
Decide which technology you want to use and start implementing your use case. Consider that Websockets are more or less new. Depending on your requirements you might not be able to use Websockets because you might have to support crappy browsers. See this page.

How are Node.js+Socket.io+MongoDB webapps truly asynchronous?

I have a good old-style LAMP webapp. A week ago I needed to add a push notification mechanism to it.
Therefore, what I did was to add node.js+socket.io on the server and poll the MySQL database every 10 seconds using node.js to check whether there were new items: if so, I would have sent them to the client(s) with socket.io.
I was pretty happy with the result, even if that is not a proper realtime notification (as there is a lag of up to 10 secs).
Now, I am about to build a new webapp which will need push notifications, too. I am wondering whether to go with the same approach as the first one (that I believe is more stable and mature) or to go totally Node.js, without PHP and Apache. As for the database, I have already decided to go for MongoDB.
Finally, my question is: if I go for Node.js+Socket.io+MongoDB will I get a truly near-real-time webapp? I mean, as soon as a new record is inserted into MongoDB, will there be some sort of event triggered that I can catch via node.js, do some checking on it and, if relevant, send the notification to the client? Or will there be anyway some sort of polling on the db server-side and lag, as with my first LAMP webapp?
A related question: can you build a realtime webapp on MySQL without doing any polling as I did with my first app. Or do you need MongoDB (or Redis)?
I hope this question is not too silly - sorry, I am just starting with Node.js and co.
Thanks.
I understand your problem because I switched to node.js from php/apache/mysql too.
Generally node.js is stable, modules and your scripts are the main reasons for errors
Real-time has nothing to do with database, it's all about client and server, you can query as many data as you want in your requests and push it to the other client.
Choosing node.js is very wise but it's harder to implement.
When you insert a new record to your db, the event is the request itself, you will make a push event along with the database query something like:
// Please note this is not real code, just an example of the idea
app.get('/query', function(request, response){
// Query your database
db.query('SELECT * FROM users', function(rows){
// Push notification to dan
socket.emit('database_query_executed', 'to_dan', rows);
// End request
response.end('success');
})
})
Of course you can use MySQL! And any database you want, as I said real-time has nothing to do with databases because the database is in the middle of the process and it's totally optional.
If you want to use node.js for push notifications and php/apache for mysql then you will need to create 2 requests for each server something like:
// this is javascript
ajax('http://node.yoursite.com/push', node_options)
ajax('http://php.yoursite.com/mysql_query', php_options)
or if you want just one request, or you want to use a form, you can call your php and inside php you can create an http or net request to node.js from php, something like:
// this is php
new HttpRequest('http://node.youtsite.com/push', HttpRequest::METH_GET);
Using:
A regular MongoDB Collection as the Store,
A MongoDB Capped Collection with Tailable Cursors as the Queue,
A Node worker with Socket.IO watching the Queue as the Worker,
A Node server to serve the page with the Socket.IO client, and to receive POSTed data (or however else the data gets added) as the Server
It goes like:
The new data gets sent to the Server,
The Server puts the data in the Store,
The Server adds the data's ObjectID to the Queue,
The Queue will send the newly arrived ObjectID to the open Tailable Cursor on the Worker,
The Worker goes and gets the actual data in the ObjectID from the Store,
The Worker emits the data through the socket,
The client receives the data from the socket.
This is 'push' from the initial addition of the data all the way to receipt at the client - no polling, so as real-time as you can get given the processing time at each step.
Re: triggers in MongoDB - please see this answer: https://stackoverflow.com/a/12405093/1651408
There are much more convenient triggers in MySQL, but to call Node.js from them would require a bit of work with MySQL UDFs (user-defined functions), for instance pushing data through a Unix socket. Please note that this is necessary only when other applications (besides your Node.js process) are updating the database, and be sure to choose InnoDB as storage in this case (row- vs. table-level locking).
Can see no big problem with your technology choice of sockets.io, even if client-side web sockets aren't supported, you'll fall back (gracefully, I hope) to polling.
Finally, your question is not silly at all, since push technology is definitely superior to the flood of polling requests - it scales better. EDIT: However, would not describe either technology as real-time.
Another EDIT: for a quite well-known and successful setup of this kind please read this: http://blog.fogcreek.com/the-trello-tech-stack/
Have you discovered Chole? It works separately from your web sever and interfaces with it by using HTTP POSTs. That way you can code your web app any which way you want.
Actually Using Push Technology like Socket.IO helps you to use
the server's resource efficiently and also helps you to leverage old browsers to modern browsers making websocket or websocket-like connection.
10 sec polling is a HTTP request which is expensive especially when a lot of users present.
Unlike polling technology, push technology is relatively cheap. Users' client is opening a dedicated socket(ie. websocket) to listen to the server's push notification.
And usually your client-side JavaScript do some actions when the push notification is received.
Using your LAMP stack and Socket.IO with different port (other than 80) will be good enough to implement what you need.
But using Node.js + MongoDB + Socket.IO actually helps you to manage your server's resource much efficiently.
Because those three have non-blocking nature.
If you understand non-blocking concept correctly and implement your app appropriately,
your identical app, an app with same feature but with different language and different database, would be able to handle a lot more requests than general LAMP stack.
Above picture is a famous chart of comparing Non-blocking vs Thread way to handle concurrency
Apache(Thread) vs Nginx(Non-blocking)
MySQL is a great database. I believe you won't need join and transactions for realtime notification.
MongoDB does not have those two features unless you implement similar features by yourself.
Because of not having those two and some characteristics of its own, MongoDB can store and fetch data much faster than traditional SQL databases.
Switching from MySQL to MongoDB will decrease the time taking to insert and fetch data.
with JS you can open a socket to your server (not old browser), the server will have a ah-hoc program (on an ad-hoc port, so you need the permission to open door and run program on your server) that will send data (almost) realtime from and to the client, and without the HTTP's protocol overhead.old browser will just fall-back to polling mechanism.
I can't see other way to do this (probably there are already "coocked" framework that do this)

RESTful backend and socket.io to sync

Today, i had the idea of the following setup. Create a nodejs server along with express and socket.io. With express, i would create a RESTful API, which is connected to a mongo. BackboneJS or similar would connect the client to that REST API.
Now every time the mongodb(ie the data in it iam interested in) changes, socket.io would fire an event to the client, which would carry a courser to the data which has changed. The client then would trigger the appropriate AJAX requests to the REST to get the new data, where it needs it.
So, the socket.io connection would behave like a synchronize trigger. It would be there for the entire visit and could also manage sessions that way. All the payload would be send over http.
Pros:
REST API for use with other clients than web
Auth could be done entirely over socket.io. Only sending token along with REST requests.
Use the benefits of REST.
Would also play nicely with pub/sub service like Redis'
Cons:
Greater overhead, than using pure socket.io.
What do you think, are there any great disadvantages i did not think of?
I agree with #CharlieKey, you should send the updated data rather than re-requesting.
This is exactly what Tower is doing:
save some data: https://github.com/viatropos/tower/blob/development/src/tower/model/persistence.coffee#L77
insert into mongodb (cursor is a query/persistence abstraction): https://github.com/viatropos/tower/blob/development/src/tower/model/cursor/persistence.coffee#L29
notify sockets: https://github.com/viatropos/tower/blob/development/src/tower/model/cursor/persistence.coffee#L68
emit updated records to client: https://github.com/viatropos/tower/blob/development/src/tower/server/net/connection.coffee#L62
The disadvantage of using sockets as a trigger to re-request with Ajax is that every connected client will have to fetch the data, so if 100 people are on your site there's going to be 100 HTTP requests every time data changes - where you could just reuse the socket connections.
I think that pushing the updated data with the socket.io event would be better than re-requesting the lastest. Even better you could only push the modified pieces of data decreasing the amount of data sent over the line. Overall though a interesting idea.
I'd look into Now.js since it does pretty much exactly what you need.
It creates a namespace which is shared among the client and server. The server can call functions on the client directly and vice versa.
That is if you insist on your current infrastructure decision to use MongoDB and Node.js, otherwise there would be CouchDB which is a full web server and document database with sophisticated replication mechanisms built-in.

How to model Push Notifications on server

Brief Description:
Well, since many days I've been looking for an answer to this question but there seems to be answers for 'How to create a Push Notification Server' and like questions. I am using node.js and it's quite easy to 'create' a push notification server using sock.js (I've heard socket.io isn't good as compared to sock.js). No problem till here. But what I want is how to model such a server.
Details:
OK, so, let's say I've an application where there's a chat service (just an example this is, actual thing is big as you might have guessed). A person sends a message in a room and all the people in the room get notified. But what I want is a 'stateful' chat - that is, I want to store the messages in a data store. Here's where the trouble comes. Storing the message in the database and later telling everyone that "Hey, there's a message for you". This seems easy when we need the real-time activity for just one part of the app. What to do when the whole app is based on real-time communication? Besides this, I also want to have a RESTful api.
My solution (with which I am not really happy)
What I thought of doing was this: (on the server side of course)
Data Store
||
Data Layer (which talks to data store)
||
------------------
| |
Real-Time Server Restful server
And here, the Real-time server listens to interesting events that the data-layer publishes. Whenever something interesting happens, the server notifies the client. But which client? - This is the problem with my method
Hope you can be of help. :)
UPDATE:
I think I forgot to emphasize an important part of my question. How to implement a pub-sub system? (NOTE: I don't want the actual code, I'll manage that myself; just how to go about doing it is where I need some help). The problem is that I get quite boggled when writing the code - what to do how (my confusion is quite apparent from this question itself). Could please provide some references to read or some advice as to how to begin with this thing?
I am not sure if I understood you correctly; but I will summarize how I read it:
We have a real-time chat server that uses socket connections to publish new messages to all connected clients.
We have a database where we want to keep chat logs.
We have also a restful interface to access the realtime server to get current chats in a lazier manner.
And you want to architect your system this way:
In the above diagram, the components I circled with purple curve wants to be updated like all other clients. Am I right? I don't know what you meant with "Data Layer" but I thought it is a daemon that will be writing to database and also interfacing database for other components.
In this architecture, everything is okay in the direction you meant. I mean DataStore is connected by servers to access data, maybe to query client credentials for authentication, maybe to read user preferences etc.
For your other expectation from these components, I mean to allow these components to be updated like connected clients, why don't you allow them to be clients, too?
Your realtime server is a server for clients; but it is also a client for data layer, or database server, if we prefer a more common naming. So we already know that there is nothing that stops a server from being a client. Then, why can't our database system and restful system also be clients? Connect them to realtime server the same way you connect browsers and other clients. Let them enjoy being one of the people. :)
I hope I did not understand everything completely wrong and this makes sense for the question.

Resources