Server side tasks for CouchDB - couchdb

I need to perform some background task periodically in CouchDB (guess that could be done through cronjob, just curious about some native CouchDB approaches). I also need to retrieve some resources from HTTP on server (e.g. to authenticate through OAuth2 and store token permanently in some document). Could it be achieved somehow (e.g. nodejs to be integrated with CouchDB. I don't really like the idea to have nodejs webserver in front of couchdb, I'm trying to avoid that additional layer and use couchdb as HTTP server, DB backed and server-side business logic).

CouchDB is a database. Its primary job is to store data. Yes, it has some JavaScript parts but those are to help it build indexes, or convert to and from JSON.
Asking CouchDB to run periodic cron-style tasks, or to fetch HTTP resources, is similar to asking MySQL to run periodic cron-style tasks, or to fetch HTTP resources. Unfortunately, it's not possible.
You do not necessarily need a HTTP server. You can build a 2.1-tier architecture, with direct browser-to-CouchDB connections as before; but run your periodic or long-lasting back-end programs yourself, and they simply read and write CouchDB data as a normal user (perhaps an admin user).

Related

How to access a MongoDB database from a different Server

In a microservice architecture, I have 2 separate services with different mongoDb databases. Teacher-Service and Student-service.
Now I'm trying to create a login function, once a user submits an email I'd love to query both databases to determine if the user is either a teacher or a student.
How do I implement this in Node.js using the express framework.
I guess you can make HTTP requests to both services (synchronous communication).
For e.g. GET http://teachers.service.com/api/users?email=<input-email> and GET http://students.service.com/api/users?email=<input-email> with some auth header with mind of course :).
Or another way to communicate through services is using asynchronous communication transporters like RabbitMQ, Kafka, NATS and so on. You can search for their documentations to get ground up.

Firestore snapshot listener limit in node.js application

I have a nodejs backend that is serving as a gRPC server in front of a cloud firestore datastore. In perusing the best practices documentation for Firestore, I noticed: "limit snapshot listeners to 100 per client".
This is a pretty reasonable limitation if a "client" is a web UI or flutter application, but does the same limitation apply to a node.js or golang server connecting to the database via the admin interface? Suddenly, in the best case I am looking at 100 concurrent users per server process, which isn't super-great, if those users each request a single resource in streaming mode.
So: does that 100 snapshot listeners per client limitation apply when the "client" is actually a backend API service? And if so, what are some best practices to work around this?
(yes, I know I could just use the regular client API in the client itself, and will be doing that, I am mostly wondering about the limitations in an academic sense, as I was considering using streaming GRPC because there's a fair bit of data massaging that needs to happen between the storage representation and what the client consumes, so putting that all into a single place on a server where I control the rollout frequency is easier than dealing with data representation sync errors because some client is using an older implementation of a transformer method. Plus: that's extra data / code to ship to clients).
The 100 snapshot listeners per client limit should apply for any client, including a backend API service.
Firestore doesn't have a way to make the distinction on where the calls come from, and as such there's no built-in mechanism to make it to exempt the limitation.

Scaling nodejs app with pm2

I have an app that receives data from several sources in realtime using logins and passwords. After data is recieved it's stored in memory store and replaced after new data is available. Also I use sessions with mongo-db to auth user requests. Problem is I can't scale this app using pm2, since I can use only one connection to my datasource for one login/password pair.
Is there a way to use different login/password for each cluster or get cluster ID inside app?
Are memory values/sessions shared between clusters or is it separated? Thank you.
So if I understood this question, you have a node.js app, that connects to a 3rd party using HTTP or another protocol, and since you only have a single credential, you cannot connect to said 3rd party using more than one instance. To answer your question, yes it is possibly to set up your clusters to use a unique use/pw combination, the tricky part would be how to assign these credentials to each cluster (assuming you don't want to hard code it). You'd have to do this assignment when the servers start up, and perhaps use a a data store to hold these credentials and introduce some sort of locking mechanism for each credential (so that each credential is unique to a particular instance).
If I was in your shoes, however, what I would do is create a new server, whose sole job would be to get this "realtime data", and store it somewhere available to the cluster, such as redis or some persistent store. The server would then be a standalone server, just getting this data. You can also attach a RESTful API to it, so that if your other servers need to communicate with it, they can do so via HTTP, or a message queue (again, Redis would work fine there as well.
'Realtime' is vague; are you using WebSockets? If HTTP requests are being made often enough, also could be considered 'realtime'.
Possibly your problem is like something we encountered scaling SocketStream (websockets) apps, where the persistent connection requires same requests routed to the same process. (there are other network topologies / architectures which don't require this but that's another topic)
You'll need to use fork mode 1 process only and a solution to make sessions sticky e.g.:
https://www.npmjs.com/package/sticky-session
I have some example code but need to find it (over a year since deployed it)
Basically you wind up just using pm2 for 'always-on' feature; sticky-session module handles the node clusterisation stuff.
I may post example later.

MongoDB document streaming with HTTP response?

After an all day research on node.js real-time frameworks/wrappers (derby.js, meteor,
socketIO...) I realised, that the more old-fashioned (sorry) way of a restful API
fits all my needs.
One of the reasons I thought I have to use an ongoing socket connection was because I want to
stream my MongoDB documents from the database instead of loading them all into memory on the server. I think this is the recommended way because it minimizes the use of server ressources.
But here is the problem:
Does a simple document query streaming work with the ordinary HTTP request/response
model or do we have to establish an ongoing socket-connection to stream all documents to the client?
Note: I only have to load the documents on an ajax call - without the need to have new
documents to be pushed to the client (so really no need to be realtime).
Is there anything special to consider?
You can stream the results of the query using the standard HTTP request/response APIs.
The general sequence of calls is:
res.writeHead(<header content>)
res.write(<data>)
...
res.write(<data>)
res.end();
But you make those calls asynchronously, driven by the streaming events from your query.

How are Node.js+Socket.io+MongoDB webapps truly asynchronous?

I have a good old-style LAMP webapp. A week ago I needed to add a push notification mechanism to it.
Therefore, what I did was to add node.js+socket.io on the server and poll the MySQL database every 10 seconds using node.js to check whether there were new items: if so, I would have sent them to the client(s) with socket.io.
I was pretty happy with the result, even if that is not a proper realtime notification (as there is a lag of up to 10 secs).
Now, I am about to build a new webapp which will need push notifications, too. I am wondering whether to go with the same approach as the first one (that I believe is more stable and mature) or to go totally Node.js, without PHP and Apache. As for the database, I have already decided to go for MongoDB.
Finally, my question is: if I go for Node.js+Socket.io+MongoDB will I get a truly near-real-time webapp? I mean, as soon as a new record is inserted into MongoDB, will there be some sort of event triggered that I can catch via node.js, do some checking on it and, if relevant, send the notification to the client? Or will there be anyway some sort of polling on the db server-side and lag, as with my first LAMP webapp?
A related question: can you build a realtime webapp on MySQL without doing any polling as I did with my first app. Or do you need MongoDB (or Redis)?
I hope this question is not too silly - sorry, I am just starting with Node.js and co.
Thanks.
I understand your problem because I switched to node.js from php/apache/mysql too.
Generally node.js is stable, modules and your scripts are the main reasons for errors
Real-time has nothing to do with database, it's all about client and server, you can query as many data as you want in your requests and push it to the other client.
Choosing node.js is very wise but it's harder to implement.
When you insert a new record to your db, the event is the request itself, you will make a push event along with the database query something like:
// Please note this is not real code, just an example of the idea
app.get('/query', function(request, response){
// Query your database
db.query('SELECT * FROM users', function(rows){
// Push notification to dan
socket.emit('database_query_executed', 'to_dan', rows);
// End request
response.end('success');
})
})
Of course you can use MySQL! And any database you want, as I said real-time has nothing to do with databases because the database is in the middle of the process and it's totally optional.
If you want to use node.js for push notifications and php/apache for mysql then you will need to create 2 requests for each server something like:
// this is javascript
ajax('http://node.yoursite.com/push', node_options)
ajax('http://php.yoursite.com/mysql_query', php_options)
or if you want just one request, or you want to use a form, you can call your php and inside php you can create an http or net request to node.js from php, something like:
// this is php
new HttpRequest('http://node.youtsite.com/push', HttpRequest::METH_GET);
Using:
A regular MongoDB Collection as the Store,
A MongoDB Capped Collection with Tailable Cursors as the Queue,
A Node worker with Socket.IO watching the Queue as the Worker,
A Node server to serve the page with the Socket.IO client, and to receive POSTed data (or however else the data gets added) as the Server
It goes like:
The new data gets sent to the Server,
The Server puts the data in the Store,
The Server adds the data's ObjectID to the Queue,
The Queue will send the newly arrived ObjectID to the open Tailable Cursor on the Worker,
The Worker goes and gets the actual data in the ObjectID from the Store,
The Worker emits the data through the socket,
The client receives the data from the socket.
This is 'push' from the initial addition of the data all the way to receipt at the client - no polling, so as real-time as you can get given the processing time at each step.
Re: triggers in MongoDB - please see this answer: https://stackoverflow.com/a/12405093/1651408
There are much more convenient triggers in MySQL, but to call Node.js from them would require a bit of work with MySQL UDFs (user-defined functions), for instance pushing data through a Unix socket. Please note that this is necessary only when other applications (besides your Node.js process) are updating the database, and be sure to choose InnoDB as storage in this case (row- vs. table-level locking).
Can see no big problem with your technology choice of sockets.io, even if client-side web sockets aren't supported, you'll fall back (gracefully, I hope) to polling.
Finally, your question is not silly at all, since push technology is definitely superior to the flood of polling requests - it scales better. EDIT: However, would not describe either technology as real-time.
Another EDIT: for a quite well-known and successful setup of this kind please read this: http://blog.fogcreek.com/the-trello-tech-stack/
Have you discovered Chole? It works separately from your web sever and interfaces with it by using HTTP POSTs. That way you can code your web app any which way you want.
Actually Using Push Technology like Socket.IO helps you to use
the server's resource efficiently and also helps you to leverage old browsers to modern browsers making websocket or websocket-like connection.
10 sec polling is a HTTP request which is expensive especially when a lot of users present.
Unlike polling technology, push technology is relatively cheap. Users' client is opening a dedicated socket(ie. websocket) to listen to the server's push notification.
And usually your client-side JavaScript do some actions when the push notification is received.
Using your LAMP stack and Socket.IO with different port (other than 80) will be good enough to implement what you need.
But using Node.js + MongoDB + Socket.IO actually helps you to manage your server's resource much efficiently.
Because those three have non-blocking nature.
If you understand non-blocking concept correctly and implement your app appropriately,
your identical app, an app with same feature but with different language and different database, would be able to handle a lot more requests than general LAMP stack.
Above picture is a famous chart of comparing Non-blocking vs Thread way to handle concurrency
Apache(Thread) vs Nginx(Non-blocking)
MySQL is a great database. I believe you won't need join and transactions for realtime notification.
MongoDB does not have those two features unless you implement similar features by yourself.
Because of not having those two and some characteristics of its own, MongoDB can store and fetch data much faster than traditional SQL databases.
Switching from MySQL to MongoDB will decrease the time taking to insert and fetch data.
with JS you can open a socket to your server (not old browser), the server will have a ah-hoc program (on an ad-hoc port, so you need the permission to open door and run program on your server) that will send data (almost) realtime from and to the client, and without the HTTP's protocol overhead.old browser will just fall-back to polling mechanism.
I can't see other way to do this (probably there are already "coocked" framework that do this)

Resources