Tracking currently active users in node.js - node.js

I am building an application using node.js and socket.io. I would like to create a table of users who are actively browsing the site at any given moment, which will update dynamically.
I am setting a cookie to give each browser a unique ID, and have a mysql database of all users (whether online or not); however, I'm not sure how best to use these two pieces of information to determine who is, and who isn't, actively browsing right now.
The simplest way would seem to be to store the cookie & socket IDs in an array, but I have read that global variables (which presumably this would have to be) are generally bad, and to be avoided.
Alternatively I could create a new database table, where IDs are inserted and deleted when a socket connects/disconnects; but I'm not sure whether this would be overkill.
Is one of these methods any better than the other, or is there a way of tracking this information which I haven't thought of yet?

You can keep track of active users in memory without it being a global variable. It can simply be a module level variable. This is one of the advantages of the nodejs module system.
The reasons to put it in a database instead of memory are:
You have multiple servers so you need a centralized place to put the data
You want the data stored persistently so if the server is restarted (normally or abnormally) you will have the recent data
The reasons for not putting it directly in a database:
It's a significant load of new database operations since you have to update the data on every single incoming request.
You can sometimes get the persistence without directly using a database by logging the access to a log file and then running chron jobs that parse the logs and do bulk addition of data to the database. This has a downside in that it's not as easy to query live data (since the most recent data is sitting in databases and hasn't been parsed yet).
For an in-memory store, you could do something like this:
// middleware that keeps track of user access
let userAccessMap = new Map();
app.use((req, res, next) => {
// get userId from the cookie (substitute your own cookie logic here)
let id = id: req.cookie.userID;
let lastAccess = Date.now();
// if you want to keep track of more than just lastAccess,
// you can store an object of data here instead of just the lastAccess time
// To update it, you would get the previous object, update some properties
// in it, and then set it back in the userAccessMap
userAccessMap.set(id, lastAccess);
next();
});
// routinely clean up the userAccessMap to remove old access times
// so it doesn't just grow forever
const cleanupFrequency = 30 * 60 * 1000; // run cleanup every 30 minutes
const cleanupTarget = 24 * 60 * 60 * 1000; // clean out users who haven't been here in the last day
setInterval(() => {
let now = Date.now();
for (let [id, lastAccess] of userAccessMap.entries()) {
if (now - lastAccess > cleanupTarget) {
// delete users who haven't been here in a long time
userAccessMap.delete(id);
}
}
}, cleanupFrequncy);
// Then, create some sort of adminstrative interface (probably with some sort of access protection)
// that gives you access to the user access info
// This might even be available in a separate web server on a separate port that isn't open to the general publoic
app.get("/userAccessData", (req, res) => {
// perhaps convert this to a human readable user name by looking up the user id
// also may want to sort the data by recentAccess
res.json(Array.from(userAccessMap));
});

Related

How to capture only the fields modified by user

I am trying to build a logging mechanism, to log changes done to a record. I am currently logging previous and new record. However, as the site is very busy, I expect the logfile to grow seriously huge. To avoid this, I plan to only capture the modified fields only.
Is there a way to capture only the modifications done to a record (in REACT), so my {request.body} will have fewer fields?
My Server-side is build with NODE.JS and the client-side is REACT.
One approach you might want to consider is to add an onChange(universal) or onTextChanged(native) listener to the text field and store the form update in a local state/variables.
Finally, when a user makes an action (submit, etc.) you can send the updated data to the logging module.
The best way I found and works for me is …
on the api server-side, where I handle the update request, before hitting the database, I do a difference between the previous record and {request.body} using lodash and use the result to send to my update database function
var _ = require('lodash');
const difference = (object, base) => {
function changes(object, base) {
return _.transform(object, function (result, value, key) {
if (!_.isEqual(value, base[key])) {
result[key] = (_.isObject(value) && _.isObject(base[key])) ? changes(value, base[key]) : value;
}
});
}
return changes(object, base);
}
module.exports = difference
I saved the above code in a file named diff.js and included it in my server-side file.
It worked good.
Thanks for giving the idea...

Mongo Change Streams running multiple times (kind of): Node app running multiple instances

My Node app uses Mongo change streams, and the app runs 3+ instances in production (more eventually, so this will become more of an issue as it grows). So, when a change comes in the change stream functionality runs as many times as there are processes.
How to set things up so that the change stream only runs once?
Here's what I've got:
const options = { fullDocument: "updateLookup" };
const filter = [
{
$match: {
$and: [
{ "updateDescription.updatedFields.sites": { $exists: true } },
{ operationType: "update" }
]
}
}
];
const sitesStream = Client.watch(sitesFilter, options);
// Start listening to site stream
sitesStream.on("change", async change => {
console.log("in site change stream", change);
console.log(
"in site change stream, update desc",
change.updateDescription
);
// Do work...
console.log("site change stream done.");
return;
});
It can easily be done with only Mongodb query operators. You can add a modulo query on the ID field where the divisor is the number of your app instances (N). The remainder is then an element of {0, 1, 2, ..., N-1}. If your app instances are numbered in ascending order from zero to N-1 you can write the filter like this:
const filter = [
{
"$match": {
"$and": [
// Other filters
{ "_id": { "$mod": [<number of instances>, <this instance's id>]}}
]
}
}
];
Doing this with strong guarantees is difficult but not impossible. I wrote about the details of one solution here: https://www.alechenninger.com/2020/05/building-kafka-like-message-queue-with.html
The examples are in Java but the important part is the algorithm.
It comes down to a few techniques:
Each process attempts to obtain a lock
Each lock (or each change) has an associated fencing token
Processing each change must be idempotent
While processing the change, the token is used to ensure ordered, effectively-once updates.
More details in the blog post.
It sounds like you need a way to partition updates between instances. Have you looked into Apache Kafka? Basically what you would do is have a single application that writes the change data to a partitioned Kafka Topic and have your node application be a Kafka consumer. This would ensure only one application instance ever receives an update.
Depending on your partitioning strategy, you could even ensure that updates for the same record always go to the same node app (if your application needs to maintain its own state). Otherwise, you can spread out the updates in a round robin fashion.
The biggest benefit to using Kafka is that you can add and remove instances without having to adjust configurations. For example, you could start one instance and it would handle all updates. Then, as soon as you start another instance, they each start handling half of the load. You can continue this pattern for as many instances as there are partitions (and you can configure the topic to have 1000s of partitions if you want), that is the power of the Kafka consumer group. Scaling down works in the reverse.
While the Kafka option sounded interesting, it was a lot of infrastructure work on a platform I'm not familiar with, so I decided to go with something a little closer to home for me, sending an MQTT message to a little stand alone app, and letting the MQTT server monitor messages for uniqueness.
siteStream.on("change", async change => {
console.log("in site change stream);
const mqttClient = mqtt.connect("mqtt://localhost:1883");
const id = JSON.stringify(change._id._data);
// You'll want to push more than just the change stream id obviously...
mqttClient.on("connect", function() {
mqttClient.publish("myTopic", id);
mqttClient.end();
});
});
I'm still working out the final version of the MQTT server, but the method to evaluate uniqueness of messages will probably store an array of change stream IDs in application memory, as there is no need to persist them, and evaluate whether to proceed any further based on whether that change stream ID has been seen before.
var mqtt = require("mqtt");
var client = mqtt.connect("mqtt://localhost:1883");
var seen = [];
client.on("connect", function() {
client.subscribe("myTopic");
});
client.on("message", function(topic, message) {
context = message.toString().replace(/"/g, "");
if (seen.indexOf(context) < 0) {
seen.push(context);
// Do stuff
}
});
This doesn't include security, etc., but you get the idea.
Will that having a field in DB called status which will be updated using findAnUpdate based on the event received from change stream. So lets say you get 2 events at the same time from change stream. First event will update the status to start and the other will throw error if status is start. So the second event will not process any business logic.
I'm not claiming those are rock-solid production grade solutions, but I believe something like this could work
Solution 1
applying Read-Modify-Write:
Add version field to the document, all the created docs have version=0
Receive ChangeStream event
Read the document that needs to be updated
Perform the update on the model
Increment version
Update the document where both id and version match, otherwise discard the change
Yes, it creates 2 * n_application_replicas useless queries, so there is another option
Solution 2
Create collection of ResumeTokens in mongo which would store collection -> token mapping
In the changeStream handler code, after successful write, update ResumeToken in the collection
Create a feature toggle that will disable reading ChangeStream in your application
Configure only a single instance of your application to be a "reader"
In case of "reader" failure you might either enable reading on another node, or redeploy the "reader" node.
As a result: there might be an infinite amount of non-reader replicas and there won't be any useless queries

How can I use Node.js to get the value of a child under random USERid and keys

It's a simple concept but yet finding the right answer is ridiculously stressing. Use Firebase Functions w/ Node.js to pull the "itemsExpires" (date) from the Firebase database. The parent node is the userId (random string) and each item is stored under a key node (another random string).. So, here's what the firebase database looks like:
firebase-database-name
+ 82hfijcbwjbjfbergjagbk_USERID
+ "My Stuff"
+ gnjfgsdkfjsdf_ITEMkey
-- "item name": whatever
-- "itemExpires": 05-01-2017
-- "itemType": whatever too
+ an3jeejwiag_ITEMkey
-- "item name": whatever
-- "itemExpires": 06-01-2017
-- "itemType": whatever too
+ zzzndjgabblsbl_ITEMkey
-- "item name": whatever
-- "itemExpires": 07-01-2017
-- "itemType": whatever too
I'm not asking for someone to write the code, a good reference will do but there are so many ways to call data and all I'm finding are the ways to Call using a structured tree and not one with random id's and keynumbers.
*** Basically, my goal here is to run a 3rd party cron job through Firebase Functions that runs through each item entry and checks the expiration date against today's date. This is the wall I'm against.
Bradley I am yet unclear as to what you want to do exactly. I suppose you intend on having multiple users (not just one as in the example) with multiple Items and compare the current date against the expiration date of all items for every user at a specified time (using cron). There are some considerations you should take into account here :
Do you really need cron ? ( or can you solve your problems more easily and natively with a javascript plain setInterval() ? )
How often are you going to check your entire database and how big is that database?
OK. So to explain, the first consideration is just a thought and the logic behind it should be pretty obvious. The second consideration takes some explaining. Since I believe your firebase data will NOT be static and will constantly change you need to find a way to get these changes inside you node script.
If you do not intend on calling your scheduled task too often and the database is not a mammoth (taking ages to load) you could just do the following :
const firebase = require('firebase-admin');
const serviceAccount = require('yourServiceAccountPath.json');
firebase.initializeApp({
credential: firebase.credential.cert(serviceAccount),
databaseURL: "youDatabaseURL"
});
setInteval(function(){ // or cron schedule
firebase.database().ref().once('value',function(snapshot){
let allYourUsers = snapshot.val();
// iterate through them all
// and do what you gotta do
});
},10000); // your interval in milliseconds
However this approach just loads once all your database each time you want to check all items. If you have other data in the database, consider adding users to a seperate path and load just that path. This approach is not recommended if your users are quite many and/or you want them to be checked very often. If such is the case and your data does not change very often you could consider this alternative:
Here you use the on function to update your data as it is edited and set the checking part seperate like so :
const firebase = require('firebase-admin');
const serviceAccount = require('yourServiceAccountPath.json');
firebase.initializeApp({
credential: firebase.credential.cert(serviceAccount),
databaseURL: "youDatabaseURL"
});
const databaseRef=firebase.database().ref();
let allYourUsers;
let allYourUsersStaticCopy;
databaseRef.on('value',function(snapshot){
allYourUsers = snapshot.val();
});
setInteval(function(){ // or cron schedule
if ( allYourUsers ) { // to ensure that data has loaded at least once
// (startup considerations)
allYourUsersStaticCopy = Object.assign({},allYourUsers);
// iterate through the static copy in order to avoid
// you data changing while you are accesing it
// and do what you gotta do
}
},10000); // your interval in milliseconds
The upside with the second piece of code is that your data is loaded every time there is a change and not every time your check runs. If however your data changes very often (additions,deletions and edits) this approach might not be optimum.
In the case that your script runs often enough, the database is big enough, and the changes are often enough to prevent any of the above to be efficient, you might want to consider the optimum solution : loading your users once and then attaching listeners for the child added,removed and changed to update your existing users object. Thus you receive only changes and not a whole snapshot. If such is the case you can read about the mentioned listeners in the firebase docs and I can squeeze in some extra code. Just let me know.
Hope this long post helps!
Assuming you have Firebase set up properly within your node project, you can do a one time read for your ITEMkey entries. Something like this:
var db = admin.database();
var ref = db.ref("82hfijcbwjbjfbergjagbk_USERID").child("My Stuff");
ref.once("value", function(snapshot) {
var contents = snapshot.val();
// Data returned here will be an object with all children
// nodes under "My Stuff". You can access it by calling
// snapshot.val() like I did above.
}

Keep Object In Memory Between Requests with SailsJS/ Express

I'm building a server using SailsJS (a framework built on top of Express) and I need to keep an object in memory between requests. I would like to do this because loading it to/ from a database is taking way too long. Any ideas how I could do this?
Here's my code:
var params = req.params.all();
Network.findOne({ id: params.id }, function(err, network) {
if(network) {
var synapticNetwork = synaptic.Network.fromJSON(network.jsonValue);
if(synapticNetwork) { ...
Specifically, the fromJSON() function takes way too long and I would rather keep the synapticNetwork object in memory while the server is running (aka. load it when the server starts and just save periodically).
There are plenty libraries out there for caching purposes, one of which is node-cache as you've mentioned. All of them share similar api :
var cache = require('memory-cache');
// now just use the cache
cache.put('foo', 'bar');
console.log(cache.get('foo'))
You can also implement your own module and just require it wherever you need:
var cache = {};
module.exports = {
put: function(key, item) {
cache[key] = item;
},
get: function(key) {
return cache[key];
}
}
There are a lot of potential solutions. The first and most obvious one is using some session middleware for express. Most web frameworks should have some sort of session solution.
https://github.com/expressjs/session
The next option would be to use a caching utility like what Vsevolod suggested. It accomplishes pretty much the same thing as session, except if the data needs to be tied to a user/session then you'll have to store some kind of identifier in the session and use that to retrieve from the cache. Which I think is a bit redundant if that's your use-case.
There are also utilities that will expand your session middle-ware and persist objects in session to a database or other kinds of data stores, so that session information isn't lost even after server restarts. You still get the speed of an in-memory store, but backed by a database in case the in-memory store gets blown away.
Another option is to use Redis. You still have to serialize/deserialize your objects, but Redis is an in-memory data store and is super quick to write to and read from.

Meteor publish method

I just started the Meteor js, and I'm struggling in its publish method. Below is one publish method.
//Server side
Meteor.publish('topPostsWithTopComments', function() {
var topPostsCursor = Posts.find({}, {sort: {score: -1}, limit: 30});
var userIds = topPostsCursor.map(function(p) { return p.userId });
return [
topPostsCursor,
Meteor.users.find({'_id': {$in: userIds}})
];
});
// Client side
Meteor.subscribe('topPostsWithTopComments');
Now I'm not getting how I can use publish data on client. I meant I want to use data which will be given by topPostsWithTopComments
Problem is detailed below
When a new post enters the top 30 list, two things need to happen:
The server needs to send the new post to the client.
The server needs to send that post’s author to the client.
Meteor is observing the Posts cursor returned on line 6, and so will send the new post down as soon as it’s added, ensuring the client will receive the new post straight away.
However, consider the Meteor.users cursor returned on line 7. Even if the cursor itself is reactive, it’s now using an outdated value for the userIds array (which is a plain old non-reactive variable), which means its result set will be out of date as well.
This is why as far as that cursor is concerned, there is no need to re-run the query and Meteor will happily continue to publish the same 30 authors for the original 30 top posts ad infinitum.
So unless the whole code of the publication runs again (to construct a new list of userIds), the cursor is no longer going to return the correct information.
Basically what I need is:
if any changes happens in Post, then it should have the updated users list. without calling user collection again. I found some user full mrt modules.
link1 |
link2 |
link3
Please share your views!
-Neelesh
When you publish data on the server you're just publishing what the client is allowed to query. This is for security. After you subscribe to your publication you still need to query what the publication returned.
if(Meteor.isClient) {
Meteor.subscribe('topPostsWithTopComments');
// This returns all the records published with topPostsWithComments from the Posts Collection
var posts = Posts.find({});
}
If you wanted to only publish posts that the current user owns you would want to filter them out in the publish method on the server and not on the client.
I think #Will Brock already answered your question but maybe it becomes more clear with an abstract example.
Let's construct two collections named collectiona and collectionb.
// server and client
CollectionA = new Meteor.Collection('collectiona');
CollectionB = new Meteor.Collection('collectionb');
On the server you could now call Meteor.publish with 'collectiona' and 'collectionb' separately to publish both record sets to the client. This way the client could then also separately subscribe to them.
But instead you can also publish multiple record sets in a single call to Meteor.publish by returning multiple cursors in an array. Just like in the standard publishing procedure you can of course define what is being sent down to the client. Like so:
if (Meteor.isServer) {
Meteor.publish('collectionAandB', function() {
// constrain records from 'collectiona': limit number of documents to one
var onlyOneFromCollectionA = CollectionA.find({}, {limit: 1});
// all cursors in the array are published
return [
onlyOneFromCollectionA,
CollectionB.find()
];
});
}
Now on the client there is no need to subscribe to 'collectiona' and 'collectionb' separately. Instead you can simply subscribe to 'collectionAandB':
if (Meteor.isClient) {
Meteor.subscribe('collectionAandB', function () {
// callback to use collection A and B on the client once
// they are ready
// only one document of collection A will be available here
console.log(CollectionA.find().fetch());
// all documents from collection B will be available here
console.log(CollectionB.find().fetch());
});
}
So I think what you need to understand is that there is no array sent to the client that contains the two cursors published in the Meteor.publish call. This is because returning an array of cursors in the function passed as an argument to your call to Meteor.publish merely tells Meteor to publish all cursors contained in the array. You still need to query the individual records using your collection handles on the client (see #Will Brock's answer).

Resources