Meteor publish method - node.js

I just started the Meteor js, and I'm struggling in its publish method. Below is one publish method.
//Server side
Meteor.publish('topPostsWithTopComments', function() {
var topPostsCursor = Posts.find({}, {sort: {score: -1}, limit: 30});
var userIds = topPostsCursor.map(function(p) { return p.userId });
return [
topPostsCursor,
Meteor.users.find({'_id': {$in: userIds}})
];
});
// Client side
Meteor.subscribe('topPostsWithTopComments');
Now I'm not getting how I can use publish data on client. I meant I want to use data which will be given by topPostsWithTopComments
Problem is detailed below
When a new post enters the top 30 list, two things need to happen:
The server needs to send the new post to the client.
The server needs to send that post’s author to the client.
Meteor is observing the Posts cursor returned on line 6, and so will send the new post down as soon as it’s added, ensuring the client will receive the new post straight away.
However, consider the Meteor.users cursor returned on line 7. Even if the cursor itself is reactive, it’s now using an outdated value for the userIds array (which is a plain old non-reactive variable), which means its result set will be out of date as well.
This is why as far as that cursor is concerned, there is no need to re-run the query and Meteor will happily continue to publish the same 30 authors for the original 30 top posts ad infinitum.
So unless the whole code of the publication runs again (to construct a new list of userIds), the cursor is no longer going to return the correct information.
Basically what I need is:
if any changes happens in Post, then it should have the updated users list. without calling user collection again. I found some user full mrt modules.
link1 |
link2 |
link3
Please share your views!
-Neelesh

When you publish data on the server you're just publishing what the client is allowed to query. This is for security. After you subscribe to your publication you still need to query what the publication returned.
if(Meteor.isClient) {
Meteor.subscribe('topPostsWithTopComments');
// This returns all the records published with topPostsWithComments from the Posts Collection
var posts = Posts.find({});
}
If you wanted to only publish posts that the current user owns you would want to filter them out in the publish method on the server and not on the client.

I think #Will Brock already answered your question but maybe it becomes more clear with an abstract example.
Let's construct two collections named collectiona and collectionb.
// server and client
CollectionA = new Meteor.Collection('collectiona');
CollectionB = new Meteor.Collection('collectionb');
On the server you could now call Meteor.publish with 'collectiona' and 'collectionb' separately to publish both record sets to the client. This way the client could then also separately subscribe to them.
But instead you can also publish multiple record sets in a single call to Meteor.publish by returning multiple cursors in an array. Just like in the standard publishing procedure you can of course define what is being sent down to the client. Like so:
if (Meteor.isServer) {
Meteor.publish('collectionAandB', function() {
// constrain records from 'collectiona': limit number of documents to one
var onlyOneFromCollectionA = CollectionA.find({}, {limit: 1});
// all cursors in the array are published
return [
onlyOneFromCollectionA,
CollectionB.find()
];
});
}
Now on the client there is no need to subscribe to 'collectiona' and 'collectionb' separately. Instead you can simply subscribe to 'collectionAandB':
if (Meteor.isClient) {
Meteor.subscribe('collectionAandB', function () {
// callback to use collection A and B on the client once
// they are ready
// only one document of collection A will be available here
console.log(CollectionA.find().fetch());
// all documents from collection B will be available here
console.log(CollectionB.find().fetch());
});
}
So I think what you need to understand is that there is no array sent to the client that contains the two cursors published in the Meteor.publish call. This is because returning an array of cursors in the function passed as an argument to your call to Meteor.publish merely tells Meteor to publish all cursors contained in the array. You still need to query the individual records using your collection handles on the client (see #Will Brock's answer).

Related

Firebase doc changes

thanks for your help, I am new to firebase, I am designing an application with Node.js, what I want is that every time it detects changes in a document, a function is invoked that creates or updates the file system according to the new structure of data in the firebase document, everything works fine but the problem I have is that if the document is updated with 2 or more attributes the makeBotFileSystem function is invoked the same number of times which brings me problems since this can give me performance problems or file overwriting problems since what I do is generate or update multiple files.
I would like to see how the change can be expected but wait until all the information in the document is finished updating, not attribute by attribute, is there any way? this is my code:
let botRef = firebasebotservice.db.collection('bot');
botRef.onSnapshot(querySnapshot => {
querySnapshot.docChanges().forEach(change => {
if (change.type === 'modified') {
console.log('bot-changes ' + change.doc.id);
const botData = change.doc.data();
botData.botId = change.doc.id;
//HERE I CREATE OR UPDATE FILESYSTEM STRUCTURE, ACCORDING Data changes
fsbotservice.makeBotFileSystem(botData);
}
});
});
The onSnapshot function will notify you anytime a document changes. If property changes are commited one by one instead of updating the document all at once, then you will receive multiple snapshots.
One way to partially solve the multiple snapshot thing would be to change the code that updates the document to commit all property changes in a single operation so that you only receive one snapshot.
Nonetheless, you should design the function triggered by the snapshot so that it can handle multiple document changes without breaking. Given that document updates will happen no matter if by single/multiple property changes your code should be able to handle those. IMHO the problem is the filesystem update rather than how many snaphots are received
You should use docChanges() method like this:
db.collection("cities").onSnapshot(querySnapshot => {
let changes = querySnapshot.docChanges();
for (let change of changes) {
var data = change.doc.data();
console.log(data);
}
});

Firestore: get document back after adding it / updating it without additional network calls

Is it possible to get document back after adding it / updating it without additional network calls with Firestore, similar to MongoDB?
I find it stupid to first make a call to add / update a document and then make an additional call to get it.
As you have probably seen in the documentation of the Node.js (and Javascript) SDKs, this is not possible, neither with the methods of a DocumentReference nor with the one of a CollectionReference.
More precisely, the set() and update() methods of a DocumentReference both return a Promise containing void, while the CollectionReference's add() method returns a Promise containing a DocumentReference.
Side Note (in line with answer from darrinm below): It is interesting to note that with the Firestore REST API, when you create a document, you get back (i.e. through the API endpoint response) a Document object.
When you add a document to Cloud Firestore, the server can affect the data that is stored. A few ways this may happen:
If your data contains a marker for a server-side timestamp, the server will expand that marker into the actual timestamp.
Your data data is not permitted according to your server-side security rules, the server will reject the write operation.
Since the server affects the contents of the Document, the client can't simply return the data that it already has as the new document. If you just want to show the data that you sent to the server in your client, you can of course do so by simply reusing the object you passed into setData(...)/addDocument(data: ...).
This appears to be an arbitrary limitation of the the Firestore Javascript API. The Firestore REST API returns the updated document on the same call.
https://firebase.google.com/docs/firestore/reference/rest/v1beta1/projects.databases.documents/patch
I did this to get the ID of a new Document created, and then use it in something else.
Future<DocumentReference<Object>> addNewData() async {
final FirebaseFirestore _firestore = FirebaseFirestore.instance;
final CollectionReference _userCollection = _firestore.collection('users');
return await _userCollection
.add({ 'data': 'value' })
.whenComplete(() => {
// Show good notification
})
.catchError((e) {
// Show Bad notification
});
}
And here I obtain the ID:
await addNewData()
.then((document) async {
// Get ID
print('ID Document Created ${document.id}');
});
I hope it helps.

Mongo Change Streams running multiple times (kind of): Node app running multiple instances

My Node app uses Mongo change streams, and the app runs 3+ instances in production (more eventually, so this will become more of an issue as it grows). So, when a change comes in the change stream functionality runs as many times as there are processes.
How to set things up so that the change stream only runs once?
Here's what I've got:
const options = { fullDocument: "updateLookup" };
const filter = [
{
$match: {
$and: [
{ "updateDescription.updatedFields.sites": { $exists: true } },
{ operationType: "update" }
]
}
}
];
const sitesStream = Client.watch(sitesFilter, options);
// Start listening to site stream
sitesStream.on("change", async change => {
console.log("in site change stream", change);
console.log(
"in site change stream, update desc",
change.updateDescription
);
// Do work...
console.log("site change stream done.");
return;
});
It can easily be done with only Mongodb query operators. You can add a modulo query on the ID field where the divisor is the number of your app instances (N). The remainder is then an element of {0, 1, 2, ..., N-1}. If your app instances are numbered in ascending order from zero to N-1 you can write the filter like this:
const filter = [
{
"$match": {
"$and": [
// Other filters
{ "_id": { "$mod": [<number of instances>, <this instance's id>]}}
]
}
}
];
Doing this with strong guarantees is difficult but not impossible. I wrote about the details of one solution here: https://www.alechenninger.com/2020/05/building-kafka-like-message-queue-with.html
The examples are in Java but the important part is the algorithm.
It comes down to a few techniques:
Each process attempts to obtain a lock
Each lock (or each change) has an associated fencing token
Processing each change must be idempotent
While processing the change, the token is used to ensure ordered, effectively-once updates.
More details in the blog post.
It sounds like you need a way to partition updates between instances. Have you looked into Apache Kafka? Basically what you would do is have a single application that writes the change data to a partitioned Kafka Topic and have your node application be a Kafka consumer. This would ensure only one application instance ever receives an update.
Depending on your partitioning strategy, you could even ensure that updates for the same record always go to the same node app (if your application needs to maintain its own state). Otherwise, you can spread out the updates in a round robin fashion.
The biggest benefit to using Kafka is that you can add and remove instances without having to adjust configurations. For example, you could start one instance and it would handle all updates. Then, as soon as you start another instance, they each start handling half of the load. You can continue this pattern for as many instances as there are partitions (and you can configure the topic to have 1000s of partitions if you want), that is the power of the Kafka consumer group. Scaling down works in the reverse.
While the Kafka option sounded interesting, it was a lot of infrastructure work on a platform I'm not familiar with, so I decided to go with something a little closer to home for me, sending an MQTT message to a little stand alone app, and letting the MQTT server monitor messages for uniqueness.
siteStream.on("change", async change => {
console.log("in site change stream);
const mqttClient = mqtt.connect("mqtt://localhost:1883");
const id = JSON.stringify(change._id._data);
// You'll want to push more than just the change stream id obviously...
mqttClient.on("connect", function() {
mqttClient.publish("myTopic", id);
mqttClient.end();
});
});
I'm still working out the final version of the MQTT server, but the method to evaluate uniqueness of messages will probably store an array of change stream IDs in application memory, as there is no need to persist them, and evaluate whether to proceed any further based on whether that change stream ID has been seen before.
var mqtt = require("mqtt");
var client = mqtt.connect("mqtt://localhost:1883");
var seen = [];
client.on("connect", function() {
client.subscribe("myTopic");
});
client.on("message", function(topic, message) {
context = message.toString().replace(/"/g, "");
if (seen.indexOf(context) < 0) {
seen.push(context);
// Do stuff
}
});
This doesn't include security, etc., but you get the idea.
Will that having a field in DB called status which will be updated using findAnUpdate based on the event received from change stream. So lets say you get 2 events at the same time from change stream. First event will update the status to start and the other will throw error if status is start. So the second event will not process any business logic.
I'm not claiming those are rock-solid production grade solutions, but I believe something like this could work
Solution 1
applying Read-Modify-Write:
Add version field to the document, all the created docs have version=0
Receive ChangeStream event
Read the document that needs to be updated
Perform the update on the model
Increment version
Update the document where both id and version match, otherwise discard the change
Yes, it creates 2 * n_application_replicas useless queries, so there is another option
Solution 2
Create collection of ResumeTokens in mongo which would store collection -> token mapping
In the changeStream handler code, after successful write, update ResumeToken in the collection
Create a feature toggle that will disable reading ChangeStream in your application
Configure only a single instance of your application to be a "reader"
In case of "reader" failure you might either enable reading on another node, or redeploy the "reader" node.
As a result: there might be an infinite amount of non-reader replicas and there won't be any useless queries

How do I manage groups/rooms with node WebSockets?

TL;DR below.
I am currently developing a React/Redux SPA that is driven by real-time data. I've decided to use ws, instead of socket.io since socket.io feels a bit high level for what I'm doing, I'd rather manage sockets myself.
In saying that, I'm struggling to find a way to manage the separation of updates/messages per view/route. Since I'm using client-side routing it's per express route won't really work...
Messages between the server and client via WebSockets are JSON with actions like GET_ITEMS then a response of GET_ITEMS_SUCCESS with an array of 'items' and for errors: ..._ERROR etc. This is all fine, since it's just 1 to 1 transaction. Though the problem arises when broadcasting (1 to all) to all relevant clients when the server receives an update.
So, I assume it best practice to limit these broadcasts to the clients that are viewing/want the data. So when viewing, for example, the Item page, there is no point in broadcasting updates to the User data since that is only used on the User page.
I haven't been able to find any common practices when dealing with this sort of situation, just a few small outdated/barely used wrappers for ws that just add a few basic functions to leave/join but don't offer much flexibility with implementation.
What I think MIGHT work is to have an object/array for each 'group'/'room', which stores the clients that are currently listening to updates from a given section. So a user would send an action to INIT_LISTEN (& ``) with a param of category, e.g. ITEM for updates and other actions related to items.
TL;DR
What my question really boils down to is: How do I store a reference to a single socket? (ws client object? ws client ID?) Then, can I store this in an object/array to iterate through like below.
const ClientRooms = {
Items: {
{
...ws
}
/* ...rest of the client */
}
}
or
const ClientRooms = {
Items: [ "xyz" ] /* Array of ws ids */
}
I have a "ping--pong" heartbeat function to keep clients active and prevent silent connection failures/disconnections. I can't find if ws.terminate() still fires the ws close event so I can iterate 'group'/'room' the object/array to find and remove instances of that client.

Cancel previous MongoDB operation from the same client

I have a MongoDB collection of 3257477 cities, and I'm using Mongoose on NodeJS to access it. I'm making requests to it repeatedly (once per 500ms). Requests are usually answered very quickly. However, when I make a bad typo the query takes a long time and requests start to pile up until the initial request is answered. Here are some logs I collected of requests and responses:
21:48:50 started query for "new"
21:48:50 finished query for "new"
21:48:52 started query for "newj ljl" // blockage
21:48:54 started query for "newj"
21:48:55 started query for "new"
21:48:57 started query for "new ye"
21:48:59 started query for "new york"
21:49:08 finished query for "newj ljl" // blockage removed, quick queries flood in
21:49:08 finished query for "new"
21:49:08 finished query for "new york"
21:49:08 finished query for "new ye"
21:49:23 finished query for "newj"
I'm able to cancel the requests made by the client so I'm not worried about queries coming back in the wrong order. And I'm not interested in how to make that query faster at this point, since queries for actual correct spellings are quick.
I'm wondering how a new request can cancel an old request that was made by the same client. In other words "newj ljl" gets canceled when "newj" arrives, "newj" gets canceled when "new" arrives, and so on. If it's just going to be thrown out, why tie up the database?
Is there a proper way to do this?
Update:
I'm aware of db.currentOp().inprog and I'm thinking I can use the client property of the documents within that array to know whether it's a repeat request, but I can't quite figure out how to access that from Mongoose. I'm also not sure when to do that, or how I know which request was spawned from this client (and therefore which to cancel). I'd like an actual code example using Mongoose, or the native NodeJS MongoDB driver if possible!
Here's some sample code to go off of:
models.City.find({ ... })
.exec(function (err, cities) {
});
Below is what I came up with to solve the issue.
I can easily do db.currentOp().inprog and db.killOp() from the Mongo shell, but I really need this to happen automatically, when it needs to, from Mongoose. Since you can reference the MongoDB driver using require('mongoose').connection.db, you can execute those commands by doing "queries" on the following collections:
db.collection('$cmd.sys.inprog');
db.collection('$cmd.sys.killop');
The full solution:
var db = require('mongoose').connection.db,
// get the client IP address
ip = request.headers['x-forwarded-for'] ||
request.connection.remoteAddress ||
request.socket.remoteAddress ||
request.connection.socket.remoteAddress;
// same thing as db.currentOp().inprog
db.collection('$cmd.sys.inprog').findOne(function (err, data) {
if (err) throw err;
data.inprog.filter(function (op) {
// get the operation's client IP address without the port
return ip == op.client.split(':')[0];
}).forEach(function(op){
// same thing as db.killOp()
db.collection('$cmd.sys.killop')
.findOne({ 'op': op.opid }, function (err, data) {
if (err) throw err;
});
});
// start the new cities query
models.City.find({ ... })
.exec(function (err, cities) {
});
});
Helpful links:
https://groups.google.com/forum/#!topic/mongodb-user/1wFp7AqWnM4
drop database with mongoose
How to determine a user's IP address in node
You can try using db.killOp()
http://docs.mongodb.org/manual/reference/method/db.killOp/#db.killOp
UPDATE: You can get the list of current operations from db.currentOp() and identify the operation to be cancelled by matching fields like op, query and client
http://docs.mongodb.org/manual/reference/method/db.currentOp/#db.currentOp
You can definitely do this with killop, and the above solution looks like it could work for the problem as stated. However, I think it may be worthwhile to dig a bit deeper.
The fact that you have a noticeably slow query when you've got a query that's going to return no results seems unusual. That reeks of a full collection scan. The questions to ask are, first, do you have indices set up, and second, are you querying with a general regex? MongoDB doesn't really handle regex searches like { "name" : /.*new york.*/ } particularly well.
Also, the whole "send an http request every time the user hits a key" approach is simple and elegant, but also causes some unnecessary server load. Perhaps a search button or a client-side timeout where you only send a request if a user hasn't hit a key for 1 second could help alleviate the need for the killop approach.

Resources