Are MongoDB queries client-side operations? - node.js

Lets say I have a document
{ "_id" : ObjectId("544946347db27ca99e20a95f"), "nameArray": [{"id":1 , first_name: "foo"}]
Now i need to push a array into nameArray using $push . How does document update in that case. Does document get's retrieved on client and updates happens on client and changes are then reflected to Mongodb database server. Entire operation is carried out in Mongodb Database.

What you are asking here is if MongoDB operations are client-side operations. The short answer is NO.
In MongoDB a query targets a specific collection of documents as mentioned in the documentation and a collection is a group of MongoDB documents which exists within a single database. Collections are simply what tables are in RDBMS. So if query targets a specific collection then it means their are perform on database level, thus server-side. The same thing applies for data modification and aggregation operations.
Sometimes, your operations may involve a client-side processing because MongoDB doesn't provides a way to achieve what you want out of the box. Generally speaking, you only those type of processing when you want to modify your documents structure in the collection or change your fields' type. In such situation, you will need to retrieve your documents, perform your modification using bulk operations.

See the documentation:
Your array is inserted into the existing array as one element. If the array does not exists it is created. If the target is not an array the operation fails.
There is nothing stated like "retriving the element to the client and update it there". So the operation is completely done on the database server side. I don't know any operation that works in the way like you described it. Unless you are chaining a query, with a modify of the item in your client and an update. But these are two separated operations and not one single command.

Related

NodeJS MongoDB locks on documents

I am using the mongodb driver and am concerned about possible concurrency issues that could duplicate objects. Reading a few questions and answers on stack overflows I believe that writes operations are atomic, but this may not solve my concurrency problem. Let's say there are two concurrent calls to doSomeAndDelete with the same id: operations in HERE might take some time but only one of these two functions should be able to handle result. How can I implement a lock?
async function doSomeAndDelete(id){
const result = await myCollection.findOne({ _id : id });
/*Some operations on result [HERE]*/
if(/*conditions*/)
await myCollection.deleteOne({ _id : id});
}
For deletion, only one of the operations will succeed and delete the document, while the other one will not delete anything because the document no longer exists. That, assuming, the _id will not be reused.
In general, write operations on a document are atomic, so if you have multiple threads writing to a document, you might want to use mongodb transactions, or use some form of optimistic locking. For example, you can use an ObjectId field in your documents as a version id, and use a new value for each update. When you read-and-update a document, you validate that the field has the same value you obtained from the read, meaning the record has not been modified since you read it.

MongoDB unnormalized data and the change stream

I have an application that most of the collections in it are heavily read then write, so I demoralized the data in them, and now I need to handle the normalization of the data, for some collections I used jobs in order to sync the data but that not good enough as for some cases I need the data to be normalized in real-time,
for example:
let's say I have orders collections and users collection.
orders have the user email(for search)
{
_id:ObjectId(),
user_email:'test#email.email'
....
}
now whenever I am changing the user email in users I want to change it in orders as well.
so I find that MongoDB has change stream which looks pretty awesome feature, I have played with it a bit and it gives me the results I need to update my other collections, my question is does anyone use it in production? can I trust on this stream to be always set the update data to update the other collections? how does it affect the DB performance if I have many streams open? also, I use the nodejs MongoDB driver does it has any effect
I've not worked yet with change stream but these cases are very common and can be easily solved by building more normalized schema
Normalization form 1 says among the others "don't repeat data" - so you will save the email in the users collection only
orders collection won't have the email field but will have user_id for joining with users collection with lookup command for joining collections
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/

In a nested MongoDB call, how do I ensure atomicity?

Is it possible to atomically update/remove two documents in MongoDB by calling a new update/remove call from within the first update's callback? In the case below, I want to remove the second document from the collection, but only if the update to the first document succeeds:
db.collection.update(conditions1, {$set: set}, function (err,result){
db.collection.remove(conditions2, function(err,doc_num){
db.close();
)};
});
I'm coming across the $isolated query operator, but from what I understand in the documentation, this operator is used for performing a read/write lock on a single query which affects multiple documents, not on performing a read/write lock on one document after performing an update on another document through the first document update's callback, which is what I want to try and accomplish.
No it's not possible because. As documented here a lock would be aquired on a single query and not a whole transaction.
You can overcome atomicity problem by using this.
As Amir said, it's not possible, but you can mimic the behavior in mongo by following the two phase commit pattern. That link also links to how to perform rollback-like operations.

MongoDb + Mongoose QueryStream - Following document changes

I'm trying to make use of Mongoose and its querystream in a scheduling application, but maybe I'm misunderstanding how it works. I've read this question here on SO [Mongoose QueryStream new results and it seems I'm correct, but someone please explain:
If I'm filtering a query like so -
Model.find().stream()
when I add or change something that matches the .find(), it should throw a data event, correct? Or am I completely wrong in my understanding of this issue?
For example, I'm trying to look at some data like so:
Events.find({'title':/^word/}).stream();
I'm changing titles in the mongodb console, and not seeing any changes.
Can anyone explain why?
Your understanding is indeed incorrect as a stream is just an output stream of the current query response and not something that "listens for new data" by itself. The returned result here is basically just a node streaming interface, which is an optional choice as opposed to a "cursor", or indeed the direct translation to an array as mongoose methods do by default.
So a "stream" does not just "follow" anything. It is reall just another way of dealing with the normal results of a query, but in a way that does not "slurp" all of the results into memory at once. It rather uses event listeners to process each result as it is fetched from the server cursor.
What you are in fact talking about is a "tailable cursor", or some variant thereof. In basic MongoDB operations, a "tailable cursor" can be implemented on a capped collection. This is a special type of collection with specific rules, so it might not suit your purposes. They are intended for "insert only" operations which is typically suited to event queues.
On a model that is using a capped collection ( and only where a capped collection has been set ) then you implement like this:
var query = Events.find({'title':/^word/}).sort({ "$natural": -1}).limit(1);
var stream = query.tailable({ "awaitdata": true}).stream();
// fires on data received
stream.on("data",function(data) {
console.log(data);
});
The "awaitdata" there is just as an important option as the "tailable" option itself, as it is the main thing that tells the query cursor to remain "active" and "tail" the additions to the collection that meet the query conditions. But your collection must be "capped" for this to work.
An alternate and more adavanced approach to this is to do something like the meteor distribution does, where the "capped collection" that is being tailed is in fact the MongoDB oplog. This requires a replica set configuration, however just as meteor does out of the box, there is nothing wrong with having a single node as a replica set in itself. It's just not wise to do so in production.
This is more adavnced than a simple answer, but the basic concept is since the "oplog" is a capped collection you are able to "tail" it for all write operations on the database. This event data is then inspected to determine such details as the collection you want to watch for writes has been written to. Then that data can be used to query the new information and do something like return the updated or new results to a client via a websocket or similar.
But a stream in itself is just a stream. To "follow" the changes on a collection you either need to implement it as capped, or consider implementing a process based on watching the changes in the oplog as described.

How to account for a failed write or add process in Mongodb

So I've been trying to wrap my head around this one for weeks, but I just can't seem to figure it out. So MongoDB isn't equipped to deal with rollbacks as we typically understand them (i.e. when a client adds information to the database, like a username for example, but quits in the middle of the registration process. Now the DB is left with some "hanging" information that isn't assocaited with anything. How can MongoDb handle that? Or if no one can answer that question, maybe they can point me to a source/example that can? Thanks.
MongoDB does not support transactions, you can't perform atomic multistatement transactions to ensure consistency. You can only perform an atomic operation on a single collection at a time. When dealing with NoSQL databases you need to validate your data as much as you can, they seldom complain about something. There are some workarounds or patterns to achieve SQL like transactions. For example, in your case, you can store user's information in a temporary collection, check data validity, and store it to user's collection afterwards.
This should be straight forwards, but things get more complicated when we deal with multiple documents. In this case, you need create a designated collection for transactions. For instance,
transaction collection
{
id: ..,
state : "new_transaction",
value1 : values From document_1 before updating document_1,
value2 : values From document_2 before updating document_2
}
// update document 1
// update document 2
Ooohh!! something went wrong while updating document 1 or 2? No worries, we can still restore the old values from the transaction collection.
This pattern is known as compensation to mimic the transactional behavior of SQL.

Resources