MongoDB - Use results of multiple fetch queries in one update query - node.js

I am building a platform for students to give mock test on. Once the test is complete, a results are to be generated for them relative to other students who attempted the said test.
Report contains multiple parameters i.e. rank, rank within their batch, and stuff like average marks people got on the given test are updated.
To get each of this data, I need to perform a separate query on the database and then I got to update 1. the result of current user who attempted the test 2. the result of everyone else (i.e. everyone's rank changes on new attempts)
So I need to perform multiple queries to get the data and run 2-3 update queries to set the new data.
Given mongodb calls are asynchronous, I can't find a way to gather all of that data at one place to be updated.
One way is to put the next query within the callback function of the previous query but I feel like there should be a better way than that.

Maybe you could use Promise.all()
Example:
const initialQueries = []
initialQueries.push("Some promise(s)")
Promise.all(initialQueries).then(results => {
// After all initialQueries are finished
updateQueries()
}).catch(err => {
// At least one failed
})

Use db.collection.bulkWrite.
It allows multiple document insertions, updates (by _id or a custom filter), and even deletes.
const ObjectID = require('mongodb').ObjectID;
db.collection('tests').bulkWrite([
{ updateOne : {
"filter" : { "_id" : ObjectID("5d400af131602bf3fa09da3a") },
"update" : { $set : { "score" : 20 } }
}
},
{ updateOne : {
"filter" : { "_id" : ObjectID("5d233e7831602bf3fa996557") },
"update" : { $set : { "score" : 15 } }
}
}
]);
New in version 3.2.

Related

How to improve the performance of query in mongodb?

I have a collection in MongoDB with more than 5 million documents. Whenever I create a document inside the same collection I have to check if there exists any document with same title and if it exists then I don't have to add this to the database.
Example: here is my MongoDB document:
{
"_id":ObjectId("3a434sa3242424sdsdw"),
"title":"Lost in space",
"desc":"this is description"
}
Now whenever a new document is being created in the collection, I want to check if the same title already exists in any of the documents and if it does not exists, then only I want to add it to the database.
Currently, I am using findOne query and checking for the title, if it not available only then it is added to the database. I am facing the performance issue in this. It is taking too much time to do this process. Please suggest a better approach.
async function addToDB(data){
let result= await db.collection('testCol').findOne({title:data.title});
if(result==null){
await db.collection('testCol').insertOne(data);
}else{
console.log("already exists in db");
}
}
You can reduce the network round trip time which is currently 2X. Because you execute two queries. One for find then one for update. You can combine them into one query as below.
db.collection.update(
<query>,
{ $setOnInsert: { <field1>: <value1>, ... } },
{ upsert: true }
)
It will not update if already exists.
db.test.update(
{"key1":"1"},
{ $setOnInsert: { "key":"2"} },
{ upsert: true }
)
It looks for document with key1 is 1. If it finds, it skips. If not, it inserts using the data provided in the object of setOnInsert.

Query/Find MongoDB documents with series of multiple conditions

I have a User schema with basic fields which include interests, location co-ordinates
I need to perform POST request with a specific UserId to get the results
app.post('api/users/search/:id',function(err,docs)
{ //find all the documents whose search is enabled.
//on documents returned in above find the documents who have atleast 3 common interests(req.body.interests) with the user with ':id'
// -----OR-----
//find the documents who stay within 'req.body.distance' compared to location of user with':id'
//Something like this
return User
.find({isBuddyEnabled:true}),
.find({"interests":{"$all":req.body.interests}}),
.find({"_id":req.params.id},geoLib.distance([[req.body.gcordinates],[]))
});
Basically i need to perform find inside find or Query inside query..
As per your comments in the code you want to use multiple conditions in your find query such that either one of those condition is satisfied and returns the result based on it. You can use $or and $and to achieve it. A sample code with conditions similar to yours is given below.
find({
$or:[
{ isBuddyEnabled:true },
{ "interests": { "$all":req.body.interests }},
{ $and:[
{ "_id":req.params.id },
{ geoLib.distance...rest_of_the_condition }
]
}
]
});

Storing a complex Query within MongoDb Document [duplicate]

This is the case: A webshop in which I want to configure which items should be listed in the sjop based on a set of parameters.
I want this to be configurable, because that allows me to experiment with different parameters also change their values easily.
I have a Product collection that I want to query based on multiple parameters.
A couple of these are found here:
within product:
"delivery" : {
"maximum_delivery_days" : 30,
"average_delivery_days" : 10,
"source" : 1,
"filling_rate" : 85,
"stock" : 0
}
but also other parameters exist.
An example of such query to decide whether or not to include a product could be:
"$or" : [
{
"delivery.stock" : 1
},
{
"$or" : [
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 60
}
},
{
"delivery.filling_rate" : {
"$gt" : 90
}
}
]
},
{
"$and" : [
{
"delivery.maximum_delivery_days" : {
"$lt" : 40
}
},
{
"delivery.filling_rate" : {
"$gt" : 80
}
}
]
},
{
"$and" : [
{
"delivery.delivery_days" : {
"$lt" : 25
}
},
{
"delivery.filling_rate" : {
"$gt" : 70
}
}
]
}
]
}
]
Now to make this configurable, I need to be able to handle boolean logic, parameters and values.
So, I got the idea, since such query itself is JSON, to store it in Mongo and have my Java app retrieve it.
Next thing is using it in the filter (e.g. find, or whatever) and work on the corresponding selection of products.
The advantage of this approach is that I can actually analyse the data and the effectiveness of the query outside of my program.
I would store it by name in the database. E.g.
{
"name": "query1",
"query": { the thing printed above starting with "$or"... }
}
using:
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
Which results in:
2016-03-27T14:43:37.265+0200 E QUERY Error: field names cannot start with $ [$or]
at Error (<anonymous>)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:161:19)
at DBCollection._validateForStorage (src/mongo/shell/collection.js:165:18)
at insert (src/mongo/shell/bulk_api.js:646:20)
at DBCollection.insert (src/mongo/shell/collection.js:243:18)
at (shell):1:12 at src/mongo/shell/collection.js:161
But I CAN STORE it using Robomongo, but not always. Obviously I am doing something wrong. But I have NO IDEA what it is.
If it fails, and I create a brand new collection and try again, it succeeds. Weird stuff that goes beyond what I can comprehend.
But when I try updating values in the "query", changes are not going through. Never. Not even sometimes.
I can however create a new object and discard the previous one. So, the workaround is there.
db.queries.update(
{"name": "query1"},
{"$set": {
... update goes here ...
}
}
)
doing this results in:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 52,
"errmsg" : "The dollar ($) prefixed field '$or' in 'action.$or' is not valid for storage."
}
})
seems pretty close to the other message above.
Needles to say, I am pretty clueless about what is going on here, so I hope some of the wizzards here are able to shed some light on the matter
I think the error message contains the important info you need to consider:
QUERY Error: field names cannot start with $
Since you are trying to store a query (or part of one) in a document, you'll end up with attribute names that contain mongo operator keywords (such as $or, $ne, $gt). The mongo documentation actually references this exact scenario - emphasis added
Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $)...
I wouldn't trust 3rd party applications such as Robomongo in these instances. I suggest debugging/testing this issue directly in the mongo shell.
My suggestion would be to store an escaped version of the query in your document as to not interfere with reserved operator keywords. You can use the available JSON.stringify(my_obj); to encode your partial query into a string and then parse/decode it when you choose to retrieve it later on: JSON.parse(escaped_query_string_from_db)
Your approach of storing the query as a JSON object in MongoDB is not viable.
You could potentially store your query logic and fields in MongoDB, but you have to have an external app build the query with the proper MongoDB syntax.
MongoDB queries contain operators, and some of those have special characters in them.
There are rules for mongoDB filed names. These rules do not allow for special characters.
Look here: https://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names
The probable reason you can sometimes successfully create the doc using Robomongo is because Robomongo is transforming your query into a string and properly escaping the special characters as it sends it to MongoDB.
This also explains why your attempt to update them never works. You tried to create a document, but instead created something that is a string object, so your update conditions are probably not retrieving any docs.
I see two problems with your approach.
In following query
db.queries.insert({
"name" : "query1",
"query": { the thing printed above starting with "$or"... }
})
a valid JSON expects key, value pair. here in "query" you are storing an object without a key. You have two options. either store query as text or create another key inside curly braces.
Second problem is, you are storing query values without wrapping in quotes. All string values must be wrapped in quotes.
so your final document should appear as
db.queries.insert({
"name" : "query1",
"query": 'the thing printed above starting with "$or"... '
})
Now try, it should work.
Obviously my attempt to store a query in mongo the way I did was foolish as became clear from the answers from both #bigdatakid and #lix. So what I finally did was this: I altered the naming of the fields to comply to the mongo requirements.
E.g. instead of $or I used _$or etc. and instead of using a . inside the name I used a #. Both of which I am replacing in my Java code.
This way I can still easily try and test the queries outside of my program. In my Java program I just change the names and use the query. Using just 2 lines of code. It simply works now. Thanks guys for the suggestions you made.
String documentAsString = query.toJson().replaceAll("_\\$", "\\$").replaceAll("#", ".");
Object q = JSON.parse(documentAsString);

How do I $inc an entire object full of properties without building the query with a loop?

I have a collection of documents for the form:
{ name:String, groceries:{ apples:Number, cherries:Number, prunes:Number } }
Now, every query I have to increment with positive and/or negative values for each element in "groceries". It is not important what keys or how many, I just added some examples.
I could do a :
var dataToBeIncremented = stuff;
var $inc = {};
for each( var index in dataToBeIncremented )
{
$inc[ "groceries." + index ] = dataToBeIncremented[ index ];
}
then
db.update( { _id:targetID }, { $inc : query } )
however, I might have thousands of grocery elements and find doing this loop at each update to be ugly and unoptimized.
I would like to know how to void this or why it can't be optimized.
Actually there is no way to avoid it, because there is no such command that can increment all the values inside the subdocument.
So the only way to do it is to do something like you have done:
{
"$inc": {
"groceries.apples" : 1,
"groceries.cherries" : 1,
"groceries.prunes" : 1
}
}
Because you do not know what are the fields exactly, you need to find them beforehand and to create the $inc statement. There is one good thing about these updates: no matter how may elements do you have, you will still need only 2 queries (find what to update and to actually perform update).
I was also thinking how to achieve a better results with a different schema, but apparently you have to cope with what you have.

Mongoose update multiple geospacial index with no limit

I have some Mongoose Models with geospacial indexes:
var User = new Schema({
"name" : String,
"location" : {
"id" : String,
"name" : String,
"loc" : { type : Array, index : '2d'}
}
});
I'm trying to update all items that are in an area - for instance:
User.update({ "location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } }, { "foo" : "bar" },{ "multi" : true }, function(err){
console.log("done!");
});
However, this appears to only update the first 100 records. Looking at the docs, it appears there is a native limit on finds on geospatial indices for that applies when you don't set a limit.
(from docs:
Use limit() to specify a maximum number of points to return (a default limit of 100 applies if unspecified))
This appears to also apply to updates, regardless of the multi flag, which is a giant drag. If I apply an update, it only updates the first 100.
Right now the only way I can think of to get around this is to do something hideous like this:
Model.find({"location.loc" : { "$near" : [ -122.4192, 37.7793 ], "$maxDistance" : 0.4 } },{limit:0},function(err,results){
var ids = results.map(function(r){ return r._id; });
Model.update({"_id" : { $in : ids }},{"foo":"bar"},{multi:true},function(){
console.log("I have enjoyed crippling your server.");
});
});
While I'm not even entirely sure that would work (and it could be mildly optimized by only selecting the _id), I'd really like to avoid keeping an array of n ids in memory, as that number could get very large.
Edit:
The above hack doesn't even work, looks like a find with {limit:0} still returns 100 results. So, in an act of sheer desperation and frustration, I have written a recursive method to paginate through ids, then return them so I can update using the above method. I have added the method as an answer below, but not accepted it in hopes that someone will find a better way.
This is a problem in mongo server core as far as I can tell, so mongoose and node-mongodb-native are not to blame. However, this is really stupid, as geospacial indices is one of the few reasons to use mongo over some other more robust NoSQL stores.
Is there a way to achieve this? Even in node-mongodb-native, or the mongo shell, I can't seem to find a way to set (or in this case, remove by setting to 0) a limit on an update.
I'd love to see this issue fixed, but I can't figure out a way to set a limit on an update, and after extensive research, it doesn't appear to be possible. In addition, the hack in the question doesn't even work, I still only get 100 records with a find and limit set to 0.
Until this is fixed in mongo, here's how I'm getting around it: (!!WARNING: UGLY HACKS AHEAD:!!)
var getIdsPaginated = function(query,batch,callback){
// set a default batch if it isn't passed.
if(!callback){
callback = batch;
batch = 10000;
}
// define our array and a find method we can call recursively.
var all = [],
find = function(skip){
// skip defaults to 0
skip = skip || 0;
this.find(query,['_id'],{limit:batch,skip:skip},function(err,items){
if(err){
// if an error is thrown, call back with it and how far we got in the array.
callback(err,all);
} else if(items && items.length){
// if we returned any items, grab their ids and put them in the 'all' array
var ids = items.map(function(i){ return i._id.toString(); });
all = all.concat(ids);
// recurse
find.call(this,skip+batch);
} else {
// we have recursed and not returned any ids. This means we have them all.
callback(err,all);
}
}.bind(this));
};
// start the recursion
find.call(this);
}
This method will return a giant array of _ids. Because they are already indexed, it's actually pretty fast, but it's still calling the db many more times than is necessary. When this method calls back, you can do an update with the ids, like this:
Model.update(ids,{'foo':'bar'},{multi:true},function(err){ console.log('hooray, more than 100 records updated.'); });
This isn't the most elegant way to solve this problem, you can tune it's efficiency by setting the batch based on expected results, but obviously the ability to simply call update (or find for that matter) on $near queries without a limit would really help.

Resources