MongoDB query fast in mongo shell but slow/times out in nodejs - node.js

The following query takes approximately 54ms in the mongo shell, but times out after a minute in the nodejs driver:
db.posts.find(
{
parentId: 10,
modifiedGmt: {
"$gte": new Date("2017-01-01T00:00:00.000z")
},
postType: {
"$in": ["type1", "type2"]
}
},
{
_id: 1,
parentId: 1,
postId: 1,
url: 1,
modifiedGmt: 1,
authors: 1,
taxonomy: 1,
postType: 1
}
).sort({modifiedGmt: 1}).limit(2400)
Explain shows that the query is using existing indexes. If I drop the limit to something very low like 10, it won't time out but it will take far, far too long. I'm not really sure where to go with this. It's a large collection but with the indexes in place and the limit sub-10000 I don't see why it would be so slow.
Any ideas?

I'm suspecting that the .sort is not able to use your index.
I would recommend to have a read on this page sort-operator-and-performance.

Related

Mongo DB: How do I query by both Id and date

I am trying to do a query by 2 parameters on a mongoDb database using Mongoose. I need to query by who the document was created by and also a subdocument called events which has a date. I want to bring back all documents within a timeframe.
My query looks like this.
var earliest = new Date(2018,0,3);
var latest = new Date(2018,0,4);
Goal.find({createdBy:userId,'events.date':{$gte: earliest, $lte: latest}})
.exec(function(err,doc)){ //do stuff}
The document below is what was returned. I get everything in my database back and my date range query isn't taken into account. I'm new to Mongodb and I don't know what I am doing wrong.
[
{
_id: "5a4dac123f37dd3818950493",
goalName: "My First Goal",
createdBy: "5a4dab8c3f37dd3818950492",
__v: 0,
events:
[
{
_id: "5a4dac123f37dd3818950494",
eventText: "Test Goal",
eventType: "multiDay",
date: "2018-01-03T00:00:00.000Z",
eventLength: 7,
completed: false
},
{
_id: "5a4dac123f37dd3818950495",
eventText: "Test Goal",
eventType: "multiDay",
date: "2018-01-04T00:00:00.000Z",
eventLength: 7,
completed: false
},
{
_id: "5a4dac123f37dd3818950496",
eventText: "Test Goal",
eventType: "multiDay",
date: "2018-01-05T00:00:00.000Z",
eventLength: 7,
completed: false
}
],
startDate: "2018-01-04T00:00:00.000Z",
createdOn: "2018-01-04T00:00:00.000Z"
}
]
There is a difference between matching documents and matching "elements of an array". Your document already contains the whole array, even the values that don't match your array filter criteria. But since your document match criteria matches, the whole document is returned (with all the array entries).
If you just want the matching "elements" then use .aggregate() instead. An example on how to use aggregate for such a task is available at Mongodb find inside sub array

MongoDB schema schedule publish

I want to do a post publication routine every 10 minutes of data obtained by API.
The routine of cron and others works perfectly, but the problem comes when I try to get the first article with a date closer to expiration and has not been published or as little as possible.
I have tried to create an incremental field with the times that has been published, update the date of publication, etc; But I can not give with the formula, I always return the same result, the same article.
Schema
"_id": item.ID,
"title": item.title,
"images": [ item.firstImage ],
"url": "none",
"expired": "2017-04-30T22:00:00+03:00",
"lastPublish": "2017-04-30T20:41:02+03:00",
"publish": 0
Code
db.find({}).sort({ 'expired': 1, 'lastPublish': 1 }).limit(1).exec(function(err, doc) {
//db.find({}).sort({ 'publish': 1, 'expired': 1 }).limit(1).exec(function(err, doc) {
db.update({ '_id': doc._id }, { '$set': { 'lastPublish': new Date(), '$inc': { 'publish': 1 } })
});
I need to make a sort of circular queue, post those that expire earlier (they have dates of hours and days of difference) and when it's over, start over. But I always publish the first post and I'm running out of ideas.
I'd better leave for a few hours and retake it later, when I've cleared my mind.
Thanks !
Solved! Running the following routine works properly, I don't know why yesterday didn't work, apart from the one that was missing to choose the element of the array ([0]).
db.find({}).sort({ 'publish': 1, 'expired': 1 }).limit(1).exec(function(err, doc) {
db.update({ '_id': doc[0]._id }, { '$inc': { 'publish': 1 }, '$set': { 'lastPublish': new Date() } });
thanks !

How to find a document by its position/index in the array?

I need to retrieve let's say the documents at position 1,5 and 8 in a MongoDB database using Mongoose.
Is it possible at all to get a document by its position in a collection? If so, could you show how to do that?
I need something like this:
var someDocs = MyModel.find({<collectionIndex>: [1, 5, 8]}, function(err, docs) {
//do something with the three documents
})
I tried to do the following to see what indexes are used in collection but I get the 'getIndexes is not a function' error:
var indexes = MyModel.getIndexes();
Appreciate any help.
If by position 5 for example, you mean the literal 5th element in your collection, that isn't the best way to go. A Mongo collection is usually in the order in which you inserted the elements, but it may not always be. See the answer here and check the docs on natural order: https://stackoverflow.com/a/33018164/7531267.
In your case, you might have a unique id on each record that you can query by.
Assuming the [1, 5, 8] you mentioned are the ids, something like this should do it:
var someDocs = MyModel.find({ $or: [{ id: 1 }, { id: 5 }, { id: 8 }]}}, function(err, cb) {
//do something with the three documents
})
You can also read about $in to replace $or and clean up the query a bit.
Assume you have this document in users collections:
{
_id: ObjectId('...'),
name: 'wrq',
friends: ['A', 'B', 'C']
}
Code below to search first and thrid friend of user 'wrq':
db.users.aggregate(
[
{
$match: {
name: 'wrq'
}
},
{
$project:{
friend1: {
$arrayElemAt: ["$friends", 0]
},
friend3: {
$arrayElemAt: ["$friends", 2]
}
}
}
]
)

Combine multiple query with one single $in query and specify limit for each array field?

I am using mongoose with node.js for MongoDB. Now i need to make 20 parallel find query requests in my database with limit of documents 4, same as shown below just brand_id will change for different brand.
areamodel.find({ brand_id: brand_id }, { '_id': 1 }, { limit: 4 }, function(err, docs) {
if (err) {
console.log(err);
} else {
console.log('fetched');
}
}
Now as to run all these query parallely i thought about putting all 20 brand_id in a array of string and then use a $in query to get the results, but i don't know how to specify the limit 4 for every array field which will be matched.
I write below code with aggregation but don't know where to specify limit for each element of my array.
var brand_ids = ["brandid1", "brandid2", "brandid3", "brandid4", "brandid5", "brandid6", "brandid7", "brandid8", "brandid9", "brandid10", "brandid11", "brandid12", "brandid13", "brandid14", "brandid15", "brandid16", "brandid17", "brandid18", "brandid19", "brandid20"];
areamodel.aggregate(
{ $project: { _id: 1 } },
{ $match : { 'brand_id': { $in: brand_ids } } },
function(err, docs) {
if (err) {
console.error(err);
} else {
}
}
);
Can anyone please tell me how can i solve my problem using only one query.
UPDATE- Why i don't think $group be helpful for me.
Suppose my brand_ids array contains these strings
brand_ids = ["id1", "id2", "id3", "id4", "id5"]
and my database have below documents
{
"brand_id": "id1",
"name": "Levis",
"loc": "india"
},
{
"brand_id": "id1",
"name": "Levis"
"loc": "america"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "india"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "america"
}
Desired JSON output
{
"name": "Levis"
},
{
"name": "Lee"
}
For above example suppose i have 25000 documents with "name" as "Levis" and 25000 of documents where "name" is "Lee", now if i will use group then all of 50000 documents will be queried and grouped by "name".
But according to the solution i want, when first document with "Levis" and "Lee" gets found then i will don't have to look for remaining thousands of the documents.
Update- I think if anyone of you can tell me this then probably i can get to my solution.
Consider a case where i have 1000 total documents in my mongoDB, now suppose out of that 1000, 100 will pass my match query.
Now if i will apply limit 4 on this query then will this query take same time to execute as the query without any limit, or not.
Why i am thinking about this case
Because if my query will take same time then i don't think $group will increase my time as all documents will be queried.
But if time taken by limit query is more than the time taken without the limit query then.
If i can apply limit 4 on each array element then my question will be solved.
If i cannot apply limit on each array element then i don't think $group will be useful, as in this case i have to scan whole documents to get the results.
FINAL UPDATE- As i read on below answer and also on mongodb docs that by using $limit, time taken by query does not get affected it is the network bandwidth that gets compromised. So i think if anyone of you can tell me how to apply limit on array fields (by using $group or anything other than that)then my problem will get solved.
mongodb: will limit() increase query speed?
Solution
Actually my thinking about mongoDB was very wrong i thought adding limit with queries decrease time taken by query but it is not the case that's why i stumbled so many days to try the answer which Gregory NEUT and JohnnyHK Told me to. Thanks a lot both of you guys i must have found the solution at the day one if i had known about this thing. thanks alot for helping me out of here guys i really appreciate it.
I propose you to use the $group aggregation attribute to group all data you got from the $match by brand_id, and then limit the groups of data using $slice.
Look at this stack overflow post
db.collection.aggregate(
{
$sort: {
created: -1,
}
}, {
$group: {
_id: '$city',
title: {
$push: '$title',
}
}, {
$project: {
_id: 0,
city: '$_id',
mostRecentTitle: {
$slice: ['$title', 0, 2],
}
}
})
I propose using distinct, since that will return all different brand names in your collection. (I assume this is what you are trying to achieve?)
db.runCommand ( { distinct: "areamodel", key: "name" } )
MongoDB docs
In mongoose i think it is: areamodel.db.db.command({ distinct: "areamodel", key: "name" }) (Untested)

Select last N records from MongoDB using node.js

Note: I have seen the other question and tried answers from it with no luck.
I have a collection in MongoDB:
{ _id: 1, category_id: 1, time_added: 1234567890, name: "abc" }
I need to find last 10 entries from category_id equal to 10. Sound simple?
collection.find({ 'category_id': 10 }, {}, { _id: -1, limit : 10}, function (e, d) {});
But running this gives me first 10 records instead of last 10. Looks like driver has priority to "limit" and not "sorting"... I also tries with $natural and sorting on time_added. Whenever I run the same query from command line - I get what I need. Here is what I type into command line:
collection.find({ 'category_id': 10 }).sort({_id: -1}).limit(10)
What am I doing wrong? Is there an alternative way to do this with node.js?
Turns out node.js accepts function calls in the same way the command line interface does. Every function has last optional argument as callback function. So this code runs and returns the correct results:
collection.find({ 'category_id': 10 }).sort({_id: -1}).limit(10, function (e, d) {})

Resources