I know that it is a bad practice to use skip in order to implement pagination, because when your data gets large skip starts to consume a lot of memory. One way to overcome this trouble is to use natural order by _id field:
//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...
//Page 2
users = db.users.find({'_id'> last_id}). limit(10);
The problem is - I'm new to mongo and do not know what is the best way to get this very last_id
The concept you are talking about can be called "forward paging". A good reason for that is unlike using .skip() and .limit() modifiers this cannot be used to "go back" to a previous page or indeed "skip" to a specific page. At least not with a great deal of effort to store "seen" or "discovered" pages, so if that type of "links to page" paging is what you want, then you are best off sticking with the .skip() and .limit() approach, despite the performance drawbacks.
If it is a viable option to you to only "move forward", then here is the basic concept:
db.junk.find().limit(3)
{ "_id" : ObjectId("54c03f0c2f63310180151877"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f63310180151878"), "a" : 4, "b" : 4 }
{ "_id" : ObjectId("54c03f0c2f63310180151879"), "a" : 10, "b" : 10 }
Of course that's your first page with a limit of 3 items. Consider that now with code iterating the cursor:
var lastSeen = null;
var cursor = db.junk.find().limit(3);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if (!cursor.hasNext())
lastSeen = doc._id;
}
So that iterates the cursor and does something, and when it is true that the last item in the cursor is reached you store the lastSeen value to the present _id:
ObjectId("54c03f0c2f63310180151879")
In your subsequent iterations you just feed that _id value which you keep ( in session or whatever ) to the query:
var cursor = db.junk.find({ "_id": { "$gt": lastSeen } }).limit(3);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if (!cursor.hasNext())
lastSeen = doc._id;
}
{ "_id" : ObjectId("54c03f0c2f6331018015187a"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f6331018015187b"), "a" : 6, "b" : 6 }
{ "_id" : ObjectId("54c03f0c2f6331018015187c"), "a" : 7, "b" : 7 }
And the process repeats over and over until no more results can be obtained.
That's the basic process for a natural order such as _id. For something else it gets a bit more complex. Consider the following:
{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
{ "_id": 1, "rank": 3 }
{ "_id": 3, "rank": 2 }
To split that into two pages sorted by rank then what you essentially need to know is what you have "already seen" and exclude those results. So looking at a first page:
var lastSeen = null;
var seenIds = [];
var cursor = db.junk.find().sort({ "rank": -1 }).limit(2);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if ( lastSeen != null && doc.rank != lastSeen )
seenIds = [];
seenIds.push(doc._id);
if (!cursor.hasNext() || lastSeen == null)
lastSeen = doc.rank;
}
{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
On the next iteration you want to be less or equal to the lastSeen "rank" score, but also excluding those already seen documents. You do this with the $nin operator:
var cursor = db.junk.find(
{ "_id": { "$nin": seenIds }, "rank": "$lte": lastSeen }
).sort({ "rank": -1 }).limit(2);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if ( lastSeen != null && doc.rank != lastSeen )
seenIds = [];
seenIds.push(doc._id);
if (!cursor.hasNext() || lastSeen == null)
lastSeen = doc.rank;
}
{ "_id": 1, "rank": 3 }
{ "_id": 3, "rank": 2 }
How many "seenIds" you actually hold on to depends on how "granular" your results are where that value is likely to change. In this case you can check if the current "rank" score is not equal to the lastSeen value and discard the present seenIds content so it does not grow to much.
That's the basic concepts of "forward paging" for you to practice and learn.
The simplest way to implement pagination in MongoDB
// Pagination
const page = parseInt(req.query.page, 10) || 1;
const limit = parseInt(req.query.limit, 10) || 25;
const startIndex = (page - 1) * limit;
const endIndex = page * limit;
query = query.skip(startIndex).limit(limit);
Related
Basically I have documents in which I have on field called "Difficulty Level" and value of this filed is between 1 to 10 for each documents.
So, I have to select random 10 or 20 documents so that in randomly selected documents , atleast 1 document should be there for each difficulty level i.e. from 1 to 10. means there should atlease one document with "Difficulty level" : 1 ,"Difficulty level" : 2 ,"Difficulty level" : 3 ............."Difficulty level" : 10.
So, How can I select documents randomly with this condition fulfilled ?
Thanks
I tried $rand operator for selecting random documents but can't getting solution for that condition.
If I've understood correctly you can try something like this:
The goal here is to create a query like this example
This query gets two random elements using $sample, one for level1 and another for level2. And using $facet you can get multiple results.
db.collection.aggregate([
{
"$facet": {
"difficulty_level_1": [
{
"$match": { "difficulty_level": 1 } },
{ "$sample": { "size": 1 } }
],
"difficulty_level_2": [
{ "$match": { "difficulty_level": 2 } },
{ "$sample": { "size": 1 } }
]
}
}
])
So the point is to do this query in a dynamic way. So you can use JS to create the object query an pass it to the mongo call.
const random = Math.floor((Math.random()*10)+1) // Or wathever to get the random number
let query = {"$facet":{}}
for(let i = 1 ; i <= random; i++){
const difficulty_level = `difficulty_level_${i}`
query["$facet"][difficulty_level] = [
{ $match: { difficulty_level: i }},
{ $sample: { size: 1 }}
]
}
console.log(query) // This output can be used in mongoplayground and it works!
// To use the query you can use somethiing like this (or other way you call the DB)
this.db.aggregate([query])
I am new in nodejs. I want below result but it shows last value.
"child_skills" : [
"Nodejs",
"Android",
"Javascript"
],
My nodejs method
export function create(req, res) {
return JobCategories.create(req.body)
.then((JobCategoryInstance) => {
var childskill = [];
for (var i = 0; i < JobCategoryInstance.child_categories.length; i++) {
childskill = JobCategoryInstance.child_categories[i].child_categoryname;
}
EngineerSkills.create({ skill_name: JobCategoryInstance.category_name, child_skills: childskill });
return JobCategoryInstance;
})
.then(respondWithResult(res, 201))
.catch(handleError(res));
}
My result is below. Why this is get only last value?
{
"_id" : ObjectId("58c2d5019caa49199854872e"),
"skill_name" : "soft",
"date_updated" : ISODate("2017-03-10T16:32:01.437Z"),
"child_skills" : [
"Javascript"
],
"__v" : 0
}
Instead of adding elements to the array, you are substituting the array with one of the elements every time.
Change this:
childskill = JobCategoryInstance.child_categories[i].child_categoryname;
to this:
childskill.push(JobCategoryInstance.child_categories[i].child_categoryname);
if you want new elements to be added to the array every time.
I have some product data where some product don't have key "images.cover".
now when I try to print all data it show error
Cannot read property 'cover' of undefined.
So I try to make if images.cover key not present then just put var cover = ''; else images.cover value. I'm using nodejs and mongodb
From the error message:
Cannot read property 'cover' of undefined
you can narrow down the error source on the trouble product document to any of the three attributes:
the document doesn't have images field (hence the undefined object),
the images field may be null, and
the covers key may not be present as well.
Let's consider a minimum test case where a sample collection has documents with the above three + one with the images.cover key set:
db.product.insert([
/* 0 */
{
"_id" : 1,
"image" : {
"cover" : "test1",
"url" : "url1"
}
},
/* 1 */
{
"_id" : 2,
"image" : {
"url" : "url2"
}
},
/* 2 */
{
"_id" : 3
},
/* 3 */
{
"_id" : 4,
"image" : {
"cover" : null,
"url" : "url4"
}
}
]);
In mongo shell you can check to see if a key is present or not either by using native JavaScript methods or using mongodb's $exists operator. For the former, you could try:
var cover = "", covers = [];
db.product.find().forEach(function (doc){
var cover = "";
if ((doc.image !== undefined) && (typeof(doc.image.cover) !== "undefined") && (doc.image.cover !== undefined)){
cover = doc["image"].cover;
}
covers.push(cover);
});
print(covers); /* will print to mongo shell:
{
"0" : "test1",
"1" : "",
"2" : "",
"3" : null
}
*/
Using $exists operator with its value set to true, this searches for documents that contain the field, including documents where the field value is null. So using this route is probably not going to work in your example since you would like to assign the covers variable for unmatched documents as well:
var cover = "", covers = [];
db.product.find({ "image.cover": {$exists : true} }).forEach( function(doc) {
covers.push(doc["image"].cover);
});
print(covers); /* this will print to mongo shell:
{
"0" : "test1",
"1" : null
}
*/
I have an array and it contains duplicate values in BOTH the ID's, is there a way to remove one of the duplicate array item?
userName: "abc",
_id: 10239201141,
rounds:
[{
"roundId": "foo",
"money": "123
},// Keep one of these
{// Keep one of these
"roundId": "foo",
"money": "123
},
{
"roundId": "foo",
"money": "321 // Not a duplicate.
}]
I'd like to remove one of the first two, and keep the third because the id and money are not duplicated in the array.
Thank you in advance!
Edit I found:
db.users.ensureIndex({'rounds.roundId':1, 'rounds.money':1}, {unique:true, dropDups:true})
This doesn't help me. Can someone help me? I spent hours trying to figure this out.
The thing is, I ran my node.js website on two machines so it was pushing the same data twice. Knowing this, the duplicate data should be 1 index away. I made a simple for loop that can detect if there is duplicate data in my situation, how could I implement this with mongodb so it removes an array object AT that array index?
for (var i in data){
var tempRounds = data[i]['rounds'];
for (var ii in data[i]['rounds']){
var currentArrayItem = data[i]['rounds'][ii - 1];
if (tempRounds[ii - 1]) {
if (currentArrayItem.roundId == tempRounds[ii - 1].roundId && currentArrayItem.money == tempRounds[ii - 1].money) {
console.log("Found a match");
}
}
}
}
Use an aggregation framework to compute a deduplicated version of each document:
db.test.aggregate([
{ "$unwind" : "$stats" },
{ "$group" : { "_id" : "$_id", "stats" : { "$addToSet" : "$stats" } } }, // use $first to add in other document fields here
{ "$out" : "some_other_collection_name" }
])
Use $out to put the results in another collection, since aggregation cannot update documents. You can use db.collection.renameCollection with dropTarget to replace the old collection with the new deduplicated one. Be sure you're doing the right thing before you scrap the old data, though.
Warnings:
1: This does not preserve the order of elements in the stats array. If you need to preserve order, you will have retrieve each document from the database, manually deduplicate the array client-side, then update the document in the database.
2: The following two objects won't be considered duplicates of each other:
{ "id" : "foo", "price" : 123 }
{ "price" : 123, "id" : foo" }
If you think you have mixed key orders, use a $project to enforce a key order between the $unwind stage and the $group stage:
{ "$project" : { "stats" : { "id_" : "$stats.id", "price_" : "$stats.price" } } }
Make sure to change id -> id_ and price -> price_ in the rest of the pipeline and rename them back to id and price at the end, or rename them in another $project after the swap. I discovered that, if you do not give different names to the fields in the project, it doesn't reorder them, even though key order is meaningful in an object in MongoDB:
> db.test.drop()
> db.test.insert({ "a" : { "x" : 1, "y" : 2 } })
> db.test.aggregate([
{ "$project" : { "_id" : 0, "a" : { "y" : "$a.y", "x" : "$a.x" } } }
])
{ "a" : { "x" : 1, "y" : 2 } }
> db.test.aggregate([
{ "$project" : { "_id" : 0, "a" : { "y_" : "$a.y", "x_" : "$a.x" } } }
])
{ "a" : { "y_" : 2, "x_" : 1 } }
Since the key order is meaningful, I'd consider this a bug, but it's easy to work around.
Mongo docs state:
The Mongo multikey feature can automatically index arrays of values.
That's nice. But how about sorting based on multikeys? More specifically, how to sort a collection according to array match percentage?
For example, I have a pattern [ 'fruit', 'citrus' ] and a collection, that looks like this:
{
title: 'Apples',
tags: [ 'fruit' ]
},
{
title: 'Oranges',
tags: [ 'fruit', 'citrus' ]
},
{
title: 'Potato',
tags: [ 'vegetable' ]
}
Now, I want to sort the collection according to match percentage of each entry to the tags pattern. Oranges must come first, apples second and potatoes last.
What's the most efficient and easy way to do it?
As of MongoDB 2.1 a similar computation can be done using the aggregation framework. The syntax is something like
db.fruits.aggregate(
{$match : {tags : {$in : ["fruit", "citrus"]}}},
{$unwind : "$tags"},
{$group : {_id : "$title", numTagMatches : {$sum : 1}}},
{$sort : {numTagMatches : -1}} )
which returns
{
"_id" : "Oranges",
"numTagMatches" : 2
},
{
"_id" : "Apples",
"numTagMatches" : 1
}
This should be much faster than the map-reduce method for two reasons. First because the implementation is native C++ rather than javascript. Second, because "$match" will filter out the items which don't match at all (if this is not what you want, you can leave out the "$match" part, and change the "$sum" part to be either 1 or 0 depending on if the tag is equal to "fruit" or "citrus" or neither).
The only caveat here is that mongo 2.1 isn't recommended for production yet. If you're running in production you'll need to wait for 2.2. But if you're just experimenting on your own you can play around with 2.1, as the aggregation framework should be more performant.
Note: The following explanation is required for Mongo 2.0 and earlier. For later versions you should consider the new aggregation framework.
We do something similar while trying to fuzzy-match input sentence which we index. You can use map reduce to emit the object ID every time you get a match and them sum them up. You'll then need to load the results into your client and sort by the highest value first.
db.plants.mapReduce(
function () {
var matches = 0;
for (var i = 0; i < targetTerms.length; i++) {
var term = targetTerms[i];
for (var j = 0; j < this.tags.length; j++) {
matches += Number(term === this.tags[j]);
}
}
emit(this._id, matches);
},
function (prev, curr) {
var result = 0;
for (var i = 0; i < curr.length; i++) {
result += curr[i];
}
return result;
},
{
out: { inline: 1 },
scope: {
targetTerms: [ 'fruit', 'oranges' ],
}
}
);
You would have you pass your ['fruit', 'citrus' ] input values using the scope parameter in the map reduce call as {targetTerms: ['fruit', 'citrus' ]} so that they are available in the map function above.