MongoDB perormance issues with geoNear - node.js

Hey I have a MongoDB database and I'm using Node.js with Mongoose.
I have a collection which mongoose schema looks like this:
{
location2d: {
type: [Number], //[<longitude>, <latitude>]
index: '2dsphere'
},
name: String,
owner: {type: Schema.Types.ObjectId, ref: 'Player', index: true},
}
This collection is quite big (500'000 documents). When I do a simple nearest find query, it runs quite fast ~10 ms.
But when I do something like this:
this.find({owner: {$ne:null}})
.where('location2d')
.near({center: [center.lon, center.lat], maxDistance: range / 6380000, spherical: true})
.limit(10)
.select('owner location2d')
.exec()
it takes a very long time, about 60 seconds! Just because I added {owner: {$ne:null}} in the find method multiplies the times required to perform by 6000.
What am I doing wrong? How can I improve this?
When I do a search by owner it's fast, when I do a search by proximity it's fast but when I combine both it's unbelievably slow.
Any clue?

OK I found a solution, a little bit dirty but very fast.
1st: create a new field called ownerIdAsInt which is the int parsed of the owner mongo Id : document.ownerIdAsInt = parseInt(document.owner.toString(), 16). If owner is null, set that field to 0
2nd: Define a compound index with {ownerIdAsInt: 1, location2d: "2dsphere"}
Your schema should like like this:
var schema = new Schema({
location2d: {
type: [Number], //[<longitude>, <latitude>]
index: '2dsphere'
},
name: String,
owner: {type: Schema.Types.ObjectId, ref: 'Player', index: true},
ownerIdAsInt: Number
});
schema.index({ownerIdAsInt: 1, location2d: "2dsphere"});
and the query now is:
this.find({ownerIdAsInt: {$gt: 0},
location2d: {$nearSphere: [center.lon, center.lat],
$maxDistance: range / 6380000}})
.limit(10)
.select('owner location2d')
.exec()
Result are now ~20 ms long. Much faster!

Related

Bad performance on a sorting request

I have a very simple query in a NodeJS / Mongoose application:
const blocks = await Block
.find({
content: ObjectId(internalId),
})
.sort({ position: 1, _id: 1 })
with the schema:
const BlockSchema = mongoose.Schema({
id: String,
(...)
content: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Domain',
index: true
},
concept: {
type: mongoose.Schema.Types.ObjectId,
ref: 'ConceptDetails',
index: true
},
conceptDetails: {
type: mongoose.Schema.Types.ObjectId,
ref: 'ConceptDetails',
index: true
},
creator: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
const Block = mongoose.model(
'Block',
BlockSchema
);
The performance of this simple query was really bad with real data (around 900ms) so I added the following index:
db.blocks.createIndex({ position: 1, _id: 1 });
It improves the performance (around 330ms) but I expected to have something better for a request like that. FYI I have 13100 block items in the database.
Is there something else I can do to improve performance?
Thanks for your help!
It is because you have a find clause to filter by content. This makes the index not usable. You can check this with explain(). Below is a visualization of the query plan on my local replication of your scenario. You can see COLLSCAN, which indicates the index is not used.
What can we do?
We can build another compound index, which includes the field content to speed up the query. Make sure content is before the your sort fields position and _id so the index can be utilized
db.collection.createIndex({content: 1, position: 1, _id: 1 })
You can check again the query plan:
You can see the query plan changed to IXSCAN, which utilized the new index. You can then expect a faster query benefit from index scan.
You can check out this official doc for more details on query coverage and optimization.

Mongoose populate query return all results with skip/limit

I have the following method in a little node/express app :
async getAll(req, res) {
const movies = await movieModel
.find()
.populate({path: 'genres', select: 'name'})
.skip(0)
.limit(15);
return res.send(movies);
};
With the following schema :
const MovieSchema = new mongoose.Schema({
externalId: { required: true, type: Number },
title: { required: true, type: String },
genres: [{ ref: "Genre", type: mongoose.Schema.Types.ObjectId }],
releaseDate: {type: Date},
originalLanguage: {type : String},
originalTitle: {type : String},
posterPath: {type : String},
backdropPath: {type : String},
overview: {type: String},
comments: [{ ref: "Comment", type: mongoose.Schema.Types.ObjectId }],
votes: [VoteSchema]
}, {timestamps: true}
});
MovieSchema.virtual("averageNote").get(function () {
let avg = 0;
if (this.votes.length == 0) {
return '-';
}
this.votes.forEach(vote => {
avg += vote.note;
});
avg = avg / this.votes.length;
return avg.toFixed(2);
});
MovieSchema.set("toJSON", {
transform: (doc, ret) => {
ret.id = ret._id;
delete ret._id;
delete ret.__v;
},
virtuals: true,
getters: true
});
However the query always return all document entries.
I also tried to add exec() at the end of the query or with .populate({path: 'genres', select: 'name', options: {skip: 0, limit: 15} }) but without result.
I tried on an other schema which is simpler and skip/limit worked just fine, so issue probably comes from my schema but I can't figure out where the problem is.
I also tried with the virtual field commented but still, limit and sort where not used.
My guess is that it's comes from votes: [VoteSchema] since it's the first time I use this, but it was recommanded by my teacher as using ref
isn't recommended in a non relational database. Furthermore, in order to calculate the averageNote as a virtual field, I have no other choice.
EDIT : just tried it back with votes: [{ ref: "Vote", type: mongoose.Schema.Types.ObjectId }] And I still can't limit nor skip
Node version : 10.15.1
MongoDB version : 4.0.6
Mongoose version : 5.3.1
Let me know if I should add any other informations
This is actually more about how .populate() actually works and why the order of "chained methods" here is important. But in brief:
const movies = await movieModel
.find()
.skip(0)
.limit(15)
.populate({path: 'genres', select: 'name'}) // alternately .populate('genres','name')
.exec()
The problem is that .populate() really just runs another query to the database to "emulate" a join. This is not really anything to do with the original .find() since all populate() does is takes the results from the query and uses certain values to "look up" documents in another collection, using that other query. Importantly the results come last.
The .skip() and .limit() on the other had are cursor modifiers and directly part of the underlying MongoDB driver. These belong to the .find() and as such these need to be in sequence
The MongoDB driver part of the builder is is forgiving in that:
.find().limit(15).skip(0)
is also acceptable due to the way the options pass in "all at once", however it's good practice to think of it as skip then limit in that order.
Overall, the populate() method must be the last thing on the chain after any cursor modifiers such as limit() or skip().

Push element into nested array mongoose nodejs

I am trying to push a new element into an array, I use mongoose on my express/nodejs based api. Here is the code for mongoose:
Serie.updateOne({'seasons.episodes.videos._id': data._id}, {$push: {'seasons.episodes.videos.$.reports': data.details}},
function(err) {
if (err) {
res.status(500).send()
console.log(err)
}
else res.status(200).send()
})
as for my series models, it looks like this:
const serieSchema = new mongoose.Schema({
name: {type: String, unique:true, index: true, text: true},
customID: {type: Number, required: true, unique: true, index: true},
featured: {type: Boolean, default: false, index: true},
seasons: [{
number: Number,
episodes: [{number: Number, videos: [
{
_id: ObjectId,
provider: String,
reports: [{
title: {type: String, required: true},
description: String
}],
quality: {type: String, index: true, lowercase: true},
language: {type: String, index: true, lowercase: true},
}
]}]
}],
});
When I execute my code, I get MongoDB error code 16837, which says "cannot use the part (seasons of seasons.episodes.videos.0.reports) to traverse the element (my element here on json)"
I've tried many other queries to solve this problem but none worked, I hope someone could figure this out.
In your query you're using positional operator ($ sign) to localize one particular video by _id and then you want to push one item to reports.
The problem is that MongoDB doesn't know which video you're trying to update because the path you specified (seasons.episodes.videos.$.reports) contains two other arrays (seasons and episodes).
As documentation states you can't use this operator more than once
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
This limitation complicates your situation. You can still update your reports but you need to know exact indexes of outer arrays. So following update would be working example:
db.movies.update({'seasons.episodes.videos._id': data._id}, {$push: {'seasons.0.episodes.0.videos.$.reports': data.details}})
Alternatively you can update bigger part of this document in node.js or rethink your schema design keeping in mind technology limitations.

Mongoose find() returning irrelevant documents

I have a schema like this one:
var WorkSchema = new mongoose.Schema({
works: [{
name:String,
times:[{
day:String,
timeOfDay:[{
startTime: Number,
endTime: Number
}],
date: Date
}],
locations:[String]
}],
worker: {
id: {
type: mongoose.Schema.Types.ObjectId,
ref: "User"
},
username: String
}
},{timestamps: true}
);
I have saved a lot of document on this schema. Now, i want to find the documents which has name:'idle'. I'm using Worker.find({'works.name':req.body.name}) but it's not giving me the exact documents i want and giving irrelevant documents. But in MongoDb Compass, this exact line is finding the desired documents.
How do i find the values in mongoose?

Return Mongo document using Mongoose where subdocument does NOT exist?

Help! I'm losing my mind. I need to simply return a Mongo document, using Mongoose, IF a sub document does not exist.
My schemas:
var userSchema = new mongoose.Schema({
email: {type: String, unique: true, lowercase: true},
password: {type: String, select: false},
displayName: String,
picture: String,
facebook: String,
deactivation: deactiveSchema
});
var deactiveSchema = new mongoose.Schema({
when : { type: Date, default: Date.now, required: true },
who : { type: Schema.Types.ObjectId, required: true, ref: 'User' }
});
My goal is to lookup a user by their facebook ID if they have not been deactivated.
If they have been deactivated, then a deactivation subdocument will exist. Of course, to save space, if they are active then a deactivation will not exist.
On a side note, I'm also worried about how to properly construct the index on this logic.
I'd post snippets but every attempt has been wrong. =(
You can use $exists operator:
userSchema.find({deactivation:{$exists:false}}).exec(function(err,document){
});
or $ne:
userSchema.find({deactivation:{$ne:null}}).exec(function(err,document){
});
Since you are retiring data and not deleting, I'd go with one of two approaches:
Flag for retired (Recommended)
add to your schema:
retired: {
type: Boolean,
required: true
}
and add an index for this query:
userSchema.index({facebook: 1, retired: 1})
and query:
User.find({facebook: facebookId, retired: false}, callback)
Query for existence
User.find().exists("deactivation", false).exec(callback)
The latter will be slower, but if you really don't want to change anything, it will work. I'd recommend taking some time to read through the indexing section of the mongo docs.
Mongoose has many options for defining queries with conditions and a couple of styles for writing queries:
Condition object
var id = "somefacebookid";
var condition = {
facebook : id,
deactivation: { $exists : true}
};
user.findOne(condition, function (e, doc) {
// if not e, do something with doc
})
http://mongoosejs.com/docs/queries.html
Query builder
Alternatively, you may want to use the query builder syntax if you are looking for something closer to SQL. e.g.:
var id = "somefacebookid";
users
.find({ facebook : id }).
.where('deactivation').exists(false)
.limit(1)
.exec(callback);

Resources