Indexing with Mongoose - node.js

I was reading the mongoose docs about indexing and want to find out whether there is a difference between field level indexing and schema level indexing. They mention that "defining indexes at the schema level is necessary when creating compound indexes." Are there any other reasons why I might choose one over the other or it is just a preference?
const animalSchema = new Schema({
name: String,
type: String,
tags: { type: [String], index: true } // field level
});
animalSchema.index({ name: 1, type: -1 }); // schema level

When developing your indexing strategy you should have a deep understanding of your application’s queries. Before you build indexes, map out the types of queries you will run so that you can build indexes that reference those fields.
for example you have a query just find based on one field like name so you can indexing name in field level but if you have a query that find based on name and tag so you should indexing name and tag together for this situation you should use indexing in schema level

Related

Mongoose Query with Dynamic Key that also contains the "and" operator

So I'm trying to query my MongoDB database using mongoose to fetch documents that have a specific family AND a specific analysis ID at the same time. Here is an example of the document structure:
_id: ObjectId("62b2fb397fda9ba6fe24aa5c")
day: 1
family: "AUTOMOTIVE"
prediction: -233.99999999999892
analysis: ObjectId("629c86fc67cfee013c5bf147")
The problem I face in this case is that the name of the key of the family field is set dynamically and could therefore have any other name such as "product_family", "category", etc. This is why, in order to fetch documents with a dynamic key name, I have to use the where() and equals() operators like so:
// Get the key of the field that is set dyncamically.
let dynamicKey = req.body.dynamicKey;
// Perform a query using a dynamic key.
documents = await Model.find().where(dynamicKey).equals(req.body.value);
HOWEVER, my goal here is NOT to just fetch all the documents with the dynamic key, but rather to fetch the documents that have BOTH the dynamic key name AND ALSO a specific analysis Id.
Had the family field NOT been dynamic, I could have simply used a query like so:
documents = await Model.find({
$and: [{analysis: req.body.analysis_id}, {family: req.body.value}]
});
but this does not seem possible in this case since the keys inside the find() operator are mere text strings and not variables. I also tried using the following queries with no luck:
documents = await Model.find().where(dynamicKey).equals(req.body.value).where('analysis').equals(req.body.analysis_id);
documents = await Model.find().where(dynamicKey).equals(req.body.value).where('analysis').equals(req.body.analysis_id);
Can somebody please help?
As #rickhg12hs mentioned in the comments, part of the answer is to use the [] brackets to specify your dynamic key like so:
await Model.find({[dynamicKey]: req.body.value, analysis: req.body.analysis_id});
I also found out that another query that can work is this:
await Model.find({analysis:req.body.analysis_id}).where(dynamicKey).equals(req.body.value);
However, it seems that for either of these solutions to work you also need to set your schema's strict mode to "false", since we are working with a dynamic key value.
Example:
var predictionSchema = new mongoose.Schema({
day: {
type: Number,
required: true
},
prediction: {
type: Number,
required: true
},
analysis: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Analysis', // Reference the Analysis Schema
required: true
}
}, { strict: false });

One to many relation in Dynamodb Node js (Dynamoose)

I am using Dynamodb with nodejs for my reservation system. And Dynamoose as ORM. I have two tables i.e Table and Reservation. To create relation between them, I have added tableId attribute in Reservation which is of type Model type (of type Table type), as mentioned in the dyanmoose docs. Using the document.populate I am able to get the Table data through the tableId attribute from Reservation table. But how can I retrieve all Reservation for a Table? (Reservation and Table has one to many relation)?
These are my Model:
Table Model:
const tableSchema = new Schema ({
tableId: {
type: String,
required: true,
unique: true,
hashKey: true
},
name: {
type: String,
default: null
},
});
*Reservation Model:*
const reservationSchema = new Schema ({
id: {
type: Number,
required: true,
unique: true,
hashKey: true
},
tableId: table, \\as per doc attribute of Table (Model) type
date: {
type: String
}
});
This is how I retrieve table data from reservation model
reservationModel.scan().exec()
.then(posts => {
return posts.populate({
path: 'tableId',
model: 'Space'
});
})
.then(populatedPosts => {
console.log('pp',populatedPosts);
return {
allData: {
message: "Executedddd succesfully",
data: populatedPosts
}
}
})
Anyone please help to retrieve all Reservation data from Table??
As of v2.8.2, Dynamoose does not support this. Dynamoose is focused on one directional simple relationships. This is partly due to the fact that we discourage use of model.populate. It is important to note that model.populate does another completely separate request to DynamoDB. This increases the latency and decreases the performance of your application.
DynamoDB truly requires a shift in how you think about modeling your data compared to SQL. I recommend watching AWS re:Invent 2019: Data modeling with Amazon DynamoDB (CMY304) for a great explanation of how you can model your data in DynamoDB in a highly efficient manner.
At some point Dynamoose might add support for this, but it's really hard to say if we will.
If you truly want to do this, I'd recommend adding a global index to your tableId property in your reservation schema. Then you can run something like the following:
async function code(id) {
const reservation = await reservationModel.get(id);
const tables = await tableModel.query("tableId").eq(id).exec(); // This will be an array of `table` entries where `"tableId"=id`. Remember, it is required you add an index for this to work.
}
Remember, this will cause multiple calls to DynamoDB and isn't as efficient. I'd highly recommend watching the video I linked above to get more information about how to model your data in an more efficient manner.
Finally, I'd like to point out that your unique: true code does nothing. As seen in the Dynamoose Attribute Settings Documentation, unique is not a valid setting. In your case since you don't have a rangeKey, it's not possible for two items to have the same hashKey, so technically it's already a unique property based on that. However it is important to note that you can overwrite existing items when creating an item. You can set overwrite to false for document.save or Model.create to prevent that behavior and throw an error instead of overwriting your document.

Non-existing field in Mongodb document appears in mongoose findById() result

I'm somewhat new in what is related to Mongoose and I came to this behaviour I consider as strange. The document returned by Mongoose has fields that are not present in the actual MongoDb document, and seem to be added by Mongoose based on the schema.
I use a schema similar to this (this one is simplified) :
const ProfessionalSchema = new mongoose.Schema({
product: {
details: [{
_id: false,
id: String, // UUID
name: String,
prestations: [{
_id: false,
id: String, // UUID
name: String,
price: Number,
}],
}],
},
[...]
My document as shown in Mongodb with mongo CLI utility doesn't have a product field.
What I don't understand is why the result of Professional.findById().exec() returns a document with a product:{details[]} field. I expect not to have that field in the Mongoose returned result, since it is not present in the original MongoDb document.
The Mongoose documentation found https://mongoosejs.com/docs/guide.html (Schema and Model paragraph) didn't help.
My business logic would require that field not to be present, instead of being forced by the schema. Is this achievable ?
Try taking a look at the default option. You could e.g. default your product to null and then, in your business logic, handle the "product is null" case rather than the "product field does not exist" case.
As for why this is happening, it's because you're dealing with a schema. If the field doesn't exist on the document, it's going to be auto-populated. The whole point of a schema is to ensure consistency of your document structure.

Mongoose search for an object's value in referenced property (subdocument)

I have two schemas:
var ShelfSchema = new Schema({
...
tags: [{
type: Schema.Types.ObjectId,
ref: 'Tag'
}]
});
var TagSchema = new Schema({
name: {
type: String,
unique: true,
required: true
}
});
I would like to search for all Shelves where the tags array has a tag with a specific value.
I have tried using:
modelShelf.find({tags 'tags.name': 'mytag'})...
but it does not work. It always returns an empty array.
Any idea?
Looking at db each Shelf instance links only the objectID of the tags.
I have used references because I need to work also with Tag(s) entities.
In mongoDB you essentially can't do this directly as queries target a single collection at a time. Recently there were added new features which allow some kind of join when using the aggregation framework but for your needs that is not necessary.
From your schemas I see that the tags' names are unique so you can first fetch your desired tag with something like
modelTag.find({name: 'mytag'})
in order to get the tag's ID and then query your shelf collection for this tag ID
modelShelf.find({tags: tagId})

MongoDB/Mongoose index make query faster or slow it down?

I have an article model like this:
var ArticleSchema = new Schema({
type: String
,title: String
,content: String
,hashtags: [String]
,comments: [{
type: Schema.ObjectId
,ref: 'Comment'
}]
,replies: [{
type: Schema.ObjectId
,ref: 'Reply'
}]
, status: String
,statusMeta: {
createdBy: {
type: Schema.ObjectId
,ref: 'User'
}
,createdDate: Date
, updatedBy: {
type: Schema.ObjectId
,ref: 'User'
}
,updatedDate: Date
,deletedBy: {
type: Schema.ObjectId,
ref: 'User'
}
,deletedDate: Date
,undeletedBy: {
type: Schema.ObjectId,
ref: 'User'
}
,undeletedDate: Date
,bannedBy: {
type: Schema.ObjectId,
ref: 'User'
}
,bannedDate: Date
,unbannedBy: {
type: Schema.ObjectId,
ref: 'User'
}
,unbannedDate: Date
}
}, {minimize: false})
When user creates or modify the article, I will create hashtags
ArticleSchema.pre('save', true, function(next, done) {
var self = this
if (self.isModified('content')) {
self.hashtags = helper.listHashtagsInText(self.content)
}
done()
return next()
})
For example, if user write "Hi, #greeting, i love #friday", I will store ['greeting', 'friday'] in hashtags list.
I am think about creating an index for hashtags to make queries on hashtags faster. But from mongoose manual, I found this:
When your application starts up, Mongoose automatically calls
ensureIndex for each defined index in your schema. Mongoose will call
ensureIndex for each index sequentially, and emit an 'index' event on
the model when all the ensureIndex calls succeeded or when there was
an error. While nice for development, it is recommended this behavior
be disabled in production since index creation can cause a significant
performance impact. Disable the behavior by setting the autoIndex
option of your schema to false.
http://mongoosejs.com/docs/guide.html
So is indexing faster or slower for mongoDB/Mongoose?
Also, even if I create index like
hashtags: { type: [String], index: true }
How can I make use of the index in my query? Or will it just magically become faster for normal queries like:
Article.find({hashtags: 'friday'})
You are reading it wrong
You are misreading the intent of the quoted block there as to what .ensureIndex() ( now deprecated, but still called by mongoose code ) actually does here in the context.
In mongoose, you define an index either at the schema or model level as is appropriate to your design. What mongoose "automatically" does for you is on connection it inpects each registered model and then calls the appropriate .ensureIndex() methods for the index definitions provided.
What does this actually do?
Well, in most cases, being after you have already started up your application before and the .ensureIndexes() method was run is Absolutely Nothing. That is a bit of an overstatement, but it more or less rings true.
Because the index definition has already been created on the server collection, a subsesquent call does not do anything. I.e, it does not drop the index and "re-create". So the real cost is basically nothing, once the index itself has been created.
Creating indexes
So since mongoose is just a layer on top of the standard API, the createIndex() method contains all the details of what is happening.
There are some details to consider here, such as that an index build can happen in the "background", and while this is less intrusive to your application it does come at it's own cost. Notably that the index size from "background" generation will be larger than if you built it n the foreground, blocking other operations.
Also all indexes come at a cost, notably in terms of disk usage as well as an additional cost of writing the additional information outside of the collection data itself.
The adavantages of an index are that it is much faster to "search" for values contained within an index than to seek through the whole collection and match the possible conditions.
These are the basic "trade-offs" associated with indexes.
Deployment Pattern
Back to the quoted block from the documentation, there is a real intent behind this advice.
It is typical in deployment patterns and particularly with data migrations to do things in this order:
Populate data to relevant collections/tables
Enable indexes on the collection/table data relevant to your needs
This is because there is a cost involved with index creation, and as mentioned earlier it is desirable to get the most optimum size from the index build, as well as avoid having each document insertion also having the overhead of writing an index entry when you are doing this "load" in bulk.
So that is what indexes are for, those are the costs and benefits and the message in the mongoose documentation is explained.
In general though, I suggest reading up on Database Indexes for what they are and what they do. Think of walking into a library to find a book. There is a card index there at the entrance. Do you walk around the library to find the book you want? Or do you look it up in the card index to find where it is? That index took someone time to create and also keep it updated, but it saves "you" the time of walking around the whole library just so you can find your book.

Resources