Implementing manual linkage/reference in mongoose - node.js

http://docs.mongodb.org/manual/reference/database-references/#DatabaseReferences-SimpleDirect%2FManualLinking
For nearly every case where you want to store a relationship between two documents, use manual references. The references are simple to create and your application can resolve references as needed.
As it has been indicated in the mongodb reference document it seems to be more reasonable to use manual linkage / reference rather than using the DBRef like this :
stories : [{ type: Schema.ObjectId, ref: 'Story' }]
Implementing the relations via DBref is quite simple as it seems. Apart from that I could not find a reliable resource on how to implement manual reference most efficiently in a schema. Proposals:
stories : [{ type: Schema.ObjectId}] OR
stories : [{ type: Number] OR
stories : [{ type: String]
How the manual reference should be implemented? It would be much appreciated an example of insertion as well.

Implementing this will depend on which library you use in which environment.
Here is a nice example for mongoose in node.js:
https://mongoosejs.com/docs/populate.html

Related

Should I create New Schema Model file for every route OR use already created Schema?

Suppose I have a User Schema which has around 30 fields, and other 3 schemas also.
UserSchema.js
user_schema = new Schema({
user_id: { type: String},
.........//30 properties
});
ctrs_schema = new Schema({
.........10 properties
});
ids_schema = new Schema({
.........5 properties
});
comments_schema = new Schema({
.........10 properties
});
Now I am writing a route which will change the gender of the user, Now in order to do it I can use UserSchema.js but that will load all of the schemas into my route, whereas if I would have created a new file which had only one schema with two fields, then all schemas will not get loaded into the memory for the route.
UserGenderSchema.js
gender_schema = new Schema({
user_id: { type: String},
gender: { type: String}
});
I know there are pros and cons of both of the ways
Pros -
I have to edit only in single file if I would have to change something
for any field.
Cons -
All Schemas are Loading for all routes which are unnecesary. Memory
Wastage.
Will, there be any less memory usage between both of the ways on the threads?
Can anyone Please tell me which architecture will be better or what you are implementing in your project and why?
Thanks
It's better to keep user related fields in just one schema, cause mongo has been there because of its non-relational structure and it gained its performance by keeping relational structures away, so if you create a schema for each field and then create a reference in each of them to point out to the user they are related to, you are kind of using mongo to build a heavily relational structure and mongo is not good as it should be in this situation, so if later on your application you want to somehow show all the information of the user or try to update multiple fields of user or try to show more information of the user in one of your routes or something, you will end up having some serious performance issues. as a conclusion, the cost of loading all the schema to touch only one field is not as much as the cost of breaking down your data structure.

Get a tree with MongoDB query (mongoose)

Is there any easy way to get a tree in mongoose? Let's consider following example:
categorySchema = new mongoose.Schema({
name : {type: String, required: true},
parent : {type: Schema.Types.ObjectId, ref: 'Category'}
})
I have a category that can have a lot of children (sub-categories), and these children can have a lot of children (sub-sub-categories) and so on...
As far as it's quite easy to build such structure and modify it, it's quite hard for me to retrieve that structure.
I did it in very bad way scanning the collection recursively, but I guess there can be a better way to do that.
In SQL it's quite easy to achieve that and I am wondering if there is a good way to do that in mongodb.
Note: This schema may be changed if needed.

How to handle creation of referenced documents in Mongoose and Node?

I have a Node.js API method that creates a Track document and then creates Task document (which has DBRef to a Track to work on) to actually convert and upload it to s3 (specific operations doesn't really matter). I use Mongoose to perform operations on my MongoDB.
So my question is there any way to avoid having "stalled" tracks that have Track document created and no Task for them? Creating a Task without Track is fine though, but in that case we don't have DBRef until we create a Track document and then update a Task document which may fail due to server restart or something.
In traditional RDBMS that kind of problem is easy to solve since I would do that in a single transaction.
Thanks in advance.
Update:
My schemas look like that:
var Track = new Schema({
name: String,
...
});
var Task = new Schema({
track: {type: Schema.Types.ObjectId, ref: 'Track'},
created: {type: Date, default: Date.now}
});

Difference between MongoDB and Mongoose

I wanted to use the mongodb database, but I noticed that there are two different databases with either their own website and installation methods: mongodb and mongoose. So I came up asking myself this question: "Which one do I use?".
So in order to answer this question I ask the community if you could explain what are the differences between these two? And if possible pros and cons? Because they really look very similar to me.
I assume you already know that MongoDB is a NoSQL database system which stores data in the form of BSON documents. Your question, however is about the packages for Node.js.
In terms of Node.js, mongodb is the native driver for interacting with a mongodb instance and mongoose is an Object modeling tool for MongoDB.
mongoose is built on top of the mongodb driver to provide programmers with a way to model their data.
EDIT:
I do not want to comment on which is better, as this would make this answer opinionated. However I will list some advantages and disadvantages of using both approaches.
Using mongoose, a user can define the schema for the documents in a particular collection. It provides a lot of convenience in the creation and management of data in MongoDB. On the downside, learning mongoose can take some time, and has some limitations in handling schemas that are quite complex.
However, if your collection schema is unpredictable, or you want a Mongo-shell like experience inside Node.js, then go ahead and use the mongodb driver. It is the simplest to pick up. The downside here is that you will have to write larger amounts of code for validating the data, and the risk of errors is higher.
Mongo is NoSQL Database.
If you don't want to use any ORM for your data models then you can also use native driver mongo.js: https://github.com/mongodb/node-mongodb-native.
Mongoose is one of the orm's who give us functionality to access the mongo data with easily understandable queries.
Mongoose plays as a role of abstraction over your database model.
One more difference I found with respect to both is that it is fairly easy to connect to multiple databases with mongodb native driver while you have to use work arounds in mongoose which still have some drawbacks.
So if you wanna go for a multitenant application, go for mongodb native driver.
From the first answer,
"Using Mongoose, a user can define the schema for the documents in a particular collection. It provides a lot of convenience in the creation and management of data in MongoDB."
You can now also define schema with mongoDB native driver using
##For new collection
db.createCollection("recipes",
validator: { $jsonSchema: {
<<Validation Rules>>
}
}
)
##For existing collection
db.runCommand({
collMod: "recipes",
validator: { $jsonSchema: {
<<Validation Rules>>
}
}
})
##full example
db.createCollection("recipes", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "servings", "ingredients"],
additionalProperties: false,
properties: {
_id: {},
name: {
bsonType: "string",
description: "'name' is required and is a string"
},
servings: {
bsonType: ["int", "double"],
minimum: 0,
description:
"'servings' is required and must be an integer with a minimum of zero."
},
cooking_method: {
enum: [
"broil",
"grill",
"roast",
"bake",
"saute",
"pan-fry",
"deep-fry",
"poach",
"simmer",
"boil",
"steam",
"braise",
"stew"
],
description:
"'cooking_method' is optional but, if used, must be one of the listed options."
},
ingredients: {
bsonType: ["array"],
minItems: 1,
maxItems: 50,
items: {
bsonType: ["object"],
required: ["quantity", "measure", "ingredient"],
additionalProperties: false,
description: "'ingredients' must contain the stated fields.",
properties: {
quantity: {
bsonType: ["int", "double", "decimal"],
description:
"'quantity' is required and is of double or decimal type"
},
measure: {
enum: ["tsp", "Tbsp", "cup", "ounce", "pound", "each"],
description:
"'measure' is required and can only be one of the given enum values"
},
ingredient: {
bsonType: "string",
description: "'ingredient' is required and is a string"
},
format: {
bsonType: "string",
description:
"'format' is an optional field of type string, e.g. chopped or diced"
}
}
}
}
}
}
}
});
Insert collection Example
db.recipes.insertOne({
name: "Chocolate Sponge Cake Filling",
servings: 4,
ingredients: [
{
quantity: 7,
measure: "ounce",
ingredient: "bittersweet chocolate",
format: "chopped"
},
{ quantity: 2, measure: "cup", ingredient: "heavy cream" }
]
});
Mongodb and Mongoose are two different drivers to interact with MongoDB database.
Mongoose : object data modeling (ODM) library that provides a rigorous modeling environment for your data. Used to interact with MongoDB, it makes life easier by providing convenience in managing data.
Mongodb: native driver in Node.js to interact with MongoDB.
mongo-db is likely not a great choice for new developers.
On the other hand mongoose as an ORM (Object Relational Mapping) can be a better choice for the new-bies.
If you are planning to use these components along with your proprietary code then please refer below information.
Mongodb:
It's a database.
This component is governed by the Affero General Public License (AGPL) license.
If you link this component along with your proprietary code then you have to release your entire source code in the public domain, because of it's viral effect like (GPL, LGPL etc)
If you are hosting your application over the cloud, the (2) will apply and also you have to release your installation information to the end users.
Mongoose:
It's an object modeling tool.
This component is governed by the MIT license.
Allowed to use this component along with the proprietary code, without any restrictions.
Shipping your application using any media or host is allowed.
Mongoose is built untop of mongodb driver, the mongodb driver is more low level. Mongoose provides that easy abstraction to easily define a schema and query. But on the perfomance side Mongdb Driver is best.
Mongodb and Mongoose are two completely different things!
Mongodb is the database itself, while Mongoose is an object modeling tool for Mongodb
EDIT: As pointed out MongoDB is the npm package, thanks!
MongoDB is The official MongoDB Node.js driver allows Node.js applications to connect to MongoDB and work with data.
On the other side Mongoose it other library build on top of mongoDB. It is more easier to understand and use. If you are a beginner than mongoose is better for you to work with.

Denormalization with Mongoose: How to synchronize changes

What is the best way to propagate updates when you have a denormalized Schema? Should it be all done in the same function?
I have a schema like so:
var Authors = new Schema({
...
name: {type: String, required:true},
period: {type: Schema.Types.ObjectId, ref:'Periods'},
quotes: [{type: Schema.Types.ObjectId, ref: 'Quotes'}]
active: Boolean,
...
})
Then:
var Periods = new Schema({
...
name: {type: String, required:true},
authors: [{type: Schema.Types.ObjectId, ref:'Authors'}],
active: Boolean,
...
})
Now say I want to denormalize Authors, since the period field will always just use the name of the period (which is unique, there can't be two periods with the same name). Say then that I turn my schema into this:
var Authors = new Schema({
...
name: {type: String, required:true},
period: String, //no longer a ref
active: Boolean,
...
})
Now Mongoose doesn't know anymore that the period field is connected to the Period schema. So it's up to me to update the field when the name of a period changes. I created a service module that offers an interface like this:
exports.updatePeriod = function(id, changes) {...}
Within this function I go through the changes to update the period document that needs to be updated. So here's my question. Should I, then, update all authors within this method? Because then the method would have to know about the Author schema and any other schema that uses period, creating a lot of coupling between these entities. Is there a better way?
Perhaps I can emit an event that a period has been updated and all the schemas that have denormalized period references can observe it, is that a better solution? I'm not quite sure how to approach this issue.
Ok, while I wait for a better answer than my own, I will try to post what I have been doing so far.
Pre/Post Middleware
The first thing I tried was to use the pre/post middlewares to synchronize documents that referenced each other. (For instance, if you have Author and Quote, and an Author has an array of the type: quotes: [{type: Schema.Types.ObjectId, ref:'Quotes'}], then whenever a Quote is deleted, you'd have to remove its _id from the array. Or if the Author is removed, you may want all his quotes removed).
This approach has an important advantage: if you define each Schema in its own file, you can define the middleware there and have it all neatly organized. Whenever you look at the schema, right below you can see what it does, how its changes affect other entities, etc:
var Quote = new Schema({
//fields in schema
})
//its quite clear what happens when you remove an entity
Quote.pre('remove', function(next) {
Author.update(
//remove quote from Author quotes array.
)
})
The main disadvantage however is that these hooks are not executed when you call update or any Model static updating/removing functions. Rather you need to retrieve the document and then call save() or remove() on them.
Another smaller disadvantage is that Quote now needs to be aware of anyone that references it, so that it can update them whenever a Quote is updated or removed. So let's say that a Period has a list of quotes, and Author has a list of quotes as well, Quote will need to know about these two to update them.
The reason for this is that these functions send atomic queries to the database directly. While this is nice, I hate the inconsistency between using save() and Model.Update(...). Maybe somebody else or you in the future accidently use the static update functions and your middleware isn't triggered, giving you headaches that you struggle to get rid of.
NodeJS Event Mechanisms
What I am currently doing is not really optimal but it offers me enough benefits to actually outweight the cons (Or so I believe, if anyone cares to give me some feedback that'd be great). I created a service that wraps around a model, say AuthorService that extends events.EventEmitter and is a Constructor function that will look roughly like this:
function AuthorService() {
var self = this
this.create = function() {...}
this.update = function() {
...
self.emit('AuthorUpdated, before, after)
...
}
}
util.inherits(AuthorService, events.EventEmitter)
module.exports = new AuthorService()
The advantages:
Any interested function can register to the Service
events and be notified. That way, for instance, when a Quote is
updated, the AuthorService can listen to it and update the Authors
accordingly. (Note 1)
Quote doesn't need to be aware of all the documents that reference it, the Service simply triggers the QuoteUpdated event and all the documents that need to perform operations when this happens will do so.
Note 1: As long as this service is used whenever anyone needs to interact with mongoose.
The disadvantages:
Added boilerplate code, using a service instead of mongoose directly.
Now it isn't exactly obvious what functions get called when you
trigger the event.
You decouple producer and consumer at the cost of legibility (since
you just emit('EventName', args), it's not immediately obvious
which Services are listening to this event)
Another disadvantage is that someone can retrieve a Model from the Service and call save(), in which the events won't be triggered though I'm sure this could be addressed with some kind of hybrid between these two solutions.
I am very open to suggestions in this field (which is why I posted this question in the first place).
I'm gonna speak more from an architectural point of view than a coding point of view since when it comes right down to it, you can pretty-much achieve anything with enough lines of code.
As far as I've been able to understand, your main concern has been keeping consistency across your database, mainly removing documents when their references are removed and vice-versa.
So in this case, rather than wrapping the whole functionality in extra code I'd suggest going for atomic Actions, where an Action is a method you define yourself that performs a complete removal of an entity from the DB (both document and reference).
So for example when you wanna remove an author's quote, you do something like removing the Quote document from the DB and then removing the reference from the Author document.
This sort of architecture ensures that each of these Actions performs a single task and performs it well, without having to tap into events (emitting, consuming) or any other stuff. It's a self-contained method for performing its own unique task.

Resources