How to properly delete an orphaned reference in MongoDB? - node.js

So, I am building a small blog-like project in Node, and I am running into an issue with orphaned database references. I have two models in separate files that reference each other.
Here are my models:
// ./models/user
Var UserSchema = mongoose.Schema({
name: String,
posts: [{type: mongoose.SchemaTypes.ObjectId, ref:'Post'}]
});
// ./models/post
var PostSchema = mongoose.Schema({
title:String,
post_body: String,
posted_by: mongoose.SchemaTypes.ObjectId
});
My question is when you delete say, a Post, how would you delete the reference in the User's post array? My thinking was I could create a middleware to run before the delete route and delete the reference in the User's post array before I actually delete the post. Would that be considered the best way to go about it? I found a post on Stack that uses a 'pre' function in the schema like this:
// ./models/post
PostSchema.pre('remove', function(next){
this.model('User').remove({posts: this._id}, next);
});
Here is the actual stack post: Automatically remove referencing objects on deletion in MongoDB . I could not get this work though. I did ,however, implement a custom middleware to delete the references, but feel it might not be best practice. Any tips/advice would be greatly appreciated. Thanks!

You don't want .remove() here but you want .update() with $pull instead:
PostSchema.pre('update',function(next) {
this.model('User').update(
{ },
{ "$pull": { "posts": this._id } },
{ "multi": true },
next
);
})
That's the correct operation to remove something from an array. The "multi" makes sure that the "post" would be removed for all User objects that reference it, thought it probably really is only one document anyway.
The .remove() method is for "removing" whole documents. The .update() method makes "changes" to documents.

Related

Easy way to reference Documents in Mongoose

In my application I have a User Collection. Many of my other collections have an Author (an author contains ONLY the user._id and the user.name), for example my Post Collection. Since I normally only need the _id and the name to display e.g. my posts on the UI.
This works fine, and seems like a good approach, since now everytime I deal with posts I don`t have to load the whole user Object from the database - I can only load my post.author.userId/post.author.name.
Now my problem: A user changes his or her name. Obviously all my Author Objects scattered around in my database still have the old author.
Questions:
is my approuch solid, or should I only reference the userId everywhere I need it?
If I'd go for this solution I'd remove my Author Model and would need to make a User database call everytime I want to display the current Users`s name.
If I leave my Author as is, what would be a good way to implement a solution for situations like the user.name change?
I could write a service which checks every model which has Authors of the current user._id and updates them of course, but this sounds very tedious. Although I'm not sure there's a better solution.
Any pro tipps on how I should deal with problems like this in the future?
Yes, sometime database are good to recorded at modular style. But You shouldn't do separating collection for user/author such as
At that time if you use mongoose as driver you can use populate to get user schema data.
Example, I modeling user, author, post that.
var UserSchema = new mongoose.Schema({
type: { type: String, default: "user", enum: ["user", "author"], required: true },
name: { type: String },
// Author specific values
joinedAt: { type: Date }
});
var User = mongoose.model("User", UserSchema);
var PostSchema = new mongoose.Schema({
author: { type: mongoose.Scheam.Types.ObjectId, ref: "User" },
content: { type: String }
});
var Post = mongoose.model("Post", PostSchema);
In this style, Post are separated model and have to save like that. Something like if you want to query a post including author's name, you can use populate at mongoose.
Post.findOne().populate("author").exce(function(err, post) {
if(err)
// do error handling
if(post){
console.log(post.author.type) // author
}
});
One solution is save only id in Author collection, using Ref on the User collection, and populate each time to get user's name from the User collection.
var User = {
name: String,
//other fields
}
var Author = {
userId: {
type: String,
ref: "User"
}
}
Another solution is when updating name in User collection, update all names in Author collection.
I think first solution will be better.

Mongoose: How to populate 2 level deep population without populating fields of first level? in mongodb

Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?
As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.
SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});
why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC
As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().
I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.

Adding fields to model which derived from Mongoose schema

I have a Mongoose schema that looks like this:
ManifestSchema = new Schema({
entries: [{
order_id: String,
line_item: {}, // <-- resolved at run time
address: {},// <-- resolved at run time
added_at: Number,
stop: Number,
}]
}, {collection: 'manifests', strict: true });
and somewhere in the code I have this:
Q.ninvoke(Manifests.findById(req.params.id), 'exec')
.then(function(manifest)
{
// ... so many things, like resolving the address and the item information
entry.line_item = item;
entry.address = order.delivery.address;
})
The issue that I faced is that without defining address and line_item in the schema, when I resolved them at run time, they wouldn't returned to the user because they weren't in the schema...so I added them...which cause me another unwanted behavior: When I saved the object back, both address and line_item were saved with the manifest object, something that I would like to avoid.
Is there anyway to enable adding fields to the schema at run time, but yet, not saving them on the way back?
I was trying to use 'virtuals' in mongoose, but they really provide what I need because I don't create the model from a schema, but it rather returned from the database.
Call toObject() on your manifest Mongoose instance to create a plain JavaScript copy that you can add extra fields to for the user response without affecting the doc you need to save:
Q.ninvoke(Manifests.findById(req.params.id), 'exec')
.then(function(manifest)
{
var manifestResponse = manifest.toObject();
// ... so many things, like resolving the address and the item information
entry.line_item = item;
entry.address = order.delivery.address;
})

Making a mongoose model aware that it is nested

Consider the blog/comment schemas where nesting is appropriate (even if you disagree):
var CommentSchema = new Schema({ name: String, body: String });
var BlogPostSchema = new Schema({ title: String, comments: [CommentSchema] });
I understand how to add, update, delete comments for a blog post, but all of these methods require the save() method to be called on the parent blog post document:
blog_post.comments.push( new Comment({...}) );
blog_post.save();
I would like to be able to make the Comment schema aware that it is nested inside of another schema so that I can call save() on a comment document and it's smart enough to update the parent blog post. In my app logic, I already know the blog post id, so I would like to do something like this:
CommentSchema.virtual('blog_post_id');
CommentSchema.pre('save', function (next) {
var comment = this;
if( !comment.blog_post_id ) throw new Error('Need a blog post id');
BlogModel.findById( comment.blog_post_id, function(err, post) {
post.comments.push( comment );
post.save(next);
});
});
var comment = new Comment({ blog_post_id: 123, name: 'Joe', body: 'foo' });
comment.save();
The above works, but I still end up with a top-level Comments collection separate from the blog posts (this is just how mongoose works, I accept that).
Question: How do I prevent Mongoose from creating a separate "Comments" collection. In the pre-save method I would like to call next() without any write operations taking place afterwards. Any thoughts?
This has earned me the Tumbleweed badge... hooray!?!?
So I have written a lot of code which accomplished the above. I don't want to release it until I have done more testing. But if anybody is interested in this, please let me know by posting here. I will gladly hand over what I have (which is going into production soon).
Right now my code doesn't support deep nesting... meaning you can only work with "simple" nesting similar to the blog/comments example above. I have the architecture in place to handle more complex nesting in the future, but I don't have the time to test right now (darn deadlines). Here are some of the big points about my solution so far:
All operations require the parent document's id (this makes sense once you start using it)
find, findOne, save, and remove directly on a nested model
findById doesn't (can't) work - well it maybe could work but would require searching the entire collection, which is slow. Must use findOne + parent id instead (see examples).
Super fast - uses projection for finding, and saves using Model.update() on the parent model (which is really fast).
All middleware still executes (pre/post and validation)
None of the findAndUpdate/Remove methods work [yet?]
Setup
// setup the "nestedSchema" plugin
var nestedSchema = require("./plugins/nestedSchema");
CommentSchema.plugin(nestedSchema, {
path: 'comments',
ownerModelPath: './BlogPostModel',
ownerIdFieldName: 'blogpost_id'
});
Examples - take note that the parent's blogpost_id is ALWAYS used - this is a requirement which makes it stay fast (callbacks and error handling removed for brevity):
// create a new comment
var comment = new CommentModel({
blogpost_id: [id],
name: 'Joe Schmoe',
body: 'The content of the comment'
});
comment.save();
// use findOne in leu of findById
CommentModel.findOne({blogpost_id: [id], _id: [id]}, function( err, comment ) {
comment.set('body', 'This comment has been updated directly!');
comment.save();
});
// find all hateful comments and remove
CommentModel.find({blogpost_id: [id], body: /sucks|stupid|dumb/gi}).remove();
Using the mongoose-relationship plugin from https://www.npmjs.org/package/mongoose-relationship
it is possible to make your documents aware of their relations.
Corresponding references are updated by the plugin when adding/removing documents.
There is a good example on the github page: https://github.com/sabymike/mongoose-relationship

Get related items in keystone

Working on a project in KeystoneJS and I'm having trouble figuring out the mongoose relationship bit.
According to the keystone docs, let's say we have the following models: User and Post. Now a post has a relationship to a user, so I'll write:
Post.add({
author: { type: Types.Relationship, ref: 'User' }
});
and then:
User.relationship({ path: 'posts', ref: 'Post', refPath: 'author' });
Now, I want to be able to see all posts related to that User without having to query for both a User and Posts. For example, if I queried for a user object I would like to be able to do user.posts and have access to those related posts. Can you do this with mongoose/keystone?
As far as I understand, keystone's List Relationship has nothing to do with mongoose and querying. Instead, it is used by keystone's admin UI to build out the relationship queries before rendering them in the view. This said I would forget User.relationship(...); solving your problem, although you want it for what I just mentioned.
The following should work fine based on your schema, but only populates the relationship on the one side:
var keystone = require('keystone');
keystone.list('Post').model.findOne().populate('author', function (err, doc) {
console.log(doc.author.name); // Joe
console.log(doc.populated('author')); // '1234af9876b024c680d111a1' -> _id
});
You could also try this the other way, however...
keystone.list('User').model.findOne().populate('posts', function (err, doc) {
doc.posts.forEach(function (post) {
console.log(post);
});
});
...mongoose expects that this definition is added to the Schema. This relationship is added by including this line in your User list file:
User.schema.add({ posts: { type: Types.Relationship, ref: 'Post', many: true } })
After reading the keystone docs, this seems to be logically equivalent the mongoose pure way, User.schema.add({ posts: [{ type: Schema.Types.ObjectId, ref: 'Post' }] });. And now you are now maintaining the relationship on both lists. Instead, you may want to add a method to your keystone list.
User.schema.methods.posts = function(done){
return keystone.list('Post').model.find()
.where('author', this.id )
.exec(done);
};
By adding a method to your User list, it saves you from persisting the array of ObjectIds relating the MongoDB document back to the Post documents. I know this requires a second query, but one of these two options look to be your best bet.
I found this on their github https://github.com/Automattic/mongoose/issues/1888, check it for context, but basically says to use the keystone populateRelated() method. I tested it and does work
// if you've got a single document you want to populate a relationship on, it's neater
Category.model.findOne().exec(function(err, category) {
category.populateRelated('posts', function(err) {
// posts is populated
});
});
I'm aware the question is old but this has to be out there for further reference

Resources