Easy way to reference Documents in Mongoose - node.js

In my application I have a User Collection. Many of my other collections have an Author (an author contains ONLY the user._id and the user.name), for example my Post Collection. Since I normally only need the _id and the name to display e.g. my posts on the UI.
This works fine, and seems like a good approach, since now everytime I deal with posts I don`t have to load the whole user Object from the database - I can only load my post.author.userId/post.author.name.
Now my problem: A user changes his or her name. Obviously all my Author Objects scattered around in my database still have the old author.
Questions:
is my approuch solid, or should I only reference the userId everywhere I need it?
If I'd go for this solution I'd remove my Author Model and would need to make a User database call everytime I want to display the current Users`s name.
If I leave my Author as is, what would be a good way to implement a solution for situations like the user.name change?
I could write a service which checks every model which has Authors of the current user._id and updates them of course, but this sounds very tedious. Although I'm not sure there's a better solution.
Any pro tipps on how I should deal with problems like this in the future?

Yes, sometime database are good to recorded at modular style. But You shouldn't do separating collection for user/author such as
At that time if you use mongoose as driver you can use populate to get user schema data.
Example, I modeling user, author, post that.
var UserSchema = new mongoose.Schema({
type: { type: String, default: "user", enum: ["user", "author"], required: true },
name: { type: String },
// Author specific values
joinedAt: { type: Date }
});
var User = mongoose.model("User", UserSchema);
var PostSchema = new mongoose.Schema({
author: { type: mongoose.Scheam.Types.ObjectId, ref: "User" },
content: { type: String }
});
var Post = mongoose.model("Post", PostSchema);
In this style, Post are separated model and have to save like that. Something like if you want to query a post including author's name, you can use populate at mongoose.
Post.findOne().populate("author").exce(function(err, post) {
if(err)
// do error handling
if(post){
console.log(post.author.type) // author
}
});

One solution is save only id in Author collection, using Ref on the User collection, and populate each time to get user's name from the User collection.
var User = {
name: String,
//other fields
}
var Author = {
userId: {
type: String,
ref: "User"
}
}
Another solution is when updating name in User collection, update all names in Author collection.
I think first solution will be better.

Related

Storing form data in MongoDB (Design Question)

Trying to design a Form Entry web app and i've rarely used MongoDB before.
Wondering if this is the best practice for storing form (document) data inside a collection.
const mongoose = require('mongoose');
// Create Schema and Model
const documentSchema = mongoose.Schema({
nps: [{ // New Promotion Submission
documentId: Number,
orgid: Number,
documentFields: [{ // Form Fields
id: Number,
dateTimeSubmitted: Date,
title: String,
productDescription: String,
productUnitSize: Number,
productCartonQty: Number
}]
}]
})
const documents = mongoose.model('documents', documentSchema);
module.exports = documents;
This is absolutely fine design, couple of things to look at:
Make sure you introduce validation on your schema fields, mirror the same validation pattern on the frontend form fields also.
Be consistent with your naming: if you use camelCase in documentId make sure to also origId
Convention says you name a model in singular form, i.e. "Document" not "documents".
If you're going to re-use the documentFields schema anywhere else in other models, make sure to store it as a separate schema and import as needed.

mongoose own populate with custom query

I'm trying to create a custom query method in mongoose - similar to the populate()-function of mongoose. I've the following two simple schemas:
const mongoose = require('mongoose')
const bookSchema = new mongoose.Schema({
title: String,
author: {type: mongoose.Schema.Types.ObjectId, required: true, ref: 'Author'}
}, {versionKey: false})
const authorSchema = new mongoose.Schema({
name: String
}, {versionKey: false})
Now, I want retrieve authors information and furthermore the books written by the author. As far as I know, mongoose provides custom queries, hence my idea was to write a custom query function like:
authorSchema.query.populateBooks = function () {
// semi-code:
return forAll(foundAuthors).findAll(books)
}
Now, to get all authors and all books, I can simply run:
authorModel.find({}).populateBooks(console.log)
This should result in something like this:
[ {name: "Author 1", books: ["Book 1", "Book 2"]}, {name: "Author 2", books: ["Book 3"] } ]
Unfortunately, it doesn't work because I don't know how I can access the list of authors selected previously in my populateBooks function. What I need in my custom query function is the collection of the previous-selected documents.
For example, authorModel.find({}) already returns a list of authors. In populateBooks() I need to iterate through this list to find all books for all authors. Anyone know how I can access this collection or if it's even possible?
populate: "Population is the process of automatically replacing the specified paths in the document with document(s) from other collection(s)" (from the docs i linked).
Based on your question, you're not looking for population. yours is a simple query (the following code is to achieve the example result you gave at the end. note that your books field had a value of an array of strings, I'm assuming those were the titles). Also, do note that the following code will work with the models you've already provided, but this is a bad implementation that i recommend against - for multiple reasons: efficiency, elegance, potential errors (for e.g, authors with identical names), see note after code:
Author.find({}, function(err, foundAuthors){
if (err){
console.log(err); //handle error
}
//now all authors are objects in foundAuthors
//but if you had certain search parameters, foundAuthors only includes those authors
var completeList = [];
for (i=0;i<foundAuthors.length;i++){
completeList.push({name: foundAuthors[i].name, books: []});
}
Book.find({}).populate("author").exec(function(err, foundBooks){
if (err){
console.log(err); //handle err
}
for (j=0;j<foundBooks.length;j++){
for (k=0;k<completeList.length;k++){
if (completeList[k].name === foundBooks[j].author.name){
completeList[k].books.push(foundBooks[j].title);
}
}
}
//at this point, completeList has exactly the result you asked for
});
});
However, as i stated, i recommend against this implementation, this was based on the code you already provided without changing it.
I recommend changing your author schema to include a books property:
var AuthorSchema = new mongoose.Schema({
books: [{
type: mongoose.Schema.Types.ObjectId,
ref: "Book"
}]
//all your other code for the schema
});
And add all books to their respective authors. This way, all you would need to do to get an array of objects, each of which contains an author and all of his books is one query:
Author.find({}).populate("books").exec(function(err, foundAuthors){
//if there's no err, then foundAuthors is an array of authors with their books
});
That is far simpler, more efficient, more elegant and more effective than the earlier possible solution i gave, based on your already existing code without changing it.

Query Parse.com migrated database pointer relationship with Mongoose

Context
So we have migrated from Parse.com to an hosted MongoDB database. Now I have to write a script that queries our database directly (not using Parse).
I'm using nodejs / mongoose and am able to retrieve these documents.
Problem
Here is my schema so far:
var StorySchema = new mongoose.Schema({
_id: String,
genre: String
});
var ActivitySchema = new mongoose.Schema({
_id: String,
action: String,
_p_story: String /* Also tried: { type: mongoose.Schema.Types.ObjectId, ref: 'Story' } and { type: String, ref: 'Story' }*/,
});
I would like to write a query that fetches theses documents with the related Story (stored as a pointer).
Activity
.find({
action: 'read',
})
.exec(function(error, activities) {
activities.forEach(function(activity) {
// I would like to use activity._p_story or whatever the mean to access the story here
});
});
Question
Is there a way to have the fetched activities populated with their story, given that the _p_story field contains Story$ before the object id?
Thanks!
One option I have been looking at is the ability to create a custom data type for each pointer. The unfortunate side is Parse treats these as 'belongsTo' relationships and but does not store the 'hasMany' relationship that Mongoose wants for populate(). But once this is in place you can easily do loops to get the relational data. Not ideal but works and is what populate is really doing under the hood anyways.
PointerTypeClass.js -> This would work for populating the opposite direction.
var Pointer = function(mongoose) {
function PointerId(key, options) {
mongoose.SchemaType.call(this, key, options, 'PointerId');
}
PointerId.prototype = Object.create(mongoose.SchemaType.prototype);
PointerId.prototype.cast = function(val) {
return 'Pointer$' + val;
}
return PointerId;
}
module.exports = Pointer;
Also be sure mongoose knows about the new type by doing mongoose.Schema.Types.PointerId = require('./types/PointerTypeClass')(mongoose);
Lastly. If you are willing to write some cloudcode you could create the array of ids for your populate to know about the objects. Basically in your Object.beforeSave you would update the array of the id for the relationship. Hope this helps.

Mongoose: How to populate 2 level deep population without populating fields of first level? in mongodb

Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?
As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.
SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});
why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC
As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().
I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.

Mongoose: populate() / DBref or data duplication?

I have two collections:
Users
Uploads
Each upload has a User associated with it and I need to know their details when an Upload is viewed. Is it best practice to duplicate this data inside the the Uploads record, or use populate() to pull in these details from the Users collection referenced by _id?
OPTION 1
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
_user: { type: Schema.ObjectId, ref: 'users'},
title: { type: String },
});
OPTION 2
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
user: {
name: { type: String },
email: { type: String },
avatar: { type: String },
//...etc
},
title: { type: String },
});
With 'Option 2' if any of the data in the Users collection changes I will have to update this across all associated Upload records. With 'Option 1' on the other hand I can just chill out and let populate() ensure the latest User data is always shown.
Is the overhead of using populate() significant? What is the best practice in this common scenario?
If You need to query on your Users, keep users alone. If You need to query on your uploads, keep uploads alone.
Another question you should ask yourself is: Every time i need this data, do I need the embedded objects (and vice-versa)? How many time this data will be updated? How many times this data will be read?
Think about a friendship request:
Each time you need the request you need the user which made the request, then embed the request inside the user document.
You will be able to create an index on the embedded object too, and your search will be mono query / fast / consistent.
Just a link to my previous reply on a similar question:
Mongo DB relations between objects
I think this post will be right for you http://www.mongodb.org/display/DOCS/Schema+Design
Use Cases
Customer / Order / Order Line-Item
Orders should be a collection. customers a collection. line-items should be an array of line-items embedded in the order object.
Blogging system.
Posts should be a collection. post author might be a separate collection, or simply a field within posts if only an email address. comments should be embedded objects within a post for performance.
Schema Design Basics
Kyle Banker, 10gen
http://www.10gen.com/presentation/mongosf2011/schemabasics
Indexing & Query Optimization
Alvin Richards, Senior Director of Enterprise Engineering
http://www.10gen.com/presentation/mongosf-2011/mongodb-indexing-query-optimization
**These 2 videos are the bests on mongoddb ever seen imho*
Populate() is just a query. So the overhead is whatever the query is, which is a find() on your model.
Also, best practice for MongoDB is to embed what you can. It will result in a faster query. It sounds like you'd be duplicating a ton of data though, which puts relations(linking) at a good spot.
"Linking" is just putting an ObjectId in a field from another model.
Here is the Mongo Best Practices http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesign-SummaryofBestPractices
Linking/DBRefs http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-SimpleDirect%2FManualLinking

Resources