I am trying to create an array of nested objects. I am following an example from a book that does the following:
// Creates the Schema for the Features object (mimics ESRI)
var Phone = new Schema({
number: { type: Number, required: false },
...
personId: {type: Schema.Types.ObjectId}
}
);
// Creates the Schema for the Attachments object
var Person = new Schema({
name: { type: String },
phones: [Phone]
}
);
var Person = mongoose.model('Person', Person);
Which works just fine when storing multiple Phone #'s for a person. However I am not sure if there is a good/fast way to get a Phone object by _id. Since Phone is not a mongoose model you cannot go directly to Phone.findOne({...}); Right now I am stuck with getting a person by _id then looping over that persons phones and seeing if the id matches.
Then I stumbled upon this link:
http://mongoosejs.com/docs/populate.html
Is one way more right than the other? Currently when I delete a person his/her phones go away as well. Not really sure that works with 'populate', seems like I would need to delete Person and Phones.
Anyone want to attempt to explain the differences?
Thanks in advance
The general rule is that if you need to independently query Phones, then you should keep them in a separate collection and use populate to look them up from People when needed. Otherwise, embedding them is typically a better choice as it simplifies updates and deletion.
When using an embedded approach like you are now, note that Mongoose arrays provide an id method you can use to more easily look up an element by its _id value.
var phone = person.phones.id(id);
Related
Im learning mongodb and I have the following question: In one Schema, I have a reference to another model - Im storing id's of books. I have a books model where I have a reference to other books - saving their id's.
The id's of 'similarBooks' I will insert manually. But id's of the books will be always in the format of
ObjectId("1234").
If user clicks on the name of book a query will be made - findById. However the id's I manually inserted are just strings, not ObjectId("id") so it wouldnt find the book. What is the best way to handle this? Do I then in my query take the id (the one thats just a string) and convert it to ObjectId("id") or do I not just insert manually the id as string but already convert to ObjectId. If so how? So far I just been adding data for this type of models in 3t studio.
Same question is for writing tests. If i have ids stored as strings, do I convert to the ObjectId ?
Thank you!
const bookSchema = new mongoose.Schema({
title: {
type: String,
required: true
},
similarBooks: {
name: {
type: [String] //would be only 2
},
id: {
type: [String] //would be only 2
}
}
...
})
There is an answer on the advantage of saving Id as ObjectId instead of string here. Mainly it saves on space.
MongoDb: Benefit of using ObjectID vs a string containing an Id?
So to answer you question, i would always convert that String id to ObjectId before adding it to your similarbooks array.
Let's say I have for example:
const Stats = Item({
name: String,
value: Number
})
const Player = Schema({
name: String,
objectInventory: [Item],
petInventory: [Item]
})
Would the items somehow get mixed up? Is this safe? Are all the items unique and know where they belong to? I don't want to write Player.objectInventory and get pets in there. I'm sorry if this seems like common sense but I had that doubt.
Yes there can be two documents in one schema. This items will not get mixed up. The mongoose is nothing more than just another layer on top of the database to help you with schema. So in your case, you would just put different ids for different properties (e.g. objectInventory and petInventory) and when you would populate them, the mongoose will just make correct queries to return the results.
In my application I have a User Collection. Many of my other collections have an Author (an author contains ONLY the user._id and the user.name), for example my Post Collection. Since I normally only need the _id and the name to display e.g. my posts on the UI.
This works fine, and seems like a good approach, since now everytime I deal with posts I don`t have to load the whole user Object from the database - I can only load my post.author.userId/post.author.name.
Now my problem: A user changes his or her name. Obviously all my Author Objects scattered around in my database still have the old author.
Questions:
is my approuch solid, or should I only reference the userId everywhere I need it?
If I'd go for this solution I'd remove my Author Model and would need to make a User database call everytime I want to display the current Users`s name.
If I leave my Author as is, what would be a good way to implement a solution for situations like the user.name change?
I could write a service which checks every model which has Authors of the current user._id and updates them of course, but this sounds very tedious. Although I'm not sure there's a better solution.
Any pro tipps on how I should deal with problems like this in the future?
Yes, sometime database are good to recorded at modular style. But You shouldn't do separating collection for user/author such as
At that time if you use mongoose as driver you can use populate to get user schema data.
Example, I modeling user, author, post that.
var UserSchema = new mongoose.Schema({
type: { type: String, default: "user", enum: ["user", "author"], required: true },
name: { type: String },
// Author specific values
joinedAt: { type: Date }
});
var User = mongoose.model("User", UserSchema);
var PostSchema = new mongoose.Schema({
author: { type: mongoose.Scheam.Types.ObjectId, ref: "User" },
content: { type: String }
});
var Post = mongoose.model("Post", PostSchema);
In this style, Post are separated model and have to save like that. Something like if you want to query a post including author's name, you can use populate at mongoose.
Post.findOne().populate("author").exce(function(err, post) {
if(err)
// do error handling
if(post){
console.log(post.author.type) // author
}
});
One solution is save only id in Author collection, using Ref on the User collection, and populate each time to get user's name from the User collection.
var User = {
name: String,
//other fields
}
var Author = {
userId: {
type: String,
ref: "User"
}
}
Another solution is when updating name in User collection, update all names in Author collection.
I think first solution will be better.
Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?
As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.
SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});
why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC
As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().
I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.
I have two collections:
Users
Uploads
Each upload has a User associated with it and I need to know their details when an Upload is viewed. Is it best practice to duplicate this data inside the the Uploads record, or use populate() to pull in these details from the Users collection referenced by _id?
OPTION 1
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
_user: { type: Schema.ObjectId, ref: 'users'},
title: { type: String },
});
OPTION 2
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
user: {
name: { type: String },
email: { type: String },
avatar: { type: String },
//...etc
},
title: { type: String },
});
With 'Option 2' if any of the data in the Users collection changes I will have to update this across all associated Upload records. With 'Option 1' on the other hand I can just chill out and let populate() ensure the latest User data is always shown.
Is the overhead of using populate() significant? What is the best practice in this common scenario?
If You need to query on your Users, keep users alone. If You need to query on your uploads, keep uploads alone.
Another question you should ask yourself is: Every time i need this data, do I need the embedded objects (and vice-versa)? How many time this data will be updated? How many times this data will be read?
Think about a friendship request:
Each time you need the request you need the user which made the request, then embed the request inside the user document.
You will be able to create an index on the embedded object too, and your search will be mono query / fast / consistent.
Just a link to my previous reply on a similar question:
Mongo DB relations between objects
I think this post will be right for you http://www.mongodb.org/display/DOCS/Schema+Design
Use Cases
Customer / Order / Order Line-Item
Orders should be a collection. customers a collection. line-items should be an array of line-items embedded in the order object.
Blogging system.
Posts should be a collection. post author might be a separate collection, or simply a field within posts if only an email address. comments should be embedded objects within a post for performance.
Schema Design Basics
Kyle Banker, 10gen
http://www.10gen.com/presentation/mongosf2011/schemabasics
Indexing & Query Optimization
Alvin Richards, Senior Director of Enterprise Engineering
http://www.10gen.com/presentation/mongosf-2011/mongodb-indexing-query-optimization
**These 2 videos are the bests on mongoddb ever seen imho*
Populate() is just a query. So the overhead is whatever the query is, which is a find() on your model.
Also, best practice for MongoDB is to embed what you can. It will result in a faster query. It sounds like you'd be duplicating a ton of data though, which puts relations(linking) at a good spot.
"Linking" is just putting an ObjectId in a field from another model.
Here is the Mongo Best Practices http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesign-SummaryofBestPractices
Linking/DBRefs http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-SimpleDirect%2FManualLinking