My post document looks like the following:
{
_id: ...,
type: 'post',
title: ...,
description: ...,
author: 'user_id'
}
And another user document:
{
_id: 'user_id',
type: 'user',
name: ...,
}
How do I fetch the post and the linked user document given that I only know post id?
Having user document inside the post document doesn't seems like a good solution as if the user changes his/her name or other details, I will have to update every post.
Another solution would be to include a posts array in the user document and use two emits in the view document. But with frequent posts and high number of posts, this looks a little inefficient.
You mentioned "linked documents" as if you were referencing this feature in CouchDB, but it doesn't appear like you meant it that way.
It turns out, this is totally supported. Your document structure doesn't need to change at all, you can use a map function like this:
function (doc) {
if (doc.type === 'post') {
emit(doc._id)
emit(doc._id, { _id: doc.author })
}
}
By emitting an object with an _id property as the value, it allows CouchDB to look up a different document (in this case, the user document) than the original when you add include_docs=true on your view query. This allows you to fetch an entire collection of related documents in a single query! I'd reference the documentation I linked to earlier for a complete example. (the rest of their docs are great too!)
Related
I begin with mongoose and I have to use watch() method on a collection.
When i want to catch insert, there are no problems.
Nevertheless, when I want to retrieve the changes of an update, I don't know why, in some cases, mongoose changes the name of my fields?
registration.watch(). on('change', data => {
if(data.operationType == "update") {
console.log(data.updateDescription.updatedFields);
}
)};
my registration's collection is made up of persons who can accept or decline an invitation, and a person can change they answer. So it's basically a removal of the person from one array of data to be put in the other one.
The only problem I have is my array's name sometimes "change" :
{
__v: 100,
accepted: [
{
_id: 5faa76d048dd6e0017e631d4,
user: 5faa752848dd6e0017e631d2
},
{
_id: 5faa9ab06048a20017774610,
user: 5fa8fabc60260ec31606d71e
},
],
'declined.1': { _id: 5faf037a141f030017863484, user: 5faa74de48dd6e0017e631d0 },
for example here, my field declined change to "declined.1", why it's happening ? and how to avoid this ? or at least, how can i get declined's array in this situation ?
When you update a document in MongoDB, it only writes the deltas to the operations log, which is what the watch function pulls from.
The dot notation declined.1 means index 1 of the declined array. The change document you provided would be expected from pushing a new object onto the declined array. Essentially, it is saving space by not repeating all of the array elements that didn't change.
If you need to retrieve the entire document, you could set the fullDocument to updateLookup. See http://mongodb.github.io/node-mongodb-native/3.0/api/Collection.html#watch
I’m using Mongoose version 4.6.8 and MongoLab (MLab). I have a Mongoose schema called “Group” that has a collection of User subdocuments called “teachers”:
var GroupSchema = new Schema({
//…more properties here…//
teachers: [{
type: Schema.ObjectId,
ref: 'User'
}]
});
This is a document from the “groups” collection on MongoLab:
{
//…more properties here…//
"teachers": [
{
"$oid": "5799a9c759feea9c208c004c"
}
]
}
And this is a document from the “users” collection on MongoLab:
{
//…more properties here…//
"username": "bob"
}
But if I want to get a list of Groups that have a particular teacher (User) with the username of “bob”, this doesn’t work (the list of groups is empty):
Group.find({"teachers.username": "bob"}).exec(callback);
This also returns no items:
Group.find().where('teachers.username').equals('bob').exec(callback);
How can I achieve this?
Without some more knowledge of your set up (specifically whether you want anybody named Bob or a specific Bob whose id you could pick up first) - this might be some help although I think it would require you to flatten your teachers array to just their ID's, not single-key objects.
User.findById(<Id of Bob>, function(err, user){
Group.find({}, function(err, groups){
var t = groups.map(function(g){
if(g['teachers'].indexOf(user.id))
return g
})
// Do something with t
})
})
You can use populate to do that.
Try this:
Group.find({})
.populate({
path : 'teachers' ,
match : { username : "bob" }
})
.exec(callback);
populate will populate based on the teachers field (given path) and match will return only those who have username bob.
For more information on mongoose populate options, Please read Mongoose populate documentation.
I think the solution in this case is to get a teacher’s groups through the User module instead of my first inclination which was to go through the Groups module. This makes sense because it is in line with how modern APIs represent a one-to-many relationship.
As an example, in Behance’s API, an endpoint for a user’s projects is:
GET /v2/users/user/projects
And a request to this endpoint (where the User’s username is “matiascorea”) would look like this:
https://api.behance.net/v2/users/matiascorea/projects?client_id=1234567890
So in my case, instead of finding the groups by teacher, I would need to simply find the User (teacher) by username, populate the teacher’s groups, and use them:
User.findOne({username: 'bob'})
.populate('groups')
.exec(callback);
And the API call for this would be:
GET /api/users/user/groups
And a request to this endpoint would look like this:
https://example.com/api/users/bob/groups
Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?
As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.
SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});
why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC
As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().
I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.
I have two models in my app: Item and Comment. An Item can have many Comments, and a Comment instance contains a reference to an Item instance with key 'comment', to keep track of the relationship.
Now I have to send a JSON list of all Items with their Comment count when user requests on a particular URL.
function(req, res){
return Item.find()
.exec(function(err, items) {
return res.send(items);
});
};
I am not sure how can I "populate" comment count to the items. This seems to be a common problem and I tend to think there should be some nicer way of doing this job than brute force.
So please share your thoughts. How would you "populate" the Comment count to the Items?
check the MongoDB documentation and look for the method findAndModify() -- with it you can atomically update a document, e.g. add a comment and increment the document counter at the same time.
findAndModify
The findAndModify command atomically modifies and returns a single document. By default, the returned document does not include the modifications made on the update. To return the document with the modifications made on the update, use the new option.
Example
Use the update option, with update operators $inc for the counter, and $addToSet for adding the actual comment to an embedded array of comments.
db.runCommand(
{
findAndModify: "item",
query: { name: "MyItem", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { commentCount: 1 },
$addToSet: {comments: new_comment} }
}
)
See:
MongoDB: findAndModify
MongoDB: Update Operators
I did some research on this issue and came up with following results. First, MongoDB docs suggest:
In general, use embedded data models when:
you have “contains” relationships between entities.
you have one-to-many relationships where the “many” objects always appear with or are viewed in the context of their parent documents.
So in my situation, it makes much more sense if Comments are embedded into Items, instead of having independent existence.
Nevertheless, I was curious to know the solution without changing my data model. As mentioned in MongoDB docs:
Referencing provides more flexibility than embedding; however, to
resolve the references, client-side applications must issue follow-up
queries. In other words, using references requires more roundtrips to
the server.
As multiple roundtrips are kosher now, I came up with following solution:
var showList = function(req, res){
// first DB roundtrip: fetch all items
return Item.find()
.exec(function(err, items) {
// second DB roundtrip: fetch comment counts grouped by item ids
Comment.aggregate({
$group: {
_id: '$item',
count: {
$sum: 1
}
}
}, function(err, agg){
// iterate over comment count groups (yes, that little dash is underscore.js)
_.each(agg, function( itr ){
// for each aggregated group, search for corresponding item and put commentCount in it
var item = _.find(items, function( item ){
return item._id.toString() == itr._id.toString();
});
if ( item ) {
item.set('commentCount', itr.count);
}
});
// send items to the client in JSON format
return res.send(items);
})
});
};
Agree? Disagree? Please enlighten me with your comments!
If you have a better answer, please post here, I'll accept it if I find it worthy.
I have two collections:
Users
Uploads
Each upload has a User associated with it and I need to know their details when an Upload is viewed. Is it best practice to duplicate this data inside the the Uploads record, or use populate() to pull in these details from the Users collection referenced by _id?
OPTION 1
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
_user: { type: Schema.ObjectId, ref: 'users'},
title: { type: String },
});
OPTION 2
var UploadSchema = new Schema({
_id: { type: Schema.ObjectId },
user: {
name: { type: String },
email: { type: String },
avatar: { type: String },
//...etc
},
title: { type: String },
});
With 'Option 2' if any of the data in the Users collection changes I will have to update this across all associated Upload records. With 'Option 1' on the other hand I can just chill out and let populate() ensure the latest User data is always shown.
Is the overhead of using populate() significant? What is the best practice in this common scenario?
If You need to query on your Users, keep users alone. If You need to query on your uploads, keep uploads alone.
Another question you should ask yourself is: Every time i need this data, do I need the embedded objects (and vice-versa)? How many time this data will be updated? How many times this data will be read?
Think about a friendship request:
Each time you need the request you need the user which made the request, then embed the request inside the user document.
You will be able to create an index on the embedded object too, and your search will be mono query / fast / consistent.
Just a link to my previous reply on a similar question:
Mongo DB relations between objects
I think this post will be right for you http://www.mongodb.org/display/DOCS/Schema+Design
Use Cases
Customer / Order / Order Line-Item
Orders should be a collection. customers a collection. line-items should be an array of line-items embedded in the order object.
Blogging system.
Posts should be a collection. post author might be a separate collection, or simply a field within posts if only an email address. comments should be embedded objects within a post for performance.
Schema Design Basics
Kyle Banker, 10gen
http://www.10gen.com/presentation/mongosf2011/schemabasics
Indexing & Query Optimization
Alvin Richards, Senior Director of Enterprise Engineering
http://www.10gen.com/presentation/mongosf-2011/mongodb-indexing-query-optimization
**These 2 videos are the bests on mongoddb ever seen imho*
Populate() is just a query. So the overhead is whatever the query is, which is a find() on your model.
Also, best practice for MongoDB is to embed what you can. It will result in a faster query. It sounds like you'd be duplicating a ton of data though, which puts relations(linking) at a good spot.
"Linking" is just putting an ObjectId in a field from another model.
Here is the Mongo Best Practices http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesign-SummaryofBestPractices
Linking/DBRefs http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-SimpleDirect%2FManualLinking