Mongoose/ MongoDB Database Design - node.js

I always have a certain fixed structure in my model (GroupName) and a dynamic part of 1-x (Members).
Group1
GroupName
Member 1
Member 2
Group2
GroupName
Member 1
Group3
GroupName
Member 1
Member 2
Member 3
Is it better to use two tables and connect them later via ids like this:
Groups:
Group1
GroupName
GroupId
Group2
GroupName
GroupId
Members:
Member 1
GroupId
Member 2
GroupId
or to use Schema.Types.Mixed(or anything else)? And how to do it in the second way?
I will always use them in combination later. From a gut feeling I would choose the first method:
http://blog.mongolab.com/2013/04/thinking-about-arrays-in-mongodb/
EDIT:
But even on the second method I have the issue, that one member can belong to multiple groups and I don't want to store him twice. The groups are unique and do only exist once.
But I'm new to MongoDb so I want to learn what's the best option and why.
EDIT II:
I have choosen two divide it into two docs. Is this implementation of the Schemas than correct like this:
var mongoose = require('mongoose');
// define the schema for group model
var groupSchema = mongoose.Schema({
href: {
type: String,
required: true,
unique: true
},
title: String,
members: [id: Schema.Types.ObjectId, name: String]
});
// create the model for users and expose it to our app
module.exports = mongoose.model('group', groupSchema);
&&
var mongoose = require('mongoose');
// define the schema for member model
var memberSchema = mongoose.Schema({
id: {
type:Schema.Types.ObjectId,
required: true,
unique: true
},
amount: String,
name: String
});
// create the model for users and expose it to our app
module.exports = mongoose.model('member', memberSchema);

There is an excellent post on the MongoDB blog which tells us about the various ways a schema can be designed based on the model relationships.
I believe the best schema for you would be to make use of embedded arrays with the member IDs.
//Group
{
_id: '1234',
name: 'some group',
members : [
'abcd',
'efgh'
]
}
EDIT
There is a correction needed in the schema:
// define the schema for group model
var groupSchema = mongoose.Schema({
href: {
type: String,
required: true,
unique: true
},
title: String,
members: [{id: Schema.Types.ObjectId, name: String}] //Needs to be enclosed with braces
});
// create the model for users and expose it to our app
module.exports = mongoose.model('group', groupSchema);

I don't know what your documents contains and if members are a growing array - for example Group1 can have 1-n members in any given moment . if this is the case you should go with option 2: try something like:
{gId: 1, mId: 5}
That is a design best suited for Social graph. Your Group documents will have a fixed size which is good for memory and you can easily get all the members of a group (just don't forget to index gId and mId)
If for each group there is a fixed number of members (or not growing and shrinking to much) then go with option 1
There is a great post by mongoDb team (and also src code) that talks about design.
Socialite

Related

Mongoose multi ref specifying the model

I have the following scheme:
var user = Schema({
id: Number,
name: String,
surname: String,
role: { type: Schema.Types.ObjectId, ref: "" }//member or crew
property: Number
});
var member = Schema({
cod_id: Number,
aa: String,
bb: String,
});
var crew = Schema({
cod_id: Number,
cc: String,
dd: String,
});
Member and crew, they are both users but they have different attributes.
The only attributes that are equal are: name, surname, role and property.
What I would like to understand if it were possible to do such a thing, specifying in user the role attribute that can be either member or crew, refer to the specific model in question.
Everything stems from the need to have the property attribute in a single model and not having to put this attribute in either member or crew, otherwise when I have to do a search I have to do two, one in the model member and one in the crew, waiting for don't have duplicate problems.
Can you give me some advice?

Mongoose Many to many relations

I'm new to mongoDB and Mongoose, and I have some problems with relations.
My schema has 3 tables (User / Person / Family), you can see it below.
var mongoose = require('mongoose')
, Schema = mongoose.Schema
var userSchema = Schema({
_id : Schema.Types.ObjectId,
email : String,
person : [{ type: Schema.Types.ObjectId, ref: 'Person' }] // A user is linked to 1 person
});
var personSchema = Schema({
_id : Schema.Types.ObjectId,
name : String,
user : [{ type: Schema.Types.ObjectId, ref: 'User' }] // A person is linked to 1 user
families : [{ type: Schema.Types.ObjectId, ref: 'Family' }] // A person have 1,n families
});
var familySchema = Schema({
_id : Schema.Types.ObjectId,
name : String,
persons : [{ type: Schema.Types.ObjectId, ref: 'Person' }] // A family have 0,n persons
});
var User = mongoose.model('User', userSchema);
var Person = mongoose.model('Person', personSchema);
var Family = mongoose.model('Family', familySchema);
I don't know if my schema is good, does the parameter person is require in my userSchema ? Because the informations will be duplicated, in userSchema I will have the personID and in the personSchema this wil be the userID.
If I understand it's usefull to have this duplicated values for my requests ? But if the informations is duplicated I need to execute two queries to update the two tables ?
For exemple, if I have a person with a family (families parameter in personSchema), and in the family I have this person (persons parameter in familySchema). What will be the requests to remove / update the lines in the tables ?
Thanks
IMHO, your schema seems fine if it meets your needs !! (Although, if you think your current schema fulfills your purpose without being bloated, then yeah its fine)..
"Person" seems to be the only type of a user and the entity to be connected to rest of the other entities . As long as this is the case, you can feel free to remove the person parameter from the userschema as you can access the user information from the person. But lets assume if there exists another entity "Aliens" who also has their own unique family, then it would be better to add the alien and person parameter in the "User" Schema to see the types of users.(As long as there's only one type i.e. Person, then you may not need to add it in userschema). In case, if you still like to keep it, then please make the following change too in your schema as you are passing the array although it seems to be one to one relation !
var userSchema = Schema({
_id : Schema.Types.ObjectId,
email : String,
person : { type: Schema.Types.ObjectId, ref: 'Person' } // A user is linked to 1
//person // Here I have removed the []
});
var personSchema = Schema({
_id : Schema.Types.ObjectId,
name : String,
user : { type: Schema.Types.ObjectId, ref: 'User' } // removed [] here too
families : [{ type: Schema.Types.ObjectId, ref: 'Family' }]
});
Yes, you will need to update it for both entities Person and Family if you want to maintain the uniformity. But, it could be done in one request/ mutation.
Well, you could perform the request depending upon the flow order of your business logic. Lets say if "Homer" is a Person who is a new member of the Simpson Family.
So, in that case you would add "Homer" to the Family collection(table) and then push the
ObjectId of this Simpson (Family collection) to the Person entity.
I have added the sample example of adding Homer to the Simpson family below. Hope this helps :)
addNewFamilyMember: async (_, {personID, familyID}) => {
try{
let family = await Family.findOne({_id: familyID});
let person = await Person.findOne({_id: personID}); // created to push the objectId of the family in this
if (!family){
throw new Error ('Family not found !')
} else {
let updatedFamily = await Family.findByIdAndUpdate(
{ _id: familyID },
{
"$addToSet": { // "addToSet" inserts into array only if it does not exist already
persons: personID
}
},
{ new: true }
);
person.families.push(updatedFamily); // pushing objectId of the Simpson family in Person "Homer"
person = await person.save();
updatedFamily.persons.push(person); // pushing "Homer" in the Simpson family
updatedFamily = updatedFamily.save();
return updatedFamily;
}
} catch(e){
throw e;
}
}
If you want to perform update, then it depends upon the intent of your purpose (as for example, if you just want to update the name "Homer", you would only have to update it in the Person collection, as the Family collection already has reference to the objectId of Homer, so every time you make an update to the Homer, the updated document would be referenced by Family collection ! ), and
if you want to perform deletion, then in that case too, the approach would be different based upon the scenario, as if you wish to remove a person document, or just remove the person reference from the family, or remove the family reference from the person !!
Lets say you want to delete a person then in that case, you would have to take the personId and search for that person and since you have access to the families via this person, you can access the families via person.families and remove the personId from those respective families as well ! And then you could remove the associated user too as you have the reference to the user too from the same person object.
To sum up, it depends upon your choice of action, and how much sanitization you want in your schema.. The above mentioned process would be just different in case if we take a different approach.

Best way to structure my mongoose schema: embedded array , populate, subdocument?

Here is my current Schema
Brand:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var BrandSchema = new mongoose.Schema({
name: { type: String, lowercase: true , unique: true, required: true },
photo: { type: String , trim: true},
email: { type: String , lowercase: true},
year: { type: Number},
timestamp: { type : Date, default: Date.now },
description: { type: String},
location: { },
social: {
website: {type: String},
facebook: {type: String },
twitter: {type: String },
instagram: {type: String }
}
});
Style:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var StyleSchema = new mongoose.Schema({
name: { type: String, lowercase: true , required: true},
});
Product
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var ProductSchema = new mongoose.Schema({
name: { type: String, lowercase: true , required: true},
brandId : {type: mongoose.Schema.ObjectId, ref: 'Brand'},
styleId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
year: { type: Number },
avgRating: {type: Number}
});
Post:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var PostSchema = new mongoose.Schema({
rating: { type: Number},
upVote: {type: Number},
brandId : {type: mongoose.Schema.ObjectId, ref: 'Brand'},
comment: {type: String},
productId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
styleId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
photo: {type: String}
});
I'm currently making use of the mongoose populate feature:
exports.productsByBrand = function(req, res){
Product.find({product: req.params.id}).populate('style').exec(function(err, products){
res.send({products:products});
});
};
This works, however, being a noob --- i've started reading about performance issues with the mongoose populate, since it's really just adding an additional query.
For my post , especially, it seems that could be taxing. The intent for the post is to be a live twitter / instagram-like feed. It seems that could be a lot of queries, which could greatly slow my app down.
also, I want to be able to search prodcuts / post / brand by fields at some point.
Should i consider nesting / embedding this data (products nested / embedded in brands)?
What's the most efficient schema design or would my setup be alright -- given what i've specified I want to use it for?
User story:
There will be an Admin User.
The admin will be able to add the Brand with the specific fields in the Brand Schema.
Brands will have associated Products, each Product will have a Style / category.
Search:
Users will be able to search Brands by name and location (i'm looking into doing this with angular filtering / tags).
Users will be able to search Products by fields (name, style, etc).
Users will be able to search Post by Brand Product and Style.
Post:
Users will be able to Post into a feed. When making a Post, they will choose a Brand and a Product to associate the Post with. The Post will display the Brand name, Product name, and Style -- along with newly entered Post fields (photo, comment, and rating).
Other users can click on the Brand name to link to the Brand show page. They can click on the Product name to link to a Product show page.
Product show page:
Will show Product fields from the above Schema -- including associated Style name from Style schema. It will also display Post pertaining to the specific Product.
Brand show page:
Will simply show Brand fields and associated products.
My main worry is the Post, which will have to populate / query for the Brand , Product, and Style within a feed.
Again, I'm contemplating if I should embed the Products within the Brand -- then would I be able to associate the Brand Product and Style with the Post for later queries? Or, possibly $lookup or other aggregate features.
Mongodb itself does not support joins. So, mongoose populate is an attempt at external reference resolution. The thing with mongodb is that you need to design your data so that:
most of you queries need not to refer multiple collections.
after getting data from query, you need not to transform it too much.
Consider the entities involved, and their relations:
Brand is brand. Doesn't depend on anything else.
Every Product belongs to a Brand.
Every Product is associated with a Style.
Every Post is associated with a Product.
Indirectly, every Post is associated to a Brand and Style, via product.
Now about the use cases:
Refer: If you are looking up one entity by id, then fetching 1-2 related entities is not really a big overhead.
List: It is when you have to return a large set of objects and each object needs an additional query to get associated objects. This is a performance issue. This is usually reduced by processing "pages" of result set at a time, say 20 records per request. Lets suppose you query 20 products (using skip and limit). For 20 products you extract two id arrays, one of referred styles, and other of referred brands. You do 2 additional queries using $in:[ids], get brands and styles object and place them in result set. That's 3 queries per page. Users can request next page as they scroll down, and so on.
Search: You want to search for products, but also want to specify brand name and style name. Sadly, product model only holds ids for style and brand. Same issue with searching Posts with brand and product. Popular solution is to maintain a separate "search index", a sort of table, that stores data exactly the way it will be searched for, with all searchable fields (like brand name, style name) at one place. Maintaining such search collections in mongodb manually can be a pain. This is where ElasticSearch comes in. Since you are already using mongoose, you can simply add mongoosastic to your models. ElasticSearch's search capabilities are far greater than a DB Storage engine will offer you.
Extra Speed: There is still some room for speeding things up: Caching. Attach mongoose-redis-cache and have frequent repeated queries served, in-memory from Redis, reducing load on mongodb.
Twitter like Feeds: Now if all Posts are public then listing them up for users in chronological order is a trivial query. However things change when you introduce "social networking" features. Then you need to list "activity feeds" of friends and followers. There's some wisdom about social inboxes and Fan-out lists in mongodb blog.
Moral of the story is that not all use cases have only "db schema query" solutions. Scalability is one of such cases. That's why other tools exist.

Is there a way to populate with a projection in Mongoose?

Say I have a collection that contains a field that references documents from another collection like follows:
ClassEnrollment
_id | student | class
---------------------
and classes in the Class collection have the following schema:
_id | className | teacher | building | time | days | classNumber | description
------------------------------------------------------------------------------
If I have a set of 3000 classes I want to populate on the server I might do something like ClassEnrollment.populate(listOfClassEnrollments, {path: 'class'});
In my situation, I don't want the majority of the class fields though, just the name. If I get the list of 3000 classes from the db with all fields, I end up taking a performance hit in the form of network latency (these 3000 classes have to be transferred from the hosted db to the server, which might be 50 MB of raw data if the descriptions are long)
Is there a way to populate the list of class enrollments with just the name through an option to populate (behind the scenes I imagine it would work like a projection, so the db just responds with the class name and _id instead of all the class information)?
You can use the select option in your populate call to do this:
ClassEnrollment.populate(listOfClassEnrollments, {path: 'class', select: 'className'});
To specify multiple fields, use a space-separated list:
ClassEnrollment.populate(
listOfClassEnrollments,
{path: 'class', select: 'className classNumber'}
);
Let's say we have a very simple user & video schemas.
1) USER SCHEMA
import mongoose from "mongoose";
const { Schema, model } = mongoose;
const UserSchema = new Schema({
name: String,
email: String,
password: String,
});
export default model("User", UserSchema);
2) VIDEOS SCHEMA
import mongoose from "mongoose";
const { Schema, model } = mongoose;
const { ObjectId } = Schema.Types;
const VideoSchema = new Schema({
videoOwnerId: { type: ObjectId, ref: "User", required: true },
title: { type: String, required: true },
desc: { type: String, required: true },
});
export default model("Video", VideoSchema);
Then I want to find in Videos Collection all videos by specific user AND at the same time all information about this user(a user document from Users Collection) and use projection on it ( select specific fields )
3) Somewhere in our code (maybe in a controller)
const videos = await Video.find({ videoOwnerId: "someId214121" }).populate("videoOwnerId", "-password");
So to populate with projection you use a populate("videoOwnerId", "-password") method, when the first argument is a field you want to populate, the second argument is a projection.
To get a document with all fields but without password for example
populate("videoOwnerId", "-password")
To get only specific fields that you want(string with fields separated by whitespace)
populate("videoOwnerId", "name email")

Mongoose Populate Returning null due to Schema design?

I'm stuck with mongoose populate returning null. I have a very similar situation to another question where it seems to we working just fine, perhaps with one important difference:
The model I'm referencing only exist as a subdocument to another model.
Example:
// The model i want to populate
Currency = new Schema({
code: String,
rate: Number
});
// The set of currencies are defined for each Tenant
// A currency belongs to one tenant, one tenant can have multiple currencies
Tenant = new Schema({
name: String,
currencies: [Currency]
});
Product = new Schema({
Name: String,
_currency: {type: ObjectId, ref: 'Currency'},
});
Customer = new Schema({
tenant: {type: ObjectId, ref: 'Tenant'},
products: [ Product ]
});
Then I export the models and use them in one of my routes where what I would like to do is something like
CustomerModel.find({}).populate('products._currency').exec(function(err, docs){
// docs[0].products[0]._currency is null (but has ObjectId if not usinn populate)
})
Which is returning null for any given product._currency but if I don't populate i get the correct ObjectId ref, which corresponds to an objectId of a currency embedded in a tenant.
I'm suspecting I need currencies to be stand-alone schema for this to work., Ie not just embedded in tenant, but that would mean I get a lot of schemas referencing each other.
Do you know if this is the case, or should my set-up work?
If this is the case, I guess I just have to bite the bullet and have multitude of collections referencing each other?
Any help or guidance appreciated!

Resources