MongoDB performances $ref vs embedded - node.js

I recently started a project using mongodb and nodejs to build a restful web service. Unfortunately mongodb is very new to me, and coming from the relational databases world I'm asking my self a lot of questions.
Let me explain you my problem :
The goal is to build a sort of content management system with social features like a user can post topics that can be shared and commented.
I have 2 possibilities to do this the one using a reference to get topics posted by a user, the second using topics as embedded document of user instead of reference.
So basically I can have these 2 schemas :
var UserSchema = new Schema({
username: {
type: String,
unique: true,
required: true
},
password: {
type: String,
required: true
},
name: {
type: String
},
first_name: String,
phone: String,
topics: [Topic.schema]
});
var TopicSchema = new Schema({
_creator: {
type: String,
ref: 'User'
},
description: String,
comments: [Comments.schema],
shared_with: [{
type: Schema.ObjectId,
ref: 'User'
}] //[{ type: String, ref: 'User'}]
});
var CommentSchema = new Schema({
_creator: {
type: String,
require: true
},
text: {
type: String,
required: true
},
});
and
var UserSchema = new Schema({
username: {
type: String,
unique: true,
required: true
},
password: {
type: String,
required: true
},
name: {
type: String
},
first_name: String,
phone: String,
topics: [{ type: Schema.ObjectId, ref: 'Topics'}]
});
var TopicSchema = new Schema({
_creator: {
type: String,
ref: 'User'
},
description: String,
comments: [Comments.schema],
shared_with: [{
type: Schema.ObjectId,
ref: 'User'
}] //[{ type: String, ref: 'User'}]
});
var CommentSchema = new Schema({
_creator: {
type: String,
require: true
},
text: {
type: String,
required: true
},
});
So the first schema uses 1 collection of user document and the second use 1 collection for the user and 1 collection for the topics, this implies to make for example, 2 finds queries to retrieve a user and it's topics but it is also easyer to query directly the topics.
Here is the request I use to retrieve a specific topic with some user info with the first schema :
User.aggregate([
{$match: {
"topics._id":{$in:[mongoose.Types.ObjectId('56158c314861d2e60d000003')]}
}},
{ $unwind:"$topics" },
{$match: {
"topics._id":{$in:[mongoose.Types.ObjectId('56158c314861d2e60d000003')]}
}},
{ $group: {
_id: {
_id:"$_id",
name:"$name",
first_name:"$first_name"
},
topics:{ "$push": "$topics"}
}}
]);
So the question is, what do youh think ? Which is the good schema in your opinion ?
Thanks in advance.

Better solution: using a reference to get topics posted by a user
For this database use, one typically needs to consider the MMAPV1 document size limit (16MB). Putting user, topic, and comments in one document allows the document to grow without bound. If each topic is a page of text (1K), then each user could have about 16,000 topics before the limit is reached. That seems huge, but what happens if you decide to put images, videos, sounds in the topic as the product matures? Converting from an embedded to a normalized schema later would be a lot more work than a simple design choice today.
Similarly, if the comments could grow to cause a topic to exceed the 16MB limit, they should be in a separate collection. Unlikely? Probably. But if you are writing something that will become, say, the Huffington Post - check out comments on their popular articles.
Here is mongo's advice on data model design

Related

Is this an efficient nosql database design idea if I am storing an unbounded array of references in a document?

I am learning nosql database design by developing a nodejs backend for a social network type application.
User schema in my backend is this:
userSchema = new mongoose.Schema({
name: {
type: String,
required: true
},
email: {
type: String,
required: true
},
hashed_password: {
type: String,
required: true
},
isVerified: {
type:Boolean,
default: false
},
emailVerificationToken: String,
emailVerificationTokenExpires: Date,
followers: [{ type: ObjectId, ref: 'User'}],
following: [{ type: ObjectId, ref: 'User'}],
pins: [{type: ObjectId, ref: 'Pin'}],
bio: String,
website: String
});
I am storing an array of references to followers which is an unbounded array. I know that the maximum document size is 16 MB. I want to develop a backend which must be scalable to handle millions of users. Should I have a separate schema to store followers, following data or the current database design is fine?

Right way to store Schemas on Mongoose?

I started learning some NodeJS, and how to make a REST API from Academind on YouTube, and learned what a relational and non-relational database is, etc.
With MongoDB, writes are rather cheap, so I want to minimize the number of reads that I do. At the moment I am trying to see how I could make an API, that will be for an app that's similar to discord's, although it'll be for fun.
Is this the right way to make a Schema?
const mongoose = require('mongoose')
const userSchema = mongoose.Schema({
_id: mongoose.Schema.Types.ObjectId,
name: { type: String, required: true, unique: true},
email: { type: String, required: true },
password: { type: String, required: true }, // TODO: Hashing, etc
guilds: [{
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true},
channels: [{
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true},
// Only the X most recent messages
messages: [{
_id: mongoose.Schema.Types.ObjectId,
message: {type: String, required: true},
user: {
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
}
}]
}],
// Only an X amount of users
users: [{
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
}]
}]
})
module.exports = mongoose.model('User', userSchema)
And then for the Guilds,
const mongoose = require('mongoose')
const guildSchema = mongoose.Schema({
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true},
channels: [{
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true},
// Only an X amount of messages
messages: [{
_id: mongoose.Schema.Types.ObjectId,
message: {type: String, required: true},
user: {
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
}
}]
}],
// All the users
users: [{
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
}]
})
module.exports = mongoose.model('Guild', guildSchema)
Channel Schema
const mongoose = require('mongoose')
const channelSchema = mongoose.Schema({
_id: mongoose.Schema.Types.ObjectId,
name: { type: String, required: true },
guild: {
_id: mongoose.Schema.Types.ObjectId,
name: { type: String, required: true },
channels: [{
_id: mongoose.Schema.Types.ObjectId,
name: { type: String, required: true }
}],
// The users of the guild, or just the channel?
// Could add a users object outisde of the guild object
users: [{
_id: mongoose.Schema.Types.ObjectId,
name: { type: String, required: true }
}]
}
})
module.exports = mongoose.model('Channel', channelSchema)
And finally for the messages
const mongoose = require('mongoose')
const messageSchema = mongoose.Schema({
_id: mongoose.Schema.Types.ObjectId,
user: {
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
},
message: {type: String, required: true},
channel: {
guild: {
_id: mongoose.Schema.Types.ObjectId,
name: {type: String, required: true}
// Store more data for each message?
}
}
})
module.exports = mongoose.model('Message', messageSchema)
I am not sure if this is how a non-relational schema should look like. If it's not, how would I go about to store the data that I need?
Also, let's say that I POST a message on channel X on guild Y with the users A B and C, how would I go about to update all the entries, to add a message?
I've only used the User.find({_id: id}).exec().then().catch() so far, so I am not sure how to go about to update them.
Thanks in advance!
The messages collection should be on its own, do not embed it into any collection. This is not a good idea to embed data that will grow without limit.
The idea to store the last 5 messages into other collection looks painful to implement.
Embed denormalised data from all collections into the users collection seems like a problem when you will have to update guilds, channels, guilds users.
You may embed channels into guilds. Channels would not grow without a limit, should be a reasonable amount, less than 100 of channels per guild and probably it always used with a guild that they belong to. If not, consider not to embed channels into guilds.
The power of mongodb is to build the schema that reflects how your app is using data. I would recommend starting with normalized data. And when problems with creating, reading, updating, deleting data will occur then make appropriate changes in your mongoose schema to solve the problem. Premature optimization will only hurts in the long run.
As always an answer depends on details. Since I do not know all details I would recommend three part article by William Zola, Lead Technical Support Engineer at MongoDB. part 1 part 2 part 3

Mongodb, mongoose, Schema structure. get a collection into a field of other collection

I have a schema "Questions" it has like a dozen of questions in it, I can add and delete those questions, I need this collection reflected in a field of other collection - "User" with one additional field (nested in options).
Question Schema:
var QuestionScema = new mongoose.Schema({
key: { type: String, required: true },
label: { type: String, required: true },
name: { type: String, required: true },
page: { type: String, required: true },
type: { type: String, required: true },
options: [{
key: {type: String, required: true},
value: {type: String, required: true}
}],
});
User Schema:
var UserSchema = new mongoose.Schema({
Name: { type: String, required: true },
Email: { type: String, required: true, unique: true },
Password: { type: String, required: true },
//this is where I need to reflect a Questions collection on each user,
//so that it will look something like this//
Questions: [{
key: {type: String, required: true},
//here can be all other fields from Questions collection, that is not a problem
options: [{
key: {type: String, reuired: true},
value: {type: String, reuired: true},
counter: {type: Number, default: 0} //this is the additional field
}]
}],
//
Notifications: [{
Title: { type: String },
Data: { type: String },
Created: { type: Date, default: Date.now }
}]
});
I can't figure out how to do that.
I have another collection of users, say User2 that will answer those questions from Questions collections and I need to keep track on Users schema (not User2, there I just save questions and answers) of how many times an option for that question is chosen.
A Questiuons entry can look like this:
{
key: Haveyouseenthismovie,
label: Have you seen this movie?,
name: Have you seen this movie?,
page: 1,
type: dropdown,
options: [{
key: yes,
value: yes
}, {
key: no,
value: no
}]
}
I want it to work like that (reflect a collection in field of each User) so I don't have to check if that question is in User collection if not add and if it is, is there an option that I need if it is than increment, if not than add that option (that user selected from options in that question in Questions schema) and increment. That looks like a bummer. So I figured that it will be better if that field will reflect a collection and I will just increment the option that I need on a question that I need.
Please help me figure that out, I don't have enough practise in mongo so I struggle with it sometimes :)
I don't think there is a way to reflect a collection in another document as the way you seem to wish it.
As I understand, the following options are available for you:
Embed the entire question document inside the User documents in User Collection.
Just maintain the '_id' of the question document in the User document in User Collection.
Please read on Data Modelling concepts & maintaining relationship between documents from Mongo DB Page https://docs.mongodb.com/manual/applications/data-models-relationships/

Dont know how to populate Mongoose Query

I have a problem with a mongoose population and I don't know what I should do.
I got two schemas:
var userSchema = new userSchema({
username: { type: String, required: true, unique: true },
password: { type: String, required: true },
mods: [{ type: mongoose.Schema.Types.ObjectId, ref: 'users'}]
});
var dataSchema = mongoose.Schema({
title: { type: String, required: true, unique: true },
description: { type: String, required: true },
owner: {type: mongoose.Schema.Types.ObjectId, required: true}
});
So one user can have several data packages.
Some users are moderated by other users.
Whats the query for a moderator, that all his own data packages and the ones of the users he is moderating are listed?
You see that I have a SQL background and there's definitely another way to do it with MongoDB.
Thanks for your help!
I'm not clear understand what queries do you need but first you need set ref property in 'owner' field in dataSchema. As about population it's look like this:
//if you use callback
users.find({/*your query*/}).populate('mods')
.exec((err, result)=>{/*your code*/});
//if you use promise
users.find({/*your query*/}).populate('mods').exec()
.then(result=>{/*your code*/})
.catch(err=>{throw err});

Tinder like application MongoDB database schema

I am developing an app like Tinder to experiment with MongoDB.
I am wondering about the database schema.
The main idea is that a user can "like" many users but no matter how much the number of "liked" profiles grows, it is very unlikely to hit the 16MB document size ceiling, so in my design, "liked" profiles are embedded inside one's profile.
below is a sample of my users schema using mongoose
var UserSchema = mongoose.Schema({
fullName: {
type: String,
trim: true
},
phone: {
type: String,
trim: true,
required: true,
},
gender: {
type: String,
enum: ['male', 'female'],
},
age: {
type: Number,
required: true
},
favorites: []
});
On the other hand, a user might be "disliked" by my many users.
So a user should not see on his next profile search the profiles of users who "disliked" him, so in my design I created a collection that holds the ID of the user who "disliked" and the ID of the user being "disliked".
below is a sample of my blocked schema using mongoose
var BlockedSchema = mongoose.Schema({
BlockerUserId: {
type: String,
required: true
},
BlockedUserId: {
type: String,
required: true
}
});
Do you think this is a good approach? and which indexes needs to be created?
Best,
You can manage dislike in the user collection only, you don't need a new collection.
var UserSchema = mongoose.Schema({
fullName: {
type: String,
trim: true
},
phone: {
type: String,
trim: true,
required: true,
},
gender: {
type: String,
enum: ['male', 'female'],
},
age: {
type: Number,
required: true
},
favorites: [],
dislike[]
});
and search like
var current_user_id = userdata._id;
db.users.find({dislike:{$ne:current_user_id}})
The above code is not syntactically correct but it will give you an idea.

Resources