Setting up a complex comment model in NodeJs and mongoose - node.js

I am setting up a comment model where users can post comments reference and can also reply. the complication comes with the reply part. I want users to be able to reply to comments or others' replies, and I am lost on how to set up my model for that.
How should I set up my model to be able to capture that data in my reply?
also, any other suggestion would be appreciated
Here is the model I am currently setting up
const mongoose = require('mongoose')
const commentSchema = new mongoose.Schema({
owner: {
type: mongoose.Schema.Types.ObjectId,
required: true,
ref: 'User'
},
reference: {
type: mongoose.Schema.Types.ObjectId,
required: false,
ref: 'Project' || null,
default: false
},
body: {
type: String,
required: true,
trim: true
},
reply: {
owner: {
type: mongoose.Schema.Types.ObjectId,
required: false,
ref: 'User'
},
body: {
type: String,
required: true
}
}
}, {
timestamps: true
})
const Comment = mongoose.model('Comment', commentSchema)
module.exports = Comment

If you are thinking about a model where we have
some post
>commentA
>replyA-a
>replyA-a-a
>replyA-a-a-a
>replyA-b
>commentB
>commentC
I would aggregate everything for the corresponding entity
Comment {
user,
body,
replies: [Comment] // pattern composite
}
EntityComment { // only persist this one
reference: { id, type: post|topic|whatever },
comment: [Comment]
}
Props are:
an entityComment can grow big (is this problematic?)
no need for multiple fetch, everything's there
easy to "hide" some comments and just show its count (array length)
If record entityComment becomes too big (the max record length seems to be 16MB so likely not be the limit, but maybe the payload is slow to load), then
we can think of saving each comment (using replies: [{ ref: Comment, type: ObjectId)}])
but maybe a better idea is to use a reference for body (body: [ref: CommentBody, type: ObjectId])
The reason is body is likely the culprit (datasize wise), and this would allow to
keep everything nested in entityComment
delay the fetch of the bodies we are interested in (not the whole hierarchy)
There are tradeoffs:
is fine for read
is simpler for writes (just update/delete a singular comment)

Related

Mongoose aggregate and append

I have a Mongo DB (latest version) that I am accessing with Mongoose (v6.5.4)
The project is using a discriminator pattern to keep all documents in the same collection.
There are many instances where i need to join documents.
Set up:
// Models:
const UserSchema = new Schema<IUser>(
{
firstName: {
type: String,
required: true,
},
lastName: {
type: String,
required: true,
},
email: {
type: String,
required: true,
unique: true,
},
});
// There are other similar models to <Team>
const TeamSchema = new Schema<ITeam>(
{
name: {
type: String,
required: true,
},
userIds: {
type: [Schema.Types.ObjectId],
required: true,
ref: "User",
default: [],
},
});
Problem:
I can use populate to return collections of Teams and the userIds be an array of user objects.
Where I am stuck is querying getting an array of users with an added field of teams[].
I've been trying aggregate to no success, I can loop over the users collection and return a list of Teams but this feels wrong and expensive in terms of read units (production data base is on a pay as you go service)
As data models go there is not much going for it - but it is an existing solution
Can anyone advise?
I was being stupid. The from field in my look up was wrong.
Should have been 'teams' not 'Team' which is the model name.

Mongoose Virtual - Count references in another model that are in local array of references

I have 2 models Comment and Report.
const mongoose = require('mongoose');
const CommentSchema = new mongoose.Schema(
{
content: {
type: String,
trim: true,
maxLength: 2048,
},
createdAt: {
type: Date,
default: Date.now,
},
parent: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Comment',
required: false,
},
replies: [
{
type: mongoose.Schema.Types.ObjectId,
ref: 'Comment',
},
],
isReply: {
type: Boolean,
default: false,
},
},
{ toJSON: { virtuals: true }, toObject: { virtuals: true } }
);
CommentSchema.virtual('reportCount', {
ref: 'Report',
localField: '_id',
foreignField: 'comment',
justOne: false,
count: true,
});
CommentSchema.virtual('reportReplyCount', {
ref: 'Report',
localField: 'replies',
foreignField: 'comment',
justOne: false,
count: true,
});
module.exports = mongoose.model('Comment', CommentSchema);
Comment has field replies which is array of references pointing to the Comment model. A User can report a comment, and when that happens a new Report document is stored in Report collection, and it contains a reference to that comment and a reference to a User. I have 2 virtual properties in the Comment Schema, reportCount (show number of reports for that comment) and reportReplyCount (shows number of reports on comment replies). Now the reportCount works flawlessly, but the reportReplyCount does not. When I create a comment, and the replies to that comment, it shows number of replies instead of number of reports. I googled but could not find anything similar.
const mongoose = require('mongoose');
const ReportSchema = new mongoose.Schema({
description: {
type: String,
trim: true,
required: true,
maxLength: 100,
},
reporter: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
createdAt: {
type: Date,
default: Date.now,
},
comment: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Comment',
required: true,
},
});
module.exports = mongoose.model('Report', ReportSchema);
I don't know what you're trying to do exactly, but I've looked around and there doesn't seem to be any existing solution for this. Virtuals are one way of solving the problem, but I haven't seen an answer that uses it in this scenario.
You could try to create a new virtual called reportReplyCount that shows the number of reports on replies. Then use aggregate on Comment and replace reportCount with the new virtual. You can use something like the following:
CommentSchema.virtual('reportReplyCount', {
ref: 'Report', // reference to Report model
localField: 'replies', // matches field Comment Schema has named 'replies'
foreignField: 'comment', // matches field Report Schema has named 'comment' (foreign key in Report model)
justOne: false, // this is going to return all related documents, not just one (just like reportCount)
count: true, // set it to true so it returns a number instead of an array of documents
});
CommentSchema.methods = { ... }
CommentSchema.statics = { ... }
module.exports = mongoose.model('Comment', CommentSchema);
I would avoid using virtuals in your case if you can find another solution for your problem.
As a side note, I have seen developers create a new model to act as the virtual, like this:
const mongoose = require('mongoose');
const ReportSchema = new mongoose.Schema({
description: {
type: String,
trim: true,
required: true,
maxLength: 100,
},
reporter: { // reference to User model (foreign key)
type: mongoose.Schema.Types.ObjectId,
ref: 'User', // reference to User model (foreign key)
});
module.exports = mongoose.model('Report', ReportSchema);
// Now you need an instance of that new Schema called VirtualReport. The schema must follow the same format as the "real" Report's schema did above but with a few extra parameters that refer to the virtual and it's definition (as in how it will behave).
const VirtualReportSchema = new mongoose.Schema({ ... }, { _id : false });
module.exports = mongoose.model('VirtualReport', VirtualReportSchema);
Then all you need to do is, in your schema that has the virtual:
// Now you can use VirtualReport like any other model. It will work just like Report but it won't get stored in the database.
CommentSchema.virtual('reportReplyCount', {
ref: 'VirtualReport', // reference to VirtualReport model
localField: 'replies', // matches field Comment Schema has named 'replies'
foreignField: 'comment', // matches field VirtualReport Schema has named 'comment' (foreign key in VirtualReport model)
justOne: false, // this is going to return all related documents, not just one (just like reportCount)
count: true, // set it to true so it returns a number instead of an array of documents
});
CommentSchema.methods = { ... }
CommentSchema.statics = { ... }
module.exports = mongoose.model('Comment', CommentSchema);
But please note that the virtual's definition ("how it will behave") must contain _id property set to false (otherwise an error will be thrown). This is because when virtuals are used in subdocuments and a user references them via dot notation (e.g., commentToBeDeleted[parent].reportReplyCount), dot notation tries to access _id property of the virtual. If it's set to false, dot notation won't be able to find that virtual and you'll get an error. So don't forget to set _id property to false!
BTW, this question was asked here. It's rather unfortunate that the question was asked on Stack Overflow instead of MongoDB's own docs where a link is provided to the explanation for virtuals (well there is also a comment about "justOne" but at least it refers directly to documentation).

Retweet schema in MongoDB

What is the best way to model retweet schema in MongoDB? It is important that I have createdAt times of both original message and the time when retweet occurred because of pagination, I use createdAt as cursor for GraphQL query.
I also need a flag weather the message itself is retweet or original, and id references to original message and original user and reposter user.
I came up with 2 solutions, first one is that I keep ids of reposters and createdAt in array in Message model. The downside is that I have to generate timeline every time and for subscription its not clear what message to push to client.
The second is that I treat retweet as message on its own, I have createdAt and reposterId in place but I have a lot of replication, if I were to add like to message i have to push in array of every single retweet.
I could use help with this what is the most efficient way to do it in MongoDB?
First way:
import mongoose from 'mongoose';
const messageSchema = new mongoose.Schema(
{
text: {
type: mongoose.Schema.Types.String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
reposts: [
{
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
createdAt: { type: Date, default: Date.now },
},
],
},
{
timestamps: true,
},
);
const Message = mongoose.model('Message', messageSchema);
Second way:
import mongoose from 'mongoose';
const messageSchema = new mongoose.Schema(
{
text: {
type: mongoose.Schema.Types.String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
isReposted: {
type: mongoose.Schema.Types.Boolean,
default: false,
},
repost: {
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
originalMessageId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Message',
},
},
},
{
timestamps: true,
},
);
const Message = mongoose.model('Message', messageSchema);
export default Message;
Option 2 is the better choice here. I'm operating with the assumption that this is a Twitter re-tweet or Facebook share like functionality. You refer to this functionality as both retweet and repost so I'll stick to "repost" here.
Option 1 creates an efficiency problem where, to find reposts for a user, the db needs to iterate over all of the repost arrays of all the messageSchema collections to ensure it found all of the reposterIds. Storing ids in mongo arrays in collection X referencing collection Y is great if you want to traverse from X to Y. It's not as nice if you want to traverse from Y to X.
With option 2, you can specify a more classic one-to-many relationship between messages and reposts that will be simpler and more efficient to query. Reposts and non-repost messages alike will ultimately be placed into messageSchema in the order the user made them, making organization easier. Option 2 also makes it easy to allow reposting users to add text of their own to the repost, where it can be displayed alongside the repost in the view this feeds into. This is popular on facebook where people add context to the things they share.
My one question is, why are three fields being used to track reposts in Option 2?
isReposted, repost.reposterId and repost.originalMessageId provide redundant data. All that you should need is an originalMessageId field that, if not null, contains a messageSchema key and, if null, signifies that the message is not itself a repost. If you really need it, the userId of the original message's creator can be found in that message when you query for it.
Hope this helps!

Mongoose populate from references in another collection

I feel like this has to have been asked before and yet, I can't seem to find an answer. I have the following Mongoose models:
Album Model
const AlbumSchema = new Schema({
name: {
type: String,
required: true,
trim: true
},
user: {
type: ObjectId,
ref: 'User',
required: false
},
photosCount: {
type: Number,
default: 0
},
videosCount: {
type: Number,
default: 0
}
})
Media Model
const MediaSchema = new Schema({
type: {
type: String,
enum: ['image', 'video'],
required: true
},
user: {
type: ObjectId,
ref: 'User',
required: false
},
url: {
type: String,
required: true,
trim: true
},
width: {
type: Number,
required: true
},
height: {
type: Number,
required: true
},
albums: [{
type: ObjectId,
ref: 'Album'
}]
})
Whenever an album is fetched, I would like there to be a thumbnail property on the album which holds the most recently added media object.
I don't want to add a set of pointers to Media on the Album model because an album can potentially have tens of thousands of media. It makes sense to me that the media should hold references to the albums it's in and not the other way around.
From the Mongoose docs it says:
It is debatable that we really want two sets of pointers as they may
get out of sync. Instead we could skip populating and directly find()
the stories we are interested in.
Story
.find({ _creator: aaron._id })
.exec(function (err, stories) {
if (err) return handleError(err);
console.log('The stories are an array: ', stories);
})
Not exactly sure how to apply that in this particular context or if it would even make sense to do so. I feel like that would be a bit ugly inside of a model but I'm new to Mongoose and MongoDb in general so I'm not sure what the best practice is to handle a scenario like this.
To reiterate, I want a way to get the latest media for an arbitrary album and I do not want to have to store references in both collections if it can be avoided (for the reason I outlined above). I want the most recently added media object added to a given album to reside within a thumbnail property of that album, weather it's a single album or an array of albums.
UPDATE
I thought about adding this to the AlbumSchema:
thumbnail: {
type: ObjectId,
ref: 'Media'
}
and then updating with a post save hook in my media model:
MediaSchema.post('save', function (media, next) {
if (!media.isNew || !media.album) {
return next()
}
const Album = mongoose.model('Album')
Album.update({ _id: media.album }, { thumbnail: media }, next)
})
this seems ugly though and if this media is later deleted, the system would have to find all albums with their thumbnail pointing to that media and update them to point to the next most recent media.

mongoose how to manage count in a reference document

So I've got these schemas:
'use strict';
/**
* Module dependencies.
*/
var mongoose = require('mongoose'),
Schema = mongoose.Schema;
/**
* Comment Schema
*/
var CommentSchema = new Schema({
post_id: {
type: Schema.Types.ObjectId,
ref: 'Post',
required: true
},
author:{
type: String,
required: true
},
email:{
type: String,
required: true
},
body: {
type: String,
required: true,
trim: true
},
status: {
type: String,
required: true,
default: 'pending'
},
created: {
type: Date,
required: true,
default: Date.now
},
meta: {
votes: Number
}
});
/**
* Validations
*/
CommentSchema.path('author').validate(function(author) {
return author.length;
}, 'Author cannot be empty');
CommentSchema.path('email').validate(function(email) {
return email.length;
}, 'Email cannot be empty');
CommentSchema.path('email').validate(function(email) {
var emailRegex = /^([\w-\.]+#([\w-]+\.)+[\w-]{2,4})?$/;
return emailRegex.test(email);
}, 'The email is not a valid email');
CommentSchema.path('body').validate(function(body) {
return body.length;
}, 'Body cannot be empty');
mongoose.model('Comment', CommentSchema);
'use strict';
/**
* Module dependencies.
*/
var mongoose = require('mongoose'),
monguurl = require('monguurl'),
Schema = mongoose.Schema;
/**
* Article Schema
*/
var PostSchema = new Schema({
title: {
type: String,
required: true,
trim: true
},
author:{
type: String,
required: true,
default: 'whisher'
},
slug: {
type: String,
index: { unique: true }
},
body: {
type: String,
required: true,
trim: true
},
status: {
type: String,
required: true,
trim: true
},
created: {
type: Date,
required: true,
default: Date.now
},
published: {
type: Date,
required: true
},
categories: {
type: [String],
index: { unique: true }
},
tags: {
type: [String],
required: true,
index: true
},
comment: {
type: Schema.Types.ObjectId,
ref: 'CommentSchema'
},
meta: {
votes: Number
}
});
/**
* Validations
*/
PostSchema.path('title').validate(function(title) {
return title.length;
}, 'Title cannot be empty');
PostSchema.path('body').validate(function(body) {
return body.length;
}, 'Body cannot be empty');
PostSchema.path('status').validate(function(status) {
return /publish|draft/.test(status);
}, 'Is not a valid status');
PostSchema.plugin(monguurl({
source: 'title',
target: 'slug'
}));
mongoose.model('Post', PostSchema);
by an api I query Post like
exports.all = function(req, res) {
Post.find().sort('-created').exec(function(err, posts) {
if (err) {
res.jsonp(500,{ error: err.message });
} else {
res.jsonp(200,posts);
}
});
};
How to retrieve how many comments has the post ?
I mean I want an extra propriety in post object
like post.ncomments.
The first thing I think of is adding an extra
field to the post schema and update it whenever a user
add a comment
meta: {
votes: Number,
ncomments:Number
}
but it seems quite ugly I think
If you want the likely the most efficient solution, then manually adding a field like number_comments to the Post schema may be the best way to go, especially if you want to do things like act on multiple posts (like sorting based on comments). Even if you used an index to do the count, it's not likely to be as efficient as having the count pre-calculated (and ultimately, there are just more types of queries you can perform when it has been pre-calculated, if you haven't chosen to embed the comments).
var PostSchema = new Schema({
/* others */
number_comments: {
type: Number
}
});
To update the number:
Post.update({ _id : myPostId}, {$inc: {number_comments: 1}}, /* callback */);
Also, you won't need a comment field in the PostSchema unless you're using it as a "most recent" style field (or some other way where there'd only be one). The fact that you have a Post reference in the Comment schema would be sufficient to find all Comments for a given Post:
Comments.find().where("post_id", myPostId).exec(/* callback */);
You'd want to make sure that the field is indexed. As you can use populate with this as you've specified the ref for the field, you might consider renaming the field to "post".
Comments.find().where("post", myPostId).exec(/* callback */);
You'd still only set the post field to the _id of the Post though (and not an actual Post object instance).
You could also choose to embed the comments in the Post. There's some good information on the MongoDB web site about these choices. Note that even if you embedded the comments, you'd need to bring back the entire array just to get the count.
It looks like your Post schema will only allow for a single comment:
// ....
comment: {
type: Schema.Types.ObjectId,
ref: 'CommentSchema'
},
// ....
One consideration is to just store your comments as subdocuments on your posts rather than in their own collection. Will you in general be querying your comments only as they related to their relevant post, or will you frequently be looking at all comments independent of their post?
If you move the comments to subdocuments, then you'll be able to do something like post.comments.length.
However, if you retain comments as a separate collection (relational structure in a NoSQL DB-- there are sometimes reasons to do this), there isn't an automatic way of doing this. Mongo can't do joins, so you'll have to issue a second query. You have a few options in how to do that. One is an instance method on your post instances. You could also just do a manual CommentSchema.count({postId: <>}).
Your proposed solution is perfectly valid too. That strategy is used in relational databases that can do joins, because it would have better performances than counting up all the comments each time.

Resources