Mongoose: 3 way document joining

Mongoose: 3 way document joining - node.js

I'm learning mongoose and need some help. I have 3 collections, and on a single API call, I want to create 3 documents that reference each other; "joins" below:
Users - need to reference chirps
Videos - need to reference chirps
Chirps - need to reference users & chirps
Question: I know that I can do a model.create() and pass in the new document in each callback, and then update to the respective docs, but I was wondering if there's a cleaner way of doing it?
Sorry if I'm not clear on the question. Please ask me if something doesn't make sense.
Code
var chirpSchema = new mongoose.Schema({
date_created: { type: Date, default: Date.now }
, content: { post : String }
, _video: { type: $oid, ref: "video" }
, _author: { type: $oid, ref: "user" }
});
var chirp = mongoose.model('chirp', chirpSchema);
var userSchema = new mongoose.Schema({
date_joined: { type : Date, default: Date.now }
, cookie_id: String,
chirp_library: [{type: $oid, ref: "chirp"}]
})
var user = mongoose.model('user', userSchema);
var videoSchema = new mongoose.Schema({
date_tagged: { type : Date, default: Date.now }
, thumbnail_url : String
, _chirps: [{type: $oid, ref: "chirp" }]
});
var video = mongoose.model('video', videoSchema);

Mongo and other NoSQL databases aren't just an interchangeable alternative to a SQL database. It forces you to rethink your design in a different way. The concept isn't to define a relationship. The idea is to make information available in less queries. Arrays are generally a thing to avoid in Mongo, especially if they have the potential to grow infinitely. Based on your naming, that seems like a strong possibility. If you keep the rest of your schema and just delete those two arrays off of your user and video schemas:
chirp.find({_author: yourUserId}).populate("_author") gives you the same information as user.findOne({_id: yourUserId}) in your current design.
similarly,
chirp.find({_video: yourVideoId}).populate("_video") and video.findOne({_id: yourVideoId})
The only issue with this is that the .populate() is running on every single chirp you are pulling. A way around this is to denormalize some (or all) of your author and video documents on the chirp document. How I would likely design this is:
var chirpSchema = new mongoose.Schema({
date_created: { type: Date, default: Date.now },
content: {
post : String
},
_author: {
_id: { type: $oid, ref: "video" },
date_joined: { type: Date, default: Date.now }
},
_video: {
_id: { type: $oid, ref: "user" },
date_tagged: { type: Date, default: Date.now },
thumbnail_url: String
}
});
var userSchema = new mongoose.Schema({
date_joined: { type : Date, default: Date.now }
, cookie_id: String
})
var videoSchema = new mongoose.Schema({
date_tagged: { type : Date, default: Date.now }
, thumbnail_url : String
});
It's perfectly OK to have data repeated, as long as it makes your queries more efficient. That being said, you need to strike a balance between reading and writing. If your user information or video information changes regularly, you'll have to update that data on each chirp document. If there is a specific field on your video/author that is changing regularly, you can just leave that field off the chirp document if it's not necessary in queries.

Related

Retweet schema in MongoDB

What is the best way to model retweet schema in MongoDB? It is important that I have createdAt times of both original message and the time when retweet occurred because of pagination, I use createdAt as cursor for GraphQL query.
I also need a flag weather the message itself is retweet or original, and id references to original message and original user and reposter user.
I came up with 2 solutions, first one is that I keep ids of reposters and createdAt in array in Message model. The downside is that I have to generate timeline every time and for subscription its not clear what message to push to client.
The second is that I treat retweet as message on its own, I have createdAt and reposterId in place but I have a lot of replication, if I were to add like to message i have to push in array of every single retweet.
I could use help with this what is the most efficient way to do it in MongoDB?
First way:
import mongoose from 'mongoose';
const messageSchema = new mongoose.Schema(
{
text: {
type: mongoose.Schema.Types.String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
reposts: [
{
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
createdAt: { type: Date, default: Date.now },
},
],
},
{
timestamps: true,
},
);
const Message = mongoose.model('Message', messageSchema);
Second way:
import mongoose from 'mongoose';
const messageSchema = new mongoose.Schema(
{
text: {
type: mongoose.Schema.Types.String,
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
isReposted: {
type: mongoose.Schema.Types.Boolean,
default: false,
},
repost: {
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
originalMessageId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Message',
},
},
},
{
timestamps: true,
},
);
const Message = mongoose.model('Message', messageSchema);
export default Message;

Option 2 is the better choice here. I'm operating with the assumption that this is a Twitter re-tweet or Facebook share like functionality. You refer to this functionality as both retweet and repost so I'll stick to "repost" here.
Option 1 creates an efficiency problem where, to find reposts for a user, the db needs to iterate over all of the repost arrays of all the messageSchema collections to ensure it found all of the reposterIds. Storing ids in mongo arrays in collection X referencing collection Y is great if you want to traverse from X to Y. It's not as nice if you want to traverse from Y to X.
With option 2, you can specify a more classic one-to-many relationship between messages and reposts that will be simpler and more efficient to query. Reposts and non-repost messages alike will ultimately be placed into messageSchema in the order the user made them, making organization easier. Option 2 also makes it easy to allow reposting users to add text of their own to the repost, where it can be displayed alongside the repost in the view this feeds into. This is popular on facebook where people add context to the things they share.
My one question is, why are three fields being used to track reposts in Option 2?
isReposted, repost.reposterId and repost.originalMessageId provide redundant data. All that you should need is an originalMessageId field that, if not null, contains a messageSchema key and, if null, signifies that the message is not itself a repost. If you really need it, the userId of the original message's creator can be found in that message when you query for it.
Hope this helps!

Mongoose populate ObjectID from multiple possible collections

I have a mongoose model that looks something like this
var LogSchema = new Schema({
item: {
type: ObjectId,
ref: 'article',
index:true,
},
});
But 'item' could be referenced from multiple collections. Is it possible to do something like this?
var LogSchema = new Schema({
item: {
type: ObjectId,
ref: ['article','image'],
index:true,
},
});
The idea being that 'item' could be a document from the 'article' collection OR the 'image' collection.
Is this possible or do i need to manually populate?

Question is old, but maybe someone else still looks for similar issues :)
I found in Mongoose Github issues this:
mongoose 4.x supports using refPath instead of ref:
var schema = new Schema({
name:String,
others: [{ value: {type:mongoose.Types.ObjectId, refPath: 'others.kind' } }, kind: String }]
})
In #CadeEmbery case it would be:
var logSchema = new Schema({
item: {type: mongoose.Types.ObjectId, refPath: 'kind' } },
kind: String
})
But I did't try it yet...

First of all some basics
The ref option says mongoose which collection to get data for when you use populate().
The ref option is not mandatory, when you do not set it up, populate() require you to give dynamically a ref to him using the model option.
#example
populate({ path: 'conversation', model: Conversation }).
Here you say to mongoose that the collection behind the ObjectId is Conversation.
It is not possible to gives populate or Schema an array of refs.
Some others Stackoverflow people asked about it.
Soluce 1: Populate both (Manual)
Try to populate one, if you have no data, populate the second.
Soluce 2: Change your schema
Create two link, and set one of them.
var LogSchema = new Schema({
itemLink1: {
type: ObjectId,
ref: 'image',
index: true,
},
itemLink2: {
type: ObjectId,
ref: 'article',
index: true,
},
});
LogSchema.find({})
.populate('itemLink1')
.populate('itemLink2')
.exec()

Dynamic References via refPath
Mongoose can also populate from multiple collections based on the value of a property in the document. Let's say you're building a schema for storing comments. A user may comment on either a blog post or a product.
body: { type: String, required: true },
on: {
type: Schema.Types.ObjectId,
required: true,
// Instead of a hardcoded model name in `ref`, `refPath` means Mongoose
// will look at the `onModel` property to find the right model.
refPath: 'onModel'
},
onModel: {
type: String,
required: true,
enum: ['BlogPost', 'Product']
}
});
const Product = mongoose.model('Product', new Schema({ name: String }));
const BlogPost = mongoose.model('BlogPost', new Schema({ title: String }));
const Comment = mongoose.model('Comment', commentSchema);

MONGODB MULTI PARAMETER SEARCH QUERY

I have the following schema:
var ListingSchema = new Schema({
creatorId : [{ type: Schema.Types.ObjectId, ref: 'User' }],//LISTING CREATOR i.e. specific user
roommatePreference: { //preferred things in roommate
age: {//age preferences if any
early20s: { type: Boolean, default: true },
late20s: { type: Boolean, default: true },
thirtys: { type: Boolean, default: true },
fortysAndOld: { type: Boolean, default: true }
},
gender: {type:String,default:"Male"}
},
roomInfo: {//your own location of which place to rent
address: {type:String,default:"Default"},
city: {type:String,default:"Default"},
state: {type:String,default:"Default"},
zipcode: {type:Number,default:0},
},
location: {//ROOM LOCATION
type: [Number], // [<longitude>, <latitude>]
index: '2d' // create the geospatial index
},
pricing: {//room pricing information
monthlyRent: {type:Number,default:0},
deposit: {type:Number,default:0},
},
availability:{//room availability information
durationOfLease: {
minDuration: {type:Number,default:0},
maxDuration: {type:Number,default:0},
},
moveInDate: { type: Date, default: Date.now }
},
amneties : [{ type: Schema.Types.ObjectId, ref: 'Amnety' }],
rules : [{ type: Schema.Types.ObjectId, ref: 'Rule' }],
photos : [{ type: Schema.Types.ObjectId, ref: 'Media' }],//Array of photos having photo's ids, photos belong to Media class
description: String,//description of room for roomi
status:{type:Boolean,default:true}//STATUS OF ENTRY, BY DEFAULT ACTIVE=TRUE
},
{
timestamps:true
}
);
The application background is like Airbnb/Roomi app, where users can give their rooms/places on rent. Now i want to implement a filter for a user finding the appropriae listing of room.
Here creatorId, rules, amneties are refIds of other schemas. I want to write a query which will give me listings based on several parameters,
e.g. user can pass rules, pricing info, some amneties, gender etc in req queries.
The query parameters depends upon user's will.
Is there any way to do nested query like thing for this?, like the way we did in SQL.

Well, mongodb is not made to be used as relational DB.
instead, i would suggest transforming amenities array into an array of objects with the amenities embeded inside the Listings schema.
so you can query as follows:
// Schema
ListSchema = mongoose.Schema({
....
amneties: [{aType: 'shower'}]
// or you can make it a simple array of strings:
// amneties: ['shower']
....
})
// query
Listings.find({'amneties.aType' : <some amenity>})
there are no joins in mongodb, you can still make "joins" as mongoose calls them populate, but they are happening on your server, and every populations requires a round trip to the server.
if you still wish to use references to the amneties collection, you should query it first and populate the Listing object on them.

Mongo user document structure with three user types

I'm setting up a Mongo database in Express with Mongoose and I'm trying to decide how to model the users. I've never modeled multiple users in the MEAN stack before and thought I'd reach out for some best-practices - I'm an instructor and need to be able to teach my students best practices. I haven't been able to find a whole lot out there, but perhaps I'm searching for the wrong things.
The app will have 3 user types, student, staff, and admin. Each user type will require some of the same basics - email, password, first and last names, phone, etc. If the user is a student, they will need to provide additional info like their high school name, grade, age, gender, etc, which ideally will be required.
This is what I've come up with so far - a single user model that requires all the basic information, but also has schema set up to allow for the additional information that students will need to include. Then I also have a pre-save hook set up to remove the "studentInfo" subdocument if the user being saved doesn't have a "student" role:
var mongoose = require("mongoose");
var Schema = mongoose.Schema;
var ethnicityList = [
"White",
"Hispanic or Latino",
"Black or African American",
"Native American or American Indian",
"Asian / Pacific Islander",
"Other"
];
var userSchema = new Schema({
firstName: {
type: String,
required: true
},
lastName: {
type: String,
required: true
},
phone: {
type: Number,
required: true
},
email: {
type: String,
required: true,
lowercase: true,
unique: true
},
password: {
type: String,
required: true
},
preferredLocation: {
type: String,
enum: ["provo", "slc", "ogden"]
},
role: {
type: String,
enum: ["student", "staff", "admin"],
required: true
},
studentInfo: {
school: String,
currentGrade: Number,
ethnicity: {
type: String,
enum: ethnicityList
},
gender: {
type: String,
enum: ["male", "female"]
}
}
}, {timestamps: true});
userSchema.pre("save", function (next) {
var user = this;
if (Object.keys(user.studentInfo).length === 0 && user.role !== "student") {
delete user.studentInfo;
next();
}
next();
});
module.exports = mongoose.model("User", userSchema);
Question 1: Is this an okay way to do this, or would it be better just to create two different models and keep them totally separate?
Question 2: If I am going to be to restrict access to users by their user type, this will be easy to check by the user's role property with the above setup. But if it's better to go with separated models/collections for different user types, how do I check whether its a "Staff" or "Student" who is trying to access a protected resource?
Question 3: It seems like if I do the setup as outlined above, I can't do certain validation on the subdocument - I want to require students to fill out the information in the subdocument, but not staff or admin users. When I set any of the fields to required, it throws an error when they're not included, even though the subdocument itself isn't required. (Which makes sense, but I'm not sure how to get around. Maybe custom validation pre-save as well? I've never written that before so I'm not sure how, but I can look that up if that's the best way.)

Well, Here are my two cents.
You would be better off creating separate schema models and then injecting the models on a need to basis.
for e.g.
If I have a blog schema as follows:
var createdDate = require('../plugins/createdDate');
// define the schema
var schema = mongoose.Schema({
title: { type: String, trim: true }
, body: String
, author: { type: String, ref: 'User' }
})
// add created date property
schema.plugin(createdDate);
Notice that author is referring to User and there is an additional field createdData
And here is the User Schema:
var mongoose = require('mongoose');
var createdDate = require('../plugins/createdDate');
var validEmail = require('../helpers/validate/email');
var schema = mongoose.Schema({
_id: { type: String, lowercase: true, trim: true,validate: validEmail }
, name: { first: String, last: String }
, salt: { type: String, required: true }
, hash: { type: String, required: true }
, created: {type:Date, default: Date.now}
});
// add created date property
schema.plugin(createdDate);
// properties that do not get saved to the db
schema.virtual('fullname').get(function () {
return this.name.first + ' ' + this.name.last;
})
module.exports = mongoose.model('User', schema);
And the created Property which is being refereed in both User and Blogspot
// add a "created" property to our documents
module.exports = function (schema) {
schema.add({ created: { type: Date, default: Date.now }})
}
If you want to restrict access based on the user types, you would have to write custom validation like in the User schema we had written for emails:
var validator = require('email-validator');
module.exports = function (email) {
return validator.validate(email);
}
And then add an if-else based on whatever validations you do.
2 and 3. So, Yes custom validations pre-save as well.
Since you are an instructor I preferred to just point out the practices that are used instead of elaborating on your specific problem.
Hope this helps! :)

Populate mongoose array of objects

I have a mongoose model that looks like this:
var ModuleSchema = new Schema({
systems: [{
system: {
type: Schema.ObjectId,
ref: 'System'
},
quantity: {
type: Number
}
}]
});
mongoose.model('Module', ModuleSchema);
Basically the ModuleSchema.systems.$.system property will not get populated.
The property belongs to an object in the array of objects. I have tried everything to get it to populate but it just won't happen.
I tried the following syntax for populating but not sure what might be wrong because I still don't get back the populated System property.
Module.findOne({project: pId}).sort('-created')
.populate('systems.system')

It wont work as system is not a property of systems. You need to populate it like
Module.findOne({project: pId}).sort('-created')
.populate('systems.0.system').exec(function (err, doc){})
Module.findOne({project: pId}).sort('-created')
.populate('systems.1.system').exec(function (err, doc){})
So you should have a for loop and iterate over it to populate all the documents. Else you should modify your model to make it work better.
var ModuleSchema = new Schema({
systems: [{
system: {
type: Schema.ObjectId,
ref: 'System'
},
quantity: {
type: Number
}
}]
});
Change your model to this and it will make easy for you.
var ModuleSchema = new Schema({
systems: {
system: [{
type: Schema.ObjectId,
ref: 'System'
}],
quantity: {
type: Number
}
}
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string