Sub documents vs Mongoose population - node.js

I have the following senario:
A user can login to a website. A user can add/delete the poll(a question with two options). Any user can give there opinion on the poll by selecting anyone of the options.
Considering the above scenario I have three models - Users Polls Options . They are as follows, in order of dependency:
Option Schema
var optionSchema = new Schema({
optionName : {
type : String,
required : true,
},
optionCount : {
type : Number,
default : 0
}
});
Poll Schema
var pollSchema = new Schema({
question : {
type : String,
required : true
},
options : [optionSchema]
});
User Schema: parent schema
var usersSchema = new Schema({
username : {
type : String,
required : true
},
email : {
type : String,
required : true,
unique : true
},
password : String,
polls : [pollSchema]
});
How do I implement the above relation between those documents. What exaclty is mongoose population? How is it different from subdocuments ? Should I go for subdocuments or should I use Mongoose population.

As MongoDb hasn't got joins as relational databases, so population is a something like hidden join. It just means that when you have that User model and you will populate Poll Model, mongoose will do something like this:
fetch User
fetch related Polls, by ObjectIds which are stored in User document
put fetched Polls documents into User document
And when you will set User as document and Polls as subdocument, it will just mean that you will put whole data in single document. At one side it means that to fetch User Polls, mongoose doesn't need to run two queries(it need to fetch only User document, because Polls data is already there).
But what is better to choose? It just depends of the case.
If your Polls document will refer in another documents (you need access to Polls from documents User, A, B, C - it could be better to populate it, but not for sure. The advantage of populating is fact, that when you will need to change some Polls fields, you don't need to change that data in every document which is referring to that Polls document(as it will be a subdocument) - in that case in document User, A, B, C - you will only update Polls document. As you see it's nice. I told that it's not sure if populating will be better in that case, because I don't know how you need to retrieve your Polls data. If you store you data in wrong way, you will get performance issues or have some problems in easy data fetch.
Subdocuments are the basic way of storing data. It's great when Polls will be only referring to User. There is performance advantage - mongoose need to do one query instead of two as in population and there is no previously reminded update disadvantage, because you store Polls data only in single place, so there is no need to update other documents.
Basically MongoDb was created to mostly use Subdocuments. As the matter of fact, it's just non-relational database. So in most cases I prefer to use subdocuments. I can't answer which way will be better in your case, because I'm not sure how your DB looks like(in a full way) and how you want to retrieve your data.
There is some useful info in official documentation:
http://mongoosejs.com/docs/subdocs.html
http://mongoosejs.com/docs/populate.html
Take a look on that.
Edit
As I prefer to fetch data easily, take care about performance and know that data redundancy in MongoDb is something common, I will choose to store this data as subdocuments.

Related

Can mongoose batch update based on an array of objects that matches the collection?

I am working on a project in Express/Node, and I am utilizing a MongoDB database that has a collection of Course documents that represent a course in my school system that changes in real-time. The Course documents in my database each look like this:
Course Document
{
courseID: Number,
restrictions: String,
status: String,
}
My program has to check for changes in the school's course system, and update any changes that it sees and updates my private MongoDB database with the changes. To accomplish this, I currently have a script that looks at all the courses in the school system, and records them in an array of objects, with each object corresponding to a course.
var allCourses =
[
{
courseID: 123456,
restrictions: "A and B",
status: "OPEN"
},
{
courseID: 678990,
restrictions: "A",
status: "FULL",
}
]
The goal now is to be able to go through my database, and skip the documents that are the same as the corresponding javascript object in the array, and update those that are not.
Obviously, I could just iterate through my array with forEach, and update every single course by filtering by 'courseID' and updating both fields one document at a time, but I can foresee that this would take a large amount of time.
I was wondering if there was a batch update function, similar to the insertMany operation, that can take my array of objects and update my database documents that correspond to an object within the array?
These are helpful links
Trying to do a bulk upsert with Mongoose. What's the cleanest way to do this?
https://docs.mongodb.com/manual/reference/method/db.collection.insertMany/

Compare two collections in MongoDb and remove common

I have three collections in MongoDB
achievements
students
student_achievements
achievements is a list of achievements a students can achieve in an academic year while
students collections hold data list of students in the school.
student_achievements holds documents where each documents contains studentId & achievementId.
I have an interface where i use select2 multiselect to allocate one or more achievements from achievements to students from students and save it to their collection student_achievements, right now to do this i populate select2 with available achievements from database. I have also made an arrangement where if a student is being allocated same achievement again the system throws an error.
what i am trying to achieve is if an achievement is allocated to student that shouldn't be available in the list or removed while fetching the list w.r.t student id,
what function in mongodb or its aggregate framework can i use to achieve this i.e to compare to collections and remove out the common.
Perhaps your data-structure could be made different to make the problem easier to solve. MongoDB is a NoSQL schemaless store, don't try to make it be like a relational database.
Perhaps we could do something like this:
var StudentSchmea = new Schema({
name: String,
achievements: [{ type: Schema.Types.ObjectId, ref: 'Achivement' }]
});
Then you can do something like this which will only add the value if it is unique to the achievements array:
db.student.update(
{ _id: 1 },
{ $addToSet: { achievements: <achivement id> } }
)
If you are using something like Mongoose you can also write your own middleware to remove orphaned docs:
AchivementSchema.post('remove', function(next) {
// Remove all references to achievements in Student schema
});
Also, if you need to verify that the achievement exists before adding it to the set, you can do a findOne query before updating/inserting to verify.
Even with the post remove hook in place, there are certain cases where you will end up with orphaned relationships potentially. The best thing to do for those situations is to have a regularly run cron task to to do cleanup when needed. These are some of the tradeoffs you encounter when using a NoSQL store.

query filter for mongodb using node js

I have two collection one is questions which stores _id, title, options, result, feedback and second is a child in the child I have store question_id, score. And I have filter the _id from questions collection. I don't know how I do this, Is it possible can we set the query for this. so that next time when I find the question from questions collection it sends filtered question. Means Return only that question from questions collection which id not same as the second collection child qustion_id.
This is my first collection where I have store questions, _id title option result feedback
_id:{type:String},
title:{type:String, required:true},
options:{type:Array, required:true},
result:{type:Array, required:true},
feedback:{type:String}
This is my Second collection where I have store attempted question_id and score
quiz:[
{
questionId:{
type:mongoose.Schema.Types.ObjectId,
ref: 'Question',
index: true
},
score:{type:Number},
time:{type:String}
}
]
This is not exactly I just create an example
var query = {}
firstcollection.find($and[{_id:},{secondcollection question_id:}]},function(err, data){
so that filter data means filter _id will store in data.
and I send this data to the frontend
res.send(data);
});
The main problem is conceptual, you are trying to work with mongodb, which is document store in RDBMS style. Under the community pressure Mondo added some minimal join functionality in latest version, but it doesn't make it relational DB.
There is no good way to perform such query. The idea behind document store is simple - you do have collection of documents and you query this collection, and only this collection. All link between collections are "virtual" and only provided by code logic, with no support from DB engine.
So all you can do with mongo is: query first collection for ids (with appropriate projection, to fetch ids only), store answer to some array and then perform second query to other collection using this array.

"Right" way to keep API db tables in sync

It's my first time creating an application solo (back-end, front-end, design) and I could use some guidance on my back-end.
Let's say there is a table of Users, Jobs, and Applications (Job Applications). Right now I have it structured like so:
UserSchema {
// other attributes
_id: String,
application_ids: [String] // array of String id's
}
JobSchema {
// other attributes
_id: String,
application_ids: [String]
}
ApplicationSchema {
// other attributes
_id: String,
applicant_id: String, // the _id of a User
job_id: String // the _id of a Job
}
My current plan is like when a new Application is created (via POST /api/applications where I send the User's _id and Job's _id) I would then set the Application's applicant_id to the User's _id and then update the User's application_ids array as well as the Job's application_ids array.
Question: Am I going about this in a reasonable manner or am I touching too many tables for one POST request? There are other tables/schemas in my application that will follow a similar relationship structure. Then there's the matter of deleting Applications and then having to update application_ids again and etc, etc but that's another matter.
Note: I am using Express.js and Mongoose.js, if that helps
No, you shouldn't do it this way. By storing the ID of the user and job in the application, you can use a query to get all the applications by user or all applications for a given job. No need to touch both.
If you really want to have the relationship on both sides, at least set it up as an ObjectId and use the "ref" declaration. Check out the populate docs in the mongoose docs.

Mongoose Populate Use or Not Use?

Definitions
I have Post Model in mongoose:
{
sender: ObjectId, // User Id
title : String,
...
}
I want to list my Post with their User's title.
And I have two choice:
1- List Posts > Extract unique Senders > Query for User titles > Replace Ids with Titles in results
One query to list Posts and one query to list unique Users
2- Use mongoose populate method in schema: sender: {type:ObjectId, ref: User},
And use the new populated value for sender in result like: sender.title
Base on how mongoose populate values may has different number of queries
Question!:
When mongoose populate 'sender' property, What does it do?
Because I need to use the best option for my project (And readable one)!
1- Use a new query for each Id
To List 1000 Post we have 1001 queries!! even when we have repeated users!!
2- Or Use a query for each unique Id
To List 1000 Post from 100 Users we have 101 queries!
3- Or even better list unique User ids and query all together (like choice one)
We have only 2 queries!! (the best if possible)
Option 3 - Mongoose will get the posts and then query users exactly once with the $in operator.
Even after doing this, the performance of doing it manually will always be better because mongoose blocks the event emitter for the period of time it takes to complete both the queries whereas your code will block it individually for a shorter time, which has better performance. you can use blocked to benchmark is

Resources