MongoDb and Storing Relationships Between Objects

MongoDb and Storing Relationships Between Objects - node.js

I am currently planning the development of an application using Node and I am stuck as to whether or not I should use MongoDb as a databse. Ideally I would like to use it. I understand how it works in general, but what I don't understand is how to reference other objects within a document model.
For example, let's say I have two objects; a User and an Order object.
{
Order : {
Id: 1,
Amount: 23.95
}
}
{
User: {
Id: 1,
Orders: [ ]
}
}
Essentially, a User will place an order, and upon creation of that Order object, I would like for the User object to update the Orders array appropriately.
First of all, I hear alot about MongoDb lacking relational functionality. So would I be able to store a reference to that order in the Orders array, perhaps by ID? Or should I just store a duplicate of the order object into the array?

If I were you, I would have a field named userId in Order to keep a reference to the user creating the order. Because the relation between User and Order is one-to-many, User may have many Order but Order only have one User.

Related

Can mongoose batch update based on an array of objects that matches the collection?

I am working on a project in Express/Node, and I am utilizing a MongoDB database that has a collection of Course documents that represent a course in my school system that changes in real-time. The Course documents in my database each look like this:
Course Document
{
courseID: Number,
restrictions: String,
status: String,
}
My program has to check for changes in the school's course system, and update any changes that it sees and updates my private MongoDB database with the changes. To accomplish this, I currently have a script that looks at all the courses in the school system, and records them in an array of objects, with each object corresponding to a course.
var allCourses =
[
{
courseID: 123456,
restrictions: "A and B",
status: "OPEN"
},
{
courseID: 678990,
restrictions: "A",
status: "FULL",
}
]
The goal now is to be able to go through my database, and skip the documents that are the same as the corresponding javascript object in the array, and update those that are not.
Obviously, I could just iterate through my array with forEach, and update every single course by filtering by 'courseID' and updating both fields one document at a time, but I can foresee that this would take a large amount of time.
I was wondering if there was a batch update function, similar to the insertMany operation, that can take my array of objects and update my database documents that correspond to an object within the array?

These are helpful links
Trying to do a bulk upsert with Mongoose. What's the cleanest way to do this?
https://docs.mongodb.com/manual/reference/method/db.collection.insertMany/

How to Create Relationships Using MongoDB?

I am working on a web application that uses a mongoDB database and express/nodeJS. I want to create a project in which I have users, and users can have posts, which can have many attributes, such as title, creator, and date. I am confused how to do this so that I avoid replication in my database. I tried references by using ids in a list of all the users posts like this idea: [postID1, postID2, postID3, etc...]. The problem is that I want to be able to use query back to all the users posts and display them in an ejs template, but I don't know how to do that. How would I use references? What should I do to make a this modeling system optimal for relationships?
Any help would be greatly appreciated!
Thank you!

This is a classic parent-child relationship, and your problem is that you're storing the relationship in the wrong record :-). The parent should never contain the reference to the children. Instead, each child should have a reference to the parent. Why not the other way around? It's a bit of a historical quirk: it's done that way because a classic relational table can't have multiple values for a single field, which means you can't store multiple child IDs easily in a relational table, whereas since each child will only ever have one parent, it's easy to setup a single field in the child. A Mongo document can have multiple values within a single field by using arrays, but unless you really have a good reason to do so, it's just better to follow the historical paradigm.
How does this apply in your situation? What you're trying to do is to store references to all the children (i.e. the post IDs) as a list in the parent (i.e. an array in the user document). This is not the usual way to do this. Instead, in each child (i.e. in each post), have a field called user_id, and store the userID there.
Next, make sure you create an index on the user_id field.
With that setup, it's easy to take a post and figure out who the user was (just look at the user_id field). And if you want to find all of a user's posts, just do posts.find({user_id: 'XXXX'}). If you have an index on that field, the find will execute quickly.
Storing parent references in the child is almost always better than storing child references in the parent. Even though Mongo is flexible enough to allow you to structure it either way, it's not preferred unless you have a real reason for it.
EDIT
If you do have a valid reason for storing the child references in the parent, then assuming a structure like this:
user = {posts: [postID1, postID2, postID3, ...]}
You can find the user for a specific post by user.find({posts: "XXXX"}). MongoDB is smart enough to know that you're searching for a user in which the post array contains element "XXX". And if you create an index on the posts field, then the query should be pretty quick.

I would like to mention that, there is nothing wrong in Parent containing Child references in NoSQL databases at least. It all depends on what suits your needs.
You have One-to-many relationship between users and post, and you can model your data in following 3 ways
Embedded Data Model
{
user: "username",
post: [
{
title: "Title-1",
creator: "creator",
published_date: ISODate("2010-09-24")
},
{
title: "Title-2",
creator: "creator",
published_date: ISODate("2010-09-24")
}
]
}
Parent containing child references
{
user: "username",
posts: [123456789, 234567890, ...]
}
{
_id: 123456789,
title: "Title-1",
creator: "creator",
published_date: ISODate("2010-09-24")
}
{
_id: 234567890,
title: "Title-2",
creator: "creator",
published_date: ISODate("2010-09-24")
}
Child containing parent reference
{
_id: "U123",
name: "username"
}
{
_id: 123456789,
title: "Title-1",
creator: "creator",
published_date: ISODate("2010-09-24"),
user: "U123"
}
{
_id: 23456789,
title: "Title-2",
creator: "creator",
published_date: ISODate("2010-09-24"),
user: "U123"
}
According to the MongoDB docs (I have edited the below paragraph according to your case)
When using references, the growth of the relationships determine where
to store the reference. If the number of posts per user is small
with limited growth, storing the post reference inside the user
document may sometimes be useful. Otherwise, if the number of posts
per user is unbounded, this data model would lead to mutable,
growing arrays.
Reference: https://docs.mongodb.com/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/
Now you have to decide what is best for your project keeping in mind that your model should satisfy all the test cases
Peace,

Compare two collections in MongoDb and remove common

I have three collections in MongoDB
achievements
students
student_achievements
achievements is a list of achievements a students can achieve in an academic year while
students collections hold data list of students in the school.
student_achievements holds documents where each documents contains studentId & achievementId.
I have an interface where i use select2 multiselect to allocate one or more achievements from achievements to students from students and save it to their collection student_achievements, right now to do this i populate select2 with available achievements from database. I have also made an arrangement where if a student is being allocated same achievement again the system throws an error.
what i am trying to achieve is if an achievement is allocated to student that shouldn't be available in the list or removed while fetching the list w.r.t student id,
what function in mongodb or its aggregate framework can i use to achieve this i.e to compare to collections and remove out the common.

Perhaps your data-structure could be made different to make the problem easier to solve. MongoDB is a NoSQL schemaless store, don't try to make it be like a relational database.
Perhaps we could do something like this:
var StudentSchmea = new Schema({
name: String,
achievements: [{ type: Schema.Types.ObjectId, ref: 'Achivement' }]
});
Then you can do something like this which will only add the value if it is unique to the achievements array:
db.student.update(
{ _id: 1 },
{ $addToSet: { achievements: <achivement id> } }
)
If you are using something like Mongoose you can also write your own middleware to remove orphaned docs:
AchivementSchema.post('remove', function(next) {
// Remove all references to achievements in Student schema
});
Also, if you need to verify that the achievement exists before adding it to the set, you can do a findOne query before updating/inserting to verify.
Even with the post remove hook in place, there are certain cases where you will end up with orphaned relationships potentially. The best thing to do for those situations is to have a regularly run cron task to to do cleanup when needed. These are some of the tradeoffs you encounter when using a NoSQL store.

"Right" way to keep API db tables in sync

It's my first time creating an application solo (back-end, front-end, design) and I could use some guidance on my back-end.
Let's say there is a table of Users, Jobs, and Applications (Job Applications). Right now I have it structured like so:
UserSchema {
// other attributes
_id: String,
application_ids: [String] // array of String id's
}
JobSchema {
// other attributes
_id: String,
application_ids: [String]
}
ApplicationSchema {
// other attributes
_id: String,
applicant_id: String, // the _id of a User
job_id: String // the _id of a Job
}
My current plan is like when a new Application is created (via POST /api/applications where I send the User's _id and Job's _id) I would then set the Application's applicant_id to the User's _id and then update the User's application_ids array as well as the Job's application_ids array.
Question: Am I going about this in a reasonable manner or am I touching too many tables for one POST request? There are other tables/schemas in my application that will follow a similar relationship structure. Then there's the matter of deleting Applications and then having to update application_ids again and etc, etc but that's another matter.
Note: I am using Express.js and Mongoose.js, if that helps

No, you shouldn't do it this way. By storing the ID of the user and job in the application, you can use a query to get all the applications by user or all applications for a given job. No need to touch both.
If you really want to have the relationship on both sides, at least set it up as an ObjectId and use the "ref" declaration. Check out the populate docs in the mongoose docs.

Is a type property the correct way to store different data entities in CouchDB?

I'm trying to wrap my head around CouchDB. I'm trying to switch off of MongoDB to CouchDB because I think the concept of views are more appealing to me. In CouchDB it looks like all records are stored in a single database. There is no concept of collections or anything, like in MongoDB. So, when storing different data entities such as users, blog posts, comments, etc, how do you differentiate between them from within your map reduce functions? I was thinking about just using some sort of type property and for each item I'd just have to make sure to specify the type, always. This line of thought was sort of reinforced when I read over the CouchDB cookbook website, in which an example does the same thing.
Is this the most reliable way of doing this, or is there a better method? I was thinking of alternatives, and I think the only other alternative way is to basically embed as much as I can into logical documents. Like, the immediate records inside of the database would all be User records, and each User would have an array of Posts, in which you just add all of the Posts to. The downside here would be that embedded documents wouldn't get their own id properties, correct?

Using type is convenient and fast when creating views. Alternatively you can consider using a part of the JSON document. I.e., instead of defining:
{
type: "user",
firstname: "John",
lastname: "Smith"
}
You would have:
{
user: {
firstname: "John",
lastname: "Smith"
}
}
And then in the view for emitting documents containing user information, instead of using:
function (doc) {
if (doc.type === "user") emit(null, doc);
}
You would write:
function (doc) {
if (doc.user) emit(null, doc);
}
As you can see there is not much difference. As you have already realized 1st approach is the most widely used but second (afaik) is well accepted.
Regarding the question of storing all Posts of one User in one single document. Depends on how you plan to update your document. Remember that you need to write the whole document each time that you update (unless you use attachments). That means that each time a user writes a new Post you need to retrieve the document containing the array of Posts, add/modify one element and update the document. Probably too much (heavy).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string