MongoDB schema: store id as FK or whole document

MongoDB schema: store id as FK or whole document - node.js

I am designing MongoDB structure (the models structure in NodeJS app actually). I will have players and matches collections.
Is it better to store only the ids of the matches the player joined,inside each player's object (like a FK in RDBM) or store the whole object of match inside the player object?
In the application one of the action would be to show the details of the match and on this view the user will see the players that joined this particular match (their names, country etc.). That makes me think that storing whole Match document inside the Player document is better.
Any tips?

Storing whole Match document inside the Player document is not a good option I think.
Your player document will need to be updated every time the player play in a match.
You have 2 main alternatives:
1-) Using child referencing. (referencing player in match).
So if we want to imlement this using mongoose models:
Player model:
const mongoose = require("mongoose");
const playerSchema = mongoose.Schema({
name: String,
country: String
});
const Player = mongoose.model("Player", playerSchema);
module.exports = Player;
Match model:
const mongoose = require("mongoose");
const matchSchema = mongoose.Schema({
date: {
type: Date,
default: Date.now()
},
players: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Player"
}
]
});
const Match = mongoose.model("Match", matchSchema);
module.exports = Match;
With these models, our match document will be like this (referencing playerId's):
{
"_id" : ObjectId("5dc419eff6ba790f4404fd07"),
"date" : ISODate("2019-11-07T16:19:39.691+03:00"),
"players" : [
ObjectId("5dc41836985aaa22c0c4d423"),
ObjectId("5dc41847985aaa22c0c4d424"),
ObjectId("5dc4184e985aaa22c0c4d425")
],
"__v" : 0
}
And we can use this route to get match info with all players info:
const Match = require("../models/match");
router.get("/match/:id", async (req, res) => {
const match = await Match.findById(req.params.id).populate("players");
res.send(match);
});
And the result will be like this:
[
{
"date": "2019-11-07T13:19:39.691Z",
"players": [
{
"_id": "5dc41836985aaa22c0c4d423",
"name": "player 1",
"country": "country 1",
"__v": 0
},
{
"_id": "5dc41847985aaa22c0c4d424",
"name": "player 2",
"country": "country 1",
"__v": 0
},
{
"_id": "5dc4184e985aaa22c0c4d425",
"name": "player 3",
"country": "country 2",
"__v": 0
}
],
"_id": "5dc419eff6ba790f4404fd07",
"__v": 0
}
]
2-) Embedding players inside match, and still keeping a independent players collection.
But this will need more space than first option.
So your a match will look like this in matches collection:
{
"date": "2019-11-07T13:19:39.691Z",
"players": [
{
"_id": "5dc41836985aaa22c0c4d423",
"name": "player 1",
"country": "country 1",
"__v": 0
},
{
"_id": "5dc41847985aaa22c0c4d424",
"name": "player 2",
"country": "country 1",
"__v": 0
},
{
"_id": "5dc4184e985aaa22c0c4d425",
"name": "player 3",
"country": "country 2",
"__v": 0
}
],
"_id": "5dc419eff6ba790f4404fd07",
"__v": 0
}
But this may be a little faster when getting a match info, since there is no need to populate players info.
const Match = require("../models/match");
router.get("/match/:id", async (req, res) => {
const match = await Match.findById(req.params.id);
res.send(match);
});

The way I see it, the matches collection here is a collection of documents that exists independently and then connected with the players that participates to the matches. With that said, I would do an array of match keys.
I would suggest going for a nested document structure if the document being nested can be considered as "owned" by the parent document. For example, a todo nested document inside of a todoList document.

This is a case of many-to-many relationship.
I am guessing that there will be about 100 players and 100 matches data, initially.
The design options are embedding or referencing.
(1) Embedding:
The most queried side will have the less queried side embedded.
Based on your requirement (show the details of the match and on this view the user will see the players that joined this particular match and their details) the match side will have the player data embedded.
The result is two collections.The main one is the matches. The secondary is the players; this will have all the source data for a player (id, name, dob, country, and other details).
Only a few players data is stored for a match, and only a subset of a player data is stored in the matches collection.
Results in duplication of player data. This is fine, it is mostly static info that will be duplicated; things like name and country. But, some of it may need updates over time and the application needs to take care of this.
The player data is stored as an array of embedded documents in the matches collection. This design is the possible solution.
matches:
_id
matchId
date
place
players [ { playerId 1, name1, country1 }, { playerId 2, ... }, ... ]
outcome
players:
_id
name
dob
country
ranking
(2) Referencing:
This will also have two collections: players and matches.
Referencing can happen with either side, the matches can refer the players or vice-versa.
Based on the requirement, the most queried side will have references of the less queried side; matches will have the player id references. This will be an array of player ids.
matches:
_id
matchId
date
place
players [ playerId 1, playerId 2, ... ]
The players collection will have the same data as in earlier case.

Related

MongoDB Schema design Student Exams

I am trying to learn MongoDB with Nodejs and using mongoose, and know basic stuff, (I am from a SQL Background). I am trying to make a student exam Database.
There are multiple exams, each exam has some students, that are going to take some MCQ questions and answer the questions which I need to save.
As I am new to NoSQL, so I am looking for models for these requirements. (personal learning project)
Here are my example objects, which I need to model
// Exam Object
{
"ExamID": "1",
"ExamDate": "2021-01-01 09:00:00",
"ExamName": "Midterm Exam"
}
// student Object
{
"StudentID": "1",
"StudentName": "SOME FULL NAME"
}
// Question Object
{
"ExamQuestions":[
{
"QuestionID": "1",
"QuestionText": "SOME Long QUestion Text",
"Correct": "A",
"QuestionOptions":[
{
"Option_Title": "A",
"Option_Value": "This is Option A"
},
{
"Option_Title": "B",
"Option_Value": "This is Option B"
},
{
"Option_Title": "C",
"Option_Value": "This is Option C"
},
{
"Option_Title": "D",
"Option_Value": "This is Option D"
}
]
}
]
}
// students answer to questions Object
{
"Students":[
{
"StudentID":"1",
"Answers":[
{
"QuestionID": "1",
"ANSWER": "A"
},
{
"QuestionID": "2",
"ANSWER": "B"
},
{
"QuestionID": "3",
"ANSWER": "D"
}
]
},
{
"StudentID":"2",
"Answers":[
{
"QuestionID": "1",
"ANSWER": "B"
},
{
"QuestionID": "2",
"ANSWER": "C"
},
{
"QuestionID": "3",
"ANSWER": "A"
}
]
}
]
}

So with mongoDB there are two ways that I know in where you can deal with one-to-many, you have the embedded approach and referencing approach.
Embedded approach what is that?
(So I will step back a bit from your exam example cause I do not know how I can explain with that but I will try my best to break down relationships in mongo)
So with embedded lets say we have model for user post(Like in a social media app) so a post has comments right and one post can have many comments
So we can have a document like this
{
_id:54844838asd3843aereaf,
user:John Doe,
comments:[
{
usercomment:"random text",
}
]
}
So lets say that now another user were to comment then when using the embedded approach you basically pushing new comments in that array and then update that document so to have those new comments while maintaining the old ones
Second approach: Referencing
So here what happens is, if we do take that example from above about user post and comment
So what you want to do is to have two models user post model and comments model but then your user post model must make reference to the comments model what do I mean by this
So below we make a comments model
const mongoose = require('mongoose');
const commentsModel = mongoose.model(
"commentsModel",
new mongoose.Schema({
commentedUser:{type:String, required:true},
commment:{type:String, required:true}
})
)
Now we need to have a post model then have each post ref comments model
const postModel = mongoose.model(
"userPost",
new mongoose.Schema({
user:{type:String, required:true},
post:{type:String, required:true},
comments:[
{
type:mongoose.Schema.Types.ObjectId
ref:"commentsModel"
//So here I reference the commentsModel and on each entry
//I insert only the id of that comment
}
]
})
)
So now how the procedure would work is lets say you receive a comment from user you take that comment save in commentModel
Finally take that _id given back by mongoose and push to your array of comments (So key is to store that comment id in your comments array )
If my explanation did not fully make sense to you, you can refer to docs they clear
https://docs.mongodb.com/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/
You can also check this blog out with regards to that for more clarity if docs don't make sense
https://medium.com/#brandon.lau86/one-to-many-relationships-with-mongodb-and-mongoose-in-node-express-d5c9d23d93c2
Final last resort is YouTube what best than to hear someone explain
https://youtu.be/t_9fgpsO_vM

Editing/Updating nested objects in documents CouchDB (node.js)

I'm trying to add (aka. push to existing array) in couchDB document.
Any feedback is greatly appreciated.
I have a document called "survey" inside my database called "database1".
I have "surveys" as a set of arrays which consists of objects that has information on each survey.
My goal is to update my "survey" document. Not replacing my array, but adding a new object to the existing array. I've used "nano-couchdb" and "node-couchdb", but could not find a way around it. I was able to update my "surveys", but it would replace the whole thing, not keeping the existing objects in array.
1) Using Nano-couchdb:
db.insert({ _id, name }, "survey", function (error, resp) {
if(!error) { console.log("it worked")
} else {
console.log("sad panda")}
})
2) Using couchdb-node:
couch.update("database1", {
_id: "survey",
_rev:"2-29b3a6b2c3a032ed7d02261d9913737f",
surveys: { _id: name name: name }
)
These work well with adding new documents to a database, but doesn't work with adding stuff to existing documents.
{
"_id": "survey",
"_rev": "2-29b3a6b2c3a032ed7d02261d9913737f",
"surveys": [
{
"_id": "1",
"name": "Chris"
},
{
"_id": "2",
"name": "Bob"
},
{
"_id": "1",
"name": "Nick"
}
]
}
I want my request to work as it would for
"surveys.push({_id:"4",name:"harris"})
whenever new data comes in to this document.

Your data model should be improved. In CouchDB it doesn't make much sense to create a huge "surveys" document, but instead store each survey as a separate document. If you need all surveys, just create a view for this. If you use CouchDB 2.0, you can also query for survey documents via Mango.
Your documents could look like this:
{
"_id": "survey.1",
"type": "survey",
"name": "Chris"
}
And your map function would look like that:
function (doc) {
if (doc.type === 'survey') emit(doc._id);
}
Assuming you saved this view as 'surveys' in the design doc '_design/documentLists', you can query it via http://localhost:5984/database1/_design/documentLists/_view/surveys.

one way direction, sails and mongodb

I have a question. I think i am doing something wrong.
I have two models:
tutorType and Student.
Here is my Student
module.exports = {
attributes: {
tutor1Name:'string',
tutor1Type: {
model: 'tutorType'
},
contact: {
type: 'numeric'
},
}
};
And here is my TutorType
module.exports = {
attributes: {
libelle: "string"
}
};
I use the predefined blue print to insert. When i Insert a new student
I have the returned response:
{
"name": "Luke",
"tutor1Type": "Biological",
"createdAt": "2015-07-13T17:57:12.526Z",
"updatedAt": "2015-07-13T17:57:12.526Z",
"id": "55a3fbf8c9e93bf0266a63a8"
}
Should tutor1Type be an object instead of a String? Actually, I can put the string I want. I would like to be able to put only rows i have in TutorType with a foreign key on my Student.
What wrong I do?!
EDIT
Just add a look in my DB.
When I send the ID of the tutorType to my controller. It add an "Object(hashID)" in db. I presume that is a good news.
The things is that I can also insert string that are not into my tutorList...
I do not understand how integrity work right here...
Any ideas?!

Combine Mongo Output with Node for API

I''m really new to Node but I currently have a NodeJS / Express open source CMS and would like to output some API data for an app that I am working. Forgive me if I'm not using the correct terminology or whatnot, this is new to me.
What I currently have are two collections, locations and tours. The CMS allows me to create a relationship between the two. This simply stores an array of ObjectID's in the locations record for each associated tour record.
What I want to do is take my API output code (below) and have it output the entire tours array, complete with all the fields (title, description, etc), in with each location record. Currently it only outputs an array of the ID's.
Here is my current code:
var async = require('async'),
landmark = require('keystone');
var Location = keystone.list('Location'),
Tour = keystone.list('Tour');
/**
* List Locations
*/
exports.list = function(req, res) {
Location.model.find(function(err, items) {
if (err) return res.apiError('database error', err);
res.apiResponse({
locations: items
});
});
}
/**
* Get Location by ID
*/
exports.get = function(req, res) {
Location.model.findById(req.params.id).exec(function(err, item) {
if (err) return res.apiError('database error', err);
if (!item) return res.apiError('not found');
res.apiResponse({
location: item
});
});
}
Current API output (truncated):
{
"locations": [
{
"_id": "53a47997ebe91d8a4a26d251",
"slug": "test-location",
"lastModified": "2014-06-20T20:19:14.484Z",
"commonName": "test location",
"__v": 3,
"url": "",
"tours": [
"53a47963ebe91d8a4a26d250"
],
"images": []
}
]
}
What I'm looking for:
{
"locations": [
{
"_id": "53a47997ebe91d8a4a26d251",
"slug": "test-location",
"lastModified": "2014-06-20T20:19:14.484Z",
"commonName": "test location",
"__v": 3,
"url": "",
"tours": [
{
"_id": "53a47963ebe91d8a4a26d250",
"title": "my test tour title",
"url": "url_to_audio_file"
}
],
"images": []
}
]
}
Anyone know if this is possible? Any help would be appreciated! Thanks!

It looks like you have setup your Location model to have a reference to the Tours, defined as an array of Tours. This means that when you store the Tour within your Location, you're not storing the data that represents that Tour, but instead an ID that references the Tour. When you perform the find operation, you're seeing that in the response that you send back to the client.
If this is the case, then you might want to take a look at Mongoose's populate function. This will take those references and populate them fully with the data that they contain.
So for instance, you can change your query to the following:
Location.model.find().populate('tours').exec(function(err, items) {
// items should now contain fully populated tours
}
Let me know if this isn't what you mean and I can try to help further.

The solution provided by #dylants is absolutely correct. However, for it to work you need to have tours declared as a Types.Relationship field in your Location list with the ref option set to Tour.
Check out the Keystone docs on Relationship Fields.
I included the many: true option in my example below, because I assumed this is a one-to-many relationship. If it isn't, you can discard it.
var keystone = require('keystone'),
Location = keystone.list('Location');
Location.add({
...
tours: { type: Types.Relationship, ref: 'Tour', many: true },
...
});
The List.relationship() method you mentioned is meant to be used only if you want a list of related documents to automatically appear in the Keystone Admin UI, and not to establish the actual relationship.
Hope this helps.

Lucene Hierarchial Taxonomy Search

I've a set of documents annotated with hierarchial taxonomy tags,
E.g.
[
{
"id": 1,
"title": "a funny book",
"authors": ["Jean Bon", "Alex Terieur"],
"book_category": "/novel/comedy/new"
},
{
"id": 2,
"title": "a dramatic book",
"authors": ["Alex Terieur"],
"book_category": "/novel/drama"
},
{
"id": 3,
"title": "A hilarious book",
"authors": ["Marc Assin", "Harry Covert"],
"book_category": "/novel/comedy"
},
{
"id": 4,
"title": "A sad story",
"authors": ["Gerard Menvusa", "Alex Terieur"],
"book_category": "/novel/drama"
},
{
"id": 5,
"title": "A very sad story",
"authors": ["Gerard Menvusa", "Alain Terieur"],
"book_category": "/novel"
}]
I need to search book by "book_category". The search must return books that match the query category exactly or partially (with a defined depth threshold) and give them a different score in function of the match degree.
E.g.: query "book_category=/novel/comedy" and "depth_threshold=1" must return books with book_category=/novel/comedy (score=100%), /novel and /novel/comedy/new (score < 100%).
I tried the TopScoreDocCollector in the search, but it returns the book which book_category at least contains the query category, and gives them the same score.
How can i obtain this search function that returns also the more general category and gives different match scores to the results?
P.S.: i don't need a faced search.
Thanks

There is no built-in query, that supports this reuqirement, but you can use a DisjunctionMaxQuery with multiple ConstantScoreQuerys. The exact category and the more general category can be searched by simple TermQuerys. For the sub-categories, you can use a MultiTermQuery like the RegexpQuery to match all sub-categories, if you don't know them upfront. For example:
// the exact category
Query directQuery = new TermQuery(new Term("book_category", "/novel/comedy"));
// regex, that matches one level more that your exact category
Query narrowerQuery = new RegexpQuery(new Term("book_category", "/novel/comedy/[^/]+"));
// the more general category
Query broaderQuery = new TermQuery(new Term("book_category", "/novel"));
directQuery = new ConstantScoreQuery(directQuery);
narrowerQuery = new ConstantScoreQuery(narrowerQuery);
broaderQuery = new ConstantScoreQuery(broaderQuery);
// 100% for the exact category
directQuery.setBoost(1.0F);
// 80% for the more specific category
narrowerQuery.setBoost(0.8F);
// 50% for the more general category
broaderQuery.setBoost(0.5F);
DisjunctionMaxQuery query = new DisjunctionMaxQuery(0.0F);
query.add(directQuery);
query.add(narrowerQuery);
query.add(broaderQuery);
This would give a result like:
id=3 title=a hilarious book book_category=/novel/comedy score=1.000000
id=1 title=a funny book book_category=/novel/comedy/new score=0.800000
id=5 title=A very sad story book_category=/novel score=0.500000
For a complete test case, see this gist: https://gist.github.com/knutwalker/7959819

This could by a solution. But i have more than one hierarchic filed to query and i want to use the CategoryPath indexed in taxonomy.
I'm using the DrillDown query:
DrillDownQuery luceneQuery = new DrillDownQuery(searchParams.indexingParams);
luceneQuery.add(new CategoryPath("book_category/novel/comedy,'/'));
luceneQuery.add(new CategoryPath("subject/sub1/sub2",'/'));
In this way the search return the books how match the two category paths and their descendants.
To retrieve also the ancestors i can start the drilldown from a ancestor of the requested categoryPath (retrieved from the taxonomy).
The problem is the same score for all the results.
I want to override the similarity/score function in order to calculate a categoryPath lenght based score, comparing the query categoryPath with each returned document CategoryPath (book_category).
E.g.:
if(queryCategoryPath.compareTo(bookCategoryPath)==0){
document.score = 1
}else if(queryCategoryPath.compareTo(bookCategoryPath)==1){
document.score = 0.9
}else if(queryCategoryPath.compareTo(bookCategoryPath)==2){
document.score = 0.8
} and so on.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string