Is it possible for Mongoose to automatically extract schemas from Mongodb? - node.js

I'm still learning Mongodb, Nodejs, and Mongoose, so please excuse my ignorance if this question lacks understanding.
I find it somewhat redundant that each Mongodb collection have to be dissected in Mongoose. Specifically, all the fields of each Mongodb collection and their types need to be stated in Mongoose's schema.
So if I have a collection that contains documents sharing the same fields, such as:
> db.people.find()
{ "_id" : ObjectId("1111"), "name" : "Alice", "age": 30 }
{ "_id" : ObjectId("2222"), "name" : "Bob", "age": 25 }
{ "_id" : ObjectId("3333"), "name" : "Charlie", "age": 40 }
The way that Mongoose+Nodejs connect to this Mongodb
var mongoose = require('mongoose');
var personSchema = new mongoose.Schema({
name : String,
age : Number
});
mongoose.model("Person", personSchema, 'people');
where the last line contains the collection name as the 3rd parameter (explained here).
Is it possible to have Mongoose automatically extract the schema somehow from a Mongodb collection for a collection that contains documents of identical fields (i.e. they would have the same schema)? So that we don't have to define the schema in Mongoose.

Mongoose does not currently have a way of automatically building a Schema and Model given an example document.
While a simple document to Schema tool could be written and it would handle some cases reasonably well, depending on the nature of the collections and documents in your database, it wouldn't accurately reflect various aspects of the data model.
For example, if you had two collections that were related:
var personSchema = Schema({
_id : Number,
name : String,
age : Number,
stories : [{ type: Schema.Types.ObjectId, ref: 'Story' }]
});
and
var storySchema = Schema({
title : String
author : String
});
As you can see the stories field is an array of ObjectIds that are associated with the story collection. When stored in the MongoDB collection, it would be something like:
{
"_id" : ObjectId("52a1d3601d02442354276cfd"),
"name" : "Carl",
"age" : 27,
"stories" : [
ObjectId("52a1d33b1d02442354276cfc")
]
}
And stories:
{
"_id" : ObjectId("52a1d33b1d02442354276cfc"),
"title" : "Alice in Wonderland",
"author" : "Lewis Carroll"
}
As you can see, the stories array contains only an ObjectId without storing what it maps to (a document in the stories collection). One functionality of Mongoose that's lost without this connection being established in the schema is populate (reference).
Maybe more importantly, part of the benefit of using Mongoose is to have a declared schema. While it may be "NoSQL" and MongoDB allows documents to be schema-less, many of the drivers in fact encourage developers to have a schema as it helps enforce a consistent document structure in a collection. If you're doing "production" development, having a declared rathered than inferred schema just seems prudent to me. While you can use a design document, having a rigid Schema defined in source code makes it not only the design, but also helps to enforce the Schema from being inadvertently changed.
It's quite easy to declare a Schema in Mongoose and it only needs to be done once per application instance.
You can of course use the underlying driver for MongoDB on NodeJS which doesn't have schema support at all.

Related

Non-existing field in Mongodb document appears in mongoose findById() result

I'm somewhat new in what is related to Mongoose and I came to this behaviour I consider as strange. The document returned by Mongoose has fields that are not present in the actual MongoDb document, and seem to be added by Mongoose based on the schema.
I use a schema similar to this (this one is simplified) :
const ProfessionalSchema = new mongoose.Schema({
product: {
details: [{
_id: false,
id: String, // UUID
name: String,
prestations: [{
_id: false,
id: String, // UUID
name: String,
price: Number,
}],
}],
},
[...]
My document as shown in Mongodb with mongo CLI utility doesn't have a product field.
What I don't understand is why the result of Professional.findById().exec() returns a document with a product:{details[]} field. I expect not to have that field in the Mongoose returned result, since it is not present in the original MongoDb document.
The Mongoose documentation found https://mongoosejs.com/docs/guide.html (Schema and Model paragraph) didn't help.
My business logic would require that field not to be present, instead of being forced by the schema. Is this achievable ?
Try taking a look at the default option. You could e.g. default your product to null and then, in your business logic, handle the "product is null" case rather than the "product field does not exist" case.
As for why this is happening, it's because you're dealing with a schema. If the field doesn't exist on the document, it's going to be auto-populated. The whole point of a schema is to ensure consistency of your document structure.

Mongoose Inner Join

user collection
user : {
"_id" : md5random,
"nickname" : "j1",
"name" : "jany"
}
user : {
"_id" : md5random,
"nickname" : "j2",
"name" : "jenneffer"
}
friendship collection
friendship : {
"_id" : md5rand,
"nick1" : "j1",
"nick2" : "j2",
"adTime" : date
}
for example SQL
SELECT friendship.adTime, user.name
FROM friendship
INNER JOIN user ON
(user.nickname=friendship.nick1 or user.nickname=friendship.nick2)
Is there any way in the MONGO to get this SQL result?
I WANT GET RESULTS, i know mongo not supporting this request.
But what is this better solution? Any body can write an exeample for me?
MongoDB is n document-oriented, not a relational database. So it doesn't have a query language like sql. Therefore you should change you database schema.
Here is a example:
Define your schema. You can't save relations in mongodb like in sql databases. so add the friends directly to the user. to add attributes to these relations you have to create a new model (here: Friend).
var userSchema = mongoose.Schema({
nickname: String,
name: String,
friends: [{type: mongoose.Schema.Types.ObjectId, ref: 'Friend'}]
});
var friendSchema = mongoose.Schema({
user: {type: mongoose.Schema.Types.ObjectId, ref: 'User'},
addTime: Date
});
var User = mongoose.Model('User', userSchema);
var Friend = mongoose.Model('Friend', friendSchema);
Your query to get all addTimes's could look like this:
User.find().populate('friends').exec(function(err, users) {
if (err) throw err;
var adTimes = [];
users.forEach(function(user) {
user.friends.forEach(function(friend) {
adTimes.push(friend.adTime);
});
});
response.send(adTimes); // adTimes should contain all addTimes from his friends
});
NOTE: The above schema should work, but maybe you should use a relational (like MySQL) or graph database (like Neo4j) instead of a document-oriented like MongoDB.
Use $lookup, Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.
To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.
For more detail click $lookup documentation

Weird mongoose behavior when viewing ObjectIds?

When viewing a sub-document with Robomongo I see something like this:
"views" : [
ObjectId("53a478431275cf0f3d91e27d"),
ObjectId("53a478431275cf0f3d91e27d")
]
But when I pull down the object through Mongoose into node.js, I see something like this:
views:
[ { _bsontype: 'ObjectID',
id: 'T\u001aôj#Ü«m¢©Ö',
viewDate: '2015-07-07T23:21:32.259Z' } ]
Yes, the schema is a little different, and I'm trying to write a script to remediate the data into the new format.
The schema is currently
views: [{view:{type: Schema.Types.ObjectId, ref: 'users'},viewDate:{type: Date, default: Date.now}}],
But
A) Why does the view object look all messed up in the latter, and
B) How can I get what I see in Robomongo? (Answered. See edit)
EDIT: Question B is answered. If I do .lean() to my query, then I'll be able to get it back as a non-mongoose object and it'll look how I expect it to look. So that just leaves question A
I managed to reproduce this.
First, you declared a schema similar to this:
views : { type : Schema.Types.ObjectId, ref : 'users' }
You created and wrote documents to the database using that schema.
Then you changed the schema to your current:
views: [{
view : { type: Schema.Types.ObjectId, ref: 'users' },
viewDate : { type: Date, default: Date.now }
}]
Using that schema, you are reading the documents that you wrote to the database using the first schema.
Those schema are fundamentally different: the first is stored as a single ObjectId in the database (the term "subdocument" is a bit confusing, because in Mongoose, subdocuments are documents that are stored with their parent document; the method you're using is called "population" in Mongoose-speak), but the second schema makes views an array of documents that have two properties (view, which is stored as an ObjectId and viewData which is a date).
This confuses Mongoose because it tries to apply the second schema to documents that were written using the first schema, and because of that, it's showing the internal representation of an ObjectId object instead of a stringified version of it.
This also explains why .lean() shows the correct results, because that tells Mongoose to return raw documents (as they are stored in the database) instead of trying to convert them according to the schema.

Modelling reference to embedding document using Mongoose

I am modelling two types of events (events and subevents) in a MongoDB like this:
var EventSchema = mongoose.Schema({
'name' : String,
'subEvent' : [ SubeventSchema ]
});
var SubeventSchema = mongoose.Schema({
'name' : String
});
Now when I query a subevent I want to be able to also retrieve data about its corresponding superevent, so that some example data retrieved using Mongoose population feature could look like this:
EventModel.findOne({
name : 'Festival'
})
.populate('subEvent')
.execute(function (err, evt) { return evt; });
{
name : 'Festival',
subEvent: [
{ name : 'First Concert' },
{ name : 'Second Concert' }
]
}
EventModel.findOne({
'subEvent.name' : 'FirstConcert'
}, {
'subEvent.$' : 1
})
.populate('superEvent') // This will not work, this is the actual problem of my question
.execute(function (err, subevt) { return subevt; });
{
name: 'First Concert',
superEvent: {
name: 'Festival'
}
}
A solution I can think of is not to embed but to reference like this:
var EventSchema = mongoose.Schema({
'name' : String,
'subEvent' : [ {
'type' : mongoose.Schema.Types.ObjectId,
'ref' : 'SubeventSchema'
} ]
});
var SubeventSchema = mongoose.Schema({
'name' : String,
'superEvent' : {
'type' : mongoose.Schema.Types.ObjectId,
'ref' : 'EventSchema'
}
});
I am looking for a solution based on the first example using embedded subevents, though. Can this be achieved and in case yes, how?
I think your mental model of document embedding isn't correct. The major misunderstanding (and this is very common) is that you "query a subevent" (query an embedded document). According to your current Event schema, a Subevent is just a document embedded in an Event document. The embedded SubEvent is not a top-level document; it's not a member of any collection in MongoDB. Therefore, you don't query for it. You query for Events (which are the actual collection-level documents in your schema) whose subEvents have certain properties. E.g. one way people translate the query
db.events.find({ "subEvent" : { "name" : "First Concert" } })
into plain English is as "find all the subevents with the name "First Concert". This is wrong. The right translation is "find all events that have at least one subevent whose name is "First Concert" (the "at least one" part depends on knowledge that subEvent is an array).
Coming back to the specific question, you can hopefully see now that trying to do a populate of a "superevent" on a subevent makes no sense. Your queries return events. The optimal schema, be it subevents embedded in events, one- or two-way references between events and subevents documents in separate collections, or events denormalized into the constituent subevent documents, cannot be determined from the information in the question because the use case is not specified.
Perhaps this is a situation where you need to modify your thinking rather than the schema itself. Mongoose .populate() supports the basic ideas of MongoDB "projection", or more commonly referred to as "field selection". So rather than try to model around this, just select the fields you want to populate.
So your second schema form is perfectly valid, just change how you populate:
EventModel.find({}).populate("subEvent", "name").execute(function(err,docs) {
// "subevent" array items only contain "name" now
});
This is actually covered in the Mongoose documentation under the "populate" section.

MongoDB _id is convert to ObjectID automatically, but sometime it is not.

I am using a wrapper called mongoskin to access mongoDB. mongoskin is a simple wrapper around mongoDB javascript api.
But when I write to mongoDB, sometimes _id is converted to ObjectID, sometime is it not. The different behavior causes many problem when I have to compare _id. For example:
The following documents in company collection, "creator" is not converted to ObjectID, but item in "clients" is converted to ObjectID automatically.
> db.company.find()
{ "_id" : ObjectId("53d4b452f5b25900005cb998"), "name" : "Default Company Co.", "clients" : [ ObjectId("53d4b452f5b25900005cb999"), ObjectId("53d4b452f5b25900005cb99a") ] }
{ "_id" : ObjectId("53d4b452f5b25900005cb999"), "name" : "client company for 777 - updated", "creator" : "53d4b452f5b25900005cb998", "ssn" : "12-123-1234" }
This is the nodejs code I used to assign _id for "creator"
clientCompany.creator = req.session.user.company_id;
This is the nodejs code I used to assign _id for "clients"
var updateObj = {$addToSet: {clients:resultClient._id} };
// update creator company.clients
creatorCompany.update(updateObj, function(err, result) { ...}
When I console.log "req.session.user.company_id" and "resultClient._id", they both looks like a string type. How come one end up as ObjectID in MongoDB? If there is an auto conversion, how do I make this behavior consistent?
Thanks!
I'm guessing resultClient is the result of a query and req.session.user.company_id a string from your web application? In that case you need to create an ObjectId from the string:
clientCompany.creator = mongoskin.ObjectID(req.session.user.company_id);

Resources