How to create subdocument on mongodb dynamically - node.js

I have a mongodb database with a collection as follow:
var mongoose = require('mongoose');
var journalSchema = mongoose.Schema({
title : String,
journalid: {type:String, index: { unique: true, dropDups: true }},
articleCount : type:Number, default:1,
});
module.exports = mongoose.model('Journal', journalSchema);
Now that my database is growing, I would like to have a "articles count" field per year.
I could make an array as follow years : [{articleCount : Number}] and fill it up by accessing journal.years[X] for a specific year, where X correspond to 0 for 1997, 1 for 1998, etc..
However, my data are scrapped dynamically and I would like to have a function in my express server where articleCountis increased based on the year.
For example:
function updateYear(journalid, year, callback) {
Journal.findOneAndUpdate(
{'journalid':journalid},
{$inc : {'articleCount' : 1}}, // DO SOMETHING WITH year HERE
function() {
callback();
});
}
This does increase the article count but I don't know where to include the "year"...
What would be the fastest way of doing that, knowing that I have to fetch through quite a lot of articles (10 millions +) and I would like to be able to get/update the article count for a given year efficiently.
Hope I'm clear enough, Thanks!

Make your array a set of objects with a year and count:
journalYears: [
{
year: String, // or Number
count: Number
}
]
e.g.
journalYears: [
{ year: "2014", count: 25 },
{ year: "2015", count: 15 }
]
Then for your update:
function updateYear(journalId, year, callback) {
Journal.findOneAndUpdate(
{_id: journalId, "journalYears.year": year},
{ $inc: { "journalYears.$.count": 1 } },
callback
);
}
The index of the first match from your query is saved in $. It's then used to update that specific element in your array.

Related

How to make query for 3 level nested object in mongodb

I am trying to query and filter objects by comparing specific fields inside 3rd level of my objects. I am not sure how to use filter with $lte or $gte in the third level. For example, in my object below I wanted to filter documents whose delivery time (delivery_rule -> time -> $lte: max) but I can't get it using this query:
if (filters.time) {
query = {
...query,
"delivery_rule.time.max": { $lte: filters.time }
};
}
and my schema is :
var VendorSchema = new mongoose.Schema({
...,
delivery_rule: {
...,
time: {
min: {
type: Number,
default: 0
},
mid: {
type: Number,
default: 0
},
max: {
type: Number,
default: 0
}
},
});
module.exports = mongoose.model("Vendor", VendorSchema);
When I run my query using filters.time = 30 in the shell it returns me [] objects, but I have 5 objects with time 60.
The one that I did works well. I got a little confusion in testing))

mongodb date insert like using mongoose

I am inserting bulk records in mongodb. I am using the native DB drivers to do this, as the performance is much higher. At other points in my application, i am using mongoose. The problem I am having is that mongoose translates the date into a different format whereas mongodb native just inserts it as the number of seconds since 1970. So later queries in mongoose based off that date do not work.
Here's my mongoose schema:
var MySchema = new Schema({
name : { type: String, required: true },
updatedAt : Date
});
And my mongo db mass insert:
var newRec = {
name : entry.Name,
updatedAt : Date.now
};
newRecords.push(newRec);
MySchema.collection.insert(newRecords, function(err, newRecs) {
res.json(newRecs.ops);
});
This produces in the DB:
{
"_id": {
"$oid": "562818ecf24d540f0053a38d"
},
"name": "Cool Record",
"updatedAt": 12312423512
}
Whereas if it was run through Mongoose it would produce:
{
"_id": {
"$oid": "561fd90285b5e73f5626f74e"
},
"name": "Cool Record",
"updatedAt": {
"$date": "2015-10-20T20:01:17.553Z"
}
}
If going through mongoose, queries like this work well:
MySchemda.find({ updatedAt : { $gt: lastSynced }}).exec();
But do not work otherwise.
Date.now is a number representing milliseconds since 1970. While it conceptually represents a date, it isn't actually a Date:
var x = Date.now;
typeof x;
// "number"
You need to switch your schema to be:
var MySchema = new Schema({
name : { type: String, required: true },
updatedAt : Number
});
alternately, you can use:
var newRec = {
name: entry.Name,
updatedAt: new Date()
}
and keep your schema as it is.
You may use it like this:
Define the date type and the default value, then create a variable which can be defined as your schema use year : new Date this would help a lot to set the date.
var songSchema = new mongoose.Schema({
name : String,
year : {type : Date, default : Date.now},
singer : String,
});
var song = mongoose.model("song", songSchema);
var xyz= new song({
name : "abcd",
year :new Date, // to set the date first set new Date
singer : "aabbccdd",
});
You can use setDate , setMonth, setYear method to solve the issues.There are more methods defined under the object year .You can out further at the documentation of mongoose.
xyz.save(function(err, songs){
if(err)
winston.log("Something is wrong"+ " "+ err);
else {
songs.year.setDate(03);
songs.year.setMonth(03);
songs.year.setYear(2015);
winston.log(songs);
}
});

Find documents with limits from multiple MongoDB collections and as return sorted list using Mongoose

If I have different types of documents, each in their own collections, is there a way to search for posts from all collections and return them as a single list ordered by something like a datestamp?
Further, I need:
To be able to decide how many posts I need in total from all collections
The posts should be ordered by the same criteria - which means the number of posts will be different from each collection
To be able to start collecting with an offset (say, give me 100 posts starting at post no. 201).
If I saved all documents in the same collection this task would be rather easy but would also require a dynamic, largely undocumented schema since each document will be very different except for a few parameters such as the date.
So, is there a way to keep my documents in well defined schemas, each in separate collections but still being able to accomplish the above?
For argument's sake, here's how the schemas could look divided up:
var InstagramPostSchema = new Schema({
date: Date,
imageUrl: String,
...
})
var TwitterPostSchema = new Schema({
date: Date,
message: String,
...
})
And if I made one universal schema it could look like this:
var SocialPostSchema = new Schema({
date: Date,
type: String,
postData: {}
})
What's the preferred way to do this?
The ideal way would be if I could write separate schemas that inherits from a common base schema, but I'm not familiar enough with Mongoose and MongoDB to know if there's a native way to do this.
There is a good way to do this which is also a bit nicer and with some benifts over your final suggestion, and it is to use discriminators.
The basic idea is that there is a base schema with common properties or even no properties at all for which you are going to define your main collection from. Each other schema then inherrits from that and also shares the same collection.
As a basic demonstration:
var async = require('async'),
util = require('util'),
mongoose = require('mongoose'),
Schema = mongoose.Schema;
mongoose.connect('mongodb://localhost/test');
function BaseSchema() {
Schema.apply(this,arguments);
this.add({
date: { type: Date, default: Date.now },
name: { type: String, required: true }
});
}
util.inherits(BaseSchema,Schema);
var socialPostSchema = new BaseSchema();
var instagramPostSchema = new BaseSchema({
imageUrl: { type: String, required: true }
});
var twitterPostSchema = new BaseSchema({
message: { type: String, required: true }
});
var SocialPost = mongoose.model('SocialPost', socialPostSchema ),
InstagramPost = SocialPost.discriminator(
'InstagramPost', instagramPostSchema ),
TwitterPost = SocialPost.discriminator(
'TwitterPost', twitterPostSchema );
async.series(
[
function(callback) {
SocialPost.remove({},callback);
},
function(callback) {
InstagramPost.create({
name: 'My instagram pic',
imageUrl: '/myphoto.png'
},callback);
},
function(callback) {
setTimeout(
function() {
TwitterPost.create({
name: "My tweet",
message: "ham and cheese panini #livingthedream"
},callback);
},
1000
);
},
function(callback) {
SocialPost.find({}).sort({ "date": -1 }).exec(callback);
}
],
function(err,results) {
if (err) throw err;
results.shift();
console.dir(results);
mongoose.disconnect();
}
);
With output:
[ { __v: 0,
name: 'My instagram pic',
imageUrl: '/myphoto.png',
__t: 'InstagramPost',
date: Wed Aug 19 2015 22:53:23 GMT+1000 (AEST),
_id: 55d47c43122e5fe5063e01bc },
{ __v: 0,
name: 'My tweet',
message: 'ham and cheese panini #livingthedream',
__t: 'TwitterPost',
date: Wed Aug 19 2015 22:53:24 GMT+1000 (AEST),
_id: 55d47c44122e5fe5063e01bd },
[ { _id: 55d47c44122e5fe5063e01bd,
name: 'My tweet',
message: 'ham and cheese panini #livingthedream',
__v: 0,
__t: 'TwitterPost',
date: Wed Aug 19 2015 22:53:24 GMT+1000 (AEST) },
{ _id: 55d47c43122e5fe5063e01bc,
name: 'My instagram pic',
imageUrl: '/myphoto.png',
__v: 0,
__t: 'InstagramPost',
date: Wed Aug 19 2015 22:53:23 GMT+1000 (AEST) } ] ]
So the things to notice there are that even though we defined separate models and even seperate schemas, all items are in fact in the same collection. As part of the discriminator, each document stored has a __t field depicting it's type.
So the really nice things here are:
You can store everything in one collection and query all objects together
You can seperate validation rules per schema and/or define things in a "base" so you don't need to write it out multiple times.
The objects "explode" into their own class defintions by the attached schema to the model for each type. This includes any attached methods. So these are first class objects when you create or retrieve the data.
If you wanted to work with just a specific type such as "TwitterPost", then using that model "automatically" filters out anything else but the "twitter" posts from any query operations performed, just by using that model.
Keeping things in the one collection makes a lot of sense, especially if you want to try and aggregate data accross the information for different types.
A word of caution is that though you can have completely different objects using this pattern, it is generally wise to have as much in common as makes sense to your operations. This is particularly useful in querying or aggregating across different types.
So where possible, try to convert "legacy imported" data to a more "common" format of fields, and just keep the unique properties that are really required for each object type.
As to the first part of your question where you wanted to query "each collection" with something like different limits and then sort the overall results from each, well you can do that too.
There are various techniques, but keeping in the MongoDB form, there is nedb which you an use to both store the combined results and "sort" them as well. And all is done in a manner you are used to:
var async = require('async'),
util = require('util'),
mongoose = require('mongoose'),
DataStore = require('nedb'),
Schema = mongoose.Schema;
mongoose.connect('mongodb://localhost/test');
function BaseSchema() {
Schema.apply(this,arguments);
this.add({
date: { type: Date, default: Date.now },
name: { type: String, required: true }
});
}
util.inherits(BaseSchema,Schema);
var socialPostSchema = new BaseSchema();
var instagramPostSchema = new BaseSchema({
imageUrl: { type: String, required: true }
});
var twitterPostSchema = new BaseSchema({
message: { type: String, required: true }
});
var SocialPost = mongoose.model('SocialPost', socialPostSchema ),
InstagramPost = SocialPost.discriminator(
'InstagramPost', instagramPostSchema ),
TwitterPost = SocialPost.discriminator(
'TwitterPost', twitterPostSchema );
async.series(
[
function(callback) {
SocialPost.remove({},callback);
},
function(callback) {
InstagramPost.create({
name: 'My instagram pic',
imageUrl: '/myphoto.png'
},callback);
},
function(callback) {
setTimeout(
function() {
TwitterPost.create({
name: "My tweet",
message: "ham and cheese panini #livingthedream"
},callback);
},
1000
);
},
function(callback) {
var ds = new DataStore();
async.parallel(
[
function(callback) {
InstagramPost.find({}).limit(1).exec(function(err,posts) {
async.each(posts,function(post,callback) {
post = post.toObject();
post.id = post._id.toString();
delete post._id;
ds.insert(post,callback);
},callback);
});
},
function(callback) {
TwitterPost.find({}).limit(1).exec(function(err,posts) {
async.each(posts,function(post,callback) {
post = post.toObject();
post.id = post._id.toString();
delete post._id;
ds.insert(post,callback);
},callback);
});
}
],
function(err) {
if (err) callback(err);
ds.find({}).sort({ "date": -1 }).exec(callback);
}
);
}
],
function(err,results) {
if (err) throw err;
results.shift();
console.dir(results);
mongoose.disconnect();
}
);
Same output as before with the latest post sorted first, except that this time a query was sent to each model and we just got results from each and combined them.
If you change the query output and writes to the combined model to use "stream" processing, then you even have basically the same memory consumption and likely faster processing of results from parallel queries.

mongoose subdocument sorting

I have an article schema that has a subdocument comments which contains all the comments i got for this particular article.
What i want to do is select an article by id, populate its author field and also the author field in comments. Then sort the comments subdocument by date.
the article schema:
var articleSchema = new Schema({
title: { type: String, default: '', trim: true },
body: { type: String, default: '', trim: true },
author: { type: Schema.ObjectId, ref: 'User' },
comments: [{
body: { type: String, default: '' },
author: { type: Schema.ObjectId, ref: 'User' },
created_at: { type : Date, default : Date.now, get: getCreatedAtDate }
}],
tags: { type: [], get: getTags, set: setTags },
image: {
cdnUri: String,
files: []
},
created_at: { type : Date, default : Date.now, get: getCreatedAtDate }
});
static method on article schema: (i would love to sort the comments here, can i do that?)
load: function (id, cb) {
this.findOne({ _id: id })
.populate('author', 'email profile')
.populate('comments.author')
.exec(cb);
},
I have to sort it elsewhere:
exports.load = function (req, res, next, id) {
var User = require('../models/User');
Article.load(id, function (err, article) {
var sorted = article.toObject({ getters: true });
sorted.comments = _.sortBy(sorted.comments, 'created_at').reverse();
req.article = sorted;
next();
});
};
I call toObject to convert the document to javascript object, i can keep my getters / virtuals, but what about methods??
Anyways, i do the sorting logic on the plain object and done.
I am quite sure there is a lot better way of doing this, please let me know.
I could have written this out as a few things, but on consideration "getting the mongoose objects back" seems to be the main consideration.
So there are various things you "could" do. But since you are "populating references" into an Object and then wanting to alter the order of objects in an array there really is only one way to fix this once and for all.
Fix the data in order as you create it
If you want your "comments" array sorted by the date they are "created_at" this even breaks down into multiple possibilities:
It "should" have been added to in "insertion" order, so the "latest" is last as you note, but you can also "modify" this in recent ( past couple of years now ) versions of MongoDB with $position as a modifier to $push :
Article.update(
{ "_id": articleId },
{
"$push": { "comments": { "$each": [newComment], "$position": 0 } }
},
function(err,result) {
// other work in here
}
);
This "prepends" the array element to the existing array at the "first" (0) index so it is always at the front.
Failing using "positional" updates for logical reasons or just where you "want to be sure", then there has been around for an even "longer" time the $sort modifier to $push :
Article.update(
{ "_id": articleId },
{
"$push": {
"comments": {
"$each": [newComment],
"$sort": { "$created_at": -1 }
}
}
},
function(err,result) {
// other work in here
}
);
And that will "sort" on the property of the array elements documents that contains the specified value on each modification. You can even do:
Article.update(
{ },
{
"$push": {
"comments": {
"$each": [],
"$sort": { "$created_at": -1 }
}
}
},
{ "multi": true },
function(err,result) {
// other work in here
}
);
And that will sort every "comments" array in your entire collection by the specified field in one hit.
Other solutions are possible using either .aggregate() to sort the array and/or "re-casting" to mongoose objects after you have done that operation or after doing your own .sort() on the plain object.
Both of these really involve creating a separate model object and "schema" with the embedded items including the "referenced" information. So you could work upon those lines, but it seems to be unnecessary overhead when you could just sort the data to you "most needed" means in the first place.
The alternate is to make sure that fields like "virtuals" always "serialize" into an object format with .toObject() on call and just live with the fact that all the methods are gone now and work with the properties as presented.
The last is a "sane" approach, but if what you typically use is "created_at" order, then it makes much more sense to "store" your data that way with every operation so when you "retrieve" it, it stays in the order that you are going to use.
You could also use JavaScript's native Array sort method after you've retrieved and populated the results:
// Convert the mongoose doc into a 'vanilla' Array:
const articles = yourArticleDocs.toObject();
articles.comments.sort((a, b) => {
const aDate = new Date(a.updated_at);
const bDate = new Date(b.updated_at);
if (aDate < bDate) return -1;
if (aDate > bDate) return 1;
return 0;
});
As of the current release of MongoDB you must sort the array after database retrieval. But this is easy to do in one line using _.sortBy() from Lodash.
https://lodash.com/docs/4.17.15#sortBy
comments = _.sortBy(sorted.comments, 'created_at').reverse();

Mongoose Birthday query

I store the date of birth (dob) within a user model and I would like to check if it is the users birthday today. How do I do query that?
I have started to try, but obviously failed
// User Model
var UserSchema = new Schema({
name: String,
email: { type: String, lowercase: true },
role: {
type: String,
default: 'user'
},
hashedPassword: String,
dob: { type: Date, default: new Date() },
joinedOn: { type: Date, default: new Date() },
leftOn: { type: Date },
position: { type: String },
provider: String,
salt: String,
google: {},
github: {}
});
// Birtdhays
exports.birthdays = function(req, res, next) {
var todayStart = moment().startOf('day');
var todayEnd = moment().endOf('day');
User
.find()
.where('dob').gt(todayStart).lt(todayEnd)
.limit(3)
.sort('dob')
.select('name dob')
.exec(function(err, users){
if(err) return res.send(500, err);
res.json(200, users);
});
};
Presuming that you have birthdays stored as a date object in your documents then they probably look something like this:
{
"name": "Neil",
"dob": ISODate("1971-09-22T00:00:00Z")
}
So it's not just the "day" but the full year as well from the originally selected day of birth. This probably seems like a logical way to store a date of birth and it is a useful date object for many purposes. But how to query on that? Any date derived from the current year is not going to match that value within a range and other users data on the same day can occur in different years.
There is some JavaScript date manipulation you can do in order to deal with this though, and also some functionality of MongoDB in the aggregation framework ( or alternately using JavaScript in mapReduce ), to get data out of the date that is useful for matching.
Firstly you can look at the $dayOfyear operator and code to get that from the current date to use as a match:
var now = new Date();
var start = new Date( now.getFullYear() + "-01-01" );
var diff = now - start;
var oneDay = 1000 * 60 * 60 * 24;
var dayOfYear = Math.floor(diff / oneDay);
User.aggregate(
[
{ "$project": {
"name": 1,
"dob": 1,
"dayOfYear": { "$dayOfYear": "$dob" }
}},
{ "$match": { "dayOfYear": dayOfYear }
],
function(err,users) {
// work here
}
)
Now that's all fine, or it seems. But of course what happens when there is a leap year? All days after February 28 are moved forward one. You could account for this in other ways, but how about just using the "day" and "month" to do the match on instead:
var now = new Date();
var day = now.getDate(); // funny method name but that's what it is
var month = now.getMonth() + 1; // numbered 0-11
User.aggregate(
[
{ "$project": {
"name": 1,
"dob": 1,
"day": "$dayOfMonth",
"month": "$month"
}},
{ "$match": {
"day": day,
"month": month
}}
],
function(err,users) {
// work here
}
)
And the aggregation operators for $dayOfMonth and $month help there.
So that's all better and you can query for the "birthday" now, but there is something still not really right. Whilst the query will work, it's clearly not really efficient. Since what you are doing here is running through all of the results in the collection and manipulating the existing dates in order to extract the parts to perform a match.
Ideally you don't want to do this, and have the "query" itself target the current "birthdays" in a simple stroke to avoid doing this manipulation to get a match. This is where modelling comes in, and where you should consider adding more data to your document where queries like this are common:
{
"name": "Neil",
"dob": ISODate("1971-09-22T00:00:00Z"),
"day": 22,
"month": 9
}
Then it's easy to query on this as both "day" and "month" fields can also be indexed to further improve performance and avoid scanning the whole collection for matches:
var now = new Date();
var day = now.getDate(); // funny method name but that's what it is
var month = now.getMonth() + 1; // numbered 0-11
User.find({ "day": day, "month": month },function(err,users) {
// work here
})
So those are the considerations. Either accept the manipulation with the aggregation framework ( or possibly mapReduce ), or where you are going to frequently use such a query and/or have many items in the collection then add additional fields to your document schema that can be used as a data point directly in the query itself.
I solved it in the following way:
Added the fields birthday and brithmonth to the user model and added a pre save hook to keep them updated on any changes to the birthday:
UserSchema
.pre('save', function(next) {
this.birthmonth = this.dob.getMonth() + 1;
this.birthday = this.dob.getDate()
next();
});
Thanks Neil for the inspiration!

Resources