Reporting in mongodb - node.js

I have report generating in my first node application. I used mongodb and express. I have three collections: salary rule, Leave and Employee. I want to generate employees salary by using these collections.
I found phantomjs to export pdf. I used ejs template to generate html.
I got json values from the following scenario.
find Salary rule
find All Employees
find all Leaves by date range.
Match employees and leaves by employee id and calculate the salary.
put the result json into the array and generate html by ejs
export html to pdf by using phantomjs.
I am confused that this scenario could be hit performance and error-prone. I cannot find any suitable examples for exporting in node and mongodb.
My question is-
Is it bad idea to use mongodb in this scenario or is it normal flow?
Or do I need to change my mongodb collection schema?
Leave
var schema = new mongoose.Schema({
date: { type: Date, default: Date.now },
description: String,
type: String, // paid or unpaid
empName : String,
empId : String
});
Employee
var schema = new mongoose.Schema({
id: String,
name: String,
basicSalary: Number,
active: Boolean
});
Salary Rule
var schema = new mongoose.Schema({
totalHoliday: Number,
overtimeFee: Number,
unpaidLeaveFee: Number
});

IMO looks like exporting your data to a Relational Database could be easy to generate the report.
BUT if you still want to do this with MongoDB you could do a mapReduce.
https://docs.mongodb.com/manual/reference/method/db.collection.mapReduce/
your last two steps are the same but change the way that you get the data.

Related

Proper way of updating average rating for a review system using Mongoose

I'm currently learning some backend stuff using an Udemy course and I have an example website that lets you add campgrounds (campground name, picture, description, etc.) and review them. I'm using the Express framework for Node.js, and Mongoose to access the database.
My campground schema looks like:
const campgroundSchema = new mongoose.Schema({
name: String,
image: String,
description: String,
price: String,
comments: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Comment"
}
],
rating: {type: Number, default: 0}
});
And my comment/review schema looks like:
const commentSchema = new mongoose.Schema({
text: String,
rating: {
type: Number,
min: 1,
max: 5,
validate: {validator: Number.isInteger}
},
campground: {type: mongoose.Schema.Types.ObjectId, ref: "Campground"}
});
Campgrounds and Comments also have references to a User but I've left that out for simplicity.
I'm looking to know the best practice for updating and displaying the campground average rating.
The method used by the tutorial I'm following is to recalculate the average rating each time a comment is added, changed, or deleted. Here's how it would work for a new comment:
Campground.findById(campgroundId).populate("comments").exec(function(err, campground) {
Comment.create(newComment, function(err, comment) {
campground.comments.push(comment);
campground.rating = calculateRating(campground.comments);
campground.save();
});
});
"calculateRating" iterates through the comment array, gets the total sum, and returns the sum divided by the number of comments.
My gut instinct tells me that there should be a way to make the "rating" field of Campground perform the functionality of the "calculateRating" function, so that I don't have to update the rating every time a comment is added, changed, or removed. I've been poking around documentation for a while now, but since I'm pretty new to Mongoose and databases in general, I'm a bit lost on how to proceed.
In summary: I want to add functionality to my Campground model so that when I access its rating, it automatically accesses each comment referenced in the comments array, sums up their ratings, and returns the average.
My apologies if any of my terminology is incorrect. Any tips on how I would go about achieving this would be very much appreciated!
Love,
Cal
I think what you are trying to do is get a virtual property of the document that gets the average rating but it does not get persisted to the mongo database.
according to mongoosejs :- Virtuals are document properties that you can get and set but that do not get persisted to MongoDB. They are set on the schema.
You can do this:
CampgroundSchema.virtual('averageRating').get(function() {
let ratings = [];
this.comments.forEach((comment) => ratings.push(comment.rating));
return (ratings.reduce((a,b)=>a+b)/ratings.length).toFixed(2);
});
After that on your view engine after finding campgrounds or a campground, all you need to call is ; campground.averageRating;
Read more here : https://mongoosejs.com/docs/guide.html#virtuals
also note that you can not make any type of query on virtual properties.

Best way to structure my mongoose schema: embedded array , populate, subdocument?

Here is my current Schema
Brand:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var BrandSchema = new mongoose.Schema({
name: { type: String, lowercase: true , unique: true, required: true },
photo: { type: String , trim: true},
email: { type: String , lowercase: true},
year: { type: Number},
timestamp: { type : Date, default: Date.now },
description: { type: String},
location: { },
social: {
website: {type: String},
facebook: {type: String },
twitter: {type: String },
instagram: {type: String }
}
});
Style:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var StyleSchema = new mongoose.Schema({
name: { type: String, lowercase: true , required: true},
});
Product
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var ProductSchema = new mongoose.Schema({
name: { type: String, lowercase: true , required: true},
brandId : {type: mongoose.Schema.ObjectId, ref: 'Brand'},
styleId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
year: { type: Number },
avgRating: {type: Number}
});
Post:
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var PostSchema = new mongoose.Schema({
rating: { type: Number},
upVote: {type: Number},
brandId : {type: mongoose.Schema.ObjectId, ref: 'Brand'},
comment: {type: String},
productId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
styleId: {type: mongoose.Schema.ObjectId, ref: 'Style'},
photo: {type: String}
});
I'm currently making use of the mongoose populate feature:
exports.productsByBrand = function(req, res){
Product.find({product: req.params.id}).populate('style').exec(function(err, products){
res.send({products:products});
});
};
This works, however, being a noob --- i've started reading about performance issues with the mongoose populate, since it's really just adding an additional query.
For my post , especially, it seems that could be taxing. The intent for the post is to be a live twitter / instagram-like feed. It seems that could be a lot of queries, which could greatly slow my app down.
also, I want to be able to search prodcuts / post / brand by fields at some point.
Should i consider nesting / embedding this data (products nested / embedded in brands)?
What's the most efficient schema design or would my setup be alright -- given what i've specified I want to use it for?
User story:
There will be an Admin User.
The admin will be able to add the Brand with the specific fields in the Brand Schema.
Brands will have associated Products, each Product will have a Style / category.
Search:
Users will be able to search Brands by name and location (i'm looking into doing this with angular filtering / tags).
Users will be able to search Products by fields (name, style, etc).
Users will be able to search Post by Brand Product and Style.
Post:
Users will be able to Post into a feed. When making a Post, they will choose a Brand and a Product to associate the Post with. The Post will display the Brand name, Product name, and Style -- along with newly entered Post fields (photo, comment, and rating).
Other users can click on the Brand name to link to the Brand show page. They can click on the Product name to link to a Product show page.
Product show page:
Will show Product fields from the above Schema -- including associated Style name from Style schema. It will also display Post pertaining to the specific Product.
Brand show page:
Will simply show Brand fields and associated products.
My main worry is the Post, which will have to populate / query for the Brand , Product, and Style within a feed.
Again, I'm contemplating if I should embed the Products within the Brand -- then would I be able to associate the Brand Product and Style with the Post for later queries? Or, possibly $lookup or other aggregate features.
Mongodb itself does not support joins. So, mongoose populate is an attempt at external reference resolution. The thing with mongodb is that you need to design your data so that:
most of you queries need not to refer multiple collections.
after getting data from query, you need not to transform it too much.
Consider the entities involved, and their relations:
Brand is brand. Doesn't depend on anything else.
Every Product belongs to a Brand.
Every Product is associated with a Style.
Every Post is associated with a Product.
Indirectly, every Post is associated to a Brand and Style, via product.
Now about the use cases:
Refer: If you are looking up one entity by id, then fetching 1-2 related entities is not really a big overhead.
List: It is when you have to return a large set of objects and each object needs an additional query to get associated objects. This is a performance issue. This is usually reduced by processing "pages" of result set at a time, say 20 records per request. Lets suppose you query 20 products (using skip and limit). For 20 products you extract two id arrays, one of referred styles, and other of referred brands. You do 2 additional queries using $in:[ids], get brands and styles object and place them in result set. That's 3 queries per page. Users can request next page as they scroll down, and so on.
Search: You want to search for products, but also want to specify brand name and style name. Sadly, product model only holds ids for style and brand. Same issue with searching Posts with brand and product. Popular solution is to maintain a separate "search index", a sort of table, that stores data exactly the way it will be searched for, with all searchable fields (like brand name, style name) at one place. Maintaining such search collections in mongodb manually can be a pain. This is where ElasticSearch comes in. Since you are already using mongoose, you can simply add mongoosastic to your models. ElasticSearch's search capabilities are far greater than a DB Storage engine will offer you.
Extra Speed: There is still some room for speeding things up: Caching. Attach mongoose-redis-cache and have frequent repeated queries served, in-memory from Redis, reducing load on mongodb.
Twitter like Feeds: Now if all Posts are public then listing them up for users in chronological order is a trivial query. However things change when you introduce "social networking" features. Then you need to list "activity feeds" of friends and followers. There's some wisdom about social inboxes and Fan-out lists in mongodb blog.
Moral of the story is that not all use cases have only "db schema query" solutions. Scalability is one of such cases. That's why other tools exist.

Mongoose: Difference between referencing "Schema.ObjectId" instead of directly using the schema name?

Suppose I have the following MessageSchema model:
var MessageSchema = new Schema({
userId: String,
username: String,
message: String,
date: Date
});
mongoose.model('Message', MessageSchema)
Can someone tell me the difference between the following two implementations of the Meetings model? Thanks.
var Meetings = new Schema({
_id: ObjectId,
name: String,
messages: [MessageSchema],
...
});
var Meetings2 = new Schema({
_id: ObjectId,
name: String,
messages: [{type: Schema.ObjectId, ref: 'Message'}],
...
});
The main difference is that Meeting model is embedding the MessageSchema (denormalization) whilst the Meeting2 model references it (normalization). The difference in choice boils down to your model design and that depends mostly on how you query and update your data. In general, you would want to use an embedded schema design approach if the subdocuments are small and the data does not change frequently. Also if the Message data grows by a small amount, consider denormalizing your schema. The embedded approach allows you to do optimized reads thus can be faster since you will only execute a single query as all the data resides in the same document.
On the other hand, consider referencing if your Message documents are very large so they are kept in a separate collection that you can then reference. Another factor that determines the referencing approach is if your document grows by a large amount. Another important consideration is how often the data changes (volatility) versus how it's read. If it's updated regularly, then referencing is a good approach. This way enhances fast writes.
You can use a hybrid of embedding and referencing i.e. create an array of subdocuments with the frequently accessed data but with a reference to the actual document for more information.
The general rule of thumb is that if your application's query pattern is well-known and data tends to be accessed only in one way, an embedded approach works well. If your application queries data in many ways or you unable to anticipate the data query patterns, a more normalized document referencing model will be appropriate for such case.
Meetings messages field contains an array of Message object, while Meetings2 messages field contains an array of Message Id's.
var Meetings2 = new Schema({
...
messages: [{type: Schema.ObjectId, ref: 'Message'}],
...
});
can be written as
var Meetings2 = new Schema({
...
messages: [Schema.ObjectId],
...
});
The ref is just a helper function in mongoose, making it easier to populate the messages.
So in summary. In Meetings you embed the messages in an array, while in Meetings2 you reference the messages.

How to define a single reference schema using mongoose

I am new in node.js and mongodb and as in the below code i define a reference in table employee of department but here when I insert or get data from employee table i always get in array format but i want to define reference as single column not multiple.
var employee = new mongoose.Schema({
name: String,
dept: [department]
});
var department = new mongoose.Schema({
dept_name : String,
dept_code : String
})
I want data in response from employee table in format `{"name":"CS",dept:{"id":_id_of_dept}}
Please guide me the correct way to achieve my objective.
You can only nest using references or arrays, so in this example, you can either create a department record and reference that in the employee document, or have an array of departments for the employee (which I understand is not what you want to do).
To reference the department document from your employee you would use something like:
var department = new mongoose.Schema({
dept_name : String,
dept_code : String
});
mongoose.model('Department', department);
var employee = new mongoose.Schema({
name: String,
dept: { type: Schema.Types.ObjectId, ref: 'Department' }
});
However, then you'll need to populate when querying your employee to gather the department data.
But at a high level, looking at what you're trying to accomplish, I'd suggest just directly storing the department within the employee document. Relational databases may often end up with patterns like you've described above, but in document based databases, there's really not a good reason to separate out the department from the employee. Doing something like what I've put below will help you in the future for queries and general access of the data:
var employee = new mongoose.Schema({
name: String,
dept: {
dept_name : String,
dept_code : String
}
});
This answer might be helpful in understanding what I mean.

Querying by date, regardless time part

I want to show my blog posts, paginated by creation date. I will have a page for 5 posts written in 2012-10-01, a page for 11 posts written in 2012-10-03 and no page at all for 2012-10-02 (no posts written)
Each post document is stored with a creation date which is a datetime value, here's a mongoose snippet:
var postSchema = new Schema({
url: String,
creationDate: {type: Date, default: Date.now},
contenuto: String,
});
so it will have something like 2012-10-01 18:45:03... know what I mean.
In my code, I will create a
var searchDate = new Date(yy,mm,dd);
How can I use that for querying the posts collection, without considering the "time part" of creationDate?
I'm not sure this would always work:
Post.find({ creationDate:dataRicerca })
As per this post;
How do I resolve a year/month/day date to a more specific date with time data in MongoDB?
you can store the data separately (as well as the full date) in your schema for easier searching. You could also do this;
Post.find({Posted:{$gt: Date("2012-10-01"), $lt:Date("2012-10-02")}})
(updated to use Date() rather than ISODate() for better compatibility)

Resources