InsertMany() After Aggregation Pipeline - Not inserting projected fields correcrly

InsertMany() After Aggregation Pipeline - Not inserting projected fields correcrly - node.js

I am new to Mongodb so please bear with me. After I run the aggregation the projected fields are returned with res.json but they are not all inserted as new documents. Some of the projected fields are not in the schema (which im not sure if you can add anyway) but some fields like createdAt are in the schema but refuse to go into the newly created document and instead createdAt is populated with current date instead of projected field value.
Here is the code:
Click Schema:
const clickSchema = new mongoose.Schema({
page: String,
clickName: String,
clickProduct: String,
createdAt: { type: Date, default: new Date().toISOString().slice(0, 22).replace('T', ' ')},
})
PIPELINE:
Promise.all(tPromises).then(() => {
return Visit.aggregate([
{
$match: {
page: {
//Random Match Works
}
}
},
{
$lookup: {
from: "clickData",
localField: "clickName",
foreignField: "clickName",
as: "clickInfo"
}
},
{
$unwind: {
path: '$clickInfo'
}
},
{
// Send these fields from TrackingInfo to Merge
$project: {
page: '$page',
created_at: '$createdAt',
clickName: $clickName,
time: new Date().toISOString().slice(0, 22).replace('T', ' '),
randomField: 'randomVal'
}
},
])
}).then(docs => {
console.log(docs, 'docs')
Click.insertMany(docs)
return res.status(200).json({docs})
}).catch(err => {
console.log('error:', err)
return res.status(500).json({err})
});
I have tried many different ways of doing this including using a merge in the pipeline instead of Click.insertMany.
However a merge in the pipeline doesnt allow me to res.json() any results back (due to either some promise hell or because the pipeline doesnt return an array - I am not 100% sure why.)
Would appreciate any helpful pointers or tips!
Thankyou

Related

Mongoose SUM get stacked

I'm trying to make trivial SUM on mongoDB to count number of prices for single client.
My collection:
{"_id":"5d973c71dd93adfbda4c7272","name":"Faktura2019006","clientId":"5d9c87a6b9676069c8b5e15b","expiration":"2019-10-02T01:11:18.965Z","price":999999,"userId":"123"},
{"_id":"5d9e07e0b9676069c8b5e15d","name":"Faktura2019007","clientId":"5d9c87a6b9676069c8b5e15b","expiration":"2019-10-02T01:11:18.965Z","price":888,"userId":"123"}
What I tried:
// invoice.model.js
const mongoose = require("mongoose");
const InvoiceSchema = mongoose.Schema({
_id: String,
name: String,
client: String,
userId: String,
expiration: Date,
price: Number
});
module.exports = mongoose.model("Invoice", InvoiceSchema, "invoice");
and
// invoice.controller.js
const Invoice = require("../models/invoice.model.js");
exports.income = (req, res) => {
console.log("Counting Income");
Invoice.aggregate([
{
$match: {
userId: "123"
}
},
{
$group: {
total: { $sum: ["$price"] }
}
}
]);
};
What happen:
When I now open a browser and code above is being called, I get console log 'Counting Income' in terminal however in browser it's just loading forever and nothing happen.
Most likely I just miss some stupid minor thing but I'm trying to find it out for quite a long time without any success so any advise is welcome.

The reason that the controller never finishes is because you are not ending the response process (meaning, you need to use the res object and send something back to the caller).
In order to get the aggregate value, you also need to execute the pipeline (see this example).
Also, as someone pointed out in the comments, you need to add _id: null in your group to specify that you are not going to group by any specific field (see the second example here).
Finally, in the $sum operator, for what you're trying to do, you just need to remove the array brackets since you only want to sum on a single field (see a few examples down here).
Here is the modified code:
// invoice.controller.js
const Invoice = require("../models/invoice.model.js");
exports.income = (req, res) => {
console.log("Counting Income");
Invoice.aggregate([
{
$match: {
userId: "123"
}
},
{
$group: {
_id: null,
total: { $sum: "$price" }
}
}
]).then((response) => {
res.json(response);
});
};
Edit for your comment about when an empty array is returned.
If you want to always return the same type of object, I would control that in the controller. I'm not sure if there is a fancy way to do this with the aggregate pipeline in mongo, but this is what I would do.
Invoice.aggregate([
{
$match: {
userId: "123"
}
},
{
$group: {
_id: null,
total: { $sum: "$price" }
}
},
{
$project: {
_id: 0,
total: "$total"
}
}
]).then((response) => {
if (response.length === 0) {
res.json({ total: 0 });
} else {
// always return the first (and only) value
res.json(response[0]);
}
});
Here, if you find a userId of 123, then you would get this as the return:
{
"total": 1000887
}
But if you change the userId to, say, 1123 which doesn't exist in your db, the result will be:
{
"total": 0
}
This way, your client can always consume the same type of object.
Also, the reason I put the $project pipeline stage in there was to suppress the _id field (see here for more info).

MongoDB: Dynamic Counts

I have two collections. A 'users' collection and an 'events' collection. There is a primary key on the events collection which indicates which user the event belongs to.
I would like to count how many events a user has matching a certain condition.
Currently, I am performing this like:
db.users.find({ usersMatchingACondition }).forEach(user => {
const eventCount = db.events.find({
title: 'An event title that I want to find',
userId: user._id
}).count();
print(`This user has ${eventCount} events`);
});
Ideally what I would like returned is an array or object with the UserID and how many events that user has.
With 10,000 users - this is obviously producing 10,000 queries and I think it could be made a lot more efficient!
I presume this is easy with some kind of aggregate query - but I'm not familiar with the syntax and am struggling to wrap my head around it.
Any help would be greatly appreciated!

You need $lookup to get the data from events matched by user_id. Then you can use $filter to apply your event-level condition and to get a count you can use $size operator
db.users.aggregate([
{
$match: { //users matching condition }
},
{
$lookup:
{
from: 'events',
localField: '_id', //your "primary key"
foreignField: 'user_id',
as: 'user_events'
}
},
{
$addFields: {
user_events: {
$filter: {
input: "$user_events",
cond: {
$eq: [
'$$this.title', 'An event title that I want to find'
]
}
}
}
}
},
{
$project: {
_id: 1,
// other fields you want to retrieve: 1,
totalEvents: { $size: "$user_events" }
}
}
])

There isn't much optimization that can be done without aggregate but since you specifically said that
First, instead of
const eventCount = db.events.find({
title: 'An event title that I want to find',
userId: user._id
}).count();
Do
const eventCount = db.events.count({
title: 'An event title that I want to find',
userId: user._id
});
This will greatly speed up your queries because the find query actually fetches the documents first and then does the counting.
For returning an array you can just initialize an array at the start and push {userid: id, count: eventCount} objects to it.

mongoose duplicate items getting inserted using $addToSet and $each to push items into the array

I am trying to push an array of objects into a document. I am using $addToSet to try and not insert duplicate data. I want to do a check on applied.studentId. But if I pass the same request twice, then the data is getting inserted. Is there any check on $addToSet and $each that I have to use?
My schema is as follows
jobId: { type: Number},
hiringCompanyId: String,
applied: [{
studentId: String,
firstName:String,
lastName:String,
gender:String,
identityType:String,
identityValue:String,
email:String,
phone:String,
}],
My node code is as follows.
public ApplyForJob(data: JobDto): Promise<{ status: string }> {
let students = data.applied;
let findQuery = {hiringCompanyId: data.hiringCompanyId, jobId: data.companyJobId};
let appliedQuery = {};
if (!isNullOrUndefined(data.applied.length)) {
appliedQuery = {
"$addToSet": {
"applied": {
"$each": data.applied
}
}
};
}
return new Promise((resolve, reject) => {
Jobs.findOneAndUpdate(findQuery, appliedQuery).exec((err, info) => {
if (err) {
reject(new UpdateError('Jobs - Update()', err, Jobs.collection.collectionName));
} else {
console.log(info);
resolve({status: "Success"});
}
})
});
}

On disabling the date field, $addToSet does not add duplicate values. As per the doc https://docs.mongodb.com/manual/reference/operator/update/addToSet/
As such, field order matters and you cannot specify that MongoDB compare only a subset of the fields in the document to determine whether the document is a duplicate of an existing array element.

as Rahul Ganguly mention absolutely correctly, we cannot use reliably $addToSet with JS objects.
One options is to move applied in to separate collection and make Job schema to ref new Applied model.
Example:
{
jobId: { type: Number },
hiringCompanyId: String,
applied: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Applied'
}]
}

How to join two collections in mongoose

I have two Schema defined as below:
var WorksnapsTimeEntry = BaseSchema.extend({
student: {
type: Schema.ObjectId,
ref: 'Student'
},
timeEntries: {
type: Object
}
});
var StudentSchema = BaseSchema.extend({
firstName: {
type: String,
trim: true,
default: ''
// validate: [validateLocalStrategyProperty, 'Please fill in your first name']
},
lastName: {
type: String,
trim: true,
default: ''
// validate: [validateLocalStrategyProperty, 'Please fill in your last name']
},
displayName: {
type: String,
trim: true
},
municipality: {
type: String
}
});
And I would like to loop thru each student and show it's time entries. So far I have this code which is obviously not right as I still dont know how do I join WorksnapTimeEntry schema table.
Student.find({ status: 'student' })
.populate('student')
.exec(function (err, students) {
if (err) {
return res.status(400).send({
message: errorHandler.getErrorMessage(err)
});
}
_.forEach(students, function (student) {
// show student with his time entries....
});
res.json(students);
});
Any one knows how do I achieve such thing?

As of version 3.2, you can use $lookup in aggregation pipeline to perform left outer join.
Student.aggregate([{
$lookup: {
from: "worksnapsTimeEntries", // collection name in db
localField: "_id",
foreignField: "student",
as: "worksnapsTimeEntries"
}
}]).exec(function(err, students) {
// students contain WorksnapsTimeEntries
});

You don't want .populate() here but instead you want two queries, where the first matches the Student objects to get the _id values, and the second will use $in to match the respective WorksnapsTimeEntry items for those "students".
Using async.waterfall just to avoid some indentation creep:
async.waterfall(
[
function(callback) {
Student.find({ "status": "student" },{ "_id": 1 },callback);
},
function(students,callback) {
WorksnapsTimeEntry.find({
"student": { "$in": students.map(function(el) {
return el._id
})
},callback);
}
],
function(err,results) {
if (err) {
// do something
} else {
// results are the matching entries
}
}
)
If you really must, then you can .populate("student") on the second query to get populated items from the other table.
The reverse case is to query on WorksnapsTimeEntry and return "everything", then filter out any null results from .populate() with a "match" query option:
WorksnapsTimeEntry.find().populate({
"path": "student",
"match": { "status": "student" }
}).exec(function(err,entries) {
// Now client side filter un-matched results
entries = entries.filter(function(entry) {
return entry.student != null;
});
// Anything not populated by the query condition is now removed
});
So that is not a desirable action, since the "database" is not filtering what is likely the bulk of results.
Unless you have a good reason not to do so, then you probably "should" be "embedding" the data instead. That way the properties like "status" are already available on the collection and additional queries are not required.
If you are using a NoSQL solution like MongoDB you should be embracing it's concepts, rather than sticking to relational design principles. If you are consistently modelling relationally, then you might as well use a relational database, since you won't be getting any benefit from the solution that has other ways to handle that.

It is late but will help many developers.
Verified with
"mongodb": "^3.6.2",
"mongoose": "^5.10.8",
Join two collections in mongoose
ProductModel.find({} , (err,records)=>{
if(records)
//reurn records
else
// throw new Error('xyz')
})
.populate('category','name') //select only category name joined collection
//.populate('category') // Select all detail
.skip(0).limit(20)
//.sort(createdAt : '-1')
.exec()
ProductModel Schema
const CustomSchema = new Schema({
category:{
type: Schema.ObjectId,
ref: 'Category'
},
...
}, {timestamps:true}, {collection: 'products'});
module.exports = model('Product',CustomSchema)
Category model schema
const CustomSchema = new Schema({
name: { type: String, required:true },
...
}, {collection: 'categories'});
module.exports = model('Category',CustomSchema)

mongoose subdocument sorting

I have an article schema that has a subdocument comments which contains all the comments i got for this particular article.
What i want to do is select an article by id, populate its author field and also the author field in comments. Then sort the comments subdocument by date.
the article schema:
var articleSchema = new Schema({
title: { type: String, default: '', trim: true },
body: { type: String, default: '', trim: true },
author: { type: Schema.ObjectId, ref: 'User' },
comments: [{
body: { type: String, default: '' },
author: { type: Schema.ObjectId, ref: 'User' },
created_at: { type : Date, default : Date.now, get: getCreatedAtDate }
}],
tags: { type: [], get: getTags, set: setTags },
image: {
cdnUri: String,
files: []
},
created_at: { type : Date, default : Date.now, get: getCreatedAtDate }
});
static method on article schema: (i would love to sort the comments here, can i do that?)
load: function (id, cb) {
this.findOne({ _id: id })
.populate('author', 'email profile')
.populate('comments.author')
.exec(cb);
},
I have to sort it elsewhere:
exports.load = function (req, res, next, id) {
var User = require('../models/User');
Article.load(id, function (err, article) {
var sorted = article.toObject({ getters: true });
sorted.comments = _.sortBy(sorted.comments, 'created_at').reverse();
req.article = sorted;
next();
});
};
I call toObject to convert the document to javascript object, i can keep my getters / virtuals, but what about methods??
Anyways, i do the sorting logic on the plain object and done.
I am quite sure there is a lot better way of doing this, please let me know.

I could have written this out as a few things, but on consideration "getting the mongoose objects back" seems to be the main consideration.
So there are various things you "could" do. But since you are "populating references" into an Object and then wanting to alter the order of objects in an array there really is only one way to fix this once and for all.
Fix the data in order as you create it
If you want your "comments" array sorted by the date they are "created_at" this even breaks down into multiple possibilities:
It "should" have been added to in "insertion" order, so the "latest" is last as you note, but you can also "modify" this in recent ( past couple of years now ) versions of MongoDB with $position as a modifier to $push :
Article.update(
{ "_id": articleId },
{
"$push": { "comments": { "$each": [newComment], "$position": 0 } }
},
function(err,result) {
// other work in here
}
);
This "prepends" the array element to the existing array at the "first" (0) index so it is always at the front.
Failing using "positional" updates for logical reasons or just where you "want to be sure", then there has been around for an even "longer" time the $sort modifier to $push :
Article.update(
{ "_id": articleId },
{
"$push": {
"comments": {
"$each": [newComment],
"$sort": { "$created_at": -1 }
}
}
},
function(err,result) {
// other work in here
}
);
And that will "sort" on the property of the array elements documents that contains the specified value on each modification. You can even do:
Article.update(
{ },
{
"$push": {
"comments": {
"$each": [],
"$sort": { "$created_at": -1 }
}
}
},
{ "multi": true },
function(err,result) {
// other work in here
}
);
And that will sort every "comments" array in your entire collection by the specified field in one hit.
Other solutions are possible using either .aggregate() to sort the array and/or "re-casting" to mongoose objects after you have done that operation or after doing your own .sort() on the plain object.
Both of these really involve creating a separate model object and "schema" with the embedded items including the "referenced" information. So you could work upon those lines, but it seems to be unnecessary overhead when you could just sort the data to you "most needed" means in the first place.
The alternate is to make sure that fields like "virtuals" always "serialize" into an object format with .toObject() on call and just live with the fact that all the methods are gone now and work with the properties as presented.
The last is a "sane" approach, but if what you typically use is "created_at" order, then it makes much more sense to "store" your data that way with every operation so when you "retrieve" it, it stays in the order that you are going to use.

You could also use JavaScript's native Array sort method after you've retrieved and populated the results:
// Convert the mongoose doc into a 'vanilla' Array:
const articles = yourArticleDocs.toObject();
articles.comments.sort((a, b) => {
const aDate = new Date(a.updated_at);
const bDate = new Date(b.updated_at);
if (aDate < bDate) return -1;
if (aDate > bDate) return 1;
return 0;
});

As of the current release of MongoDB you must sort the array after database retrieval. But this is easy to do in one line using _.sortBy() from Lodash.
https://lodash.com/docs/4.17.15#sortBy
comments = _.sortBy(sorted.comments, 'created_at').reverse();

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

InsertMany() After Aggregation Pipeline - Not inserting projected fields correcrly - node.js

Related

Mongoose SUM get stacked

MongoDB: Dynamic Counts

mongoose duplicate items getting inserted using $addToSet and $each to push items into the array

How to join two collections in mongoose

mongoose subdocument sorting

Categories

Resources