Making MongoDB Update/Write Queries Faster - node.js

I am looking for the most efficient way to store realtime data in MongoDB for a MeanJS web application I am working on.
I have the following example schema:
SomeModel: {
name: {
type: String,
default: '',
required: 'Please Enter Name',
trim: true
},
data: {
type: Schema.Types.Mixed,
default: {}
},
data_keys: {
type: Schema.Types.Mixed,
default: {}
},
websocket_url: {
type: String,
default: '',
},
created: {
type: Date,
default: Date.now
},
user: {
type: Schema.ObjectId,
ref: 'User'
}
}
The 'data' field may have data like this, but it depends on the 'model' being subscribed to, each model's data may have a slightly different format.
data: {
balance: {
currentBalance: 100,
availableBalance: 80,
/* Additional Account Details */
},
orders: [{
/* Some Array of Order Details */
}],
/* Additional Data Properties */
}
For each 'someModel' object I am trying to connect to a websocket server, subscribe to updates and then write them to the database.
I am trying to use something like this:
some_ws = new WebSocket(someModel.websocket_url);
some_ws.on('message', function incoming(msg) {
var message = JSON.parse(msg);
try {
// Update 'someModel.data' in memory.
Object.keys(message['data']).forEach(function(key) {
someModel.data[key] = message['data'][key];
});
// Write out to Database.
SomeModel
.update({_id: someModel._id}, {data: someModel.data, data_keys: someModel.data_keys})
.exec(function (err, nItems) {
if(err) {
console.log("ERROR Saving SomeModel Data: %s", err);
} else {
// console.log("Saved Data for: %s", someModel.name);
}
});
} catch (exception) {
console.log(clc.red("Exception Caught: %s"), util.inspect(exception));
console.log(clc.cyan("DEBUG:: Message: %s"), util.inspect(message));
}
});
I'm finding I'm getting almost continuous updates from the websocket connection and that the 'update' queries are slowing down the 'read queries' that need to happen in the front end of the application.
I'd like to be able to store the 'current' data for the model in this 'someModel.data' object, and then every minute write to a 'model_log' table with a "snapshot" of what data associated with the model at that particular time:
eg:
model_log schema: {
model: {
type: 'Schema.ObjectId',
ref: 'SomeModel',
},
data: {
/* Model Data */
},
timestamp: {
type: Date,
default: Date.now,
}
}
so I can do: model_log.find({'timestamp': { $gte: startDate, $lte: endDate } });
and get back:
[
{
model: ObjectId('someModelId'),
data: {
someData: someValues,
otherData: otherValues,
},
timestamp: March 15, ‎2016‎ ‎12:‎00‎ ‎AM
},
{
model: ObjectId('someModelId'),
data: {
someData: someNewValues,
otherData: otherNewValues,
},
timestamp: March 15, 2016 12:01 AM,
},
...
]
How can I make this more efficient or make these write/update operations faster?
Thanks,

there are few options:
split load by using collection sharding
create replica set where secondary server can reply for queries and primary is responsible to serve data and push changes
using wired tiger storage engine put collection into memory (I'am unsure if we can do it in community)
use SSD HDD to reduce write latency
switch to wired tiger (as we have here document lock level instead of collection lock level)
nr 3 & 5 can be tested in separation on dev machine

Related

Find value from sub array within last 30 days using Mongoose

I am trying to locate a certain value in a sub array using Mongoose.js with MongoDB. Below is my Mongoose schema.
const foobarSchema = new mongoose.Schema({
foo: {
type: Array,
required: true
},
comments: {
type: Array,
required: false
},
createdAt: { type: Date, required: true, default: Date.now }
});
The value I am trying to get is inside foo, so in foo I always have one array at place [0] which contains an object that is like the below
{
_id
code
reason
createdAt
}
I'd like to get the value for reason for all records created in the last 30 days. I've looked around on stack overflow and haven't found anything I could piece together. Below is my existing but non working code
const older_than = moment().subtract(30, 'days').toDate();
Foobar.find({ ...idk.. req.body.reason, createdAt: { $lte: older_than }})
edit add mock document
{
foo: [{
_id: 'abc123',
code: '7a',
reason: 'failure',
createdAt: mongo time code date now
}],
comments: []
}
curent code half working
const reason = req.params.reason
const sevenAgo = moment().subtract(7, 'days').toISOString()
Foo.aggregate([
{
$match: {
"foo.createdAt": {
$gte: sevenAgo
},
"foo.reason": {
reason
}
}
},
{
$project: {
reason: {
$arrayElemAt: [
"$foo.reason",
0
]
}
}
}
])
Currently returns blank array - no query failure - which is wrong it should return at least 1 document/record as that is what is in the DB that matches
expected mock data
[
{
code: 7a,
reason: failure
}
{
code: 7a,
reason:failure
}
]

How to grab field value during a MongooseModel.bulkWrite operation?

Context:
I am trying to upsert in bulk an array of data, with an additional computed field: 'status'.
Status should be either :
- 'New' for newly inserted docs;
- 'Removed' for docs present in DB, but inexistent in incoming dataset;
- a percentage explaining the evolution for the field price, comparing the value in DB to the one in incoming dataset.
Implementations:
data.model.ts
import { Document, model, Model, models, Schema } from 'mongoose';
import { IPertinentData } from './site.model';
const dataSchema: Schema = new Schema({
sourceId: { type: String, required: true },
name: { type: String, required: true },
price: { type: Number, required: true },
reference: { type: String, required: true },
lastModified: { type: Date, required: true },
status: { type: Schema.Types.Mixed, required: true }
});
export interface IData extends IPertinentData, Document {}
export const Data: Model<IData> = models.Data || model<IData>('Data', dataSchema);
data.service.ts
import { Data, IPertinentData } from '../models';
export class DataService {
static async test() {
// await Data.deleteMany({});
const data = [
{
sourceId: 'Y',
reference: `y0`,
name: 'y0',
price: 30
},
{
sourceId: 'Y',
reference: 'y1',
name: 'y1',
price: 30
}
];
return Data.bulkWrite(
data.map(function(d) {
let status = '';
// #ts-ignore
console.log('price', this);
// #ts-ignore
if (!this.price) status = 'New';
// #ts-ignore
else if (this.price !== d.price) {
// #ts-ignore
status = (d.price - this.price) / this.price;
}
return {
updateOne: {
filter: { sourceId: d.sourceId, reference: d.reference },
update: {
$set: {
// Set percentage value when current price is greater/lower than new price
// Set status to nothing when new and current prices match
status,
name: d.name,
price: d.price
},
$currentDate: {
lastModified: true
}
},
upsert: true
}
};
}
)
);
}
}
... then in my backend controller, i just call it with some route :
try {
const results = await DataService.test();
return new HttpResponseOK(results);
} catch (error) {
return new HttpResponseInternalServerError(error);
}
Problem:
I've tried lot of implementation syntaxes, but all failed either because of type casting, and unsupported syntax like the $ symbol, and restrictions due to the aggregation...
I feel like the above solution might be closest to a working scenario but i'm missing a way to grab the value of the price field BEFORE the actual computation of status and the replacement with updated value.
Here the value of this is undefined while it is supposed to point to current document.
Questions:
Am i using correct Mongoose way for a bulk update ?
if yes, how to get the field value ?
Environment:
NodeJS 13.x
Mongoose 5.8.1
MongoDB 4.2.1
EUREKA !
Finally found a working syntax, pfeeeew...
...
return Data.bulkWrite(
data.map(d => ({
updateOne: {
filter: { sourceId: d.sourceId, reference: d.reference },
update: [
{
$set: {
lastModified: Date.now(),
name: d.name,
status: {
$switch: {
branches: [
// Set status to 'New' for newly inserted docs
{
case: { $eq: [{ $type: '$price' }, 'missing'] },
then: 'New'
},
// Set percentage value when current price is greater/lower than new price
{
case: { $ne: ['$price', d.price] },
then: {
$divide: [{ $subtract: [d.price, '$price'] }, '$price']
}
}
],
// Set status to nothing when new and current prices match
default: ''
}
}
}
},
{
$set: { price: d.price }
}
],
upsert: true
}
}))
);
...
Explanations:
Several problems were blocking me :
the '$field_value_to_check' instead of this.field with undefined 'this' ...
the syntax with $ symbol seems to work only within an aggregation update, using update: [] even if there is only one single $set inside ...
the first condition used for the inserted docs in the upsert process needs to check for the existence of the field price. Only the syntax with BSON $type worked...
Hope it helps other devs in same scenario.

How to change nodejs schema according to my requirement

I have following data I want put data like below but my schema does not allow me what I do, to send data like below please check. when I put only single {} body it's working fine but I want more then one bodies
//Request body
{
"title" : "test10",
"sch_start": "2017-04-3",
"sch_end":"2017-04-3"
},
{
"title" : "test11",
"sch_start": "2017-04-4",
"sch_end":"2017-04-4"
}
import mongoose, {Schema} from 'mongoose';
/**
* Model to store Calendar entries of Engineer
*/
var EngineerScheduleSchema = new Schema({
title: { type: String, default: ''},
available: { type: Boolean, default: false },
sch_start: { type:Date, default: Date.now },
sch_end: { type:Date, default: Date.now }
});
export default mongoose.model('engineer_schedule', EngineerScheduleSchema);
// Main Model Mongoose Schema
schedule: [EngineerSchedule.schema]
//API Method
export function updateSchedulecalendar(req, res) {
var responseSchedule;
//Insert
return Engineer.findOneAndUpdate({ _id: req.params.id }, { $addToSet: { schedule: req.body } }, { new: true, upsert: true, setDefaultsOnInsert: true, runValidators: true }).exec()
.then((entity) => {
if (entity) {
responseSchedule = entity.schedule;
return EngineerEvents.emit(engineerEvents.updatedSchedule, req.user, entity);
}
else {
return res.status(404).end();
}
})
.then(()=> { return res.status(200).json(responseSchedule); })
.catch(handleError(res));
}
First, Since you are trying to send many schedules, your request body should look like:
[{
"title" : "test10",
"sch_start": "2017-04-3",
"sch_end":"2017-04-3"
},
{
"title" : "test11",
"sch_start": "2017-04-4",
"sch_end":"2017-04-4"
}]
Second, you should take a look at findOneAndUpdate documentation, cause it seems to me that you are sending two schedule objects (test10 and test11 ) to be updated on a single schedule specified by req.params.id. It does not make much sense.
If you are looking to update multiple schedules in a single request, maybe you should implement a bulk update. Take a look at bulk functionality, I would implement something like this:
export function updateSchedulecalendar(req, res) {
var responseSchedule;
var bulk = db.items.initializeUnorderedBulkOp();
// Iterate over every schedule sent by client
for(schedule in req.body) {
// Generate a new update statement for that specific document
bulk.find( { title: schedule.title } ).update( { $set: { sch_start: schedule.sch_start, ... } } );
}
// Execute all updates
bulk.execute().then(function() {
// Validation
});
}

Add or push new object to nested mongodb document

I can't seem to find an answer to this on Stack or in the Mongoose docs. How do I added a new object into a nested document?
This is my current schema:
var SessionsSchema = mongoose.Schema({
session: {
sid: String,
dataloop: {
timeStamp: Date,
sensorValues:{
value: Number,
index: Number
}
}
}
});
Upon receiving new data from the client, I need to push into the existing session document, i've tried both $addToSet and $push but neither are giving me the correct results.
This is the $push:
Sessions.findOneAndUpdate(
{ 'session.sid': sessionID },
{
'$push:': {dataloop:{
timeStamp: datemilli,
sensorValues:{
value: pressure,
index: indexNum,
sessionTime: relativeTime
}
}
}
},
function(err,loop) {
console.log(loop);
}
)
Here is my expected output:
_id:58bb37a7e2950617355fab0d
session:Object
sid:8
dataloop:Object
timeStamp:2017-03-04 16:54:27.057
sensorValues:Object
value:134
index:18
sessionTime:0
dataloop:Object // <----------NEW OBJECT ADDED HERE
timeStamp:2017-03-04 16:54:27.059
sensorValues:Object
value:134
index:18
sessionTime:0
dataloop:Object // <----------ANOTHER NEW OBJECT
timeStamp:2017-03-04 16:54:27.059
sensorValues:Object
value:134
index:18
sessionTime:0
__v:0
If you consider to change your Schema to include a dataloop array :
var SessionsSchema = mongoose.Schema({
session: {
sid: String,
dataloop: [{
timeStamp: Date,
sensorValues: {
value: Number,
index: Number
}
}]
}
});
You could use $push on session.dataloop to add a new dataloop item :
Sessions.findOneAndUpdate({ 'session.sid': sessionID }, {
'$push': {
'session.dataloop': {
timeStamp: datemilli,
sensorValues: {
value: pressure,
index: indexNum,
sessionTime: relativeTime
}
}
}
},
function(err, loop) {
console.log(loop);
}
)

How to join two collections in mongoose

I have two Schema defined as below:
var WorksnapsTimeEntry = BaseSchema.extend({
student: {
type: Schema.ObjectId,
ref: 'Student'
},
timeEntries: {
type: Object
}
});
var StudentSchema = BaseSchema.extend({
firstName: {
type: String,
trim: true,
default: ''
// validate: [validateLocalStrategyProperty, 'Please fill in your first name']
},
lastName: {
type: String,
trim: true,
default: ''
// validate: [validateLocalStrategyProperty, 'Please fill in your last name']
},
displayName: {
type: String,
trim: true
},
municipality: {
type: String
}
});
And I would like to loop thru each student and show it's time entries. So far I have this code which is obviously not right as I still dont know how do I join WorksnapTimeEntry schema table.
Student.find({ status: 'student' })
.populate('student')
.exec(function (err, students) {
if (err) {
return res.status(400).send({
message: errorHandler.getErrorMessage(err)
});
}
_.forEach(students, function (student) {
// show student with his time entries....
});
res.json(students);
});
Any one knows how do I achieve such thing?
As of version 3.2, you can use $lookup in aggregation pipeline to perform left outer join.
Student.aggregate([{
$lookup: {
from: "worksnapsTimeEntries", // collection name in db
localField: "_id",
foreignField: "student",
as: "worksnapsTimeEntries"
}
}]).exec(function(err, students) {
// students contain WorksnapsTimeEntries
});
You don't want .populate() here but instead you want two queries, where the first matches the Student objects to get the _id values, and the second will use $in to match the respective WorksnapsTimeEntry items for those "students".
Using async.waterfall just to avoid some indentation creep:
async.waterfall(
[
function(callback) {
Student.find({ "status": "student" },{ "_id": 1 },callback);
},
function(students,callback) {
WorksnapsTimeEntry.find({
"student": { "$in": students.map(function(el) {
return el._id
})
},callback);
}
],
function(err,results) {
if (err) {
// do something
} else {
// results are the matching entries
}
}
)
If you really must, then you can .populate("student") on the second query to get populated items from the other table.
The reverse case is to query on WorksnapsTimeEntry and return "everything", then filter out any null results from .populate() with a "match" query option:
WorksnapsTimeEntry.find().populate({
"path": "student",
"match": { "status": "student" }
}).exec(function(err,entries) {
// Now client side filter un-matched results
entries = entries.filter(function(entry) {
return entry.student != null;
});
// Anything not populated by the query condition is now removed
});
So that is not a desirable action, since the "database" is not filtering what is likely the bulk of results.
Unless you have a good reason not to do so, then you probably "should" be "embedding" the data instead. That way the properties like "status" are already available on the collection and additional queries are not required.
If you are using a NoSQL solution like MongoDB you should be embracing it's concepts, rather than sticking to relational design principles. If you are consistently modelling relationally, then you might as well use a relational database, since you won't be getting any benefit from the solution that has other ways to handle that.
It is late but will help many developers.
Verified with
"mongodb": "^3.6.2",
"mongoose": "^5.10.8",
Join two collections in mongoose
ProductModel.find({} , (err,records)=>{
if(records)
//reurn records
else
// throw new Error('xyz')
})
.populate('category','name') //select only category name joined collection
//.populate('category') // Select all detail
.skip(0).limit(20)
//.sort(createdAt : '-1')
.exec()
ProductModel Schema
const CustomSchema = new Schema({
category:{
type: Schema.ObjectId,
ref: 'Category'
},
...
}, {timestamps:true}, {collection: 'products'});
module.exports = model('Product',CustomSchema)
Category model schema
const CustomSchema = new Schema({
name: { type: String, required:true },
...
}, {collection: 'categories'});
module.exports = model('Category',CustomSchema)

Resources