nodejs bulk update without compromising the performance

nodejs bulk update without compromising the performance - node.js

I have a node application with MySQL as DB. I on my way to making an endpoint that will update multiple rows with different data for each. Also, I am using sequelize as ORM.
Now I know I can update a row like
model.update(data).then(()=>{res.end('Row Updated')});
Now my question is where should I call update method for second model. ie in the cb function passed to then() or after the update.model method
I mean which of the following would be a best practice.
models1.update(data1).then(()=>{console.log('Row 1 Updated')});
model2.update(data2).then(()=>{console.log('Row 2 Updated')});
**OR**
model1.update(data1).then(()=>{
model2.update(data2).then(()=>{console.log('All the rows have been updated')})
});

Neither of the above are correct because promises aren't chained properly. This impedes proper control flow and error handling.
If queries are independent, they could be performed in parallel:
Promise.all([
model1.update(data1),
model2.update(data2)
])
.then(() => {
console.log('All the rows have been updated');
});

Related

what is the difference between document middleware, model middleware, aggregate middleware, and query middleware?

I am fairly new to MongoDB and Mongoose, I am really confused about why some of the middleware works at the document and some works on query. I am also confused about why some of the query methods return documents and some return queries. If a query is returning document it is acceptable, but why a query return query and what really it is.
Adding more to my question what is a Document function and Model or Query function, because both of them have some common methods like updateOne.
Moreover, I have gathered all these doubts from the mongoose documentation.

Tl;dr: the type of middleware most commonly defines what the this variable in a pre/post hook refers to:
Middleware Hook
'this' refers to the
methods
Document
Document
validate, save, remove, updateOne, deleteOne, init
Query
Query
count, countDocuments, deleteMany, deleteOne, estimatedDocumentCount, find, findOne, findOneAndDelete, findOneAndRemove, findOneAndReplace, findOneAndUpdate, remove, replaceOne, update, updateOne, updateMany
Aggregation
Aggregation object
aggregate
Model
Model
insertMany
Long explanation:
Middlewares are nothing, but built-in methods to interact with the database in different ways. However, as there are different ways to interact with the database, each with different advantages or preferred use-cases, they also behave differently to each other and therefor their middlewares can behave differently, even if they have the same name.
By themselves, middlewares are just shorthands/wrappers for the mongodbs native driver that's being used under the hood of mongoose. Therefor, you can usually use all middlewares, as if you were using regular methods of objects without having to care if it's a Model-, Query-, Aggregation- or Document-Middleware, as long as it does what you want it to.
However, there are a couple of use-cases where it is important to differentiate the context in which these methods are being called.
The most prominent use-case being hooks. Namely the *.pre() and the *.post() hooks. These hooks are methods that you can "inject" into your mongoose setup, so that they are being executed before or after specific events.
For example:
Let's assume I have the following Schema:
const productSchema = new Schema({
name: 'String',
version: {
type: 'Number',
default: 0
}
});
Now, let's say you always want to increase the version field with every save, so that it automatically increases the version field by 1.
The easiest way to do this would be to define a hook that takes care of this for us, so we don't have to care about this when saving an object. If we for example use .save() on the document we just created or fetched from the database, we'd just have to add the following pre-hook to the schema like this:
productSchema.pre('save', function(next) {
this.version = this.version + 1; // or this.version += 1;
next();
});
Now, whenever we call .save() on a document of this schema/model, it will always increment the version before it is actually being saved, even if we only changed the name.
However, what if we don't use the .save() or any other document-only middleware but e.g. a query middleware like findOneAndUpdate() to update an object?
Then, we won't be able to use the pre('save') hook, as .save() won't be called. In this case, we'd have to implement a similar hook for findOneAndUpdate().
Here, however, we finally come to the differences in the middlewares, as the findOneAndUpdate() hook won't allow us to do that, as it is query hook, meaning it does not have access to the actual document, but only to the query itself. So if we e.g. only change the name of the product the following middleware would not work as expected:
productSchema.pre('findOneAndUpdate', function(next) {
// this.version is undefined in the query and would therefor be NaN
this.version = this.version + 1;
next();
});
The reason for this is, that the object is directly updated in the database and not first "downloaded" to nodejs, edited and "uploaded" again. This means, that in this hook this refers to the query and not the document, meaning, we don't know what the current state of version is.
If we were to increment the version in a query like this, we'd need to update the hook as follows, so that it automatically adds the $inc operator:
productSchema.pre('findOneAndUpdate', function(next) {
this.$inc = { version: 1 };
next();
});
Alternatively, we could emulate the previous logic by manually fetching the target document and editing it using an async function. This would be less efficient in this case, as it would always call the db twice for every update, but would keep the logic consistent:
productSchema.pre('findOneAndUpdate', async function() {
const productToUpdate = await this.model.findOne(this.getQuery());
this.version = productToUpdate.version + 1;
next();
});
For a more detailed explanation, please the check the official documentation that also has a designated paragraph for the problem of having colliding naming of methods (e.g. remove() being both a Document and Query middleware method)

Do I have to write the reverse operation for migration rollback?

I use knex.js and it's good query builder for PostgreSQL. I haven't found any docs to explain how to do the migration rollback in a right way.
For now, I just write the reverse migrate operation in the down function to do the migration rollback. Is this a correct way?
import * as Knex from 'knex';
exports.up = async (knex: Knex): Promise<any> => {
await knex.schema.raw(`
ALTER TABLE IF EXISTS "GOOGLE_CHANNEL"
ADD COLUMN IF NOT EXISTS google_channel_ad_group_cpc_bid INTEGER NOT NULL DEFAULT 0;
`);
await knex.schema.raw(`
UPDATE "GOOGLE_CHANNEL" as gc
SET
google_channel_ad_group_cpc_bid = 7
FROM "CAMPAIGN_TEMPLATE" as ct
WHERE ct.campaign_channel_id = gc.campaign_channel_id;
`);
};
exports.down = async (knex: Knex): Promise<any> => {
// TODO: migration rollback
await knex.schema.raw(``);
};
I have two concerns:
If there are a lot of SQL statements in up function, I have to do write a lot of SQL statements in down function too in order to rollback the migration.
Why doesn't knex.js do the migration rollback without writing the reverse operation for us? I mean, knex.js can take a snapshot or record a savepoint of the database.

Yes, to rollback you use the down function of a migration script. When you run knex migrate:rollback the down function will run. Knex has meta tables in the database that are used to figure out what migrations that have run or not.
For example:
exports.up = function (knex, Promise) {
return knex.schema
.createTable('role', function (table) {
table.increments('role_id').primary();
table.string('title').notNullable().unique();
table.string('description');
table.integer('level').notNullable(),
})
.createTable('user_account', function (table) {
table.increments('user_id').primary();
table.integer('role_id').references('role_id').inTable('role').notNullable();
table.string('username').notNullable().unique();
table.string('passwordHashed').notNullable();
table.string('email', 50).notNullable().unique();
});
};
exports.down = function (knex, Promise) {
return knex.schema
.dropTable('user_account')
.dropTable('role');
};
Here I create two tables in the up function. The user_account has a foreign key constraint, and links with the role table, which means I have to drop the user_account table before the role table in the down function.
In your case, you use a update statement. In the down function you have to either make a new update with a hard-coded value (the old one before the migration), or make sure you store the old value in a history table.
As for your concerns:
Yes, if you add a lot of stuff, you also have to add a lot of code to reverse whatever you are doing. However, you can skip making the down scripts, but then you won't be able to rollback. Some (many?) choose to only go forward and never rollback. If they have to fix something they don't rollback but make a new migration script with the fix.
I would recommend you to create the down functions in the beginning. You can consider skipping making them when the time is right. People who don't make down functions usually have to test their migrations more thoroughly in a test or staging environment before deploying to production. This is to make sure it works, because they can't rollback after all.
I can't really answer for the Knex creators here. However, what you are describing as a potential solution is basically a backup of the database before a migration is done. After all, a migration does more than just change the layout of the tables, etc. A migration script will typically add or remove new rows as well. You can use the backup approach, but you have to take the backups yourself.
Knex is a fairly simple query builder. If you want the migration scripts to be written for you, you might want to go for a full-blown OR mapper.

How to create multiple documents in Mongoose in one request

How would I go about creating multiple documents with different schemas in one REST API request in Node/Mongoose/Express?
Say for example I need to create a user and a site on a single request, say for example /createUser.
I could of course create a user and then in the returned promise, create the next record, but what if that second record doesn't meet validation? Then I've created a user without the second record.
User.create(userData)
.then(user => {
Site.create(siteData)
.then(site => {
// Do something
})
.catch(err => {
console.log(err)
// If this fails, I'm left with a user created without
// a site.
})
})
.catch(err => {
console.log(err)
})
Is there a good practice to follow when creating multiple documents like this? Should I run manual validation instead before each .create() runs? Any guidance/advice would be very much appreciated!

You have the transaction problem here. You are trying to write into two different models but want the whole operation to be atomic and if any of them fails you need to rollback. Up until mongo 4.0 transactions were not supported by mongo and a work around for these sort issues was two phased commits. Now in mongo 4.0 we have transactions to cater such problems.
Hope it helped.

Express with pug, Postgres and proper MVC

I recently started using Node.js + Express.js (generated with pug) + pg-promise for handling db.
My first target is to obtain data from Postgres (already set up) and display it pretty using render and pug. Let's say it is user list from Users table.
On this restful tutorial I have learned how to get data and return it as JSON - it worked.
Based on Mozilla's tutorial I seperated my code:
routes/users.js: where for '/' I call user_controller.user_list method (using router.get)
controllers/userController.js I have exported user_list where I would like to ask model for data and call render if I have results
queries.js which is kinda my model? But I'm not sure. It has API: connection to db with promises and one function for every query I am going to use in Controllers. I believe I should have like one Model file per table (or any logical entity) but where to store pgp connections?
This file is based on first tutorial I mentioned
// queries.js (connectionString is set properly to my postgres)
var pgp = require('pg-promise')(options);
var db = pgp(connectionString);
function getUsers(req, res, next) {
db.any('SELECT (user_id, username) FROM public.users ORDER BY user_id ASC LIMIT 1000')
.then(function (data) {
res.json({ data: data });
})
.catch(function (err) {
return next(err);
});
}
module.exports = {
getUsers: getUsers
};
Here starts my problem as most tutorials uses mongoose which is very model-db-schema-friendly and what I have is simple 'SELECT ...' string I pass to pg-promise's any() function.
Therefore I have no model class like User.
In userControllers.js I don't know how to call getUsers() to handle its data. Returning JS object from getUsers() would be nice.
Also: where should I call render? In controller or only in
db.any(...).then(function (data) { <--here--> })
Before, I also tried to embed whole Postgres handling into Controller but from db.any() I got this array for handling:
[{ row: '(1,John)' },{ row: '(2,Amy)' },{ row: '(50,Peter)' } ]
Didn't know how go from there as I probably lost my API functionality as well ;-)
I am browsing through multiple tutorials how to handle MVC but usually they handle MongoDB and
satisfy readers with res.send() not render().

I am not sure that I understand what your question is exactly about, but since I do not have enough reputation to comment, I'll do my best to help you with your interrogations. :)
First, regarding the queries.js file, it is IMO not exactly a model, but rather a DAO (Data Access Object) file. DAO comes between you Model (which is actually you database) and your Controller layers. There usually is a DAO file per object (User, Pet, whatever you want) in your data model.
When the data model is rather complex, it can be useful to use an Object Relational Mapping (ORM) such as Mongoose to map your database and execute complexe processes on your objects. In such a case, you might need a specific file per object so as to describe your model and store your queries. But since you don't need an ORM, you DAO can directly interact with your database. That is why you do not have a User.js file.
Regarding the way the db object should be used, I think you should refer directly to pg-promise documentation on the matter.
IMPORTANT: For any given connection, you should only create a single
Database object in a separate module, to be shared in your application
(see the code example below). If instead you keep creating the
Database object dynamically, your application will suffer from loss in
performance, and will be getting a warning in a development
environment (when NODE_ENV = development)
As a matter of fact, a db object in pg-promise sort of represents the database itself and is actually designed for the simultaneous use of several databases, which does not seem to be your case for the moment.
Finally, when it comes to the render function, I believe it should be in the controller, as your DAO is not supposed to know how the data it has gathered is going to be used.
Modularity is always a time-saving choice on the long-term.
Furthermore, note that you might later need a Business Layer between your DAO and your controller, in order to preprocess and postprocess data you are going to persist or to display. In such a case, if you need for instance to ask for data from your database, you will need to render data after it is processed by the Business layer. If the render is made in the DAO layer, it will not be possible.
In the link I provided earlier to pg-promise's db object connection, you will also find documentation on the any() method. You might already have looked it up.
It specifically states that it returns
A promise object that represents the query result:
When no rows are returned, it resolves with an empty array.
When 1 or more rows are returned, it resolves with the array of rows.
so your returned data is a JS Array. If you want to make it a JS object, just use
JSON.stringify(yourArray) to process your data before rendering it in your controller.
But I wonder if Pug is not able to use your data directly.
Also, if you cannot get any data out of your DAO, maybe you should check that your data object is not empty, as such a case is tolerated by the any() method. If you expect your query to always return something, you might want to consider using the many() or the one() methods.
I hope this helps you.

nodejs - sails.js lock query (transactions)

The case:
I'm creating an api with SAILS.js, sails.js uses waterline for ORM. The api returns lets say photographs, many users are able to vote for a picture. The pictures will be ordered on number of votes.
Procedure:
When a user votes for a song, I've to check the number of votes ("SELECT"|| picture.findById()) AND after that I have to increment that number by one ("UPDATE" picture.update).
Problem:
transaction/ locking in Sails.js, these two queries should be excecuted without having a other query modifying the picture data within the select and update query of our vote system.
How should we perform locking/ transition in sails.js (node js framework)
THANKS

Sails does support transactions.
Here is an example of a transaction in sails.js:
await sails.getDatastore().transaction(async db=> {
await Model
.create({foo: bar})
.usingConnection(db);
if (somethingWentWrong) {
throw 'error happened - transaction rollback';
}
await Model
.update({id: 1}))
.set({votes: 1})
.usingConnection(db);
}).intercept('Error', () => res.serverError());

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string