Do I have to write the reverse operation for migration rollback?

Do I have to write the reverse operation for migration rollback? - node.js

I use knex.js and it's good query builder for PostgreSQL. I haven't found any docs to explain how to do the migration rollback in a right way.
For now, I just write the reverse migrate operation in the down function to do the migration rollback. Is this a correct way?
import * as Knex from 'knex';
exports.up = async (knex: Knex): Promise<any> => {
await knex.schema.raw(`
ALTER TABLE IF EXISTS "GOOGLE_CHANNEL"
ADD COLUMN IF NOT EXISTS google_channel_ad_group_cpc_bid INTEGER NOT NULL DEFAULT 0;
`);
await knex.schema.raw(`
UPDATE "GOOGLE_CHANNEL" as gc
SET
google_channel_ad_group_cpc_bid = 7
FROM "CAMPAIGN_TEMPLATE" as ct
WHERE ct.campaign_channel_id = gc.campaign_channel_id;
`);
};
exports.down = async (knex: Knex): Promise<any> => {
// TODO: migration rollback
await knex.schema.raw(``);
};
I have two concerns:
If there are a lot of SQL statements in up function, I have to do write a lot of SQL statements in down function too in order to rollback the migration.
Why doesn't knex.js do the migration rollback without writing the reverse operation for us? I mean, knex.js can take a snapshot or record a savepoint of the database.

Yes, to rollback you use the down function of a migration script. When you run knex migrate:rollback the down function will run. Knex has meta tables in the database that are used to figure out what migrations that have run or not.
For example:
exports.up = function (knex, Promise) {
return knex.schema
.createTable('role', function (table) {
table.increments('role_id').primary();
table.string('title').notNullable().unique();
table.string('description');
table.integer('level').notNullable(),
})
.createTable('user_account', function (table) {
table.increments('user_id').primary();
table.integer('role_id').references('role_id').inTable('role').notNullable();
table.string('username').notNullable().unique();
table.string('passwordHashed').notNullable();
table.string('email', 50).notNullable().unique();
});
};
exports.down = function (knex, Promise) {
return knex.schema
.dropTable('user_account')
.dropTable('role');
};
Here I create two tables in the up function. The user_account has a foreign key constraint, and links with the role table, which means I have to drop the user_account table before the role table in the down function.
In your case, you use a update statement. In the down function you have to either make a new update with a hard-coded value (the old one before the migration), or make sure you store the old value in a history table.
As for your concerns:
Yes, if you add a lot of stuff, you also have to add a lot of code to reverse whatever you are doing. However, you can skip making the down scripts, but then you won't be able to rollback. Some (many?) choose to only go forward and never rollback. If they have to fix something they don't rollback but make a new migration script with the fix.
I would recommend you to create the down functions in the beginning. You can consider skipping making them when the time is right. People who don't make down functions usually have to test their migrations more thoroughly in a test or staging environment before deploying to production. This is to make sure it works, because they can't rollback after all.
I can't really answer for the Knex creators here. However, what you are describing as a potential solution is basically a backup of the database before a migration is done. After all, a migration does more than just change the layout of the tables, etc. A migration script will typically add or remove new rows as well. You can use the backup approach, but you have to take the backups yourself.
Knex is a fairly simple query builder. If you want the migration scripts to be written for you, you might want to go for a full-blown OR mapper.

Related

Running one time DML/Update scripts using Prisma

Does Prisma support running one-time DML statements, such as UPDATE, automatically?
For example, let's say we want all emails in a table to be lowercase. We make a change in our API so that all future accounts/emails are lowercase, however we want to update EXISTING emails to be lowercase too.
Running npx prisma generate and npx prisma migrate executes DDL to keep your schema in-sync. However, I do not see a place to hold database "patch" files. These files generally are run once in order to update existing records in a database.

Prisma doesn't support running one-time DML statements automatically.
You would need to use something like a cron if you want to run some function at specific intervals.
And for one time you could just invoke the function once.
As for your particular use case you could achieve it by using the below function.
async function main() {
const result = await prisma.$executeRaw`UPDATE "User" SET email=lower(email)`;
console.log(result);
}
You could know more about getting Raw Database Access from this Guide

Validate #versionColumn value before saving an entity with TypeORM

I'm currently working on saving data in a postgres DB using TypeORM with the NestJS integration. I'm saving data which keeps track of a version property using TypeORM's #VersionColumn feature, which increments a number each time save() is called on a repository.
For my feature it is important to check this version number before updating the records.
Important
I know I could technically achieve this by retrieving the record before updating it and checking the versions, but this leaves a small window for errors. If a 2nd user updates the same record in that millisecond between the get and save or if it would take longer for some weird reason, it would up the version and make the data in the first call invalid. TypeORM doesn't check the version value, so even if a call has a lower value than what is in the database, it still saves the data eventhough it should be seen as out of date.
1: User A checks latest version => TypeORM gives back the latest version: 1
2: User B updates record => TypeORM ups the version: 2
3: User A saves their data with version 1 <-- This needs to validate the versions first.
4: TypeORM overwrites User B's record with User A's data
What I'm looking for is a way to make TypeORM decline step 3 as the latest version in the database is 2 and User A tries to save with version 1.
I've tried using the querybuilder and update statements to make this work, but the build-in #VersionColumn only up the version on every save() call from a repository or entity manager.
Besides this I also got a tip to look into database triggers, but as far as I could find, this feature is not yet supported by TypeORM
Here is an example of the setup:
async update(entity: Foo): Promise<boolean> {
const value = await this._configurationRepository.save(entity);
if (value === entity) {
return true;
}
return false;
}

In my opinion, something like this is much better served through triggers directly in the Database as it will fix concerns around race conditions as well as making it so that modifications made outside the ORM will also update the version number. Here is a SQL Fiddle demonstrating triggers in action. You'll just need to incorporate it into your schema migrations.
Here is the relevant DDL from the SQL Fiddle example:
CREATE TABLE entity_1
(
id serial PRIMARY KEY,
some_value text,
version int NOT NULL DEFAULT 1
);
CREATE OR REPLACE FUNCTION increment_version() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.version = NEW.version + 1;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
CREATE TRIGGER increment_entity_1_version
BEFORE UPDATE
ON entity_1
FOR EACH ROW
EXECUTE PROCEDURE increment_version();
The same trigger function can be used for any table that has a version column in case this is a pattern you want to use across multiple tables.

I think you are looking for concurrency control. If this is the case there is a solution in this about 1/2 the way down. TypeORM concurrency control issue

How can I get the entire updated entry in a $afterUpdate hook in objection models?

Im using Objection.js as my ORM for a simple rainfall application. I need to be able to dynamically update and entry of one table when a lower level tables entires has been updated. To do this I need the whole entry I am updating so I can use that data to correctly update the dynamically updated entry.
Im using the $afterUpdate hook for the lower level table entry which. The issue I am having is that when I log this within the $afterUpdate hook function it only contains the properties for the parts of the entry I want to update. How can I get the entire entry? Im sure I could get the record by running an additional query to the DB but I was hoping there would be away to avoid this. Any help would be appreciated

I think, as of right now, you can only get the whole model with an extra query.
If you are doing the update with an instance query ($query) you can get the other properties from options.old.
Query:
const user = await User.query().findById(userId);
await user.$query()
.patch({ name: 'Tom Jane' })
Hook:
$afterUpdate(opt, queryContext) {
console.log(opt.old)
}
Patch
If you don't need to do this in the hook, you might want to use patch function chained with first().returning('*') to get the whole model in a single query, it's more efficient than patchAndFetchById in postgreSQL. Like stated in the documentation.
Because PostgreSQL (and some others) support returning('*') chaining, you can actually insert a row, or update / patch / delete (an) existing row(s), and receive the affected row(s) as Model instances in a single query, thus improving efficiency. See the examples for more clarity.
const jennifer = await Person
.query()
.patch({firstName: 'Jenn', lastName: 'Lawrence'})
.where('id', 1234)
.returning('*')
.first();
References:
http://vincit.github.io/objection.js/#postgresql-quot-returning-quot-tricks
https://github.com/Vincit/objection.js/issues/185
https://github.com/Vincit/objection.js/issues/695

Automating/Tracking Knex Migrations and Lucid Models

The Situation
I recently started working on a new project using nodejs. I have a background of using Python/Django and C#/.NET (not a huge fan of the latter). Node is awesome, but I must say I miss the ease of building models and automating migrations in Django. I am currently using the AdonisJS framework which leverages Knex. Knex is a powerful library, but the migrations all need to be manually built. Additionally, the AdonisJS ORM that manages the Models is independent of Knex (migration manager). You also do not define field attributes on the Models, which can have benifits for dynamically doing things in the front and back end. All things considered, there is a lot of room for human error, miscommunication and a boat load more typing required. I know the the hot thing these days is to keep it loose and fast, but for this specific project, I am looking for a bit more structure than loosely defined models.
Current State
What I have landed on is building a new Class called tableModel and a field class to define the fields within table model. I have already completed this and I am successfully writing the migration files leveraging mustache. I plan on also automatically writing the Models which I shouldn't have a problem with (fingers crossed).
The Problem
Here is where it gets a little tough and where I need help...I need to track what has been added or removed via migration so I can effectively write ups and downs as the tableModels change over time.
So let's say I add a "tableModel" which creates a migration to create table Foo with fields {id (bigint), user_id(int), name(string255)}
Later I want to add a field called description so I would simply add it to my "tableModel" and then run a build command which would build out the migration.
How do I check what has already been created though so I only do an up() for description?
Then I want to remove the name field so I mark it out in my "tableModel" and run a build migration command. How do I check what has been migrated that now needs to be added in to the down().
Edit: I would add a remove field to the up and the corresponding roll back to the down.
Bonus Round
Let's say I want to change user_id from an int to a bigint, because who makes a foreign key just an int? How do I check not just what needs to be added to the up and down, but also checks if I need to change a property on a field.
Edit: would just write the up. and a corresponding roll back to the down
The Big Question
Basically, how do I define dirty "tableModels" classes
Possible Solution?
I am thinking that maybe I should capture some type of registry or snapshot and then run the comparison when building the migrations and or models, then recapture/snapshot. If this is the route, should I store in a json file, write this to the DB itself, or is there another/better option.
If I create the tableModel instances as constants, could I actually write back to the JS file and capture the snapshot as an attribute? IF this is an option, is Node's file system the way to go and what's the best way to do this? Node keep suprising me so I wouldn't be baffled if any of these are an option.
Help!
If anyone has gone down this path before or knows of any tools I could leverage, I would greatly appreciate it and thank you in advance. Also, if I am headed in a completely wrong direction, then please let me know, I both handle and appreciate all types of feedback.
Example
Something to note, when I define the "tableModel" for a given migration or model, it is an instance of the class, I am not creating an extended class since this is not my orm.
class tableModel {
constructor(tableName, modelName = tableName, fields = []) {
this.tableName = tableName
this.modelName = modelName
this.fields = fields
}
// Bunch of other stuff
}
fooTableModel = new tableModel('fooTable', 'fooModel', fields = [
new tableField.stringField('title'),
new tableField.bigIntField('related_user_id'),
new tableField.textField('description','Testing Default',false,true)
]
)
which equates to:
tableModel {
tableName: 'fooTable',
modelName: 'fooModel',
fields:
[ stringField {
name: 'title',
type: 'string',
_unique: false,
allow_null: null,
fieldAttributes: {},
default_value: null },
bigIntField {
name: 'related_user_id',
type: 'bigInteger',
_unique: false,
allow_null: null,
fieldAttributes: {},
default_value: 0 },
textField {
name: 'description',
type: 'text',
_unique: false,
allow_null: true,
fieldAttributes: {},
default_value: 'Testing Default' } ]

You have the up and down notation mixed up. Those are for migrating the "latest" (runs the up function) and doing rollbacks (runs the down function). Up and down to not relate to dropping or adding table columns.
The migrations up is for any change, and the down is to reverse those changes. So if you wanted to drop a column from some table, you write the command in the up, then write the opposite in the down (you'd add it back in...), such that you can "rollback" and the change is effectively reversed. You have to be careful with such things though, as you can put yourself in a situation where you actually lose data.
Want to add a column? Write it in the up, and drop the column in the down.
One of the major points behind the migrations mechanism is to track the state of changes of your database, as time goes forward. So generally, if you created a table in some migration, then a day or so later you realize you need to drop/add columns, you normally don't go back and edit the existing migration, especially if the migration has already been run. You'd just write a new migration to drop/add your column.
Since you're using knex, there are a couple "knex" tables that get created. By default the one you're looking for is knex_migrations, unless someone specifically modified the settings to change the name of it. This table holds all the migrations that have run against your DB, per batch. From the CLI, assuming you have knex.js installed globally, you can run knex migrate:latest, and that will push all the migrations that exist in your directory to the target database, if they have not yet been run. It does this by way of examining that knex_migrations table. If you roll a change and don't like it, and assuming you've properly done the down function, you can invoke knex migrate:rollback to reverse the change. If there are 3 migration files that have NOT yet been run, invoking knex migrate:latest will run all 3 of those migration files under a new batch #, which is 1 higher than the most recent batch number. Conversely, if you invoke a knex migrate:rollback, it will find the highest batch number (there could be more than 1 migration in a batch...), and invoke the down function on all those files, effectively rollback those changes.
All that said, knex is a "query builder" tool. It's got a ton of helper functions to help build the sql for you. Personally, I find this to be a major distraction. Why spend hours on hours figuring out all the helper functions when I can just go crank out raw SQL and run that. Thus, that's what we've done in our system. we use knex.raw('') and write our own DDL and DML. It works great and does exactly what we need it to. We don't need to go figure out the magic of the query building.
The short answer is that knex will automatically know what has and has not been run for you (again, via that knex_migrations table it creates for you...).
Things can get weird though when it start involving git and different branches. I recommend that if you're writing migrations on some branch, and you need to go do other work, always remember to first perform a rollback of any migrations you've done in that branch BEFORE switching branches. Otherwise you will be in weird DB states that don't coincide with the application code.
I would personally just deal with updating models independently of writing migrations. For example, if I'm adding a description column to some table, then I probably want to manually update the ORM to reflect the change of the new db schema. Generally, I've found trying to use a tool that automagically does that for you (rather, if I change the orm, stuff happens to write all the underlying sql...) usually winds me up in a heap of trouble and I just spend more time trying to un-fudge stuff. But, that's just my 2 cents :)

Here is where it gets a little tough and where I need help...I need to track what has been added or removed via migration so I can effectively write ups and downs as the tableModels change over time.
You could store changes in a DB/txt file and those can act as snapshots. So when you want to rollback to a particular migration, you would find the changes (up/down) made for that mutation and adjust accordingly.
Later I want to add a field called description so I would simply add it to my "tableModel" and then run a build command which would build out the migration. How do I check what has already been created though so I only do an up() for description?
Here you either call the database itself directly and check what fields have already been created. If a field is already their and the attributes are the same, you can either ignore it or stop the transaction all together.
Bonus Round Let's say I want to change user_id from an int to a bigint, because who makes a foreign key just an int? How do I check not just what needs to be added to the up and down, but also checks if I need to change a property on a field.
Again, call the DB itself on the table in question. I know the SQL call would be:
describe [table_name];
After reading the end, I think you answered this yourself, but I think capturing these changes would work best in a NoSql database since you're using Node or PostGres with it's json field.

Sequelize.js - how to properly use get methods from associations (no sql query on each call)?

I'm using Sequelize.js for ORM and have a few associations (which actually doesn't matter now). My models get get and set methods from those associations. Like this (from docs):
var User = sequelize.define('User', {/* ... */})
var Project = sequelize.define('Project', {/* ... */})
// One-way associations
Project.hasOne(User)
/*
...
Furthermore, Project.prototype will gain the methods getUser and setUser
according to the first parameter passed to define.
*/
So now, I have Project.getUser(), which returns a Promise. But if I call this twice on the very same object, I get SQL query executed twice.
My question is - am I missing something out, or this is an expected behavior? I actually don't want to make additional queries each time I call the same method on this object.
If this is expected - should I use custom getters with member variables which I manually populate and return if present? Or there is something more clever? :)
Update
As from DeBuGGeR's answer - I understand I can use includes when making a query in order to eager load everything, but I simply don't need it, and I can't do it all the time. It's waste of resources and a big overhead if I load my entire DB at the beginning, just to understand (by some criteria) that I won't need it. I want to make additional queries depending on situation. But I also can't afford to destroy all models (DAO objects) that I have and create new ones, with all the info inside them. I should be able to update parts of them, which are missing (from relations).

If you use getUser() it will make the query call, it dosent give you access to the user. You can manually save it to project.user or project.users depending on the association.
But you can try Eager Loading
Project.find({
include: [
{ model: User, as: 'user' } // here you HAVE to specify the same alias as you did in your association
]
}).success(function(project){
project.user // contains the user
});
Also e.g of getUser(). Dont expect it to automatically cache user and dont override this cleverly as it will create side effects. getUser is expected to get from database and it should!
Project.getUser().then(function(user){
// user is available and is a sequelize object
project.user = user; // save project.user and use it till u want to
})

The first part of things is clear - every call to get[Association] (for example Project.getUser()) WILL result in database query.
Sequelize does not maintain any kind of state nor cache for the results. You can get user in the Promisified result of the call, but if you want it again - you will have to make another query.
What #DeBuGGeR said - about using accessors is also not true - accessors are present only immediately after a query, and are not preserved.
As sometimes this is not ok, you have to implement some kind of caching system by yourself. Here comes the tricky part:
IF you want to use the same get method Project.getUser(), you won't be able to do it, as Sequelize overrides your instanceMethods. For example, if you have the association mentioned above, this won't work:
instanceMethods: {
getUser: function() {
// check if you have it, otherwise make a query
}
}
There are few possible ways to fix it - either change Sequelize core a little (to first check if the method exists), or use some kind of wrapper to those functions.
More details about this can be found here: https://github.com/sequelize/sequelize/issues/3707
Thanks to mickhansen for the cooperation on how to understand what to do :)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string