node-mongo-native migration framework - node.js

I'm working on a node.js server, and using MongoDB with node-mongo-native.
I'm looking for a db migration framework, similar to Rails migrations. Any recommendations?

I'm not aware of a specific native Node.js tool for doing MongoDB migrations .. but you do have the option of using tools written in other languages (for example, Mongoid Rails Migrations).
It's worth noting that the approach to Schema design and data modelling in MongoDB is different from relational databases. In particular, there is no requirement for a collection to have a consistent or predeclared schema so many of the traditional migration actions such as adding and removing columns are not required.
However .. migrations which involve data transformations can still be useful.
If your application is expecting data to be in a certain format (eg. you want to split a "name" field into "first name" and "last name") there are several strategies you could use if the idea of using migration tools written in another programming language isn't appealing:
handle data differences in your application logic, so old and new data formats are both acceptable (perhaps "upgrading" records to match a newer format as they are updated)
write a script to do a once off data migration
contribute MongoDB helpers to node-migrate

I've just finished writing a basic migration framework based on node-mongo-native: https://github.com/afloyd/mongo-migrate. It will allow you to migrate up & down, as well as migrating up/down to a specific revision number. It was initially based on node-migrate, but obviously needed to be changed a bit to make it work.
The revision history is stored in mongodb and not on the file system like node-migrate, allowing collaboration on the same project using a single database. Otherwise each developer running migrations could cause migrations to run more than once against a database.
The migrations themselves are file-based, also helping with collaboration on a single project where each developer is (or is not) not using the same database. So when each dev runs the migration, all migration files not already run against his/her database will be run.
Check out the documentation for more info.

Related

PostgreSQL - Continuous integration

I have a database (PostgreSQL) in development environment, which allows me to develop a GraphQL api in NodeJS. I would like to know how to do when I make modifications to the database, pass these modifications to staging and then to production automatically, without having to redo all the queries and so on in each environment.
Do you know how to do it?
Thank you
A typical solution is to use something like migrations. You should have a special table that stores an information about all applied migrations.
The first migration can just execute an initial script that creates all tables, relations, functions and so on.
The subsequent migrations modify structure according to changes in your app and you always know what migrations was applied to a certain DB.
To achieve working with migration you should find a suitable package that can create, execute and undo migrations and maybe seeders as well (something like this package).

How can I switch between a live and a production database without duplicating code?

Here is my situation. I have an extensive REST based API that connects to a MongoDB database using Mongoose. The API is written as a standard "MEAN" stack application.
Currently, when a developer queries the API they're always connecting to the live production database. What I want to do is have an exact duplicate database as a "staging" database, where new data will be added first, vetted over a period of time, and then move to the live database. Then I want developers to be able to query either one simply by modifying their query.
I started looking into this with the Mongoose documentation, and it appears as though the models are tied to the DB connection, and if I want to have multiple connections I also have to have multiple models, one for each connection. This would be a nightmare of WET code and not the path I want to take.
What I want to do is not touch any of my code at all and simply have a switch that changes to the proper database for a given query. So my question is, how can I achieve this? Is it possible? The documentation seems to imply it is not.
Rather than trying to maintain connections two environments in the same code base have you considered setting up stage version of your application? Which database it connects to could be set through an environment variable or some other configuration option.
The developers would still then only have to make a change to query one or the other and you could migrate data from the stage database to production/live database once you have finished your vetting process.

DB migration: Single creation script vs change sets

I am creating a DB schema per customer. So whenever a new customer registers I need to quickly create their schema in runtime.
Option 1
In runtime, use Liquibase (or equivalent) to run all the changesets to generate the latest schema.
Cons:
This is slow, there can be multiple historical change setsa which are not relevant now any more (create table and year later drop it).
Liquibase is used here in runtime and not just "migration time". Not sure if this is a good idea.
Standartizing on Liquibase as a mean to create schema will force all developers to use it during development. We try to avoid loading more tools on the developers.
Option 2
After each build we generate a temporary DB using Liquibase changesets. Then from the DB we create a clean schema creation script based on the current snapshot. Then when a new customer comes we just run the clean script, not the full change set history.
Cons:
Next time I run liquibase it will try to run from changeset 1. A workaround might be to include in the generation script the creation of the changeset table and inserting to it the latest changeset.
New schemas are created using one script, while old schemas go through the changeset process. In theory this might cause a different schema. However, the single script went through the changeset process as well so I can't think of exact case that will cause an error, this is a theoretical problem for now.
What do you think?
I would suggest option #1 for the consistency.
Database updates can be complex and the less chance for variation the better. That means you should have your developers create the liquibase changeSets initially to update their databases as they are implementing new features to know they are running as they expect and then know that those same steps will be ran in QA all the way through production. It is an extra tool they need to deal with, but it should be easy to integrate into their standard workflow in a way that is easy for them to use.
Similarly, I usually recommend leaving non-relevant historical changeSets in your changeLog because if you remove them you are deviating from your known-good update path. Databases are fast with most operations, especially on a system with little to no data. If you have specific changeSets that are no longer needed and are excessively expensive for one reason or another you can remove them on a case by case basis, but I would suggest doing that very rarely.
You are right that creating a database snapshot from a liquibase script should be identical to running the changeLog--as long as you include the databasechangelog table in the snapshot. However, sticking with an actual Liquibase update all the way through to production will allow you to use features such as contexts, preconditions and changelog parameters that may be helpful in your case as well.
There are two approaches for database deployment:
Build once deploy many – this approach uses the same principle as the native code, compile once and copy the binaries across the environments. From database point of view this approach means that the deploy script is generated once and then executed across environments.
Build & Deploy on demand – this approach generates the delta script when needed in order to handle any out of process changes.
If you use the Build & Deploy on demand approach you can generate the delta script for the entire schema or work-item / changeset.

Entity Framework migrations on legacy database

We have several legacy SQL Server databases that we occasionally make schema changes to. We currently have a utility written in C++ that allows users to update their DB's with these schema changes. The utility currently generates dynamic sql to create all DB objects. I am looking into redoing this and thought EF migrations might be a good way to go. I have read up a bit on the subject and I have a general idea of how it works. But I'm having a bit of a hard time figuring out how I would set it up to replace our current procedure (or if it is even possible). Currently, a client could be on any one of a number of previous versions. I'm assuming I would have to go back to the oldest possible version and create my model/initial migration from that, then generate incremental migrations for each version change in order to support updates from all versions. Is that a correct assumption? Also, currently our clients could be using sql server 2000, 2005, or 2008. Would this have any effect on how I would set things up (or if I even could)? Further, the goal is to create a utility with a (C# - probably WPF) UI that the user can use to manipulate the migrations (up or down, preferably). I've seen a lot of examples of how to manipulate migrations from command-line within package manager but not a lot of stuff on how to create a utility with a friendly UI for upgrading/downgrading DB's in production. Also, I have not seen anything that shows how to create stored procedures in a migration (our DBs rely on some stored procedures). I'm assuming that, if nothing else, I can use the Sql() method to generate a SQL query to create a SP. Is that correct? Is there a better way?
I know my questions are a bit non-specific and I apologize for that. But I'm still in the beginning processes of learning this and I'd like to get an idea of whether or not this is a good way to go. Any guidance would be greatly appreciated.
Thanks,
Dennis
Firstly, on SQL Server support, Entity Framework doesn't really support SQL Server 2000. See this question:
EntityFramework SQL Server 2000?
On the question of supporting all the multiple versions, you have the right idea about needing to generate an initial migration for the oldest version first then incrementally altering the model and generating migrations to support the later versions. This will be a pain as the migrations are opinionated about how they represent the model in the database and you will be doing a lot of messing about to end up with a model and a set of migrations that fully represent that. Specific concerns are indexes, column lengths, data types, stored procedures, triggers, functions, partitioning.
The Sql() function gets you around most issues, though also helpful in the migrations are functions like CreateIndex and AlterColumn.
For automating this, the migrations are definitely available as powershell cmdlets which are themselves just .Net objects so can be called programmatically.
As this question is a year old, I assume you will have made a decision on whether to do this. My opinion is that it is hard to see that it's worth the effort. If you were re-platforming the code base that uses this database to Entity Framework then it would make sense. Otherwise there are bound to be better tools out there for database version management. My first port of call would be Redgate.

SubSonic-based app that connects to multiple databases

I currently developed an app that connects to SQL Server 2005 database, so my DAL objects where generated using information from that DB.
It will also be possible to connect to an Oracle and MySQL db, all with the same table structures (aside from the normal differences in fields, such as varbinary(max) in SQL Server and BLOB in Oracle, and so on). For this purpose, I already defined multiple connection strings and multiple SubSonic providers for the different DB's the app will run on.
My question is, if I generated my objects using a SQL Server database, should the generated objects work transparently with the other DB's or do I need to generate a different DAL for each database engine I use? Should I be aware of any possible bugs I may encounter while performing these operations?
Thanks in advance for any advice on this issue.
I'm using SubSonic 2.2 by the way....
From what I've been able to test so far, I can't see an easy way to achieve what I'm trying to do.
The ideal situation for me would have been to generate SubSonic objects using SQL Server for example, and just be able to switch dynamically to MySQL by just creating at runtime the correct Provider for it along with its connection string. I got to a point where my app would correctly connect from SQL Server to a MySQL DB, but there's a point where the app fails since SubSonic internally generates queries of the form
SELECT * FROM dbo.MyTable
which MySQL doesn't support obviously. I also noticed queries that enclosed table names with brackets ([]), so it seems that there are a number of factors that would limit the use of one Provider along multiple DB engines.
I guess my only other option is to sort it out with multiple generated providers, although I must admit it does not make me comfortable knowing that I'll have N copies of basically the same classes along my project.
I would really love to hear from anyone else if they've had similar experiences. I'll be sure to post my results once I get everything sorted out and working for my project.
Has any of this changed in 3.0? This would definitely be a worthy reason for me to upgrade if life is any easier on this matter...

Resources