Currently, migration scripts (node-pg-migrate) build the entire database from zero to everything. It's the database of a new system that is going to replace the legacy system. Data needs to be imported from the legacy system to the new database. They have completely different data structures.
The migrations scripts build the import schema with raw data imported from the legacy system (using its web service). Then all other schemas with its tables, functions and everything is created. Primarily the data schema with data transformed from the import schema usable for the new system. The api schema with views and functions exposed through postgREST working on data from the data schema. And some more helper schemas and stuff.
Now, the data to be imported is not final yet, so I need to re-import often. To do that, I need to migrate all the way down, dropping all other schemas as it goes down to get to the migration steps that remove all imported data and drops the import schema. Then go up again to import the data all the way up again to build all the schemas to have a working api again.
I'm getting to my question now shortly... I'm quite sure I need to move the import data scripts away from the migration so I don't need to deconstruct and reconstruct the entire database and all its schemas. Ideally, I want to run import scripts and schema scripts independently. Using node-pg-migrate though is really convenient, also for importing the data.
How can I use node-pg-migrate with independent lanes of migrations? One for imports (or dml changes) and one for schema changes (or ddl changes).
Related:
Migration scripts for Structure and Data
https://dba.stackexchange.com/questions/64632/how-to-create-a-versioning-ddl-dml-change-script-for-sql-server
Update: I just found out the solution may lie in the area of scope as implemented by node-db-migrate. I'm using node-pg-migrate though...
node-pg-migrate does not support lanes as e.g. db-migrate. But you can emulate them IMO in a way, that you will use separate migrations folder and table:
node-pg-migrate -m migrations/dll -t pgmigrations_dll
node-pg-migrate -m migrations/dml -t pgmigrations_dml
Related
I'm writing data to postgres tables from python with sqlalchemy and psycopg2 using the if_exists='replace' option in to_sql(). This drops the table, then recreates it. However, if I have a view defined that uses that table, the to_sql() command fails, as postgres won't drop the table. Is there anyway around this other than manually dropping the view first, the recreating it? Thanks.
If you aim to DROP a TABLE with related objects depending on it such as VIEW, you need to use CASCADE keyword to force to DROP related objects as well (this is a recursive operation).
See PostgreSQL dependencies tracking for details:
To ensure the integrity of the entire database structure, PostgreSQL
makes sure that you cannot drop objects that other objects still
depend on.
By default it is not feasible, actually creating a VIEW on a table is a convenient way to prevent this TABLE to be dropped accidentally. Anyway, you may also want to read this post to implement CASCADE beahviour with SQLAlchemy.
Then it is still your responsibility to recreate missing related objects after you recreated the table. SQLAlchemy seems to have no representation for related views. But it there is a package to create views and may fill this the gap in some extent (not tested).
So, it cannot be handled by SQLAlchemy alone. You will need instead a script/function that plays DDL statements to recreate your dependencies (maybe using the above mentioned package).
If you can recreate it using pure SQL standard (or using package) then you will not loose the benefit of SQLAlchemy ORM (at least the capability to abstract Database engine and being portable to another one).
About dependencies tracking, an easy way to see what related object should be recreated is:
BEGIN;
DROP TABLE mytable CASCADE;
ROLLBACK;
You can also use the function pg_depend which is very convenient but PostgreSQL specific.
I have managed to unit test all the function that are using data from my database.
The problem starts when I want to check the data itself, what happens if the schema of my DB changed? All the other unit tests are using DB stubs and not the real data.
How can I check the schema of the DB? here I must not mock it, because I want to check the real schema.
Edit: Its important to note that the aforementioned DB is a third party one. I.e. I have checked all the functionality with mocks and now I want to check the acctual schema of this DB, just to make sure someone didn't changed it without mentioning.
You will ideally write an integration test that roundtrips the data to/from your database. You should use a local copy of the database in a clean state, not use a production/development or shared database.
If you're interested I wrote an article on this a while back. It's Java focussed but the theory holds true for pretty much any language
We use model first for tables and relations and database first for views and stored procedures.
If we change the model we have to:
-generate database
-create views and procedures
-add the procedures and the views to the model
-remap function call of procedures manually
This costs much time because the model changes often or has failures.
Does anyboy knows a workaround to automatically integrate the views and procedures in the model?
You could automate the process by creating your own template for generating DDL from SSDL. By default EF designer uses SSDLToSQL10.tt file but you could create your own .tt file which would generate DDL that better suits your needs. This should address 1) and 2). Once you have the database you could now update your model from the database. This should adress 3). Finally to address 4) you could write a Model Generation Extension that would tweak the model the designer builds from the database in the OnAfterModelGenerated/OnAfterModelUpdated method. (Be aware - some of the extension points in the designer are weird to say the least and might be confusing/hard to implement).
Another option you may want to explore is to use Code First and Migrations. With Migrations you could evolve your database instead of constantly creating/deleting it. If you need, you can use SQL to define a migration so you have full control of how your database looks like. Code First does not support some of the features supported by ModelFirst/DatabaseFirst (e.g. TVFs/FunctionImports) so you may want to check first if what's supported is enough for you.
Database.Persist seems to be index-agnostic. This is okay, I can create my own indices, but generic SQL migration seems to create and drop tables when adding/removing fields. This has the effect of dropping the index as well.
Is there a recommended way to make sure they survive database migrations?
Only the SQLite3 backend should be dropping tables, PostgreSQL and MySQL both provide powerful enough ALTER TABLE commands to avoid that. So indices should only be lost for SQLite3. If you're using SQLite3 in production (not really recommended), you have two choices:
Disable automatic migrations and handle the schema yourself.
Add some code after the migrations are run to replace any missing indices.
I'm developing a web application in Node.js with MongoDB as the back end. What I wanted to know is, what is the generally accepted procedure, if any exists, for creating initial collections and populating them with initial data such as a white list for names or lists of predefined constants.
From what I have seen, MongoDB creates collections implicitly any time data is inserted into the database and the collection being inserted into doesn't already exist. Is it standard to let these implicit insertions take care of collection creation, or do people using MongoDB have scripts setup which build the main structure and insert any required initial data? (For example, when using MySQL I'd have a .sql script which I can run to dump and rebuild /repopulate the database from scratch).
Thank you for any help.
MHY
If you have data, this post on SO might be interresting for you. But since Mongo understands JavaScript, you can easily write a script that prepares the data for you.
It's the nature of Mongo to create everything that does not exist. This allows a very flexible and agile development since you are not constrainted to types or need to check if table x already exists before working on it. If you need to create collections dynamically, just get it from the database and work it if (no matter if it exists or not).
If you are looking for a certain object, be sure to check it (not null or if a certain key exists) because it may affect your code if you work with null objects.
There's is absolutely no reason to use setup scripts merely to make collections and databases appear. Both DB and collection creation is done lazily.
Rember that MongoDB is a completely schema free document store so there's no way to even setup a specific schema in advance.
There are tools available to dump and restore database content supplied with mongo.
Now, if your application needs initial data (like configuration parameters or whitelists like you suggest) it's usually best practice to have your application components set up there own data as needed and offer data migration paths as well.