Postgres can't drop table when view is present - python-3.x

I'm writing data to postgres tables from python with sqlalchemy and psycopg2 using the if_exists='replace' option in to_sql(). This drops the table, then recreates it. However, if I have a view defined that uses that table, the to_sql() command fails, as postgres won't drop the table. Is there anyway around this other than manually dropping the view first, the recreating it? Thanks.

If you aim to DROP a TABLE with related objects depending on it such as VIEW, you need to use CASCADE keyword to force to DROP related objects as well (this is a recursive operation).
See PostgreSQL dependencies tracking for details:
To ensure the integrity of the entire database structure, PostgreSQL
makes sure that you cannot drop objects that other objects still
depend on.
By default it is not feasible, actually creating a VIEW on a table is a convenient way to prevent this TABLE to be dropped accidentally. Anyway, you may also want to read this post to implement CASCADE beahviour with SQLAlchemy.
Then it is still your responsibility to recreate missing related objects after you recreated the table. SQLAlchemy seems to have no representation for related views. But it there is a package to create views and may fill this the gap in some extent (not tested).
So, it cannot be handled by SQLAlchemy alone. You will need instead a script/function that plays DDL statements to recreate your dependencies (maybe using the above mentioned package).
If you can recreate it using pure SQL standard (or using package) then you will not loose the benefit of SQLAlchemy ORM (at least the capability to abstract Database engine and being portable to another one).
About dependencies tracking, an easy way to see what related object should be recreated is:
BEGIN;
DROP TABLE mytable CASCADE;
ROLLBACK;
You can also use the function pg_depend which is very convenient but PostgreSQL specific.

Related

How to update rows in Jooq without Codegen using JSON

I am using Jooq version 3.17.0 and attempting to insert data into a table without codegen.
At the minute, I am designing a system that allows data to be imported into multiple tables (one at a time, and starting with just one), yet I do not want to write specific code for each table and as of now, I haven't had a need for codegen.
The code currently works for importing data via JSON, with json being a String formatted in the 'Jooq' format. This imports data correctly into the database. This also allows us to send json data of table updates from one system to our main system that uses Jooq. Yet it gives me an error when I try to update.
I am using MYSQL as my database.
The original code for insertion is :
Result<Record> convertedJson = dslContext.fetchFromJSON(json);
Loader<Record> res1 = dslContext.loadInto(table(tableName)).loadJSON(json).fields(convertedJson.fields()).execute();
However, if we try to update data by sending in the same json, but with one field changed, jooq gives an error org.jooq.exception.DataAccessException stating that there is a duplicate entry for key.
I tried to use :
Loader<Record> res2 = dslContext.loadInto(table(tableName)).onDuplicateKeyUpdate().loadJSON(json).fields(convertedJson.fields()).execute();
But then this throws an error ON DUPLICATE KEY UPDATE only works on tables with explicit primary keys. Table is not updatable : <tableName> since in LoaderImpl.onDuplicateKeyUpdate():220 since table.getPrimaryKey() is null which technically makes sense since table(tableName) returns a Table that does not know it's fields.
My question is probably two-fold.
Is there a way to have a table that is aware of it's fields without codegen?
Is there a way for me to allow jooq to update rows this way.
My preferences is to steer clear of codegen, unless it's really needed. I probably could switch to codegen if needed, but again I would still need to be able to execute SQL without writing specific code for each table. Using JSON is still very much desired, as that allows me to send data from one application to another for import.
Using code generation
You've run into one of those many reasons why code generation is very helpful with jOOQ. If your various tables are known at compile time, and all you're doing is switch table names, then I would go with generated code, making the lookup of the table dynamic. That would solve the problem easily.
From experience with various similar support cases, I've always recommended this first, because as soon as these kinds of troubles start, it's a good idea to re-think the code generation strategy as you will run into other, similar problems, having to work around the lack of ubiquitously available meta data all the time. There are many other benefits to using the code generator.
Emulating code generation
If for some reason you cannot (e.g. the tables aren't known at compile time) or do not want to use the code generator, then you can do the code generator's work yourself at runtime, by building CustomTable types as documented here.
Using other means of providing meta information
Another way to provide jOOQ with meta data is to use one of various forms of implementing org.jooq.Meta, which include:
Looking up meta data from the JDBC driver's DatabaseMetaData (this can be slow, depending on your schema)
Letting jOOQ interpret some DDL scripts
Using jOOQ's XML representation of the standard SQL INFORMATION_SCHEMA
Using generated code

Does models.Base.metadata.create_all(bind=engine) stay in code after creating tables?

I am working on a project built by some people before me, when I was creating some new tables using sqlalchemy I figured out that the method which creates tables were removed. So, I had to put it in code again to create tables. What I am wondering is if it doesn't need to stay in code after creating tables. is there any problem with keeping it there?
Here is the line code that I am talking about:
models.Base.metadata.create_all(bind=engine)
Metadata.create_all takes a checkfirst keyword argument which determines whether SQLAlchemy should check whether a table already exists before trying to create it. The default value of this argument is True, so once the tables have been created future invocations will have no effect, beyond emitting a few queries.
You can leave the code in place - it will be useful when a developer needs to create a fresh environment.

How to use database views in EF Core 3.0?

I know the question was asked before, but at the time it was, we had EF Core 2.x. The short answer was "no you can't" and obviously, not very helpful.
The other answers involved ugly hacks like changing migration files after they were created by the tool.
I make an application Code First. I have my models created with lot's of foreign keys and database joins in mind.
But here comes the unpleasant surprise (I'm a little new to EF): those joins written in LINQ are pretty slow, as a matter of fact they do not produce database join, but fetch whole tables instead.
Of course it's totally unacceptable, I import an old database with millions of records, with the joins I get results in milliseconds, without I get couple of seconds lags - on my very fast internet connection (in real world scenario it would be much worse).
I need views, and AFAIK EF won't create them for me, is it STILL true for EF 3.0?
Then, what would be the best and the most clean way to create views in SQL and to make entities for them? I mean - considering the situation the database models would change over time, and the database structure would have to be updated.
Well, I would prefer doing my joins not in SQL views, just have queries returned "JOIN" statement results. Especially some not obvious joins. Lets say table B has a column being a foreign key referencing table A. I want to get results from table A joining B for details. With normal SQL JOIN performance.
I checked the database: there is no significant performance difference between "select * from A" and "select * from A join B...". In LINQ - the difference is huge.
I figured out that in Code First database views are redundant.
The "views" can be created as models (ordinary classes) having a field or a property set to joined entity. I use private fields for that purpose. Then I use LINQ Join() to create my view entity. The query may refer ONLY to the fields set to joined entities, nothing else. Such query, if written properly translates clearly to SQL JOIN and works with full speed. In my application it's equivalent of a database view.
Why private fields and not properties, you may ask. Maybe because joined entities are "implementation details", but another reason is my presentation code uses reflection to operate on entity public properties, it's good to have those entities hidden from it. Otherwise I would probably need to use attributes to hide those "columns".
BTW, such views can be ordered with OrderBy(), filtered with Where() at virtually no cost. The constraint is to maintain the collection's IQueryable interface, never refer joined entities indirectly. So even if X refers to A.B, never refer X in a LINQ query, always A.B where A is direct entity reference assigned in the Join() query.
To build dynamic queries at runtime one must use expressions.
This set of properties of EF Core 3.0 allows to build a database application without using SQL, but with the full SQL speed maintained. However, the database / entity structure must be relatively simple to achieve that.

SQL indices with Database.Persist (Yesod web framework)

Database.Persist seems to be index-agnostic. This is okay, I can create my own indices, but generic SQL migration seems to create and drop tables when adding/removing fields. This has the effect of dropping the index as well.
Is there a recommended way to make sure they survive database migrations?
Only the SQLite3 backend should be dropping tables, PostgreSQL and MySQL both provide powerful enough ALTER TABLE commands to avoid that. So indices should only be lost for SQLite3. If you're using SQLite3 in production (not really recommended), you have two choices:
Disable automatic migrations and handle the schema yourself.
Add some code after the migrations are run to replace any missing indices.

Initial DB structure / data for MongoDB + NodeJS web application

I'm developing a web application in Node.js with MongoDB as the back end. What I wanted to know is, what is the generally accepted procedure, if any exists, for creating initial collections and populating them with initial data such as a white list for names or lists of predefined constants.
From what I have seen, MongoDB creates collections implicitly any time data is inserted into the database and the collection being inserted into doesn't already exist. Is it standard to let these implicit insertions take care of collection creation, or do people using MongoDB have scripts setup which build the main structure and insert any required initial data? (For example, when using MySQL I'd have a .sql script which I can run to dump and rebuild /repopulate the database from scratch).
Thank you for any help.
MHY
If you have data, this post on SO might be interresting for you. But since Mongo understands JavaScript, you can easily write a script that prepares the data for you.
It's the nature of Mongo to create everything that does not exist. This allows a very flexible and agile development since you are not constrainted to types or need to check if table x already exists before working on it. If you need to create collections dynamically, just get it from the database and work it if (no matter if it exists or not).
If you are looking for a certain object, be sure to check it (not null or if a certain key exists) because it may affect your code if you work with null objects.
There's is absolutely no reason to use setup scripts merely to make collections and databases appear. Both DB and collection creation is done lazily.
Rember that MongoDB is a completely schema free document store so there's no way to even setup a specific schema in advance.
There are tools available to dump and restore database content supplied with mongo.
Now, if your application needs initial data (like configuration parameters or whitelists like you suggest) it's usually best practice to have your application components set up there own data as needed and offer data migration paths as well.

Resources