I have a python script that supports an --edit-the-database option to invoke the user's preferred editor on a dump of the script's SQLite database. This option is intended to facilitate quick access to parts of the database that the script's other options don't provide access to, particularly during the development of this script.
Once the script has dumped the database's content, launched the editor and verified that the modified content is still valid then it needs to replace the existing database content.
First it removes all existing content by executing this SQL (using python's sqlite module):
PRAGMA writable_schema = 1;
DELETE FROM sqlite_master WHERE type IN ('table', 'index', 'trigger');
PRAGMA writable_schema = 0;
VACUUM;
and then it loads the new content using the sqlite module's executescript() method:
cursor.executescript(sql_slurped_from_user_modified_dump)
The problem is that these two operations (deleting existing content, loading new content) are not executed atomically: press CTRL-C at the wrong moment and the database content has been lost.
If I try to execute those two blocks of code inside a transaction then I get the error:
Error: cannot VACUUM from within a transaction
And if I keep the transaction but remove the VACUUM then I get the error:
Error: table first_table already exists
I have an ugly workaround in place: prior to calling the editor, the script copies the dump file to a safe location, writes a warning message to the user:
WARNING: if anything goes wrong then a backup of the database
can be found in /some/path
and, if the script continues and completes loading the new content, then it deletes the copy of the dump. But this is pretty ugly!
I could use DROP TABLE instead of the DELETE FROM sqlite_master ..., but if I am trying to allow the database to be modified in this way then I am allowing that the list of tables itself may change. I.e. if the user adds this to the dump:
CREATE TABLE t3 (n INT);
then a hard-coded list of DROPs like this:
BEGIN TRANSACTION
DROP TABLE t1;
DROP TABLE t2;
DROP INDEX ...
...
cursor.executescript(sql_slurped_from_user_modified_dump)
...
END TRANSACTION;
isn't going to work second time round (because it doesn't delete table t3).
I could use filesystem-atomic operations (i.e. something like: load the modified dump into a new database file; hardlink new file to old file), but that would require the script to close its database connection and reopen it afterwards, which, for reasons beyond the scope of this question, I would prefer not to do.
Does anybody have any better ideas for atomically replacing the entire content of a database whose list of tables is not predictable?
In case Google leads you here ...
I managed to do the first half of the task (delete existing content inside a single transaction) with something like this pseudocode:
-- Make the order in which tables are dropped irrelevant. Unfortunately, this
-- cannot be done just around the table dropping because it doesn't work inside
-- transactions.
PRAGMA foreign_keys = 0;
BEGIN TRANSACTION;
indexes = (SELECT name
FROM sqlite_master
WHERE type = 'index' AND
name NOT LIKE 'sqlite_autoindex_%';)
triggers = (SELECT name
FROM sqlite_master
WHERE type = 'trigger';)
tables = (SELECT name
FROM sqlite_master
WHERE type = 'table';)
for thing in indexes+triggers+tables:
DROP thing;
At which point I thought the second half (loading new content in the same transaction) would just be this:
cursor.executescript(sql_slurped_from_user_modified_dump)
END TRANSACTION;
-- Reinstate foreign key constraints.
PRAGMA foreign_keys = 1;
Unfortunately, pressing CTRL-C in the middle of the two blocks resulted in any empty database. The cause? cursor.executescript() does an immediate COMMIT before running the provided SQL. That turns the above code into two transactions!
This isn't the first time I've been caught out by this module's hidden transaction
management, but this time I was motivated to try the apsw module instead. This switch was remarkably easy. The latter half of the code now looks like this:
cursor.execute(sql_slurped_from_user_modified_dump)
END TRANSACTION;
-- Reinstate foreign key constraints.
PRAGMA foreign_keys = 1;
and it works perfectly!
Related
[Question posted by a user on YugabyteDB Community Slack]
I am running YugabyteDB 2.12 single node and would like to know if it is possible to create a temporary table such that it is automatically dropped upon committing the transaction in which it was created.
In “vanilla” PostgreSQL it is possible to specify ON COMMIT DROP option when creating a temporary table. In the YugabyteDB documentation for CREATE TABLE no such option is mentioned, however, when I tried it from ysqlsh it did not complain about the syntax. Here is what I tried from within ysqlsh:
yugabyte=# begin;
BEGIN
yugabyte=# create temp table foo (x int) on commit drop;
CREATE TABLE
yugabyte=# insert into foo (x) values (1);
INSERT 0 1
yugabyte=# select * from foo;
x
---
1
(1 row)
yugabyte=# commit;
ERROR: Illegal state: Transaction for catalog table write operation 'pg_type' not found
The CREATE TABLE documentation for YugabyteDB mentions the following for temporary tables:
Temporary tables are only visible in the current client session or transaction in which they are created and are automatically dropped at the end of the session or transaction.
When I create a temporary table (without the ON COMMIT DROP option), indeed the table is automatically dropped at the end of the session, but it is not automatically dropped upon commit of the transaction. Is there any way that this can be accomplished (apart from manually dropping the table just before the transaction is committed)?
Your input is greatly appreciated.
Thank you
See these two GitHub issues:
#12221: The create table doc section doesn’t mention the ON COMMIT clause for a temp table
and
#7926 CREATE TEMP … ON COMMIT DROP writes data into catalog table outside the DDL transaction
You cannot (yet, through YB-2.13.0.1) use the ON COMMIT DROP feature. But why not use ON COMMIT DELETE ROWS and simply let the temp table remain in place until the session ends?
Saying this raises a question: how do you create the temp table in the first place? Your stated goal implies that you’d need to create it before every use. But why? You could, instead, have dedicated initialization code to create the ON COMMIT DELETE ROWS temp table that you call from the client for this purpose at (but only at) the start of a session.
If you don’t want to have this, then (back to a variant of your present thinking) you could just do this before every intended use the table:
drop table if exists t;
create temp table t(k int) on commit delete rows;
After all, how else (without dedicated initialization code) would you know whether or not the temp table exists yet?
If you prefer, you could use this logic instead:
do $body$
begin
if not
(
select exists
(
select 1 from information_schema.tables
where
table_type='LOCAL TEMPORARY' and
table_name='t'
)
)
then
create temp table t(k int) on commit delete rows;
end if;
end;
$body$;
I am trying to execute this query but as userdefined(Create type) types are not supportable in azure data warehouse. and i want to use it in stored procedure.
CREATE TYPE DataTypeforCustomerTable AS TABLE(
PersonID int,
Name varchar(255),
LastModifytime datetime
);
GO
CREATE PROCEDURE usp_upsert_customer_table #customer_table DataTypeforCustomerTable READONLY
AS
BEGIN
MERGE customer_table AS target
USING #customer_table AS source
ON (target.PersonID = source.PersonID)
WHEN MATCHED THEN
UPDATE SET Name = source.Name,LastModifytime = source.LastModifytime
WHEN NOT MATCHED THEN
INSERT (PersonID, Name, LastModifytime)
VALUES (source.PersonID, source.Name, source.LastModifytime);
END
GO
CREATE TYPE DataTypeforProjectTable AS TABLE(
Project varchar(255),
Creationtime datetime
);
GO
CREATE PROCEDURE usp_upsert_project_table #project_table DataTypeforProjectTable READONLY
AS
BEGIN
MERGE project_table AS target
USING #project_table AS source
ON (target.Project = source.Project)
WHEN MATCHED THEN
UPDATE SET Creationtime = source.Creationtime
WHEN NOT MATCHED THEN
INSERT (Project, Creationtime)
VALUES (source.Project, source.Creationtime);
END
Is there any alternative way to do this.
You've got a few challenges there, because most of what you're trying to convert is not the way to do things on ASDW.
First, as you point out, CREATE TYPE is not supported, and there is no equivalent alternative.
Next, the code appears to be doing single inserts to a table. That's really bad on ASDW, performance will be dreadful.
Next, there's no MERGE statement (yet) for ASDW. That's because UPDATE is not the best way to handle changing data.
And last, stored procedures work a little differently on ASDW, they're not compiled, but interpreted each time the procedure is called. Stored procedures are great for big chunks of table-level logic, but not recommended for high volume calls with single-row operations.
I'd need to know more about the use case to make specific recommendations, but in general you need to think in tables rather than rows. In particular, focus on the CREATE TABLE AS (CTAS) way of handling your ELT.
Here's a good link, it shows how the equivalent of a Merge/Upsert can be handled using a CTAS:
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-develop-ctas#replace-merge-statements
As you'll see, it processes two tables at a time, rather than one row. This means you'll need to review the logic that called your stored procedure example.
If you get your head around doing everything in CTAS, and separately around Distribution, you're well on your way to having a high performance data warehouse.
Temp tables in Azure SQL Data Warehouse have a slightly different behaviour to box product SQL Server or Azure SQL Database - they exist at the session level. So all you have to do is convert your CREATE TYPE statements to temp tables and split the MERGE out into separate INSERT / UPDATE / DELETE statements as required.
Example:
CREATE TABLE #DataTypeforCustomerTable (
PersonID INT,
Name VARCHAR(255),
LastModifytime DATETIME
)
WITH
(
DISTRIBUTION = HASH( PersonID ),
HEAP
)
GO
CREATE PROCEDURE usp_upsert_customer_table
AS
BEGIN
-- Add records which do not already exist
INSERT INTO customer_table ( PersonID, Name, LastModifytime )
SELECT PersonID, Name, LastModifytime
FROM #DataTypeforCustomerTable AS source
WHERE NOT EXISTS
(
SELECT *
FROM customer_table target
WHERE source.PersonID = target.PersonID
)
...
Simply load the temp table and execute the stored proc. See here for more details on temp table scope.
If you are altering a large portion of the table then you should consider the CTAS approach to create a new table, then rename it as suggested by Ron.
I have created a non-persistent attribute in my WoActivity table named VDS_COMPLETE. it is a bool that get changed by a checkbox in one of my application.
I am trying to make a automatisation script in Python to change the status of every task a work order that have been check when I save the WorkOrder.
I don't know why it isn't working but I'm pretty sure I'm close to the answer...
Do you have an idea why it isn't working? I know that I have code in comments, I have done a few experimentations...
from psdi.mbo import MboConstants
from psdi.server import MXServer
mxServer = MXServer.getMXServer()
userInfo = mxServer.getUserInfo(user)
mboSet = mxServer.getMboSet("WORKORDER")
#where1 = "wonum = :wonum"
#mboSet .setWhere(where1)
#mboSet.reset()
workorderSet = mboSet.getMbo(0).getMboSet("WOACTIVITY", "STATUS NOT IN ('FERME' , 'ANNULE' , 'COMPLETE' , 'ATTDOC')")
#where2 = "STATUS NOT IN ('FERME' , 'ANNULE' , 'COMPLETE' , 'ATTDOC')"
#workorderSet.setWhere(where2)
if workorderSet.count() > 0:
for x in range(0,workorderSet.count()):
if workorderSet.getString("VDS_COMPLETE") == 1:
workorder = workorderSet.getMbo(x)
workorder.changeStatus("COMPLETE",MXServer.getMXServer().getDate(), u"Script d'automatisation", MboConstants.NOACCESSCHECK)
workorderSet.save()
workorderSet.close()
It looks like your two biggest mistakes here are 1. trying to get your boolean field (VDS_COMPLETE) off the set (meaning off of the collection of records, like the whole table) instead of off of the MBO (meaning an actual record, one entry in the table) and 2. getting your set of data fresh from the database (via that MXServer call) which means using the previously saved data instead of getting your data set from the screen where the pending changes have actually been made (and remember that non-persistent fields do not get saved to the database).
There are some other problems with this script too, like your use of "count()" in your for loop (or even more than once at all) which is an expensive operation, and the way you are currently (though this may be a result of your debugging) not filtering the work order set before grabbing the first work order (meaning you get a random work order from the table) and then doing a dynamic relationship off of that record (instead of using a normal relationship or skipping the relationship altogether and using just a "where" clause), even though that relationship likely already exists.
Here is a Stack Overflow describing in more detail about relationships and "where" clauses in Maximo: Describe relationship in maximo 7.5
This question also has some more information about getting data from the screen versus new from the database: Adding a new row to another table using java in Maximo
A deployment script creates and configures databases, collections, etc. The script includes code to drop databases before beginning so testing them can proceed normally. After dropping the database and re-adding it:
var graphmodule = require("org/arangodb/general-graph");
var graphList = graphmodule._list();
var dbList = db._listDatabases();
for (var j = 0; j < dbList.length; j++) {
if (dbList[j] == 'myapp')
db._dropDatabase('myapp');
}
db._createDatabase('myapp');
db._useDatabase('myapp');
db._create('appcoll'); // Collection already exists error occurs here
The collections that had previously been added to mydb remain in mydb, but they are empty. This isn't exactly a problem for my particular use case since the collections are empty and I had planned to rebuild them anyway, but I'd prefer to have a clean slate for testing and this behavior seems odd.
I've tried closing the shell and restarting the database between the drop and the add, but that didn't resolve the issue.
Is there a way to cleanly remove and re-add a database?
The collections should be dropped when db._dropDatabase() is called.
However, if you run db._dropDatabase('mydb'); directly followed by db._createDatabase('mydb'); and then retrieve the list of collections via db._collections(), this will show the collections from the current database (which is likely the _system database if you were able to run the commands)?.
That means you are probably looking at the collections in the _system database all the time unless you change the database via db._useDatabase(name);. Does this explain it?
ArangoDB stores additional information for managed graphs;
Therefore when working with named graphs, you should use the graph management functions to delete graphs to make shure nothing remains in the system:
var graph_module = require("org/arangodb/general-graph");
graph_module._drop("social", true);
The current implementation of the graph viewer in the management interface stores your view preferences (like the the attribute that should become the label of a graph) in your browsers local storage, so thats out of the reach of these functions.
I have this simple Table (just for test) :
create table table
(
key int not null primary key auto_increment,
name varchar(30)
);
Then I execute the following requests:
insert into table values ( null , 'one');// key=1
insert into table values ( null , 'two');// key=2
At this Stage all goes well, then I close The H2 Console and re-open it and re-execute this request :
insert into table values ( null , 'three');// key=33
Finally, here is all results:
I do not know how to solve this problem, if it is a real problem...
pending a response from the author...
The database uses a cache of 32 entries for sequences, and auto-increment is internally implemented a sequence. If the system crashes without closing the database, at most this many numbers are lost. This is similar to how sequences work in other databases. Sequence values are not guaranteed to be generated without gaps in such cases.
So, did you really close the database? You should - it's not technically a problem if you don't, but closing the database will ensure such strange things will not occur. I can't reproduce the problem if I normally close the database (stop the H2 Console tool). Closing all connections will close the database, and the database is closed if the application is stopped normally (using a shutdown hook).
By the way, what is your exact database URL? It seems you are using jdbc:h2:tcp://... but I can't see the rest of the URL.
Don't close terminal. Terminal is parent process of h2-tcp-server. They are not detached. When you just close terminal, it's process closes all child processes, what means emergency server shutdown
This happens when a database "thinks" it got forced to close (an accident or emergency for example), and its related to "identity-cache"
In my case I was facing this issue while learning and playing with the H2 database with an SpringBoot application, the solution was that at the h2-console when finishing playing, execute the SHUTDOWN; command and after that you can safely stop your spring boot application without having this tremendous jump on your autogenerated fields.
Personal Note: This usually is not a problem if you are creating the new database on every application start, but when you persist the data (for example on a data.sql file like on the below properties) you are playing with on the h2 database and it persist even when restarting, then this happens, so close it safely with SHUTDOWN command.
spring.datasource.url=jdbc:h2:./src/main/resources/data;DB_CLOSE_ON_EXIT=FALSE;AUTO_RECONNECT=TRUE
spring.jpa.hibernate.ddl-auto=update
References:
Solution
https://stackoverflow.com/a/40135657/10195307
Learn about identity-cache https://www.sqlshack.com/learn-to-avoid-an-identity-jump-issue-identity_cache-with-the-help-of-trace-command-t272/