For Update - for psycopg2 cursor for postgres - python-3.x

We are using psycopg2 jsonb cursor to fetch the data and processing but when ever new thread or processing coming it should not fetch and process the same records which first process or thread.
For that we have try to use the FOR UPDATE but we just want to know whether we are using correct syntax or not.
con = self.dbPool.getconn()
cur = conn.cursor()
sql="""SELECT jsondoc FROM %s WHERE jsondoc #> %s"”"
if 'sql' in queryFilter:
sql += queryFilter 'sql’]
When we print this query, it will be shown as below:
Query: "SELECT jsondoc FROM %s WHERE jsondoc #> %s AND (jsondoc ->> ‘claimDate')::float <= 1536613219.0 AND ( jsondoc ->> ‘claimstatus' = ‘done' OR jsondoc ->> 'claimstatus' = 'failed' ) limit 2 FOR UPDATE"
cur.execute(sql, (AsIs(self.tablename), Json(queryFilter),))
cur.execute()
dbResult = cur.fetchall()
Please help us to clarify the syntax and explain if that syntax is correct then how this query lock the fetched records of first thread.
Thanks,
Sanjay.

If this exemplary query is executed
select *
from my_table
order by id
limit 2
for update; -- wrong
then two resulting rows are locked until the end of the transaction (i.e. next connection.rollback() or connection.commit() or the connection is closed). If another transaction tries to run the same query during this time, it will be stopped until the two rows are unlocked. So it is not the behaviour you are expected. You should add skip locked clause:
select *
from my_table
order by id
limit 2
for update skip locked; -- correct
With this clause the second transaction will skip the locked rows and return next two onces without waiting.
Read about it in the documentation.

Related

Node.js and Oracle DB select query getting empty array in rows

const result = await connection.execute(
`SELECT * from no_example `, [], { maxRows: 1000 } // bind value for :id
);
but in result i always get empty rows
If you are inserting rows in another tool, or another program. Make sure that you COMMIT the data:
SQL> create table t (c number);
Table created.
SQL> insert into t (c) values (1);
1 row created.
SQL> commit;
Commit complete.
If you are insert using Node.js, look at the autoCommit attribute and connection.commit() function. Also see the node-oracledb documentation on Transaction Management.
Unrelated to your problem, but you almost certainly shouldn't be using maxRows. By default node-oracledb will return all rows. If you don't want all, then add some kind of WHERE clause or row-limiting clause to your query. If you expect a big number of rows, then use a result set so you can access consecutive batches of rows.

Executing more than one SQL query using psycopg2 in a double with statement

Is it possible to pass more than one query in a double with cursor opening statement with psycopg2 (first to open connection, then cursor)?
E.g. to replace:
import psycopg2
def connector():
return psycopg2.connect(**DB_DICT_PARAMS)
########
sql_update1 = ("UPDATE table SET array = %s::varchar[], "
"array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
"WHERE id = %s")
sql_update2 = ("UPDATE table SET json_field = %s "
"WHERE id = %s")
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update1, [stringArray, ID])
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update2, [jsonString, ID])
by:
#(...)
sql_update1 = ("UPDATE table SET array = %s::varchar[], "
"array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
"WHERE id = %s")
sql_update2 = ("UPDATE table SET json_field = %s "
"WHERE id = %s")
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update1, [stringArray, ID])
curs.execute(sql_update2, [jsonString, ID])
What if the second query needs the first one to be completed before, and what if not?
In the shown case, they will definitely update the same record (i.e. row) in the database but not the same fields (i.e. attributes or columns).
Is this precisely authorized because the two SQL statement are committed sequentially, i.e. the first finishes first. Then, after and only after, the second is executed.?
Or is it actually forbidden because they can be executed in parallel, each query without knowing the state of the other at any instant t?
There are no fancy triggers or procedures in the DB. Let's make it first simple.
(Please note that I have purposefully written two queries here, where a single one would have perfectly fit, but it's not always the case, as some computations are still in the way before saving some other results to the same record in the DB).
If you want them to execute at the same time, simply put them in the same string seperated by a semicolon. I'm a little rusty but I think the following should work:
sql_updates = ("UPDATE table SET array = %s::varchar[], "
"array_created = true, timestamp = now() AT TIME ZONE 'UTC' "
"WHERE id = %s;"
"UPDATE table SET json_field = %s "
"WHERE id = %s;")
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_updates, [stringArray, ID, jsonString, ID])
Better avoid this:
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update1, [stringArray, ID])
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update2, [jsonString, ID])
Opening a database connection is pretty slow compared to doing a query, so it is much better to reuse it rather than opening a new one for each query. If your program is a script, typically you'd just open the connection at startup and close it at exit.
However, if your program spends a long time waiting between queries, and there will be many instances running, then it would be better to close the connection to not consume valuable RAM on the postgres server for doing nothing. This is common in client/server applications where the client mostly waits for user input. If there are many clients you can also use connection pooling, which offers the best of both worlds at the cost of a bit extra complexity. But if it's just a script, no need to bother with that.
with connector() as conn:
with conn.cursor() as curs:
curs.execute(sql_update1, [stringArray, ID])
curs.execute(sql_update2, [jsonString, ID])
This would be faster. You don't need to build a new cursor, you can reuse the same one. note if you don't fetch the results of the first query before reusing the cursor, then you won't be able to do so after executing the second query, because a cursor only stores the results of the last query. Since these are updates, there are no results, unless you want to check the rowcount to see if it did update a row.
What if the second query needs the first one to be completed before, and what if not?
Don't care. execute() processes the whole query before returning, so by the time python gets to the next bit of code, the query is done.
Is this precisely authorized because the two SQL statement are committed sequentially, i.e. the first finishes first. Then, after and only after, the second is executed.?
Yes
Or is it actually forbidden because they can be executed in parallel, each query without knowing the state of the other at any instant t?
If you want to execute several queries in parallel, for example because a query takes a while and you want to execute it while still running other queries, then you need several DB connections and of course one python thread for each because execute() is blocking. It's not used often.

How to execute multiple DML statements in a variable sequentially using cx_Oracle

I have a variable SCRIPT which has two to three DML statements. I want to run them sequentially after connecting to my Oracle DB. I have tried the below but it is failing with below error
c.execute(SCRIPT)
cx_Oracle.DatabaseError: ORA-00933: SQL command not properly ended
Below is the piece of code tried.
SCRIPT="""UPDATE IND_AFRO.DRIVER
SET Emp_Id = 1000, update_user_id = 'RIBST-4059'
WHERE Emp_Id IN (SELECT Emp_Id
FROM IND_AFRO.DRIVER Ddq
WHERE NOT EXISTS
(SELECT 1
FROM IND_AFRO_AF.EMPLOYEE
WHERE Emp_Id = Ddq.Emp_Id)
AND Functional_Area_Cd = 'DC');
UPDATE IND_AFRO.APPOINTMENTS
SET Emp_Id = 1000, update_user_id = 'RIBST-4059'
WHERE Emp_Id IN (SELECT Emp_Id
FROM IND_AFRO.APPOINTMENTS Ddq
WHERE NOT EXISTS
(SELECT 1
FROM IND_AFRO_AF.EMP
WHERE Emp_Id = Ddq.Emp_Id));
UPDATE IND_AFRO.ar_application_for_aid a
SET a.EMP_ID = 1000
WHERE NOT EXISTS
(SELECT 1
FROM IND_AFRO_AF.EMP
WHERE emp_id = a.emp_id);"""
conn = cx_Oracle.connect(user=r'SYSTEM', password='ssadmin', dsn=CONNECTION)
c = conn.cursor()
c.execute(SCRIPT)
c.close()
The execute() and executemany() functions only work on one SQL or PL/SQL statement.
You can wrap the three statements in a PL/SQL BEGIN/END block like:
SQL> begin
2 insert into test values(1);
3 update test set a = 2;
4 end;
5 /
PL/SQL procedure successfully completed.
Alternatively you can split up your string into individual statements. If the statements originate from a file, you can write a wrapper to read file and execute each statement. This is a lot easier if you restrict the SQL syntax (particularly regarding line terminators). For an example, see https://github.com/oracle/python-cx_Oracle/blob/master/samples/SampleEnv.py#L116
However this means calling execute() more times, which isn't as efficient as the first solution.

Iterate on page of returning execute_values

http://initd.org/psycopg/docs/extras.html
psycopg2.extras.execute_values has a parameters page_size.
I'm doing an INSERT INTO... ON CONFLICT... with RETURNING ID.
The problem is that the cursor.fetchall() give me back only the last "page", that is, 100 ids (default of page_size).
Without modifying page_size parameters, is it possible to iterate over the results, to get the total number of rows updated ?
The best and shortest answer would be using fetch = True in parameter as stated in here
all_ids = psycopg2.extras.execute_values(cur, query, data, template=None, page_size=10000, fetch=True)
# all_ids will return all affected rows with array like this [ [1], [2], [3] .... ]
I ran into the same issue. I work around this issue by batching my calls to execute_values(). I'll set my_page_size=1000, then iterate over my values, filling argslist until i have my_page_size items. Then I'll call execute_values(cur, sql, argslist, page_size=my_page_size). And iterate over cur to get those ids.
Without modifying page_size parameters, is it possible to iterate over
the results, to get the total number of rows updated ?
Yes.
try:
conn = psycopg2.connect(...)
cur = conn.cursor()
query = """
WITH
items (eggs) AS (VALUES %s),
inserted AS (
INSERT INTO spam (eggs)
SELECT eggs FROM items
ON CONFLICT (eggs) DO NOTHING
RETURNING id
)
SELECT id FROM spam
WHERE eggs IN (SELECT eggs FROM items)
UNION
SELECT id FROM inserted
"""
eggs = (('egg_{}'.format(i % 666),) for i in range(10_000))
ids = psycopg2.extras.execute_values(cur, query, argslist=eggs, fetch=True)
# Do whatever with `ids`. `len(ids)` I suppose?
finally:
if connection:
cur.close()
conn.close()
I overkilled query on purpose to address some gotchas:
WITH items (eggs) AS (VALUES %s) is done to be able to use argslist in two places at once;
RETURNING with ON CONFLICT will return only ids which were actually inserted, conflicting ones are omitted from INSERT' direct results. To solve that all this SELECT ... WHERE ... UNION SELECT mumbo jumbo is done;
to get all values which you asked for: ids = psycopg2.extras.execute_values(..., fetch=True).
A horrible interface oddity considering that all other cases are done like
cur.execute(...) # or other kind of `execute`
rows = cur.fetchall() # or other kind of `fetch`
So if you want only the number of inserted rows then do
try:
conn = psycopg2.connect(...)
cur = conn.cursor()
query = """
INSERT INTO spam (eggs)
VALUES %s
ON CONFLICT (eggs) DO NOTHING
RETURNING id
"""
eggs = (('egg_{}'.format(i % 666),) for i in range(10_000))
ids = psycopg2.extras.execute_values(cur, query, argslist=eggs, fetch=True)
print(len(ids)
finally:
if connection:
cur.close()
conn.close()

SQLite database is locked

How do I sort this one out?
code:
c.execute("INSERT INTO INPUT33 (NAME) VALUES (?);", (name3,))
c.execute("select MAX(rowid) from [input33];")
conn.commit()
for rowid in cursor:break
for elem in rowid:
m = elem
print(m)
c.execute("select MAX(rowid) from [input];")
for rowid in c:break
for elem in rowid:
m = elem
c.execute("DELETE FROM input WHERE rowid = ?", (m,))
conn.commit()
After running this, i get this:
sqlite3.OperationalError: database is locked
Taken from Python Docs
When a database is accessed by multiple connections, and one of the processes modifies the database, the SQLite database is locked until that transaction is committed. The timeout parameter specifies how long the connection should wait for the lock to go away until raising an exception. The default for the timeout parameter is 5.0 (five seconds).

Resources