I have PostgreSQL query which starts with DO:
do
$$
DECLARE
temprow record;
BEGIN
for temprow in
select *
from generate_series(1, 100)
where generate_series % 2 = 0
loop
with cte_input(val) as (select val from (values (temprow.generate_series)) as t(val))
insert
into tmp_table(input_value, value_100)
select cte_input.val as input_value, cte_input.val::float / 100 as value_100
from cte_input;
commit;
end loop;
END
$$ LANGUAGE plpgsql;
How I can run this query with Python and psycopg2?
Is it right way to use temporary function if I need to run this query with some dynamic changes few times?
UPD
Thank you #erwin-brandstetter for information about COMMIT.
I deleted COMMIT from query block and add it in Python code: ps_cursor.execute('COMMIT').
I write code in this way:
import concurrent.futures
import psycopg2 as pg
from psycopg2 import pool
features = [(1, name_of_feature_1), ...] # list of features
list_query = []
for feature in features:
feature_id = feature[0]
name_feature = feature[1]
query = f"""--Feature:{feature_id}
create or replace procedure pg_temp.proc_feature_{feature_id}_values()
language plpgsql
as
$$
DECLARE
temprow record;
BEGIN
for temprow in
select *
from tmp_maternal_sample
where maternal_sample = 1000
loop
insert
into tmp_feature_values(feature_id,
feature_values_array,
maternal_sample)
select feature_id,
array_agg(t_rank.{name_feature}) f_values,
temprow.maternal_sample
from t_rank
....
....
end loop;
end
$$;
call pg_temp.proc_feature_{feature_id}_values();
"""
list_query.append(query)
def load_query(query):
ps_connection = threaded_postgreSQL_pool.getconn()
if (ps_connection):
print(f"Successfully recived connection from connection pool for Query {query[:15]} ")
ps_cursor = ps_connection.cursor()
ps_cursor.execute(query)
ps_cursor.execute('COMMIT')
ps_cursor.close()
result = f'Query {query[:15]} finished'
print(result)
return result
try:
threaded_postgreSQL_pool = pool.ThreadedConnectionPool(1, 32, user, password, host, port, database)
if (threaded_postgreSQL_pool):
print("Connection pool created successfully using ThreadedConnectionPool")
with concurrent.futures.ThreadPoolExecutor(max_workers=32) as executor:
future_to_sql = {executor.submit(load_query, query): query for query in list_query}
for future in concurrent.futures.as_completed(future_to_sql):
sql = future_to_sql[future]
try:
data = future.result()
except Exception as exc:
print('%s generated an exception: %s' % (sql[:15], exc))
else:
print('%s page is %s bytes' % (sql[:15], data))
except (Exception, pg.DatabaseError) as error:
print("Error while connecting to PostgreSQL", error)
finally:
if threaded_postgreSQL_pool:
threaded_postgreSQL_pool.closeall
print('Threaded PG connection pool is closed')
It's safe to assume Postgres 11 or later, because:
COMMIT works in one plpgsql code block, but not in another?
Your DO statement is convoluted without obvious reason. Simpler:
DO
LANGUAGE plpgsql
$do$
DECLARE
i int;
BEGIN
FOR i IN
SELECT generate_series(2, 100, 2)
LOOP
INSERT INTO tmp_table(input_value, value_100)
VALUES (i, i::float / 100);
-- COMMIT; -- ?
END LOOP;
END
$do$;
Which boils down to just this - even including the creation of that temp table:
CREATE TEMP TABLE tmp_table AS
SELECT g AS input_value, g::float / 100 AS value_100
FROM generate_series(2, 100, 2) g;
db<>fiddle here
Some setups (like dbfiddle.uk) still don't allow transaction handling with COMMIT. Not sure you even need that?
Either way, just execute the raw SQL.
Related
I am using ThreadPoolExecuter for parallel execution of a function which prints statements and executes the sql. I would like to manage the print statements from the function. Eg
def Func(host,sql):
print ('Executing for %s ' %host)
SQL = Execute(host,SQL) -- connecting to DB
print SQL
main():
sql = 'show databases;'
hostList = ['abc.com','def.com','ghi.com','jkl.com']
with concurrent.futures.ThreadPoolExecutor() as executor:
future = [executor.submit(Func,acct ,host,sql) for host in hostList]
Here for 4 items in hostList it executes the thread and executes the function Func in parallel but prints results like below
Executing for abc.com
Executing for def.com
Executing for ghi.com
Executing for jkl.com
then
SQL output 1
SQL output 2
SQL output 3
SQL output 4
How I would like the function to print is like below
Executing for abc.com
SQL output 1
Executing for def.com
SQL output 1
Executing for ghi.com
SQL output 1
Executing for jkl.com
SQL output 1
If you just want to group your print statements together without reflecting the pause required to execute then you can do the following. Note that if the ONLY thing you are doing is a single print statement then you likely don't need the lock.
import concurrent.futures
import threading
import random
import time
def Func(account, host, sql, lock):
seconds = random.randint(1, 10)
time.sleep(seconds)
result = "Result of executing \"{}\" took {} seconds".format(sql, seconds)
## -------------------------------
## Probably don't need to lock if you combine these into one statement
## -------------------------------
with lock:
print('Executing for %s ' % host)
print("\t{}\n".format(result))
## -------------------------------
lock = threading.Lock()
hostList = ['abc.com','def.com','ghi.com','jkl.com']
with concurrent.futures.ThreadPoolExecutor() as executor:
future = [executor.submit(Func, "acct" , host, "somesql", lock) for host in hostList]
I hope my formatting is OK as this is my first time using stackOverflow
No matter how I change my code and methods I keep on getting the same bug when executing this code
File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py",
line 83, in call
return bytes(self.params[index]) IndexError: tuple index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "sqlTest.py", line 40, in
mycursor.execute(sql,val) File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 558,
in execute
stmt = RE_PY_PARAM.sub(psub, stmt) File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 86,
in call "Not enough parameters for the SQL statement")
mysql.connector.errors.ProgrammingError: Not enough parameters for the
SQL statement
This is a section of my main project that would log the current values of certain Variable as well as the GPS Coordinates and a timestamp.
From what I've seen the main issue has to do with the database expecting 8 database entries when I should only need 7.
I mainly followed https://www.w3schools.com/python/python_mysql_insert.asp tutorial as I am not super familiar with using python and mySQL together.
#Initialize mySQL databse connection
mydb = mysql.connector.connect(
host="themySQLserver.net",
user="myUSername",
passwd="_____",
database="24totheTLE"
)
These variables are normally set by the main program but I manually set them for troubleshooting
top_since_epoch = 4
left_since_epoch = 1
bottom_since_epoch = 5
right_since_epoch = 3
This is the code that calls the python2 script to get the gps data
fullgps = os.popen("python gps.py").read()
gps_split = fullgps.split(";")
gps_split[1] = gps_split[1].rstrip()
s1 = float(gps_split[0])
s2 = float(gps_split[1])
The primary key "LogNum" for my database is set to auto increment and as such I have not mentioned it in my code.
ts = time.time()
timestam = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
mycursor = mydb.cursor()
sql = "INSERT INTO records (TimeStamp,CarsTop,CarsLeft,CarsRight,CarsBottom,GPSLong,GPSLat) VALUES (%s, %s, %s, %s, %s, %s, %s)"
val = [timestam,top_since_epoch,left_since_epoch,bottom_since_epoch,s2,s1]
mycursor.execute(sql,val)
mydb.commit()
print(mycursor.rowcount, "record inserted.")
Thanks to anyone who replies.
Finally found the answer
The problem was that the code I used to pass the date-time information was not set up to interact with the "mysql.connector" This was solved by updating the code to work with MySQLdb libaries:
This code Worked for me
#Initialize mySQL databse connection
mydb = MySQLdb.connect(
host="sql24.cpt1.host-h.net",
user="stonesp_1",
passwd="______",
database="24totheTLE"
)
print("5 minutes")
fullgps = os.popen("python gps.py").read()
gps_split = fullgps.split(";")
s1 = float(gps_split[0])
s2 = float(gps_split[1])
print(s1)
print(s2)
ts = time.time()
timecur = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
print(timecur)
mycursor = mydb.cursor()
sql = "INSERT INTO records (TimeStamp,CarsTop,CarsLeft,CarsRight,CarsBottom,GPSLong,GPSLat) VALUES (%s,%s,%s,%s,%s,%s,%s)"
val = [timecur,top_since_epoch,left_since_epoch,right_since_epoch,bottom_since_epoch,s2,s1]
mycursor.execute(sql,val)
mydb.commit()
print(mycursor.rowcount, "record inserted.")
mydb.close()
I am using a named cursor to fetch 200K+ rows, and using the attribute, 'withhold=True', this way I can iterate by fetching many (50K) at a time - but my cursor is not persisting...
Here is the error / stacktrace
Traceback (most recent call last):
File "/home/me/code/etl/etl.py", line 179, in main
_pg_data = _fetch(_some)
psycopg2.ProgrammingError: named cursor isn't valid anymore
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/me/code/etl/etl.py", line 330, in <module>
main()
File "/home/me/code/etl/etl.py", line 271, in main
logging.error(Fore.LIGHTRED_EX + e + Fore.RESET, exc_info=True)
TypeError: must be str, not ProgrammingError
Here is my code
from colorama import Fore
from datetime import datetime
import argparse, logging, psycopg2, pyodbc, sys, time, yaml
import _classes.Utils as cpu
def main():
_cfg_path = "/home/me/code/etl/db.yml"
with open(_cfg_path, 'r') as _ymlfile:
_cfg = yaml.load(_ymlfile, Loader=yaml.CLoader)
# create a connection to the database
_conn = psycopg2.connect("host={0} dbname={1} user={2} password={3} port={4}".format(_cfg['local_postgres']['host'], _cfg['local_postgres']['db'],
_cfg['local_postgres']['user'], _cfg['local_postgres']['passwd'],
_cfg['local_postgres']['port']))
_curs_pgsql = _conn.cursor()
_curs_pgsql.callproc('usp_outbound', ['curs'])
_curs2_pgsql = _conn.cursor('curs', withhold=True)
_push_date = datetime.now().strftime("%Y-%m-%d")
_some = 50000
_fetch = _curs2_pgsql.fetchmany
while True:
_pg_data = _fetch(_some)
if not _pg_data:
break
for _row in _pg_data:
_params = ()
_sql = "INSERT INTO dbo.tbl VALUES (?, ?, ?)"
_params = (_row[0], _row[1], _row[2])
# ...insert into destination database
# ...now update source database and set the push and push date flags
_curs_pgsql.execute("UPDATE products SET pushed = TRUE, pushed_date = (%s) WHERE id = (%s)", (_push_date, _row[2],))
_conn.commit()
if _conn:
# close cursor / close the communication with the PostgreSQL database server
_curs2_pgsql.close()
_curs_pgsql.close()
_conn.close()
Clearly I am missing something with my named cursor and how it's supposed to be defined...
According to the documentation -
Set the value before calling execute() or use the connection.cursor() withhold parameter, otherwise the value will have no effect.
... ... ...
Trying to fetch from a named cursor after a commit() or to create a named cursor when the connection is in autocommit mode will result in an exception. It is possible to create a WITH HOLD cursor by specifying a True value for the withhold parameter to cursor() or by setting the withhold attribute to True before calling execute() on the cursor.
What am I missing?
The code seems reasonable. It's hard to say what the stored procedure (usp_outbound) might be doing though. Your named cursor is being created after the procedure - is that creating it? Is something happening there that might close it? Perhaps the stored procedure needs a WITH HOLD?
Try reorganizing the code to something like this and see if it helps (or you get an error that provides a hint).
with psycopg2.connect("your_connection_string") as _conn:
# first (unnamed) cursor
_curs_pgsql = _conn.cursor()
with _conn.cursor(name='curs', withhold=True) as _curs2_pgsql:
_curs2_pgsql.itersize = 50000 # fetch rows from database in batches of this value
_push_date = datetime.now().strftime("%Y-%m-%d")
_curs_pgsql.callproc('usp_outbound', ['curs'])
for _row in _curs2_pgsql:
_sql = "INSERT INTO dbo.tbl VALUES (?, ?, ?)"
_params = (_row[0], _row[1], _row[2])
# ...insert into destination database
# ...now update source database and set the push and push date flags
_curs_pgsql.execute("UPDATE products SET pushed = TRUE, pushed_date = (%s) WHERE id = (%s)", (_push_date, _row[2],))
# if there are no errors, connection will automatically
# commit() here at end of with block (but remain open)
_conn.close()
Anyone knows How to make connection in python to connect as400 iseries system and call any as400 programs with parameter.
For example how to create library by connecting as400 through python. I want to call " CRTLIB LIB(TEST) " from python script.
I am able to connect to DB2 database through pyodbc package.
Here is my code to connect DB2 database.
import pyodbc
connection = pyodbc.connect(
driver='{iSeries Access ODBC Driver}',
system='ip/hostname',
uid='username',
pwd='password')
c1 = connection.cursor()
c1.execute('select * from libname.filename')
for row in c1:
print (row)
If your IBM i is set up to allow it, you can call the QCMDEXC stored procedure using CALL in your SQL. For example,
c1.execute("call qcmdexc('crtlib lib(test)')")
The QCMDEXC stored procedure lives in QSYS2 (the actual program object is QSYS2/QCMDEXC1) and does much the same as the familiar program of the same name that lives in QSYS, but the stored procedure is specifically meant to be called via SQL.
Of course, for this example to work, your connection profile has to have the proper authority to create libraries.
It's also possible that your IBM i isn't set up to allow this. I don't know exactly what goes into enabling this functionality, but where I work, we have one partition where the example shown above completes normally, and another partition where I get this instead:
pyodbc.Error: ('HY000', '[HY000] [IBM][System i Access ODBC Driver][DB2 for i5/OS]SQL0901 - SQL system error. (-901) (SQLExecDirectW)')
This gist shows how to connect to an AS/400 via pyodbc:
https://gist.github.com/BietteMaxime/6cfd5b2dc2624c094575
A few notes; in this example, SYSTEM is the DSN you're set up for the AS/400 in the with pyodbc.connect statement. You could also switch this to be SERVER and PORT with these modifications:
import pyodbc
class CommitMode:
NONE = 0 # Commit immediate (*NONE) --> QSQCLIPKGN
CS = 1 # Read committed (*CS) --> QSQCLIPKGS
CHG = 2 # Read uncommitted (*CHG) --> QSQCLIPKGC
ALL = 3 # Repeatable read (*ALL) --> QSQCLIPKGA
RR = 4 # Serializable (*RR) --> QSQCLIPKGL
class ConnectionType:
READ_WRITE = 0 # Read/Write (all SQL statements allowed)
READ_CALL = 1 # Read/Call (SELECT and CALL statements allowed)
READ_ONLY = 2 # Read-only (SELECT statements only)
def connstr(server, port, commit_mode=None, connection_type=None):
_connstr = 'DRIVER=iSeries Access ODBC Driver;SERVER={server};PORT={port};SIGNON=4;CCSID=1208;TRANSLATE=1;'.format(
server=server,
port=port,
)
if commit_mode is not None:
_connstr = _connstr + 'CommitMode=' + str(commit_mode) + ';'
if connection_type is not None:
_connstr = _connstr + 'ConnectionType=' + str(connection_type) + ';'
return _connstr
def main():
with pyodbc.connect(connstr('myas400.server.com', '8471', CommitMode.CHG, ConnectionType.READ_ONLY)) as db:
cursor = db.cursor()
cursor.execute(
"""
SELECT * FROM IASP.LIB.FILE
"""
)
for row in cursor:
print(' '.join(map(str, row)))
if __name__ == '__main__':
main()
I cleaned up some PEP-8 as well. Good luck!
I'd like to write a script to run several SQL commands in a for-while-loop-construct. Everything works fine so far.. Except for deletes.
Script:
#!bin/python3.2
# script to remove batches of obsolete stuff from the tracking DB
#
import sys
import getpass
import platform
import cx_Oracle
# print some infos
print("Python version")
print(("Python version: " + platform.python_version()))
print("cx_Oracle version: " + cx_Oracle.version)
print("Oracle client: " + str(cx_Oracle.clientversion()).replace(', ','.'))
dbconn = cx_Oracle.connect('xxxx','yyyy', '1.2.3.4:1521/xxxRAC')
print ("Oracle DB version: " + dbconn.version)
print ("Oracle client encoding: " + dbconn.encoding)
cleanupAdTaKvpQuery = "delete from TABLE1 where TABLE2_ID < 320745354908598 and rownum <= 5"
getOldRowsQuery = "select count(*) from TABLE2 where ID < 320745354908598"
dbconn.begin()
cursor = dbconn.cursor()
cursor.execute(getOldRowsQuery)
rowCnt = cursor.fetchall()
print("# rows (select before delete): " + str(rowCnt))
try:
cursor.execute(cleanupAdTaKvpQuery)
rows = cursor.rowcount
except:
print("Cleanup Failed.")
cursor.execute(getOldRowsQuery)
rowCnt = cursor.fetchall()
print("# rows (select after delete): " + str(rowCnt))
try:
dbconn.commit
print("Success!")
except:
print("Commit failed " + arg)
dbconn.close
print("# of affected rows:" + str(rows))
As you can see in the output. The script runs fine, the results (see rowCnt) are valid and make sense, there are no errors and no exceptions and it does not raise an exception.
Output:
Python version
Python version: 3.2.3
cx_Oracle version: 5.2
Oracle client: (11.2.0.3.0)
Oracle DB version: 11.2.0.3.0
Oracle client encoding: US-ASCII
# rows (select before delete): [(198865,)]
# rows (select after delete): [(198860,)] <--- the result above decreased by 5!
Success!
# of rows:5
(ayemac_ora_cleanup)marcel#mw-ws:~/scripts/python/virt-envs/ayemac_ora_cleanup$
What am I missing or doing wrong? I tried to debug it with several additional select statements, trying to catch exceptions, etc...
Any help is appreciated! Thank you!
UPDATE:
Fixed, thanks for the hint with the missing brackets!
you are missing the brackets in
dbconn.commit()
without them the command will not raise an exception, but simply do nothing. the same goes for dbconn.close()