psycopg2 named cursor withhold=True - python-3.x

I am using a named cursor to fetch 200K+ rows, and using the attribute, 'withhold=True', this way I can iterate by fetching many (50K) at a time - but my cursor is not persisting...
Here is the error / stacktrace
Traceback (most recent call last):
File "/home/me/code/etl/etl.py", line 179, in main
_pg_data = _fetch(_some)
psycopg2.ProgrammingError: named cursor isn't valid anymore
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/me/code/etl/etl.py", line 330, in <module>
main()
File "/home/me/code/etl/etl.py", line 271, in main
logging.error(Fore.LIGHTRED_EX + e + Fore.RESET, exc_info=True)
TypeError: must be str, not ProgrammingError
Here is my code
from colorama import Fore
from datetime import datetime
import argparse, logging, psycopg2, pyodbc, sys, time, yaml
import _classes.Utils as cpu
def main():
_cfg_path = "/home/me/code/etl/db.yml"
with open(_cfg_path, 'r') as _ymlfile:
_cfg = yaml.load(_ymlfile, Loader=yaml.CLoader)
# create a connection to the database
_conn = psycopg2.connect("host={0} dbname={1} user={2} password={3} port={4}".format(_cfg['local_postgres']['host'], _cfg['local_postgres']['db'],
_cfg['local_postgres']['user'], _cfg['local_postgres']['passwd'],
_cfg['local_postgres']['port']))
_curs_pgsql = _conn.cursor()
_curs_pgsql.callproc('usp_outbound', ['curs'])
_curs2_pgsql = _conn.cursor('curs', withhold=True)
_push_date = datetime.now().strftime("%Y-%m-%d")
_some = 50000
_fetch = _curs2_pgsql.fetchmany
while True:
_pg_data = _fetch(_some)
if not _pg_data:
break
for _row in _pg_data:
_params = ()
_sql = "INSERT INTO dbo.tbl VALUES (?, ?, ?)"
_params = (_row[0], _row[1], _row[2])
# ...insert into destination database
# ...now update source database and set the push and push date flags
_curs_pgsql.execute("UPDATE products SET pushed = TRUE, pushed_date = (%s) WHERE id = (%s)", (_push_date, _row[2],))
_conn.commit()
if _conn:
# close cursor / close the communication with the PostgreSQL database server
_curs2_pgsql.close()
_curs_pgsql.close()
_conn.close()
Clearly I am missing something with my named cursor and how it's supposed to be defined...
According to the documentation -
Set the value before calling execute() or use the connection.cursor() withhold parameter, otherwise the value will have no effect.
... ... ...
Trying to fetch from a named cursor after a commit() or to create a named cursor when the connection is in autocommit mode will result in an exception. It is possible to create a WITH HOLD cursor by specifying a True value for the withhold parameter to cursor() or by setting the withhold attribute to True before calling execute() on the cursor.
What am I missing?

The code seems reasonable. It's hard to say what the stored procedure (usp_outbound) might be doing though. Your named cursor is being created after the procedure - is that creating it? Is something happening there that might close it? Perhaps the stored procedure needs a WITH HOLD?
Try reorganizing the code to something like this and see if it helps (or you get an error that provides a hint).
with psycopg2.connect("your_connection_string") as _conn:
# first (unnamed) cursor
_curs_pgsql = _conn.cursor()
with _conn.cursor(name='curs', withhold=True) as _curs2_pgsql:
_curs2_pgsql.itersize = 50000 # fetch rows from database in batches of this value
_push_date = datetime.now().strftime("%Y-%m-%d")
_curs_pgsql.callproc('usp_outbound', ['curs'])
for _row in _curs2_pgsql:
_sql = "INSERT INTO dbo.tbl VALUES (?, ?, ?)"
_params = (_row[0], _row[1], _row[2])
# ...insert into destination database
# ...now update source database and set the push and push date flags
_curs_pgsql.execute("UPDATE products SET pushed = TRUE, pushed_date = (%s) WHERE id = (%s)", (_push_date, _row[2],))
# if there are no errors, connection will automatically
# commit() here at end of with block (but remain open)
_conn.close()

Related

Python sqlite3: how to check if connection is an in-memory database?

From Python's sqlite3 library, how can we determine if a connection belongs to a in-memory database?
import sqlite3
conn = sqlite3.connect(':memory:')
def is_in_memory_connection(conn):
# How to check if `conn` is an in-memory connection?
Is it possible to check the filename of an on-disk database? If so, I would presume that it would return None for an in-memory database.
This is the closest I could come up with:
import sqlite3
def is_in_memory_connection(conn):
local_cursor = conn.cursor()
local_cursor.execute('pragma database_list')
rows = local_cursor.fetchall()
print(rows[0][2])
return rows[0][2] == ''
#database = 'test.sqlite'
database = ':memory:'
conn = sqlite3.connect(database)
result = is_in_memory_connection(conn)
print(result)
If you have an in-memory database, database_list will show equivalent of this:
sqlite> pragma database_list;
seq name file
--- ---- ----
0 main
If you are opening a file that's on disk, it'll show the path of the file equivalent of this:
sqlite> pragma database_list;
seq name file
--- ---- --------------------------
0 main /home/testing/test.sqlite
Taking advantage of this, you could call pragma database_list to show the file. If the path is empty, the database is not associated with a file.
https://sqlite.org/pragma.html#pragma_database_list

Unable to insert into mySQL database using python, variables and auto_increment

I hope my formatting is OK as this is my first time using stackOverflow
No matter how I change my code and methods I keep on getting the same bug when executing this code
File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py",
line 83, in call
return bytes(self.params[index]) IndexError: tuple index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "sqlTest.py", line 40, in
mycursor.execute(sql,val) File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 558,
in execute
stmt = RE_PY_PARAM.sub(psub, stmt) File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 86,
in call "Not enough parameters for the SQL statement")
mysql.connector.errors.ProgrammingError: Not enough parameters for the
SQL statement
This is a section of my main project that would log the current values of certain Variable as well as the GPS Coordinates and a timestamp.
From what I've seen the main issue has to do with the database expecting 8 database entries when I should only need 7.
I mainly followed https://www.w3schools.com/python/python_mysql_insert.asp tutorial as I am not super familiar with using python and mySQL together.
#Initialize mySQL databse connection
mydb = mysql.connector.connect(
host="themySQLserver.net",
user="myUSername",
passwd="_____",
database="24totheTLE"
)
These variables are normally set by the main program but I manually set them for troubleshooting
top_since_epoch = 4
left_since_epoch = 1
bottom_since_epoch = 5
right_since_epoch = 3
This is the code that calls the python2 script to get the gps data
fullgps = os.popen("python gps.py").read()
gps_split = fullgps.split(";")
gps_split[1] = gps_split[1].rstrip()
s1 = float(gps_split[0])
s2 = float(gps_split[1])
The primary key "LogNum" for my database is set to auto increment and as such I have not mentioned it in my code.
ts = time.time()
timestam = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
mycursor = mydb.cursor()
sql = "INSERT INTO records (TimeStamp,CarsTop,CarsLeft,CarsRight,CarsBottom,GPSLong,GPSLat) VALUES (%s, %s, %s, %s, %s, %s, %s)"
val = [timestam,top_since_epoch,left_since_epoch,bottom_since_epoch,s2,s1]
mycursor.execute(sql,val)
mydb.commit()
print(mycursor.rowcount, "record inserted.")
Thanks to anyone who replies.
Finally found the answer
The problem was that the code I used to pass the date-time information was not set up to interact with the "mysql.connector" This was solved by updating the code to work with MySQLdb libaries:
This code Worked for me
#Initialize mySQL databse connection
mydb = MySQLdb.connect(
host="sql24.cpt1.host-h.net",
user="stonesp_1",
passwd="______",
database="24totheTLE"
)
print("5 minutes")
fullgps = os.popen("python gps.py").read()
gps_split = fullgps.split(";")
s1 = float(gps_split[0])
s2 = float(gps_split[1])
print(s1)
print(s2)
ts = time.time()
timecur = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
print(timecur)
mycursor = mydb.cursor()
sql = "INSERT INTO records (TimeStamp,CarsTop,CarsLeft,CarsRight,CarsBottom,GPSLong,GPSLat) VALUES (%s,%s,%s,%s,%s,%s,%s)"
val = [timecur,top_since_epoch,left_since_epoch,right_since_epoch,bottom_since_epoch,s2,s1]
mycursor.execute(sql,val)
mydb.commit()
print(mycursor.rowcount, "record inserted.")
mydb.close()

How do I fix this 'no such table' error in my database code?

I am trying to run a database for storing certain data. Unfortunately, when I run my code in terminal, it throws an error stating that "no such table: analytics_sessions". What must I do/implement within my code in order to fix this issue.
I already have connected this code to a database file. Nothing is inside the file but it does exist.
This is my python3 code:
from ._db import db_connect
import contextlib
#
# Game sessions-- measuring how long players play each of our games
# for at a stretch.
#
def post_game_session_info(game_id, start_datetime, duration_sec, user_id):
# TODO: Procure user_ids so we can link against the 'users' table.
# TODO: Add client time!
# TODO: Ensure datetime is stored in UTC.
with db_connect() as model_db_connection:
with contextlib.closing(model_db_connection.cursor()) as cursor:
cursor.execute(
"insert into analytics_sessions (game_id, start_server_date_time, duration_sec, user_id) "
"values (?,?,?,?)",
(game_id, str(start_datetime), duration_sec, user_id)
)
This is the terminal error message that is thrown when I try running the code:
File "/Users/elliotfayman/Documents/GitHub/whale-beta/model/analytics.py", line 19, in post_game_session_info
(game_id, str(start_datetime), duration_sec, user_id)
sqlite3.OperationalError: no such table: analytics_sessions

MariaDB, pypyodbc, "Unknown prepared statement handler" executing "SELECT" query on table loaded with "LOAD DATA LOCAL INFILE."

Python 3.4.3, MariaDB 10.0.21, MariaDB ODBC Connector 1.0.0, pypyodbc 1.3.3, all 64-bit on 64-bit Windows 7.
I've got a python script that's supposed to create a table, populate it with data from a fixed-width file, and then run a SELECT statement against it. All simple stuff. My script looks something like this:
import pypyodbc
def do_stuff(name, password, filepath):
db = pypyodbc.connect(driver = "{MariaDB ODBC 1.0 Driver}",
server = "localhost", uid = name,
pwd = password, autocommit = True)
cursor = db.cursor()
cursor.execute("CREATE TABLE `foo`.`bar` (`col1` INT);")
cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE `foo`.`bar` (#row) SET col1 = SUBSTR(#row,1,1)" % filepath.replace("\\", "\\\\"))
for row in cursor.execute("SELECT * FROM `foo`.`bar`"):
print(row)
db.close()
do_stuff("root", "password", r"C:\\Users\\laj\\Desktop\\test.txt")
It grabs the first character from each line in the the text file and sticks it in the sole column in the table. When the "SELECT" statement comes around, however, I get hit with the following error:
Traceback (most recent call last):
File "test.py", line 25, in <module>
do_stuff("root", "oag123", r"C:\\Users\\laj\\Desktop\\test.txt")
File "test.py", line 21, in do_stuff
for row in cursor.execute("SELECT * FROM `foo`.`bar`"):
File "C:\Python34\lib\site-packages\pypyodbc-1.3.3-py3.4.egg\pypyodbc.py", line 1605, in execute
File "C:\Python34\lib\site-packages\pypyodbc-1.3.3-py3.4.egg\pypyodbc.py", line 1631, in execdirect
File "C:\Python34\lib\site-packages\pypyodbc-1.3.3-py3.4.egg\pypyodbc.py", line 986, in check_success
File "C:\Python34\lib\site-packages\pypyodbc-1.3.3-py3.4.egg\pypyodbc.py", line 964, in ctrl_err
pypyodbc.Error: ('HY000', '[HY000] Unknown prepared statement handler (5) given to mysqld_stmt_reset')
What really gets me, though is that I can get rid of the error simply by closing and reopening the database connection in between populating the table and executing the "SELECT," like so:
import pypyodbc
def do_stuff(name, password, filepath):
db = pypyodbc.connect(driver = "{MariaDB ODBC 1.0 Driver}",
server = "localhost", uid = name,
pwd = password, autocommit = True)
cursor = db.cursor()
cursor.execute("CREATE TABLE `foo`.`bar` (`col1` INT);")
cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE `foo`.`bar` (#row) SET col1 = SUBSTR(#row,1,1)" % filepath.replace("\\", "\\\\"))
db.close()
db = pypyodbc.connect(driver = "{MariaDB ODBC 1.0 Driver}",
server = "localhost", uid = name,
pwd = password, autocommit = True)
cursor = db.cursor()
for row in cursor.execute("SELECT * FROM `foo`.`bar`"):
print(row)
db.close()
do_stuff("root", "password", r"C:\\Users\\laj\\Desktop\\test.txt")
Unfortunately, this isn't actually a valid solution to my problem. Not only is it something I shouldn't have to do, but it also doesn't help when it comes to temporary tables because they just get dropped during the disconnect phase of that "fix." Any insight would be great, this is driving me up a wall.
execute does not return what you think:
cursor.execute("SELECT ...");
rows = cur.fetchall();
for row in rows ...
Turned out to be a pypyodbc problem. Installed pyodbc, imported it as pypyodbc, and everything worked as it should.

Why doesn't psycopg2 allow us to open multiple server-side cursors in the same connection?

I am curious that why psycopg2 doesn't allow opening multiple server-side cursors (http://initd.org/psycopg/docs/usage.html#server-side-cursors) in the same connection. I got this problem recently and I have to solve it by replacing the second cursor by a client-side cursor. But I still want to know if there is any way to do that.
For example, I have these 2 tables on Amazon Redshift:
CREATE TABLE tbl_account (
acctid varchar(100),
regist_day date
);
CREATE TABLE tbl_my_artist (
user_id varchar(100),
artist_id bigint
);
INSERT INTO tbl_account
(acctid, regist_day)
VALUES
('TEST0000000001', DATE '2014-11-23'),
('TEST0000000002', DATE '2014-11-23'),
('TEST0000000003', DATE '2014-11-23'),
('TEST0000000004', DATE '2014-11-23'),
('TEST0000000005', DATE '2014-11-25'),
('TEST0000000006', DATE '2014-11-25'),
('TEST0000000007', DATE '2014-11-25'),
('TEST0000000008', DATE '2014-11-25'),
('TEST0000000009', DATE '2014-11-26'),
('TEST0000000010', DATE '2014-11-26'),
('TEST0000000011', DATE '2014-11-24'),
('TEST0000000012', DATE '2014-11-24')
;
INSERT INTO tbl_my_artist
(user_id, artist_id)
VALUES
('TEST0000000001', 2000011247),
('TEST0000000001', 2000157208),
('TEST0000000001', 2000002648),
('TEST0000000002', 2000383724),
('TEST0000000003', 2000002546),
('TEST0000000003', 2000417262),
('TEST0000000004', 2000076873),
('TEST0000000004', 2000417266),
('TEST0000000005', 2000077991),
('TEST0000000005', 2000424268),
('TEST0000000005', 2000168784),
('TEST0000000006', 2000284581),
('TEST0000000007', 2000284581),
('TEST0000000007', 2000000642),
('TEST0000000008', 2000268783),
('TEST0000000008', 2000284581),
('TEST0000000009', 2000088635),
('TEST0000000009', 2000427808),
('TEST0000000010', 2000374095),
('TEST0000000010', 2000081797),
('TEST0000000011', 2000420006),
('TEST0000000012', 2000115887)
;
I want to select from those 2 tables, then do something with query result.
I use 2 server-side cursors because I need 2 nested loops in my query. I want to use server-side cursor because the result can be very huge.
I use fetchmany() instead of fetchall() because I'm running on a single-node cluster.
Here is my code:
import psycopg2
from psycopg2.extras import DictCursor
conn = psycopg2.connect('connection parameters')
cur1 = conn.cursor(name='cursor1', cursor_factory=DictCursor)
cur2 = conn.cursor(name='cursor2', cursor_factory=DictCursor)
cur1.execute("""SELECT acctid, regist_day FROM tbl_account
WHERE regist_day <= '2014-11-25'
ORDER BY 1""")
for record1 in cur1.fetchmany(50):
cur2.execute("""SELECT user_id, artist_id FROM tbl_my_artist
WHERE user_id = '%s'
ORDER BY 1""" % (record1["acctid"]))
for record2 in cur2.fetchmany(50):
print '(acctid, artist_id, regist_day): (%s, %s, %s)' % (
record1["acctid"], record2["artist_id"], record1["regist_day"])
# do something with these values
conn.close()
When running, I got an error:
Traceback (most recent call last):
File "C:\Users\MLD1\Desktop\demo_cursor.py", line 20, in <module>
for record2 in cur2.fetchmany(50):
File "C:\Python27\lib\site-packages\psycopg2\extras.py", line 72, in fetchmany
res = super(DictCursorBase, self).fetchmany(size)
InternalError: opening multiple cursors from within the same client connection is not allowed.
That error occured at line 20, when I tried to fetch result from the second cursor.
An answer four years later, but it is possible to have more than one cursor open from the same connection. (It may be that the library was updated to fix the problem above.)
The caveat is that you are only allowed to call execute() only once using a named cursor, so if you reuse one of the cursors in the fetchmany loop you'd need to either remove the name or create another "anonymous" cursor.

Resources