So I've queried data from oracle database using cursor.execute(). A relatively simple select query. It works.
But when I try to fetch data from it, python crashes.
The same occurs for fetchall(), fetchmany() and fetchone().
When the query first broke in fetchmany() I decided to loop through fetchone() and it worked for the first two rows then broke at the third.
I'm guessing it is because there's too much data in third row.
So, is there any way to bypass this issue and pull the data?
(Please ignore the wrong indents could not copy properly in my phone)
EDIT:
I removed four columns with type "ROWID". There was no issue after that. I was easily able to fetch 100 rows in one go.
So to confirm my suspicion I went ahead and created another copy with only those rowed columns and it crashes as expected.
So is there any issue with ROWID type?
Test table for the same.
Insert into TEST_FOR_CX_ORACLE (Z$OEX0_LINES,Z$OEX0_ORDER_INVOICES,Z$OEX0_ORDERS,Z$ITEM_ROWID) values ('ABoeqvAEyAAB0HOAAM','AAAL0DAEzAAClz7AAN','AAAVeuABHAAA4vdAAH','ABoeo+AIVAAE6dKAAQ');
Insert into TEST_FOR_CX_ORACLE (Z$OEX0_LINES,Z$OEX0_ORDER_INVOICES,Z$OEX0_ORDERS,Z$ITEM_ROWID) values ('ABoeqvABQAABKo6AAI','AAAL0DAEzAAClz7AAO','AAAVeuABHAAA4vdAAH','ABoeo+AIVAAE6dKAAQ');
Insert into TEST_FOR_CX_ORACLE (Z$OEX0_LINES,Z$OEX0_ORDER_INVOICES,Z$OEX0_ORDERS,Z$ITEM_ROWID) values ('ABoeqvABQAABKo6AAG','AAAL0DAEzAAClz7AAP','AAAVeuABHAAA4vdAAH','ABoeo+AHIAAN+OIAAM');
Insert into TEST_FOR_CX_ORACLE (Z$OEX0_LINES,Z$OEX0_ORDER_INVOICES,Z$OEX0_ORDERS,Z$ITEM_ROWID) values ('ABoeqvAEyAAB0HOAAK','AAAL0DAEzAACl0EAAC','AAAVeuABHAAA4vdAAH','ABoeo+AHIAAN+OIAAM');
Script:
from cx_Oracle import makedsn,connect,Cursor
from pandas import read_sql_table, DataFrame, Series
from time import time
def create_conn( host_link , port , service_name , user_name , password ):
dsn=makedsn(host_link,port,service_name=service_name)
return connect(user=user_name, password=password, dsn=dsn)
def initiate_connection(conn):
try:
dbconnection = create_conn(*conn)
print('Connected to '+conn[2]+' !')
except Exception as e:
print(e)
dbconnection = None
return dbconnection
def execute_query(query,conn):
dbconnection=initiate_connection(conn)
try:
cursor = dbconnection.cursor()
print ('Cursor Created!')
return cursor.execute(query)
except Exception as e:
print(e)
return None
start_time = time()
query='''SELECT * FROM test_for_cx_oracle'''
try:
cx_read_query = execute_query(query,ecspat_c)
time_after_execute_query = time()
print('Query Executed')
columns = [i[0] for i in cx_read_query.description]
time_after_getting_columns = time()
except Exception as e:
print(e)
print(time_after_execute_query-start_time,time_after_getting_columns-time_after_execute_query)
Unfortunately, this is a bug in the Oracle Client libraries. You will see it if you attempt to fetch the same rowid value multiple times in consecutive rows. If you avoid that situation all is well. You can also set the environment variable ORA_OCI_NO_OPTIMIZED_FETCH to the value 1 before you run the query to avoid the problem.
This has been reported earlier here: https://github.com/oracle/python-cx_Oracle/issues/120
Related
Using scrapy to collect data & then saving it to postgres. I have one table named auto_records I wish to completely replace each time data is scraped. Seems like it should not be too difficult but I'm having some weird behavior.
class AutoRecordsPipeline(object):
def open_spider(self, spider):
hostname = 'localhost'
username = 'postgres'
password = 'xxxxxxx'
database = 'autos'
self.connection = psycopg2.connect(host=hostname, user=username, password=password, dbname=database)
self.cur = self.connection.cursor()
self.cur.execute("DELETE FROM auto_records") #MOVED HERE AS PER COMMENTS
def close_spider(self, spider):
self.cur.close()
self.connection.close()
def process_item(self, item, spider):
try:
#self.cur.execute("VACUUM FULL auto_records")
#self.cur.execute("DELETE FROM auto_records")
self.cur.execute("INSERT INTO auto_records(make,year,color,miles) VALUES (%s,%s,%s,%s)",(item['make'],item['year'],item['color'],item['miles']))
except psycopg2.IntegrityError:
self.conn.rollback()
else:
self.connection.commit()
return item
Initially I tried VACUUM FULL (commented out above) but I got the error psycopg2.errors.ActiveSqlTransaction: VACUUM cannot run inside a transaction block so then tried the current delete statement. What I see now when I print the table to the console like so
autorecs = pd.read_sql_query('SELECT * FROM "auto_records"', con=engine)
print('from autos db auto_records',autorecs)
is just the last record that gets scraped
from autos db auto_records id make year color miles
0 30 Chevrolet 2019 blue 30157
I don't understand where the 0 30 comes from, seems like it should be 0 1 or 29 30 since there's a total of 30 records. If I comment out the delete statement I get too many records b/c of the INSERT INTO statement. Don't know if it has to do with the scrapy pipeline or something else but I'm hoping someone has an idea as to what the real issue is... thanks
Im trying to update custommer ("cliente") information in an Sqlite database but there's just one value that wont update. When I tried to update the column 'fechaDeuda' (date of debt) it just wont do it. Weird thing is: I've coded a function to handle all updates and it works fine with all the other cases, even when being called in 'Problematic Line' (see code below) the function executes every statement of the database update and does not throw any exception but still the value in dataBase doesnt change.
Here's the code where I update three values on the same row:
try:
table = 'clientes'
_cliente = sqGenericSelectOne(table, None, 'id', generic.id) # This function retrieves one row based on the arguments passed
clienteID = _cliente[0]
fecha = date_time.split(' - ')[0]
sqUpdateOne(table, 'fechaDeuda', fecha, 'id', clienteID) #### Problematic line (!!!)
nuevoSaldo = float(_cliente[9]) - _pago
if nuevoSaldo < 0:
nuevoSaldo = 0
sqUpdateOne(table, 'saldo', nuevoSaldo, 'id', clienteID) # This line works fine
nuevoTotalPagado = float(_cliente[11]) + _pago
sqUpdateOne(table, 'totalPagado', nuevoTotalPagado, 'id', clienteID) # This one works fine too
except Exception as e:
error(0, e)
**Edit (for some reason this got erased when I posted)**Heres the 'sqUpdateOne' code:
# Sqlite database value updater
def sqUpdateOne(table, column, newValue, refColumn, refValue):
try:
with conn:
c.execute(f'''UPDATE {table} SET {column} = {newValue} WHERE {refColumn} = {refValue}''')
print()
return
except Exception as e:
error(3, e, table)
return
And here's how the dataBase table 'clientes' is created in case it's necessary. The table works fine for everything else. I keep track of it using https://sqliteonline.com/
table = 'clientes'
try:
c.execute(f'''CREATE TABLE {table} (
id integer,
nombre text,
apellido text,
valoracion real,
direccion text,
tel_1 text,
tel_2 text,
correo text,
cantServicios integer,
saldo real,
fechaDeuda text,
totalPagado real,
descripcion text)''')
state(2)
except Exception as e:
error(2, e, table)
Edit: I add here screenshot of sqUpdateOne when called to update 'fecha' so you guys can see the values it gets (again, it works perfectly fine in all other cases):
And heres a screenshot of the dataBase's content before trying to update it (it remains the same after)
Thanks in advance!
I think you are getting an exception when you execute the sqUpdateOne() for the column fechaDeuda because the query string is not quite correct. As it's a text column you need to enclose the value in quotes.
So change how you are setting the variable fecha to wrap it in quotes:
fecha = f'"{date_time.split(' - ')[0]}"'
A sidenote: you could also remove the explicit return statements in sqUpdateOne() as you return from the function any way after these lines.
I am using Google Cloud Functions to connect to a Google Bigquery database and update some rows. The cloud function is written using Python 3.
I need help figuring out how to get the result message or the number of updated/changed rows whenever I run an update dml through the function. Any ideas?
from google.cloud import bigquery
def my_update_function(context,data):
BQ = bigquery.Client()
query_job = BQ.query("Update table set etc...")
rows = query_job.result()
return (rows)
I understand that rows always come back as _emptyrowiterator object. Any way i can get result or result message? Documentation says I have to get it from a bigquery job method. But can't seem to figure it out.
I think that you are searching for QueryJob.num_dml_affected_rows. It contain number of rows affected by update or any other DML statement. If you just paste it to your code instead of rows in return statement you will get number as int or you can create some massage like :
return("Number of updated rows: " + str(job_query.num_dml_affected_rows))
I hope it will help :)
Seems like there is no mention in the bigquery Python DB-API documentation on rows returned. https://googleapis.dev/python/bigquery/latest/reference.html
I decided to use a roundabout method on dealing with this issue by generating a SELECT statement first to check if there are any matches to the WHERE clause in the UPDATE statement.
Example:
from google.cloud.bigquery import dbapi as bq
def my_update_function(context,data):
try:
bq_connection = bq.connect()
bq_cursor = bq_connection.cursor()
bq_cursor.execute("select * from table where ID = 1")
results = bq_cursor.fetchone()
if results is None:
print("Row not found.")
else:
bq_cursor.execute("UPDATE table set name = 'etc' where ID = 1")
bq_connection.commit()
bq_connection.close()
except Exception as e:
db_error = str(e)
I have been using psycopg2 to manage items in my PostgreSQL database. Recently someone suggested that I could improve my database transactions by using asyncio and asyncpg in my code. I have looked around Stack Overflow and read though the documentation for examples. I have been able to create tables and insert records, but I haven't been able to get the execution feedback that I desire.
For example in my psycopg2 code, I can verify that a table exists or doesn't exist prior to inserting records.
def table_exists(self, verify_table_existence, name):
'''Verifies the existence of a table within the PostgreSQL database'''
try:
self.cursor.execute(verify_table_existence, name)
answer = self.cursor.fetchone()[0]
if answer == True:
print('The table - {} - exists'.format(name))
return True
else:
print ('The table - {} - does NOT exist'.format(name))
return False
except Exception as error:
logger.info('An error has occurred while trying to verify the existence of the table {}'.format(name))
logger.info('Error message: {}').format(error)
sys.exit(1)
I haven't been able to get the same feedback using asyncpg. How do I accomplish this?
import asyncpg
import asyncio
async def main():
conn = await asyncpg.connect('postgresql://postgres:mypassword#localhost:5432/mydatabase')
answer = await conn.fetch('''
SELECT EXISTS (
SELECT 1
FROM pg_tables
WHERE schemaname = 'public'
AND tablename = 'test01'
); ''')
await conn.close()
#####################
# the fetch returns
# [<Record exists=True>]
# but prints 'The table does NOT exist'
#####################
if answer == True:
print('The table exists')
else:
print('The table does NOT exist')
asyncio.get_event_loop().run_until_complete(main())
You used fetchone()[0] with psycopg2, but just fetch(...) with asyncpg. The former will retrieve the first column of the first row, while the latter will retrieve a whole list of rows. Being a list, it doesn't compare as equal to True.
To fetch a single value from a single row, use something like answer = await conn.fetchval(...).
I tried to get executed with my except: statement... while attempt to oppose the functionality of UNIQUE constraint..But ended with exceptional error..
The Postgresql database table already contains the row that I have used in
db.insert("The News","AparnaKumar",1995,234569654)
but it works well on inserting unrepeated rows..
import psycopg2
class database:
def __init__(self):
self.con=psycopg2.connect("dbname='book_store' user='postgres' password='5283' host='localhost' port='5432' ")
self.cur=self.con.cursor()
self.cur.execute("CREATE TABLE if not exists books(id SERIAL PRIMARY KEY,title TEXT NOT NULL UNIQUE,author TEXT NOT NULL,year integer NOT NULL,isbn integer NOT NULL UNIQUE)")
self.con.commit()
def insert(self,title,author,year,isbn):
try:
self.cur.execute("INSERT INTO books(title,author,year,isbn) VALUES(%s,%s,%s,%s)",(title,author,year,isbn))
self.con.commit()
except:
print("already exists..")
def view(self):
self.cur.execute("SELECT * FROM books")
rows=self.cur.fetchall()
print(rows)
def search(self,title=None,author=None,year=None,isbn=None):
self.cur.execute("SELECT * FROM books WHERE title=%s or author=%s or year=%s or isbn=%s",(title,author,year,isbn))
row=self.cur.fetchall()
print(row)
db=database()
db.insert("The News","AparnaKumar",1995,234569654)
db.view()
db.search(year=1995)
You can either modify your python function to rollback the transaction, or modify the sql to not insert a new row if a conflict occurs (postgresql versions 9.5+):
option 1:
def insert(self,title,author,year,isbn):
try:
self.cur.execute("INSERT INTO books(title,author,year,isbn) VALUES(%s,%s,%s,%s)",(title,author,year,isbn))
self.con.commit()
except:
self.con.rollback()
print("already exists..")
option 2 (works for postgresql versions 9.5 or later):
def insert(self,title,author,year,isbn):
try:
self.cur.execute("INSERT INTO books(title,author,year,isbn) VALUES(%s,%s,%s,%s) ON CONFLICT DO NOTHING",(title,author,year,isbn))
self.con.commit()
except:
print("already exists..")
Use a SAVEPOINT.
See the second question in the psycopg FAQ list.
If you want to keep an transaction open, you should use a SAVEPOINT and ROLLBACK to that SAVEPOINT when the exception occurs. But since you use a catchall exception (too broad), this will also happen on other errors. So it gets a bit out of control.
Perhaps a very simple solution is to check using a SELECT if the record you're about to insert already exists.