WHERE IN psycopg2 clause not formatting - python-3.x

I have been having trouble using WHERE $VARIABLE IN clauses in psycopg2:
from app.commons.database import conn
from psycopg2 import sql
from psycopg2.extras import DictCursor
query = '''
SELECT
*
FROM
{}.{}
WHERE
{} in %s
'''.format(
sql.Identifier('information_schema'),
sql.Identifier('tables'),
sql.Identifier('table_schema')
)
data = (
'information_schema',
'pg_catalog'
)
with conn.cursor(cursor_factory=DictCursor) as cursor:
cursor.execute(query, data)
print(cursor.fetchall())
raises
TypeError: not all arguments converted during string formatting
I've read the seemingly hundreds of posts on this same topic, and the overwhelming answer has been: "you need to use tuples when submitting data as the second argument to cursor.execute". I've been doing that and still can't seem to determine where the gap is.

Check out the psycopg2 documentation on Lists adaptation
You are getting that error because psycopg2 is trying to substitute the two parameters, but you only gave it one parameter. Try changing to this:
from app.commons.database import conn
from psycopg2 import sql
from psycopg2.extras import DictCursor
query = '''
SELECT
*
FROM
{}.{}
WHERE
{} =ANY(%s)
'''.format(
sql.Identifier('information_schema'),
sql.Identifier('tables'),
sql.Identifier('table_schema')
)
data = [
'information_schema',
'pg_catalog'
] # A list now, instead of a tuple
with conn.cursor(cursor_factory=DictCursor) as cursor:
cursor.execute(query, (data, )) # A tuple, containing your list
print(cursor.fetchall())

Related

syntax to guard against SQL-injection of named identifiers

I'm reading the psycopg2 documentation & wondering how to parametrize SQL identifiers of tables with a name? Here is an example:
import psycopg2
conn = psycopg2.connect()
cursor = conn.cursor()
cursor.execute(
"SELECT * FROM %(my_table)s LIMIT %(my_limit)s;"
vars={
"my_limit": 42, # parametrizing literals works fine.
"my_table": sql.Identifier("foo"), # how to do same with named identifiers?
}
)
psycopg2.ProgrammingError: can't adapt type 'Identifier'
I know I could use positional parameters %s or {} but I would like the query to mix and match identifiers with literals with a named mapping.
This did it for me:
import psycopg2
from psycopg2 import sql
conn = psycopg2.connect()
cursor = conn.cursor()
cursor.execute(sql.SQL(
"SELECT * FROM {my_table} LIMIT {my_limit};"
).format(
my_limit = sql.Literal(42),
my_table = sql.Identifier("foo"),
).as_string(conn)
)

Getting SQLCODE=-104 on binding a parameter for DB2 query in Python

Assuming the data.xlsx looks like this:
Column_Name | Table_Name
CUST_ID_1 | Table_1
CUST_ID_2 | Table_2
Here are the SQLs that I'm trying to generate by using the bind_param for db2 in Python:
SELECT CUST_ID_1 FROM TABLE_1 WHERE CUST_ID_1 = 12345
SELECT CUST_ID_2 FROM TABLE_2 WHERE CUST_ID_2 = 12345
And this is how Im trying to generate this query:
import ibm_db
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
validate_sql = "SELECT ? FROM ? WHERE ?=12345"
validate_stmt = ibm_db.prepare(conn, validate_sql)
df = pd.read_excel("data.xlsx", sheet_name='Sheet1')
for i in df.index:
ibm_db.bind_param(validate_stmt, 1, df['Column_Name'][i])
ibm_db.bind_param(validate_stmt, 2, df['Table_Name'][i])
ibm_db.bind_param(validate_stmt, 3, df['Column_Name'][i])
ibm_db.execute(validate_stmt)
validation_result = ibm_db.fetch_both(validate_stmt)
while validation_result != False:
print(validation_result[0])
validation_result = ibm_db.fetch_both(validate_stmt)
When I try to execute this code, Im hitting a SQLCODE=-104 error.
Any idea how the syntax should be for parameter binding?
Thanks,
Ganesh
2 major errors.
1. You can’t use a parameter marker for a table or column name (2-nd & 3-rd parameters).
2. You must specify the data type of the parameter marker, if it’s not possible to understand it from the query (1-st parameter). You must use something like «cast(? as data-type-desired)». But it’s just for you info, since you try to use it here as a column name, which is not possible as described in 1).

PYODBC - Type Error: the first argument to execute must be a string or unicode query

Been trying to connect our ERP ODBC by using PYODBC, Although I got the syntax correct the only error I'm getting at this point is this 'TypeError: the first argument to execute must be a string or unicode query'
I've tried adding .decode('utf-8').
import pyodbc
import pandas as pd
conn = pyodbc.connect(
'DRIVER={SQL Server};'
'SERVER=192.168.1.30;'
'DATABASE=Datamart;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
for row in cursor.tables(tableType='TABLE'):
print(row)
sql = """SELECT * FROM ETL.Dim_FC_UPS_Interface_Detail"""
cursor.execute(row, sql)
df = pd.read_sql(sql, conn)
df.head()
I think your ordering of commands is off a bit for use of pyodbc cursor execute function. See the docs.
cursor = conn.cursor()
sql = """SELECT * FROM ETL.Dim_FC_UPS_Interface_Detail"""
cursor.execute(sql)
for row in cursor:
print(row)

Binding Teradata Query in python not returning anything

I am trying to automate some usual db queries via python and was testing the sql parameterization
import teradata
import pyodbc
import sys
from pandas import DataFrame
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
udaExec = teradata.UdaExec (appName="HelloWorld", version="1.0",
logConsole=False)
session = udaExec.connect(method="odbc", system="db",
username="username", password="password");
t = 'user_id' #dynamic column to be selected
cursor = session.cursor();
"""The below query returned only the user_id column
>>> sw_overall1
0
0 user_id
"""
sw_overall1=cursor.execute("""select distinct ? from
table""" ,(t,)).fetchall()
sw_overall1 = DataFrame(sw_overall1)
cursor = session.cursor();
#The below query returned the correct result
sw_overall2=cursor.execute("""select distinct user_id from
table""" ).fetchall()
Am I doing the binding incorrectly ? since without binding I get the correct output.

Querying from Microsoft SQL to a Pandas Dataframe

I am trying to write a program in Python3 that will run a query on a table in Microsoft SQL and put the results into a Pandas DataFrame.
My first try of this was the below code, but for some reason I don't understand the columns do not appear in the order I ran them in the query and the order they appear in and the labels they are given as a result change, stuffing up the rest of my program:
import pandas as pd, pyodbc
result_port_mapl = []
# Use pyodbc to connect to SQL Database
con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' +
<database>
cnxn = pyodbc.connect(con_string)
cursor = cnxn.cursor()
# Run SQL Query
cursor.execute("""
SELECT <field1>, <field2>, <field3>
FROM result
""")
# Put data into a list
for row in cursor.fetchall():
temp_list = [row[2], row[1], row[0]]
result_port_mapl.append(temp_list)
# Make list of results into dataframe with column names
## FOR SOME REASON HERE row[1] AND row[0] DO NOT CONSISTENTLY APPEAR IN THE
## SAME ORDER AND SO THEY ARE MISLABELLED
result_port_map = pd.DataFrame(result_port_mapl, columns={'<field1>', '<field2>', '<field3>'})
I have also tried the following code
import pandas as pd, pyodbc
# Use pyodbc to connect to SQL Database
con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' + <database>
cnxn = pyodbc.connect(con_string)
cursor = cnxn.cursor()
# Run SQL Query
cursor.execute("""
SELECT <field1>, <field2>, <field3>
FROM result
""")
# Put data into DataFrame
# This becomes one column with a list in it with the three columns
# divided by a comma
result_port_map = pd.DataFrame(cursor.fetchall())
# Get column headers
# This gives the error "AttributeError: 'pyodbc.Cursor' object has no
# attribute 'keys'"
result_port_map.columns = cursor.keys()
If anyone could suggest why either of those errors are happening or provide a more efficient way to do it, it would be greatly appreciated.
Thanks
If you just use read_sql? Like:
import pandas as pd, pyodbc
con_string = 'DRIVER={SQL Server};SERVER='+ <server> +';DATABASE=' + <database>
cnxn = pyodbc.connect(con_string)
query = """
SELECT <field1>, <field2>, <field3>
FROM result
"""
result_port_map = pd.read_sql(query, cnxn)
result_port_map.columns.tolist()

Resources