Creating a PostgresSQL table using psycopg2 in Python

Creating a PostgresSQL table using psycopg2 in Python - python-3.x

I am trying to connect to a remote PostgresSQL database using the psycopg2 library in Python. To be clear, I can already do this using psql.exe, but that is not what I want to do here. So far, I have verified that I can connect and use my cursor to perform a simple query on an existing table:
import psycopg2
conn = psycopg2.connect(dbname='mydb', user='postgres', password='mypassword', host='www.mydbserver.com', port='5432', sslmode='require')
cur = conn.cursor()
cur.execute('SELECT * FROM existing_schema.existing_table')
one = cur.fetchone()
print(one)
This essentially, connects to an existing schema and table and selecting everything. I then fetch the first row from cur and print it. Example output: ('090010100001', '09001', None, 'NO', None, 'NO'). Now, I want to create a new table using this same method. I have already created a new schema called test within mydb. My plan is to copy csv data to the table leter, but for now, I just want to create the blank table. Here's what I have tried:
cur.execute("""
CREATE TABLE test.new_table
(
region TEXT,
state TEXT,
tier TEXT,
v_detailed DOUBLE PRECISION,
v_approx DOUBLE PRECISION,
v_unmapped DOUBLE PRECISION,
v_total DOUBLE PRECISION,
a_detailed DOUBLE PRECISION,
a_approx DOUBLE PRECISION,
a_unmapped DOUBLE PRECISION,
a_total DOUBLE PRECISION
)
""")
conn.commit()
When I ran the above in a Jupyter Notebook, I assumed it would be a rather quick process. However, it seems to get stuck and just run and run (the process did not complete after 30 + mins). Eventually, it threw an error: OperationalError: server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request. Should it take that long to run this simple line of code?! (I'm guessing, no). What might I be doing wrong here?

OK, there is an issue with using the .copy_from() method in psycopg2. That was the issue. This is how I overcame it:
conn = psycopg2.connect(dbname='mydb', user='postgres', password='mypassword', host='www.mydbserver.com', port='5432', sslmode='require')
cur = conn.cursor()
cur.execute("""
CREATE TABLE test.new_table
(
region TEXT,
state TEXT,
tier TEXT,
v_detailed DOUBLE PRECISION,
v_approx DOUBLE PRECISION,
v_unmapped DOUBLE PRECISION,
v_total DOUBLE PRECISION,
a_detailed DOUBLE PRECISION,
a_approx DOUBLE PRECISION,
a_unmapped DOUBLE PRECISION,
a_total DOUBLE PRECISION
)
""")
conn.commit()
with open(output_file, 'r') as f:
next(f) # Skip the header row.
#You must set the search_path to the desired schema beforehand
cur.execute('SET search_path TO test, public')
tbl = 'region_report_%s'% (report_type)
cur.copy_from(f, tbl, sep=',')
conn.commit()
conn.close()

Related

mariadb python - executemany using SELECT

Im trying to input many rows to a table in a mariaDB.
For doing this i want to use executemany() to increase speed.
The inserted row is dependent on another table, which is found with SELECT.
I have found statements that SELECT doent work in a executemany().
Are there other ways to sole this problem?
import mariadb
connection = mariadb.connect(host=HOST,port=PORT,user=USER,password=PASSWORD,database=DATABASE)
cursor = connection.cursor()
query="""INSERT INTO [db].[table1] ([col1], [col2] ,[col3])
VALUES ((SELECT [colX] from [db].[table2] WHERE [colY]=? and
[colZ]=(SELECT [colM] from [db].[table3] WHERE [colN]=?)),?,?)
ON DUPLICATE KEY UPDATE
[col2]= ?,
[col3] =?;"""
values=[input_tuplets]
When running the code i get the same value for [col1] (the SELECT-statement) which corresponds to the values from the from the first tuplet.
If SELECT doent work in a executemany() are there another workaround for what im trying to do?
Thx alot!

I think that reading out the tables needed,
doing the search in python,
use exeutemany() to insert all data.
It will require 2 more queries (to read to tables) but will be OK when it comes to calculation time.

Thanks for your first question on stackoverflow which identified a bug in MariaDB Server.
Here is a simple script to reproduce the problem:
CREATE TABLE t1 (a int);
CREATE TABLE t2 LIKE t1;
INSERT INTO t2 VALUES (1),(2);
Python:
>>> cursor.executemany("INSERT INTO t1 VALUES \
(SELECT a FROM t2 WHERE a=?))", [(1,),(2,)])
>>> cursor.execute("SELECT a FROM t1")
>>> cursor.fetchall()
[(1,), (1,)]
I have filed an issue in MariaDB Bug tracking system.
As a workaround, I would suggest reading the country table once into an array (according to Wikipedia there are 195 different countries) and use these values instead of a subquery.
e.g.
countries= {}
cursor.execute("SELECT country, id FROM countries")
for row in cursor:
countries[row[0]]= row[1]
and then in executemany
cursor.executemany("INSERT INTO region (region,id_country) values ('sounth', ?)", [(countries["fra"],) (countries["ger"],)])

Special characters are shown as question marks when downloading the data from SQL Server using pyodbc

I'm new to the Microsoft SQL database. I'm using pyodbc module in python to insert and download the data from the SQL Server database table. I've observed the special characters are stored as "?" (question marks).
String I'm loading to DB: 'TV 40″'.
String in the downloaded data from DB: 'TV 40?'
Please find the code to insert the data
def insert_record(table_name, op_rec):
#import pdb;pdb.set_trace()
#try:
cursor, cnxn = create_conn()
op_rec = remove_none(op_rec)
if check_table_exists(cursor, table_name):
pass
else:
fields = tuple(list(op_rec.keys()))
create_table(cursor, table_name, fields)
#print("Table created: " + table_name)
row = list(op_rec.values())
query = """INSERT INTO %s VALUES (""" % table_name
query_data = add_parameters(query, tuple(row))
query_data += ")"
cursor.execute(query_data)
cursor.commit()
cursor.close()
op_rec in the above code is a dictionary.
I've tried changing the character sets of the database using the following methods. But none worked for me.
pyodbc doesn't correctly deal with unicode data
PYODBC corrupts utf8 data (reading from MYSQL information_schema DB)
When I tried the above solution I got Incorrect syntax near '='. (102)
Can you please help with this issue?
Thank you in advance

I'm storing a string, not Unicode characters so I'm not using nvarchar.
In Python_3, all strings are Unicode.
varchar columns use the single-byte character set (SBCS) defined for the database (or the table, if one is declared specifically for that table). If the (Unicode) character does not exist in the SBCS for that database/table then it is replaced with a question mark.
import pyodbc
cnxn = pyodbc.connect("DSN=mssql_199;UID=scott;PWD=tiger^5HHH")
crsr = cnxn.cursor()
print(
crsr.execute(
"SELECT collation_name FROM sys.databases WHERE name='test'"
).fetchval()
)
# SQL_Latin1_General_CP1_CI_AS
print(crsr.execute("SELECT CAST('TV 40″' AS varchar(2000))").fetchval())
# TV 40?
TL;DR - You can't store a character in a varchar column if the character set does not support it.

Inserting Timestamp Into Snowflake Using Python 3.8

I have an empty table defined in snowflake as;
CREATE OR REPLACE TABLE db1.schema1.table(
ACCOUNT_ID NUMBER NOT NULL PRIMARY KEY,
PREDICTED_PROBABILITY FLOAT,
TIME_PREDICTED TIMESTAMP
);
And it creates the correct table, which has been checked using desc command in sql. Then using a snowflake python connector we are trying to execute following query;
insert_query = f'INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) VALUES ({accountId}, {risk_score},{ct});'
ctx.cursor().execute(insert_query)
Just before this query the variables are defined, The main challenge is getting the current time stamp written into snowflake. Here the value of ct is defined as;
import datetime
ct = datetime.datetime.now()
print(ct)
2021-04-30 21:54:41.676406
But when we try to execute this INSERT query we get the following errr message;
ProgrammingError: 001003 (42000): SQL compilation error:
syntax error line 1 at position 157 unexpected '21'.
Can I kindly get some help on ow to format the date time value here? Help is appreciated.

In addition to the answer #Lukasz provided you could also think about defining the current_timestamp() as default for the TIME_PREDICTED column:
CREATE OR REPLACE TABLE db1.schema1.table(
ACCOUNT_ID NUMBER NOT NULL PRIMARY KEY,
PREDICTED_PROBABILITY FLOAT,
TIME_PREDICTED TIMESTAMP DEFAULT current_timestamp
);
And then just insert ACCOUNT_ID and PREDICTED_PROBABILITY:
insert_query = f'INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY) VALUES ({accountId}, {risk_score});'
ctx.cursor().execute(insert_query)
It will automatically assign the insert time to TIME_PREDICTED

Educated guess. When performing insert with:
insert_query = f'INSERT INTO ...(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED)
VALUES ({accountId}, {risk_score},{ct});'
It is a string interpolation. The ct is provided as string representation of datetime, which does not match a timestamp data type, thus error.
I would suggest using proper variable binding instead:
ctx.cursor().execute("INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES "
"(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) "
"VALUES(:1, :2, :3)",
(accountId,
risk_score,
("TIMESTAMP_LTZ", ct)
)
);
Avoid SQL Injection Attacks
Avoid binding data using Python’s formatting function because you risk SQL injection. For example:
# Binding data (UNSAFE EXAMPLE)
con.cursor().execute(
"INSERT INTO testtable(col1, col2) "
"VALUES({col1}, '{col2}')".format(
col1=789,
col2='test string3')
)
Instead, store the values in variables, check those values (for example, by looking for suspicious semicolons inside strings), and then bind the parameters using qmark or numeric binding style.

You forgot to place the quotes before and after the {ct}. The code should be :
insert_query = "INSERT INTO DATA_LAKE.CUSTOMER.ACT_PREDICTED_PROBABILITIES(ACCOUNT_ID, PREDICTED_PROBABILITY, TIME_PREDICTED) VALUES ({accountId}, {risk_score},'{ct}');".format(accountId=accountId,risk_score=risk_score,ct=ct)
ctx.cursor().execute(insert_query)

Getting Syntax error in parametrize insert query in python postgresql

I am trying to send data to PostgreSQL, data is a tuple of strings i.e (time, price).
The problem is when I send data using a simple query (not parametrized) it works fine!
Following is the simple query working perfectly.
cur.execute("INSERT INTO paxos (date,price) VALUES ('2020-04-09 14:39:58.145804', '$1,664.08');");
But as those values aren't fixed, so I want to store them in variable and use a parameterized query for sending data, but the parameterized query isn't working for me. Here is the parameterized query.
cur.execute("INSERT INTO paxos (date, price) values (?, ?)",(time, price))
Here is the complete function I am trying to implement:
def insert_data(time, price):
con = psycopg2.connect(database="", user="", password="", host="", port="5432")
print("Database opened successfully")
cur = con.cursor()
data_tuple = (time, price)
cur.execute("insert into paxos (date, price) values (?, ?)",(time, price))
con.commit()
print("Record inserted successfully")
con.close()
insert_data("2020-04-09 14:39:58.145804", "$1,664.08")
Here is the error message:

It seems like this is a python syntax error. I think you should use the format method of the string object (see codesnippet below).
I can't test it right now, but according to some old code of mine, I always "built" a query string first and then passed the string object to the cursor. Try something like that:
artist = "Aphex Twin"
title = "Windowlicker"
query = '''SELECT EXISTS(SELECT * FROM tracks
WHERE artist ILIKE \'{}\'AND
title ILIKE \'{}\')'''.format(artist, title)
cursor.execute(query)

UcanAccess retrieve stored query sql

I'm trying to retrieve the SQL that makes up a stored query inside an Access database.
I'm using a combination of UcanAccess 4.0.2, and jaydebeapi and the ucanaccess console. The ultimate goal is to be able to do the following from a python script with no user intervention.
When UCanAccess loads, it successfully loads the query:
Please, enter the full path to the access file (.mdb or .accdb): /Users/.../SnohomishRiverEstuaryHydrology_RAW.accdb
Loaded Tables:
Sensor Data, Sensor Details, Site Details
Loaded Queries:
Jeff_Test
Loaded Procedures:
Loaded Indexes:
Primary Key on Sensor Data Columns: (ID)
, Primary Key on Sensor Details Columns: (ID)
, Primary Key on Site Details Columns: (ID)
, Index on Sensor Details Columns: (SiteID)
, Index on Site Details Columns: (SiteID)
UCanAccess>
When I run, from the UCanAccess console a query like
SELECT * FROM JEFF_TEST;
I get the expected results of the query.
I tried things including this monstrous query from inside a python script even using the sysSchema=True option (from here: http://www.sqlquery.com/Microsoft_Access_useful_queries.html):
SELECT DISTINCT MSysObjects.Name,
IIf([Flags]=0,"Select",IIf([Flags]=16,"Crosstab",IIf([Flags]=32,"Delete",IIf
([Flags]=48,"Update",IIf([flags]=64,"Append",IIf([flags]=128,"Union",
[Flags])))))) AS Type
FROM MSysObjects INNER JOIN MSysQueries ON MSysObjects.Id =
MSysQueries.ObjectId;
But get an object not found or insufficient privileges error.
At this point, I've tried mdbtools and can successfully retrieve metadata, and data from access. I just need to get the queries out too.
If anyone can point me in the right direction, I'd appreciate it. Windows is not a viable option.
Cheers, Seth
***********************************
* SOLUTION
***********************************
from jpype import *
startJVM(getDefaultJVMPath(), "-ea", "-Djava.class.path=/Users/seth.urion/local/access/UCanAccess-4.0.2-bin/ucanaccess-4.0.2.jar:/Users/seth.urion/local/access/UCanAccess-4.0.2-bin/lib/commons-lang-2.6.jar:/Users/seth.urion/local/access/UCanAccess-4.0.2-bin/lib/commons-logging-1.1.1.jar:/Users/seth.urion/local/access/UCanAccess-4.0.2-bin/lib/hsqldb.jar:/Users/seth.urion/local/access/UCanAccess-4.0.2-bin/lib/jackcess-2.1.6.jar")
conn = java.sql.DriverManager.getConnection("jdbc:ucanaccess:///Users/seth.urion/PycharmProjects/pyAccess/FE_Hall_2010_2016_SnohomishRiverEstuaryHydrology_RAW.accdb")
for query in conn.getDbIO().getQueries():
print(query.getName())
print(query.toSQLString())

If you can find a satisfactory way to call Java methods from within Python then you could use the Jackcess Query#toSQLString() method to extract the SQL for a saved query. For example, I just got this to work under Jython:
from java.sql import DriverManager
def get_query_sql(conn, query_name):
sql = ''
for query in conn.getDbIO().getQueries():
if query.getName() == query_name:
sql = query.toSQLString()
break
return sql
# usage example
if __name__ == '__main__':
conn = DriverManager.getConnection("jdbc:ucanaccess:///home/gord/UCanAccessTest.accdb")
query_name = 'Jeff_Test'
query_sql = get_query_sql(conn, query_name)
if query_sql == '':
print '(Query not found.)'
else:
print 'SQL for query [%s]:' % (query_name)
print
print query_sql
conn.close()
producing
SQL for query [Jeff_Test]:
SELECT Invoice.InvoiceNumber, Invoice.InvoiceDate
FROM Invoice
WHERE (((Invoice.InvoiceNumber)>1));

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Creating a PostgresSQL table using psycopg2 in Python - python-3.x

Related

mariadb python - executemany using SELECT

Special characters are shown as question marks when downloading the data from SQL Server using pyodbc

Inserting Timestamp Into Snowflake Using Python 3.8

Getting Syntax error in parametrize insert query in python postgresql

UcanAccess retrieve stored query sql

Categories

Resources