Python pandas into azure SQL, bulk insert - python-3.x

How can I arrange bulk insert of python dataframe into corresponding azure SQL.
I see that INSERT works with individual records :
INSERT INTO XX ([Field1]) VALUES (value1);
How can I insert the entire content of dataframe into Azure table?
Thanks

According to my test, we also can use to_sql to insert data to Azure sql
for example
from urllib.parse import quote_plus
import numpy as np
import pandas as pd
from sqlalchemy import create_engine, event
import pyodbc
# azure sql connect tion string
conn ='Driver={ODBC Driver 17 for SQL Server};Server=tcp:<server name>.database.windows.net,1433;Database=<db name>;Uid=<user name>;Pwd=<password>;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;'
quoted = quote_plus(conn)
engine=create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
#event.listens_for(engine, 'before_cursor_execute')
def receive_before_cursor_execute(conn, cursor, statement, params, context, executemany):
print("FUNC call")
if executemany:
cursor.fast_executemany = True
#insert
table_name = 'Sales'
# For test, I use a csv file to create dataframe
df = pd.read_csv('D:\data.csv')
df.to_sql(table_name, engine, index=False, if_exists='replace', schema='dbo')
#test after inserting
query = 'SELECT * FROM {table}'.format(table=table_name )
dfsql = pd.read_sql(query, engine)
print(dfsql)

Related

retrieve the columns of SAP HANA tables

I want to get the columns of multiple tables in SAP HANA database. I am using hdbcli and it is giving error :
hdbcli.dbapi.Error: (362, 'invalid schema name: INFORMATION_SCHEMA: line 1 col 15
Python code :
import hdbcli
from hdbcli import dbapi
import pandas as pd
from google.cloud import bigquery
conn = dbapi.connect(
address="example.hana.trial-us10.hanacloud.ondemand.com",
port=443,
user='DBADMIN',
password='example#xxxxxx'
)
tables = ['table1', 'table2', 'table3']
for table in tables:
cursor = conn.cursor()
cursor.execute(f"SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='{table}'")
print(f"Table '{table}' description:")
print([column[3] for column in cursor.fetchall()])
cursor.close()
conn.close()
need some help to proceed. Thanks

How to load data from a connection string with vaex package?

If I have a table on my server and I am producing a connection string to it, how can I, using Vaex, load it to a dataframe?
Here is what I am doing but with Pandas:
from sqlalchemy import types, create_engine, text
import pandas as pd
import pymysql
def connect_to_data(driver='mysql+pymysql://', conn_string=''):
try:
conn = create_engine(driver + conn_string)
print("MySQL Connection Successfull!")
except Exception as err:
print("MySQL Connection Failed!")
print(err)
return conn
# Connect to the db:
conn_string = 'xxxxxxxx'
conn = connect_to_data(conn_string=conn_string)
# Get all requests from the db:
query = '''SELECT * FROM table_name'''
result = conn.execute(text(query))
# Desired dataframe:
df = pd.read_sql_query(query, conn)
How can I do the same with Vaex (because of it's high performance)?
For now at least, you can't do it directly. But vaex can easily read a pandas dataframe so you can
# Following your example..
pandas_df = pd.read_sql_query(query, conn)
df = vaex.from_pandas(pandas_df)

How can I write a function to convert sql query to a dataframe

With lot of struggle I somehow written a function that fetches tables and data in the table, but got paused while converting the query to DataFrame, Any help guys!!
I have tried this way, any suggestions are welcome and I would learn from them!!
import pandas as pd,pymysql.cursors
class Db_Conn(object):
def __init__(self):
connection=pymysql.connect(host='****',
user='****',
password='*****',
db='******',
charset='utf8mb4',
cursorclass=pymysql.cursors.DictCursor)
self.connection=connection
def fetchtables(self,query):
with self.connection.cursor() as cursor:
if cursor.execute(query):
for (table_name) in cursor:
print(table_name)
elif cursor.execute(query):
a=cursor.fetchall()
print(a)
I struggled with this...which i want this function to convert sql query to DataFrame
def dataframes(self,query):
with self.connection.cursor() as cursor:
a=pd.read_sql(query,cursor)
print(a)
I have created the object as:
db1=Db_Conn()
# db2=db1.fetchtables('show tables')
# db3=db1.fetchtables('select * from **')
df1=db1.dataframes('select * from ****')
just use like below read_sql()
def _get_data(self):
df= pd.read_sql("select col1,col2 from table_name", connection)
return df
It will return dataframe
This is how I do it.
import pyodbc
import pandas as pd
cnxn = pyodbc.connect("Driver={SQL Server};SERVER=your_server_name;Database=your_db_name;")
df = pd.read_sql('SELECT * FROM Orders',cnxn)

"No such table" error while loading the .db file in python

I'm trying to read the .db file in python code, whereas i getting "no table found an" error. But i could see the table when I import it onto MYSQL DB.
import sqlite3;
import pandas as pd;
con=None
def getConnection():
databaseFile="test.db"
global con
if con == None:
con=sqlite3.connect(databaseFile)
return con
def queryExec():
con=getConnection()
result=pd.read_sql_query("select * from Movie;",con)
result
queryExec()
Even I tried using the absolute path of the .db file, but no luck.
Assume you're trying to read data from SQLite database file, here is a simpler way to do it.
import sqlite3
import pandas as pd
con = sqlite3.connect("test.db")
with con:
df = pd.read_sql("select * from Movie", con)
print(df)

insert using pandas to_sql() missing data into clickhouse db

It's my first time using sqlalchemy and pandas to insert some data into a clickhouse db.
When I try to insert some data using clickhouse cli it works fine, but when I tried to do the same thing using sqlalchemy I don't know why one row is missing.
Have I done something wrong?
import pandas as pd
# created the dataframe
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
metadata.reflect(bind = engine)
conn = engine.connect()
df.to_sql('test', conn, if_exists = 'append', index = False)
Let's try this way:
import pandas as pd
from infi.clickhouse_orm.engines import Memory
from infi.clickhouse_orm.fields import UInt16Field, StringField
from infi.clickhouse_orm.models import Model
from sqlalchemy import create_engine
# define the ClickHouse table schema
class Test_Humans(Model):
year = UInt16Field()
first_name = StringField()
engine = Memory()
engine = create_engine('clickhouse://default:#localhost/test')
# create table
with engine.connect() as conn:
conn.connection.create_table(Test_Humans) # https://github.com/Infinidat/infi.clickhouse_orm/blob/master/src/infi/clickhouse_orm/database.py#L142
pdf = pd.DataFrame.from_records([
{'year': 1994, 'first_name': 'Vova'},
{'year': 1995, 'first_name': 'Anja'},
{'year': 1996, 'first_name': 'Vasja'},
{'year': 1997, 'first_name': 'Petja'},
# ! sqlalchemy-clickhouse ignores the last item so add fake one
{}
])
pdf.to_sql('test_humans', engine, if_exists='append', index=False)
Take into account that sqlalchemy-clickhouse ignores the last item so add fake one (see source code and related issue 10).

Resources