how to execute query on simple vertical partitioning in sqlalchemy using python - python-3.x

i am trying to work with multiple database and schemas using simple vertical partitioning in sqlalchemy and python .
Have create two database engines and configured successfully to sessionmaker()
Session = sessionmaker()
Session.configure(binds={BaseA:engine1, BaseB:engine2})
Able to get the required sql query generated successfully
driverssql = session.query(drivers)
but when i execute the above query to fetch the requslts i get the follwing error :
resultset=session.execute(driversql)
sqlalchemy.exc.UnboundExecutionError: Could not locate a bind configured on SQL expression or this Session (how can i associate the correct engine with execute statement)

I'm seeing here two variants:
You can create 2 sessionmakers here and use them separately according to an engine.
You can choose necessary engine when executing a query:
engine1 = create_engine(first_db)
engine2 = create_engine(second_db)
session.execute(drivers, bind=engine1)

Related

Cosmos DB spatial query using Spark

I would like to query a cosmos db collection using a spatial query. Specifically the ST_DISTANCE query. This query works as intended using the azure-cosmos Python SDK.
I am looking to use this query via Apache Spark for a more complex query pattern. However, using the ST_DISTANCE query in a SQL cell in a notebook results in the following error.
Error in SQL statement: AnalysisException: Undefined function: 'ST_DISTANCE'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.
The notebook is initialized as follows.
# Configure Catalog Api to be used
spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog")
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", cosmosEndpoint)
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", cosmosMasterKey)
from pyspark.sql.functions import col
df = spark.read.format("cosmos.oltp").options(**cfg)\
.option("spark.cosmos.read.inferSchema.enabled", "true")\
.load()
df.createOrReplaceTempView("outlets")
_______________________________________________________________________
%sql
SELECT * FROM outlets f WHERE ST_DISTANCE(f.boundary, POINT(0,0)) < 600
Based on what I understand from the Cosmos DB Spark connector github repo[1], not all Cosmos DB filter queries are supported via the connector (yet?). So the ST_DISTANCE and other filter functions in the spatial family aren't going to work as those aren't predicates that are natively supported by Spark to be pushed down to the database.
Found something that will help sail past this issue at least temporarily. The query config[2] allows sending a custom query directly to Cosmos DB. A temporary view can be built and queried over. This will not work for all use cases, but this solved my issue where I need a single view with distance filtering done. Rest can be handled via Spark SQL.
Refer spark.cosmos.read.customQuery[2] in below sample.
outlets_cfg = {
"spark.cosmos.accountEndpoint" : cosmosEndpoint,
"spark.cosmos.accountKey" : cosmosMasterKey,
"spark.cosmos.database" : cosmosDatabaseName,
"spark.cosmos.container" : cosmosContainerName,
"spark.cosmos.read.customQuery" : "SELECT * FROM c WHERE ST_DISTANCE(c.location,{\"type\":\"Point\",\"coordinates\": [12.832489, 18.9553242]}) < 1000"
}
df = spark.read.format("cosmos.oltp").options(**outlets_cfg)\
.option("spark.cosmos.read.inferSchema.enabled", "true")\
.load()
df.createOrReplaceTempView("outlets")
[1] https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3-1_2-12/
[2] https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md#query-config

How to execute Snowflake Stored Procedure from Python?

I have created stored procedure in snowflake which is executed fine in snowflake UI and also from server by using snowsql. Now I want to execute procedure from python program, I tried to execute from python, here are the steps that I have followed:
establish the connection to snowflake ( successfully able to connect.)
cs = ctx.cursor()
Used appropriate role,warehouse,database and schema.
tried to execute procedure like this:
cs.execute("call test_proc('value1', 'value2')")
x = cs.fetchall()
print(x)
But getting an erorr:
snowflake.connector.errors.ProgrammingError: 002140 (42601): SQL
compilation error: Unknown function test_proc
Can you please help me to resolve this problem.
Thanks,
When connecting to Snowflake using Python connector you could define DATABASE/SCHEMA
conn = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
warehouse=WAREHOUSE,
database=DATABASE,
schema=SCHEMA
);
Once you have it set up, you could call your stored procedure without using fully-qualified name:
cs.execute("call test_proc('value1', 'value2')");
Alternative way is:
Using the Database, Schema, and Warehouse
Specify the database and schema in which you want to create tables. Also specify the warehouse that will provide resources for executing DML statements and queries.
For example, to use the database testdb, schema testschema and warehouse tiny_warehouse (created earlier):
conn.cursor().execute("USE WAREHOUSE tiny_warehouse_mg")
conn.cursor().execute("USE DATABASE testdb_mg")
conn.cursor().execute("USE SCHEMA testdb_mg.testschema_mg")
Actually, I have to have command like this
cs.execute("call yourdbname.schemaname.test_proc('value1', 'value2')")
and It is working as expected.
Thanks

Multiple Aggregations Across Partition Keys Only Work From Web Base Azure Data Explorer But Not From Python Client Library

I'm working with the Azure Core API. Using the web based data explorer, I'm able to successfully run a query that has multiple aggregations across multiple partitions. Yet when I attempt to run such a query from my shell using the Azure client library for Python (pip install azure-cosmos==4.0.0), I get an error message. I've tried two variations of the query, where one included the partition key and one didn't. Both queries returned the same error message.
container = database.get_container_client('some_container')
query1 = "select c.fmonth, c.fquarter, c.fyear, sum(c.revenue) as actual_revenue__sum, sum(c.predicted_revenue_m1) as predicted_revenue__sum from c where c.fyear=2020 group by c.fmonth, c.fquarter, c.fyear"
query2 = "select c.fmonth, c.fquarter, c.fyear, sum(c.revenue) as actual_revenue__sum, sum(c.predicted_revenue_m1) as predicted_revenue__sum from c where c.date_start >='2020-01-01' and c.date_start < '2021-01-01' group by c.fmonth, c.fquarter, c.fyear"
res = container.query_items(query1, enable_cross_partition_query=True)
Error Message Returned:
CosmosHttpResponseError: (BadRequest) Message: {"Errors":["Cross partition query only supports 'VALUE ' for aggregates."]}
ActivityId: 4961b99e-7032-4eac-ae84-2c8cab03a496, Microsoft.Azure.Documents.Common/2.11.0
Errors out for aggregates on multiple partitions, with enable cross
partition query set to true, but no "value" keyword present
If you want to use aggregates on multiple partitions,select value count(1) from c,this will work.
But python sdk doesn't support group by now.
Refer to this document,.net sdk and js sdk support group by.Other sdk will support later.
Hope this can help you.

How to fetch raw sql insert/update from sqlalchemy ORM

I was trying to dump my PostgreSQL database created via SQLalchemy using python script. Though I have successfully created a database and all the data are getting inserted via web parsing in the ORM I have mapped with. But when I am trying to take a dump for all my insert queries using this
tab = Table(table.__tablename__, MetaData())
x = tab.insert().compile(
dialect=postgresql.dialect(),
compile_kwargs={"literal_binds": True},
)
logging.info(f"{x}")
I am adding values using ORM like this:
for value in vertex_type_values:
data = table(
Type=value["type"],
Name=value["name"],
SizeX=value["size_x"],
SizeY=value["size_y"],
SizeZ=value["size_z"],
)
session.add(data)
session.commit()
here table is the model which i have designed and imported from my local library and vertex_type_values which I have extracted and yield in my script
I am getting the output as
INSERT INTO <tablename> DEFAULT VALUES
So my question is how to get rid of Default Values and get actual values so that I can directly use insert command if my DB crash anytime? I need to know raw SQL for insert command

How can I print actual query generated by SQLAlchemy?

I'm trying to log all SQLAlchemy queries to the console while parsing the query and filling in the parameters (e.g. translating :param_1 to 123). I managed to find this answer on SO that does just that. The issue I'm running into is that parameters don't always get translated.
Here is the event I'm latching onto -
#event.listens_for(Engine, 'after_execute', named=True)
def after_cursor_execute(**kw):
conn = kw['conn']
params = kw['params']
result = kw['result']
stmt = kw['clauseelement']
multiparams = kw['multiparams']
print(literalquery(stmt))
Running this query will fail to translate my parameters. Instead, I'll see :param_1 in the output -
Model.query.get(123)
It yields a CompileError exception with message Bind parameter '%(38287064 param)s' without a renderable value not allowed here..
However, this query will translate :param_1 to 123 like I would expect -
db.session.query(Model).filter(Model.id == 123).first()
Is there any way to translate any and all queries that are run using SQLAlchemy?
FWIW I'm targeting SQL Server using the pyodbc driver.
If you set up the logging framework, you can get the SQL statements logged by setting the sqlalchemy.engine logger at INFO level, e.g.:
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

Resources