Python: sqlalchemy - map result only - python-3.x

I have some sql to run which is not single table based. Below is one example(on sqlite)
SELECT C.REGION METRICSCOPE, C.METRIC METRICOPTION, ROUND(1.0*C.COUNT/T.COUNT, 4) Percentage, T.COUNT COUNT
FROM
(SELECT REGION, $metric METRIC, COUNT(*) COUNT
FROM TICKET T, USER U
WHERE T.ASSIGNEDTO = U.USERNAME
AND ASOF BETWEEN '$startDate' AND '$endDate'
GROUP BY REGION, $metric ) C,
(SELECT REGION, COUNT(*) COUNT
FROM TICKET T, USER U
WHERE ASOF BETWEEN '$startDate' AND '$endDate'
AND T.ASSIGNEDTO = U.USERNAME
GROUP BY region) T
WHERE C.REGION = T.REGION
I want to run the sql and then map the result to a class, then jsonify the class objects & return to my webpage.
It seems to me that sqlalchemy use the table based map(each class needs to define the tablename) which is not suitable for my case.
Is it possible to map the result only? Appreciate if you can provide an example for me.
Thanks

Related

ATHENA/PRESTO complex query with multiple unnested tables

i have i would like to create a join over several tables.
table login : I would like to retrieve all the data from login
table logging : calculating the Nb_of_sessions for each db & for each a specific event type by user
table meeting : calculating the Nb_of_meetings for each db & for each user
table live : calculating the Nb_of_live for each db & for each user
I have those queries with the right results :
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid;
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid;
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id;
But when i begin to try put it all together, it seems i retrieve bad data (i have only on db retrieved) and it seems not efficient.
select a1.db.id,a._id as userid,a.firstname,a.lastname,count(rl._id) as nb_chat
FROM
"logins"."login" a,
"loggings"."logging" b,
"meetings"."meeting" c,
"lives"."live" d,
UNNEST(dbs) AS a1 (db),
UNNEST(users) AS r1 (user)
where a._id = b.userid AND a._id = c.userid AND a._id = r1.user._id
group by 1,2,3,4
Do you have an idea ?
Regards.
The easiest way is to work with with to structure the subquery and then reference them.
with parameter reference:
You can use WITH to flatten nested queries, or to simplify subqueries.
The WITH clause precedes the SELECT list in a query and defines one or
more subqueries for use within the SELECT query.
Each subquery defines a temporary table, similar to a view definition,
which you can reference in the FROM clause. The tables are used only
when the query runs.
Since you already have working sub queries, the following should work:
with logins as
(
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
)
,visits as
(
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid
)
,meetings as
(
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid
)
,chats as
(
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id
)
select *
from logins l
left join visits v
on l.dbid = v.dbid
and l.userid = v.userid
left join meetings m
on l.dbid = m.dbid
and l.userid = m.userid
left join chats c
on l.dbid = c.dbid
and l.userid = c.userid;

How to apply DISTINCT on only date part of datetime field in sqlalchemy python?

I need to query my database and return the result by applying Distinct on only date part of datetime field.
My code is:
#blueprint.route('/<field_id>/timeline', methods=['GET'])
#blueprint.response(field_timeline_paged_schema)
def get_field_timeline(
field_id,
page=1,
size=10,
order_by=['capture_datetime desc'],
**kwargs
):
session = flask.g.session
field = fetch_field(session, parse_uuid(field_id))
if field:
query = session.query(
func.distinct(cast(Capture.capture_datetime, Date)),
Capture.capture_datetime.label('event_date'),
Capture.tags['visibility'].label('visibility')
).filter(Capture.field_id == parse_uuid(field_id))
return paginate(
query=query,
order_by=order_by,
page=page,
size=size
)
However this returns the following error:
(psycopg2.errors.InvalidColumnReference) for SELECT DISTINCT, ORDER BY expressions must appear in select list
The resulting query is:
SELECT distinct(CAST(tenant_resson.capture.capture_datetime AS DATE)) AS distinct_1, CAST(tenant_resson.capture.capture_datetime AS DATE) AS event_date, tenant_resson.capture.tags -> %(tags_1)s AS visibility
FROM tenant_resson.capture
WHERE tenant_resson.capture.field_id = %(field_id_1)s
Error is:
Query error - {'error': ProgrammingError('(psycopg2.errors.InvalidColumnReference) SELECT DISTINCT ON expressions must match initial ORDER BY expressions\nLINE 2: FROM (SELECT DISTINCT ON (CAST(tenant_resson.capture.capture...\n ^\n',)
How to resolve this issue? Cast is not working for order_by.
I am not familiar with sqlalchemy but this resulting query works as you expect. Please note the DISTINCT ON.
Maybe there is a way in sqlalchemy to execute non-trivial parameterized queries? This would give you the extra benefit to be able to test and optimize the query upfront.
SELECT DISTINCT ON (CAST(tenant_resson.capture.capture_datetime AS DATE))
CAST(tenant_resson.capture.capture_datetime AS DATE) AS event_date,
tenant_resson.capture.tags -> %(tags_1)s AS visibility
FROM tenant_resson.capture
WHERE tenant_resson.capture.field_id = %(field_id_1)s;
You can order by event_date if your business logic needs.
The query posted by #Stefanov.sm is correct. In SQLAlchemy terms it would be
query = (
session.query(
Capture.capture_datetime.label('event_date'),
Capture.tags['visibility'].label('visibility')
).distinct(cast(Capture.capture_datetime, Date))\
.filter(Capture.field_id == parse_uuid(field_id))
)
See the docs for more information
I needed to add order_by to my query. Now it works fine.
query = session.query(
cast(Capture.capture_datetime, Date).label('event_date'),
Capture.tags['visibility'].label('visibility')
).filter(Capture.field_id == parse_uuid(field_id)) \
.distinct(cast(Capture.capture_datetime, Date)) \
.order_by(cast(Capture.capture_datetime, Date).desc())

Python/Peewee query with fn.MAX and alias results in "no such attribute"

I have a peewee query that looks like this:
toptx24h = Transaction.select(fn.MAX(Transaction.amount).alias('amount'), User.user_name).join(User,on=(User.wallet_address==Transaction.source_address)).where(Transaction.created > past_dt).limit(1)
My understanding is this should be equivalent to:
select MAX(t.amount) as amount, u.user_name from transaction t inner join user u on u.wallet_address = t.source_address where transaction.created > past_dt limit 1
My question is how to I access the results user_name and amount
When I try this, I get an error saying top has no attribute named amount
for top in toptx24h:
top.amount # No such attribute amount
I'm just wondering how i can access the amount and user_name from the select query.
Thanks
I think you need a GROUP BY clause to ensure you're grouping by User.username.
I wrote some test code and confirmed it's working:
with self.database.atomic():
charlie = TUser.create(username='charlie')
huey = TUser.create(username='huey')
data = (
(charlie, 10.),
(charlie, 20.),
(charlie, 30.),
(huey, 1.5),
(huey, 2.5))
for user, amount in data:
Transaction.create(user=user, amount=amount)
amount = fn.MAX(Transaction.amount).alias('amount')
query = (Transaction
.select(amount, TUser.username)
.join(TUser)
.group_by(TUser.username)
.order_by(TUser.username))
with self.assertQueryCount(1):
data = [(txn.amount, txn.user.username) for txn in query]
self.assertEqual(data, [
(30., 'charlie'),
(2.5, 'huey')])

SQLAlchemy: Referencing labels in SELECT subqueries

I'm trying to figure out how to replicate the below query in SQLAlchemy
SELECT c.company_id AS company_id,
(SELECT policy_id FROM associative_table at WHERE at.company_id = c.company_id) AS policy_id_ref,
(SELECT `default` FROM policy p WHERE p.policy_id = policy_id_ref) AS `default`,
FROM company c;
Note that this is a stripped down, basic example of what I'm really dealing with. The actual schema supports data and relationship versioning that requires the subqueries to include additional conditions, sorting, and limiting, making it impractical (if not impossible) for them to be joins.
The crux of the problem is in how the second subquery relies on policy_id_ref -- the value obtained from the first subquery. In SQLAlchemy, this is effectively what I have now:
ct = aliased(classes.company)
at = aliased(classes.associative_table)
pt = aliased(classes.policy)
policy_id_ref = session.query(at.policy_id).\
filter(at.company_id == ct.company_id).\
label('policy_id_ref')
policy_default = session.query(pt.default).\
filter(pt.id == 'policy_id_ref').\
label('default')
query = session.query(ct.company_id,policy_id_ref,policy_default)
The pull from the "company" table works fine as does the first subquery that retrieves the "policy_id_ref" column. The problem is the second subquery that has to reference that "policy_id_ref" column. I don't know how to write its filter in such a way that it literally renders "policy_id_ref" in the resulting query, to match the label of the first subquery.
Suggestions?
Thanks in advance
You can write your query as
select(
Companies.company_id,
AssociativeTable.policy_id.label('policy_id_ref'),
Policy.default.label('policy_default'),
).select_from(
Companies,
).join(
AssociativeTable,
AssociativeTable.company_id == Companies.company_id,
).join(
Policy,
AssociativeTable.policy_id == Policy.id
)
but in case you need reference to label from subquery => use literal_column
from sqlalchemy import func, select, literal_column
session.query(
func.array_agg(
literal_column('batch_info'),
JSONB
).label('history')
).select_from(
select(
func.jsonb_build_object(
'batch_id', AccountingQueueBatch.id,
'batch_label', AccountingQueueBatch.label,
).label('batch_info')
).select_from(
AccountingQueueBatch,
)
)

Count null columns as zeros with oracle

I am running a query with Oracle:
SELECT
c.customer_number,
COUNT(DISTINCT o.ORDER_NUMBER),
COUNT(DISTINCT q.QUOTE_NUMBER)
FROM
Customer c
JOIN Orders o on c.customer_number = o.party_number
JOIN Quote q on c.customer_number = q.account_number
GROUP BY
c.customer_number
This works beautifully and I can get the customer and their order and quote counts.
However, not all customers have orders or quotes but I still want their data. When I use LEFT JOIN I get this error from Oracle:
ORA-24347: Warning of a NULL column in an aggregate function
Seemingly this error is caused by the eventual COUNT(NULL) for customers that are missing orders and/or quotes.
How can I get a COUNT of null values to come out to 0 in this query?
I can do COUNT(DISTINCT NVL(o.ORDER_NUMBER, 0)) but then the counts will come out to 1 if orders/quotes are missing which is no good. Using NVL(o.ORDER_NUMBER, NULL) has the same problem.
Try using inline views:
SELECT
c.customer_number,
o.order_count,
q.quote_count
FROM
customer c,
( SELECT
party_number,
COUNT(DISTINCT order_number) AS order_count
FROM
orders
GROUP BY
party_number
) o,
( SELECT
account_number,
COUNT(DISTINCT quote_number) AS quote_count
FROM
quote
GROUP BY
account_number
) q
WHERE 1=1
AND c.customer_number = o.party_number (+)
AND c.customer_number = q.account_number (+)
;
Sorry, but I'm not working with any databases right now to test this, or to test whatever the ANSI SQL version might be. Just going on memory.

Resources