How to make an aggregation of another aggregation

How to make an aggregation of another aggregation - struct

I have a table that contains several projects and another table with several housing and a table with several quotes. I try to have a line per project and list all the housing of my project then all the quotes of my housing
I have a line by housing because it forces me to group by housing.
this is the code :
select
projet.id,
(struct(
( logement.id) as identifiant_logement , (logement.living_area) as superficie, array_agg(struct (
coalesce(devisdetail.id,0) as identifiant_devis,
coalesce(devisdetail.price_incl_tax,0) as price_incl_tax
)) as logements
)) as projets
from
EXTERNAL_QUERY("app-oslo-prod.eu.oslo-prod-conn", "SELECT * FROM CORE.mission;")mission
left join EXTERNAL_QUERY("app-oslo-prod.eu.oslo-prod-conn", "SELECT * FROM CORE.PROJECT;")projet on projet.mission_id = mission.id left join
EXTERNAL_QUERY("app-oslo-prod.eu.oslo-prod-conn", "SELECT * FROM CORE.ACCOMMODATION;")logement on logement.project_id = projet.id
left join
EXTERNAL_QUERY("app-oslo-prod.eu.oslo-prod-conn", "SELECT * FROM CORE.QUOTE_ACCOMMODATION") devisdetail
ON logement.id = devisdetail.accommodation_id
where projet.id = 1194 or projet.id = 1195
group by projet.id , logement.id, logement.living_area

Related

ATHENA/PRESTO complex query with multiple unnested tables

i have i would like to create a join over several tables.
table login : I would like to retrieve all the data from login
table logging : calculating the Nb_of_sessions for each db & for each a specific event type by user
table meeting : calculating the Nb_of_meetings for each db & for each user
table live : calculating the Nb_of_live for each db & for each user
I have those queries with the right results :
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid;
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid;
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id;
But when i begin to try put it all together, it seems i retrieve bad data (i have only on db retrieved) and it seems not efficient.
select a1.db.id,a._id as userid,a.firstname,a.lastname,count(rl._id) as nb_chat
FROM
"logins"."login" a,
"loggings"."logging" b,
"meetings"."meeting" c,
"lives"."live" d,
UNNEST(dbs) AS a1 (db),
UNNEST(users) AS r1 (user)
where a._id = b.userid AND a._id = c.userid AND a._id = r1.user._id
group by 1,2,3,4
Do you have an idea ?
Regards.

The easiest way is to work with with to structure the subquery and then reference them.
with parameter reference:
You can use WITH to flatten nested queries, or to simplify subqueries.
The WITH clause precedes the SELECT list in a query and defines one or
more subqueries for use within the SELECT query.
Each subquery defines a temporary table, similar to a view definition,
which you can reference in the FROM clause. The tables are used only
when the query runs.
Since you already have working sub queries, the following should work:
with logins as
(
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
)
,visits as
(
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid
)
,meetings as
(
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid
)
,chats as
(
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id
)
select *
from logins l
left join visits v
on l.dbid = v.dbid
and l.userid = v.userid
left join meetings m
on l.dbid = m.dbid
and l.userid = m.userid
left join chats c
on l.dbid = c.dbid
and l.userid = c.userid;

SQL Oracle Sub-query

I am having a issue getting this Sub-query to run. I am using Toad Data Point -Oracle. I get syntax error. I have tried several different ways with no luck. I am knew to sub-query's
Select *
from FINC.VNDR_ITEM_M as M
where M.ACCT_DOC_NBR = A.ACCT_DOC_NBR
(SELECT A.CLIENT_ID,
A.SRC_SYS_ID,
A.CO_CD,
A.ACCT_NBR,
A.CLR_DT,
A.ASGN_NBR,
A.FISCAL_YR,
A.ACCT_DOC_NBR,
A.LINE_ITEM_NBR,
A.MFR_PART_NBR,
A.POST_DT,
A.DRCR_IND,
A.DOC_CRNCY_AMT,
A.CRNCY_CD,
A.BSL_DT
FROM FINC.VNDR_ITEM_F A
WHERE A.CLR_DT IN (SELECT MAX(B.CLR_DT)
FROM FINC.VNDR_ITEM_F AS B
where (B.ACCT_DOC_NBR = A.ACCT_DOC_NBR and B.FISCAL_YR=A.FISCAL_YR and B.LINE_ITEM_NBR = A.LINE_ITEM_NBR and B.SRC_SYS_ID =A.SRC_SYS_ID and B.POST_DT=A.POST_DT and B.CO_CD=A.CO_CD)
and (B.CO_CD >='1000' and B.CO_CD <= '3000' or B.CO_CD ='7090') and (B.POST_DT Between to_date ('08/01/2018','mm/dd/yyyy')
AND to_date ('08/31/2018', 'mm/dd/yyyy')) and (B.SRC_SYS_ID ='15399') and (B.FISCAL_YR ='2018'))
GROUP BY
A.CLIENT_ID,
A.SRC_SYS_ID,
A.CO_CD,
A.ACCT_NBR,
A.CLR_DT,
A.ASGN_NBR,
A.FISCAL_YR,
A.ACCT_DOC_NBR,
A.LINE_ITEM_NBR,
A.MFR_PART_NBR,
A.POST_DT,
A.DRCR_IND,
A.DOC_CRNCY_AMT,
A.CRNCY_CD,
A.BSL_DT)

Your syntax is broken, you put subquery just at the end. Now it looks like:
select *
from dual as m
where a.dummy = m.dummy
(select dummy from dual)
It is in incorrect place, not joined, not aliased. What you should probably do is:
select *
from dual m
join (select dummy from dual) a on a.dummy = m.dummy
You also have some redundant, unnecessary brackets, but that's minor flaw. Full code (I cannot test it without data access):
select *
from FINC.VNDR_ITEM_M M
join (SELECT A.CLIENT_ID, A.SRC_SYS_ID, A.CO_CD, A.ACCT_NBR, A.CLR_DT, A.ASGN_NBR,
A.FISCAL_YR, A.ACCT_DOC_NBR, A.LINE_ITEM_NBR, A.MFR_PART_NBR, A.POST_DT,
A.DRCR_IND, A.DOC_CRNCY_AMT, A.CRNCY_CD, A.BSL_DT
FROM FINC.VNDR_ITEM_F A
WHERE A.CLR_DT IN (SELECT MAX(B.CLR_DT)
FROM FINC.VNDR_ITEM_F AS B
where B.ACCT_DOC_NBR = A.ACCT_DOC_NBR
and B.FISCAL_YR=A.FISCAL_YR
and B.LINE_ITEM_NBR = A.LINE_ITEM_NBR
and B.SRC_SYS_ID =A.SRC_SYS_ID
and B.POST_DT=A.POST_DT
and B.CO_CD=A.CO_CD
and (('1000'<=B.CO_CD and B.CO_CD<='3000') or B.CO_CD='7090')
and B.POST_DT Between to_date ('08/01/2018', 'mm/dd/yyyy')
AND to_date ('08/31/2018', 'mm/dd/yyyy')
and B.SRC_SYS_ID ='15399' and B.FISCAL_YR ='2018')
GROUP BY A.CLIENT_ID, A.SRC_SYS_ID, A.CO_CD, A.ACCT_NBR, A.CLR_DT, A.ASGN_NBR,
A.FISCAL_YR, A.ACCT_DOC_NBR, A.LINE_ITEM_NBR, A.MFR_PART_NBR, A.POST_DT,
A.DRCR_IND, A.DOC_CRNCY_AMT, A.CRNCY_CD, A.BSL_DT) A
on M.ACCT_DOC_NBR = A.ACCT_DOC_NBR and M.CO_CD=A.CO_CD;

You need to add an alias to the SubSelect (or Derived Table in Standard SQL):
select *
from
( select .......
) AS dt
join ....

How to get sub query columns in main query with WHERE EXISTS in PostgreSQL?

I am stuck with a query which takes more time in JOIN, I want to use WHERE EXISTS in place of JOIN since as performance wise EXISTS takes less time than it.
I have modified the query and it's executing as per expectation but I am not able to use sub query's columns in my main query
Here is my query
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
WHERE EXISTS
(SELECT t.term_id,t.abbreviation
FROM terms t
WHERE (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id)
AND EXISTS
(SELECT st.grade_level,
st.sid
FROM students st
WHERE (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id)
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id
And I am getting this error:
ERROR: missing FROM-clause entry for table "st"
SQL state: 42P01
Character: 29
I have only used one column here just for sample but I have to use more fields from sub query.
My main aim is that how can I achieve this with EXISTS or any alternate solution which is more optimal as performance wise
I am using pg module on Node.js since as back end I am using Node.js.
UPDATE
Query with JOIN
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
LEFT JOIN terms t ON (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id
LEFT JOIN students st ON (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id

self taught -syntax error i can find- this query works in my system but no embedded in excel to same database

new to this (very new- and self teaching).....i have a query that draws from multiple tables on my computer system that gets all the appraised values and sales values from a subdivision. in my system, it runs the query fine. but when i try to convert it to run embedded in an excel sheet it gives me error saying no column name for 2 c and 3 c. when i put punctuation around the column names it says there is a syntax error with the alias "as c" at the bottom-- been awake too long--- what am i doing wrong ?:
select distinct pv.prop_id, ac.file_as_name,
'sale_type' , 'deed_date' , 'sale_date' , 'sale_type' , 'sale_price' ,
(pv.land_hstd_val + pv.land_non_hstd_val + pv.ag_market + pv.timber_market)as land_val,
(pv.imprv_hstd_val + pv.imprv_non_hstd_val)as imprv_val,
pv.market, pv.abs_subdv_cd
from property_val pv with (nolock)
inner join prop_supp_assoc psa with (nolock) on
pv.prop_id = psa.prop_id
and pv.prop_val_yr = psa.owner_tax_yr
and pv.sup_num = psa.sup_num
inner join property p with (nolock) on
pv.prop_id = p.prop_id
inner join owner o with (nolock) on
pv.prop_id = o.prop_id
and pv.prop_val_yr = o.owner_tax_yr
and pv.sup_num = o.sup_num
inner join account ac with (nolock) on
o.owner_id = ac.acct_id
left outer join
(select cop.prop_id,
convert(varchar(20), co.deed_dt, 101)as deed_date,
convert(varchar(20), s.sl_dt, 101)as sale_date,
s.sl_price as sale_price, s.sl_type_cd as sale_type
from chg_of_owner_prop_assoc cop with (nolock)
inner join chg_of_owner co with (nolock) on
co.chg_of_owner_id = cop.chg_of_owner_id
inner join sale s with (nolock) on
co.chg_of_owner_id = s.chg_of_owner_id
where cop.seq_num = 0) as c
on c.prop_id = pv.prop_id
where pv.prop_val_yr = 2016
and(pv.prop_inactive_dt is null or udi_parent ='t')
and pv.abs_subdv_cd in('s3579')
order by pv.abs_subdv_cd, pv.prop_id

Is it SQL Server? Try surrounding column names with square brackets instead of quotes.

Need to fetch n percentage of rows in u-sql query

Need help in writing u-sql query to fetch me top n percentage of rows.I have one dataset from which need to take total count of rows and take top 3% rows from dataset based on col1. Code which I have written is :
#count = SELECT Convert.ToInt32(COUNT(*)) AS cnt FROM #telData;
#count1=SELECT cnt/100 AS cnt1 FROM #count;
DECLARE #cnt int=SELECT Convert.ToInt32(cnt1*3) FROM #count1;
#EngineFailureData=
SELECT vin,accelerator_pedal_position,enginefailure=1
FROM #telData
ORDER BY accelerator_pedal_position DESC
FETCH #cnt ROWS;
#telData is my basic dataset.Thanks for help.

Some comments first:
FETCH currently only takes literals as arguments (https://msdn.microsoft.com/en-us/library/azure/mt621321.aspx)
#var = SELECT ... will assign the name #var to the rowset expression that starts with the SELECT. U-SQL (currently) does not provide you with stateful scalar variable assignment from query results. Instead you would use a CROSS JOIN or other JOIN to join the scalar value in.
Now to the solution:
To get the percentage, take a look at the ROW_NUMBER() and PERCENT_RANK() functions. For example, the following shows you how to use either to answer your question. Given the simpler code for PERCENT_RANK() (no need for the MAX() and CROSS JOIN), I would suggest that solution.
DECLARE #percentage double = 0.25; // 25%
#data = SELECT *
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20)
) AS T(pos);
#data =
SELECT PERCENT_RANK() OVER(ORDER BY pos) AS p_rank,
ROW_NUMBER() OVER(ORDER BY pos) AS r_no,
pos
FROM #data;
#cut_off =
SELECT ((double) MAX(r_no)) * (1.0 - #percentage) AS max_r
FROM #data;
#r1 =
SELECT *
FROM #data CROSS JOIN #cut_off
WHERE ((double) r_no) > max_r;
#r2 =
SELECT *
FROM #data
WHERE p_rank >= 1.0 - #percentage;
OUTPUT #r1
TO "/output/top_perc1.csv"
ORDER BY p_rank DESC
USING Outputters.Csv();
OUTPUT #r2
TO "/output/top_perc2.csv"
ORDER BY p_rank DESC
USING Outputters.Csv();

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to make an aggregation of another aggregation - struct

Related

ATHENA/PRESTO complex query with multiple unnested tables

SQL Oracle Sub-query

How to get sub query columns in main query with WHERE EXISTS in PostgreSQL?

self taught -syntax error i can find- this query works in my system but no embedded in excel to same database

Need to fetch n percentage of rows in u-sql query

Categories

Resources