Understanding Sub-Queries: Where Exists and IN - subquery

When do you use "Where Exists" and when do you use "IN" in a subquery? Is the "where exists" clause associated consider a correlated query?
I'm struggling to understand the different types of sub queries and when yo use them. When would I use a sub query in Select, From, or Where.
Also, why use a sub query that perform the same function as joins?
Select *
From t_employees as e
Where EXISTS(
Select *
From t_dept_manager as d
Where e.emp_no = d.emp_no);
VS
Select *
From t_employees
Where emp_no IN (
Select emp_no
From t_dept_manager);

Related

ATHENA/PRESTO complex query with multiple unnested tables

i have i would like to create a join over several tables.
table login : I would like to retrieve all the data from login
table logging : calculating the Nb_of_sessions for each db & for each a specific event type by user
table meeting : calculating the Nb_of_meetings for each db & for each user
table live : calculating the Nb_of_live for each db & for each user
I have those queries with the right results :
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid;
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid;
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id;
But when i begin to try put it all together, it seems i retrieve bad data (i have only on db retrieved) and it seems not efficient.
select a1.db.id,a._id as userid,a.firstname,a.lastname,count(rl._id) as nb_chat
FROM
"logins"."login" a,
"loggings"."logging" b,
"meetings"."meeting" c,
"lives"."live" d,
UNNEST(dbs) AS a1 (db),
UNNEST(users) AS r1 (user)
where a._id = b.userid AND a._id = c.userid AND a._id = r1.user._id
group by 1,2,3,4
Do you have an idea ?
Regards.
The easiest way is to work with with to structure the subquery and then reference them.
with parameter reference:
You can use WITH to flatten nested queries, or to simplify subqueries.
The WITH clause precedes the SELECT list in a query and defines one or
more subqueries for use within the SELECT query.
Each subquery defines a temporary table, similar to a view definition,
which you can reference in the FROM clause. The tables are used only
when the query runs.
Since you already have working sub queries, the following should work:
with logins as
(
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
)
,visits as
(
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid
)
,meetings as
(
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid
)
,chats as
(
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id
)
select *
from logins l
left join visits v
on l.dbid = v.dbid
and l.userid = v.userid
left join meetings m
on l.dbid = m.dbid
and l.userid = m.userid
left join chats c
on l.dbid = c.dbid
and l.userid = c.userid;

RedShift Correlated Sub-query

Need your help. I am trying to convert below SQL query into RedShift, but getting error message "Invalid operation: This type of correlated subquery pattern is not supported yet"
SELECT
Comp_Key,
Comp_Reading_Key,
Row_Num,
Prev_Reading_Date,
( SELECT MAX(X) FROM (
SELECT CAST(dateadd(day, 1, Prev_Reading_Date) AS DATE) AS X
UNION ALL
SELECT dim_date.calendar_date
) a
) as start_dt
FROM stage5
JOIN dim_date ON calendar_date BETWEEN '2020-04-01' and '2020-04-15'
WHERE Comp_Key =50906055
The same query works fine in SQL Server. Could you please help me to run it in RedShift?
Regards,
Kiru
Kiru - you need to convert the correlated query into a join structure. Not knowing the data content of your tables and the exact expected out put I'm just guessing but here's a swag:
SELECT
Comp_Key,
Comp_Reading_Key,
Row_Num,
Prev_Reading_Date,
Max_X
FROM stage5
JOIN dim_date ON calendar_date BETWEEN '2020-04-01' and '2020-04-15'
JOIN ( SELECT MAX(X) as Max_X, MAX(calendar_date) as date FROM (
SELECT CAST(dateadd(day, 1, Prev_Reading_Date) AS DATE) AS X FROM stage5
cross join
SELECT dim_date.calendar_date from dim_date
) a
) as start_dt ON a.date = dim_date.calendar_date
WHERE Comp_Key =50906055
This is just a starting guess but might get you started.
However, you are likely better off rewriting this query to use window functions as they are the fastest way to perform these types of looping queries in Redshift.
Thanks Bill. It won't work in RedShift as it still has correalted sub-query.
However I have modified query in another method and it works fine.
I am closing ticket.

How can I store the value of a subquery into a variable in Hana Studio?

I would like to know how can I store the value of subquery to use it in an operation after it recieve the value. For example:
Select IDTruck
, TruckPrice = (select "TruckPrice" from "Table1" where ("TruckID" = '123'))
, TruckUnit = (select "TruckUnit" from "Table2" )
, TruckPrice * TruckUnit as "PriceTotal"
from Table3
I just want to store the value and then use it in the operation so I don't have to do the select again.
I'm not sure why it should be necessary to store the values in variables for usage in your case. I think the calculation can be done also by joining just the data (assuming that table3 contains a reference to table1 and table2).
Your example above would also not work, because TruckPrice and TruckUnits are no atomar results.
So please try to refactor your statement to use joins.

Is it possible to chain subsequent queries's where clauses in Dapper based on the results of a previous query in the same connection?

Is it possible to use .QueryMultiple (or some other method) in Dapper, and use the results of each former query to be used in the where clause of the next query, without having to do each query individually, get the id, and then .Query again, get the id and so on.
For example,
string sqlString = #"select tableA_id from tableA where tableA_lastname = #lastname;
select tableB_id from tableB WHERE tableB_id = tableA_id";
db.QueryMultiple.(sqlString, new {lastname = "smith"});
Is something like this possible with Dapper or do I need a view or stored procedure to accomplish this? I can use multiple joins for one SQL statement, but in my real query there are 7 joins, and I didn't think I should return 7 objects.
Right now I'm just using object.
You can store every previous query in table parameter and then first perform select from the parameter and query for next, for example:
DECLARE #TableA AS Table(
tableA_id INT
-- ... all other columns you need..
)
INSERT #TableA
SELECT tableA_id
FROM tableA
WHERE tableA_lastname = #lastname
SELECT *
FROM #TableA
SELECT tableB_id
FROM tableB
JOIN tableA ON tableB_id = tableA_id

Hive multiple subqueries

I'm using Hive 0.9.0 and I'm trying to execute query i.e.
`SELECT a.id, b.user FROM (SELECT...FROM a_table) a, (SELECT...FROM b_table) b WHERE a.date = b.date;`
but it returns error "loop (...)+ does not match input....".
Does Hive support multiple subqueries in FROM just like Oracle DB?
Multiple subqueries allowed in hive.
I tested with below code,it works.
select * from (select id from test where id>10) a
join (select id from test where id>20) b on a.id=b.id;
Please post your exact code so that I can give relevant solution.
join subqueries is supported Absolutely.
I think the key problem is that u use SELECT...FROM.
The correct syntax is SELECT * FROM
SELECT a.id, b.user
FROM
(SELECT * FROM a_table) a
JOIN (SELECT * FROM b_table) b ON a.date = b.date;
If you want to obtain the full Cartesian product before applying the WHERE
clause, instead:
SELECT a.id, b.user FROM (SELECT...FROM a_table) a, (SELECT...FROM b_table) b WHERE a.date = b.date;
you should use 'join' in the middle, i.e.
SELECT a.id, b.user FROM (SELECT...FROM a_table) a join (SELECT...FROM b_table) b WHERE a.date = b.date;
above is not admissible in strict mode.

Resources