SQL query for SQL Server Compact Edition 3.5 - GROUP BY issue - sql-server-ce-3.5

SELECT BabyInformation.* , t1.*
FROM BabyInformation
LEFT JOIN
(SELECT * FROM BabyData
GROUP BY BabyID
ORDER By Date DESC ) AS t1 ON BabyInformation.BabyID=t1.BabyID
This is my query. I want to get the one most recent BabyData tuple based on date.
The BabyInformation should left join with babyData but one row per baby...
I tried TOP(1) but this worked only for the first baby

Here is one way to do it, there are other ways which can be faster, but I believe this one to be the clearest for a beginner.
SELECT BabyInformation.*, BabyData.*
FROM BabyInformation
JOIN
(SELECT BabyID, Max(Date) as maxDate FROM BabyData
GROUP BY BabyID
) AS t1
ON BabyInformation.BabyID=t1.BabyID
Join BabyData ON BabyData.BabyID = t1.BabyID and BabyData.Date = t1.maxDate

This should do it:
SELECT bi.* , bd.*
FROM BabyInformation [bi]
LEFT JOIN BabyData [bd]
on bd.BabyDataId = (select top 1 sub.BabyDataId from BabyData [sub] where sub.BabyId = bi.BabyId order by sub.Date desc)
I've assumed that there is a column called 'BabyDataId' in the BabyData table.

Related

Error Snowflake - Unsupported subquery type cannot be evaluated

I am facing an error in snowflake saying "Unsupported subquery type cannot be evaluated" after for example executing the below statement. How should write this statement to avoid this error?
select A
from (
select b
, c
FROM test_table
) ;
The outer query column list needs to be within the column list of the subquery. example: select b from (select b,c from test_table);
ignoring "columns" the query you have shown will never trigger this error.
You would get it from this form though:
select A.*
from tableA as A
where a.x = (select b.y FROM test_table as b where b.z = a.z)
this form assuming there is only 1 b.y per b.z can be turned into a inner join like
select A.*
from tableA as A
join test_table as b
on b.z = a.z and a.x = b.y
other forms of this pattern do the likes of max(b.y) and those can be made into a sub-select like:
select A.*
from tableA as A
join (
select c.z, max(c.y) from test_table as c group by 1
) as b
on b.z = a.z and a.x = b.y
but the general pattern is, in other databases there is no "cost" to do row-by-row queries, where-as Snowflake is more optimal with pre-building tables of similar data, and then equi-joining those results together. So both the "how-to-write" example pivot from a for-each-row thinking to a build the set of all possible answers, and then join that. This allows for the most parallel processing of the data possible. And while it means you the develop need to understand your data to get he best performance out of it, in general if you are doing large scale data processing, you should be understanding your data. So this costs, is rather acceptable imho.
If you are trying to Match Two Attributes on the Subquery.
Use like below:
If both need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) AND a IN ( select c FROM test_table )
If any one need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) OR a IN ( select c FROM test_table )

Correct way to get the last value for a field in Apache Spark or Databricks Using SQL (Correct behavior of last and last_value)?

What is the correct behavior of the last and last_value functions in Apache Spark/Databricks SQL. The way I'm reading the documentation (here: https://docs.databricks.com/spark/2.x/spark-sql/language-manual/functions.html) it sounds like it should return the last value of what ever is in the expression.
So if I have a select statement that does something like
select
person,
last(team)
from
(select * from person_team order by date_joined)
group by person
I should get the last team a person joined, yes/no?
The actual query I'm running is shown below. It is returning a different number each time I execute the query.
select count(distinct patient_id) from (
select
patient_id,
org_patient_id,
last_value(data_lot) data_lot
from
(select * from my_table order by data_lot)
where 1=1
and org = 'my_org'
group by 1,2
order by 1,2
)
where data_lot in ('2021-01','2021-02')
;
What is the correct way to get the last value for a given field (for either the team example or my specific example)?
--- EDIT -------------------
I'm thinking collect_set might be useful here, but I get the error shown when I try to run this:
select
patient_id,
last_value(collect_set(data_lot)) data_lot
from
covid.demo
group by patient_id
;
Error in SQL statement: AnalysisException: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;;
Aggregate [patient_id#89338], [patient_id#89338, last_value(collect_set(data_lot#89342, 0, 0), false) AS data_lot#91848]
+- SubqueryAlias spark_catalog.covid.demo
The posts shown below discusses how to get max values (not the same as last in a list ordered by a different field, I want the last team a player joined, the player may have joined the Reds, the A's, the Zebras, and the Yankees, in that order timewise, I'm looking for the Yankees) and these posts get to the solution procedurally using python/r. I'd like to do this in SQL.
Getting last value of group in Spark
Find maximum row per group in Spark DataFrame
--- SECOND EDIT -------------------
I ended up using something like this based upon the accepted answer.
select
row_number() over (order by provided_date, data_lot) as row_num,
demo.*
from demo
You can assign row numbers based on an ordering on data_lots if you want to get its last value:
select count(distinct patient_id) from (
select * from (
select *,
row_number() over (partition by patient_id, org_patient_id, org order by data_lots desc) as rn
from my_table
where org = 'my_org'
)
where rn = 1
)
where data_lot in ('2021-01','2021-02');

Mulitple tables join in Hive getting error - Both left and right aliases encountered in join

I am trying to join 3 tables. Following are the table details.
I am expecting following results
Here is my query and getting error as "both left and right aliases encountered in join 'id'".
This was due to joining 3rd table with 1st and 2nd table(last full join statement).
select coalesce(a.id,b.id,c.id) as id,
ref1,ref2,ref3
from v_cmo_test1 a
FULL JOIN v_cmo_test2 b on (a.id = b.id)
FULL JOIN v_cmo_test3 c on (c.id in (a.id,b.id))
If I am using below query, id 3 is repeating in the table which I don't want.
select coalesce(a.id,b.id,c.id) as id,
ref1,ref2,ref3
from v_cmo_test1 a
FULL JOIN v_cmo_test2 b on a.id = b.id
FULL JOIN v_cmo_test3 c on c.id = a.id
Could any one help me on how to achieve the expected results and really appreciate for your help.
Thanks, Babu
This is a very tricky requirement. data is incorrect because you are using test1 as driver, outer joins arent working properly. And this can occur with other tables. So, i am joining two tables at a time to achieve what you want.
select coalesce(inner_sq.id,c.id) as id,ref1,ref2,ref3
from
(select coalesce(a.id,b.id,c.id) as id,ref1,ref2
from v_cmo_test1 a
FULL JOIN v_cmo_test2 b on a.id = b.id
) inner_sq
FULL JOIN v_cmo_test3 c on c.id = inner_sq.id
Inner_sq query output -
1,bab,kim
2,xxx,yyy
3,,mmm
When you full join above with test3, you should get your output.

Left join in subquery

I am trying to do a left join on my main table using this code
select distinct VBen.BENF_NO_INDIV_BEN_BANLS as benbanls,
VBen.BENF_COD_SEXE AS Sexe,
VBen.BENF_DAT_NAISS AS DatNaiss,
VBen.BENF_DAT_DECES AS DatDec,
A.date_ch as date_chsld
from PROD.V_FICH_ID_BEN_CM AS VBen
left join (select distinct VAss.BENF_NO_INDIV_BEN_BANLS as benbanls,
vass.BENF_DD_ADMIS_ASSU_MED as date_ch
from Prod.V_ADMIS_ASSU_MED_PLAN_PRIOR_CM as vass ) as A
on VBen.BENF_NO_INDIV_BEN_BANLS =A. benbanls
where Vben.BENF_DAT_NAISS>'2016-04-01' or Vben.BENF_DAT_DECES>'2011-04-01'
The problem is that the query result is a table with of number of rows greater than the main table with the same where 'condition'. I don't understand what I am missing
Thanks for your help
Why is it a problem?
The results simply indicate you have a 1:M (one to many) relationship between VBen:Vass(A)
If you don't have a 1:M relationship and it should be 1:1 then...
you're missing join criteria between the tables.
you should be getting a min/max on your date instead of all dates per benbanls
To better understand and answer we would need to know what VBen and Vass actually represent; but to put simply, you have multiple VASS(A) per VBEN
To illustrate with an example: Think about Order_Header and Order_Line tables...
Order_header contains (order_Number PK)
Order_line contains (Order_Number, Order_Line PK)
An order can have multiple lines, each line could have it's own ship date several items may have gone out on the same shipment/day. where some that were backordered went out on a different day. In this situation, an order would still have multiple lines even though we distinct order_number and shipmentdate in a subquery. I would guess your situation is similar.
so 1 in base table * 2 rows in derived/lines table gives us 2 records
1 < 2 which is the situation you have now; and that to me is perfectly fine and expected if it's a 1:M relationship.
Maybe you need to do a min or max on date instead of a distinct?
If not you're missing join criteria to make a 1:1 relationship
maybe your expectation is just flawed.
The below will give you a 1:1 relationship but I'm not sure it's what you're after.
SELECT distinct VBen.BENF_NO_INDIV_BEN_BANLS as benbanls,
VBen.BENF_COD_SEXE AS Sexe,
VBen.BENF_DAT_NAISS AS DatNaiss,
VBen.BENF_DAT_DECES AS DatDec,
A.date_ch as date_chsld
FROM PROD.V_FICH_ID_BEN_CM AS VBen
LEFT JOIN (SELECT VAss.BENF_NO_INDIV_BEN_BANLS as benbanls,
Max(vass.BENF_DD_ADMIS_ASSU_MED) as date_ch
FROM Prod.V_ADMIS_ASSU_MED_PLAN_PRIOR_CM as vass
GROUP BY VAss.BENF_NO_INDIV_BEN_BANLS) as A
on VBen.BENF_NO_INDIV_BEN_BANLS = A. benbanls
WHERE (Vben.BENF_DAT_NAISS>'2016-04-01'
or Vben.BENF_DAT_DECES>'2011-04-01)
It is likely that there is more than one counterpart in the detail table of a record on the main table.
I try your scenario on my db get a correct result.
In my DB:
select distinct p.PollId as PollId,
p.Title AS Title,
p.InsertDate AS DatDec,
ps.date_ch as date_chsld
from dbo.Poll AS p
left join (select distinct pSt.PollId as pollId,
Max(pSt.InsertDate) as date_ch
from dbo.PollStore as pSt
Group by pSt.PollId ) as ps
on p.PollId =ps.pollId
As Your Query like this :
select distinct VBen.BENF_NO_INDIV_BEN_BANLS as benbanls,
VBen.BENF_COD_SEXE AS Sexe,
VBen.BENF_DAT_NAISS AS DatNaiss,
VBen.BENF_DAT_DECES AS DatDec,
A.date_ch as date_chsld
please try this query
from PROD.V_FICH_ID_BEN_CM AS VBen
left join (select distinct VAss.BENF_NO_INDIV_BEN_BANLS as benbanls,
Max(vass.BENF_DD_ADMIS_ASSU_MED) as date_ch
from Prod.V_ADMIS_ASSU_MED_PLAN_PRIOR_CM Group by VAss.BENF_NO_INDIV_BEN_BANLS as vass ) as A
on VBen.BENF_NO_INDIV_BEN_BANLS =A. benbanls
where Vben.BENF_DAT_NAISS>'2016-04-01' or Vben.BENF_DAT_DECES>'2011-04-01'

How can we construct a query containing Union with JoinSqlBuilder in SeviceStack OrmLite

I have a query like this where I need to union two tables and then join it with a third one
SELECT *
FROM (SELECT * FROM Table1 where col2_id = 1
UNION ALL
SELECT * FROM Table2 where col2_id = 1) AS UnionList
INNER JOIN Table3 t3 ON t3.id = UnionList.col3_id
I am trying to use JoinSqlBuilder in ServiceStack OrmLite to build this query. I am able to create the join part but I am confused about how to create the Union part. Did not find any option/function which union's two table.

Resources