Missing Column Assignment for 'featurename' - presto

I am trying to write a CASE statement in which I need to run the subquery to check whether the record is available in access table. If the person has access then score should anything between 0 to 100 which is taken care by coalesce and if not then the value should be NULL. But the query is failing with the error saying, Missing column Assignment for 'id'.
My query:
SELECT
CASE
WHEN EXISTS (SELECT * FROM hive_dsn.db.access AS aa WHERE ub.id=aa.id)
THEN COALESCE(ub.score*100,0)
ELSE
NULL
END AS UNUSED
FROM hive_dsn.db.unused_output AS ub;
Basically, I did not understand the error statement. What this error is saying and how can I resolve this.
Thanks in Advance.

I figured out that the where condition is checking on columns rather than values without any JOIN.
I rewrote query like follows:
SELECT
CASE
WHEN EXISTS (SELECT ub.personid FROM hive_dsn.db.access AS aa
JOIN hive_dsn.db.unused_output AS ub ON ub.id=aa.id)
THEN COALESCE(ub.score*100,0)
ELSE NULL
END AS UNUSED
from hive_dsn.db.unusedoutput AS ub;
This JOIN condition resolved the issue.

Instead of co related sub query this can be achieved using join.
this is sample ...
SELECT
CASE
WHEN aa.id is null then null
else THEN COALESCE(ub.score*100,0)
END AS UNUSED
FROM hive_dsn.db.unused_output AS ub
left join hive_dsn.db.access AS aa
on ub.id=aa.id ;
Note: If there is 1:n relation ship in tables then use distinct in select statement.

Related

Correct way to get the last value for a field in Apache Spark or Databricks Using SQL (Correct behavior of last and last_value)?

What is the correct behavior of the last and last_value functions in Apache Spark/Databricks SQL. The way I'm reading the documentation (here: https://docs.databricks.com/spark/2.x/spark-sql/language-manual/functions.html) it sounds like it should return the last value of what ever is in the expression.
So if I have a select statement that does something like
select
person,
last(team)
from
(select * from person_team order by date_joined)
group by person
I should get the last team a person joined, yes/no?
The actual query I'm running is shown below. It is returning a different number each time I execute the query.
select count(distinct patient_id) from (
select
patient_id,
org_patient_id,
last_value(data_lot) data_lot
from
(select * from my_table order by data_lot)
where 1=1
and org = 'my_org'
group by 1,2
order by 1,2
)
where data_lot in ('2021-01','2021-02')
;
What is the correct way to get the last value for a given field (for either the team example or my specific example)?
--- EDIT -------------------
I'm thinking collect_set might be useful here, but I get the error shown when I try to run this:
select
patient_id,
last_value(collect_set(data_lot)) data_lot
from
covid.demo
group by patient_id
;
Error in SQL statement: AnalysisException: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;;
Aggregate [patient_id#89338], [patient_id#89338, last_value(collect_set(data_lot#89342, 0, 0), false) AS data_lot#91848]
+- SubqueryAlias spark_catalog.covid.demo
The posts shown below discusses how to get max values (not the same as last in a list ordered by a different field, I want the last team a player joined, the player may have joined the Reds, the A's, the Zebras, and the Yankees, in that order timewise, I'm looking for the Yankees) and these posts get to the solution procedurally using python/r. I'd like to do this in SQL.
Getting last value of group in Spark
Find maximum row per group in Spark DataFrame
--- SECOND EDIT -------------------
I ended up using something like this based upon the accepted answer.
select
row_number() over (order by provided_date, data_lot) as row_num,
demo.*
from demo
You can assign row numbers based on an ordering on data_lots if you want to get its last value:
select count(distinct patient_id) from (
select * from (
select *,
row_number() over (partition by patient_id, org_patient_id, org order by data_lots desc) as rn
from my_table
where org = 'my_org'
)
where rn = 1
)
where data_lot in ('2021-01','2021-02');

How to debug "Each GROUP BY expression must contain at least one column that is not an outer reference error"

Since SSRS doesn't allow filters on aggregates, I found some code which helped me come up with the below query. However, when I run it I get:
Each GROUP BY expression must contain at least one column that is not an outer reference
I have searched everywhere but can't find how to fix this. I've even removed the two extra tables from the query so there were no joins at all. I need to not return any order where the total of the lines on the order is less than $500 and greater than 0.
SELECT
tdsls041_sales_order_lines.company,
tdsls041_sales_order_lines.order_number,
tdsls041_sales_order_lines.amount,
tdsls041_sales_order_lines.item,
tdsls041_sales_order_lines.container
FROM
tdsls041_sales_order_lines AS tdsls041_sales_order_lines
WHERE
(tdsls041_sales_order_lines.company = 610) AND
(tdsls041_sales_order_lines.order_number IN
(SELECT
tdsls041_sales_order_lines.order_number
FROM
tdsls041_sales_order_lines AS tdsls041_sales_order_lines_1
GROUP BY
tdsls041_sales_order_lines.order_number
HAVING
(SUM(tdsls041_sales_order_lines.amount) <= 500) OR
SUM(tdsls041_sales_order_lines.amount) > 0))
The issue that SQL Server is complaining about is that the Grouping wants an aggregate function in the SELECT statement. Unfortunately, you want to use IN which you need a list of Order Numbers.
You just need to add an aggregate function to your subquery and then add another layer to select just the Order Numbers from that.
SELECT T1.company, T1.order_number, T1.amount, T1.item, T1.container
FROM tdsls041_sales_order_lines AS T1
WHERE (T1.company = 610) AND (T1.order_number IN
(SELECT order_number FROM
(SELECT TSOL.order_number, SUM(TSOL.amount) AS TTL
FROM tdsls041_sales_order_lines AS TSOL
GROUP BY TSOL.order_number
HAVING (SUM(TSOL.amount) <= 500) OR
SUM(TSOL.amount) > 0) AS T2) )
You can filter on aggreagates in Chart and Tables. You have to put the aggregate filter on your GROUP instead of on the table itself (Group Properties->Filters tab).

PostgreSQL: how to take query results and change them on the fly before saving to file

I want to know how to take query results and change them in to human talk.
For example:
Select backup_set.id, backup_set.status, backup_set.type
from backset.table
Then take the backup_set.type result, which is usually a number such as 1,2,3,4,5 and change it to something like SUSPENDED, SCHEDULED etc... But I dont want to change the data in the table just the output.
This can be done using a CASE statement.
Select backup_set.id, backup_set.status,
case backup_set.type
when 1 then 'SUSPENDED'
when 2 then 'SCHEDULED'
else 'UNKNOWN'
end as "type"
from backset.table
But it would be better if you stored that information in another table and make the type column a foreign key to that lookup table. Then you can use a simple join to retrieve the description.
Edit
If you want to replace UNKNOWN with the actual numeric value, you just need to put that into the ELSE part. You need to cast the number to a text value though:
Select backup_set.id, backup_set.status,
case backup_set.type
when 1 then 'SUSPENDED'
when 2 then 'SCHEDULED'
else backup_set.type::text
end as "type"
from backset.table

Oracle compare string

I have a strange issue in Oracle I never seen before....
I have the following insert select statement
insert into table2
(
key_field,value
)
SELECT
key_field, CASE WHEN type ='S' THEN 100 ELSE 1 END
FROM view1
WHERE key_field=1
When I run just the SELECT part then I get 100 for the second field,
however if I run the insert select statement the I get 1 in table2 .
If I include the field type on the insert, the I get the 100
If I use "CASE WHEN type like 'S' THEN 100 ELSE 1 END" Then I get 100 in table2 (correct answer)
Anyone has any idea why the first select insert statement is not working?
Thank you!

USe Subquery to return columns from different tables

I want a sub query which returns columns from different tables
for example
i am writing the code in the way similar to below
Use North Wind Select *,(Select Order Id FROM dbo. Orders OI WHERE
OI.OrderID IN (Select OI.OrderID FROM [dbo].[Order Details] OD WHERE
OD.UnitPrice=P.UnitPrice))AS 'ColumName' FROM Products P
ERROR : Msg 512, Level 16, State 1, Line 1 Sub query returned more
than 1 value. This is not permitted when the subquery follows =, !=,
<, <= , >, >= or when the subquery is used as an expression.
Whats the Mistake in this code
please reply soon
Saradhi
Select Order Id FROM dbo. Orders OI WHERE OI.OrderID IN (Select OI.OrderID FROM [dbo].[Order Details] OD WHERE OD.UnitPrice=P.UnitPrice)
This query is returning more than one OrderId while it should be returning only one. See if your data is correct.

Resources