How do I optimize MariaDB query with subqueries in FROM clause? - subquery

imagine these two tables.
Table A
ID col1 col2 col3
1 foo baz bar
2 ofo zba rba
3 oof abz abr
Table B
A_ID field_name field_value
1 first Jon
1 last Doe
2 first Adam
2 last Smith
etc..
Now I would like to have a query (current one looks like this)
SELECT
a.id,
a.col1,
a.col2,
(SELECT field_value FROM B WHERE A_ID = a.id AND field_name = 'first') as first_name,
(SELECT field_value FROM B WHERE A_ID = a.id AND field_name = 'last') as last_name
FROM A a
WHERE (SELECT COUNT(*) FROM B WHERE A_ID = a.id) = 2;
This query is working. What I would like to achieve would be something like this.
SELECT
a.id,
a.col1,
a.col2,
(SELECT field_value FROM b WHERE b.field_name = 'first') as first_name,
(SELECT field_value FROM b WHERE b.field_name = 'last') as last_name
FROM
A a,
(SELECT field_value, field_name FROM B WHERE A_ID = a.id) b
WHERE (SELECT COUNT(*) FROM b) = 2;
How would my approach look correctly? Is there any other way to get rid of the multiple queries of the table B?
Thank you!

I would replace your correlated subqueries with joins:
SELECT
a.id,
a.col1,
a.col2,
b1.field_value AS fv1,
b2.field_value AS fv2
FROM A a
LEFT JOIN B b1
ON a.id = b1.A_ID AND b1.field_name = 'first'
LEFT JOIN B b2
ON a.id = b2.A_ID AND b2.field_name = 'last';
This answer assumes that a left join from a given A record would only match at most one record in the B table, which, however, is a requirement anyway for your correlated subqueries to only return a single value.

Related

Cognos 11 - filter between query subjects

Given Table A with columns: ColA1, ColA2, ColA3
And a Table B with columns: ColB1
I want to restrict the data that can be returned from Table A based on data in Table B, like:
ColA1 not in ColB1
Ideally, some way incorporate SQL queries in the filter with select statements
What you want is
SELECT a.ColA1
, a.ColA2
, a.ColA3
FROM TableA a
LEFT OUTER JOIN TableB b on b.ColB1 = a.ColA1
WHERE b.ColB1 IS NULL
So...
Query1 contains ColA1, ColA2, and ColA3 from TableA.
Query2 contains ColB1 from TableB.
Query3
joins Query1 and Query2 on ColA1 1..1 = 0..1 ColB1
Data Items: ColA1, ColA2, ColA3
Filter: ColB1 IS NOT NULL
not exists is probably what you are looking for
Try something like this
select * from TableA as T1
where not exists
(select * from TableB as T2
where t1.key1 = t2.key1 and T1.key2 = t2.key2)

Databricks SQL: Why is subquery in left join causing error msg

I am attempting to use a subquery in a left join condition, but am getting an error message that reads: "Error in SQL statement: AnalysisException: Table or view not found: TableD;" and points to the FROM TableD D2 statement in my subquery.
SELECT D1.Code, D1.Description, C.InstanceKey
FROM TableA A
INNER JOIN TableB B
ON A.Key = B.Key
INNER JOIN TableC C
ON B.DetailKey = C.DetailKey
LEFT JOIN TableD D1
ON C.InstanceKey = D1.InstanceKey
AND D1.RankCnt = (SELECT MIN(D2.RankCnt)
FROM TableD D2
WHERE C.InstanceKey = D2.InstanceKey);
If I remove the subquery and hardcode D1.RankCnt = [anyValidRankCnt], the query runs without issue.
This question has also been posted on the Databricks Community Forum at https://forums.databricks.com/questions/14588/why-is-subquery-in-left-join-causing-error-msg.html.
I'm not sure if that particular type of correlated subquery is supported in Spark at this time, although I was able to rewrite it in a couple of different ways, including using ROW_NUMBER. Please check these queries are semantically equivalent to yours with your data:
%sql
-- Rewrite 1: CTE
WITH cte AS
(
SELECT D1.Code, D1.Description, C.InstanceKey, ROW_NUMBER() OVER ( PARTITION BY c.InstanceKey ORDER BY D1.RankCnt ) xrank
FROM TableA A
INNER JOIN TableB B
ON A.Key = B.Key
INNER JOIN TableC C
ON B.DetailKey = C.DetailKey
LEFT JOIN TableD D1
ON C.InstanceKey = D1.InstanceKey
)
SELECT *
FROM cte
WHERE xrank = 1
-- Rewrite 2: subquery
SELECT x.Code, x.Description, C.InstanceKey
FROM TableA A
INNER JOIN TableB B
ON A.Key = B.Key
INNER JOIN TableC C
ON B.DetailKey = C.DetailKey
LEFT JOIN
(
SELECT D1.InstanceKey, D1.Code, D1.Description, D1.RankCnt
FROM TableD D1
INNER JOIN
(
SELECT InstanceKey, MIN(RankCnt) RankCnt
FROM TableD
GROUP BY InstanceKey
) D2 ON D1.InstanceKey = D2.InstanceKey
AND D1.RankCnt = D2.RankCnt
) x
ON c.InstanceKey = x.InstanceKey;
-- Rewrite 3: UNION ALL
SELECT D1.Code, D1.Description, C.InstanceKey
FROM TableA A
INNER JOIN TableB B
ON A.Key = B.Key
INNER JOIN TableC C
ON B.DetailKey = C.DetailKey
INNER JOIN TableD D1
ON C.InstanceKey = D1.InstanceKey
INNER JOIN
(
SELECT D2.InstanceKey, MIN(D2.RankCnt) RankCnt
FROM TableD D2
GROUP BY D2.InstanceKey
) x ON C.InstanceKey = x.InstanceKey
AND D1.RankCnt = x.RankCnt
UNION ALL
SELECT NULL AS Code, NULL AS Description, C.InstanceKey
FROM TableA A
INNER JOIN TableB B
ON A.Key = B.Key
INNER JOIN TableC C
ON B.DetailKey = C.DetailKey
WHERE NOT EXISTS
(
SELECT *
FROM TableD D1
WHERE C.InstanceKey = D1.InstanceKey
);

DELETE FROM (SELECT ...) SAP HANA

How come this does not work and what is a workaround?
DELETE FROM
(SELECT
PKID
, a
, b)
Where a > 1
There is a Syntax Error at "(".
DELETE FROM (TABLE) where a > 1 gives the same syntax error.
I need to delete specific rows that are flagged using a rank function in my select statement.
I have now put a table immediately after the DELETE FROM and put WHERE restrictions on the DELETE and in a small series of self-joins of the table.
DELETE FROM TABLE1
WHERE x IN
(SELECT A.x
FROM (SELECT x, r1.y, r2.y, DENSE_RANK() OVER (PARTITION by r1.y, r2.y ORDER by x) AS RANK
FROM TABLE2 r0
INNER JOIN TABLE1 r1 on r0.x = r1.x
INNER JOIN TABLE1 r2 on r0.x = r2.x
WHERE r1.y = foo and r2.y = bar
) AS A
WHERE A.RANK > 1
)

Typed query for INNER JOIN (SELECT DISTINCT)?

Is it possible to create a typed query that produces the following SQL?
SELECT A.*
FROM schema1.Table1 A
INNER JOIN (SELECT DISTINCT column1, column2 FROM schema1.Table2) B ON A.column1 = B.column1
You can't join a sub select with a typed API, the easiest way to implement this would be to use a CustomJoin, e.g:
var table1 = db.GetTableName<Table1>();
var q = db.From<Table1>()
.CustomJoin($#"INNER JOIN
(SELECT DISTINCT column1, column2 FROM schema1.Table2) B
ON {table1}.column1 = B.column1");

How to write a correlated subquery for the below one?

Can anyone help me to write co-related subquery for the below non-correlated subquery?
SELECT e.emp_no,e.last_name FROM employees e,dept_emp d
WHERE e.emp_no = d.emp_no
AND d.dept_no = (SEELCT dept_no FROM dept_emp WHERE emp_no =
(SELECT emp_no FROM employees WHERE first_name = "Margareta" AND last_name = "Markovitch"));
This the answer for the above query :
Select E.Emp_No,E.Last_Name From Employees E Where Exists
( Select Dept_No From Dept_Emp D Where E.Emp_No=D.Emp_No)
And emp_no In (Select emp_no From Employees Where First_Name='Margareta' And Last_Name='Markovitch');

Resources