How to include ' partition by ' in TD15 Pivot function? - pivot

Right now I'm having query like this -
SELECT a, b,
SUM (CASE WHEN measure_name = 'ABC' THEN measure_qty END) OVER (PARTITION BY a, b ) AS ABCPIVOT
FROM data_app.work_test
Now as TD15 is supporting direct PIVOTING.
How do I include this partition by in PIVOT function?

Related

Error Snowflake - Unsupported subquery type cannot be evaluated

I am facing an error in snowflake saying "Unsupported subquery type cannot be evaluated" after for example executing the below statement. How should write this statement to avoid this error?
select A
from (
select b
, c
FROM test_table
) ;
The outer query column list needs to be within the column list of the subquery. example: select b from (select b,c from test_table);
ignoring "columns" the query you have shown will never trigger this error.
You would get it from this form though:
select A.*
from tableA as A
where a.x = (select b.y FROM test_table as b where b.z = a.z)
this form assuming there is only 1 b.y per b.z can be turned into a inner join like
select A.*
from tableA as A
join test_table as b
on b.z = a.z and a.x = b.y
other forms of this pattern do the likes of max(b.y) and those can be made into a sub-select like:
select A.*
from tableA as A
join (
select c.z, max(c.y) from test_table as c group by 1
) as b
on b.z = a.z and a.x = b.y
but the general pattern is, in other databases there is no "cost" to do row-by-row queries, where-as Snowflake is more optimal with pre-building tables of similar data, and then equi-joining those results together. So both the "how-to-write" example pivot from a for-each-row thinking to a build the set of all possible answers, and then join that. This allows for the most parallel processing of the data possible. And while it means you the develop need to understand your data to get he best performance out of it, in general if you are doing large scale data processing, you should be understanding your data. So this costs, is rather acceptable imho.
If you are trying to Match Two Attributes on the Subquery.
Use like below:
If both need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) AND a IN ( select c FROM test_table )
If any one need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) OR a IN ( select c FROM test_table )

How to do compare/subtract records

Table A having 20 records and table B showing 19 records. How to find that one record is which is missing in table B. How to do compare/subtract records of these two tables; to find that one record. Running query in Apache Superset.
The exact answer depends on which column(s) define whether two records are the same. Assuming you wanted to use some primary key column for the comparison, you could try:
SELECT a.*
FROM TableA a
WHERE NOT EXISTS (SELECT 1 FROM TableB b WHERE b.pk = a.pk);
If you wanted to use more than one column to compare records from the two tables, then you would just add logic to the exists clause, e.g. for three columns:
WHERE NOT EXISTS (SELECT 1 FROM TableB b WHERE b.col1 = a.col1 AND
b.col2 = a.col2 AND
b.col3 = a.col3)

DELETE FROM (SELECT ...) SAP HANA

How come this does not work and what is a workaround?
DELETE FROM
(SELECT
PKID
, a
, b)
Where a > 1
There is a Syntax Error at "(".
DELETE FROM (TABLE) where a > 1 gives the same syntax error.
I need to delete specific rows that are flagged using a rank function in my select statement.
I have now put a table immediately after the DELETE FROM and put WHERE restrictions on the DELETE and in a small series of self-joins of the table.
DELETE FROM TABLE1
WHERE x IN
(SELECT A.x
FROM (SELECT x, r1.y, r2.y, DENSE_RANK() OVER (PARTITION by r1.y, r2.y ORDER by x) AS RANK
FROM TABLE2 r0
INNER JOIN TABLE1 r1 on r0.x = r1.x
INNER JOIN TABLE1 r2 on r0.x = r2.x
WHERE r1.y = foo and r2.y = bar
) AS A
WHERE A.RANK > 1
)

How do I select all rows for a clustering column in cassandra?

I have a Partion key: A
Clustering columns: B, C
I do understand I can query like this
Select * from table where A = ?
Select * from table where A = ? and B = ?
Select * from table where A = ? and B = ? and C = ?
On certain cases, I want the B value to be any value in that column.
Is there a way I can query like the following?
Select * from table where A = ? and B = 'any value' and C = ?
Option 1:
In Cassandra, you should design your data model to suit your queries. Therefore the proper way to support your fourth query (queries by A and C, but not necessarily knowing B value), is to create a new table to handle that specific query. This table will be pretty much the same, except the CLUSTERING COLUMNS will be in slightly different order:
PRIMARY KEY (A, C, B)
Now this query will work:
Select * from table where A = ? and C = ?
Option 2:
Alternatively you can create a materialized view, with a different clustering order. Now Cassandra will keep the MV in sync with your table data.
create materialized view mv_acbd as
select A, B, C, D
from TABLE1
where A is not null and B is not null and C is not null
primary key (A, C, B);
Now the query against this MV will work like a charm
Select * from mv_acbd where A = ? and C = ?
Option 3:
Not the best, but you could use the following query with your table as it is
Select * from table where A = ? and C = ? ALLOW FILTERING
Relying on ALLOW FILTERING is never a good idea, and is certainly not something that you should do in a production cluster. For this particular case, the scan is within the same partition and performance may vary depending on ratio of how many clustering columns per partition your use case has.

SubSelect MDX Query as filtered list of main query

SubSelect MDX Query as filtered list of main query
Hi all
I want to write MDX query like to SQL:
select a, b, sum(x)
from table1
where b = "True" and a in (select distinct c from table2 where c is not null and d="True")
group by a,b
I try something like this:
`Hi all
I want to write MDX query like to SQL:
select a, b, sum(x)
from table1
where b = "True" and a in (select distinct c from table2 where c is not null and d="True")
group by a,b
I try something like this:
SELECT
NON EMPTY { [Measures].[X] } ON COLUMNS,
NON EMPTY { [A].[Name].[Name]
*[B].[Name].[Name].&[True]
} ON ROWS
FROM
(
SELECT
{ ([A].[Name].[Name] ) } ON 0
FROM
( SELECT (
{EXCEPT([C].[Name].ALLMEMBERS, [C].[Name].[ALL].UNKNOWNMEMBER) }) ON COLUMNS
FROM
( SELECT (
{ [D].[Name].&[True] } ) ON COLUMNS
FROM [CUBE]))
)
But it returns me the sum of x from subquery.
How it should look like? '
Does X's measure group have relationship with D dimension? If it's true, the following code must just work:
Select
[Measures].[X] on 0,
Non Empty [A].[Name].[Name].Members * [B].[Name].&[True] on 1
From [CUBE]
Where ([D].[Name].&[True])
If you have many-to-many relationship, you need an extra measure (say Y):
Select
[Measures].[X] on 0,
Non Empty NonEmpty([A].[Name].[Name].Members,[Measures].[Y]) * [B].[Name].&[True] on 1
From [CUBE]
Where ([D].[Name].&[True])

Resources