Cassandra selecting by reverse order of clustering order - cassandra

I wan't to select rows by the order of ASC and DESC but cassandra data orders are fixed .
I use ScyllaDB.
My imaginary scenario of problem :
I have a table :
CREATE TABLE tbl(A text , B text , C text , primary key(A,B,C))
After inserting datas my table is :
Now i want to select top 1 ( or x ) item of row ( A - B - 3 )
And after that select bottom 1 ( or x ) item of row ( A - B - 3 ).
C order is ASC and it's fixed !
now i try to select bottom 1 item :
SELECT * FROM tbl WHERE A='A' AND B='B' AND C > '3' LIMIT 1 ;
but selecting top of ( A-B-3 ) is my problem
SELECT * FROM tbl WHERE A='A' AND B='B' AND C < '3' ???
Is there any solution for selecting top of item in cassandra ?

Related

Error Snowflake - Unsupported subquery type cannot be evaluated

I am facing an error in snowflake saying "Unsupported subquery type cannot be evaluated" after for example executing the below statement. How should write this statement to avoid this error?
select A
from (
select b
, c
FROM test_table
) ;
The outer query column list needs to be within the column list of the subquery. example: select b from (select b,c from test_table);
ignoring "columns" the query you have shown will never trigger this error.
You would get it from this form though:
select A.*
from tableA as A
where a.x = (select b.y FROM test_table as b where b.z = a.z)
this form assuming there is only 1 b.y per b.z can be turned into a inner join like
select A.*
from tableA as A
join test_table as b
on b.z = a.z and a.x = b.y
other forms of this pattern do the likes of max(b.y) and those can be made into a sub-select like:
select A.*
from tableA as A
join (
select c.z, max(c.y) from test_table as c group by 1
) as b
on b.z = a.z and a.x = b.y
but the general pattern is, in other databases there is no "cost" to do row-by-row queries, where-as Snowflake is more optimal with pre-building tables of similar data, and then equi-joining those results together. So both the "how-to-write" example pivot from a for-each-row thinking to a build the set of all possible answers, and then join that. This allows for the most parallel processing of the data possible. And while it means you the develop need to understand your data to get he best performance out of it, in general if you are doing large scale data processing, you should be understanding your data. So this costs, is rather acceptable imho.
If you are trying to Match Two Attributes on the Subquery.
Use like below:
If both need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) AND a IN ( select c FROM test_table )
If any one need to matched:
select * from Table WHERE a IN ( select b FROM test_table ) OR a IN ( select c FROM test_table )

Insert new rows, continue existing rowset row_number count

I'm attempting to perform some sort of upsert operation in U-SQL where I pull data every day from a file, and compare it with yesterdays data which is stored in a table in Data Lake Storage.
I have created an ID column in the table in DL using row_number(), and it is this "counter" I wish to continue when appending new rows to the old dataset. E.g.
Last inserted row in DL table could look like this:
ID | Column1 | Column2
---+------------+---------
10 | SomeValue | 1
I want the next rows to have the following ascending ids
11 | SomeValue | 1
12 | SomeValue | 1
How would I go about making sure that the next X rows continues the ID count incrementally such that the next rows each increases the ID column by 1 more than the last?
You could use ROW_NUMBER then add it to the the max value from the original table (ie using CROSS JOIN and MAX). A simple demo of the technique:
DECLARE #outputFile string = #"\output\output.csv";
#originalInput =
SELECT *
FROM ( VALUES
( 10, "SomeValue 1", 1 )
) AS x ( id, column1, column2 );
#newInput =
SELECT *
FROM ( VALUES
( "SomeValue 2", 2 ),
( "SomeValue 3", 3 )
) AS x ( column1, column2 );
#output =
SELECT id, column1, column2
FROM #originalInput
UNION ALL
SELECT (int)(x.id + ROW_NUMBER() OVER()) AS id, column1, column2
FROM #newInput
CROSS JOIN ( SELECT MAX(id) AS id FROM #originalInput ) AS x;
OUTPUT #output
TO #outputFile
USING Outputters.Csv(outputHeader:true);
My results:
You will have to be careful if the original table is empty and add some additional conditions / null checks but I'll leave that up to you.

DELETE FROM (SELECT ...) SAP HANA

How come this does not work and what is a workaround?
DELETE FROM
(SELECT
PKID
, a
, b)
Where a > 1
There is a Syntax Error at "(".
DELETE FROM (TABLE) where a > 1 gives the same syntax error.
I need to delete specific rows that are flagged using a rank function in my select statement.
I have now put a table immediately after the DELETE FROM and put WHERE restrictions on the DELETE and in a small series of self-joins of the table.
DELETE FROM TABLE1
WHERE x IN
(SELECT A.x
FROM (SELECT x, r1.y, r2.y, DENSE_RANK() OVER (PARTITION by r1.y, r2.y ORDER by x) AS RANK
FROM TABLE2 r0
INNER JOIN TABLE1 r1 on r0.x = r1.x
INNER JOIN TABLE1 r2 on r0.x = r2.x
WHERE r1.y = foo and r2.y = bar
) AS A
WHERE A.RANK > 1
)

SubSelect MDX Query as filtered list of main query

SubSelect MDX Query as filtered list of main query
Hi all
I want to write MDX query like to SQL:
select a, b, sum(x)
from table1
where b = "True" and a in (select distinct c from table2 where c is not null and d="True")
group by a,b
I try something like this:
`Hi all
I want to write MDX query like to SQL:
select a, b, sum(x)
from table1
where b = "True" and a in (select distinct c from table2 where c is not null and d="True")
group by a,b
I try something like this:
SELECT
NON EMPTY { [Measures].[X] } ON COLUMNS,
NON EMPTY { [A].[Name].[Name]
*[B].[Name].[Name].&[True]
} ON ROWS
FROM
(
SELECT
{ ([A].[Name].[Name] ) } ON 0
FROM
( SELECT (
{EXCEPT([C].[Name].ALLMEMBERS, [C].[Name].[ALL].UNKNOWNMEMBER) }) ON COLUMNS
FROM
( SELECT (
{ [D].[Name].&[True] } ) ON COLUMNS
FROM [CUBE]))
)
But it returns me the sum of x from subquery.
How it should look like? '
Does X's measure group have relationship with D dimension? If it's true, the following code must just work:
Select
[Measures].[X] on 0,
Non Empty [A].[Name].[Name].Members * [B].[Name].&[True] on 1
From [CUBE]
Where ([D].[Name].&[True])
If you have many-to-many relationship, you need an extra measure (say Y):
Select
[Measures].[X] on 0,
Non Empty NonEmpty([A].[Name].[Name].Members,[Measures].[Y]) * [B].[Name].&[True] on 1
From [CUBE]
Where ([D].[Name].&[True])

Select one column (with multiple rows) 5 times from the same table with different dates in the where clause

The DB records all user activity daily. I am trying to compile a summary report to display total number of actions per day per user. The problem is I want to stack the results next to each other. I have refered to the following stackoverflow questions.
mysql Select one column twice from the same table with different dates in the where clause
Select two columns from same table with different WHERE conditions
but I still continue to get the "subquery returns more than one row error #1242". All help is appreciated. Thank you.
This is my query, just for 2 days to start with.
SELECT LOGGEDIN_USER AS EnquiryHero,
( SELECT COUNT(user_id) from applications
DATE_TIME like "2016-08-24%" group by user_id ) as Day1,
( SELECT COUNT(user_id)from applications
WHERE DATE_TIME like "2016-08-25%" group by user_id ) as Day2,
from applications WHERE DATE_TIME like "2016-08-24%" group by user_id;
--
SELECT user_id,
( SUM( IF( the_day ='2016-08-24', ct, 0 ))) AS 2016-08-24,
( SUM( IF( the_day ='2016-08-25', ct, 0 ))) AS 2016-08-25,
( SUM( IF( the_day ='2016-08-26', ct, 0 ))) AS 2016-08-26,
( SUM( IF( the_day ='2016-08-27', ct, 0 ))) AS 2016-08-27,
FROM ( select user_id, DATE(date_time) AS the_day, loggedin_user, COUNT(*) AS ct
FROM applications GROUP BY 1,2 ) AS x
GROUP BY user_id;
First focus on getting the data; then focus on "pivoting" the data.
SELECT user_id,
DATE(`date_time`) AS the_day,
COUNT(*) AS ct
FROM applications
GROUP BY 1, 2;
See if that gives you the data desired; then look at how to "pivot". See the extra tag I added.
Then pivot
SELECT user_id,
(SUM(IF(the_day = '2016-08-24', ct, 0) AS '2016-08-24',
(SUM(IF(the_day = '2016-08-25', ct, 0) AS '2016-08-25',
(SUM(IF(the_day = '2016-08-26', ct, 0) AS '2016-08-26',
(SUM(IF(the_day = '2016-08-27', ct, 0) AS '2016-08-27',
...
FROM (
the query above
) AS x
GROUP BY user_id;

Resources