Cassandra select order by

Cassandra select order by - cassandra

I create table as this
CREATE TABLE sm.data (
did int,
tid int,
ts timestamp,
aval text,
dval decimal,
PRIMARY KEY (did, tid, ts)
) WITH CLUSTERING ORDER BY (tid ASC, ts DESC);
Before I did all select query with ts DESC so it was good. Now I also need select query with ts ASC in some cases. How do I accomplish that? Thank you

You can simply use ORDER BY ts ASC
Example :
SELECT * FROM data WHERE did = ? and tid = ? ORDER BY ts ASC

if you do this select
select * from data where did=1 and tid=2 order by ts asc;
you will end up with some errors
InvalidRequest: Error from server: code=2200 [Invalid query] message="Order by currently only support the ordering of columns following their declared order in the PRIMARY KEY"
I have tested it against my local cassandra db
I would suggets altering the order of the primary key columns
the reason is that :
"Querying compound primary keys and sorting results ORDER BY clauses can select a single column only. That column has to be the second column in a compound PRIMARY KEY."
CREATE TABLE data2 (
did int,
tid int,
ts timestamp,
aval text,
dval decimal,
PRIMARY KEY (did, ts, tid)
) WITH CLUSTERING ORDER BY (ts DESC, tid ASC)
Now we are free to choose the type of ordering for TS
cassandra#cqlsh:airline> SELECT * FROM data2 WHERE did = 1 and ts=2 order by ts DESC;
did | ts | tid | aval | dval
-----+----+-----+------+------
(0 rows)
cassandra#cqlsh:airline> SELECT * FROM data2 WHERE did = 1 and ts=2 order by ts ASC;
did | ts | tid | aval | dval
-----+----+-----+------+------
(0 rows)
Another way would be either to create a new table or a materialized view , the later would lead behind the scene to data duplication anyway
hope that clear enough

Related

Group by on Primary Partition

I am not able to perform Group by on a primary partition. I am using Cassandra 3.10. When I group by I get the following error.
InvalidReqeust: Error from server: code=2200 [Invalid query] message="Group by currently only support groups of columns following their declared order in the Primary Key. My column is a primary key even still I am facing the problem.
My schema is
Table trends{
name text,
price int,
quantity int,
code text,
code_name text,
cluster_id text
uitime timeuuid,
primary key((name,price),code,uitime))
with clustering order by (code DESC, uitime DESC)
And the command that I run is: select sum(quantity) from trends group by code;

For starters your schema is invalid. You cannot set clustering order on code because it is the partition key. The order is going to be determined by the hash of it (unless using byte order partitioner - but don't do that).
The query and thing your talking about does work though. For example you can run
> SELECT keyspace_name, sum(partitions_count) AS approx_partitions FROM system.size_estimates GROUP BY keyspace_name;
keyspace_name | approx_partitions
--------------------+-------------------
system_auth | 128
basic | 4936508
keyspace1 | 870
system_distributed | 0
system_traces | 0
where they schema is:
CREATE TABLE system.size_estimates (
keyspace_name text,
table_name text,
range_start text,
range_end text,
mean_partition_size bigint,
partitions_count bigint,
PRIMARY KEY ((keyspace_name), table_name, range_start, range_end)
) WITH CLUSTERING ORDER BY (table_name ASC, range_start ASC, range_end ASC)
Perhaps the pseudo-schema you provided differs from the actual one. Can you provide output of describe table xxxxx in your question?

Cassandra Order by currently only support the ordering of columns following their declared order in the PRIMARY KEY

This is the query I used to create the table:
CREATE TABLE test.comments (msguuid timeuuid, page text, userid text, username text, msg text, timestamp int, PRIMARY KEY (timestamp, msguuid));
then I create a materialized view:
CREATE MATERIALIZED VIEW test.comments_by_page AS
SELECT *
FROM test.comments
WHERE page IS NOT NULL AND msguuid IS NOT NULL
PRIMARY KEY (page, timestamp, msguuid)
WITH CLUSTERING ORDER BY (msguuid DESC);
I want to get the last 50 rows sorted by timestamp in ascending order.
This is the query I'm trying:
SELECT * FROM test.comments_by_page WHERE page = 'test' AND timestamp < 1496707057 ORDER BY timestamp ASC LIMIT 50;
which then gives this error: InvalidRequest: code=2200 [Invalid query] message="Order by currently only support the ordering of columns following their declared order in the PRIMARY KEY"
How can I accomplish this?

Materialized View rules are basically the same of "standard" tables ones. If you want a specific order you must specify that in the clustering key.
So you have to put your timestamp into the clustering section.

clustering order statement should be modified as below:
//Don't forget to put the primary key before timestamp into ()
CLUSTERING ORDER BY ((msguuid DESC), timestamp ASC)

Invalid list literal for <column name> of type frozen<list<bigint>>

I am having an issue filtering by a column of type frozen> that is part of a clustering key sorted in DESC order.
Context
This is the definition of my keyspace and tables
CREATE KEYSPACE hello WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE hello.table1 (
fn bigint,
et smallint,
st frozen<list<bigint>>,
tn bigint,
ts timestamp,
PRIMARY KEY ((fn, et), st, tn));
CREATE TABLE hello.table2 (
fn bigint,
et smallint,
st frozen<list<bigint>>,
tn bigint,
ts timestamp,
PRIMARY KEY ((fn, et), st, tn)) WITH CLUSTERING ORDER BY (st DESC, tn DESC);
What works...
Inserting records into table1 works fine:
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[23],1,0);
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[24],1,0);
INSERT INTO hello.table1(fn, et,st,tn, ts) VALUES ( 1,1,[25],1,0);
Selecting records and specifying a frozen column in the where clause also works fine
select * from hello.table1 where fn=1 and et=1 and st=[23];
What does not work...
Inserting records into table2 does not work:
INSERT INTO hello.table2(fn, et,st,tn, ts) VALUES ( 1,1,[23],1,0);
And if I already have records inserted (from my application), selecting records and specifying a frozen column in the where clause also does not work
select * from hello.table2 where fn=1 and et=1 and st=[23];

CAS with CQL in Cassandra

I'm trying to model some time series data in Cassandra which I had been able to do with the older thrift client but CQL seems to be throwing me off.
I want to add a NEW column to my row IF a specific column value matches.
My table definition is:
CREATE TABLE TestTable (
key int,
base uuid,
ts int, // Timestamp (column name)
val text, // Timestamp value (column value)
PRIMARY KEY (key, ts)
) WITH CLUSTERING ORDER BY (ts DESC);
What I'm guessing it'd look like is:
Row | UUID | TS | TS | TS
--- | ---- | --- | ---| ---
1 | id1 | 1 | 2 | 3
--- | --- | --- | ---| ---
2 | id2 | 1 | 5 | 6
So essentially, I can have a bunch of Timestamps for a given row and a SINGLE UUID for a row.
The UUID needs to be updated for each new insert of a TS column.
So inserts in a row work just fine:
insert into TestTable(key, base, ts, val) values (1, dfb63886-91a4-11e6-ae22-56b6b6499611, 50, 'one')
But I'm failing to figure out a way, using CQL, to INSERT a new column in a row using Cassandra transactions (CAS).
This one fails:
insert into TestTable(key, base, ts, val) values (1, dfb63886-91a4-11e6-ae22-56b6b6499611, 70, 'four') if base = dfb63886-91a4-11e6-ae22-56b6b6499611;
with the error:
SyntaxException: <ErrorMessage code=2000 [Syntax error in CQL query] message="line 1:106 mismatched input 'base' expecting K_NOT (..., 70, 'four') if [base] =...)">
And the query:
update TestTable set val = 'four', ts=70 where key = 1 if base = dfb63886-91a4-11e6-ae22-56b6b6499611;
fails with the error:
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY part ts found in SET part"
I'm trying to figure out how to model the data properly so that I only have one UUID per row and can have multiple columns without having to explicitly define them during table creation, since it can vary quite a bit.
IIRC, it was easy doing this with the thrift client but using that isn't an option =/

There is a nice tutorial regarding data series here
In a nutshell, your composite key will be your unique identifier (like the UUID that you were proposing) and a timestamp, so you will be able to add as many events/values associated to a UUID
CREATE TABLE IF NOT EXISTS TestTable (
base uuid,
ts timestamp, // Timestamp (column name)
value text, // Timestamp value (column value)
PRIMARY KEY (base, ts)
) WITH CLUSTERING ORDER BY (ts DESC);
Adding values will have the same UUID with different times:
INSERT INTO TestTable (base, ts, value)
VALUES (467286c5-7d13-40c2-92d0-73434ee8970c, dateof(now()), 'abc');
INSERT INTO TestTable (base, ts, value)
VALUES (467286c5-7d13-40c2-92d0-73434ee8970c, dateof(now()), 'def');
cqlsh:test> SELECT * FROM TestTable WHERE base = 467286c5-7d13-40c2-92d0-73434ee8970c;
base | ts | value
--------------------------------------+---------------------------------+-------
467286c5-7d13-40c2-92d0-73434ee8970c | 2016-10-14 04:13:42.779000+0000 | def
467286c5-7d13-40c2-92d0-73434ee8970c | 2016-10-14 04:12:50.551000+0000 | abc
(2 rows)
Updating can be done in any of the columns, except the ones used as keys, the errors displayed in the update statement was caused by the "IF" statement and because it was tried to update ts which is part of the composite key.
INSERT INTO TestTable (base, ts, value)
VALUES (ffb0bb8e-3d67-4203-8c53-046a21992e52, dateof(now()), 'bananas');
SELECT * FROM TestTable WHERE base = ffb0bb8e-3d67-4203-8c53-046a21992e52 AND ts < dateof(now());
base | ts | value
--------------------------------------+---------------------------------+---------
ffb0bb8e-3d67-4203-8c53-046a21992e52 | 2016-10-14 04:17:26.421000+0000 | apples
(1 rows)
UPDATE TestTable SET value = 'apples' WHERE base = ffb0bb8e-3d67-4203-8c53-046a21992e52;
SELECT * FROM TestTable WHERE base = ffb0bb8e-3d67-4203-8c53-046a21992e52 AND ts < dateof(now());
base | ts | value
--------------------------------------+---------------------------------+---------
ffb0bb8e-3d67-4203-8c53-046a21992e52 | 2016-10-14 04:17:26.421000+0000 | bananas
(1 rows)

How can I use the second column in this primary key both in ORDER BY clause as well as update it using an UPDATE command

Below is the table.
CREATE TABLE threadpool(
threadtype int,
threadid bigint,
jobcount bigint,
valid boolean,
PRIMARY KEY (threadtype, jobcount, threadid)
);
I want to run the below 2 queries on this table.
SELECT * FROM threadpool WHERE threadtype = 1 ORDER BY jobcount ASC LIMIT 1;
UPDATE threadpool SET valid = false WHERE threadtype = 1 and threadid = 4;
The second query fails with the below reason.
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY column "threadid" cannot be restricted (preceding column "jobcount" is either not restricted or by a non-EQ relation)"
Can any body please help me in modelling the data to support both the above queries.

Your described data model can't work, as
only values of datatype counter can be incremented using a CQL statement
counter tables can only have the counter as a single column beside the PK
you cannot sort by counter values

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Cassandra select order by - cassandra

You can simply use ORDER BY ts ASC Example : SELECT * FROM data WHERE did = ? and tid = ? ORDER BY ts ASC

Related

Group by on Primary Partition

Cassandra Order by currently only support the ordering of columns following their declared order in the PRIMARY KEY

Invalid list literal for <column name> of type frozen<list<bigint>>

CAS with CQL in Cassandra

How can I use the second column in this primary key both in ORDER BY clause as well as update it using an UPDATE command

Categories

Resources