Cassandra compound clustering key and queries with ordering - cassandra

We use cassandra wide rows heavily to store per user time-series as they are perfect for that use-case. Let's assume we have a table:
create table user_events (
user_id text,
timestmp timestamp,
event text,
primary key((user_id), timestmp));
What if clashes on timestamp may happen (same user can emit two different events with the same timestamp). What is the best way to tweak this schema to resolve that assuming we have an ordering for all events present (have a sequence int for each event).
If I modify schema the following way:
create table user_events (
user_id text,
timestmp timestamp,
seq int,
event text,
primary key((user_id), timestmp, seq));
I won’t be able to do WHERE user_id = ? ORDER BY timestmp ASC, seq ASC – cassandra does not allow that.

I won’t be able to do WHERE user_id = ? ORDER BY timestmp ASC, seq ASC – cassandra does not allow that.
You might be seeing an error because you are repeating ASC. This should work:
WHERE user_id = ? ORDER BY timestmp,seq ASC
Also, as long as you have defined your primary key as PRIMARY KEY((user_id),timestmp,seq)) you don't even need to specify ORDER BY x[,y] ASC. It will cluster the data on disk in that order, and thus return it to you already sorted in that order. ORDER BY should only be necessary when you want to put your results in descending order (or whatever the opposite of how you have it defined is).
What if clashes on timestamp may happen?
I think your extra seq column should be sufficient, depending on how you plan on inserting the data. If you are setting the timestmp from the client, then you should be ok. However, look what happens when I (using your second table) INSERT rows while creating the timestamp two different ways.
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('Mal',dateof(now()),1,'commanding');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('Wash',dateof(now()),1,'piloting');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),1,'freaking out');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),3,'being weird');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River',dateof(now()),2,'killing reavers');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',1,'freaking out');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',3,'being weird');
INSERT INTO user_events(user_id,timestmp,seq,event) VALUES ('River','2015-01-13 13:14-0600',2,'killing reavers');
Querying that data by a user_id of "River" yields:
aploetz#cqlsh:stackoverflow> SELECT * FROM user_events WHERE user_id='River';
user_id | timestmp | seq | event
---------+--------------------------+-----+-----------------
River | 2015-01-13 13:14:00-0600 | 1 | freaking out
River | 2015-01-13 13:14:00-0600 | 2 | killing reavers
River | 2015-01-13 13:14:00-0600 | 3 | being weird
River | 2015-01-14 12:58:41-0600 | 1 | freaking out
River | 2015-01-14 12:58:57-0600 | 3 | being weird
River | 2015-01-14 12:58:57-0600 | 2 | killing reavers
(6 rows)
Notice that using the now() function to generate a timeuuid, and then converting that to a timestamp with dateof() causes the two rows with the timestmp "2015-01-14 12:58:57-0600" to appear to be the same. But they are not the same, as you can tell by the seq column.
So just a bit of caution on using/generating timestamps. They might look the same, but they may not be stored as the same value. Just to be on the safe side, I would use a timeuuid instead.

Related

Duplicate rows/columns for the same primary key in Cassandra

I have a table/columnfamily in Cassandra 3.7 with sensordata.
CREATE TABLE test.sensor_data (
house_id int,
sensor_id int,
time_bucket int,
sensor_time timestamp,
sensor_reading map<int, float>,
PRIMARY KEY ((house_id, sensor_id, time_bucket), sensor_time)
)
Now when I select from this table I find duplicates for the same primary key, something I thought was impossible.
cqlsh:test> select * from sensor_data;
house_id | sensor_id | time_bucket | sensor_time | sensor_reading
----------+-----------+-------------+---------------------------------+----------------
1 | 2 | 3 | 2016-01-02 03:04:05.000000+0000 | {1: 101}
1 | 2 | 3 | 2016-01-02 03:04:05.000000+0000 | {1: 101}
I think part of the problem is that this data has both been written "live" using java and Datastax java driver, and it has been loaded together with historic data from another source using sstableloader.
Regardless, this shouldn't be possible.
I have no way of connecting with the legacy cassandra-cli to this cluster, perhaps that would have told me something that I can't see using cqlsh.
So, the questions are:
* Is there anyway this could happen under known circumstances?
* Can I read more raw data using cqlsh? Specifically write time of these two rows. the writetime()-function can't operate on primary keys or collections, and that is all I have.
Thanks.
Update:
This is what I've tried, from comments, answers and other sources
* selecting using blobAsBigInt gives the same big integer for all identical rows
* connecting using cassandra-cli, after enabling thrift, is possible but reading the table isn't. It's not supported after 3.x
* dumping out using sstabledump is ongoing but expected to take another week or two ;)
I don't expect to see nanoseconds in a timestamp field and additionally i'm of the impression they're fully not supported? Try this:
SELECT house_id, sensor_id, time_bucket, blobAsBigint(sensor_time) FROM test.sensor_data;
I WAS able to replicate it doing by inserting the rows via an integer:
INSERT INTO sensor_data(house_id, sensor_id, time_bucket, sensor_time) VALUES (1,2,4,1451692800000);
INSERT INTO sensor_data(house_id, sensor_id, time_bucket, sensor_time) VALUES (1,2,4,1451692800001);
This makes sense because I would suspect one of your drivers is using a bigint to insert the timestamp, and one is likely actually using the datetime.
Tried playing with both timezones and bigints to reproduce this... seems like only bigint is reproducable
house_id | sensor_id | time_bucket | sensor_time | sensor_reading
----------+-----------+-------------+--------------------------+----------------
1 | 2 | 3 | 2016-01-02 00:00:00+0000 | null
1 | 2 | 4 | 2016-01-01 23:00:00+0000 | null
1 | 2 | 4 | 2016-01-02 00:00:00+0000 | null
1 | 2 | 4 | 2016-01-02 00:00:00+0000 | null
1 | 2 | 4 | 2016-01-02 01:01:00+0000 | null
edit: Tried some shenanigans using bigint in place of datetime insert, managed to reproduce...
Adding some observations on top of what Nick mentioned,
Cassandra Primary key = one or combination of {Partition key(s) + Clustering key(s)}
Keeping in mind the concepts of partition keys used within angular brackets which can be simple (one key) or composite (multiple keys) for unique identification and clustering keys to sort data, the below have been observed.
Query using select: sufficient to query using all the partition key(s) provided, additionally can query using clustering key(s) but in the same order in which they have been mentioned in primary key during table creation.
Update using set or update: the update statement needs to have search/condition clauses which not only include all the partition key(s) but also all the clustering key(s)
Answering the question - Is there anyway this could happen under known circumstances?
Yes, it is possible when same data is inserted from different sources.
To explain further, incase one tries to insert data from code (API etc) into Cassandra and then tries inserting the same data from DataStax Studio/any tool used to perform direct querying, a duplicate record is inserted.
Incase the same data is being pushed multiple times either from code alone or querying tool alone or from another source used to do the same operation multiple times, the data behaves idempotently and is not inserted again.
The possible explanation could be the way the underlying storage engine computes internal indexes or hashes to identify a row pertaining to set of columns (since column based).
Note:
The above information of duplicacy incase same data is pushed from different sources has been observed, tested and validated.
Language used: C#
Framework: .NET Core 3
"sensor_time" is part of the primary key. It is not in "Partition Key", but is "Clustering Column". this is why you get two "rows".
However, in the disk table, both "visual rows" are stored on single Cassandra row. In reality, they are just different columns and CQL just pretend they are two "visual rows".
Clarification - I did not worked with Cassandra for a while so I might not use correct terms. When i say "visual rows", I mean what CQL result shows.
Update
You can create following experiment (please ignore and fix any syntax errors I will do).
This suppose to do table with composite primary key:
"state" is "Partition Key" and
"city" is "Clustering Column".
create table cities(
state int,
city int,
name text,
primary key((state), city)
);
insert into cities(state, city, name)values(1, 1, 'New York');
insert into cities(state, city, name)values(1, 2, 'Corona');
select * from cities where state = 1;
this will return something like:
1, 1, New York
1, 2, Corona
But on the disk this will be stored on single row like this:
+-------+-----------------+-----------------+
| state | city = 1 | city = 2 |
| +-----------------+-----------------+
| | city | name | city | name |
+-------+------+----------+------+----------+
| 1 | 1 | New York | 2 | Corona |
+-------+------+----------+------+----------+
When you have such composite primary key you can select or delete on it, e.g.
select * from cities where state = 1;
delete from cities where state = 1;
In the question, primary key is defined as:
PRIMARY KEY ((house_id, sensor_id, time_bucket), sensor_time)
this means
"house_id", "sensor_id", "time_bucket" is "Partition Key" and
"sensor_time" is the "Clustering Column".
So when you select, the real row is spitted and show as if there are several rows.
Update
http://www.planetcassandra.org/blog/primary-keys-in-cql/
The PRIMARY KEY definition is made up of two parts: the Partition Key
and the Clustering Columns. The first part maps to the storage engine
row key, while the second is used to group columns in a row. In the
storage engine the columns are grouped by prefixing their name with
the value of the clustering columns. This is a standard design pattern
when using the Thrift API. But now CQL takes care of transposing the
clustering column values to and from the non key fields in the table.
Then read the explanations in "The Composite Enchilada".

As I know in range queries, Cassandra retrieves result ordered by culstring key. Can I change this behavior in my query?

I'm trying to store and retrieve last active sensors by this schema:
CREATE TABLE last_signals (
section bigint,
sensor bigint,
time bigint,
PRIMARY KEY (section, sensor)
);
Row of this table will be updated every few seconds and in the result hot sensors will remain in memtable. But what will happen when I get a run a query like this:
SELECT * FROM last_signals
WHERE section = ? AND time > ?
Limit ?
ALLOW FILTERING;
And the result will be something like this (Ordered by clustering key):
sect | sens | time
------+------+------
1 | 1 | 4
1 | 2 | 3
1 | 4 | 2
1 | 5 | 9
The first Question: Is this result guaranteed to be the same in all version? (I'm using 3.7) and the next one is that how I can change this behavior (with query option, modeling or etc.). Indeed I need to get last writes first without considering clustring-keys order. I think in this case my reads will be much faster.
I don't think there is any way to guarantee order besides using clustering keys. Thus your ALLOW FILTERING query is potentially costly and may even time out. You could consider the following schema:
CREATE TABLE last_signals_by_time (
section bigint,
sensor bigint,
time bigint,
dummy bool,
PRIMARY KEY ((section, sensor), time)
) WITH CLUSTERING ORDER BY (time DESC);
Instead of updates do inserts with TTL so that you do not have to clean up old entries manually. (The dummy field is needed in order for TTL to work)
And then just run your read queries per section/sensors in parallel:
SELECT * FROM last_signals_by_time
WHERE section = ? AND sensor = ?
LIMIT 1;

Cassandra Data Model for Sensor Data - Value | Timestamp

I'm new to Cassandra and I'm trying to define a data model that fits my requirements.
I have a sensor that collects one value every millisecond and I have to store those data in Cassandra. The queries that I want to perform are:
1) Give me all the sensor values from - to these timestamp values
2) Tell me when this range of values was recorded
I'm not sure if there exist a common schema that can satisfy both queries because I want to perform range queries on both values. For the first query I should use something like:
CREATE TABLE foo (
value text,
timestamp timestamp,
PRIMARY KEY (value, timestamp));
but then for the second query I need the opposite since I can't do range queries on the partition key without using a token that restricts the timestamp:
CREATE TABLE foo (
value text,
timestamp timestamp,
PRIMARY KEY (timestamp, value));
So do I need two tables for this? Or there exist another way?
Thanks
PS: I need to be as fast as possible while reading
I have a sensor that collects one value every millisecond and I have to store those data in Cassandra.
The main problem I see here, is that you're going to run into Cassandra's limit of 2 billion col values per partition fairly quickly. DataStax's Patrick McFadin has a good example for weather station data (Getting Started with Time Series Data Modeling) that seems to fit here. If I apply it to your model, it looks something like this:
CREATE TABLE fooByTime (
sensor_id text,
day text,
timestamp timestamp,
value text,
PRIMARY KEY ((sensor_id,day),timestamp)
);
This will partition on both sensor_id and day, while sorting rows within the partition by timestamp. So you could query like:
> SELECT * FROM fooByTime WHERE sensor_id='5' AND day='20151002'
AND timestamp > '2015-10-02 00:00:00' AND timestamp < '2015-10-02 19:00:00';
sensor_id | day | timestamp | value
-----------+----------+--------------------------+-------
5 | 20151002 | 2015-10-02 13:39:22-0500 | 24
5 | 20151002 | 2015-10-02 13:49:22-0500 | 23
And yes, the way to model in Cassandra, is to have one table for each query pattern. So your second table where you want to range query on value might look something like this:
CREATE TABLE fooByValues (
sensor_id text,
day text,
timestamp timestamp,
value text,
PRIMARY KEY ((sensor_id,day),value)
);
And that would support queries like:
> SELECT * FROm foobyvalues WHERE sensor_id='5'
AND day='20151002' AND value > '20' AND value < '25';
sensor_id | day | value | timestamp
-----------+----------+-------+--------------------------
5 | 20151002 | 22 | 2015-10-02 14:49:22-0500
5 | 20151002 | 23 | 2015-10-02 13:49:22-0500
5 | 20151002 | 24 | 2015-10-02 13:39:22-0500

How to delete a record in Cassandra?

I have a table like this:
CREATE TABLE mytable (
user_id int,
device_id ascii,
record_time timestamp,
timestamp timeuuid,
info_1 text,
info_2 int,
PRIMARY KEY (user_id, device_id, record_time, timestamp)
);
When I ask Cassandra to delete a record (an entry in the columnfamily) like this:
DELETE from my_table where user_id = X and device_id = Y and record_time = Z and timestamp = XX;
it returns without an error, but when I query again the record is still there. Now if I try to delete a whole row like this:
DELETE from my_table where user_id = X
It works and removes the whole row, and querying again immediately doesn't return any more data from that row.
What I am doing wrong? How you can remove a record in Cassandra?
Thanks
Ok, here is my theory as to what is going on. You have to be careful with timestamps, because they will store data down to the millisecond. But, they will only display data to the second. Take this sample table for example:
aploetz#cqlsh:stackoverflow> SELECT id, datetime FROM data;
id | datetime
--------+--------------------------
B25881 | 2015-02-16 12:00:03-0600
B26354 | 2015-02-16 12:00:03-0600
(2 rows)
The datetimes (of type timestamp) are equal, right? Nope:
aploetz#cqlsh:stackoverflow> SELECT id, blobAsBigint(timestampAsBlob(datetime)),
datetime FROM data;
id | blobAsBigint(timestampAsBlob(datetime)) | datetime
--------+-----------------------------------------+--------------------------
B25881 | 1424109603000 | 2015-02-16 12:00:03-0600
B26354 | 1424109603234 | 2015-02-16 12:00:03-0600
(2 rows)
As you are finding out, this becomes problematic when you use timestamps as part of your PRIMARY KEY. It is possible that your timestamp is storing more precision than it is showing you. And thus, you will need to provide that hidden precision if you will be successful in deleting that single row.
Anyway, you have a couple of options here. One, find a way to ensure that you are not entering more precision than necessary into your record_time. Or, you could define record_time as a timeuuid.
Again, it's a theory. I could be totally wrong, but I have seen people do this a few times. Usually it happens when they insert timestamp data using dateof(now()) like this:
INSERT INTO table (key, time, data) VALUES (1,dateof(now()),'blah blah');
CREATE TABLE worker_login_table (
worker_id text,
logged_in_time timestamp,
PRIMARY KEY (worker_id, logged_in_time)
);
INSERT INTO worker_login_table (worker_id, logged_in_time)
VALUES ("worker_1",toTimestamp(now()));
after 1 hour executed the above insert statement once again
select * from worker_login_table;
worker_id| logged_in_time
----------+--------------------------
worker_1 | 2019-10-23 12:00:03+0000
worker_1 | 2015-10-23 13:00:03+0000
(2 rows)
Query the table to get absolute timestamp
select worker_id, blobAsBigint(timestampAsBlob(logged_in_time )), logged_in_time from worker_login_table;
worker_id | blobAsBigint(timestampAsBlob(logged_in_time)) | logged_in_time
--------+-----------------------------------------+--------------------------
worker_1 | 1524109603000 | 2019-10-23 12:00:03+0000
worker_1 | 1524209403234 | 2019-10-23 13:00:03+0000
(2 rows)
The below command will not delete the entry from Cassandra as the precise value of timestamp is required to delete the entry
DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='2019-10-23 12:00:03+0000';
By using the timestamp from blob we can delete the entry from Cassandra
DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='1524209403234';

Query results not ordered despite WITH CLUSTERING ORDER BY

I am storing posts from all users in table. I want to retrieve post from all users the user is following.
CREATE TABLE posts (
userid int,
time timestamp,
id uuid,
content text,
PRIMARY KEY (userid, time)
)WITH CLUSTERING ORDER BY (time DESC)
I have the data about who all user follows in another table
CREATE TABLE follow (
userid int,
who_follow_me set<int>,
who_i_follow set<int>,
PRIMARY KEY ((userid))
)
I am making query like
select * from posts where userid in(1,2,3,4....n);
2 questions:
why I still get data in random order, though CLUSTERING ORDER BY is specified in posts. ?
Is model correct to satisfy the query optimally (user can have n number of followers)?
I am using Cassandra 2.0.10.
"why I still get data in random order, though CLUSTERING ORDER BY is specified in posts?"
This is because ORDER BY only works for rows within a particular partitioning key. So in your case, if you wanted to see all of the posts for a specific user like this:
SELECT * FROM posts WHERE userid=1;
That return your results ordered by time, as all of the rows within the userid=1 partitioning key would be clustered by it.
"Is model correct to satisfy the query optimally (user can have n number of followers)?"
It will work, as long as you don't care about getting the results ordered by timestamp. To be able to query posts for all users ordered by time, you would need to come up with a different partitioning key. Without knowing too much about your application, you could use a column like GROUP (for instance) and partition on that.
So let's say that you evenly assign all of your users to eight groups: A, B, C, D, E, F, G and H. Let's say your table design changed like this:
CREATE TABLE posts (
group text,
userid int,
time timestamp,
id uuid,
content text,
PRIMARY KEY (group, time, userid)
)WITH CLUSTERING ORDER BY (time DESC)
You could then query all posts for all users for group B like this:
SELECT * FROM posts WHERE group='B';
That would give you all of the posts for all of the users in group B, ordered by time. So basically, for your query to order the posts appropriately by time, you need to partition your post data on something other than userid.
EDIT:
PRIMARY KEY (userid, follows)) WITH CLUSTERING ORDER BY (created DESC);
That's not going to work. In fact, that should produce the following error:
code=2200 [Invalid query] message="Missing CLUSTERING ORDER for column follows"
And even if you did add follows to your CLUSTERING ORDER clause, you would see this:
code=2200 [Invalid query] message="Only clustering key columns can be defined in CLUSTERING ORDER directive"
The CLUSTERING ORDER clause can only be used on the clustering column(s), which in this case, is only the follows column. Alter your PRIMARY KEY definition to cluster on follows (ASC) and created (DESC). I have tested this, and inserted some sample data, and can see that this query works:
aploetz#cqlsh:stackoverflow> SELECT * FROM posts WHERE userid=2 AND follows=1;
userid | follows | created | id
--------+---------+--------------------------+--------------------------------------
2 | 1 | 2015-01-25 13:27:00-0600 | 559cda12-8fe7-45d3-9a61-7ddd2119fcda
2 | 1 | 2015-01-25 13:26:00-0600 | 64b390ba-a323-4c71-baa8-e247a8bc9cdf
2 | 1 | 2015-01-25 13:24:00-0600 | 1b325b66-8ae5-4a2e-a33d-ee9b5ad464b4
(3 rows)
Although, if you want to query by just userid you can see posts from all of your followers. But in that case, the posts will only be ordered within each followerid, like this:
aploetz#cqlsh:stackoverflow> SELECT * FROM posts WHERE userid=2;
userid | follows | created | id
--------+---------+--------------------------+--------------------------------------
2 | 0 | 2015-01-25 13:28:00-0600 | 94da27d0-e91f-4c1f-88f2-5a4bbc4a0096
2 | 0 | 2015-01-25 13:23:00-0600 | 798053d3-f1c4-4c1d-a79d-d0faff10a5fb
2 | 1 | 2015-01-25 13:27:00-0600 | 559cda12-8fe7-45d3-9a61-7ddd2119fcda
2 | 1 | 2015-01-25 13:26:00-0600 | 64b390ba-a323-4c71-baa8-e247a8bc9cdf
2 | 1 | 2015-01-25 13:24:00-0600 | 1b325b66-8ae5-4a2e-a33d-ee9b5ad464b4
(5 rows)
This is my new schema,
CREATE TABLE posts(id uuid,
userid int,
follows int,
created timestamp,
PRIMARY KEY (userid, follows)) WITH CLUSTERING ORDER BY (created DESC);
Here userid represents who posted it and follows represents userid for his one of the follower. Say user x follows 10 other people , i am making 10+1 inserts. Definitely there is too much data duplication. However now its easier to get timeline for one of the user with following query
select * from posts where follows=?

Resources