How cassandra reads counter columns from sstables? - cassandra

I am trying to convert sstables to json using sstable2json utility. It works fine but for counter columns it gives a very long string value.
My create table statement :
CREATE TABLE counters1
(value counter,
name varchar,
surname varchar,
PRIMARY KEY (name, surname)
);
Sample data :
Now after converting to json what I get is :
[ {"key": "hari",
"cells": [["ram:value","0001800086d46a8fd6cb484e9257a02ddd14fe0600000000000000010000000000000001",1452867057744000,"c",-9223372036854775808]]} ]
Q1) Is there a way to get meaningful value from this? (0001800086d46a8fd6cb484e9257a02ddd14fe0600000000000000010000000000000001)
Q2) How does cassandra reads from the same sstable and displays "1"
Thanks

Counters changed a lot in 2.1, see http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters. Which also has a great explaination of counters in pre-2.0 (what you are looking at). The context in the sstable mostly is made up of a tuple of counter id (timeuuid), shard logical clock, and shard value. (16 byte id, and two longs). This is whats being displayed in the sstable2json. Theres a little more in the header which describes some local/global element index. Check out https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/context/CounterContext.java#L675 for more details.
But I would recommend using 2.1 counters to avoid some issues and have a little more simplicity. Its going to be pretty non-trivial to build your counter value from the sstables manually though.

Related

Regarding Cassandra's (sloppy, still confusing) documentation on keys, partitions

I have a high-write table I'm moving from Oracle to Cassandra. In Oracle the PK is a (int: clientId, id: UUID). There are about 10 billion rows. Right off the bat I run into this nonsensical warning:
https://docs.datastax.com/en/cql/3.3/cql/cql_using/useWhenIndex.html :
"If you create an index on a high-cardinality column, which has many distinct values, a query between the fields will incur many seeks for very few results. In the table with a billion songs, looking up songs by writer (a value that is typically unique for each song) instead of by their artist, is likely to be very inefficient. It would probably be more efficient to manually maintain the table as a form of an index instead of using the Cassandra built-in index."
Not only does this seem to defeat efficient find by PK it fails to define what it means to "query between the fields" and what the difference is between a built-in index, a secondary-index, and the primary_key+clustering subphrases in a create table command. A junk description. This is 2019. Shouldn't this be fixed by now?
AFAIK it's misleading anyway:
CREATE TABLE dev.record (
clientid int,
id uuid,
version int,
payload text,
PRIMARY KEY (clientid, id, version)
) WITH CLUSTERING ORDER BY (id ASC, version DESC)
insert into record (id,version,clientid,payload) values
(d5ca94dd-1001-4c51-9854-554256a5b9f9,3,1001,'');
insert into record (id,version,clientid,payload) values
(d5ca94dd-1002-4c51-9854-554256a5b9e5,0,1002,'');
The token on clientid indeed shows they're in different partitions as expected.
Turning to the big point. If one was looking for a single row given the clientId, and UUID ---AND--- Cassandra allowed you to skip specifying the clientId so it wouldn't know which node(s) to search, then sure that find could be slow. But it doesn't:
select * from record where id=
d5ca94dd-1002-4c51-9854-554256a5b9e5;
InvalidRequest: ... despite the performance unpredictability,
use ALLOW FILTERING"
And ditto with other variations that exclude clientid. So shouldn't we conclude Cassandra handles high cardinality tables searches that return "very few results" just fine?
Anything that requires reading the entire context of the database wont work which is the case with scanning on id since any of your clientid partition key's may contain one. Walking through potentially thousands of sstables per host and walking through each partition of each of those to check will not work. If having hard time with data model and not totally getting difference between partition keys and clustering keys I would recommend you walk through some introduction classes (ie datastax academy), youtube videos or book etc before designing your schema. This is not a relational database and designing around your data instead of your queries will get you into trouble. When moving from oracle you should not just copy your tables over and move the data or it will not work as well.
The clustering key is the order in which the data for a partition is ordered on disk which is what it is referring to as "build-in index". Each sstable has an index component that contains the partition key locations for that sstable. This also includes an index of the clustering keys for each partition every 64kb (by default at least) that can be searched on. The clustering keys that exist between each of these indexed points are unknown so they all have to be checked. A long time ago there was a bloom filter of clustering keys kept as well but it was such a rare use case where it helped vs the overhead that it was removed in 2.0.
Secondary indexes are difficult to scale well which is where the warning comes from about cardinality, I would strongly recommend just denormalizing data and not using index in any form as using large scatter gather queries across a distributed system is going to have availability and performance issues. If you really need it check out http://www.doanduyhai.com/blog/?p=13191 to try to get the data right (not worth it in my opinion).

Howto avoid cassandra tombstones when inserting NULL values

My problem is that cassandra creates tombstones when inserting NULL values.
From what I understand, cassandra doesn't support NULLs and when NULL is inserted it just deletes the respective column. On one hand this is very space effective, however on the other hand it creates tombstones which degrades read performance.
This goes agains NoSql phillosophy because cassandra is saving space but degrading read performance. In NoSql world the space is cheap, however performance matters. I beleive this is the phillosophy behind saving tables in denormalized form.
I would like cassandra to use the same technique for inserting NULL as for any other value - use timestamping and during compaction preserve the latest entry - even if the entry is NULL (or we can call it "unset").
Is there any tweak in cassandra config or any approach how I would be able to achieve upserts with nulls without having tombstones ?
I came across this issue however it only allows to ignore NULL values
My use case:
I have stream of events, every event identified by causeID. I'm receiving many events with same causeId and I want to store only the latest event for the same causeID (using upsert). The properties of the event may change from NULL to specific value, but also from specific value to NULL. Unfortunatelly the later case generates tombstones and degrades read performance.
Update
It seems there is no way how I could avoid tombstones. Could you advice me on techniques how to minimize them (set gc_grace_seconds to very low value). What are the risks, what to do when a node goes down for a longer period than gc_grace_seconds ?
You can't insert NULL into Cassandra - it has special meaning there, and lead to creation of tombstones that you observe. If you want to treat NULL as special value, why not to solve this problem on application side - when you get null status, just insert any special value that couldn't be used in your table, and when you read data back, check for that special value and output null to requester...
When we want to just insert or update rows using null for values that are not specified, and even though our intention is to leave the value empty, Cassandra represents it as a tombstone causing unnecessary overhead which degrades performance.
To avoid such tombstones for save operations, cassandra has the concept of unset for a parameter value.
So you can do the following to unset a field value while saving to avoid tombstone overhead for example related to different cases:
1). If you are using express-cassandra then :
const user = new models.instance.User({
user_id: 1235,
user_name: models.datatypes.unset // this will not create tombstone when we want empty user_name or null
});
user.save(function(err){
// user_name value is not set and does not create any unnecessary tombstone overhead
});
2). If you are writing cassandra raw query then for empty or null field when you know say colC will be null, then don't use it in your query.
insert into my_table(id,colA,colB) values(idVal,valA,valB) // Avoid colC
3). If you are using Node.Js Driver, you can even pass undefined on insert or update which will avoid tombstone overhead. For example
const query = 'INSERT INTO my_table (id, colC) VALUES (?, ?)';
client.execute(query, [ id, undefined ]);
4). If you are using c# driver then
// Prepare once in your application lifetime
var ps = session.Prepare("INSERT INTO my_table (id, colC) VALUES (?, ?)");
// Bind the unset value in a prepared statement
session.Execute(ps.Bind(id, Unset.Value));
For more detail on express-cassandra read the sub topic Null and unset values of
https://express-cassandra.readthedocs.io/en/latest/datatypes/#cassandra-to-javascript-datatypes
For more detail on Node.js driver unset feature refer datastax https://docs.datastax.com/en/developer/nodejs-driver/4.6/features/datatypes/nulls/
For more detail on Csharp driver unset feature refer datastax https://docs.datastax.com/en/developer/csharp-driver/3.16/features/datatypes/nulls-unset/
NOTE: I tested this on Node.js cassandra 4.0 But unset feature is introduced after cassandra 2.2
Hope this will help you or somebody else.
Thanks!
You cannot avoid tombstones if you particularly mention NULL in your INSERT. C* does not do a lookup before insert or writing a data which makes the writes very faster. For this purpose, C* just inserts a tombstone to avoid that value later (taking the latest update comparing the timestamp). If you want to avoid tombstone (which is recommended), you've to prepare different combinations of queries to check each one for NULL before adding it to the INSERT. If you have very few fields to check then it'll be easy to just add some IF-ELSE statements. But if there are lots of them, the code will be bigger and less readable. Shortly, you cannot insert NULL which will impact read performance later.
Inserting null values into cassandra
I don't think the other answers address the original question, which is how to overwrite a non-null value in Cassandra with null without creating a tombstone. The nearest is Alex Ott's suggestion to use some special value other than null.
However, with a little bit of trickery you can insert an explicit null into Cassandra by exploiting a FROZEN tuple or user-defined type. The FROZEN keyword effectively serialises the user defined type and stores the serialised representation in the column. Crucially, the serialised representation of a UDT containing null values is not itself null.
> CREATE TYPE test_type(value INT);
> CREATE TABLE test(pk INT, cl INT, data FROZEN<test_type>, PRIMARY KEY (pk, cl));
> INSERT INTO test (pk, cl, data) VALUES (0, 0, {value: 15});
> INSERT INTO test (pk, cl, data) VALUES (0, 0, {value: null});
> INSERT INTO test (pk, cl) VALUES (0, 1);
> SELECT * FROM test;
pk | cl | data
----+----+---------------
0 | 0 | {value: null}
0 | 1 | null
(2 rows)
Here we wrote 15, then overwrote it with null, and finally added a second row to demonstrate that there is a difference between an unset cell and a cell containing a frozen UDT that itself contains null.
Of course the downside of this approach is that in your application you have to delve into the UDT for the actual value.
On the other hand, if you combine several columns into the UDT you do save a little overhead in Cassandra. (But you can't then read or write them individually. You also can't remove fields, though you can add new ones.)

select older versions of data after update in Cassandra

This is my use-case.
I have inserted a row of data in Cassandra with the following query:
INSERT INTO TableWide1 (UID, TimeStampCol, Value, DateCol) VALUES ('id1','2016-03-24 17:54:36',45,'2015-03-24 00:00:00');
I update one row to have a new value.
update TableWide1 set Value = 46 where uid = 'id1' and datecol='2015-03-24 00:00:00' and timestampcol='2016-03-24 17:54:36';
Now, I would like to see all versions of this data from Cassandra. I know in HBase, this is pretty straightforward, but in Cassandra, is this even possible?
I explored a bit using writetime(), but it just gives the latest time of the newly updated data. And this cannot be used in where clause too.
This is how my schema looks like:
CREATE TABLE TableWide1(
UID varchar,
TimeStampCol timestamp,
Value double,
DateCol timestamp,
PRIMARY KEY ((UID,DateCol), TimeStampCol)
);
So is this technically possible, given the fact the old data still exists in Cassandra?
If your partitions wont get too wide you could exclude the time partitioning:
CREATE TABLE table_wide (
UID varchar,
TimeStampCol timestamp,
Value double,
PRIMARY KEY ((UID), TimeStampCol)
);
Thats generally bad though since eventually you will hit the limits of a partition.
But really you had it right. You wont be able to make a single statement, but under the covers you cant stream the entire set over anyway, and it will have to page through it. So you can just iterate through results of each day one at a time. If your dataset has days with no data and you dont want to waste reads, you can keep an additional table around to mark which days have data
CREATE TABLE table_wide_partition_list (
UID varchar,
DateCol timestamp,
PRIMARY KEY (UID)
);
And make one query to it first.
Really if you want HBase like behavior for scans, you are probably looking for more OLAP style of thing instead of normal C* usage. For this its almost universally recommended to use Spark with Cassandra currently.
Cassandra does not retain old data when updated.
It marks the old data into tombstone, and get rid of this, when compaction happens.
Hbase, was not made for handling real time application, and hot data from/for application server, though things have improved since the old times with Hbase.
People use Hbase, mainly because they already have a hadoop cluster.
Another noticeable and important difference is Cassandra is very fast on retrieval of single/multiple record based on key but not on range like >10 && <10 because data is stored based on hashed key. Hbase on the other hand stores data in sorted manner and is ideal candidate for range query.
Anyways, since cassandra doesn't retain old data. You cannot retrieve it.

How to get a range of data from Cassandra

[cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3]
table:
CREATE TABLE dc.event (
id timeuuid PRIMARY KEY,
name text
) WITH bloom_filter_fp_chance = 0.01;
How do I get a time range of data from Cassandra?
For example, when I try 'select * from event where id> maxTimeuuid('2014-11-01 00:05+0000') and minTimeuuid('2014-11-02 10:00+0000')', as seen here http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/timeuuid_functions_r.html
I get the following error: 'code=2200 [Invalid query] message="Only EQ and IN relation are supported on the partition key (unless you use the token() function)"'
Can I keep timeuuid as primary key and meet the requirement?
Thanks
Can I keep timeuuid as primary key and meet the requirement?
Not really, no. From http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
WHERE clauses can include a greater-than and less-than comparisons,
but for a given partition key, the conditions on the clustering column
are restricted to the filters that allow Cassandra to select a
contiguous ordering of rows.
You could try adding "ALLOW FILTERING" to your query... but I doubt that would work. And I don't know of a good way (and neither do I believe there is a good way) to tokenize the timeuuids. I'm about 99% sure the ordering from the partitioner would yield unexpected, bad results, even though the query itself would execute and appear correct until you dug into it.
As an aside, you should really check out a similar question that was asked about a year ago: time series data, selecting range with maxTimeuuid/minTimeuuid in cassandra
Short answer, No. Long answer, you can do something similar EG:
CREATE TABLE dc.event (
event_time timestamp,
id timeuuid,
name text,
PRIMARY KEY(event_time, id)
) WITH bloom_filter_fp_chance = 0.01;
The timestamp would presumably be truncated so that it only reflected a whole day (or hour or minute depending on the velocity of your data). Your where clause would have to include the "IN" parameter for the timestamps that are included in your timeuuid range.
If you use an appropriate chunking factor (how much you truncate your timestamp), you may even answer some of the questions you're looking for without using a range of timeuuids, just a simple where clause.
Essentially this allows you the leeway to make the kind of query you're looking for while respecting the restrictions in Cassandra. As Raedwald pointed out, you can't use the partition key in continuous ranges because of the underpinning nature of Cassandra as a large hash- That being said, Cassandra is well known to do some incredibly powerful things in time-series data.
Take a look at how Newts is doing time series for ranges. The author has a great set of slides and a talk describing the data model to get precisely what you seem to be looking for. https://github.com/OpenNMS/newts/
Cassandra can not do this kind of query because Cassandra is a key-value store implemented using a giant hash map, not a relational database. Just like an in memory hash map, the only way to find the key values within a sub range is to iterate through all the keys. That can be expensive enough for an in memory hash map, but for Cassandra it would be crippling.
Yes, you can do it by using spark with scala and spark-cassandra-connector!
I think you should keep your partition keys fewer by setting them to 'YYYY-MM-dd hh:00+0000' and filter on dates and hours only.
Then you could use something like:
case class TableKey(id: timeuuid)
val dates = Array("2014-11-02 10:00+0000","2014-11-02 11:00+0000","2014-11-02 12:00+0000")
val selected_data = sc.parallelize(dates).map(x => TableKey(_)).joinWithCassandraTable('dc', 'event')
And there you have your selected data rdd that you could collect:
val data = selected_data.collect
I had similar problem...

Cassandra or Hbase?

I have a requirement, where I want to store the following:
Mac Address // PKEY
TimeStamp // PKEY
LocationID
ownerName
Signal Strength
The insertion logic is as follows:
Store the above statistics for each active device (MacAddress) once every hour at each location (LocationID)
The entries are created at end of each hour, so the primary key will always be MAC+TimeStamp
There are no updates, only insertions
The queries which can be performed are as follows:
Give me all the entries for last 'N' hours Where MacAddress = "...."
Give me all the entries for last 'N' hours Where LocationID IN (locID1, locID2, ..);
Needless to say, there are billions of entries, and I want to use either HBASE or Cassandra. I've tried to explore, and it seems that Cassandra may not be correct choice.
The reasons for that is if I have the following in cassandra:
< < RowKey > MacAddress:TimeStamp > >
+ LocationID
+ OwnerName
+ Signal Strength
Both the queries will scan the whole database, right? Even if I add an index on LocationID, that is only going to help in the second query to some extent, because there is no index on timestamp (I believe that seaching on timestamp is not fast, as the MacAddress:TimeStamp composite Key would not allow us to search only on timestamp, and instead, a full scan would happen, is that correct?).
I'm stuck here big time, and any insights would really help, if we should opt HBase or Cassandra.
The right way to model this with Cassandra is to use a table partitioned by mac address, ordered by timestamp, and indexed on location id. See the Cassandra data model documentation, especially the section on clustering [predefined sorting]. None of your queries will require a full table scan.
You have to remember that NoSql instances like Cassandra allow horizontal scaling and make it a lot easier to shard the data. By developing a shard strategy (identifying shard key, etc) you could dramatically reduce the size of the data on a single instance and make queries (even when trying to query massive data sets) doable.
Either one would work for this query:
Give me all the entries for last 'N' hours Where MacAddress = "...."
In cassandra you would want to use an ordered partitioner so you can do easy scans. That way you would not have to scan the entire table. (I'm a little rusty on Cassandra).
In hbase it is always ordered by the rowkey so the scan becomes easy. You would just set a start and stop rowkey. Conceptually it would be:
scan.setStartRow(mac+":"+timestamp);
scan.setStopRow(mac+":"+endtimestamp);
And then it would only scan over the rows for the given mac address for the given time period--only a small subset of the data.
This query is much harder:
Give me all the entries for last 'N' hours Where LocationID IN
(locID1, locID2, ..);
Cassandra does have secondary indexes so it seems like it would be "easy" but I don't know how much data it would scan through. I haven't looked at Cassandra since it added secondary indexes.
In hbase you'd have to scan the entire table or create a second table. I would recommend creating a second table where the rowkey would be < location:timestamp > and you'd duplicate the data. Then you'd use that table to lookup the data by location using a scan and setting the start and end keys.

Resources