Order of results in Cassandra - cassandra

I have two questions about query results in Cassandra.
When I make a "full" select of a table in Cassandra (ie. select * from table) is it guaranteed that the results will be returned in increasing order of partition tokens?
For instance, having the following table:
create table users(id int, name text, primary key(id));
Is it guaranteed that the following query will return the results with increasing values in the token column?
select token(id), id from users;
If so, is it also guaranteed if the data is distributed to multiple nodes in the cluster?
If the anwer to the above question is 'yes', is it still valid if we use secondary index? For instance, if we would have the following index:
create index on users(name);
and we query the table by using the index:
select token(id), id from users where name = 'xyz';
is there any guarantee regarding the order of results?
The motivation for the above questions is if the token is the right thing to use in order in implement paging and/or resuming of broken longer "data exports".
EDIT: There are multiple resources on the net that state that the order matches the token order (eg. in description of partitioner results or this Datastax page):
Without a partition key specified in the WHERE clause, the actual order of the result set then becomes dependent on the hashed values of userid.
However the order of results is not specified in official Cassandra documentation, eg. of SELECT statement.

Is it guaranteed that the following query will return the results with increasing values in the token column?
Yes it is
If so, is it also guaranteed if the data is distributed to multiple nodes in the cluster?
The data distribution is orthogonal to the ordering of the retrieved data, no relationship
If the anwer to the above question is 'yes', is it still valid if we use secondary index?
Yes, even if you query data using a secondary index (be it SASI or the native implementation), the returned results will always be sorted by token order. Why ? The technical explanation is given in my blog post here: http://www.doanduyhai.com/blog/?p=13191#cluster_read_path
That's the main reason that explain why SASI is not a good fit if you want the search to return data ordered by some column values. Only a real search engine integration (like Datastax Enterprise Search) can yield you the correct ordering because it bypasses the cluster read path layer.

Related

Regarding Cassandra's (sloppy, still confusing) documentation on keys, partitions

I have a high-write table I'm moving from Oracle to Cassandra. In Oracle the PK is a (int: clientId, id: UUID). There are about 10 billion rows. Right off the bat I run into this nonsensical warning:
https://docs.datastax.com/en/cql/3.3/cql/cql_using/useWhenIndex.html :
"If you create an index on a high-cardinality column, which has many distinct values, a query between the fields will incur many seeks for very few results. In the table with a billion songs, looking up songs by writer (a value that is typically unique for each song) instead of by their artist, is likely to be very inefficient. It would probably be more efficient to manually maintain the table as a form of an index instead of using the Cassandra built-in index."
Not only does this seem to defeat efficient find by PK it fails to define what it means to "query between the fields" and what the difference is between a built-in index, a secondary-index, and the primary_key+clustering subphrases in a create table command. A junk description. This is 2019. Shouldn't this be fixed by now?
AFAIK it's misleading anyway:
CREATE TABLE dev.record (
clientid int,
id uuid,
version int,
payload text,
PRIMARY KEY (clientid, id, version)
) WITH CLUSTERING ORDER BY (id ASC, version DESC)
insert into record (id,version,clientid,payload) values
(d5ca94dd-1001-4c51-9854-554256a5b9f9,3,1001,'');
insert into record (id,version,clientid,payload) values
(d5ca94dd-1002-4c51-9854-554256a5b9e5,0,1002,'');
The token on clientid indeed shows they're in different partitions as expected.
Turning to the big point. If one was looking for a single row given the clientId, and UUID ---AND--- Cassandra allowed you to skip specifying the clientId so it wouldn't know which node(s) to search, then sure that find could be slow. But it doesn't:
select * from record where id=
d5ca94dd-1002-4c51-9854-554256a5b9e5;
InvalidRequest: ... despite the performance unpredictability,
use ALLOW FILTERING"
And ditto with other variations that exclude clientid. So shouldn't we conclude Cassandra handles high cardinality tables searches that return "very few results" just fine?
Anything that requires reading the entire context of the database wont work which is the case with scanning on id since any of your clientid partition key's may contain one. Walking through potentially thousands of sstables per host and walking through each partition of each of those to check will not work. If having hard time with data model and not totally getting difference between partition keys and clustering keys I would recommend you walk through some introduction classes (ie datastax academy), youtube videos or book etc before designing your schema. This is not a relational database and designing around your data instead of your queries will get you into trouble. When moving from oracle you should not just copy your tables over and move the data or it will not work as well.
The clustering key is the order in which the data for a partition is ordered on disk which is what it is referring to as "build-in index". Each sstable has an index component that contains the partition key locations for that sstable. This also includes an index of the clustering keys for each partition every 64kb (by default at least) that can be searched on. The clustering keys that exist between each of these indexed points are unknown so they all have to be checked. A long time ago there was a bloom filter of clustering keys kept as well but it was such a rare use case where it helped vs the overhead that it was removed in 2.0.
Secondary indexes are difficult to scale well which is where the warning comes from about cardinality, I would strongly recommend just denormalizing data and not using index in any form as using large scatter gather queries across a distributed system is going to have availability and performance issues. If you really need it check out http://www.doanduyhai.com/blog/?p=13191 to try to get the data right (not worth it in my opinion).

Secondary index on for low cardinality clustering column

Using Cassandra as db:
Say we have this schema
primary_key((id1),id2,type) with index on type, because we want to query by id1 and id2.
Does query like
SELECT * FROM my_table WHERE id1=xxx AND type='some type'
going to perform well?
I wonder if we have to create and manage another table for this situation?
The way you are planning to use secondary index is ideal (which is rare). Here is why:
you specify the partition key (id1) in your query. This ensures that
only the relevant partition (node) will be queried, instead of
hitting all the nodes in the cluster (which is not scalable)
You are (presumably) indexing an attribute of low cardinality (I can imagine you have maybe a few hundred types?), which is the sweet spot when using secondary indexes.
Overall, your data model should perform well and scale. Yet, if you look for optimal performances, I would suggest you use an additional table ((id1), type, id2).
Finale note: if you have a limited number of type, you might consider using solely ((id1), type, id2) as a single table. When querying by id1-id2, just issue a few parallel queries against the possible value of type.
The final decision needs to take into account your target latency, the disk usage (duplicating table with a different primary key is sometimes too expensive), and the frequency of each of your queries.

Understanding the Token Function in Cassandra

Hello I was reading the Cassandra documentation on Token Function,
I am trying to achieve pagination for a Cassandra table, I am unable to understand the lines highlighted. The document speaks about the difference between k > 42 and TOKEN(k) > TOKEN(42), but I am not able to understand the "token based comparison"
Looking forward for a detailed explanation of what token function does when part of a WHERE clause.
In order to understand in which partition it should put your data, C* makes some calculations on the PARTITION KEYs of every row. Specifically, on each node, rows are sorted by the token generated by the partitioner, (and each partition have data sorted by the cluster key). Different partitioners perform different types of calculations.
While the Murmur3Partitioner calculates the MurmurHash of the partion key, the ByteOrderedPartitioner uses the raw data bytes of the partition key itself: when you use the Murmur3Partitioner, your rows are sorted by their hashes, while when you use the ByteOrderedPartitioner, your rows are sorted directly by their raw values.
As an example, assume you have a table like this:
CREATE TABLE test (
username text,
...
PRIMARY KEY (username)
);
And assume you're trying to locate where the rows corresponding to the usernames abcd and abce and abcf are stored. The hex representation of these strings are 0x61626364 and 0x61626365 and 0x61626366 respectively. Assuming we apply this MH3 implementation (x86, 32-bit for simplicity, no optional seed) on both strings we get ‭0x‭43ED676A‬‬ and 0x‭‭E297E8AA‬‬ and 0x‭‭87E62668‬‬ respectively. So, in the case of MH3, the tokens of the strings will be these 3 values, while in the case of the BOP the tokens will be the raw data values themselves: 0x61626364, 0x61626365 and 0x61626366.
Now you can see that storing data sorted by token produces different results when different partitioners are used. A SELECT * FROM test; query would return rows in different order. This can (but should not) be a problem if you have data already sorted by their raw values and you need to retrieve that in the same order because when you use MH3 the order is complelety unrelated to your data.
Back to the question, the TOKEN function allows you to filter directly by the tokens of your data instead of your data. The documentation says:
ordering with the TOKEN function does not always provide the expected
results. Use the TOKEN function to express a conditional relation on a
partition key column. In this case, the query returns rows based on
the token of the partition key rather than on the value.
As an example, you could issue:
SELECT * FROM test WHERE TOKEN(username) <= TOKEN('abcf');
and you'd get figure what? abcd and acbf rows!!! This is because order sometimes matters... Like in the case of the pagination you're trying to do, which will be handled flawlessy for you by any available C* driver (eg the Java driver).
That said, the recommended partitioner for new clusters is Murmur3Partitioner, you can check the documentation for both pros and cons of each partitioner. Please note that the partitioner is a cluster-wide settings, and once set you cannot change it without pushing all of your data into another cluster.
Make your choice carefully.
Cassandra data is partitioned based on the Token of row's PartitionKey. The token is gerenated using a Hash Function. The function Token generates the value which would have been created by applying the hash function to it's arguments.
That said, almost all drivers now page automatically by default.

maximum secondary indexes on a columnfamily

Is it a performance issue if we have two or more secondary indexes on a columnfamily? I have orderid,city and shipmenttype. So I thought I create primary key on orderid and secondary indexes on city and shipmenttype. And use combination of secondary index columns while querying. Is that a bad modelling?
Consider the data that will be placed in the secondary index. Looking at the docs, you want to avoid columns with high cardinality. If your city and shipment type values vary greatly (or conversely, too similarly) then a secondary index may not be the right fit.
Look in to potentially maintaining a separate table with this information. This would behave as a manual index of sorts, but have the additional benefit of behaving as you expect a Cassandra table should. When you create or update records be sure to update this index table. Writes are cheap, performing multiple writes over the course of updating a record is not unheard of.
When looking at your access patterns will you be using the partition key as part of the WHERE clause or just the secondary indexes?
If you're performing a query against the secondary indexes along with the partition key you will achieve better performance than when you just query with secondary indexes.
For example, with WHERE orderid = 'foo' AND shipmenttype = 'bar' the request will only be sent to nodes responsible for the partition where foo is stored. Then the secondary index will be consulted for shipmenttype = 'bar' and your results will be returned.
When you run a query with just WHERE shipmenttype = 'bar' the query is sent to all nodes in the cluster before the secondary indexes are consulted for looking up rows. This is less than ideal.
Additionally should you query against multiple secondary indexes with a single request you must use ALLOW FILTERING. This will only consult ONE secondary index during your request, usually the more specific of the indexes referenced. This will cause a performance hit as all records returned from checking the first index will require checking for the other values listed in your WHERE clause.
Should you be using a secondary index always strive to include the partition key portion of the query. Secondly do NOT use multiple secondary indexes when querying a table, this will cause a major performance hit.
Ultimately your performance is determined by how you construct your queries against the partition and secondary indexes.

Is a read with one secondary index faster than a read with multiple in cassandra?

I have this structure that I want a user to see the other user's feeds.
One way of doing it is to fan out an action to all interested parties's feed.
That would result in a query like select from feeds where userid=
otherwise i could avoid writing so much data and since i am already doing a read I could do:
select from feeds where userid IN (list of friends).
is the second one slower? I don't have the application yet to test this with a lot of data/clustering. As the application is big writing code to test a single node is not worth it so I ask for your knowledge.
If your title is correct, and userid is a secondary index, then running a SELECT/WHERE/IN is not even possible. The WHERE/IN clause only works with primary key values. When you use it on a column with a secondary index, you will see something like this:
Bad Request: IN predicates on non-primary-key columns (columnName) is not yet supported
Also, the DataStax CQL3 documentation for SELECT has a section worth reading about using IN:
When not to use IN
The recommendations about when not to use an index apply to using IN
in the WHERE clause. Under most conditions, using IN in the WHERE
clause is not recommended. Using IN can degrade performance because
usually many nodes must be queried. For example, in a single, local
data center cluster with 30 nodes, a replication factor of 3, and a
consistency level of LOCAL_QUORUM, a single key query goes out to two
nodes, but if the query uses the IN condition, the number of nodes
being queried are most likely even higher, up to 20 nodes depending on
where the keys fall in the token range.
As for your first query, it's hard to speculate about performance without knowing about the cardinality of userid in the feeds table. If userid is unique or has a very high number of possible values, then that query will not perform well. On the other hand, if each userid can have several "feeds," then it might do ok.
Remember, Cassandra data modeling is about building your data structures for the expected queries. Sometimes, if you have 3 different queries for the same data, the best plan may be to store that same, redundant data in 3 different tables. And that's ok to do.
I would tackle this problem by writing a table geared toward that specific query. Based on what you have mentioned, I would build it like this:
CREATE TABLE feedsByUserId
userid UUID,
feedid UUID,
action text,
PRIMARY KEY (userid, feedid));
With a composite primary key made up of userid as the partitioning key you will then be able to run your SELECT/WHERE/IN query mentioned above, and achieve the expected results. Of course, I am assuming that the addition of feedid will make the entire key unique. if that is not the case, then you may need to add an additional field to the PRIMARY KEY. My example is also assuming that userid and feedid are version-4 UUIDs. If that is not the case, adjust their types accordingly.

Resources