Scylla token range select result ordering - cql

Is it guaranteed that rows returned for a token range CQL SELECT query are ordered by token value?
From the article https://www.scylladb.com/2017/02/13/efficient-full-table-scans-with-scylla-1-6/:
ScyllaDB orders partitions by a function of the partition key, known as the partitioner, and also as the token function
I'd like to have it confirmed that it's guaranteed (by a specification) because I'd like to implement efficient "group by partitioning key" without having to read the whole result set into memory. I'm using the latest Java driver for Scylla (not for C*) if that makes any difference.

Yes, it is guaranteed (I am the author of that article).

Related

How to read only all row keys in cassandra efficiently...

Accessing all rows from all nodes in cassandra would be inefficient. Is there a way to have some access to index.db which already has row keys? is something of this sort supported in built in cassandra?
There is no way to get all keys with one request without reaching every node in the cluster. There is however paging built-in in most Cassandra drivers. For example in the Java driver: https://docs.datastax.com/en/developer/java-driver/3.3/manual/paging/
This will put less stress on each node as it only fetches a limit amount of data each request. Each subsequent request will continue from the last, meaning you will touch every result for the request you're making.
Edit: This is probably what you want: How can I get the primary keys of all records in Cassandra?
One possible option could be querying all the token ranges.
For example,
SELECT distinct <partn_col_name> FROM <table_name> where token(partn_col_name) >= <from_token_range> and token(partn_col_name) < <to_token_range>
With above query, you can get the all the partition keys available within given token range. Adjust token ranges depending on execution time.

Order of results in Cassandra

I have two questions about query results in Cassandra.
When I make a "full" select of a table in Cassandra (ie. select * from table) is it guaranteed that the results will be returned in increasing order of partition tokens?
For instance, having the following table:
create table users(id int, name text, primary key(id));
Is it guaranteed that the following query will return the results with increasing values in the token column?
select token(id), id from users;
If so, is it also guaranteed if the data is distributed to multiple nodes in the cluster?
If the anwer to the above question is 'yes', is it still valid if we use secondary index? For instance, if we would have the following index:
create index on users(name);
and we query the table by using the index:
select token(id), id from users where name = 'xyz';
is there any guarantee regarding the order of results?
The motivation for the above questions is if the token is the right thing to use in order in implement paging and/or resuming of broken longer "data exports".
EDIT: There are multiple resources on the net that state that the order matches the token order (eg. in description of partitioner results or this Datastax page):
Without a partition key specified in the WHERE clause, the actual order of the result set then becomes dependent on the hashed values of userid.
However the order of results is not specified in official Cassandra documentation, eg. of SELECT statement.
Is it guaranteed that the following query will return the results with increasing values in the token column?
Yes it is
If so, is it also guaranteed if the data is distributed to multiple nodes in the cluster?
The data distribution is orthogonal to the ordering of the retrieved data, no relationship
If the anwer to the above question is 'yes', is it still valid if we use secondary index?
Yes, even if you query data using a secondary index (be it SASI or the native implementation), the returned results will always be sorted by token order. Why ? The technical explanation is given in my blog post here: http://www.doanduyhai.com/blog/?p=13191#cluster_read_path
That's the main reason that explain why SASI is not a good fit if you want the search to return data ordered by some column values. Only a real search engine integration (like Datastax Enterprise Search) can yield you the correct ordering because it bypasses the cluster read path layer.

Understanding the Token Function in Cassandra

Hello I was reading the Cassandra documentation on Token Function,
I am trying to achieve pagination for a Cassandra table, I am unable to understand the lines highlighted. The document speaks about the difference between k > 42 and TOKEN(k) > TOKEN(42), but I am not able to understand the "token based comparison"
Looking forward for a detailed explanation of what token function does when part of a WHERE clause.
In order to understand in which partition it should put your data, C* makes some calculations on the PARTITION KEYs of every row. Specifically, on each node, rows are sorted by the token generated by the partitioner, (and each partition have data sorted by the cluster key). Different partitioners perform different types of calculations.
While the Murmur3Partitioner calculates the MurmurHash of the partion key, the ByteOrderedPartitioner uses the raw data bytes of the partition key itself: when you use the Murmur3Partitioner, your rows are sorted by their hashes, while when you use the ByteOrderedPartitioner, your rows are sorted directly by their raw values.
As an example, assume you have a table like this:
CREATE TABLE test (
username text,
...
PRIMARY KEY (username)
);
And assume you're trying to locate where the rows corresponding to the usernames abcd and abce and abcf are stored. The hex representation of these strings are 0x61626364 and 0x61626365 and 0x61626366 respectively. Assuming we apply this MH3 implementation (x86, 32-bit for simplicity, no optional seed) on both strings we get ‭0x‭43ED676A‬‬ and 0x‭‭E297E8AA‬‬ and 0x‭‭87E62668‬‬ respectively. So, in the case of MH3, the tokens of the strings will be these 3 values, while in the case of the BOP the tokens will be the raw data values themselves: 0x61626364, 0x61626365 and 0x61626366.
Now you can see that storing data sorted by token produces different results when different partitioners are used. A SELECT * FROM test; query would return rows in different order. This can (but should not) be a problem if you have data already sorted by their raw values and you need to retrieve that in the same order because when you use MH3 the order is complelety unrelated to your data.
Back to the question, the TOKEN function allows you to filter directly by the tokens of your data instead of your data. The documentation says:
ordering with the TOKEN function does not always provide the expected
results. Use the TOKEN function to express a conditional relation on a
partition key column. In this case, the query returns rows based on
the token of the partition key rather than on the value.
As an example, you could issue:
SELECT * FROM test WHERE TOKEN(username) <= TOKEN('abcf');
and you'd get figure what? abcd and acbf rows!!! This is because order sometimes matters... Like in the case of the pagination you're trying to do, which will be handled flawlessy for you by any available C* driver (eg the Java driver).
That said, the recommended partitioner for new clusters is Murmur3Partitioner, you can check the documentation for both pros and cons of each partitioner. Please note that the partitioner is a cluster-wide settings, and once set you cannot change it without pushing all of your data into another cluster.
Make your choice carefully.
Cassandra data is partitioned based on the Token of row's PartitionKey. The token is gerenated using a Hash Function. The function Token generates the value which would have been created by applying the hash function to it's arguments.
That said, almost all drivers now page automatically by default.

Cassandra: Controlling which node receives data

My understanding of Cassandra's recommended clustering approach is to ensure that each node in the cluster receives an equal distribution of data, by hashing a document's unique Id. My question is if there is a way to change this and define a custom key for "intelligently" routing a document to a specific node in the cluster?
In my scenario, I have data which relates to a specific entity (think client-project-task-item) Across all my data; I will have enough items to require some horizontal scaling; however, each search will always relate to a given client-project-task for which the data set is only a moderate size.
Is there a way to create this type of partitioning / routing (different names I've seen for the same thing) logic in Cassandra?
Thanks; Brent
Clustering approach in Cassandra is not just for an equal distribution of data. It also ensures that all read/write operations are distributed across the cluster to make these operations faster. In addition to this, most likely you will have replication factor greater than 1 to ensure data redundancy so that a node failure does not result in the data loss.
Back to your question and to your own answer. If you use the same partition key for the data, this guarantees that Cassandra partitioning will store the primary replica of the data on the same node, and even more, it will store them in the same partition, ("wide row" in an old way of naming).
I think - http://www.datastax.com/documentation/cql/3.0/share/glossary/gloss_partition_key.html - is the answer I'm looking for
The first column declared in the PRIMARY KEY definition, or in the case of a compound key, multiple columns can declare those columns that form the primary key.

Why does the CassandraRDD compute method execute a bunch of token range searches?

So I was wondering (and can't figure out through looking at the code) exactly why the datastax cassandra driver does a bunch of token range searches.
For example,
http://pastebin.com/3gux40vU
The code that we use is
rdd.select("bucket").collect().foreach(println)
It happens for any select that we do, regardless of whether or not we call collect(). The table drop_me_soon is a temporary table with the schema bucket int PRIMARY KEY. It has one single entry of 0. In particular, it seems like the code
val rowIterator = tokenRanges.iterator.flatMap(fetchTokenRange(session, _))
Causes it to do all of the token range searches, but I could be wrong. Could anyone here shed some light?
The Spark driver performs a full scan over the complete token range. It queries the system.peers table to get the host vs set of token ranges to get the location and replica placement. It then maps Cassandra’s token ranges and Spark’s partitions. This is a many-one mapping in case of v-nodes and one-one otherwise.
It then schedules the computation of each partition by doing a token range query on the available workers according to the above computed mapping. For each replica, it prefers the worker running on the replica itself.
In your case, Cassandra really does not know how many rows there are in all the nodes. If you are using a single Cassandra node, then you have v-nodes which split the token range in 256 splits. So it will always try to scan over all the token range splits to get the result.

Resources