As per this datastax doc Atomicity in Cassandra is:
In Cassandra, a write is atomic at the partition-level, meaning inserting or updating columns in a row is treated as one write operation.
Whereas according to this datastax doc Atomicity in Cassandra is:
In Cassandra, a write operation is atomic at the partition level, meaning the insertions or updates of two or more rows in the same partition are treated as one write operation.
My confusion is that whether atomicity is seen at a single row basis or it can have multiple rows included of a table at partition level?
I assume it is a combination of both depending on the type of query we are executing in Cassandra.
For Example :
If I have an insert query, it will always be inserting one row in a partition. So Cassandra ensures that this row is inserted successfully at partition level.
But if I have update query whose where clause has a condition which is qualifying multiple rows then the update operation is atomic at partition level means either all qualified rows as per the condition will be updated or none will be.
Is my understanding correct?
"row" and "partition" get conflated since previously row meant partition and now row means a part of a partition.
They are atomic to the partition. Keep in mind thats in reference to a single replica, so a or multiple rows in a batch containing 5 columns are all updated in a single operation on the one replica (no cross node isolation). If your setting (key, value) VALUES ('abc', 'def') you will never see just the key and not the value set. However you might make a read and only 1 replica has it set while other does not. Meaning depending on your replication factor and consistency level requested you will either see the whole thing or nothing. This can apply to multiple rows within a partition as well but you cannot update 2 rows with a single update statement without a batch (logged or unlogged).
Related
I understand that two tables with same partition columns and values have same token generated. Does that mean that all the cells of this partition in both tables are actually in the same partition ? How does Cassandra store data internally ?
Eg:
Create table table1 (emp_id int PRIMARY KEY, name text, role text);
Create table table2 (emp_id int PRIMARY KEY, name text, role text);
INSERT INTO table1(emp_id, name, role) VALUES (1, 'sahil', 'MTS');
INSERT INTO table2(emp_id, name, role) VALUES (1, 'sahil', 'MTS');
SELECT token(emp_id) from table1 where token(emp_id) = token(11596);
system.token(emp_id)
----------------------
**7447223576279188802**
SELECT token(emp_id) from table2 where token(emp_id) = token(1);
system.token(emp_id)
----------------------
**7447223576279188802**
For your example, because both tables have the same partition key, then when identical values are inserted, they will be mapped to the same token. It is on insert that the hash function to the PK is applied to determine what replica will get the data. If you use the Murmur3 partitioner (which is used by default) then you get a consistent token value, i.e. using the same PK and PK value, the result is the same. You can reference this page for understanding:
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archDataDistributeHashing.html
Rows (items of data) that have the same table and the same partition key are said to be in the same partition. The most important consequence of being in the same partition is that data in the same partition is guaranteed to be co-located - handled by the same replica nodes and in ScyllaDB, even by the same CPU. This allows efficiently scanning a partition: All the partition's data can be read from a single node and Cassandra doesn't to go back and forth between replicas to read the various pieces of the partition and combine them. This is also what allows a node that handles the partition's full data to maintain it sorted by the clustering key: A process called compaction is merging different pieces of a sorted partition (these are sstables, or sorted string tables) into a bigger sorted partition.
When you have two different tables in the same keyspace, and use the same partition key in both, they are not stored physically on disk together - because each table has its own set of sstables (files on disk), so in that sense they are not "in the same partition". However, the co-location property which I mentioned earlier still holds (if the two tables are in the same keyspace): Two identically-keyed partitions in the two tables will be stored on exactly the same node. Why is this important/useful? Usually it isn't. One place where this knowledge can become useful is that it can be used in some situations to achieve atomic batch write to both tables at once, utilizing the fact that all replicas will see both writes together, whereas usually two writes to two tables go to different nodes at different times.
I am trying to execute 3 conditional inserts to different tables inside a batch by using the Cassandra cpp-driver:
BEGIN BATCH
insert into table1 values (...) IF NOT EXISTS
insert into table2 values (...) IF NOT EXISTS
insert into table3 values (...) IF NOT EXISTS
APPLY BATCH
But I am getting the following error:
Batch with conditions cannot span multiple tables
If the above is not possible in Cassandra, what is the alternative to perform multiple conditional inserts as a transaction and ensure that all succeed or all fail?
I'm afraid there are no alternatives. Conditional statements in a BATCH environment are limited to a single table only, and I don't think there's room for changes in future.
This is due to how Cassandra works internally: a batch containing a conditional update (it is called lightweight transaction) can only be used in one partition because they are based on the Paxos implementation, because the Paxos itself works at partition level only. Moreover, in a batch with multiple conditional statements in the same BATCH, all the conditions must be verified to the batch succeed. Even if one (and only) conditional update fails, the entire batch will fail.
You can read more about BATCH statements in the documentation.
You'd basically get a performance hit for the conditional update, and a performance hit for a batched operation, and C* stops you to get so far.
It seems to me you designed it RDBMS-like. A No-SQL alternative solution, I don't know if it can be applied to your use case though, you could denormalize your data in a 4th table that combines all the other 3 tables, and then supply a single update to this 4th table.
Little problem here with cassandra. Basically my data has a status (INITIALIZED, PERFORMED, ENDED...), and I have different scheduled tasks that will query this data based on the status with an "IN" clause. So one scheduler will work with the data that is INITIALIZED, one with the PERFORMED, some with both, etc...
Once the data is retrieved, it is processed and the status changes accordingly (INITIALIZED -> PERFORMED -> ENDED).
The problem : in order to be able to use the IN clause, the status has to figure among the primary keys of my table. But when I update the status... it creates a new record in my table, since the UPSERT doesn't find any data with the primary keys given...
How do I solve that ?
Instead of including the status column in your primary key columns you can create a secondary index on the column. However, the IN clause is not (yet) supported for secondary index columns. But as you have a very limited number of values to look up you could use equality conditions in your WHERE clause and then merge the results client-side?
Beware that using secondary indexes comes at a cost. Check out "when not to use an index". In your case these points may apply:
On a frequently updated or deleted column. See Problems using an
index on a frequently updated or deleted column below.
To look for a
row in a large partition unless narrowly queried. See Problems using
an index to look for a row in a large partition unless narrowly
queried below.
we have a table with 15 million records, and ours is a 10 node cassandra cluster. We have a column which has close to 20 repeatable values. Is it advisable to build secondary index on this column?
Assuming completely uniform distribution on that column, then each column value would map to 750,000 rows. Now while the DataStax doc on When To Use An Index states that...
built-in indexes are best on a table having many rows that contain the indexed value.
750,000 rows certainly qualifies as "many." But even given that, remember that you're also talking about 14,250,000 rows that Cassandra has to ignore when fulfilling your query.
Also, unless you have a RF of 10 (and I doubt that you would with 10 nodes), you are going to incur network time as Cassandra works between all of the different nodes required to fulfill your query. For 750,000 rows, that's probably going to timeout.
The only way I think this could be efficient, would be to first restrict your query by a partition key. Using the secondary index while also restricting with a partition key will help Cassandra find your rows more quickly. Even so, with a dataset that big, I would re-evaluate your data model and try to figure out a different table to fulfill that query without requiring a secondary index.
Consider a table 'X' with each row consisting of six attributes. Suppose 'X' is already filled with N-rows and the setup contains r-replicas in the cluster. Now if I perform an update to only one column of a row. Then only this column updated will be propagated to its corresponding replicas (along with a key identifier). Is my understanding correct or will the whole row be propagated to its replica nodes?
Thanks,
Chethan
You are correct, only the one new column will be set to replicas. Cassandra is designed to do writes with no disk seeks, so cannot do reads to propagate writes. (The exception is for counters and some operations on collections, when reads are made on the coordinator for the update. But still, the column propagated is only the column being updated.)