Cassandra Delete query with If condition not working - cassandra

I've got a cassandra table and want to delete a row, but only if one column has one specific value.
Even if cassandra claims that deleting succeeded (it returned "applied: true") the message will still be present.
Let's create the table and insert some data:
CREATE TABLE IF NOT EXISTS test
(
id uuid PRIMARY KEY,
recipient text,
message text
);
INSERT INTO test (id, recipient, message)
VALUES (7ee055ee-b5dd-4bfd-b184-614d51e268d5, 'felix', 'foo');
INSERT INTO test (id, recipient, message)
VALUES (86c9d632-dc24-4635-8277-c987c78bd242, 'andrew', 'bar');
Now I want to delete one message, but only if the user who requests the deletion (in this case felix) is the recipient and thus has permissions to do so:
cqlsh:service_message> DELETE FROM test WHERE id=7ee055ee-b5dd-4bfd-b184-614d51e268d5 IF recipient='felix';
[applied]
-----------
True
So I would now think that the query did succeed, but if we have a look at the table we'll see that the message still exists.
cqlsh:service_message> SELECT * FROM test;
id | message | recipient
--------------------------------------+---------+-----------
86c9d632-dc24-4635-8277-c987c78bd242 | bar | andrew
7ee055ee-b5dd-4bfd-b184-614d51e268d5 | foo | felix
(2 rows)
Some additional information:
cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4
cqlsh> DESCRIBE KEYSPACE service_message
CREATE KEYSPACE service_message WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE service_message.test (
id uuid PRIMARY KEY,
message text,
recipient text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

INSERT and UPDATE statements using the IF clause support lightweight transactions .
From Datastax docs on CQL: https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertLWT.html
I'm pretty sure deletes are not supported. If you want to effectively delete your information, you may consider setting the values of the cells in an UPDATE statement to null. Either by delete or by setting nulls, you are still creating tombstones.

Related

Cassandra queries perform a full table scan if no rows exist for a specific partition key

I have a very large table like
CREATE TABLE IF NOT EXISTS profile (
account_id text,
user_id uuid,
user_data text,
creation_date timestamp,
update_date timestamp,,
PRIMARY KEY ((account_id, user_id))
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': '10'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
The following query will run the full table scan if the table has no rows matching the partial partition key (account_id = 'D-F-8CX7PGX')
SELECT * FROM profile WHERE account_id = 'D-F-8CX7PGX' AND user_id = '123e4567-e89b-12d3-a456-426614174000';
I expect that Cassandra could quickly return with no rows found, not scan the full table.
Someone suggested inserting a dummy row with (account_id = 'D-F-8CX7PGX' AND user_id = '00000000-0000-0000-0000-000000000000') could avoid the full table scan. But I don't understand why it is needed.
Does anyone encounter the similar issue?
A single partition query does not do a full table scan.
Since the partition key is (account_id, user_id) and your query filters on a single partition, Cassandra will attempt to retrieve the partition from the relevant replica(s) without scanning the whole table. Cheers!

Problems performing an update on Cassandra having a compound partitioning key

I have this table in Cassandra:
CREATE TABLE wear_dealer.product_color_size_stock (
productcode text,
colorcode text,
sizecode text,
ean text,
shortdescription text,
stock int,
**PRIMARY KEY (productcode, colorcode, sizecode)**
) WITH CLUSTERING ORDER BY (colorcode ASC, sizecode ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX product_color_size_stock_stock_idx ON wear_dealer.product_color_size_stock (stock);
How can I update shortdescription having only the value for productcode
When I perform this query:
cqlsh:wear_dealer> update seasons_product_color_size
set shortdescription ='AAA'
where productcode='RUNTS';
I get the following error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Some partition key parts are missing: seasoncode"
Any strategie to overcome this?
Many thanks in advance!
Unfortunately, CQL does not allow writes for a partial key. Remember that Cassandra treats INSERTs and UPDATEs the same. So when this:
UPDATE seasons_product_color_size
SET shortdescription ='AAA'
WHERE productcode='RUNTS';
Returns this: "Some partition key parts are missing: seasoncode"
It's saying that Cassandra doesn't know which node to write the data to, because there isn't a partition key. In SQL, it would just iterate through all rows in the table and update them according to your WHERE clause. But Cassandra is specifically designed not to allow operations like that.
For this query you will need to figure out the missing seasoncodes separately, and UPDATE each row individually.
Cassandra supports write based on partition key, As you supplied partial partition key you cannot update with that.
UPDATE seasons_product_color_size SET shortdescription ='AAA' WHERE productcode='RUNTS' and sizecode=10

How to use both ORDER BY and IN together in a query in Cassandra?

I use Cassandra 3.0.5.
I'm having problem using ORDER BY and IN together.
Schema:
CREATE TABLE my_status.user_status_updates (
username text,
id timeuuid,
body text,
PRIMARY KEY (username, id))
WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
Query:
SELECT username, id, UNIXTIMESTAMPOF(id), body
FROM user_status_updates
WHERE username IN ('carol', 'dave')
ORDER BY id DESC
LIMIT 2;
InvalidRequest: code=2200 [Invalid query] message="Cannot page queries with both ORDER BY and a IN restriction on the partition key; you must either remove the ORDER BY or the IN and sort client side, or disable paging for this query"
I'm sure I've seen people query this without errors, so I know there is a way to get around this. What do I need to do to make this query work, or is it inefficient to query both ORDER BY and IN together?
You've set the Clustering Key to be ordered ASC, but are requesting it be ordered DESC in your query. These are at odds and are counter-productive. If you change the Clustering Key to DESC, then you won't need the ORDER BY clause in the query. If you truly do need to have the Clustering Key be ASC for other queries, then I would suggest a second table with it being DESC. Design your tables for what the query will require. Hope that helps.
Adam
Using both IN and ORDER BY will require turning off paging with the PAGING OFF command in cqlsh.
cqlsh> PAGING OFF is the answer.

Select distinct gives incorrect values even if performed on primary key Cassandra

Im running Cassandra Version 2.1.2 and cqlsh 5.0.1
Here is the table weather.log, weather is the keyspace having consistency level One.
I have 2 nodes configured.
CREATE KEYSPACE weather WITH replication = {'class': 'NetworkTopologyStrategy', 'us-east': '1'} AND durable_writes = true;
CREATE TABLE weather.log (
ip inet,
ts timestamp,
city text,
country text,
PRIMARY KEY (ip, ts)
) WITH CLUSTERING ORDER BY (ts DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
When we run the query.
select distinct ip from weather.log
We get inconsistent, wrong responses. Once we get 99 just next time we get 1600 etc. [where the actual number should be > 2000]
I have tried this query with consistency level set to ALL also. It dint work.
Why is this happening ? I need to get all the keys. How to get all the primary keys?
It looks like you might be effected by CASSANDRA-8940. I'd suggest to update to the latest 2.1.x release and verify if this issue is fixed for you.

Commenting Cassandra's keyspace, table, column

In Oracle there is possibility to add a comment about a table, view, materialized view, or column into the data dictionary, e.g.
COMMENT ON COLUMN employees.job_id
IS 'abbreviated job title';
I found this particularly usefull as a tester when trying to understand ideas behind the names which are not necessarily self-explanable and in large databases (over 200 tables).
Is there such feature in Cassandra?
You can use 'with comment' option
cqlsh:d2>
cqlsh:d2> create table employee (id int primary key, name text) with comment = 'Employee id and name';
cqlsh:d2> desc table employee;
CREATE TABLE d2.employee (
id int PRIMARY KEY,
name text
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'Employee id and name'
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
Cassandra documentation

Resources