Using multiple TTL values in Cassandra table - cassandra

What are the disadvantages of using multiple TTL values(One in table level and another for specific data rows to override the TTL for those rows) in Cassandra table.Will it result into incomplete data cleanup?
As in TWCS being used,we may never get sstables with all rows where tombstones are expired and sstables could be dropped.
What are the best ways to overcome this issue.We are thinking of segregate data to have unique TTL for same table rows.Any other thougths please (V3.11.5 being used)

Related

Is it possible to set default time to live for existing Cassandra table and also apply this TTL to all the existing records in the table with CQL

My requirement is to clear out the record after 5 days. I have already created a table, but didn't set this time to live configuration at table level. Now I want to set it up for the table and also to the existing records on the table.
You can set a default TTL for the table, but there is no way to go back and change a record without a TTL to have a TTL (i.e. the default TTL will apply a TTL value to a record that does not have one during insert time). In your case, after the default TTL is set at the table level, you will have to find and delete any rows in the table with code/cql/etc, manually, after they're considered "stale". This will create tombstones, and if there is an "overwhelming" number of rows that have tombstones, you might see performance issues and failures. Compaction will clean them up eventually, or you can clean them up yourself with a manual compaction (and possibly re-split again if the single generated sstable is large).
If this table is an INSERT ONLY type of table that will always have TTLs, you may want to consider TWCS. It will reduce the compaction workload significantly, and it also offers you other options to clean up data that does not have TTLs.
Hopefully this helps.
-Jim

Cassandra simple primary key queries

We would like to create a Cassandra table with Simple Primary Key that is consisted of UUID column.
The table will look like:
CREATE TABLE simple_table(
id UUID PRIMARY KEY,
col1 text,
col2 text,
col3 UUID
);
This table will potentially store few billions of rows, and the rows should expire after some time (few months) using the TTL feature.
I have few questions regarding the efficiency of this table:
What is the efficiency of a query against this table using the primary key? Meaning, how Cassandra finds a specific row after resolving in which partition it resides?
Considering that the rows will expire and create many tombstones, how does this will effect the reads and writes to this table? Let's say that we expire the data after 180 days, if I am not mistaken, the ratio of tombstones would be 10/180~=0.056 (when 10 is the gc_grace_periods in days).
In your case, the primary key is equal to the partition key, so you have so-called "skinny" partitions, consisting of one row. If you remove data, then instead of data inside partition you'll have only tombstone, and it's not a problem. If the data is expired, then it will be simply removed during compaction - gc_grace_period isn't applied here - it's required only when you explicitly remove the data - we need to keep tombstone because other nodes may need to "catch up" with changes if they weren't able to receive delete operation. You can find more details about data deletion in following document.
Problem with tombstones arise when you have many (thousands) of rows inside the same partition, for example, if you use several clustering keys. And when such data is deleted, then the tombstone is generated, and should be skipped when we read data inside partition.
P.S. Have you seen this blog post that explains how deletions happen?
After reading the blog (and the comments) that #Alex referred me to, I concluded that tombstones are created for expired rows due to default_time_to_live of the table.
Those tombstones will be cleaned only after gc_grace_periods have passed. See this stack overflow question.
Regarding my first questions this datastax page describes it pretty well.

Cassandra failure during read query

I have a Cassandra Table with ~500 columns and primary key ((userId, version, shredId), rowId) where shredId is used to distribute data evenly into different partitions. Table also has a default TTL of 2 days to expire data as data are for real-time aggregation. The compaction strategy is TimeWindowCompactionStrategy.
The workflow is:
write data to input table (with consistency EACH_QUORUM)
Run spark aggregation (on rows with same the userId and version)
write aggregated data to output table.
But I'm getting Cassandra failure during read query when size of data gets large; more specifically, once there are more than 210 rows in one partition, read queries fail.
How can I tune my database and change properties to fix this?
After investigation and research, the issued is caused by null values been inserted for some empty column. this creates large amount of tombstones and eventually timeout the query.

Using default TTL columns but high number of tombstones in Cassandra

I use Cassandra 3.0.12.
And I have a cassandra Column Family, or CQL table with the following schema:
CREATE TABLE win30 (
cust_id text,
tid timeuuid,
info text,
PRIMARY KEY (cust_id , tid )
) WITH CLUSTERING ORDER BY (tid DESC)
and compaction = {'class': 'DateTieredCompactionStrategy', 'max_sstable_age_days': 31 };
alter table win30 with default_time_to_live = '2592000';
I have set the default_time_to_live property for the entire table, but when I query the table,
select * from win30 order by tid desc limit 9999
Cassandra WARN that
Read xx live rows and xxxx tombstone for query xxxxxx (see tombstone_warn_threshold).
According to this doc How is data deleted,
Cassandra allows you to set a default_time_to_live property for an
entire table. Columns and rows marked with regular TTLs are processed
as described above; but when a record exceeds the table-level TTL,
Cassandra deletes it immediately, without tombstoning or compaction.
"but when a record exceeds the table-level TTL,Cassandra deletes it immediately, without tombstoning or compaction."
Why Cassandra still WARN for tombstone since I have set a default_time_to_live?
I insert data using some CQL like, without using TTL.
insert into win30 (cust_id, tid, info ) values ('123', now(), 'sometext');
a similar question but it does not use default_time_to_live
And it seems that I could set the unchecked_tombstone_compaction to true?
Another question, I select data with ordering the same as the CLUSTERING ORDER,
why Cassandra hit so many tombstones?
Why Cassandra still WARN for tombstone since I have set a default_time_to_live?
The way TTL works in Cassandra is that once the record is expired, its marked as tombstone (the same process of deletion of a record). So instead of manually having a purge job in RDBMS world, Cassandra enables you to cleanup old records based on their TTL. But it still follows through the same process as DELETE and hence the tombstone. Since your TTL value is '2592000' (30days), anything older than 30 days in the table gets expired (marked as tombstone - deleted).
Now the reason for the warning is that your SELECT statement is looking for records that are alive (non-deleted) and the warning message is for how many tombstoned (expired / deleted) records were encountered in the process. So while trying to serve 9999 alive records, the table hit X number of tombstones along the way.
Since the TTL is set at table level, any inserted record to this table will have a default TTL of 30days.
Here is the documentation reference, in case you want to read more.
After the number of seconds since the column's creation exceeds the TTL value, TTL data is considered expired and is included in results. Expired data is marked with a tombstone after on the next read on the read path, but it remains for a maximum of gc_grace_seconds.
Above reference is from this link
And it seems that I could set the unchecked_tombstone_compaction to true?
Its nothing related to the warning that you are getting. You could think about reducing gc_grace_seconds value (default 10 days) to get rid of tombstones quicker. But there is a reason for this value to be 10days.
Note that DateTieriedCompactionStrategy is depcreated and once you upgrade to 3.11 Apache Cassandra or DSE 5.1.2 there is TimeWindowCompactionStrategy which does a better job with handling tombstones.

Overwrite row in cassandra with INSERT, will it cause tombstone?

Writing data to Cassandra without causing it to create tombstones are vital in our case, due to the amount of data and speed. Currently we have only written a row once, and then never had the need to update the row again, only fetch the data again.
Now there has been a case, where we actually need to write data, and then complete it with more data, that is finished after awhile.
It can be made by either;
overwrite all of the data in a row again using INSERT (all data is available), or
performing an Update only on the new data.
What is the best way to do it, bear in mind of the speed and not creating a tombstone is of importance ?
Tombstones will only created when deleting data or using TTL values.
Cassandra does align very well to your described use case. Incrementally adding data will work for both INSERT and UPDATE statements. Cassandra will store data in different locations in case of adding data over time for the same partition key. Periodically running compactions will merge data again for a single key to optimize access and free disk space. This will happend based on the timestamp of written values but does not create any new tombstones.
You can learn more about how Cassandra stores data e.g. here.
It would be more efficient to do an update to add new or changed data. There is no need to rewrite the old data that isn't changing and it would be inefficient to make Cassandra rewrite it.
When you do an insert or update, Cassandra keeps a timestamp for the modify time for each column. When you do a read, Cassandra collects all the writes for that key from in memory, from on disk, and from other replicas depending on the consistency setting. It will then merge the column data so that the newest value is used for each column.
When data is compacted on disk, if there are separate updates for different columns of a row, those will be combined into a single row in the compacted data.
You don't need to worry about creating tombstones by doing an update unless you are using an update to set a TTL (Time To Live) value. In your application it sounds like you never delete data, so you will never have any tombstones.

Resources