I am always inserting data PRIMARY KEY ((site_name,date),time,id) while the site_name and date can be same the time which is a tamed field and id(uuid) is different. So I always add new data. Data is inserted with TTL (Currently 3 days). So as i don't delete or update can I disable compaction? Considering TTL is there. Would it effect anything. Also as no record is deleted can i disable gc_grace time? I wanna put as much less load on the servers as possible. Much appreciate if anyone can help ?
TTLs create tombstones. As such, compaction is required. If your data is time series data, you might consider the new date tiered compaction: http://www.datastax.com/dev/blog/datetieredcompactionstrategy .
If you use TTLs and set grace to 0, you're asking for trouble unless your cluster is a single node one. the grace is the amount of time to wait before collecting tombstones. If it's 0, it won't wait. This may sound good, but in reality, it'll mean the "deletion" might not propagate across the cluster, and the deleted data may re-appear (coz other nodes may have it, and the last present value will "win"). This type of data is called zombie data. Zombies are bad. Don't feed the zombies.
You can disable auto compaction: http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsDisableAutoCompaction.html . But again, I doubt you'll gain much from this. Again, look at date tiered compaction.
you can permanently disable autocompaction on tables (column families) separately, like this (cql)
alter table <tablename> with compaction = { 'class':'CompactionStrategy', 'enabled':'false'}
the enabled:false permanently disables autocompaction on that table, but you can do manual compaction whenever you like using 'nodetool compact' command
You can set gc grace to 0, but not turn off compaction. If you never delete or update I think you might be able to turn off compaction.
Edit:
Optimizations in C* from 2.0 and onwards exactly for this case:
https://issues.apache.org/jira/browse/CASSANDRA-4917
About TTL, tombstones and GC Grace
http://mail-archives.apache.org/mod_mbox/cassandra-user/201307.mbox/%3CCALY91SNy=cxvcGJh6Cp171GnyXv+EURe4uadso1Kgb4AyFo09g#mail.gmail.com%3E
Related
We are using TWCS for time series data with default TTL of 30 days an compaction window size of 1 day.
Unfortunately, there are cases when incoming data rate gets higher and not so much disk space left to write it. At the same time due to budget constraints adding new nodes to the cluster is not an option. Currently we resort to manually deleting old sstables, but it is error prone.
What is the best way in TWCS case to make Cassandra delete, say, all records that are older than certain date? I mean not to create tombstones in new sstable, but to actually delete old records from disk to free up space.
Of course, I can reduce TTL, but it will affect only new records (so will help only in a long run, but not immediately) and in a case when there is not so much incoming data records will be stored for a shorter period than could be.
Basically, that's the intent of the TTLs to automatically remove the old data. The explicit deletion always creates a tombstone, and it won't work well with with TWCS. So right now the solution would be to stop node, remove old files to free space, start the node - repeat on all nodes. But you're doing that already.
I am facing problem with cassandra compaction on table that stores event data. These events are generated from censors and have associated TTL. By default each event has TTL of 1 day. Few events have different TTL like 7/10/30 which is business requirement. Few events can have TTL of 5 years if event needs to be stored. More than 98% of rows have TTL of 1 day.
Although minor compaction is triggered from time to time, disk usage are constantly increasing. This is because of how SizeTierd compaction-strategy works i.e. it would choose table of similar size for compaction. This creates few huge tables which aren't compacted for long time. Presence of few large table would increase average SSTable size and compaction is run less frequently. Looks like STCS is not right choice. In load-test env, I added data to tables and switched to leveled compaction-strategy. With LCS disk space was reclaimed till certain point and then disk usage were constant. CPU was also less compared to STCS. However time window compaction-strategy looks more promising as it works well for time series TTLed data. I am going to test TWCS with my dataset. Mean while I am trying to find answer for few queries to which I didn't find answer or whatever I found was not clear to me.
In my use case, event is added to table with associated TTL. Then there are 5 more updates on same event within next minute. Updates are not made on single column, instead complete row is re-written with new TTL(which is same for al columns). This new TTL is liked to be slightly less than previous TTL. For example, event is created with TTL of 86400 seconds. It is updated after 5 second then new TTL would be 86395. Further update would be with new TTL which would be slightly less than 86395. After 4-5 updates, no update would be made to more than 99% rows. 1% rows would be re-written with TTL of 5 years.
From what I read: TWCS is for data inserted with immutable TTL. Does
this mean I should not use TWCS?
Also out of order writes are not well handled by TWCS. If event is
created at 10 AM on 5th Sep with 1 day TTL and same event row is
re-written with TTL of 5 years on 10th or 12th Sep, would that be
our of order write? I suppose out of order would be when I am
setting timestamp on data while adding data to DB or something that
would be caused by read repair.
Any guidance/suggestion will be appreciated!
NOTE: I am using cassandra 2.2.8, so I'll be creating jar for TWCS and then use it.
TWCS is a great option under certain circumstances. Here are the things to keep in mind:
1) One of the big benefits of TWCS is that merging/reconciliation among sstables does not occur. The oldest one is simply "lopped" off. Because of that, you don't want to have rows/cells span multiple "buckets/windows".
For example, If you insert a single column during one window and then the next window you insert a different column (i.e. an update of the same row but different column at a later period of time). Instead of compaction creating a single row with both columns, TWCS would lop one of the columns off (the oldest). Actually I am not sure if TWCS will even allows this to occur, but was giving you an example of what would happen if it did. In this example, I believe TWCS will disallow the removal of either sstable until both windows expire. Not 100% sure though. Either way, avoid this scenario.
2) TWCS has similar problems when out-of-time writes occur (overlap). There is a great article by the last pickle that explains this:
https://thelastpickle.com/blog/2016/12/08/TWCS-part1.html
Overlap can occur by repair or from an old compaction (i.e. if you were using STCS and then switched to TWCS, some of the sstables may overlap).
If there is overlap, say, between 2 sstables, you have to wait for both sstables to completely expire before TWCS can remove either of them, and when it does, both with be removed.
If you avoid both scenarios described above, TWCS is very efficient due to the nature of how it cleans things up - no more merging sstables. Simply remove the oldest window.
When you do set up TWCS, you have to remember that the oldest window gets removed after the TTLs expire and GC Grace passes as well - don't forget to add that part. Having a varying TTL number among rows, as you have described, may delay windows from getting removed. If you want to see what is either blocking TWCS from removing a sstable or what the sstables look like, you can use sstableexpiredblockers or the script in the above mentioned URL (which is essentially sstablemetadata with some fancy scripting).
Hopefully that helps.
-Jim
We are using Cassandra 3.10 with 6 nodes cluster.
lately, we noticed that our data volume increased drastically, approximately 4GB per day in each node.
We want to implement a more aggressive retention policy in which we will change the compaction to TWCS with 1-hour window size and set a few days TTL, this can be achieved via the table properties.
Since the ETL should be a slow process in order to lighten Cassandra workload it possible that it will not finish extracting all the data until the TTL, so I wanted to know is there a way for the ETL process to set TTL=0 on entire SSTable once it done extracting it?
TTL=0 is read as a tombstone. When next compacted it would be written tombstone or purged depending on your gc_grace. Other than the overhead of doing the writes of the tombstone it might be easier just to do a delete or create sstables that contain the necessary tombstones than to rewrite all the existing sstables. If its more efficient to do range or point tombstones will depend on your version and schema.
An option that might be easiest is to actually use a different compaction strategy all together or a custom one like https://github.com/protectwise/cassandra-util/tree/master/deleting-compaction-strategy. You can then just purge data on compactions that have been processed. This still depends quite a bit on your schema on how hard it would be to mark whats been processed or not.
You should set TTL 0 on table and query level as well. Once TTL expire data will converted to tombstones. Based on gc_grace_seconds value next compaction will clear all the tombstones. you may run major compaction also to clear tombstones but it is not recommended in cassandra based on compaction strategy. if STCS atleast 50% disk required to run healthy compaction.
I am using awesome Cassandra DB (3.7.0) and I have questions about tombstone.
I have table called raw_data. This table has default TTL as 1 hour. This table gets new data every second. Then another processor reads one row and remove the row.
It seems like this raw_data table becomes slow at reading and writing after several days of running.
Is this because of deleted rows are staying as tombstone? This table already has TTL as 1 hour. Should I set gc_grace_period to something less than 10 days (default value) to remove tombstones quickly? (By the way, I am single-node DB)
Thank you in advance.
Deleting your data is the way to have tombstone problems. TTL is the other way.
It is pretty normal for a Cassandra cluster to become slower and slower after each delete, and your cluster will eventually refuse to read data from this table.
Setting gc_grace_period to less than the default 10 days is only one part of the equation. The other part is the compaction strategy you use. Indeed, in order to remove tombstones a compaction is needed.
I'd change my mind about my single-node cluster and I'd go with the minimum standard 3 nodes with RF=3. Then I'd design my project around something that doesn't explicitly delete data. If you absolutely need to delete data, make sure that C* runs compaction periodically and removes tombstones (or force C* to run compactions), and make sure to have plenty of IOPS, because compaction is very IO intensive.
In short Tombstones are used to Cassandra to mark the data is deleted, and replicate the same to other nodes so the deleted data doesn't re-appear. These tombstone will be stored in Cassandra till the gc_grace_period. Creating more tobestones might slow down your table. As you are using a single node Cassandra you don't have to replicate anything in other nodes, hence you can update your gc grace seconds to 1 day, which will not affect. In future if you are planning to add new nodes and data centers change this gc grace seconds.
I've got an append only Cassandra table where every entry has a ttl. Using SizeTieredCompaction the table seems to grow unbounded. Is there a way to ensure that sstables are checked for tombstoned columns more often?
Tombstones will be effectively deleted after the grace period (search for gc_grace_seconds) expires and a compaction occurs. You should check your YAML configuration for the parameter value (default is 864000 seconds, 10 days) and change it to something suitable for your needs. Beware that lowering this value has its own drawbacks e.g. row "resurrecting" if a node is down for more than the grace period.
Check the datastax documentation about deletes in Cassandra.
If you have high loads of input data and do not modify it, you may better use DateTieredCompactionStrategy as it will be more efficient processing and removing the tombstones.