Problem with high Maximum tombstones per slice? - cassandra

I'm seeing the stats below for one of my tables running nodetool cfstats
Maximum tombstones per slice (last five minutes): 23571
Per the Datastax doc:
Maximum number of tombstones scanned by single key queries during the
last five minutes
All my other tables have low numbers like 1 or 2. Should I be worried? Should I try to lower the tombstone creation?

Tombstones can impact on read performance if they are residing in frequently used tables. you should re-work on data modelling part. Also. you can lower the value of gc_grace_seconds so that tombstones clear fast instead of waiting default value 10 days.

Related

how to decide number of executors for 1 billion rows in spark

We have a table which has one billion three hundred and fifty-five million rows.
The table has 20 columns.
We want to join this table with another table which has more of less same number of rows.
How to decide number of spark.conf.set("spark.sql.shuffle.partitions",?)
How to decide number of executors and its resource allocation details?
How to find the amount of storage those one billion three hundred and fifty-five million rows will take in memory?
Like #samkart says, you have to experiment to figure out the best parameters since it depends on the size and nature of your data. The spark tuning guide would be helpful.
Here are some things that you may want to tweak:
spark.executor.cores is 1 by default but you should look to increase this to improve parallelism. A rule of thumb is to set this to 5.
spark.files.maxPartitionBytes determines the amount of data per partition while reading, and hence determines the initial number of partitions. You could tweak this depending on the data size. Default is 128 MB blocks in HDFS.
spark.sql.shuffle.partitions is 200 by default but tweak it depending on the data size and number of cores. This blog would be helpful.

TTL Remover on Cassandra Data

I have a scenario where the application has inserted data into the Cassandra table with a (TTL) of 5 days.I also have (GC_GRACE_SECONDS) to 5 days so that tombstones will get evicted as soon as compaction kicks in.
Now, i have a scenario where for one table i need to keep data for 60 days. I have changed the application write to update the TTL to 60 days for the new Data. But I'm looking for a solution where i could change the TTL for existing data(which has 5 days to 60 Days).
I have tried Instaclustr/TTLRemover for some reason the code didn't work for us.
We are using Apache Cassandra 3.11.3.
Just to provide clarity on the parameters:
default_time_to_live: TTL (Time To Live) in seconds, where zero is disabled. The maximum configurable value is 630720000 (20 years). If the value is greater than zero, TTL is enabled for the entire table and an expiration timestamp is added to each column. A new TTL timestamp is calculated each time the data is updated and the row is removed after all the data expires.
Default value: 0 (disabled).
gc_grace_seconds : Seconds after data is marked with a tombstone (deletion marker) before it is eligible for garbage-collection. Default value: 864000 (10 days). The default value allows time for Cassandra to maximize consistency prior to deletion.
Note: Tombstoned records within the grace period are excluded from hints or batched mutations.
In your case you can update TTL and gc_grace_seconds to 60 days for new data to expire in 60 days time. But as your existing data is already marked with ttl as 5 days, it will not be updated with new ttl and will be deleted in 5 days. As per my knowledge there is no way to update the ttl for existing data.
We can set TTL with two ways 1) from query 2) On table.
Just update TTL on that table or re-insert the query with new TTL value.

Will cassandra full compaction remove expired data in files generated by previous compaction?

I have a table TTL=10days in cassandra, I usually do full compaction on every Monday and Thursday.
I noticed that after compaction on Thursday, Cassandra did not touch/compact the files generated on Monday.
Why is that? Is that possible the file generated on Monday is too big? How can I fix it? BTW, I uses SizeTieredCompactionStrategy.
When you say you do a "full compaction" what exactly are you doing to trigger this?
In general, SizeTieredCompaction will only compact a set number of similarly sized SSTables. This means that if your table (table 1) from Monday is say size X MBs and you have min_threshold on the table set to 4 then it would require 4 tables of ~X Mbs before the table 1 would be compacted again. This means that if you say generate a new compacted table of ~ X MBs every 3 days it would take 9 days before the original table was compacted again.

Does Cassandra's limitation of 2 billion cells per partition includes the replicated rows (cells) from other partitions?

Cassandra allows upto 2 billion cells per partition. If I have 2 node cluster, with a replication factor of 2, does that mean 2 billion cells will take into account the rows redundantly save from the other node?
No, the replication factor does not affect this limit. The limitation is not 2 billion/RF.
HTH, Cheers,
Carlo

Max. size of wide rows?

Theoretically, Cassandra allows up to 2 billion columns in a wide row.
I have heard that in reality up to 50.000 cols/50 MB are fine; 50.000-100.000 cols/100 MB are OK but require some tuning; and that one should never go above 100.000/100 MB columns per row. The reason being that this will put pressure on the heap.
Is there some truth to this?
In Cassandra, the maximum number of cells (rows x columns) in a single partition is 2 billion.
Additionally, a single column value may not be larger than 2GB, but in practice, "single digits of MB" is a more reasonable limit, since there is no streaming or random access of blob values.
Partitions greater than 100Mb can cause significant pressure on the heap.
One of our tables with cassandra 1.2 went pass 100 MB columns per row limit due to new write patterns we experienced. We have experienced significant pressure on both compactions and our caches. Btw, we had rows with several hundred MBs.
One approach is to just redesign and migrate the table to a better designed table(s) that will keep your wide rows under that limit. If that is not an option, then I suggest tune your cassandra so both compactions and caches configs can deal with your wide rows effectively.
Some interesting links to things to tune:
Cassandra Performance Tuning
in_memory_compaction_limit_in_mb

Resources