How to delete Cassandra snapshots which are older than 1 month - cassandra

All Cassandra snapshots can be deleted with
nodetool -h localhost -p 7199 clearsnapshot
and there is another to delete one specific snapshot.
nodetool clearsnapshot -t snapshot_name
However, I would like to delete all snapshots which are created 1 month ago. Is there any way to do this?

If you have not specified a name for the snapshot, the engine will assign by default as name the timestamp, this will be the number of ms from Epoch. One way to achieve this is to calculate what was the epoch time 30 days ago (one month), and any snapshot id that is lower than that number is older than the threshold that you specified.
Another way to do this is to customize the name of the snapshot (-t) with the date in a human-readable format, which will make it easier to choose the snapshots to delete.

Related

Table in Cassandra

how would you tell if a table in Cassandra is active or not?
I have done some research and from what I see, it looks like you can only tell if a table is present or not in Cassandra.
But is there a way to tell if the table is active in Cassandra?
Alex is spot-on. Most enterprises use something like Prometheus or Grafana to visualize JMX metrics, and track reads or writes per second. That's an easy way to see if the table is active.
Otherwise, you can always run nodetool tablestats keyspace.table. In the last 10 lines or so of the output, look for:
Average live cells per slice (last five minutes):
Maximum live cells per slice (last five minutes):
If you have any current read activity, these will be something other than zero or NaN (not a number).
For example, you can query different table-level metrics via JMX to see if specific table is used or not. You will need to detect changes between observations, maybe based on LiveDiskSpaceUsed metric, or something like that.

Too many Tombstone in Cassandra

I have a table named 'holder' which has the single partition in which for every one hour we will have 60K entries,
I have another table named 'holderhistory' which has the 'date' as partitionId, so every day's record from 'holder' table will be copied to the 'holderhistory'
There will be a job running in the application
i) which collects all the older entries in holder table and copy to the holderhistory table
ii) Delete the older entries from holder table
NOW the issue is - there will be too many tombstones created in the holder table.
As default the tombstones will be cleared after 10 days (864000 seconds) gc_grace_seconds
But I don't want to keep the tombstone for more than 3 hours,
1) so It is good to set the gc_grace_seconds to 3 hours?
2) Or It is good to set the default_time_to_live to 3 hours?
Which is the best solution for deleting the tombstone?
Also what are the consequence on reducing the gc_grace_seconds from 10 days to 3 hours? where we will have impact?
Anyhelp is appreciated.
If you reduce the GCGraceSeconds parameter too low and the recovery time of any node longer than the GCGraceSeconds, in such case, once one of these nodes came back online, it would mistakenly think that all of the nodes that had received the delete had actually missed a write and it would start repairing all of the other nodes. I would recommend to use efault_time_to_live and give a try.
To answer your particular case : as the table 'holder' contains only one partition, you can delete the whole partition with a single "delete by partition key" statement, effectively creating a single tombstone.
If you delete the partition once a day, you'll end up with 1 tombstone per day... that's quite acceptable.
1) with gc_grace_seconds equals 3 hours, and if RF > 1, you will not be guaranteed to recover consistently from a node failure longer than 3 hours
2) with default_time_to_live equals 3 hours, each record will be deleted by creating a tombstone 3 hours after insertion
So you could keep default gc_grace_seconds set to 10 days, and take care to delete your daily records with something like DELETE FROM table WHERE PartitionKey = X
EDIT: Answering to your comment about hinted handoff...
Let's say RF = 3, gc_grace_second = 3h and a node goes down. The 2 others replicas continue to receive mutations (insert, update, delete), but they can't replicate them to the offline node. In that case, hints will be stored on disk temporarily, to be sent later if the dead node comes back.
But a hint expires after gc_grace_seconds, after what it will never been sent.
Now if you delete a row, it will generate a tombstone in the sstables of the 2 replicas and a hint in the coordinator node. After 3 hours, the tombstones are removed from the online nodes by the compaction manager, and the hint expires.
Later when your dead node comes back, it still have the row, and it can't know that this row has been deleted because no hint and no more tombstone exist on replicas... thus it's a zombie row.
You might also find this support blog article useful:
https://academy.datastax.com/support-blog/cleaning-tombstones-datastax-dse-and-apache-cassandra

Repartition to avoid large number of small files

Currently I have a ETL job that reads few tables, performs certain transformations and writes them back to the daily table.
I use the following query in spark sql
"INSERT INTO dbname.tablename PARTITION(year_month)
SELECT * from Spark_temp_table "
The target table in which all these records are being inserted is partitioned at a year X month level. Records which are generated on a daily basis are not that much hence I am partitioning on year X month level.
However, when I check the partition, it has small ~50MB files for each day my code runs (code has to run daily) and eventually I will end up having around 30 files in my partition totalling ~1500MB
I want to know if there is way for me to just create one (or maybe 2-3 files as per block size restrictions) in one partition while I append my records on a daily basis
The way I think I can do it is to just read everything from the concerned partition in my spark dataframe, append it with the latest record and repartition it before writing back. How do I ensure I only read data from the concerned partition and only that partition is over written with lesser number of files?
you can use DISTRIBUTE BY clause to control how the records will be distributed in files inside each partition.
to have a single file per partition, you can use DISTRIBUTE BY year, month
and to have 3 file per partition, you can use DISTRIBUTE BY year, month, day % 3
the full query:
INSERT INTO dbname.tablename
PARTITION(year_month)
SELECT * from Spark_temp_table
DISTRIBUTE BY year, month, day % 3

Timestamp and deletion of updated records

I think insertion/update process occurs like this (correct me If I'm wrong).
Cassandra doesn't delete or update a row in place when you insert a new one matching the same primary key.
Instead, it insert a new row with a more recent timestamp.
When there is a request for this row, the one with the more recent timestamp win.
During the compaction process, old rows are evicted, only the one with last timestamp stay in new sstable.
So knowing that, is it preferrable to avoid update when we can ?
In my dataset, I have data sorted by date, I update them several times hourly, but once a day is done they won't change anymore for a specific day.
Real time being not really important for this app, wouldn't be better to use an alternate storage/cache (or just aggregate them until I get a complete day).
I think I would reduce a lot impact of compaction (disk usage and available space).

Cron job to move many rows between MySQL table

I have a main database that stores up to 5'000 new rows per day.
I want to have a second database that only contains the latest 30 days worth of data at any time.
Therefore I plan to set up a cron job that regularly dumps the rows older than 30 days and copies the new ones.
What would be the best design for the copying part?
Copying on the fly with MySQL alone
An MySql export to a txt file, then an MySql import, then deleting the temporary file
A php script that iterates through the rows and copies them one by one
I want robustness and minimum amount of CPU/memory usage
The quickest and most robust way is to perform the transfer directly in MySQL. Here are the steps involved:
First, create the second table:
CREATE TABLE IF NOT EXISTS second.last30days LIKE main_table;
Next, insert the records 30 days old, or newer:
INSERT INTO
second.last30days
SELECT
*
FROM
main_table
WHERE
created >= CURDATE() - INTERVAL 30 DAYS
ORDER BY created;
Lastly, delete the records older than 30 days:
DELETE FROM
second.last30days
WHERE
created < CURDATE() - INTERVAL 30 DAYS
ORDER BY created;
It would be advisable to not run the INSERT and DELETE statements at the same time.
if the databases are both hosted on the same server. just use an insert ... select statement.
that way you minimize everything. 1 query, and done.
MySQL 5.0 - 13.2.5.1. INSERT ... SELECT Syntax"

Resources