Restore cassandra cluster data when acccidentally drop table - cassandra

As you know, Cassandra cluster have replication to prevent data loss even if some node in the cluster down. But in the case that an admin accidentally drop a table with big amount of data, and that command had already executed by all the replica in cluster, is this means you lost that table and cannot restore it? Is there any suggestion to cope with this kind of disaster with short server down time?

From cassandra docs:
auto_snapshot
(Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent
data loss, using the default setting is strongly advised. If you set
to false, you will lose data on truncation or drop.

If the administrator has been deleted the data and replicated in all the nodes it is difficult to recover the data without a consistent backup.
Maybe considering that the deletes in cassandra are not executed instantly you can recover the data. When you delete data, cassandra replace the data with a tombstone.The tombstone can then be propagated to replicas that missed the initial remove request.
See http://wiki.apache.org/cassandra/DistributedDeletes
Columns marked with a tombstone exist for a configured time period (defined by the gc_grace_seconds value set on the column family), and then are permanently deleted by the compaction process after that time has expired. The default value is 10 days.
Following the explanation in About Deletes maybe if you shutdown some of the nodes and wait until the compaction succeed and the data is completely delete from the SSTables and then turn on again the nodes the data could appear again. But this will only happen if you dont make periodical repair operations on the node.
I have never tried this before, it is only an idea that comes to me reading the cassandra documentation.

Step-1: I created one table by using the below command
CREATE TABLE Cricket (
PlayerID uuid,
LastName varchar,
FirstName varchar,
City varchar,
State varchar,
PRIMARY KEY (PlayerID));
Step-2: Insert 3 records by using below command
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Pendulkar', 'Sachin', 'Mumbai','Maharastra');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Vholi', 'Virat', 'Delhi','New Delhi');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Sharma', 'Rohit', 'Berhampur','Odisha');
Step-3: Accidentally I deleted Cricket table
drop table Cricket;
Step-4: Need to recover that table by using auto snapshotbackup Note: auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised.
Step-5: Find the snapshot locations and files
cassandra#node1:~/data/students_details$ cd cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 0
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
Step-6: You will get one .cql file in that snapshot location which having tables DDL.
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ ls -lrth
total 44K
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:06 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:06 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:06 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:06 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:06 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:06 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:06 schema.cql
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:06 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$
more schema.cql
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ more schema.cql
CREATE TABLE IF NOT EXISTS students_details.cricket (
playerid uuid PRIMARY KEY,
city text,
firstname text,
lastname text,
state text)
WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND comment = ''
AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = { };
Step-7: Login to the database and create table using that DDL.
apiadmin#cqlsh:coopersdev> use students_details;
apiadmin#cqlsh:students_details> CREATE TABLE IF NOT EXISTS students_details.cricket (
... playerid uuid PRIMARY KEY,
... city text,
... firstname text,
... lastname text,
... state text)
... WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
... AND bloom_filter_fp_chance = 0.01
... AND dclocal_read_repair_chance = 0.1
... AND crc_check_chance = 1.0
... AND default_time_to_live = 0
... AND gc_grace_seconds = 864000
... AND min_index_interval = 128
... AND max_index_interval = 2048
... AND memtable_flush_period_in_ms = 0
... AND read_repair_chance = 0.0
... AND speculative_retry = '99PERCENTILE'
... AND comment = ''
... AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
... AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
... AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
... AND cdc = false
... AND extensions = { };
apiadmin#cqlsh:students_details>
Step-8: copy all the files on snapshot folder to existing cricket table folder
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ pwd
/home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cp * /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cd /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 44K
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:11 schema.cql
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:11 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:11 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:11 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:11 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:11 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:11 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:11 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$
Step-9: start restore table data using sstableloader by using below command
cassandra#node1:~$ sstableloader -d 10.213.61.21 -username cassandra --password cassandra /home/cassandra/data/students_details/cricket-d3576f60960f11ea947b39646348bb4f/snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Summary statistics:
Connections per host : 1
Total files transferred : 0
Total bytes transferred : 0.000KiB
Total duration : 2920 ms
Average transfer rate : 0.000KiB/s
Peak transfer rate : 0.000KiB/s
Step-10: Table restored successfully.Please verify.
playerid | city | firstname | lastname | state
--------------------------------------+-----------+-----------+-----------+------------
d7b12c90-960f-11ea-947b-39646348bb4f | Berhampur | Rohit | Sharma | Odisha
d7594890-960f-11ea-947b-39646348bb4f | Delhi | Virat | Vholi | New Delhi
d7588540-960f-11ea-947b-39646348bb4f | Mumbai | Sachin | Pendulkar | Maharastra

Related

Hive can't find partitioned data written by Spark Structured Streaming

I have a spark structured streaming job, writing data to IBM Cloud Object Storage (S3):
dataDf.
writeStream.
format("parquet").
trigger(Trigger.ProcessingTime(trigger_time_ms)).
option("checkpointLocation", s"${s3Url}/checkpoint").
option("path", s"${s3Url}/data").
option("spark.sql.hive.convertMetastoreParquet", false).
partitionBy("InvoiceYear", "InvoiceMonth", "InvoiceDay", "InvoiceHour").
start()
I can see the data using the hdfs CLI:
[clsadmin#xxxxx ~]$ hdfs dfs -ls s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0 | head
Found 616 items
-rw-rw-rw- 1 clsadmin clsadmin 38085 2018-09-25 01:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 45874 2018-09-25 00:31 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-28ff873e-8a9c-4128-9188-c7b763c5b4ae.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 5124 2018-09-25 01:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-5f768960-4b29-4bce-8f31-2ca9f0d42cb5.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 40154 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-70abc027-1f88-4259-a223-21c4153e2a85.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41282 2018-09-25 00:50 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-873a1caa-3ecc-424a-8b7c-0b2dc1885de4.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41241 2018-09-25 00:40 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-88b617bf-e35c-4f24-acec-274497b1fd31.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 3114 2018-09-25 00:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-deae2a19-1719-4dfa-afb6-33b57f2d73bb.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 38877 2018-09-25 00:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-e07429a2-43dc-4e5b-8fe7-c55ec68783b3.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 39060 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00001-1553da20-14d0-4c06-ae87-45d22914edba.c000.snappy.parquet
However, when I try to query the data:
hive> select * from invoiceitems limit 5;
OK
Time taken: 2.392 seconds
My table DDL looks like this:
CREATE EXTERNAL TABLE `invoiceitems`(
`invoiceno` int,
`stockcode` int,
`description` string,
`quantity` int,
`invoicedate` bigint,
`unitprice` double,
`customerid` int,
`country` string,
`lineno` int,
`invoicetime` string,
`storeid` int,
`transactionid` string,
`invoicedatestring` string)
PARTITIONED BY (
`invoiceyear` int,
`invoicemonth` int,
`invoiceday` int,
`invoicehour` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3a://streaming-data-landing-zone-partitioned/data'
I've also tried with the correct case for column/partition names - this doesn't work either.
Any ideas why my query isn't finding the data?
UPDATE 1:
I have tried setting the location to a directory containing the data without partitions and this still doesn't work, so I'm wondering if it is a data formatting issue?
CREATE EXTERNAL TABLE `invoiceitems`(
`InvoiceNo` int,
`StockCode` int,
`Description` string,
`Quantity` int,
`InvoiceDate` bigint,
`UnitPrice` double,
`CustomerID` int,
`Country` string,
`LineNo` int,
`InvoiceTime` string,
`StoreID` int,
`TransactionID` string,
`InvoiceDateString` string)
PARTITIONED BY (
`InvoiceYear` int,
`InvoiceMonth` int,
`InvoiceDay` int,
`InvoiceHour` int)
STORED AS PARQUET
LOCATION
's3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/';
hive> Select * from invoiceitems limit 5;
OK
Time taken: 2.066 seconds
Read from Snappy Compression parquet file
The data is in snappy compressed Parquet file format.
s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
So set the ‘PARQUET.COMPRESS’=’SNAPPY’ table property in create table DDL statement. You can alternatively set parquet.compression=SNAPPY in the “Custom hive-site settings” section in Ambari for either IOP or HDP.
Here is an example of using the table property during a table creation statement in Hive:
hive> CREATE TABLE inv_hive_parquet(
trans_id int, product varchar(50), trans_dt date
)
PARTITIONED BY (
year int)
STORED AS PARQUET
TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');
Update Parition metadata in External table
Also, for an external Partitioned table, we need to update the partition metadata whenever any external job (spark job in this case) writes the partitions to Datafolder directly, because hive will not be aware of these partitions unless the explicitly updated.
that can be done by either:
ALTER TABLE inv_hive_parquet RECOVER PARTITIONS;
//or
MSCK REPAIR TABLE inv_hive_parquet;

Cassandra tombstones not deleted a month after actual record TTL

Running into an issue with DSE 4.7.
The tombstones are not being deleted even after compactions, cleanup, rebuild_index and repair. records have a 15 day ttl.
sstablemetadata output suggests that there are 90% tombstones
Any ideas?
sstablemetadata output
SSTable: ./abcd-abcd-ka-478675
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.010000
Minimum timestamp: 1527521280829593
Maximum timestamp: 1527596173976435
SSTable max local deletion time: 1528892173
Compression ratio: 0.36967428395684393
Estimated droppable tombstones: 0.9073013816277629
SSTable Level: 0
Repaired at: 0
ReplayPosition(segmentId=1520529283052, position=4626679)
Estimated tombstone drop times:%n
1528817679: 18318196
1528818619: 20753822
1528819513: 24176310
.
.
.
Count Row Size Cell Count
1 0 0
2 0 1752560
3 0 0
4 0 6355421
5 0 0
6 0 687302
7 0 0
8 0 529613
10 0 444801
12 0 410107
14 0 456011
17 0 1347893
20 0 184960
24 0 152814
.
.
.
770 1347893 137
924 184960 109
1109 220403 68
1331 121620 86
1597 2044030 102
1916 185601 195
2299 184816 158273
2759 868754 0
3311 62795 0
3973 1668 0
4768 2143 0
5722 1812541 0
6866 828 0
.
.
.
Ancestors: [476190, 474027, 475201, 478160]
Estimated cardinality: 20059264
Cassandra marks TTL data with a tombstone after the requested amount of time has expired. A tombstone exists for gc_grace_seconds. After data is marked with a tombstone, the data is automatically removed during the normal compaction process.
you can try to run major compaction to evict tombstone out.
Tombstones gets deleted after normal compaction. But, still sometime you find stale data (even in prod)in tombstone.The reason could be out of all the nodes in that cluster one is down and the data from tombstone did not got deleted because of that node. Also sometimes null values are inserted in primary key causing tombstone data.

Cassandra TWCS Merges SSTables in the Same Bucket

I created the following table on Cassandra 3.11 for storing metrics using the TimeWindowCompactionStrategy:
CREATE TABLE metrics.my_test (
metric_name text,
metric_week text,
metric_time timestamp,
tags map<text, text>,
value double,
PRIMARY KEY ((metric_name, metric_week), metric_time)
) WITH CLUSTERING ORDER BY (metric_time DESC)
AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'MINUTES'}
AND default_time_to_live = 7776000
AND gc_grace_seconds = 60;
Following the blog post on TLP about TWCS, I thought I'd be able to issue a compaction and none of the SSTables in the same bucket (1 minute window) would be compacted together. However, it seems as though this is not true, and everything gets compacted together. Before compaction:
# for f in *Data.db; do ls -l $f && java -jar /root/sstable-tools-3.11.0-alpha11.jar describe $f | grep timestamp; done
-rw-r--r-- 1 cassandra cassandra 1431 Mar 22 17:29 mc-10-big-Data.db
Minimum timestamp: 1521739701309280 (03/22/2018 17:28:21)
Maximum timestamp: 1521739777814859 (03/22/2018 17:29:37)
-rw-r--r-- 1 cassandra cassandra 619 Mar 22 17:30 mc-11-big-Data.db
Minimum timestamp: 1521739787241285 (03/22/2018 17:29:47)
Maximum timestamp: 1521739810545148 (03/22/2018 17:30:10)
-rw-r--r-- 1 cassandra cassandra 654 Mar 22 17:20 mc-1-big-Data.db
Minimum timestamp: 1521739189529560 (03/22/2018 17:19:49)
Maximum timestamp: 1521739216248636 (03/22/2018 17:20:16)
-rw-r--r-- 1 cassandra cassandra 1154 Mar 22 17:21 mc-2-big-Data.db
Minimum timestamp: 1521739217033715 (03/22/2018 17:20:17)
Maximum timestamp: 1521739277579629 (03/22/2018 17:21:17)
-rw-r--r-- 1 cassandra cassandra 855 Mar 22 17:22 mc-3-big-Data.db
Minimum timestamp: 1521739283859916 (03/22/2018 17:21:23)
Maximum timestamp: 1521739326037634 (03/22/2018 17:22:06)
-rw-r--r-- 1 cassandra cassandra 1047 Mar 22 17:23 mc-4-big-Data.db
Minimum timestamp: 1521739327868930 (03/22/2018 17:22:07)
Maximum timestamp: 1521739387131847 (03/22/2018 17:23:07)
-rw-r--r-- 1 cassandra cassandra 1288 Mar 22 17:24 mc-5-big-Data.db
Minimum timestamp: 1521739391318240 (03/22/2018 17:23:11)
Maximum timestamp: 1521739459713561 (03/22/2018 17:24:19)
-rw-r--r-- 1 cassandra cassandra 767 Mar 22 17:25 mc-6-big-Data.db
Minimum timestamp: 1521739461284097 (03/22/2018 17:24:21)
Maximum timestamp: 1521739505132186 (03/22/2018 17:25:05)
-rw-r--r-- 1 cassandra cassandra 1216 Mar 22 17:26 mc-7-big-Data.db
Minimum timestamp: 1521739507504019 (03/22/2018 17:25:07)
Maximum timestamp: 1521739583459167 (03/22/2018 17:26:23)
-rw-r--r-- 1 cassandra cassandra 749 Mar 22 17:27 mc-8-big-Data.db
Minimum timestamp: 1521739587644109 (03/22/2018 17:26:27)
Maximum timestamp: 1521739625351120 (03/22/2018 17:27:05)
-rw-r--r-- 1 cassandra cassandra 1259 Mar 22 17:28 mc-9-big-Data.db
Minimum timestamp: 1521739627983733 (03/22/2018 17:27:07)
Maximum timestamp: 1521739698691870 (03/22/2018 17:28:18)
After issuing nodetool compact metrics my_test:
# for f in *Data.db; do ls -l $f && java -jar /root/sstable-tools-3.11.0-alpha11.jar describe $f | grep timestamp; done
-rw-r--r-- 1 cassandra cassandra 8677 Mar 22 17:30 mc-12-big-Data.db
Minimum timestamp: 1521739189529561 (03/22/2018 17:19:49)
Maximum timestamp: 1521739810545148 (03/22/2018 17:30:10)
It's clear to see that SSTables from multiple time windows were merged together, as the only SSTable after the compaction covers 17:19:49 to 17:30:10.
What can I do to prevent this from happening? I have a large-ish (12 nodes, ~550GB/node) table implemented with TWCS, but has multiple overlapping SSTables. I'd like to compress out any tombstones, and merge those overlapping SSTables; however, I'm worried I'll be left with a single 550GB SSTable per node. My concern is a single SSTable that large will be slow when doing reads... is that a valid concern?
Dont manually issue nodetool compact, that explicitly merges everything together into one table.
TWCS will be STCS within the time window until its done then compact that window down, a 1 minute window is crazy aggressive and probably not something that will realistically work since data will be delivered across window boundaries. Flushes can (and likely) be more than 1 minute apart so it wont even be on sstables by time window passes meaning almost everything is out of window. Some overlapping sstables are Ok so dont worry too much about it but you will need a larger window than 1 minute. Id be careful of anything less than 1 day.
Especially with partition key at 1 week and 3 month TTL you would have tens of thousands of sstables which isn't maintainable for streaming. Repairs will simply break.

What does rows_merged mean in compactionhistory?

When I issue
$ nodetool compactionhistory
I get
. . . compacted_at bytes_in bytes_out rows_merged
. . . 1404936947592 8096 7211 {1:3, 3:1}
What does {1:3, 3:1} mean? The only documentation I can find is this which states
the number of partitions merged
which does not explain why multiple values and what the colon means.
So basically it means {tables:rows} for example {1:3, 3:1} means 3 rows were taken from one sstable (1:3) and 1 row taken from 3 (3:1) sstables, all to make the one sstable in that compaction operation.
I tried it out myself so here's an example, I hope this helps:
create keyspace and table:
cqlsh> create keyspace space1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
cqlsh> create TABLE space1.tb1 ( key text, val1 text, primary KEY (key));
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key1','111');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key2','222');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key3','333');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key4','444');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key5','555');
cqlsh> exit
Now we flush to create the sstable
$ nodetool flush space1
We see that only one version of the table is created
$ sudo ls -lR /var/lib/cassandra/data/space1
/var/lib/cassandra/data/space1:
total 4
drwxr-xr-x. 2 cassandra cassandra 4096 Feb 3 12:51 tb1
/var/lib/cassandra/data/space1/tb1:
total 32
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 12:51 space1-tb1-jb-1-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 146 Feb 3 12:51 space1-tb1-jb-1-Data.db
-rw-r--r--. 1 cassandra cassandra 24 Feb 3 12:51 space1-tb1-jb-1-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 12:51 space1-tb1-jb-1-Index.db
-rw-r--r--. 1 cassandra cassandra 4389 Feb 3 12:51 space1-tb1-jb-1-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 12:51 space1-tb1-jb-1-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 12:51 space1-tb1-jb-1-TOC.txt
check the sstable2json we see our data
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-1-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]
At this point ‘notetool compactionhistory’ shows nothing for this table but lets run compact anyway to see what we get (scroll right)
$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id keyspace_name columnfamily_name compacted_at bytes_in bytes_out rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b space1 tb1 1422968305305 146 146 {1:5}
Now lets delete two rows, and flush
cqlsh> delete from space1.tb1 where key='key1';
cqlsh> delete from space1.tb1 where key='key2';
cqlsh> exit
$ nodetool flush space1
$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
[sudo] password for datastax:
total 64
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 12:58 space1-tb1-jb-2-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 146 Feb 3 12:58 space1-tb1-jb-2-Data.db
-rw-r--r--. 1 cassandra cassandra 336 Feb 3 12:58 space1-tb1-jb-2-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 12:58 space1-tb1-jb-2-Index.db
-rw-r--r--. 1 cassandra cassandra 4393 Feb 3 12:58 space1-tb1-jb-2-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 12:58 space1-tb1-jb-2-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 12:58 space1-tb1-jb-2-TOC.txt
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 13:02 space1-tb1-jb-3-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 49 Feb 3 13:02 space1-tb1-jb-3-Data.db
-rw-r--r--. 1 cassandra cassandra 16 Feb 3 13:02 space1-tb1-jb-3-Filter.db
-rw-r--r--. 1 cassandra cassandra 36 Feb 3 13:02 space1-tb1-jb-3-Index.db
-rw-r--r--. 1 cassandra cassandra 4413 Feb 3 13:02 space1-tb1-jb-3-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 13:02 space1-tb1-jb-3-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 13:02 space1-tb1-jb-3-TOC.txt
Lets check the tables contents
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-2-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-3-Data.db
[
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]
Now lets compact
$ nodetool compact space1
Only one stable now as expected
$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
total 32
-rw-r--r--. 1 cassandra cassandra 43 Feb 3 13:05 space1-tb1-jb-4-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 133 Feb 3 13:05 space1-tb1-jb-4-Data.db
-rw-r--r--. 1 cassandra cassandra 656 Feb 3 13:05 space1-tb1-jb-4-Filter.db
-rw-r--r--. 1 cassandra cassandra 90 Feb 3 13:05 space1-tb1-jb-4-Index.db
-rw-r--r--. 1 cassandra cassandra 4429 Feb 3 13:05 space1-tb1-jb-4-Statistics.db
-rw-r--r--. 1 cassandra cassandra 80 Feb 3 13:05 space1-tb1-jb-4-Summary.db
-rw-r--r--. 1 cassandra cassandra 79 Feb 3 13:05 space1-tb1-jb-4-TOC.txt
Now lets check the contents of the new stable we can see the tombstones
$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-4-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]
Finally lets check compaction history (scroll right)
$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id keyspace_name columnfamily_name compacted_at bytes_in bytes_out rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b space1 tb1 1422968305305 146 146 {1:5}
46112600-aba5-11e4-9f73-351725b0ac5b space1 tb1 1422968706144 195 133 {1:3, 2:2}

How can I restore Cassandra snapshots?

I'm building a backup and restore process for a Cassandra database so that it's ready when I need it, and so that I understand the details in order to build something that will work for production. I'm following Datastax's instructions here:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html.
As a start, I'm seeding the database on a dev box then attempting to make the backup/restore work. Here's the backup script:
#!/bin/bash
cd /opt/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
Yes, tar is not the most efficient way, perhaps, but I'm just trying to get something working right now. I've checked the tar, and all the files are there.
Once the database is backed up, I shut down Cassandra and my app, then rm -rf /var/lib/cassandra/ to simulate a complete loss.
Now to restore the database. Restoration "Method 2" from http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html is more compatible with my schema-creation component than Method 1.
So, Method 2/Step 1, "Recreate the schema": Restart Cassandra, then my app. The app is built to re-recreate the schema on startup when necessary. Once it's up, there's a working Cassandra node with a schema for the app, but no data.
Method 2/Step 2 "Restore the snapshot": They give three alternatives, the first of which is to use sstableloader, documented at http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html. The folder structure that the loader requires is nothing like the folder structure created by the snapshot tool, so everything has to be moved into place. Before going to all that trouble, I'll just try it out on one table:
>./bin/sstableloader makeyourcase/users
Error: Could not find or load main class org.apache.cassandra.tools.BulkLoader
Hmmm, well, that's not going to work. BulkLoader is in ./lib/apache-cassandra-2.0.9.jar, but the loader doesn't seem to be set up to work out of the box. Rather than debug the tool, let's move on to the second alternative, copying the snapshot directory into the makeyourcase/users/snapshots/ directory. This should be easy, since we're throwing the snapshot directory right back where it came from, so tar xzf after_seeding.tgz should do the trick:
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh should restore the data:
cd /opt/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase users
This runs without complaint. Note that you have to run this for each and every table, so you have to generate the list of tables first. But, before we do that, note that there's something interesting in the Cassandra logs:
INFO 14:32:26,319 Loading new SSTables for makeyourcase/users...
INFO 14:32:26,326 No new SSTables were found for makeyourcase/users
So, we put the snapshot back, but Cassandra didn't find it. I also tried moving the snapshot directory under the existing SSTables directory, and copying the old SSTable files into the existing directory, with the same error in the log. Cassandra doesn't log where it expects to find them, just that it can't find them. The docs say to put them into a directory named data/keyspace/table_name-UUID, but there is no such directory. There is one named data/makeyourcase/users/snapshots/1408820504987-users/, but putting the snapshot dir there, or the individual files, didn't work.
The third alternative, the "Node restart method" doesn't look suitable for a multi-node production environment, so I didn't try that.
Edit:
Just to make this perfectly explicit for the next person, here are the preliminary, working backup and restore scripts that apply the accepted answer.
myc_backup.sh:
#!/bin/bash
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
myc_restore.sh:
#!/bin/bash
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
cd ~/bootstrap/apache-cassandra-2.0.9
TABLE_LIST=`./bin/nodetool cfstats makeyourcase | grep "Table: " | sed -e 's+^.*: ++'`
for TABLE in $TABLE_LIST; do
echo "Restore table ${TABLE}"
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
if [ -d "snapshots/after_seeding" ]; then
cp snapshots/after_seeding/* .
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase ${TABLE}
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
rm -rf snapshots/after_seeding
echo " Table ${TABLE} restored."
else
echo " >>> Nothing to restore."
fi
done
Added more details:
You can run the snapshot for your particular keyspace using:
$ nodetool snapshot <mykeyspace> -t <SnapshotDirectoryName>
This will create the snapshot files inside the snapshots directory in data.
When you delete your data, make sure you don't delete the snapshots folder or you will not be able to restore it (unless you are moving it to another location / machine.)
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ ls
mykeyspace-mytable-jb-2-CompressionInfo.db mykeyspace-mytable-jb-2-Statistics.db
mykeyspace-mytable-jb-2-Data.db mykeyspace-mytable-jb-2-Filter.db mykeyspace-mytable-jb-2-Index.db
mykeyspace-mytable-jb-2-Summary.db mykeyspace-mytable-jb-2-TOC.txt snapshots
$ rm *
rm: cannot remove `snapshots': Is a directory
Once you are ready to restore, copy back the snapshot data into the keyspace/table directory (one for each table):
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ sudo cp snapshots/<SnapshotDirectoryName>/* .
You mentioned:
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh >should restore the data:
I think the issue is that you are restoring the Snapshot data into the snapshot directory. It should go right in the table directory. Everything else seems right, let me know.
The docs say to put them into a directory named
data/keyspace/table_name-UUID, but there is no such directory.
You don't have this UUID directory because you are using cassandra 2.0 and this UUID thing started with cassandra 2.2
Step-1: I created one table by using the below command
CREATE TABLE Cricket (
PlayerID uuid,
LastName varchar,
FirstName varchar,
City varchar,
State varchar,
PRIMARY KEY (PlayerID));
Step-2: Insert 3 records by using below command
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Pendulkar', 'Sachin', 'Mumbai','Maharastra');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Vholi', 'Virat', 'Delhi','New Delhi');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Sharma', 'Rohit', 'Berhampur','Odisha');
Step-3: Accidentally I deleted Cricket table
drop table Cricket;
Step-4: Need to recover that table by using auto snapshotbackup
Note: auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised.
Step-5: Find the snapshot locations and files
cassandra#node1:~/data/students_details$ cd cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 0
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
Step-6: You will get one .cql file in that snapshot location which having tables DDL.
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ ls -lrth
total 44K
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:06 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:06 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:06 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:06 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:06 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:06 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:06 schema.cql
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:06 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$
more schema.cql
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ more schema.cql
CREATE TABLE IF NOT EXISTS students_details.cricket (
playerid uuid PRIMARY KEY,
city text,
firstname text,
lastname text,
state text)
WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND comment = ''
AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = { };
Step-7: Login to the database and create table using that DDL.
apiadmin#cqlsh:coopersdev> use students_details;
apiadmin#cqlsh:students_details> CREATE TABLE IF NOT EXISTS students_details.cricket (
... playerid uuid PRIMARY KEY,
... city text,
... firstname text,
... lastname text,
... state text)
... WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
... AND bloom_filter_fp_chance = 0.01
... AND dclocal_read_repair_chance = 0.1
... AND crc_check_chance = 1.0
... AND default_time_to_live = 0
... AND gc_grace_seconds = 864000
... AND min_index_interval = 128
... AND max_index_interval = 2048
... AND memtable_flush_period_in_ms = 0
... AND read_repair_chance = 0.0
... AND speculative_retry = '99PERCENTILE'
... AND comment = ''
... AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
... AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
... AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
... AND cdc = false
... AND extensions = { };
apiadmin#cqlsh:students_details>
Step-8: copy all the files on snapshot folder to existing cricket table folder
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ pwd
/home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cp * /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cd /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 44K
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:11 schema.cql
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:11 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:11 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:11 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:11 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:11 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:11 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:11 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$
Step-9: start restore table data using sstableloader by using below command
cassandra#node1:~$ sstableloader -d 10.213.61.21 -username cassandra --password cassandra /home/cassandra/data/students_details/cricket-d3576f60960f11ea947b39646348bb4f/snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Summary statistics:
Connections per host : 1
Total files transferred : 0
Total bytes transferred : 0.000KiB
Total duration : 2920 ms
Average transfer rate : 0.000KiB/s
Peak transfer rate : 0.000KiB/s
Step-10: Table restored successfully.Please verify.
playerid | city | firstname | lastname | state
--------------------------------------+-----------+-----------+-----------+------------
d7b12c90-960f-11ea-947b-39646348bb4f | Berhampur | Rohit | Sharma | Odisha
d7594890-960f-11ea-947b-39646348bb4f | Delhi | Virat | Vholi | New Delhi
d7588540-960f-11ea-947b-39646348bb4f | Mumbai | Sachin | Pendulkar | Maharastra

Resources