Hive can't find partitioned data written by Spark Structured Streaming - apache-spark

I have a spark structured streaming job, writing data to IBM Cloud Object Storage (S3):
dataDf.
writeStream.
format("parquet").
trigger(Trigger.ProcessingTime(trigger_time_ms)).
option("checkpointLocation", s"${s3Url}/checkpoint").
option("path", s"${s3Url}/data").
option("spark.sql.hive.convertMetastoreParquet", false).
partitionBy("InvoiceYear", "InvoiceMonth", "InvoiceDay", "InvoiceHour").
start()
I can see the data using the hdfs CLI:
[clsadmin#xxxxx ~]$ hdfs dfs -ls s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0 | head
Found 616 items
-rw-rw-rw- 1 clsadmin clsadmin 38085 2018-09-25 01:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 45874 2018-09-25 00:31 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-28ff873e-8a9c-4128-9188-c7b763c5b4ae.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 5124 2018-09-25 01:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-5f768960-4b29-4bce-8f31-2ca9f0d42cb5.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 40154 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-70abc027-1f88-4259-a223-21c4153e2a85.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41282 2018-09-25 00:50 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-873a1caa-3ecc-424a-8b7c-0b2dc1885de4.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 41241 2018-09-25 00:40 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-88b617bf-e35c-4f24-acec-274497b1fd31.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 3114 2018-09-25 00:01 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-deae2a19-1719-4dfa-afb6-33b57f2d73bb.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 38877 2018-09-25 00:10 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-e07429a2-43dc-4e5b-8fe7-c55ec68783b3.c000.snappy.parquet
-rw-rw-rw- 1 clsadmin clsadmin 39060 2018-09-25 00:20 s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00001-1553da20-14d0-4c06-ae87-45d22914edba.c000.snappy.parquet
However, when I try to query the data:
hive> select * from invoiceitems limit 5;
OK
Time taken: 2.392 seconds
My table DDL looks like this:
CREATE EXTERNAL TABLE `invoiceitems`(
`invoiceno` int,
`stockcode` int,
`description` string,
`quantity` int,
`invoicedate` bigint,
`unitprice` double,
`customerid` int,
`country` string,
`lineno` int,
`invoicetime` string,
`storeid` int,
`transactionid` string,
`invoicedatestring` string)
PARTITIONED BY (
`invoiceyear` int,
`invoicemonth` int,
`invoiceday` int,
`invoicehour` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3a://streaming-data-landing-zone-partitioned/data'
I've also tried with the correct case for column/partition names - this doesn't work either.
Any ideas why my query isn't finding the data?
UPDATE 1:
I have tried setting the location to a directory containing the data without partitions and this still doesn't work, so I'm wondering if it is a data formatting issue?
CREATE EXTERNAL TABLE `invoiceitems`(
`InvoiceNo` int,
`StockCode` int,
`Description` string,
`Quantity` int,
`InvoiceDate` bigint,
`UnitPrice` double,
`CustomerID` int,
`Country` string,
`LineNo` int,
`InvoiceTime` string,
`StoreID` int,
`TransactionID` string,
`InvoiceDateString` string)
PARTITIONED BY (
`InvoiceYear` int,
`InvoiceMonth` int,
`InvoiceDay` int,
`InvoiceHour` int)
STORED AS PARQUET
LOCATION
's3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/';
hive> Select * from invoiceitems limit 5;
OK
Time taken: 2.066 seconds

Read from Snappy Compression parquet file
The data is in snappy compressed Parquet file format.
s3a://streaming-data-landing-zone-partitioned/data/InvoiceYear=2018/InvoiceMonth=9/InvoiceDay=25/InvoiceHour=0/part-00000-1e1dda99-bec2-447c-9bd7-bedb1944f4a9.c000.snappy.parquet
So set the ‘PARQUET.COMPRESS’=’SNAPPY’ table property in create table DDL statement. You can alternatively set parquet.compression=SNAPPY in the “Custom hive-site settings” section in Ambari for either IOP or HDP.
Here is an example of using the table property during a table creation statement in Hive:
hive> CREATE TABLE inv_hive_parquet(
trans_id int, product varchar(50), trans_dt date
)
PARTITIONED BY (
year int)
STORED AS PARQUET
TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');
Update Parition metadata in External table
Also, for an external Partitioned table, we need to update the partition metadata whenever any external job (spark job in this case) writes the partitions to Datafolder directly, because hive will not be aware of these partitions unless the explicitly updated.
that can be done by either:
ALTER TABLE inv_hive_parquet RECOVER PARTITIONS;
//or
MSCK REPAIR TABLE inv_hive_parquet;

Related

How do I reliably get the correct timestamp for 0001-01-01 from Hive 3 and Spark 3?

The bounty expires in 3 days. Answers to this question are eligible for a +200 reputation bounty.
Alexey is looking for an answer from a reputable source.
I have a very basic csv of new year dates all the way from 1970-01-01 00:00:00 to 0000-01-01 00:00:00, which I've made available to Hive as external table test.ny(dt string). The time zone on all machines is Europe/Moscow
When I create a parquet table in Hive 2:
create table test.ny2 stored as parquet
as
select
dt,
unix_timestamp(dt||' 00:00:00') dt2,
cast(dt as timestamp) dt3
from test.ny --this is my csv
I am able to access it via spark-sql if I set spark.sql.legacy.parquet.int96RebaseModeInRead=LEGACY. All dt3 values are read correctly as YYYY-01-01 00:00:00
However, when I access the same table via Hive 3, I get a discrepancy at
dt dt2 dt3
1901-01-01 00:00:00 -2177461817 1901-01-01 00:00:00.000
1900-01-01 00:00:00 -2208999600 1899-12-31 23:30:17.000
which can be explained by the tzdb being applied incorrectly in Hive 2, and another one at the very end:
dt dt2 dt3
0003-01-01 00:00:00 -62072708400 0002-12-29 23:30:17.000
0002-01-01 00:00:00 -62104244400 0001-12-29 23:30:17.000
0001-01-01 00:00:00 -62135780400 0001-12-29 23:30:17.000
0000-01-01 00:00:00 -62167402800 0002-12-29 23:30:17.000
That's not all. When I recreate the same table in Hive 3.1.3 from scratch:
create table test.ny3 stored as parquet
as
select
dt,
unix_timestamp(dt||' 00:00:00') dt2,
cast(dt as timestamp) dt3
from test.ny --this is my csv
I get the second error when I select it in Hive!
dt dt2 dt3
0003-01-01 00:00:00 -62072697600 0003-01-01 00:00:00.000
0002-01-01 00:00:00 -62104233600 0002-01-01 00:00:00.000
0001-01-01 00:00:00 -62135769600 0002-01-01 00:00:00.000
0000-01-01 00:00:00 -62167392000 0002-01-01 00:00:00.000
I also cannot select the data I want via spark-sql, no matter what mode I use, LEGACY (which is understandable):
dt dt2 dt3
0003-01-01 00:00:00 -62072697600 0003-01-03 00:29:43
0002-01-01 00:00:00 -62104233600 0002-01-03 00:29:43
0001-01-01 00:00:00 -62135769600 0001-01-03 00:29:43
0000-01-01 00:00:00 -62167392000 0001-01-03 00:29:43
or CORRECTED (which it gets almost right):
dt dt2 dt3
0003-01-01 00:00:00 -62072697600 0003-01-01 00:00:00
0002-01-01 00:00:00 -62104233600 0002-01-01 00:00:00
0001-01-01 00:00:00 -62135769600 0001-01-01 00:00:00
0000-01-01 00:00:00 -62167392000 0001-01-01 00:00:00 --notice the year!
Question 1: what's with Hive 3 failing to correctly process existing timestamps around years 0000 and 0001?
Question 2: how can I read both old (written by Hive 2) and new (written by Hive 3) tables in the same Spark session?
how do I get Hive 3 to use the old calendar and tzdb logic, so I can read all tables in LEGACY mode, or
how do I correct Hive 2 tables in Hive 3 to use the new calendar and tzdb logic, so I can read all tables in Spark in CORRECTED mode?
Hive3 doesnt support legacy timestamps, it was introduced at some way in 4.* versions only

Cassandra TWCS Merges SSTables in the Same Bucket

I created the following table on Cassandra 3.11 for storing metrics using the TimeWindowCompactionStrategy:
CREATE TABLE metrics.my_test (
metric_name text,
metric_week text,
metric_time timestamp,
tags map<text, text>,
value double,
PRIMARY KEY ((metric_name, metric_week), metric_time)
) WITH CLUSTERING ORDER BY (metric_time DESC)
AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'MINUTES'}
AND default_time_to_live = 7776000
AND gc_grace_seconds = 60;
Following the blog post on TLP about TWCS, I thought I'd be able to issue a compaction and none of the SSTables in the same bucket (1 minute window) would be compacted together. However, it seems as though this is not true, and everything gets compacted together. Before compaction:
# for f in *Data.db; do ls -l $f && java -jar /root/sstable-tools-3.11.0-alpha11.jar describe $f | grep timestamp; done
-rw-r--r-- 1 cassandra cassandra 1431 Mar 22 17:29 mc-10-big-Data.db
Minimum timestamp: 1521739701309280 (03/22/2018 17:28:21)
Maximum timestamp: 1521739777814859 (03/22/2018 17:29:37)
-rw-r--r-- 1 cassandra cassandra 619 Mar 22 17:30 mc-11-big-Data.db
Minimum timestamp: 1521739787241285 (03/22/2018 17:29:47)
Maximum timestamp: 1521739810545148 (03/22/2018 17:30:10)
-rw-r--r-- 1 cassandra cassandra 654 Mar 22 17:20 mc-1-big-Data.db
Minimum timestamp: 1521739189529560 (03/22/2018 17:19:49)
Maximum timestamp: 1521739216248636 (03/22/2018 17:20:16)
-rw-r--r-- 1 cassandra cassandra 1154 Mar 22 17:21 mc-2-big-Data.db
Minimum timestamp: 1521739217033715 (03/22/2018 17:20:17)
Maximum timestamp: 1521739277579629 (03/22/2018 17:21:17)
-rw-r--r-- 1 cassandra cassandra 855 Mar 22 17:22 mc-3-big-Data.db
Minimum timestamp: 1521739283859916 (03/22/2018 17:21:23)
Maximum timestamp: 1521739326037634 (03/22/2018 17:22:06)
-rw-r--r-- 1 cassandra cassandra 1047 Mar 22 17:23 mc-4-big-Data.db
Minimum timestamp: 1521739327868930 (03/22/2018 17:22:07)
Maximum timestamp: 1521739387131847 (03/22/2018 17:23:07)
-rw-r--r-- 1 cassandra cassandra 1288 Mar 22 17:24 mc-5-big-Data.db
Minimum timestamp: 1521739391318240 (03/22/2018 17:23:11)
Maximum timestamp: 1521739459713561 (03/22/2018 17:24:19)
-rw-r--r-- 1 cassandra cassandra 767 Mar 22 17:25 mc-6-big-Data.db
Minimum timestamp: 1521739461284097 (03/22/2018 17:24:21)
Maximum timestamp: 1521739505132186 (03/22/2018 17:25:05)
-rw-r--r-- 1 cassandra cassandra 1216 Mar 22 17:26 mc-7-big-Data.db
Minimum timestamp: 1521739507504019 (03/22/2018 17:25:07)
Maximum timestamp: 1521739583459167 (03/22/2018 17:26:23)
-rw-r--r-- 1 cassandra cassandra 749 Mar 22 17:27 mc-8-big-Data.db
Minimum timestamp: 1521739587644109 (03/22/2018 17:26:27)
Maximum timestamp: 1521739625351120 (03/22/2018 17:27:05)
-rw-r--r-- 1 cassandra cassandra 1259 Mar 22 17:28 mc-9-big-Data.db
Minimum timestamp: 1521739627983733 (03/22/2018 17:27:07)
Maximum timestamp: 1521739698691870 (03/22/2018 17:28:18)
After issuing nodetool compact metrics my_test:
# for f in *Data.db; do ls -l $f && java -jar /root/sstable-tools-3.11.0-alpha11.jar describe $f | grep timestamp; done
-rw-r--r-- 1 cassandra cassandra 8677 Mar 22 17:30 mc-12-big-Data.db
Minimum timestamp: 1521739189529561 (03/22/2018 17:19:49)
Maximum timestamp: 1521739810545148 (03/22/2018 17:30:10)
It's clear to see that SSTables from multiple time windows were merged together, as the only SSTable after the compaction covers 17:19:49 to 17:30:10.
What can I do to prevent this from happening? I have a large-ish (12 nodes, ~550GB/node) table implemented with TWCS, but has multiple overlapping SSTables. I'd like to compress out any tombstones, and merge those overlapping SSTables; however, I'm worried I'll be left with a single 550GB SSTable per node. My concern is a single SSTable that large will be slow when doing reads... is that a valid concern?
Dont manually issue nodetool compact, that explicitly merges everything together into one table.
TWCS will be STCS within the time window until its done then compact that window down, a 1 minute window is crazy aggressive and probably not something that will realistically work since data will be delivered across window boundaries. Flushes can (and likely) be more than 1 minute apart so it wont even be on sstables by time window passes meaning almost everything is out of window. Some overlapping sstables are Ok so dont worry too much about it but you will need a larger window than 1 minute. Id be careful of anything less than 1 day.
Especially with partition key at 1 week and 3 month TTL you would have tens of thousands of sstables which isn't maintainable for streaming. Repairs will simply break.

Spark sql partition pruning for hive partitioned table

When I run some query spark seems not predicate push down to specific hive table's partition.
Setting "spark.sql.orc.filterPushdown" to "true" didn't help.
Spark version is 1.6 and hive version is 1.2. and hive table is partioned by date with ORC format.
val sc = new SparkContext(new SparkConf())
var hql = new org.apache.spark.sql.hive.HiveContext(sc)
hql.setConf("spark.sql.orc.filterPushdown", "true")
hql.sql("""
SELECT i.*,
from_unixtime(unix_timestamp('20170220','yyyyMMdd'),"yyyy-MM-dd'T'HH:mm:ssZ") bounce_date
FROM
(SELECT country,
device_id,
os_name,
app_ver
FROM jpl_band_orc
WHERE yyyymmdd='20170220'
AND scene_id='app_intro'
AND action_id='scene_enter'
AND classifier='app_intro'
GROUP BY country, device_id, os_name, app_ver ) i
LEFT JOIN
(SELECT device_id
FROM jpl_band_orc
WHERE yyyymmdd='20170220'
AND scene_id='band_list'
AND action_id='scene_enter') s
ON i.device_id = s.device_id
WHERE s.device_id is null
""")
this is show create table
1 CREATE TABLE `jpl_band_orc`(
2 ... many fields ...
3 )
39 PARTITIONED BY (
40 `yyyymmdd` string)
41 CLUSTERED BY (
42 ac_hash)
43 SORTED BY (
44 ac_hash ASC)
45 INTO 256 BUCKETS
46 ROW FORMAT SERDE
47 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
48 STORED AS INPUTFORMAT
49 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
50 OUTPUTFORMAT
51 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
52 LOCATION
53 'BLAH~BLAH~/jpl_band_orc'
54 TBLPROPERTIES (
55 'orc.bloom.filter.columns'='ac_hash,action_id,classifier',
56 'orc.bloom.filter.fpp'='0.05',
57 'orc.compress'='SNAPPY',
58 'orc.row.index.stride'='30000',
59 'orc.stripe.size'='268435456',
60 'transient_lastDdlTime'='1464922691')
Spark job output
17/02/22 17:05:32 INFO HadoopFsRelation: Listing leaf files and directories in parallel under:
hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160604, hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160608,
...
hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160620, hdfs://banda/apps/hive/warehouse/jpl_band_orc/yyyymmdd=20160621,
Finally ended with OOM
Exception in thread "qtp1779914089-88" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.HashMap$KeySet.iterator(HashMap.java:912)
at java.util.HashSet.iterator(HashSet.java:172)
at sun.nio.ch.Util$2.iterator(Util.java:243)
at org.spark-project.jetty.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:600)
at org.spark-project.jetty.io.nio.SelectorManager$1.run(SelectorManager.java:290)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at java.lang.String.substring(String.java:1969)
at java.net.URI$Parser.substring(URI.java:2869)
It seems read all partitions then OOM occured.
How can I check whether It partition pruned or not exactly.

How can I restore Cassandra snapshots?

I'm building a backup and restore process for a Cassandra database so that it's ready when I need it, and so that I understand the details in order to build something that will work for production. I'm following Datastax's instructions here:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html.
As a start, I'm seeding the database on a dev box then attempting to make the backup/restore work. Here's the backup script:
#!/bin/bash
cd /opt/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
Yes, tar is not the most efficient way, perhaps, but I'm just trying to get something working right now. I've checked the tar, and all the files are there.
Once the database is backed up, I shut down Cassandra and my app, then rm -rf /var/lib/cassandra/ to simulate a complete loss.
Now to restore the database. Restoration "Method 2" from http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html is more compatible with my schema-creation component than Method 1.
So, Method 2/Step 1, "Recreate the schema": Restart Cassandra, then my app. The app is built to re-recreate the schema on startup when necessary. Once it's up, there's a working Cassandra node with a schema for the app, but no data.
Method 2/Step 2 "Restore the snapshot": They give three alternatives, the first of which is to use sstableloader, documented at http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html. The folder structure that the loader requires is nothing like the folder structure created by the snapshot tool, so everything has to be moved into place. Before going to all that trouble, I'll just try it out on one table:
>./bin/sstableloader makeyourcase/users
Error: Could not find or load main class org.apache.cassandra.tools.BulkLoader
Hmmm, well, that's not going to work. BulkLoader is in ./lib/apache-cassandra-2.0.9.jar, but the loader doesn't seem to be set up to work out of the box. Rather than debug the tool, let's move on to the second alternative, copying the snapshot directory into the makeyourcase/users/snapshots/ directory. This should be easy, since we're throwing the snapshot directory right back where it came from, so tar xzf after_seeding.tgz should do the trick:
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh should restore the data:
cd /opt/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase users
This runs without complaint. Note that you have to run this for each and every table, so you have to generate the list of tables first. But, before we do that, note that there's something interesting in the Cassandra logs:
INFO 14:32:26,319 Loading new SSTables for makeyourcase/users...
INFO 14:32:26,326 No new SSTables were found for makeyourcase/users
So, we put the snapshot back, but Cassandra didn't find it. I also tried moving the snapshot directory under the existing SSTables directory, and copying the old SSTable files into the existing directory, with the same error in the log. Cassandra doesn't log where it expects to find them, just that it can't find them. The docs say to put them into a directory named data/keyspace/table_name-UUID, but there is no such directory. There is one named data/makeyourcase/users/snapshots/1408820504987-users/, but putting the snapshot dir there, or the individual files, didn't work.
The third alternative, the "Node restart method" doesn't look suitable for a multi-node production environment, so I didn't try that.
Edit:
Just to make this perfectly explicit for the next person, here are the preliminary, working backup and restore scripts that apply the accepted answer.
myc_backup.sh:
#!/bin/bash
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool clearsnapshot -t after_seeding makeyourcase
./bin/nodetool snapshot -t after_seeding makeyourcase
cd /var/lib/
tar czf after_seeding.tgz cassandra/data/makeyourcase/*/snapshots/after_seeding
myc_restore.sh:
#!/bin/bash
cd /var/lib/
tar xzf after_seeding.tgz
chmod -R u+rwx cassandra/data/makeyourcase
cd ~/bootstrap/apache-cassandra-2.0.9
TABLE_LIST=`./bin/nodetool cfstats makeyourcase | grep "Table: " | sed -e 's+^.*: ++'`
for TABLE in $TABLE_LIST; do
echo "Restore table ${TABLE}"
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
if [ -d "snapshots/after_seeding" ]; then
cp snapshots/after_seeding/* .
cd ~/bootstrap/apache-cassandra-2.0.9
./bin/nodetool refresh -- makeyourcase ${TABLE}
cd /var/lib/cassandra/data/makeyourcase/${TABLE}
rm -rf snapshots/after_seeding
echo " Table ${TABLE} restored."
else
echo " >>> Nothing to restore."
fi
done
Added more details:
You can run the snapshot for your particular keyspace using:
$ nodetool snapshot <mykeyspace> -t <SnapshotDirectoryName>
This will create the snapshot files inside the snapshots directory in data.
When you delete your data, make sure you don't delete the snapshots folder or you will not be able to restore it (unless you are moving it to another location / machine.)
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ ls
mykeyspace-mytable-jb-2-CompressionInfo.db mykeyspace-mytable-jb-2-Statistics.db
mykeyspace-mytable-jb-2-Data.db mykeyspace-mytable-jb-2-Filter.db mykeyspace-mytable-jb-2-Index.db
mykeyspace-mytable-jb-2-Summary.db mykeyspace-mytable-jb-2-TOC.txt snapshots
$ rm *
rm: cannot remove `snapshots': Is a directory
Once you are ready to restore, copy back the snapshot data into the keyspace/table directory (one for each table):
$ pwd
/var/lib/cassandra/data/mykeyspace/mytable
$ sudo cp snapshots/<SnapshotDirectoryName>/* .
You mentioned:
and that puts the snapshot directories back under their respective 'snapshots' directories, and a refresh >should restore the data:
I think the issue is that you are restoring the Snapshot data into the snapshot directory. It should go right in the table directory. Everything else seems right, let me know.
The docs say to put them into a directory named
data/keyspace/table_name-UUID, but there is no such directory.
You don't have this UUID directory because you are using cassandra 2.0 and this UUID thing started with cassandra 2.2
Step-1: I created one table by using the below command
CREATE TABLE Cricket (
PlayerID uuid,
LastName varchar,
FirstName varchar,
City varchar,
State varchar,
PRIMARY KEY (PlayerID));
Step-2: Insert 3 records by using below command
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Pendulkar', 'Sachin', 'Mumbai','Maharastra');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Vholi', 'Virat', 'Delhi','New Delhi');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Sharma', 'Rohit', 'Berhampur','Odisha');
Step-3: Accidentally I deleted Cricket table
drop table Cricket;
Step-4: Need to recover that table by using auto snapshotbackup
Note: auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised.
Step-5: Find the snapshot locations and files
cassandra#node1:~/data/students_details$ cd cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 0
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
Step-6: You will get one .cql file in that snapshot location which having tables DDL.
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ ls -lrth
total 44K
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:06 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:06 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:06 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:06 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:06 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:06 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:06 schema.cql
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:06 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$
more schema.cql
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ more schema.cql
CREATE TABLE IF NOT EXISTS students_details.cricket (
playerid uuid PRIMARY KEY,
city text,
firstname text,
lastname text,
state text)
WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND comment = ''
AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = { };
Step-7: Login to the database and create table using that DDL.
apiadmin#cqlsh:coopersdev> use students_details;
apiadmin#cqlsh:students_details> CREATE TABLE IF NOT EXISTS students_details.cricket (
... playerid uuid PRIMARY KEY,
... city text,
... firstname text,
... lastname text,
... state text)
... WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
... AND bloom_filter_fp_chance = 0.01
... AND dclocal_read_repair_chance = 0.1
... AND crc_check_chance = 1.0
... AND default_time_to_live = 0
... AND gc_grace_seconds = 864000
... AND min_index_interval = 128
... AND max_index_interval = 2048
... AND memtable_flush_period_in_ms = 0
... AND read_repair_chance = 0.0
... AND speculative_retry = '99PERCENTILE'
... AND comment = ''
... AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
... AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
... AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
... AND cdc = false
... AND extensions = { };
apiadmin#cqlsh:students_details>
Step-8: copy all the files on snapshot folder to existing cricket table folder
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ pwd
/home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cp * /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cd /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 44K
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:11 schema.cql
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:11 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:11 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:11 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:11 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:11 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:11 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:11 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$
Step-9: start restore table data using sstableloader by using below command
cassandra#node1:~$ sstableloader -d 10.213.61.21 -username cassandra --password cassandra /home/cassandra/data/students_details/cricket-d3576f60960f11ea947b39646348bb4f/snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Summary statistics:
Connections per host : 1
Total files transferred : 0
Total bytes transferred : 0.000KiB
Total duration : 2920 ms
Average transfer rate : 0.000KiB/s
Peak transfer rate : 0.000KiB/s
Step-10: Table restored successfully.Please verify.
playerid | city | firstname | lastname | state
--------------------------------------+-----------+-----------+-----------+------------
d7b12c90-960f-11ea-947b-39646348bb4f | Berhampur | Rohit | Sharma | Odisha
d7594890-960f-11ea-947b-39646348bb4f | Delhi | Virat | Vholi | New Delhi
d7588540-960f-11ea-947b-39646348bb4f | Mumbai | Sachin | Pendulkar | Maharastra

Restore cassandra cluster data when acccidentally drop table

As you know, Cassandra cluster have replication to prevent data loss even if some node in the cluster down. But in the case that an admin accidentally drop a table with big amount of data, and that command had already executed by all the replica in cluster, is this means you lost that table and cannot restore it? Is there any suggestion to cope with this kind of disaster with short server down time?
From cassandra docs:
auto_snapshot
(Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent
data loss, using the default setting is strongly advised. If you set
to false, you will lose data on truncation or drop.
If the administrator has been deleted the data and replicated in all the nodes it is difficult to recover the data without a consistent backup.
Maybe considering that the deletes in cassandra are not executed instantly you can recover the data. When you delete data, cassandra replace the data with a tombstone.The tombstone can then be propagated to replicas that missed the initial remove request.
See http://wiki.apache.org/cassandra/DistributedDeletes
Columns marked with a tombstone exist for a configured time period (defined by the gc_grace_seconds value set on the column family), and then are permanently deleted by the compaction process after that time has expired. The default value is 10 days.
Following the explanation in About Deletes maybe if you shutdown some of the nodes and wait until the compaction succeed and the data is completely delete from the SSTables and then turn on again the nodes the data could appear again. But this will only happen if you dont make periodical repair operations on the node.
I have never tried this before, it is only an idea that comes to me reading the cassandra documentation.
Step-1: I created one table by using the below command
CREATE TABLE Cricket (
PlayerID uuid,
LastName varchar,
FirstName varchar,
City varchar,
State varchar,
PRIMARY KEY (PlayerID));
Step-2: Insert 3 records by using below command
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Pendulkar', 'Sachin', 'Mumbai','Maharastra');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Vholi', 'Virat', 'Delhi','New Delhi');
INSERT INTO Cricket (PlayerID, LastName, FirstName, City, State)
VALUES (now(), 'Sharma', 'Rohit', 'Berhampur','Odisha');
Step-3: Accidentally I deleted Cricket table
drop table Cricket;
Step-4: Need to recover that table by using auto snapshotbackup Note: auto_snapshot (Default: true ) Enable or disable whether a snapshot is taken of the data before keyspace truncation or dropping of tables. To prevent data loss, using the default setting is strongly advised.
Step-5: Find the snapshot locations and files
cassandra#node1:~/data/students_details$ cd cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 0
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
Step-6: You will get one .cql file in that snapshot location which having tables DDL.
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ ls -lrth
total 44K
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:06 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:06 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:06 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:06 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:06 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:06 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:06 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:06 schema.cql
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:06 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$
more schema.cql
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ more schema.cql
CREATE TABLE IF NOT EXISTS students_details.cricket (
playerid uuid PRIMARY KEY,
city text,
firstname text,
lastname text,
state text)
WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND comment = ''
AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = { };
Step-7: Login to the database and create table using that DDL.
apiadmin#cqlsh:coopersdev> use students_details;
apiadmin#cqlsh:students_details> CREATE TABLE IF NOT EXISTS students_details.cricket (
... playerid uuid PRIMARY KEY,
... city text,
... firstname text,
... lastname text,
... state text)
... WITH ID = 88128dc0-960d-11ea-947b-39646348bb4f
... AND bloom_filter_fp_chance = 0.01
... AND dclocal_read_repair_chance = 0.1
... AND crc_check_chance = 1.0
... AND default_time_to_live = 0
... AND gc_grace_seconds = 864000
... AND min_index_interval = 128
... AND max_index_interval = 2048
... AND memtable_flush_period_in_ms = 0
... AND read_repair_chance = 0.0
... AND speculative_retry = '99PERCENTILE'
... AND comment = ''
... AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
... AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
... AND compression = { 'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor' }
... AND cdc = false
... AND extensions = { };
apiadmin#cqlsh:students_details>
Step-8: copy all the files on snapshot folder to existing cricket table folder
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ pwd
/home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cp * /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f/snapshots/dropped-1589479603749-cricket$ cd /home/cassandra/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$ ls -lrth
total 44K
drwxrwxr-x 2 cassandra cassandra 6 May 14 18:05 backups
drwxrwxr-x 3 cassandra cassandra 43 May 14 18:06 snapshots
-rw-rw-r-- 1 cassandra cassandra 891 May 14 18:11 schema.cql
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-TOC.txt
-rw-rw-r-- 1 cassandra cassandra 92 May 14 18:11 md-1-big-Summary.db
-rw-rw-r-- 1 cassandra cassandra 4.7K May 14 18:11 md-1-big-Statistics.db
-rw-rw-r-- 1 cassandra cassandra 61 May 14 18:11 md-1-big-Index.db
-rw-rw-r-- 1 cassandra cassandra 16 May 14 18:11 md-1-big-Filter.db
-rw-rw-r-- 1 cassandra cassandra 9 May 14 18:11 md-1-big-Digest.crc32
-rw-rw-r-- 1 cassandra cassandra 179 May 14 18:11 md-1-big-Data.db
-rw-rw-r-- 1 cassandra cassandra 43 May 14 18:11 md-1-big-CompressionInfo.db
-rw-rw-r-- 1 cassandra cassandra 31 May 14 18:11 manifest.json
cassandra#node1:~/data/students_details/cricket-88128dc0960d11ea947b39646348bb4f$
Step-9: start restore table data using sstableloader by using below command
cassandra#node1:~$ sstableloader -d 10.213.61.21 -username cassandra --password cassandra /home/cassandra/data/students_details/cricket-d3576f60960f11ea947b39646348bb4f/snapshots
Established connection to initial hosts
Opening sstables and calculating sections to stream
Summary statistics:
Connections per host : 1
Total files transferred : 0
Total bytes transferred : 0.000KiB
Total duration : 2920 ms
Average transfer rate : 0.000KiB/s
Peak transfer rate : 0.000KiB/s
Step-10: Table restored successfully.Please verify.
playerid | city | firstname | lastname | state
--------------------------------------+-----------+-----------+-----------+------------
d7b12c90-960f-11ea-947b-39646348bb4f | Berhampur | Rohit | Sharma | Odisha
d7594890-960f-11ea-947b-39646348bb4f | Delhi | Virat | Vholi | New Delhi
d7588540-960f-11ea-947b-39646348bb4f | Mumbai | Sachin | Pendulkar | Maharastra

Resources