is 'nodetool repair -pr -par' a full repair? - cassandra

We do run repair -pr on every DSC 2.1.15 node within gc_grace like this:
nodetool -h localhost repair -pr -par mykeyspc
But in the log it says full=true:
[2017-02-12 00:00:01,683] Starting repair command #11, repairing 256 ranges for
keyspace mykeyspc (parallelism=PARALLEL, full=true)
Would have expected that a -pr didn't run a full repair or how to read this log?

It means full as in "not incremental". Can think of it as its fully repairing the data in those ranges, not just the unrepaired data. It is confusing argument naming. The -pr means its just repairing the primary ranges though so you still need to do that on each node.

Related

cassandra - Take snapshot to different location

I want to take Cassandra backup at every 1hr interval and move it to Shared location.
Cassandra taking the snapshot in the default location, how can I take the snapshot on /opt/backup location?
You can't (with snapshots).
nodetool snapshot -t <tag> <keyspace> is a quite simple tool - it just creates hard links for every file in your keyspace data directories to snapshots/<tag>.
Since these are hard links they have to be on the same filesystem. Benefit of those hard links is that a snapshot is quite fast and doesn't consume additional disk space initially (when sstables got compacted / deleted the files remain in the snapshot).
If you want those backups in a different location use -t <tag> while creating your snapshot. I made up a demo with demosnapshot and a simple script (not fully elaborated but shows the idea:
$ cat cassandrabackup.sh
#!/bin/bash
TAG=`date +%Y%m%d%H%M%S`
BACKUP_LOC=/tmp/backup/`hostname`
KEYSPACE=demokeyspace
echo creating snapshot $TAG
nodetool snapshot -t $TAG $KEYSPACE
echo sync to backup location $BACKUP_LOC
find /var/lib/cassandra -type f -path "*snapshots/$TAG*" -printf %P\\0 | rsync -avP --files-from=- --from0 /var/lib/cassandra/ $BACKUP_LOC
echo removing snapshot $TAG
nodetool clearsnapshot -t $TAG
The script creates a snaphot with a specific tag (datetime), rsyncs the contents to a backup location and then removes the snapshot. If KEYSPACE is not defined all keyspaces are backuped.
Result is like this:
$ ./cassandrabackup.sh
creating snapshot 20170823132936
Requested creating snapshot(s) for [demokeyspace] with snapshot name [20170823132936] and options {skipFlush=false}
Snapshot directory: 20170823132936
sync to backup location /tmp/backup/host1.domain.tld
building file list ...
6 files to consider
data1/
data1/demokeyspace/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/manifest.json
13 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/6)
sent 305 bytes received 50 bytes 710.00 bytes/sec
total size is 13 speedup is 0.04
removing snapshot 20170823132936
Requested clearing snapshot(s) for [all keyspaces] with snapshot name [20170823132936]
$ ifjke#fsca01:~$ find /tmp/backup/
/tmp/backup/
/tmp/backup/host1.domain.tld
/tmp/backup/host1.domain.tld/data2
/tmp/backup/host1.domain.tld/data2/demokeyspace
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823125951
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823125951/manifest.json
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823130014
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823130014/manifest.json
/tmp/backup/host1.domain.tld/data1
/tmp/backup/host1.domain.tld/data1/demokeyspace
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/manifest.json
$
As I did that error by myself in the past - include hostname in the backups ;)
Apart from that there is also an incremental backup feature in cassandra:
http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupIncremental.html

How do I disable incremental repair?

I have a cluster which I am considering enabling incremental repair on. If anything goes wrong I'd like to disable incremental repair on every node. How do I do that?
Turn node off and use sstablerepairedset to remove the repair time for each sstable so that they will all be candidates for future compactions.
find '/path/cassandra/data/keyspace/table/' -iname "*Data.db*" > sstables.txt
sudo -u cassandra sstablerepairedset --is-unrepaired -f sstables.txt
Then just go back to using repair with no -inc or in later versions use the -full flag

Unable to clear snapshot using Nodetool. Snapshot is never deleted

When I run nodetool clearsnapshot I get the normal "Requested clearing snapshot(s)" message, but the snapshot is never removed. What can I do to troubleshoot why this is occurring? Is it acceptable for me to just manually remove the snapshot directories from the tablespace directories as a workaround for this?
nodetool clearsnapshot 1472489985541
Requested clearing snapshot(s) for [1472489985541]
nodetool listsnapshots | awk '{print $1}' | grep ^1 | sort -u
1472489985541
1473165734236
1473690660090
1474296554367
Is it acceptable for me to just manually remove the snapshot directories from the tablespace directories as a workaround for this?
Yes, you can always safely remove the snapshots directories manually. They are just hard links to actual SSTables
In order to delete a snapshot from all keyspaces using the snapshot name, you must specify the -t flag in your clearsnapshot command.

To find Cassandra disk space usage

I am using Jconsole for monitoring Cassandra. I can get value like how much load each keyspace is having.
I want to find out disk space usage for each node in a cluster by remotely.
Is there any way to do so?
A shell script can do the trick
for i in node1_ip node2_ip ... nodeN_ip
do
ssh user#$i "du -sh /var/lib/cassandra/data" >> /tmp/disk_usage.txt
done
Replace /var/lib/cassandra/data if your data folder is put somewhere else

How to take backup of keyspace in cassandra by command?

I want to take the backup of a keyspace in cassandra, Using command.
Use the nodetool command. Something like:
nodetool -h localhost -p 7199 snapshot mykeyspace
Take a look at the documentation here:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html

Resources