Un-glusterize a disk? - glusterfs

How can I get back a bare disk (with the data) that was used in a simple 2 nodes replicas GlusterFS cluster ?
Would removing the .glusterfs directory be sufficient or are the files themselves somehow tied to GlusterFS?

In addition to removing the .glusterfs directory, you would also need to remove the various extended attributes which gluster sets on each of the files/directorys in the brick.

Related

Can glusterfs volume be created out of directory instead of partition?

I want to create volume using glusterfs. Can glusterfs volume be created out of directory instead of partition ?
Yes, that should work. I pretty much use it that way during development/testing:
gluster volume create testvol replica 3 myhost:/home/ravi/bricks/brick{1..6} force
Unless you want to use features like snapshot which require thinly provisioned lvms as partitions.
Might I also add that if you place multiple bricks of different distribute subvols on the same folder, things like df and quotas might not always work as intended.

Cassandra - avoid nodetool cleanup

If we have added new nodes to a C* ring, do we need to run "nodetool cleanup" to get rid of the data that has now been assigned elsewhere? Or is this going to happen anyway during normal compactions?
During normal compactions, does C* remove data that does no longer belong on this node, or do we need to run "nodetoool cleanup" for that? Asking because "cleanup" takes forever and crashes the node before finishing.
If we need to run "nodetool cleanup", is there a way to find out which nodes now have data they should no longer own? (i.e data that now belongs on the new nodes, but is still present on the old nodes because no one removed it. This is the data that "nodetool cleanup" would remove.) We have RF=3 and two data centers, each of which has a complete copy of the data. I assume we need to run cleanup on all nodes in the data center where we have added nodes, because each row on the new node used to be on another node (primary), plus two copies (replicas) on two other nodes.
If you are on Apache Cassandra 1.2 or newer, cleanup checks the meta data on files so that it only does something if it needs to. So you are safe to just run it on every node, and only those nodes with extra data will do something. The data will not be removed during the normal compaction process, you have to call cleanup to remove it.
What I found helpful is to just compare how much space each node occupies in the data folder (for me it was /var/lib/cassandra/data). Some things like snapshots might differ between the nodes but when you see that newer nodes use much less disk space than older ones it might be because they did not have a cleanup after the newer ones where added. While you are there, you can also check what is the biggest .db file in there and check if your storage is has enough free space to store another file of that size The cleanup seems to copy the data of the .db files into new ones, minus the data that is now on other nodes. So you might need that extra space while it runs.

Cassandra multiple disk per node setup

Intro
I have a cassandra 1.2 cluster, all the nodes have SSDs. Now I want to add more disks to the existing nodes, but I want to be able to choose which tables are stored on different disks.
Problem
For example, node 1 will have 3 SSDs and 1 regular disk drive and I want all the column families except 1 (let's call it "discord" table) to be stored on the SSDs only, the final table "discord" needs to be stored on the regular disk.
According to the documentation this should be possible; however, the only way of doing it that I can see is:
Setting up Cassandra to use multiple data_files_directories in cassandra.yaml.
Creating the tables.
Creating a link from the data directory on each SSD to the directory on the hard disk where I want to store the column family.
Question
Is this the only way of doing it? Or there is a simpler way of configuring a node to work in this way?
You can set multiples files using the data_file_directories property, but the data is distributed over the folders internally by Cassandra. You can not take decisions on which keyspace or column family goes to each directory.
So the symbolic links is the way to go in my opinion.

moving Cassandra snapshots to a different disk/server/datacenter

I have Cassandra 1.2.6 cluster running on datacenter A, each node has a solid state drive with somewhat limited space (aprox 50% of disk space is free).
Now I need to implement somehow a way of having automatic backups of each node. Ideally I want to have a way of moving all of the cluster's datafiles to a different disk (standard cheaper disks), or even to a different server in the same datacenter A and possibly moving all the data once in a while to a datacenter B in a different location.
From what I've read I can use snapshots on each node to get the files to copy using whatever tool I want and in this case I have the option to move the data to a different disk/server/datacenter.
My question is, since each of my nodes is about 50% full, taking a snapshot will consume all that space? or the hard links will consume way less space than I anticipate?, if so, is there a better way of doing this, maybe with an already made tool, or everything should be custom made when it comes to this type of backups in Cassandra?
Thanks in advance!
A hard link just creates a new directory entry for the same file (http://en.wikipedia.org/wiki/Hard_link). So a snapshot takes up effectively zero space, but you'll want to clean it up after you're done copying it off to whatever your archive is, because when the "original" sstable is deleted (typically post-compaction), space won't be reclaimed as long as the snapshot reference is still there.
My impression is that tablesnap is the most popular tool for automating backups to s3. It also supports Cassandra incremental backups. If you want more control over where you're backing up to, DataStax OpsCenter supports running a custom script when it takes snapshots.

How does cassandra split keyspace data when multiple directories are configured?

I have configured three separate data directories in cassandra.yaml file as given below:
data_file_directories:
- E:/Cassandra/data/var/lib/cassandra/data
- K:/Cassandra/data/var/lib/cassandra/data
when I create keyspace and insert data my key space got created in both two directories and data got scattered. what I want to know is how cassandra splits the data between multiple directories?. And what is the rule behind this?
You are using the JBOD feature of Cassandra when you add multiple entries under data_file_directories. Data is spread evenly over the configured drives proportionate to their available space.
This also let's you take advantage of the disk_failure_policy setting. You can read about the details here:
http://www.datastax.com/dev/blog/handling-disk-failures-in-cassandra-1-2
In short, you can configure Cassandra to keep going, doing what it can if the disk becomes full or fails completely. This has advantages over RAID0 (where you would effectively have the same capacity as JBOD) in that you do not have to replace the whole data set from backup (or full repair) but just run a repair for the missing data. On the other hand, RAID0 provides higher throughput (depending how well you know how to tune RAID arrays to match filesystem and drive geometry).
If you have the resources for fault-tolerant/more performant RAID setup (like RAID10 for example), you may want to just use a single directory for simplicity. Most deployments are starting to lean towards the density route, using JBOD rather than systems-level tolerance though.
You can read about the thought process behind the development of this issue here:
https://issues.apache.org/jira/browse/CASSANDRA-4292
Some what I am able to guess how the keyspace is split between multiple data directories. Based on the maximum available space and load on directories, SSTables of same column family written to the different data directories..

Resources