I need to have a GlusterFS architecture that lets me to put large files (bigger than the brick) in volumes. I'm not going to use striped type because it has performance issue and makes my volumes slower.
You can check sharding volume type in GlusterFS.
check the documentation and try it out:
Because sharding distributes files across the bricks in a volume, it lets you store files with a larger aggregate size than any individual brick in the volume.
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/administration_guide/#sect-Managing_Sharding
Related
I want to create volume using glusterfs. Can glusterfs volume be created out of directory instead of partition ?
Yes, that should work. I pretty much use it that way during development/testing:
gluster volume create testvol replica 3 myhost:/home/ravi/bricks/brick{1..6} force
Unless you want to use features like snapshot which require thinly provisioned lvms as partitions.
Might I also add that if you place multiple bricks of different distribute subvols on the same folder, things like df and quotas might not always work as intended.
Intro
I have a cassandra 1.2 cluster, all the nodes have SSDs. Now I want to add more disks to the existing nodes, but I want to be able to choose which tables are stored on different disks.
Problem
For example, node 1 will have 3 SSDs and 1 regular disk drive and I want all the column families except 1 (let's call it "discord" table) to be stored on the SSDs only, the final table "discord" needs to be stored on the regular disk.
According to the documentation this should be possible; however, the only way of doing it that I can see is:
Setting up Cassandra to use multiple data_files_directories in cassandra.yaml.
Creating the tables.
Creating a link from the data directory on each SSD to the directory on the hard disk where I want to store the column family.
Question
Is this the only way of doing it? Or there is a simpler way of configuring a node to work in this way?
You can set multiples files using the data_file_directories property, but the data is distributed over the folders internally by Cassandra. You can not take decisions on which keyspace or column family goes to each directory.
So the symbolic links is the way to go in my opinion.
I'm considering switching to FreeNAS at the same time I'm acquiring some new disks for my home server. The end configuration will have a 1.5TB drive (currently the largest disk in the set) and two 3TB drives.
The "obvious" way to structure this (to me) would be to create partitions on the 3TB drives equal in size to the full 1.5TB drive, then RAID-Z those partitions together for 3TB of redundant storage. The remainder of the 3TB drives could be mirrored together for another 1.5TB of redundant storage. This seems like it gives me no wasted space, and a full 4.5TB of redundant storage to work with.
The problem is that I can't find anything that would let me treat these two segments as a single pool. I don't really care if any given data is written to parity vs. mirrored space, so long as it's all resilient to a single disk failure.
Am I stuck with two virtual spaces and allocating data between them, or is there a ZFS option I'm not finding that would let me pool the whole thing?
Technically you should be able to build a pool with two vdevs -- one with RAID-Z with 3 partitions and another a mirror with 2 partitions.
Something like this should work:
zpool create tank raidz da0p0 da1p0 da2p0 mirror da0p1 da1p1
That said, you don't want to do that for performance reasons. Reads and writes will be distributed across all vdevs, and, as the result, across *all your partitions for every chunk of data ZFS needs to write out. In the end your 3GB hard drives will have to do two seeks access data on different partitions each time ZFS writes out each transaction group. Once data is written, similar seeks will be needed to read data that's not in ARC yet. At 10-20ms per seek performance will be rather terrible.
I have Cassandra 1.2.6 cluster running on datacenter A, each node has a solid state drive with somewhat limited space (aprox 50% of disk space is free).
Now I need to implement somehow a way of having automatic backups of each node. Ideally I want to have a way of moving all of the cluster's datafiles to a different disk (standard cheaper disks), or even to a different server in the same datacenter A and possibly moving all the data once in a while to a datacenter B in a different location.
From what I've read I can use snapshots on each node to get the files to copy using whatever tool I want and in this case I have the option to move the data to a different disk/server/datacenter.
My question is, since each of my nodes is about 50% full, taking a snapshot will consume all that space? or the hard links will consume way less space than I anticipate?, if so, is there a better way of doing this, maybe with an already made tool, or everything should be custom made when it comes to this type of backups in Cassandra?
Thanks in advance!
A hard link just creates a new directory entry for the same file (http://en.wikipedia.org/wiki/Hard_link). So a snapshot takes up effectively zero space, but you'll want to clean it up after you're done copying it off to whatever your archive is, because when the "original" sstable is deleted (typically post-compaction), space won't be reclaimed as long as the snapshot reference is still there.
My impression is that tablesnap is the most popular tool for automating backups to s3. It also supports Cassandra incremental backups. If you want more control over where you're backing up to, DataStax OpsCenter supports running a custom script when it takes snapshots.
I have configured three separate data directories in cassandra.yaml file as given below:
data_file_directories:
- E:/Cassandra/data/var/lib/cassandra/data
- K:/Cassandra/data/var/lib/cassandra/data
when I create keyspace and insert data my key space got created in both two directories and data got scattered. what I want to know is how cassandra splits the data between multiple directories?. And what is the rule behind this?
You are using the JBOD feature of Cassandra when you add multiple entries under data_file_directories. Data is spread evenly over the configured drives proportionate to their available space.
This also let's you take advantage of the disk_failure_policy setting. You can read about the details here:
http://www.datastax.com/dev/blog/handling-disk-failures-in-cassandra-1-2
In short, you can configure Cassandra to keep going, doing what it can if the disk becomes full or fails completely. This has advantages over RAID0 (where you would effectively have the same capacity as JBOD) in that you do not have to replace the whole data set from backup (or full repair) but just run a repair for the missing data. On the other hand, RAID0 provides higher throughput (depending how well you know how to tune RAID arrays to match filesystem and drive geometry).
If you have the resources for fault-tolerant/more performant RAID setup (like RAID10 for example), you may want to just use a single directory for simplicity. Most deployments are starting to lean towards the density route, using JBOD rather than systems-level tolerance though.
You can read about the thought process behind the development of this issue here:
https://issues.apache.org/jira/browse/CASSANDRA-4292
Some what I am able to guess how the keyspace is split between multiple data directories. Based on the maximum available space and load on directories, SSTables of same column family written to the different data directories..