Cassandra directory setup of replica datacenter - cassandra

I have a Cassandra cluster and plan to add a new datacenter to replicate data. There will be no write on this node, only reads.
My questions are:
in this case is it still recommended to have separate drives for commit log and data?
if I know, that my cluster will receive data only by hints (and lots of them) should I create a separate disk for the hints? I did not find any mention of this.

in this case is it still recommended to have separate drives for commit log and data?
So the whole idea of putting your commitlog on a separate mount point, goes back to spinning disks being a chokepoint for I/O. If you have your cluster/nodes backed by SSDs, you shouldn't need to do that.
if I know, that my cluster will receive data only by hints (and lots of them) should I create a separate disk for the hints?
Hints only build up when a node is down. When your writes happen, the Snitch handles propagation of all of the required replicas. So no, I wouldn't worry about putting your hints dir on a separate mount point, either.

Related

How to make Cassandra nodes have the same data?

I have two computers each one being a Cassandra node and they communicate well with each other.
From what I understand Cassandra will replicate the data to each other but will always query certain portions from one of them.
I would like though to have the data being copied to each other so they have the same data but they only use data from the local node. Is that possible?
The background reason is that the application in each node keeps generating and downloading a lot of data and at the same time both are doing some CPU super intensive tasks. What happens is that one node saves the data and suddenly can't find it anymore because it has been saved in the other node which is busy enough to reply with that data.
Technically, you just need to change replication factor to a number of nodes, and set your application to always read from a local node using whitelist load balancing mode. But it may not help you because if your nodes are very busy, replication of data from another node may also not happen, so the query will fail as well. Or replication will add an additional overhead making the situation much worse.
You need to rethink your approach - typically, you need to separate application nodes from database nodes, so application processes doesn't affect database processes.

Apache Ignite 2.9.0 Recover from lost partitions

We have setup a apache ignite 2.9.0 cluster with native persistence using kubernetes in Azure with 4 nodes. To update some cache configuration, we restarted all the ignite nodes. After restart, running any sql query on one particular table - results in restart of 2 ignite nodes and after that we see lost partitions exception.
If we try to restart all nodes to recover from lost partitions, then its fine until we run any sql query on that table after which 2 nodes restart and we get lost partitions exception.
Is there anyway we can recover from lost partitions and overcome this problem? We also wanted to understand why its occuring?We could not find any logs related to this.
When all partition owners left the grid, the partition is considered to be lost, you might think of this as a special internal marker. Depending on the PartitionLossPolicy Ignite might ignore this fact and allow cache operations or disallow them to protect data consistency.
If you use native persistence, then most likely there was no physical data loss and all you need is to tell Ignite that you are aware of the situation, now all data are in place and it's safe to remove the "lost" mark from the partitions.
I think the most simple way to handle this would be to use the control script from within a pod:
control.sh --cache reset_lost_partitions cacheName1,cacheName2,...
More details:
https://ignite.apache.org/docs/latest/configuring-caches/partition-loss-policy#handling-partition-loss

Can we backup only one availability zone for AZ replicated Cassandra cluster

Since my Cassandra cluster is replicated across three availability zones, I would like to backup only one availability zone to lower the backup costs. I have also experimented restoring nodes in a single availability zone and got back most of my data in a test environment. I would like to know if there are any drawbacks to this approach before deploying this solution in production. Is anyone following this approach in your production clusters?
Note: As I backup at regular intervals, I know that I may loose updates happened to other two AZ nodes quorum at the time of snapshot but that's not a problem.
You can backup only specific dc, or even nodes.
AFAIK, the only drawback is does your data consistent/up-to-date, and since you can afford to lose some data it shouldn't be a problem. And if you, for example performing writes with ALL consistency level, the data should be up-to-date on all nodes.
BUT, you must be sure that your data is indeed replicated between multi a-z, by playing with rack/dc properties or using ec2 switch that supports multi a-z.
EDIT:
Global Snapshot
Running nodetool snapshot is only run on a single node at a time.
This only creates a partial backup of your entire data. You will want
to run nodetool snapshot on all of the nodes in your cluster. But
it’s best to run them at the exact same time, so that you don’t have
fragmented data from a time perspective. You can do this a couple of
different ways. The first, is to use a parallel ssh program to
execute the nodetool snapshot command at the same time. The second,
is to create a cron job on each of the nodes to run at the same time.
The second assumes that your nodes have clocks that are in sync, which
Cassandra relies on as well.
Link to the page:
http://datascale.io/backing-up-cassandra-data/

Is it possible to recover a Cassandra node without a snapshot?

Offsite backups for Cassandra seem like a challenging thing. You basically have to make yet another copy of ALL your data, including the copies of data that exist due to the replication factor. Snapshots make backups easy when you don't mind storing it on the same disk that your node already uses. I'm curious - in the event of a catastrophic failure of this disk, is it possible to recover the node using the nodes that the data was replicated to?
Yes, you can restore data on crashed node using a procedure in documentation - Replacing a dead node or dead seed node. It's for Cassandra 3.x, please pick your Cassandra version from a drop-down menu on the top of the page.
But please note that you still need to do backups if your data is valuable. If you using AWS you can use this project to backup Cassandra to S3 storage.
If you are looking for offsite or off-host backups, you can also look at opscenter from Datastax or Talena software (my company). Both provide you the ability to backup your database locally or to S3. As you may expect, you also have the ability to restore data in case of hardware failures, user errors or logical corruptions which the replicas will not protect you against.
Yes, it is possible. Just execute in terminal "nodetool repair" on the node with missed data. It can take a lot of time. Also I would recommend execute repair operation on each node every month to keep your data always replicated because cassandra does not repairs data automatically (for example after node(s) falling).

Should I use different HDD for Cassandra commit log and data?

I started learning Apache Cassandra. In the conf/cassandra.yaml I noticed the commitlog setting's comment as following:
commit log. when running on magnetic HDD, this should be a
separate spindle than the data directories.
If not set, the default directory is $CASSANDRA_HOME/data/commitlog.
Does that mean I should store the commitlog in different HDD than the data?
If yes, what's the reason behind this? And what will happen if I don't comply.
Thanks.
That is a recommendation from the days of spinning-disk. Due to its log-based storage engine, Cassandra is very dependent on disk I/O. So it was recommended to have your commit log and data directories on separate disks to avoid a potential bottleneck (latency) due to heavy disk activity.
If you are using solid state drives (SSDs, and with Cassandra you really should be) then you don't need to worry about this.
NOTE: This is also why using a NAS or SAN with Cassandra is considered to be an anti-pattern.

Resources