Cassandra token assignment for multiple DC cluster - cassandra

I am a little confused about the token assignment for multiple DC. When running nodetool ring, we can see all the tokens are/need to be different even for different DC's node. Does that mean all nodes in cluster form only one ring, or each DC's nodes form a ring in each DC?

That's right, the Cassandra token range spans the entire cluster, so there will only be one primary node responsible for any piece of data.
Managing data across multiple datacenters is handled by specifying the desired replication strategy, e.g. NetworkTopologyStrategy

Each DCs nodes form a ring in its own cluster. For multiple DCs, each DC has its own distinct partition range independent of each other. As #alec-collier mentioned NetworkTopologyStrategy will figure out the replicas for a partition within each DC.

Related

How to generate tokens for murmur3Partitioner for multiple nodes?

I have a Cassandra Cluster with multiple datacenters and nodes
How can I generate token values for Multi-datacenter Cassandra cluster for Murmur3Partitioner?
I found a site which generates token for Murmur3Partitioner and RandomPartitioner but only for 1 DC.
https://www.geroba.com/cassandra/cassandra-token-calculator/
I will update here If found a better link.
https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configGenTokens.html
You just need to make sure that each of the nodes in the each DC have the ranges spread evenly.

Creating Cassandra sub-clusters

I need to create K overlapping Cassandra clusters on N machines (K>>N). Each cluster can have between 1 to N nodes. I know that one way of doing so is to create a separate process (or docker container) for each cluster a node is a member of.
My question however is that can I change Cassandra to allow the creation of sub-clusters? meaning that there would be only 1 Cassandra instance running on each node, but I would be able to take control of data replication and data placement so that within a sub-cluster for example, I would be able to do a Quorum write for example.
No, it's not possible to define the sub-cluster as you describe - there is always a single Cassandra cluster per process.
But Cassandra has a notion of the Datacenter that defines where machine resides, and the keyspace that defines how the data is replicated between datacenters and nodes. And consistency level, like, QUORUM depends on the keyspace configuration.
In your case I would think in that direction - define datacenters, create necessary keyspaces, and setup correct replication factors for that keyspaces.

Working of vnodes in Cassandra

Can someone explain the working of vnodes allocation in Cassandra?
If we have a cluster of N nodes and a new node is added how are token ranges allocated to this new node?
Rebalancing a cluster is automatically accomplished when adding or removing nodes. When a node joins the cluster, it assumes responsibility for an even portion of data from the other nodes in the cluster. If a node fails, the load is spread evenly across other nodes in the cluster.
Here is some reading that might help you better understand how vnodes work and how ranges are being allocated - Virtual nodes in Cassandra 1.2
As I said above, Cassandra automatically handles the calculation of token ranges for each node in the cluster in proportion to their num_tokens value. Token assignments for vnodes are calculated by the org.apache.cassandra.dht.tokenallocator.ReplicationAwareTokenAllocator class.
When a new node joins the cluster, it will inject it's own ranges and steal some rages from the existing nodes. Also this video might help

What are the differences between a node, a cluster and a datacenter in a cassandra nosql database?

I am trying to duplicate data in a cassandra nosql database for a school project using datastax ops center. From what I have read, there is three keywords: cluster, node, and datacenter, and from what I have understand, the data in a node can be duplicated in another node, that exists in another cluster. And all the nodes that contains the same (duplicated) data compose a datacenter. Is that right?
If it is not, what is the difference?
The hierarchy of elements in Cassandra is:
Cluster
Data center(s)
Rack(s)
Server(s)
Node (more accurately, a vnode)
A Cluster is a collection of Data Centers.
A Data Center is a collection of Racks.
A Rack is a collection of Servers.
A Server contains 256 virtual nodes (or vnodes) by default.
A vnode is the data storage layer within a server.
Note: A server is the Cassandra software. A server is installed on a machine, where a machine is either a physical server, an EC2 instance, or similar.
Now to specifically address your questions.
An individual unit of data is called a partition. And yes, partitions are replicated across multiple nodes. Each copy of the partition is called a replica.
In a multi-data center cluster, the replication is per data center. For example, if you have a data center in San Francisco named dc-sf and another in New York named dc-ny then you can control the number of replicas per data center.
As an example, you could set dc-sf to have 3 replicas and dc-ny to have 2 replicas.
Those numbers are called the replication factor. You would specifically say dc-sf has a replication factor of 3, and dc-ny has a replication factor of 2. In simple terms, dc-sf would have 3 copies of the data spread across three vnodes, while dc-sf would have 2 copies of the data spread across two vnodes.
While each server has 256 vnodes by default, Cassandra is smart enough to pick vnodes that exist on different physical servers.
To summarize:
Data is replicated across multiple virtual nodes (each server contains 256 vnodes by default)
Each copy of the data is called a replica
The unit of data is called a partition
Replication is controlled per data center
A node is a single machine that runs Cassandra. A collection of nodes holding similar data are grouped in what is known as a "ring" or cluster.
Sometimes if you have a lot of data, or if you are serving data in different geographical areas, it makes sense to group the nodes of your cluster into different data centers. A good use case of this, is for an e-commerce website, which may have many frequent customers on the east coast and the west coast. That way your customers on the east coast connect to your east coast DC (for faster performance), but ultimately have access to the same dataset (both DCs are in the same cluster) as the west coast customers.
More information on this can be found here: About Apache Cassandra- How does Cassandra work?
And all the nodes that contains the same (duplicated) data compose a datacenter. Is that right?
Close, but not necessarily. The level of data duplication you have is determined by your replication factor, which is set on a per-keyspace basis. For instance, let's say that I have 3 nodes in my single DC, all storing 600GB of product data. My products keyspace definition might look like this:
CREATE KEYSPACE products
WITH replication = {'class': 'NetworkTopologyStrategy', 'MyDC': '3'};
This will ensure that my product data is replicated equally to all 3 nodes. The size of my total dataset is 600GB, duplicated on all 3 nodes.
But let's say that we're rolling-out a new, fairly large product line, and I estimate that we're going to have another 300GB of data coming, which may start pushing the max capacity of our hard drives. If we can't afford to upgrade all of our hard drives right now, I can alter the replication factor like this:
CREATE KEYSPACE products
WITH replication = {'class': 'NetworkTopologyStrategy', 'MyDC': '2'};
This will create 2 copies of all of our data, and store it in our current cluster of 3 nodes. The size of our dataset is now 900GB, but since there are only two copies of it (each node is essentially responsible for 2/3 of the data) our size on-disk is still 600GB. The drawback here, is that (assuming I read and write at a consistency level of ONE) I can only afford to suffer a loss of 1 node. Whereas with 3 nodes and a RF of 3 (again reading and writing at consistency ONE), I could lose 2 nodes and still serve requests.
Edit 20181128
When I make a network request am I making that against the server? or the node? Or I make a request against the server does it then route it and read from the node or something else?
So real quick explanation: server == node
As far as making a request against the nodes in your cluster, that behavior is actually dictated from the driver on the application side. In fact, the driver maintains a copy of the current network topology, as it reads the cluster gossip similar to how the nodes do.
On the application side, you can set a load balancing policy. Specifically, the TokenAwareLoadBalancingPolicy class will examine the partition key of each request, figure out which node(s) has the data, and send the request directly there.
For the other load balancing policies, or for queries where a single partition key cannot be determined, the request will be sent to a single node. This node will act as a "coordinator." This chosen node will handle the routing of requests to the nodes responsible for them, as well as the compilation/returning of any result sets.
Node:
A machine which stores some portion of your entire database. This may included data replicated from another node as well as it's own data. What data it is responsible for is determined by it's token ranges, and the replication strategy of the keyspace holding the data.
Datacenter:
A logical grouping of Nodes which can be separated from another nodes. A common use case is AWS-EAST vs AWS-WEST. The replication NetworkTopologyStrategy is used to specify how many replicas of the entire keyspace should exist in any given datacenter. This is how Cassandra users achieve cross-dc replication. In addition their are Consistency Level policies that only require acknowledgement only within the Datacenter of the coordinator (LOCAL_*)
Cluster
The sum total of all the machines in your database including all datacenters. There is no cross-cluster replication.
As per below documents:-
https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/architecture/archIntro.html
Node
Where you store your data. It is the basic infrastructure component of Cassandra.
Datacenter
A collection of related nodes. A datacenter can be a physical datacenter or virtual datacenter. Different workloads should use separate datacenters, either physical or virtual. Replication is set by datacenter. Using separate datacenters prevents Cassandra transactions from being impacted by other workloads and keeps requests close to each other for lower latency. Depending on the replication factor, data can be written to multiple datacenters. datacenters must never span physical locations.
Cluster
A cluster contains one or more datacenters. It can span physical locations.

How to separate ring from cluster in cassandra

We have a cassandra DSE cluster with 10 nodes for cassandra ring and 10 nodes for hadoop ring. Now the application writes the data to the cassandra ring and cassandra will replicate the data to hadoop ring.
We want to separate the two ring's and make them as two different cluster's and application writes the data to two clusters at the same time.
How to separate the cluster? is that possible?
we have ~600GB of data in the cluster and we cannot delete it.
You should test this first, but this basic procedure should work. It will need some tweaking if you have counters.
Set your application writing to both DCs using LOCAL_QUORUM.
Run repair on the whole cluster. This is to ensure each DC has a copy of the data.
Isolate the clusters so the two DCs can't talk to each other, probably using a firewall.
Assuming your DCs are DC1 and DC2, change your replication factor to be DC2:0 on DC1 and DC1:0 on DC2.
On each DC, run 'nodetool removenode' for each node in the other DC. This will just remove the DOWN nodes from the ring but won't have any affect on the data because the other nodes have replication factor zero.
This should work with zero data loss.

Resources