Cassandra: Who creates/distributes virtual Nodes among nodes - Leader? - cassandra

In Cassandra, virtual nodes are created and distributed among nodes as given in http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2. But who does that process ? Creating the virtual nodes, distributing among peers. Is it some sort of leader ? How does it work ?
Also in case a node joins, virtual nodes are re-distributed. Lot more similar actions are present. Who does all those ?
Edit: Is it like when a node joins, it takes up some part of virtual nodes from existing cluster thus eliminating the need of leader ?

New node retrieves data about the cluster using seed nodes.
The new node will take his part of the cluster, based of num_tokens parameter (by default it will be distributed evenly between all nodes existing nodes), and will bootstrap it's part of data.
The rest of the cluster will be aware about the changes by "gossiping" - using gossip protocol.
Except the seed nodes part, there's no need for any "master" in the process.
Old nodes will not delete partitions automatically, you need to run nodetool cleanup on the old nodes after adding a new node.
Here's good article about it:
http://cassandra.apache.org/doc/latest/operating/topo_changes.html

Related

CouchDB cluster database distribution

We have a CouchDB cluster with 24 nodes, with q=8 and n=3as default cluster settings, and 100 databases already created. If we added 24 more nodes to the cluster and started creating new databases, would they be created in the new nodes or not necessarily? How does CouchDB make the decision of where to put the new databases?
We are running all nodes in 2.3.1
By default CouchDB will assign the shards for databases across all the nodes in the cluster randomly, so your new databases will have shards on both the new nodes and the old ones. It will spread the shards across as many nodes as possible, and guarantee that two replicas of a shard are never co-located on the same node.
If you wanted to have shards hosted only on the new nodes (e.g. because the old ones are filling up) you could take advantage of the “placement” feature. There are two steps involved:
Walk through the documents in the _nodes database and set a “zone” attribute on each document, e.g. “zone”:”old” for your old nodes and “zone”:”new” for the new ones.
Define a [cluster] placement config setting that tells the server to place 3 copies of each shard in your new zone:
[cluster]
placement = new:3
You could also use this feature for other purposes; for example, by splitting your nodes up into zones based on rack location, availability zone, etc. to ensure that replicas have maximum isolation from each other. You can read more about the setting here:
http://docs.couchdb.org/en/2.3.1/cluster/databases.html#placing-a-database-on-specific-nodes

How cassandra improve performance by adding nodes?

I'm going build apache cassandra 3.11.X cluster with 44 nodes. Each application server will have one cluster node so that application do r/w locally.
I have couple of questions running in my mind kindly answer if possible.
1.How many server Ip's should mention in seednode parameter?
2.How HA works when all the mentioned seed node goes down?
3.What is the dis-advantage to mention all the serverIP's in seednode parameter?
4.How cassandra scales with respect to data other than(Primary key and Tunable consistency). As per my assumption replication factor can improve HA chances but not performances.
then how performance will increase by adding more nodes?
5.Is there any sharding mechanism in Cassandra.
Answers are in order:
It's recommended to point to at least to 2 nodes per DC
Seed/contact node is used only for initial bootstrap - when your program reaches any of listed nodes, it "learns" the topology of whole cluster, and then driver listens for nodes status change, and adjust a list of available hosts. So even if seed node(s) goes down after connection is already established, driver will able to reach other nodes
it's harder to maintain usually - you need to keep a configuration parameters for your driver & list of nodes in sync.
When you have RF > 1, Cassandra may read or write data from/to any replica. Consistency level regulates how many nodes should return answer for read or write operation. When you add the new node, the data is redistributed to new node, and if you have correctly selected partition key, then new node start to receive requests in parallel to old nodes
Partition key is responsible for selection of replica(s) that will hold data associated with it - you can see it as a shard. But you need to be careful with selection of partition key - it's easy to create too big partitions, or partitions that will be "hot" (receiving most of operations in cluster - for example, if you're using the date as partition key, and always writing reading data for today).
P.S. I would recommend to read DataStax Architecture guide - it contains a lot of information about Cassandra as well...

What happens when all seed nodes in Cassandra are down? Can new nodes join the cluster at that point?

What happens when all seed nodes in Cassandra are down? Can new nodes join the cluster at that point ?
This is from cassandra docs
The ring can operate or boot without a seed; however, you will not be able to add new nodes to the cluster. It is recommended to configure multiple seeds in production system.
`
Here is the link http://cassandra.apache.org/doc/latest/faq/index.html#does-single-seed-mean-single-point-of-failure
The seed nodes are the initial point of contact for bootstrapping nodes. If you have a cluster of say 10 nodes you can ideally might have say 3 nodes as seeds. Once the bootstrapping node contacts the seed it will then start to gossip with other nodes.
There is nothing special about a seed node in terms of its functionality, it will operate exactly the same as other nodes (seed nodes do gossip more though see doc link below)
So if per chance your 3 seed nodes were down you could just add an IP of any other node in the cluster to your new node under the seeds: parameter in the cassandra.yaml file and you'll still be able to bootstrap.
It is of course nice to have all nodes using the same seeds like, for purposes of configuration consistency.
https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__seed_provider

Ability to write to a particular cassandra node

Is there a possibility to write to a particular node using datastax driver?
For example, I have three nodes in datacenter 1 and three nodes in datacenter 2.
Existing
If i build up the cluster with any one of them as seed, all the nodes will get detected by the datastax java driver. So, in this case, if i insert a data using driver, it will automatically choose one of the nodes and proceed with it as the co-ordinator(preferably local data center)
Requirement
I want a way to contact any node in datacenter 2 and hand over the co-ordinator job to one of the nodes in datacenter 2.
Why i need this
I am trying to use the trigger functionality from datacenter 2 alone. Since triggers are taken care by co-ordinator , i want a co-ordinator to be selected from datacenter 2 so that data center 1 doesnt have to do this operation.
You may be able to use the DCAwareRoundRobinPolicy load balancing policy to achieve this by creating the policy such that DC2 is considered the "local" DC.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("dc2"));
In the above example, remote (non-DC2) nodes will be ignored.
There is also a new WhiteListPolicy in driver version 2.0.2 that wraps another load balancing policy and restricts the nodes to a specific list you provide.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new WhiteListPolicy(new DCAwareRoundRobinPolicy("dc2"), whiteList));
For multi-DC scenarios Cassandra provides EACH and LOCAL consistency levels where EACH will acknowledge successful operation in each DC and LOCAL only in local one.
If I understood correctly, what you are trying to achieve is DC failover in your application. This is not a good practice. Let's assume your application is hosted in DC1 alongside with Cassandra. If DC1 goes down, your entire application is unavailable. If DC2 goes down, your application still can write with LOCAL CL and C* will replicate changes when DC2 is back.
If you want to achieve HA, you need to deploy application in each DC, use CL=LOCAL_X and finally do failover on DNS level (e.g. using AWS Route53).
See data consistency docs and this blog post for more info about consistency levels for multiple DCs.

Adding a new node to existing cluster

Is it possible to add a new node to an existing cluster in cassandra 1.2 without running nodetool cleanup on each individual node once data has been added?
It probably isn't but I need to ask because I'm trying to create an application where each user's machine is a server allowing for endless scaling.
Any advice would be appreciated.
Yes, it is possible. But you should be aware of the side-effects of not doing so.
nodetool cleanup purges keys that are no longer allocated to that node. According to the Apache docs, these keys count against the allocated data for that node, which can cause the auto bootstrap process for the next node to not properly balance the ring. So depending on how you are bringing new user machines into the ring, this may or may not be a problem.
Also keep in mind that nodetool cleanup only needs to be run on nodes that lost keyspace to the new node - i.e. adjacent nodes, not all nodes, in the cluster.

Resources