I have problems understanding the utility of gossip protocol in Apache Cassandra (Why Cassandra use gossip protocol as a P2P communication protocol?)?
Given that Apache cassandra use gossip protocol, for wich reasons? is it just to exchange nodes states in cluster to find if a node is UP or DOWN? or it's used too to exchange node information like memory usage & disk capacity?
Gossip is used to broadcast members' state around the cluster. Part of the information exchanged:
status
health
tokens
schema version
addresses
data size
Note: there might be other details that I have missed. Another resource that you can consult is https://wiki.apache.org/cassandra/ArchitectureGossip
The Gossip protocol is the internal communication technique for nodes in a cluster to talk to each other. Gossip is an efficient, lightweight, reliable inter-nodal broadcast protocol for diffusing data. It's decentralized, "epidemic", fault tolerant and a peer-to-peer communication protocol. Cassandra uses gossiping for peer discovery and metadata propagation.
Related
Typically port 9042 is provided with the Cassandra cluster seed hosts to connect to Cassandra cluster for performing CRUD operations.
Does Cassandra Java client driver has knowledge of port 7000 (used for peer communication) after the client establishes connection with Cassandra cluster?
Thanks,
Deepak
The Java driver doesn't make use of the internode communication port 7000 because it doesn't need to participate in gossip with the nodes in the cluster.
Instead, the Java driver establishes a control connection with one of the nodes the first time it connects to the cluster. The driver uses the control connection (1) to query the system tables to discover the cluster's topology and schema, and (2) listen to topology and schema changes.
It is in point (2) above that the driver recognises when nodes are added or decommissioned from the cluster as well as find out when the schema has been updated. This is the reason the driver doesn't need to gossip with the nodes.
For more information, see Control connection for the Java driver. Cheers!
No, the Java driver doesn't know and should not know about in-cluster node-to-node communications. Why should it, anyway?
I have four Cassandra nodes deployed. I have a Java application which acts as a client to the Cassandra cluster. Now, I want to see if I can use different network interfaces for the Inter-node communication and the Data transfer.
Can you shed some light on this?
Yes, you can do this. For inter-node communication you can specify IPs/interfaces via listen_address (or listen_interface, but not together) (conf), and for client->Cassandra communication - rpc_address (or rpc_interface) (conf)...
If necessary, you may need to set broadcast_address & broadcast_rpc_address as well, but it depends on the topology of your cluster.
I can't understand the difference between Snitch and Gossip in Cassandra, and I can't find even one source which has discussed the subject, let alone providing a good answer. Seems to me that Snitch and Gossip are both inter-node communication protocols; so why do we need 2 of them?
I know that Gossip helps a node to get information from bootstrap nodes, but that doesn't really explain the difference since when a node starts, it needs to learn about the data centers and racks as well which is supposed to be the domain of the Snitch.
Gossip is a protocol and Snitch is a component which utilizes it. Snitch is a little bit more than gossip and it has at least some heuristics like identifying data centers or racks while gossip is like a convenient tool to get this information. Almost all that gossip is doing is spreading arround with some rules to cover all necessary nodes and receive some technical data like ip, health etc. While Snitch utilizes this info to perform something more. One of its features is to identify different data centers by analyzing received ips. Then this info is used by other components for further actions like replicas location etc. So they've decided to give this functionality separate name to identify it and actually it's all about layering the functionality.
Some relevant information also can be found here: https://books.google.ru/books?id=h36CCwAAQBAJ&pg=PT21&lpg=PT21&dq=snitch+gossip&source=bl&ots=fjxy_z78Gj&sig=KpqdkKaREIo2YAWyJj3yMZCyNn4&hl=ru&sa=X&ved=0ahUKEwiUktS8q8zWAhWIQZoKHTViD0U4ChDoAQhUMAc#v=onepage&q=snitch%20gossip&f=false
And here is a more detailed snitch definition (but in scylla): https://github.com/scylladb/scylla/wiki/Snitches
Gossip is used to identify the state of machines (are they in the cluster, up/down/joining/leaving).
The snitches help map ownership to an actual machine, and route queries (given these 10 nodes in the cluster, which of the 10 own the data for a given key).
Different snitches can help assign data in different ways - the simple snitch just places all instances into datacenter1/rack1, and uses the simple distributed hashtable / naive partitioner placement. The property file snitch lets you create a file that has all of the instances, and maps the instance to a datacenter/rack, ensuring that replicas always exist on different racks (and datacenters, as defined by the replication strategy).
The gossiping-property-file-snitch and the ec2 snitches are somewhat like the property file snitch in that they're rack/topology aware, but they read the local instance topology information (either from a file or from the ec2 apis) and then gossip it to others, so each node is responsible for broadcasting its own topology information (through gossip).
Gossip is an epidemic protocol that spreads through the cluster. It transmits cluster metadata i.e the state of the cluster.
Following are the information shared as part of Gossip:
Generation: when it booted
Version : Timestamp
Application state:
Status : Normal/Joining/leaving
DC : data center location
Rack: rack number of this node
Schema:Schema version on the node
Load: Disk pressure on the node
Severity:The pressure on the system from the I/O standpoint
etc...
Snitch helps map IPs to racks and data centers, in other words. It creates a topology by grouping nodes to help determine where data is read from. When a read request comes in, it reaches the coordinator node, the consistency level of the read request and the read_repair_chance for that Column family decide how the snitch steps in. Only one node will send back the requested data, it is up to the snitch to determine that.
I am new to Cassandra and learning it.
So question is how communication is done between nodes in Cassandra
Basic communication - failure detection and other
Data transmission from node to node and client
Any other type of communication
Answer of 1st one is Gossip protocol http://www.datastax.com/resources/faq
But I am little curious about protocol and methodology Cassandra uses to transfer data from one node to another or client.
Communication between nodes is through Gossip, as stated by you.
Failure detection is again through Gossip, each node checks for Gossip messages from other nodes. If it does not receive 'n'(configurable in cassandra.yaml file) number of gossip messages it considers the node as dead. Look for the tag phi-convict threshold.
I am not sure what cassandra uses for data transfer, mostly probably might be simple layers built over TCP. One of the major features of cassandra is that you don't have to worry about how Cassandra handles replication, you only have to think about the strategy
Cassandra inter node communication is separate to communication between nodes and clients.
Gossip - is used so that nodes are aware of failures (client not
involved)
This needs to be split: Nodes communicate/send data the storage_port (see cassandra.yaml - default port 7000), clients connect to port 9042 (or 9160 for old thrift clients) and communicate with a proprietary binary protocol specified here: https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v3.spec
Other communication you might care about is JMX, which node tool uses
More details here: http://www.datastax.com/documentation/cassandra/2.1/cassandra/security/secureFireWall_r.html
It's my understanding that Cassandra's snitch protocol enables all nodes to maintain a picture of the network topology (i.e., reachability) for all nodes in the cluster. The application I'm working on also needs to know the state of the network topology of the nodes on which Cassandra is running. Is the topological information that is computed in Cassandra exposed in any way via an API, or do I have to re-create this topology myself? I'd prefer not to have to re-invent the wheel to support this functionality. What's my best option for achieving this?