Does the Cassandra Java driver have knowledge of port 7000? - cassandra

Typically port 9042 is provided with the Cassandra cluster seed hosts to connect to Cassandra cluster for performing CRUD operations.
Does Cassandra Java client driver has knowledge of port 7000 (used for peer communication) after the client establishes connection with Cassandra cluster?
Thanks,
Deepak

The Java driver doesn't make use of the internode communication port 7000 because it doesn't need to participate in gossip with the nodes in the cluster.
Instead, the Java driver establishes a control connection with one of the nodes the first time it connects to the cluster. The driver uses the control connection (1) to query the system tables to discover the cluster's topology and schema, and (2) listen to topology and schema changes.
It is in point (2) above that the driver recognises when nodes are added or decommissioned from the cluster as well as find out when the schema has been updated. This is the reason the driver doesn't need to gossip with the nodes.
For more information, see Control connection for the Java driver. Cheers!

No, the Java driver doesn't know and should not know about in-cluster node-to-node communications. Why should it, anyway?

Related

How to use Datastax java driver for knowing a node is down in the Cassandra cluster?

How can we use Datastax Java Driver to know down nodes in Cassandra Cluster? Does metadata of driver is updated continuously or do we have to register any listeners?
The driver consumes gossip info with the cluster. If a node is down, it’ll know it and not route traffic to it. No worries about engineering anything to do it yourself.

How to use different network interface for Cassandra inter-node communication?

I have four Cassandra nodes deployed. I have a Java application which acts as a client to the Cassandra cluster. Now, I want to see if I can use different network interfaces for the Inter-node communication and the Data transfer.
Can you shed some light on this?
Yes, you can do this. For inter-node communication you can specify IPs/interfaces via listen_address (or listen_interface, but not together) (conf), and for client->Cassandra communication - rpc_address (or rpc_interface) (conf)...
If necessary, you may need to set broadcast_address & broadcast_rpc_address as well, but it depends on the topology of your cluster.

how to connect to cassandra cluster using cqlsh or command prompt?

In other dbs, we connect to db cluster with load balance IP. How do we connect to cassandra cluster using command line? What socket is used? Is this always a single node and IP?
What if i connect o node1, and node1 goes down. Will this automatically connect to node2 or node3?
You have several options: the easiest one is to use the Cassandra Query Language Shell (CQLSH), which is a python based CQL interpreter to interact with Cassandra. It usually comes with every Cassandra installation, under the /bin folder of the installation directory. If you have ssh access to one of the nodes Cassandra is running onto, this can be an easy option (you will avoid any issues related to firewall blocking incoming connections to your cluster).
You can also use cqlsh to access remotely to the cluster:
cqlsh node_ip 9043
but this will require cqlsh to be present on your machine.
In general, Cassandra uses an initial set of contact nodes and a gossip protocol to contact and learn the cluster composition. You will be assigned a node as coordinator for your query. You may not worry about seed nodes being currently down, provided that at least one is up and running.
Another option to access remotely to the cluster is the Datastax DevCenter,which is a free-to-use grafical interface to execute CQL queries.
Hope this helps

what is the use of gossip protocol in apache cassandra?

I have problems understanding the utility of gossip protocol in Apache Cassandra (Why Cassandra use gossip protocol as a P2P communication protocol?)?
Given that Apache cassandra use gossip protocol, for wich reasons? is it just to exchange nodes states in cluster to find if a node is UP or DOWN? or it's used too to exchange node information like memory usage & disk capacity?
Gossip is used to broadcast members' state around the cluster. Part of the information exchanged:
status
health
tokens
schema version
addresses
data size
Note: there might be other details that I have missed. Another resource that you can consult is https://wiki.apache.org/cassandra/ArchitectureGossip
The Gossip protocol is the internal communication technique for nodes in a cluster to talk to each other. Gossip is an efficient, lightweight, reliable inter-nodal broadcast protocol for diffusing data. It's decentralized, "epidemic", fault tolerant and a peer-to-peer communication protocol. Cassandra uses gossiping for peer discovery and metadata propagation.

Data transmission between nodes and client cassandra

I am new to Cassandra and learning it.
So question is how communication is done between nodes in Cassandra
Basic communication - failure detection and other
Data transmission from node to node and client
Any other type of communication
Answer of 1st one is Gossip protocol http://www.datastax.com/resources/faq
But I am little curious about protocol and methodology Cassandra uses to transfer data from one node to another or client.
Communication between nodes is through Gossip, as stated by you.
Failure detection is again through Gossip, each node checks for Gossip messages from other nodes. If it does not receive 'n'(configurable in cassandra.yaml file) number of gossip messages it considers the node as dead. Look for the tag phi-convict threshold.
I am not sure what cassandra uses for data transfer, mostly probably might be simple layers built over TCP. One of the major features of cassandra is that you don't have to worry about how Cassandra handles replication, you only have to think about the strategy
Cassandra inter node communication is separate to communication between nodes and clients.
Gossip - is used so that nodes are aware of failures (client not
involved)
This needs to be split: Nodes communicate/send data the storage_port (see cassandra.yaml - default port 7000), clients connect to port 9042 (or 9160 for old thrift clients) and communicate with a proprietary binary protocol specified here: https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v3.spec
Other communication you might care about is JMX, which node tool uses
More details here: http://www.datastax.com/documentation/cassandra/2.1/cassandra/security/secureFireWall_r.html

Resources