Cassandra-stress : how to install and set it up outside cassandra cluster - cassandra

I am about to use simple cassnadra cluster (3 nodes, x.x.x.104-106). I'm using CentOS7, so i used datastax repository, Cassandra 3.0.
I read on forum, it is better to install the cassandra-stress outside the cluster, otherwise it consumes CPU of the node.
Could you please help me, how to install it?
I tried to copied cassandra-stress.sh separately, but it is dependent on some cassandra files (probably created during installation).
So I decided to install whole Cassandra on separate server, in the same network space. Now, I'm struggling with the correct setup, how to run cassandra-stress tool against the cassandra cluster.
In cassandra.yaml I setup Cassandra name, listen_adress to public_ip, rpc_address to loopback address, I set seeds to cassandra cluster nodes (x.x.x.104-106)... but in general it does not make sense to set it up, since I dont wan't create another node in the Cassandra cluster.
Could you please help me?
Edit: Maybe using something like this might be the correct way?
cassandra-stress user profile=/usr/cassandra/stress-file.yaml ops(insert=1,books=1) n=10000 -node x.x.x.104,x.x.x.105,x.x.x.106 -port native= ?
Telnet [cassandra_node_ip_ddress] 7000 works fine

If you have your Cassandra cluster running with the proper ports open (by default 9042 for clients and 7199 for JMX), and Cassandra directory on a different machine, then you should be able to run cassandra-stress, from outside the cluster, against your cluster simply by passing the -node option with an IP of one of the nodes in your cluster (say x.x.x.104). For example,
$CASSANDRA_HOME/tools/bin/cassandra-stress write -node x.x.x.104
should work. You can see more options with
$CASSANDRA_HOME/tools/bin/cassandra-stress help

on every node:
in cassandra.yaml set rpc_address to IP address
in cassanda-env.sh set LOCAL_JMX=no and jmx options autenticate=false
open firewall port 7199
restart firewall and cassandra
on cassandra-stress server:
cassandra-stress user profile=/usr/cassandra/stress-books.yaml ops\
(insert=1,books=1\)
n=10000 -node 172.16.20.104,172.16.20.105,172.16.20.106 -port native=9042
thrift=9160 jmx=7199
Note! JMX communication is not secured

Related

how to run cassandra repair/compact from one node in a cluster

I want to run repair/compact operation from 1 Cassandra cluster node instead of scheduling it from all nodes in a cluster.
I am using cassandra 3 version.
"nodetool -h **NODEIP** repair keyspace" is not working if I specify other node in the cluster. This command only works for the local node which I run this command. Please suggest a way to run repair/compaction for all nodes by running from one node in a cluster.
Thanks
By default JMX security is disabled and accessible only from localhost, as nodetool uses JMX to communicate with Cassandra, nodetool will only work on the local node unless JMX security is enabled.
See this Datastax page on how to enable JMX authentication.

how to connect to cassandra cluster using cqlsh or command prompt?

In other dbs, we connect to db cluster with load balance IP. How do we connect to cassandra cluster using command line? What socket is used? Is this always a single node and IP?
What if i connect o node1, and node1 goes down. Will this automatically connect to node2 or node3?
You have several options: the easiest one is to use the Cassandra Query Language Shell (CQLSH), which is a python based CQL interpreter to interact with Cassandra. It usually comes with every Cassandra installation, under the /bin folder of the installation directory. If you have ssh access to one of the nodes Cassandra is running onto, this can be an easy option (you will avoid any issues related to firewall blocking incoming connections to your cluster).
You can also use cqlsh to access remotely to the cluster:
cqlsh node_ip 9043
but this will require cqlsh to be present on your machine.
In general, Cassandra uses an initial set of contact nodes and a gossip protocol to contact and learn the cluster composition. You will be assigned a node as coordinator for your query. You may not worry about seed nodes being currently down, provided that at least one is up and running.
Another option to access remotely to the cluster is the Datastax DevCenter,which is a free-to-use grafical interface to execute CQL queries.
Hope this helps

Cassandra multinode cluster setup issue

I am trying to setup a Cassandra multinode cluster on CentOS 7 with OpenJDK.
I have 2 Nodes:
node1 10.99.189.49
node2 10.99.189.50
I have done following things till now:
Downloaded the tarball of Cassandra from PlanetCassandra site
Extracted it in Documents folder.
Created all the necessary directories (data/saved_cache, data/commitlog, data/data) as mentioned in the YAML file.
And I have made 3 changes in my conf/cassandra.yaml file as follows:
On node 10.99.189.49:
seeds: "10.99.189.49"
listen_address: 10.99.189.49
rpc_address: 10.99.189.49
On node 10.99.189.50:
seeds: "10.99.189.49"
listen_address: 10.99.189.50
rpc_address: 10.99.189.50
Now I run cassandra on node 10.99.189.49
and then I run cassandra on the other node.
Cassandra starts normally on both the nodes
BUT
when I do:
bin/nodetool status
I can see only one node in it.
Can anyone point what I am doing wrong or missing something?
So I started adding tips in the comments, and for my 3rd time around I thought I'd start putting them all together in an actual answer.
DataStax does a pretty good job documenting how this should work. Make sure that you've gone through these docs (specifically the first one) and that you're following all the steps:
Initializing a multiple node cluster (single data center)
Adding nodes to an existing cluster
In addition to everything you have mentioned above, make sure that the cluster_name is the same on each node.
I find it easier to make this work using the GossipingPropertyFileSnitch. Set that in your cassandra.yaml on each node:
endpoint_snitch: GossipingPropertyFileSnitch
Then make sure that each of your nodes is specifying the same default data center in the cassandra-rackdc.properties file:
dc=DC1
Get your first node (.49) up-and-running. Verify it with nodetool status.
Also verify that you have opened the necessary ports in your firewall. From .49, try telneting your way to the other node on the ports that Cassandra requires. I recommend 7000, as that is the port for non-SSL inter-node communication.
telnet 10.99.189.50 7000
Once you're sure all that works and everything is configured properly, then bring up .50. I remember reading that you should wait at least 2 minutes before bringing up another node, so do that just to be on the safe side. Tail the logs to make sure it handshakes with the other node, or to see any errors:
tail -f /var/log/cassandra/system.log
Notes: Your log location may vary. I'm assuming you're running 2.2. If you are using a different version of Cassandra, please indicate it.
Hope this helps!
On both node use
seeds: "10.99.189.49,10.99.189.50"
and also restart both node cassandra

How Can I run more than one cassandra server in single machine and form one cluster ring?

I would like know is there any way to run multiple Cassandra servers on a single machine, so tall the servers on that machine form one ring (cluster).
I would like know is there any way to run the cassandra servers in a single machine ?
There's always a way!
There is an excellent tool available that allows you to configure a multi-node cluster locally, but it's currently not supported under windows. When you build a cluster and start it, it will configure the ring for you. You can check out the ring using ./nodetool -h 127.0.0.1 -p 7100 ring after it has started.
*Just a side-note, the ccm tool starts the cluster as a background process.

install multi-node cassandra in windows

Is there any detail step-by-step document to address the multi-node cassandra installation in Windows? I read some documents/blogs and tried on Window7 workstations/Windows2008 servers but not be able to establish connection from the 2nd node to the 1st node.
When I was setting up my first cluster on windows I found this blogpost to be excellent. It covers many aspects of the setup including:
Firewall / Networking issues.
Running Cassandra as a service.
Monitoring and maintenance.
If you want to create a complete setup with using just cassandra have a look at this blog.
But to setup a multi-node cluster, you basically need to have the correct ports open on your servers. When it comes to configuration you are basically going to have identical cassandra.yaml configs accross all your nodes, with the same seeds list, and the only two fields need to be changed are the listen_address and possibly rpc_address (although you could just listen an all interfaces for the rpc_address by setting it to:
rpc_address: 0.0.0.0

Resources