Astyanax Cassandra driver removes host unexpectedly during batch inserting - cassandra

Astyanax 1.56.37 connecting to Cassandra 1.2.6 running on Debian:
When performing a number of inserts in quick succession to a Cassandra cluster containing only one node located at 10.10.1.141, at seemingly random points, I will see the following in the console:
- AddHost: 127.0.0.1
- RemoveHost: 10.10.1.141
Every attempt to connect to this keyspace after I get this fails with the same message.
Here is my configuration:
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster("Titan Cluster")
.forKeyspace(keyspaceName)
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE)
.setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
.setTargetCassandraVersion("1.2")
)
.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
.setPort(9160)
.setMaxConnsPerHost(50)
.setSeeds("10.10.1.141:9160")
.setConnectTimeout(2000)
.setSocketTimeout(30000)
.setMaxTimeoutWhenExhausted(10000)
.setMaxTimeoutCount(3)
.setTimeoutWindow(10000)
.setLatencyAwareBadnessThreshold(10)
.setLatencyAwareUpdateInterval(1000)
.setLatencyAwareResetInterval(10000)
.setLatencyAwareWindowSize(100)
)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
The connection fails on subsequent attempts at context.start()

I too faced the same issue where I had my Cassandra and application (Cassandra client) running on different machines.
AddHost: 10.10.1.141
AddHost: 127.0.0.1
RemoveHost: 10.10.1.141
When I checked my Cassandra ring status, I noticed that the Cassandra was running with the address 127.0.0.1, instead of 10.10.1.141
root#10.10.1.141:/opt/dsc-cassandra$ **bin/nodetool ring**
Address Rack Status State Load Owns Token
127.0.0.1 rack1 Up Normal 169.87 KB 100.00% -9217929600007243236
127.0.0.1 rack1 Up Normal 169.87 KB 100.00% -9140762708880451456
127.0.0.1 rack1 Up Normal 169.87 KB 100.00% -8952943573583903866
127.0.0.1 rack1 Up Normal 169.87 KB 100.00% -8891950316930533160*
In conf/cassandra.yaml, I had specified the hostname instead of IP address for listen_address. The cassandra resoved the hostname to localhost (127.0.0.1) instead of the actual IP (10.10.1.141).
After changing the listen_address to the actual IP, the client established connection successfully.
listen_address: 10.10.1.141

I was running Cassandra on VirtualBox on Windows, so the IP was something like 168.192.0.14, and for me, using NodeDiscoveryType.NONE prevented disconnections:
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster(clusterName)
.forKeyspace(keyspaceName)
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.NONE)
)
.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
.setPort(9160)
.setMaxConnsPerHost(3)
.setSeeds("192.168.0.14:9160")
)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
Keyspace keyspace = context.getClient();

Related

How to get rid of a node that shows as null after being replaced?

I had a 3 nodes cluster (Cassandra 3.9) ; one node went dead.
I built a new node from scratch and "replaced" the dead node using the information from this page https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsReplaceNode.html.
It looked like the replacement went ok.
I added two more nodes to strengthen the cluster.
A few days have passed and the dead node is still visible and marked as "down" on 3 of 5 nodes in nodetool status:
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.9 16 GiB 256 35.0% 76223d4c-9d9f-417f-be27-cebb791cddcc rack1
UN 192.168.1.12 16.09 GiB 256 34.0% 719601e2-54a6-440e-a379-c9cf2dc20564 rack1
UN 192.168.1.14 14.16 GiB 256 32.6% d8017a03-7e4e-47b7-89b9-cd9ec472d74f rack1
UN 192.168.1.17 15.4 GiB 256 34.1% fa238b21-1db1-47dc-bfb7-beedc6c9967a rack1
DN 192.168.1.18 24.3 GiB 256 33.7% null rack1
UN 192.168.1.22 19.06 GiB 256 30.7% 09d24557-4e98-44c3-8c9d-53c4c31066e1 rack1
Its host ID is null, so I cannot use nodetool removenode. Moreover
nodetool assassinate 192.168.1.18 fails with :
error: null
-- StackTrace --
java.lang.NullPointerException
And in system.log:
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:39:38,595 Gossiper.java:585 - Sleeping for 30000ms to ensure /192.168.1.18 does not change
INFO [CompactionExecutor:547] 2019-03-27 17:39:38,669 AutoSavingCache.java:393 - Saved KeyCache (27316 items) in 163 ms
INFO [IndexSummaryManager:1] 2019-03-27 17:40:03,620 IndexSummaryRedistribution.java:75 - Redistributing index summaries
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,597 Gossiper.java:1029 - InetAddress /192.168.1.18 is now DOWN
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,599 StorageService.java:2324 - Removing tokens [-1061369577393671924,...]
In system.peers, the dead node shows and has the same ID as the replacing node :
cqlsh> select peer, host_id from system.peers;
peer | host_id
--------------+--------------------------------------
192.168.1.18 | 09d24557-4e98-44c3-8c9d-53c4c31066e1
192.168.1.22 | 09d24557-4e98-44c3-8c9d-53c4c31066e1
192.168.1.9 | 76223d4c-9d9f-417f-be27-cebb791cddcc
192.168.1.14 | d8017a03-7e4e-47b7-89b9-cd9ec472d74f
192.168.1.12 | 719601e2-54a6-440e-a379-c9cf2dc20564
Dead node and replacing node have different tokens in system.peers.
So my questions are :
Could you explain what is wrong ?
How can fix this and get rid of this dead node ?
From system.peers you got the host id, so can you try nodetool removenode/assassinate with the host id.
peer | host_id
--------------+--------------------------------------
192.168.1.18 | 09d24557-4e98-44c3-8c9d-53c4c31066e1

Cassandra multinode cluster setup issue (for example 3 nodes)

I am keep getting below error when I tried to run single node or multinode Cassandra cluster.
Single node cluster with default config works fine, however nodetool staus shows 127.0.0.1 as IP Address.
After changing listen_address: 192.168.1.143 (this is my ip address) on cassandra.yaml file I am getting below error.
Exception (java.lang.RuntimeException) encountered during startup: Unable to gossip with any peers
java.lang.RuntimeException: Unable to gossip with any peers
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1443)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:547)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:804)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:664)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:613)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:379)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691)
Well, after trying different approaches finally I was able to resolve it and able to run single and 3 node cluster.
Below are the configuration changes you need to make on cassandra.yaml file
First Node
--------------
listen_address: 192.168.1.143 (This should be your server/node IP)
seeds: "192.168.1.143" (For your first node please mention your node IP address)
Second Node
---------------
listen_address: 192.168.1.144 (This should be your server/node IP)
seeds: "192.168.1.143" (specify your first node IP, additionally, you can also mention current IP address ,192.168.1.144)
Third Node
---------------
listen_address: 192.168.1.145 (This should be your server/node IP)
seeds: "192.168.1.143" (specify your first/second node IP, additionally, you can also mention current IP address ,192.168.1.145)
After starting cassandra on all 3 servers, nodetool status returned the following
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.143 258.83 KiB 256 100.0% 7b3a0644-c8dd-4a47-9186-0237f3725941 rack1
UN 192.168.1.144 309.71 KiB 256 100.0% e7a11a60-d795-47ee-8d21-7cc21b4cbdca rack1
UN 192.168.1.145 309.71 KiB 256 100.0% b2a4545a-f279-r5h7-2fy6-23dk8fg5c8kq rack1
Hope this helps!!
Yes, for joining cassandra cluster first time. you should start seed node first then other nodes.

cassandra is not running as service

The system is Linux 14.04.1-Ubuntu x86_64, 200GB space, 8GB memory. Everything is done in both root and user. We installed the Cassandra version 3.6.0 from datastax using the following command (followed the instruction from website: http://docs.datastax.com/en/cassandra/3.x/cassandra/install/installDeb.html):
$ apt-get update
$ apt-get install datastax-ddc
However, the cassandra is not started as service.
root#e7:~# nodetool status
nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused'.
root#e7:~# service cassandra start
root#e7:~# service cassandra status
* Cassandra is not running
We can start Cassandra manually using the command:
$ cassandra -R -f
...
INFO 18:45:02 Starting listening for CQL clients on /127.0.0.1:9042 (unencrypted)...
INFO 18:45:02 Binding thrift service to /127.0.0.1:9160
INFO 18:45:02 Listening for thrift clients...
INFO 18:45:12 Scheduling approximate time-check task with a precision of 10 milliseconds
root#e7:~# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 153.45 KiB 256 100.0% 28ba16df-1e4c-4a40-a786-ebee140364bf rack1
However, we have to start cassandra as a service. Any suggestions how to fix the problem?
Try using http://docs.datastax.com/en/cassandra/3.0/cassandra/install/installDeb.html
This is more stable and I have tried it.
I think the ports are not opened.
Try opening the following ports:
Cassandra inter-node ports
Port number Description
7000 Cassandra inter-node cluster communication.
7001 Cassandra SSL inter-node cluster communication.
7199 Cassandra JMX monitoring port.
Cassandra client port
Port number Description
9042 Cassandra client port.
9160 Cassandra client port (Thrift).
Also what type of Snitch is defined in the Cassandra.yaml file ?

cassandra 3.4 on virtual box not starting

I am using mac osx. i created 3 virtual box by virtualbox. I've installed centos7 minimal version on each of the virtual box.
Then i installed cassandra on each of the box. After installation it was starting by cqlsh and nodetool status command.
But after then when i was trying to link each other and edit cassandra.yaml file its started showing
('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
i've edited the cassandra.yaml file as follows:
cluster_name: 'Home Cluster'
num_tokens: 256
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
- seeds: "192.168.56.102,192.168.56.103"
storage_port: 7000
listen_address: 192.168.56.102
rpc_address: 192.168.56.102
rpc_port: 9160
endpoint_snitch: SimpleSnitch
my /etc/hosts file contains:
192.168.56.102 node01
192.168.56.103 node02
192.168.56.104 node03
Please tell me whats wrong i'm doing? My cassandra cluster not working.
solution: I got the solution from AKKI. The problem was enpoint_snitch. I made the endpoint_snitch=GossipingPropertyFileSnitch and it fixed. My now output is as follows:
[root#dbnode2 ~]# nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.56.101 107.38 KB 256 62.5% 0526a2e1-e6ce-4bb4-abeb-b9e33f72510a rack1
UN 192.168.56.102 106.85 KB 256 73.0% 0b7b76c2-27e8-490f-8274-571d00e60c20 rack1
UN 192.168.56.103 83.1 KB 256 64.5% 6c8d80ec-adbb-4be1-b255-f7a0b63e95c2 rack1
I had faced similar problem,
I tried the following solution:
In Cassandra.yaml file check if you have,
start_rpc = true
Changed my endpoint snitch to
endpoint_snitch: GossipingFilePropertySnitch
Opened all ports Cassandra uses on my CentOS
Cassandra inter-node ports
Port number Description
7000 Cassandra inter-node cluster communication.
7001 Cassandra SSL inter-node cluster communication.
7199 Cassandra JMX monitoring port.
Cassandra client port
Port number Description
9042 Cassandra client port.
9160 Cassandra client port (Thrift).
Command to open ports on CentOs 7(Find it according to your OS):
>sudo firewall-cmd --zone=public --add-port=9042/tcp --permanent
>sudo firewall-cmd –reload
Then Restart your systems
Also it seems that you are changing the Cassandra.Yaml file after starting cassandra.
Make sure you edit your Cassandra.yaml file on all nodes before starting Cassandra
Also remember to start the seed node first.

setting up cassandra multi node cluster: 'Nodes have the same token 0'

I'm trying to set up a Cassandra multi node cluster in my computer just to test, but it seems not work... The Cassandra version is 1.1 and It runs on Ubuntu.
Fist of all, I've modified the cassandra.yaml file for each node as follows:
node0
initial_token: 0
seeds: "127.0.0.1"
listen_address: 127.0.0.1
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch
node1
same as node0 exept for:
initial_token: 28356863910078205288614550619314017621 (get using
cassandra token generator)
listen_address: 127.0.0.2
After that, I've started first the seed node 127.0.0.1 and, once the node is up, I've started the other node 127.0.0.2. I've got the following:
[...]
INFO 06:09:27,146 Listening for thrift clients...
INFO 06:09:27,909 Node /127.0.0.1 is now part of the cluster
INFO 06:09:27,911 InetAddress /127.0.0.1 is now UP
INFO 06:09:27,913 Nodes /127.0.0.1 and /127.0.0.2 have the same token 0. Ignoring /127.0.0.1
Running nodetool -h localhost ring it shows:
Address: 127.0.0.2
DC: datacenter1
Rack: rack1
Status: Up
State: Normal
Load: 11,21 KB
Owns: 100,00%
Token: 0
As you can see, only the information of the second node is showed owning 100% of the ring. Indeed, the token is initialized to 0 instead of to the value I defined at its cassandra.yaml file.
The gossip Info is:
/127.0.0.2
LOAD:25559.0
STATUS:NORMAL,0
SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
RELEASE_VERSION:1.1.6-SNAPSHOT
RPC_ADDRESS:0.0.0.0
/127.0.0.1
LOAD:29859.0
STATUS:NORMAL,0
SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f
RELEASE_VERSION:1.1.6-SNAPSHOT
RPC_ADDRESS:0.0.0.0
Does anyone know what is happening and how can I fix it?
Thank you so much in advance!!
initial_token is only checked at first startup, when it is written to a system table. Delete the system table files and restart.

Resources