Cassandra load is high on one of the nodes - cassandra

I have an 8 node Cassandra cluster (Cassandra 2.0.8). When I ran the status using nodetool, I see the following. I am a newbie and wondering why the load on one of the nodes (that node is my initial seed node) is high compared to others?
I also noticed that when I try to push data into Cassandra Table (column family) using PIG, that one node is using very high CPU (95%+) while the others are not (20-30%)
Note: Ownership information does not include topology; for complete information, specify a keyspace
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN xxx.xxx.xx.xxx 15.55 MB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 36.89 MB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 3.77 GB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 1.04 GB 256 56.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 43.49 MB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 40.36 MB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 43.69 MB 256 6.2% ------------------------------------ rack1
UN xxx.xxx.xx.xxx 40.23 MB 256 6.2% ------------------------------------ rack1
Any help is appreciated. Thank you.

You mentioned that you are pushing data through PIG. If so are you using the Hadoop support of Cassandra?
If yes, it is likely that your splits are causing this.

Related

Cassandra multi dc replication nodetool rebuild issue and schema mismatch

Team,
I have a Cassandra cluster of 6 nodes and I've added a new datacenter to the existing setup of 5 nodes. I’ve followed all the steps but getting the below error when I run nodetool rebuild on the new dc’s nodes.
nodetool rebuild -- datacenter1
nodetool: Unable to find sufficient sources for streaming range (389537640310009811,390720100939923035] in keyspace system_distributed
See 'nodetool help' or 'nodetool help <command>'.
Nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.21.201.205 1.75 GiB 256 16.5% ff0accd4-c33a-4984-967f-3ec763fe5414 rack1
UN 172.21.201.45 1.55 GiB 256 17.0% d3ac5afa-d561-43ee-89e2-db1d20c59b38 rack1
UN 172.21.201.44 2.37 GiB 256 17.0% 73d8e6c6-0aa3-4a91-80fc-8c7068c78a64 rack1
UN 172.21.201.207 1020.15 MiB 256 16.0% 5751ea7d-b339-43b3-bcfe-89fcbc60dea0 rack1
UN 172.21.201.46 1.64 GiB 256 17.0% 1c1afbfc-6a4b-40f0-a4c3-1eaa543eb2d5 rack1
UN 172.21.201.206 1.13 GiB 256 17.2% b11bfef9-e708-45cc-9ab3-e52983834096 rack1
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.41.6.155 983.91 KiB 256 17.3% bf7244bb-70dc-4d91-8131-cbe4886f09e7 rack1
UN 10.41.6.157 946.36 KiB 256 15.5% 5499e7cc-db23-4163-8f0c-8f437f61bd6f rack1
UN 10.41.6.156 1.14 MiB 256 15.3% f27e94a6-7e1c-4177-9f88-36d821a7808d rack1
UN 10.41.6.159 659.3 KiB 256 17.3% 453e97df-5b83-4798-9e5e-a13bbb33acee rack1
UN 10.41.6.158 909.51 KiB 256 18.2% a4bc046a-e2ef-4fd4-9ab7-18be642a4d5a rack1
UN 10.41.6.160 1.08 MiB 256 15.5% 267cf9d0-cd55-4186-a737-998443125b19 rack1
node tool describe cluster status
#nodetool describecluster
Cluster Information:
Name: Recommendation
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
ea63e099-37c5-3d7b-9ace-32f4c833653d: [10.41.6.155, 10.41.6.157, 10.41.6.156, 10.41.6.159, 10.41.6.158, 10.41.6.160]
fd507b64-3070-3ffd-8217-f45f2f188dfc: [172.21.201.205, 172.21.201.45, 172.21.201.44, 172.21.201.207, 172.21.201.206, 172.21.201.46]
OLD DC Cassandra version - 3.1.1
NEW DC Cassandra version - 3.11.4
Can someone quickly help me fix this issue?
Changing the replication factor of the system_distributed keyspace to NetworkTopologyStrategy to include both dcs worked for me.
Command executed
ALTER KEYSPACE system_distributed WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 2, 'dc1' : 2};
Output
system_distributed | True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'datacenter1': '2', 'dc1': '2'}
After further research, I've learnt that the Cassandra version difference on both DC's caused this issue. When I matched the Cassandra version on both DC's the schema mismatch issue got resolved.

Cassandra partitioned despite having connectivity and matching schema

I have a 3 node cassandra (3.0.15) cluster, where nodes seem to have partitioned. nodetool status and nodetool describecluster outputs are inconsistent when seen from different nodes. Gossip information (nodetool gossipinfo) seems to be up to date on all nodes. No errors in the logs other than read timeouts which seems to be due from the partitioning.
Attempts to resolve this that didn't work:
Rolling restart
Full cluster restart
disable/enable gossip on each node
Node1 (192.168.2.247):
$ nodetool describecluster
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
59ffe9aa-aca7-34b3-8c5e-b736d221b922: [192.168.2.248, 192.168.2.247]
UNREACHABLE: [192.168.2.249]
$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.2.248 210.23 GB 256 67.7% f6593c49-6739-4b8c-a8d9-8321915a660d rack1
DN 192.168.2.249 190.9 GB 256 68.6% 6ac211c2-bab1-4e2b-bc84-18d911e005d0 rack1
UN 192.168.2.247 157.82 GB 256 63.7% eb462857-cbfb-4e41-8be9-5d241d273b81 rack1
Node2 (192.168.2.248):
$ nodetool describecluster
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
59ffe9aa-aca7-34b3-8c5e-b736d221b922: [192.168.2.248, 192.168.2.249, 192.168.2.247]
$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.2.248 210.24 GB 256 67.7% f6593c49-6739-4b8c-a8d9-8321915a660d rack1
UN 192.168.2.249 190.9 GB 256 68.6% 6ac211c2-bab1-4e2b-bc84-18d911e005d0 rack1
UN 192.168.2.247 157.82 GB 256 63.7% eb462857-cbfb-4e41-8be9-5d241d273b81 rack1
Node3 (192.168.2.249):
$ nodetool describecluster
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
59ffe9aa-aca7-34b3-8c5e-b736d221b922: [192.168.2.248]
UNREACHABLE: [192.168.2.249, 192.168.2.247]
$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.2.248 210.24 GB 256 67.7% f6593c49-6739-4b8c-a8d9-8321915a660d rack1
UN 192.168.2.249 190.92 GB 256 68.6% 6ac211c2-bab1-4e2b-bc84-18d911e005d0 rack1
UN 192.168.2.247 157.84 GB 256 63.7% eb462857-cbfb-4e41-8be9-5d241d273b81 rack1

How to get rid of a node that shows as null after being replaced?

I had a 3 nodes cluster (Cassandra 3.9) ; one node went dead.
I built a new node from scratch and "replaced" the dead node using the information from this page https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsReplaceNode.html.
It looked like the replacement went ok.
I added two more nodes to strengthen the cluster.
A few days have passed and the dead node is still visible and marked as "down" on 3 of 5 nodes in nodetool status:
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.9 16 GiB 256 35.0% 76223d4c-9d9f-417f-be27-cebb791cddcc rack1
UN 192.168.1.12 16.09 GiB 256 34.0% 719601e2-54a6-440e-a379-c9cf2dc20564 rack1
UN 192.168.1.14 14.16 GiB 256 32.6% d8017a03-7e4e-47b7-89b9-cd9ec472d74f rack1
UN 192.168.1.17 15.4 GiB 256 34.1% fa238b21-1db1-47dc-bfb7-beedc6c9967a rack1
DN 192.168.1.18 24.3 GiB 256 33.7% null rack1
UN 192.168.1.22 19.06 GiB 256 30.7% 09d24557-4e98-44c3-8c9d-53c4c31066e1 rack1
Its host ID is null, so I cannot use nodetool removenode. Moreover
nodetool assassinate 192.168.1.18 fails with :
error: null
-- StackTrace --
java.lang.NullPointerException
And in system.log:
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:39:38,595 Gossiper.java:585 - Sleeping for 30000ms to ensure /192.168.1.18 does not change
INFO [CompactionExecutor:547] 2019-03-27 17:39:38,669 AutoSavingCache.java:393 - Saved KeyCache (27316 items) in 163 ms
INFO [IndexSummaryManager:1] 2019-03-27 17:40:03,620 IndexSummaryRedistribution.java:75 - Redistributing index summaries
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,597 Gossiper.java:1029 - InetAddress /192.168.1.18 is now DOWN
INFO [RMI TCP Connection(16)-127.0.0.1] 2019-03-27 17:40:08,599 StorageService.java:2324 - Removing tokens [-1061369577393671924,...]
In system.peers, the dead node shows and has the same ID as the replacing node :
cqlsh> select peer, host_id from system.peers;
peer | host_id
--------------+--------------------------------------
192.168.1.18 | 09d24557-4e98-44c3-8c9d-53c4c31066e1
192.168.1.22 | 09d24557-4e98-44c3-8c9d-53c4c31066e1
192.168.1.9 | 76223d4c-9d9f-417f-be27-cebb791cddcc
192.168.1.14 | d8017a03-7e4e-47b7-89b9-cd9ec472d74f
192.168.1.12 | 719601e2-54a6-440e-a379-c9cf2dc20564
Dead node and replacing node have different tokens in system.peers.
So my questions are :
Could you explain what is wrong ?
How can fix this and get rid of this dead node ?
From system.peers you got the host id, so can you try nodetool removenode/assassinate with the host id.
peer | host_id
--------------+--------------------------------------
192.168.1.18 | 09d24557-4e98-44c3-8c9d-53c4c31066e1

Cassandra UNREACHABLE node

I have Cassandra v3.9, 3 node cluster. Replication factor 2
10.0.0.11,10.0.0.12,10.0.0.13
10.0.0.11,10.0.0.12 are seed nodes
What could be the possible reasons for following error in /etc/cassandra/conf/debug.log
The error is
DEBUG [RMI TCP Connection(174)-127.0.0.1] 2017-07-12 04:47:49,002 StorageProxy.java:2254 - Hosts not in agreement. Didn't get a response from everybody: 10.0.0.13
UPDATE1:
At the time of the error here are some statistics from all the servers
[user1#ip-10-0-0-11 ~]$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.0.12 2.55 MiB 256 67.2% 83a68750-2238-4a6e-87be-03a3d7246824 rack1
UN 10.0.0.11 1.78 GiB 256 70.6% 052fda9d-0474-4dfb-b2f8-0c5cbec15266 rack1
UN 10.0.0.13 1.78 GiB 256 62.2% 86438dc9-77e0-43b2-a672-5b2e7cf216bf rack1
[user1#ip-10-0-0-11 ~]$ nodetool describecluster
Cluster Information:
Name: PiedmontCluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
3c8d9e82-c688-3d16-a3e9-b84894168283: [10.0.0.12, 10.0.0.11]
UNREACHABLE: [10.0.0.13]
[pnm#ip-10-0-0-13 ~]$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.0.12 2.55 MiB 256 67.2% 83a68750-2238-4a6e-87be-03a3d7246824 rack1
UN 10.0.0.11 1.78 GiB 256 70.6% 052fda9d-0474-4dfb-b2f8-0c5cbec15266 rack1
UN 10.0.0.13 1.78 GiB 256 62.2% 86438dc9-77e0-43b2-a672-5b2e7cf216bf rack1
[pnm#ip-10-0-0-13 ~]$ nodetool describecluster
Cluster Information:
Name: PiedmontCluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
3c8d9e82-c688-3d16-a3e9-b84894168283: [10.0.0.12, 10.0.0.13]
UNREACHABLE: [10.0.0.11]
[user1#ip-10-0-0-12 ~]$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.0.12 2.55 MiB 256 67.2% 83a68750-2238-4a6e-87be-03a3d7246824 rack1
UN 10.0.0.11 1.78 GiB 256 70.6% 052fda9d-0474-4dfb-b2f8-0c5cbec15266 rack1
UN 10.0.0.13 1.78 GiB 256 62.2% 86438dc9-77e0-43b2-a672-5b2e7cf216bf rack1
[user1#ip-10-0-0-12 ~]$ nodetool describecluster
Cluster Information:
Name: PiedmontCluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
3c8d9e82-c688-3d16-a3e9-b84894168283: [10.0.0.12, 10.0.0.11, 10.0.0.13]
The above mentioned error is in the /var/log/cassandra/debug.log on 10.0.0.11
Error in /var/log/cassandra/debug.php on 10.0.0.13 is
DEBUG [RMI TCP Connection(4)-127.0.0.1] 2017-07-13 02:31:23,846 StorageProxy.java:2254 - Hosts not in agreement. Didn't get a response from everybody: 10.0.0.11
ERROR [MessagingService-Incoming-/10.0.0.11] 2017-07-13 02:35:04,982 CassandraDaemon.java:226 - Exception in thread Thread[MessagingService-Incoming-/10.0.0.11,5,main]
java.lang.ArrayIndexOutOfBoundsException: 4
at org.apache.cassandra.db.filter.AbstractClusteringIndexFilter$FilterSerializer.deserialize(AbstractClusteringIndexFilter.java:74) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.db.SinglePartitionReadCommand$Deserializer.deserialize(SinglePartitionReadCommand.java:1041) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:696) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:626) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:114) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:190) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178) ~[apache-cassandra-3.9.0.jar:3.9.0]
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92) ~[apache-cassandra-3.9.0.jar:3.9.0]
No error in /var/log/cassandra/debug.php on 10.0.0.12
Remember 10.0.0.11 & 10.0.0.12 are seed nodes
Thanks

Fully removing a decommissioned Cassandra node

Running Cassandra 1.0, I am shrinking a ring from 5 nodes down to 4. In order to do that I ran nodetool decommission on the node I want to remove, then stopped cassandra on that host and used nodetool move and nodetool cleanup to update the tokens on the remaining 4 nodes to rebalance the cluster.
My seed nodes are A and B. The node I removed is C.
That seemed to work fine for 6-7 days, but now one of my four nodes thinks the decommissioned node is still part of the ring.
Why did this happen, and what's the proper way to fully remove the decommissioned node from the ring?
Here's the output of nodetool ring on the one node that still thinks the decommissioned node is part of the ring:
Address DC Rack Status State Load Owns Token
127605887595351923798765477786913079296
xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0
xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432
xx.xxx.xx.xxx datacenter1 rack1 Down Normal ? 9.08% 57981914123659253974350789668785134662
xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 15.92% 85070591730234615865843651857942052864
xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
Here's the output of nodetool ring on the other 3 nodes:
Address DC Rack Status State Load Owns Token
127605887595351923798765477786913079296
xx.x.xxx.xx datacenter1 rack1 Up Normal 616.17 MB 25.00% 0
xx.xxx.xxx.xxx datacenter1 rack1 Up Normal 1.17 GB 25.00% 42535295865117307932921825928971026432
xx.xx.xx.xxx datacenter1 rack1 Up Normal 531.99 MB 25.00% 85070591730234615865843651857942052864
xx.xxx.xxx.xx datacenter1 rack1 Up Normal 659.92 MB 25.00% 127605887595351923798765477786913079296
UPDATE:
I tried to remove the node using nodetool removetoken on node B, which is the one that still claims node C is in the ring. That command ran for 5 hours and didn't seem to do anything. The only change is that the state of Node C is "Leaving" now when I run nodetool ring on Node B.
I was able to remove the decommissioned node using nodetool removetoken, but I had to use the force option.
Here's the output of my commands:
iowalker:~$ nodetool -h `hostname` removetoken 57981914123659253974350789668785134662
<waited 5 hours, the node was still there>
iowalker:~$ nodetool -h `hostname` removetoken status
RemovalStatus: Removing token (57981914123659253974350789668785134662). Waiting for replication confirmation from [/xx.xxx.xxx.xx,/xx.x.xxx.xx,/xx.xx.xx.xxx].
iowalker:~$ nodetool -h `hostname` removetoken force
RemovalStatus: Removing token (57981914123659253974350789668785134662). Waiting for replication confirmation from [/xx.xxx.xxx.xx,/xx.x.xxx.xx,/xx.xx.xx.xxx].
iowalker:~$ nodetool -h `hostname` removetoken status
RemovalStatus: No token removals in process.
With Cassandra 2.0, you need to use sh nodetool decommission on node to be removed.
In your case, check if you have removed entries in cassandra-topology.properties

Resources