OpsCenter is having trouble connecting with the agents - azure

I am trying to setup a two node cluster in Cassandra. I am able to get my nodes to connect fine as far as I can tell. When I run nodetool status it shows both my nodes in the same data center and same rack. I can also run cqlsh on either node and query data. The second node can see data from the first node, etc.
I have my first node as the seed node both in the Cassandra.yaml and the cluster config file.
To avoid any potential security issues, I flushed my iptable and allowed all on all ports for both nodes. They are also on the same virtual network.
iptables -P INPUT ACCEPT
When I start OpsCenter on either machine it sees both nodes but only has information on the node I am viewing OpsCenter on. It can tell if the other node is up/down, but I am not able to view any detailed information. It sometimes initially says 2 Agents Connected but after awhile it says 1 agent failed to connect. It keeps prompting me to install OpsCenter on the other node although it's already there.
The OpsCenterd.log doesn't reveal much. There don't appear to be any errors but I see INFO: Nodes with agents that appear to no longer be running .
I am not sure what else to check as everything but OpsCenter seems to be working fine.

You should install Opscenter on a single node rather than all nodes. The opscenter gui will then prompt you to install the agent on each of the nodes in the cluster. Use nodetool status or nodetool ring to make sure that the cluster is properly functioning and all nodes are Up and functioning Normally. (status = UN)
In address.yaml file you can set stomp_address equal to the ip address of the opscenter server to force the agents to the correct address.

Related

While restarting one node other nodes are showing down in the Cassandra cluster

Whenever I am restarting my any Cassandra node in my cluster after few minutes other nodes are showing down, sometimes other nodes also hanging. We need to restart other nodes to up the services.
During restart cluster seems unstable and one after other showing stress and DN status however JVM and nodetool services are running fine but when we are describing the cluster it is showing unreachable.
We don't have much traffic and load in our environment. Can you please give me any suggestion.
Cassandra version is 3.11.2
Do you see any error/warning in your system.log after the restart of the node?

OpsCenter nodes list shows all nodes names as localhost

I have a brand new cluster running DSE 5.0.3 with OpsCenter 6.0.3. I used the LifeCycle Manager to create a 6 node cluster, adding their IPs to the Nodes list and installing DSE on each node that way. The cluster seems fine, healthy, etc. but the Nodes section, under the LIST tab, shows all nodes' names as localhost. If I click on each node it shows "localhost - x.x.x.x" (x.x.x.x being the actual node IP). How do I make them show their actual hostnames in OpsCenter? Where does this name come from?
Thanks!
The hostnames in OpsCenter are reported by the agent running on each node in the cluster. In this case each individual name is reporting its hostname as localhost. Fixing that configuration and restarting the agents should resolve the issue.

Icinga2 cluster node local checks not executing

I am using Icinga2-2.3.2 cluster HA setup with three nodes in the same zone and database in a seperate server for idodb. All are Cent OS 6.5. Installed IcingaWeb2 in the active master.
Configured four local checks for each node including cluster health check as described in the documentation. Installed Icinga Classi UI in all three nodes, beacuse I am not able to see the local checks configured for nodes in Icinga Web2.
Configs are syncing, checks are executing & all three nodes are connected among them. But the configured local checks, specific to the node alone are not happening properly and verified it in the classic ui.
a. All local checks are executed only one time whenever
- one of the node is disconnected or reconnected
- configuration changes done in the master and reload icinga2
b. But after that, only one check is hapenning properly in one node and the remaining are not.
I have attached the screenshot of all node classic ui.
Please help me to fix and Thanks in advance.

Node actions apply to only 1 node

I just installed a 3 node cassandra (2.0.11) community cluster with a single seed node. I installed opscenter (5.0.2) on the seed node and everything is working fairly well. The only issue I am having is that any node actions I perform (stop, start, compact, etc) apply only to the seed node. Even if I choose a different node on the ring or list, the action always happens on the seed node.
I watched the opscenter logs and can see requests for /ops/compact/ip_address and the ip address is the correct node that I chose but the action always run on the seed instance.
All agents have been installed on all the nodes and the cluster is fully operational. I can run nodetool compact on each node and see the compaction progress in opscenter.
I have each node configured to listen on an internal address and have verified that the rpc server is open on the network. I have also tried adding the cluster using a non-seed node but all actions continue to run on the seed node.
Posted the answer above but I'll explain more in detail for anyone else with this issue.
I changed rpc_address and listen_address in cassandra.yaml in order to listen on a private ip address. I restarted cassandra and the cluster could communicate easily. The datastax-agent was still reporting 127.0.0.1 to opscenter as the rpc address. I found this out by enabling trace logging in opscenter.
If you modify anything in cassandra.yaml, make sure you restart the datastax-agent as it apparently caches the data.

apache cassandra node not joining cluster ring

I've a four node apache cassandra community 1.2 cluster in single datacenter with a seed.
All configurations are similar in cassandra.yaml file.
The following issues are faced, please help.
1] Though fourth node isn't listed in nodetool ring or status command, system.log displayed only this node isn't communicating via gossip protoccol with other nodes.
However both jmx & telnet port is enabled with proper listen/seed address configured.
2] Though Opscenter is able to recognize all four nodes, the agents are not getting installed from opscenter.
However same JVM version is installed as well as JAVA_HOME is also set in all four nodes.
Further observed that problematic node has Ubuntu 64-Bit & other nodes are Ubuntu 32-Bit, can it be the reason?
What is the version of cassandra you are using. I had reported a similar kind of bug in cassandra 1.2.4 and it was told to move to subsequent versions.
Are you using gossiping property file snitch? If that's the case, your problem should have been solved by having updated cassandra-topology.properties files that are upto date.
If all these are fine, check your TCP level connection via netstat and TCP dump.If the connections are getting dropped at application layer, then consider a rolling restart.
You statement is actually very raw. Your server level configuration might be wrong in my assumption.
I would suggest you to check if cassandra-topology.properties and cassandra-racked.properties across all nodes are consistent.

Resources