Icinga2 cluster node local checks not executing - linux

I am using Icinga2-2.3.2 cluster HA setup with three nodes in the same zone and database in a seperate server for idodb. All are Cent OS 6.5. Installed IcingaWeb2 in the active master.
Configured four local checks for each node including cluster health check as described in the documentation. Installed Icinga Classi UI in all three nodes, beacuse I am not able to see the local checks configured for nodes in Icinga Web2.
Configs are syncing, checks are executing & all three nodes are connected among them. But the configured local checks, specific to the node alone are not happening properly and verified it in the classic ui.
a. All local checks are executed only one time whenever
- one of the node is disconnected or reconnected
- configuration changes done in the master and reload icinga2
b. But after that, only one check is hapenning properly in one node and the remaining are not.
I have attached the screenshot of all node classic ui.
Please help me to fix and Thanks in advance.

Related

Does "spring data cassandra" have client side loadbalancing?

I'm operating project using spring-boot, spring-data-cassandra.
When I setup that project, I set cassandra properties by ip and port.
(referred by https://www.baeldung.com/spring-data-cassandra-tutorial)
When set it up like this, If I had 3 cassandra nodes and 1 cassandra node died, I think project should fail to connect with cassandra at a 33% probability.
But my project was fine even though 1 cassandra node was dead. (just have some error on one's deathbed)
Do It happen to have A function in spring-data-cassandra like client-side-loadbalancing?
If they have that function, Where can I see that code??
I tried to find that code but failed.
Please give me a little clue.
Spring Data Cassandra relies on the functionality of the DataStax Java driver that is responsible for making everything works. This includes:
establishing the initial connection to the cluster. This is where the contact points play their role. After driver is connected to any of points, it reads information about the whole cluster and establishes connections to all nodes (by default)
establishing the control connection that is used to receive notifications about changes in the cluster - nodes going up & down, changes in schema, etc. If node goes down or up, this information is used to modify the list of the active nodes
providing the load balancing of requests based on the replication, and nodes availability - if the node is down, it's excluded from list of candidates, so we don't send queries to node that is known to be down

Thingsboard cluster setup

Building a Thingsboard cluster
I need help setting up a Thingsboard cluster, the documentation online is very limited.
The cluster will contain 2 Zookeeper nodes and 4 Thingsboard nodes with Cassandra DB.
Should Zookeeper be installed separately?
A step-by-step guide would be much appreciated!
I cannot provide you detailed step-by-step instructions to setup a ThingsBoard cluster. I can point you into the right direction by sharing the different documents you need to do so.
Bottom line, the following tasks must be completed:
Install and configure a ZooKeeper ensemble.
Check the ZooKeeper documentation for further installation details. Keep in mind that you need at least three different ZK-nodes in a clustered environment and that you always need an odd number of ZK nodes (3,5,7,...). It is a very very very bad idea to build a cluster consisting out of two ZK-nodes, check split brain condition that might appear under these circumstances! Basically you setup the number of individual nodes you wish to use and change the configuration file to enable the different nodes as an ensemble. This is documented quite well in the ZK-docs.
Install and configure a Cassandra cluster.
Again you will setup the number of individual nodes you need for your Cassandra cluster and modify the individual configuration files to convert them into a Cassandra cluster. Check Cassandra documentation for details. Be sure to check proper configuration using the nodetool status command as described at the end of the document. All your nodes should be up and running.
Install and configure a ThingsBoard cluster.
Use the instructions provided with ThingsBoard single node setup.
Install Java
Skip External database installation
ThingsBoard service installation
Configure ThingsBoard to use the external database - Cassandra
Go to Cluster setup and apply the configuration steps depicted (ZK, Cassandra and RPC). Keep in mind to point to ALL members of your ZK, Cassandra cluster. You can also use IP-addresses instead of host names.
Return to single node setup and run the installation script at ONE NODE only!
Start ThingsBoard service
If everything went well, you should be able to access your ThingsBoard nodes directly using the URL http://[NODE_IP]:8080. You can verify proper cluster operation by creating a tenant on one node and check its presence on another node.
I don't know if using an even number of ThingsBoard nodes is a good idea. The documentation does not mention anything about this.
One final remark, you could/should consider putting a proxy in front of your ThingsBoard cluster to provide load balancing to your web clients and improve user experience. This way you shouldn't share the individual host addresses with your users and you will prevent node overloading due to the fact that everybody is using the same web-address to access your dashboard(s). You could also proxy your MQTT broker to provide load balancing as well.
Good luck in setting up your cluster!
Zookeeper needs at least 3 nodes to run in a cluster mode. Each node voting and the valid replica count to gain the QUORUM is 3.

OpsCenter is having trouble connecting with the agents

I am trying to setup a two node cluster in Cassandra. I am able to get my nodes to connect fine as far as I can tell. When I run nodetool status it shows both my nodes in the same data center and same rack. I can also run cqlsh on either node and query data. The second node can see data from the first node, etc.
I have my first node as the seed node both in the Cassandra.yaml and the cluster config file.
To avoid any potential security issues, I flushed my iptable and allowed all on all ports for both nodes. They are also on the same virtual network.
iptables -P INPUT ACCEPT
When I start OpsCenter on either machine it sees both nodes but only has information on the node I am viewing OpsCenter on. It can tell if the other node is up/down, but I am not able to view any detailed information. It sometimes initially says 2 Agents Connected but after awhile it says 1 agent failed to connect. It keeps prompting me to install OpsCenter on the other node although it's already there.
The OpsCenterd.log doesn't reveal much. There don't appear to be any errors but I see INFO: Nodes with agents that appear to no longer be running .
I am not sure what else to check as everything but OpsCenter seems to be working fine.
You should install Opscenter on a single node rather than all nodes. The opscenter gui will then prompt you to install the agent on each of the nodes in the cluster. Use nodetool status or nodetool ring to make sure that the cluster is properly functioning and all nodes are Up and functioning Normally. (status = UN)
In address.yaml file you can set stomp_address equal to the ip address of the opscenter server to force the agents to the correct address.

Icinga2 Cluster HA - Service stalled in "pending"

I am using Icinga2-2.3.2 cluster HA setup with three nodes in the same zone and database in a seperate server for idodb. All are Cent OS 6.5. Installed IcingaWeb2 in the active master.
Configured four local checks for each node including cluster health check as described in the documentation. Installed Icinga Classi UI in all three nodes, beacuse I am not able to see the local checks configured for nodes in Icinga Web2.
Configs are syncing, checks are executing & all three nodes are connected among them. But the status data are not syncing sometime for Icinga Classic UI.
Whenever the config changes in the master and reload, the config is syncing. After sometime when I checked all three nodes classic ui, some no of hosts & services are stalled in "pending" state in one or two nodes with different nos.
But all are ok in config master classic ui and even in the Icinga Web2 everything is ok. Above is one sceanrio, sometimes the local checks are also stalled in pending state.
I have attached the screenshot for reference.
Please help me to fix and Thanks in advance.
I was facing issue "service status Pending reload" in nagios 3, just adding contact group e.g. Linux support (email group) under Contact groups for service notification resolved my issue.

I could not submit a job to the executing node in condor apart from the central manager

I have a condor pool which consist of 4 dedicated machine one is set as a centeral manager, submitting, and executing node while the other three is set to be executing nodes I used CentOS 5.4 as an OS for all the machines. My problem is when I submitted a job from the central manager it works just on the central manager so when I specify in the JDL file that the job should run in any machine apart from the central manager the job stay in hold and does not run. When I type condor_status all nodes appear. I keep the daemon MASTER, STARTD in the daemon list for the executing nodes. Does any one come across this problem?
There's not enough information to answer your question, but the first thing to do is to run condor_q -analyze <jobid> and see what it tells you. See the Condor manual Section 2.6.5: Why is the job not running?
One possible cause is that you're not telling Condor to transfer your input/output files for you, and your nodes have different "filesystem domains", so Condor is unable to find a host which shares a common filesystem with your submit host.

Resources