No single point of failure with Thingsboard and Cassandra - cassandra

i already have some experience with Thingsboard setup, but until now i only deployed it using the standalone scenario. One Thingsboard instance with Postgres(hybrid setup) - one Cassandra.
What i want to do, is create a No Single Point of Failure installation.
My idea is to use HAproxy to switch between two Thingsboard instances and have two Cassandra instances with the same exact data.
Is it possible? If yes then how?
https://pasteboard.co/JJNhbON.png
A simple diagram of what i want to do.
Thanks in advance!

Personally, I'd simplify the storage layer by deploying multiple DCs for Cassandra with each Thingsboard instance routing traffic to its own Cassandra DC. With this design, you won't have to worry about having to keep two distinct Cassandra clusters in sync.
The HA proxy can simply divert traffic to a Thingsboard instance that is operational. Cheers!

No single point of failure configuration is:
Zookeeper cluster - 3 nodes
Kafka cluster - 3 nodes
Cassandra cluster- 3 nodes
PostgreSQL cluster - 2 nodes (master/slave) + PgPool 1 node
Redis cluster- 3-6 nodes
Thingsboard - 2 nodes (may be used as monolith or microservices)
Load balancer (HAproxy or other)
That will bring a real fault tolerance and the ability to scale horizontally. You can spin up a Kubernetes cluster to manage that easily. If you don't have a heavy load, you can use a shared CPU resource with Docker-compose or Kubernetes. At least 3 physical machines required in separate racks.

Related

Thingsboard cluster setup

Building a Thingsboard cluster
I need help setting up a Thingsboard cluster, the documentation online is very limited.
The cluster will contain 2 Zookeeper nodes and 4 Thingsboard nodes with Cassandra DB.
Should Zookeeper be installed separately?
A step-by-step guide would be much appreciated!
I cannot provide you detailed step-by-step instructions to setup a ThingsBoard cluster. I can point you into the right direction by sharing the different documents you need to do so.
Bottom line, the following tasks must be completed:
Install and configure a ZooKeeper ensemble.
Check the ZooKeeper documentation for further installation details. Keep in mind that you need at least three different ZK-nodes in a clustered environment and that you always need an odd number of ZK nodes (3,5,7,...). It is a very very very bad idea to build a cluster consisting out of two ZK-nodes, check split brain condition that might appear under these circumstances! Basically you setup the number of individual nodes you wish to use and change the configuration file to enable the different nodes as an ensemble. This is documented quite well in the ZK-docs.
Install and configure a Cassandra cluster.
Again you will setup the number of individual nodes you need for your Cassandra cluster and modify the individual configuration files to convert them into a Cassandra cluster. Check Cassandra documentation for details. Be sure to check proper configuration using the nodetool status command as described at the end of the document. All your nodes should be up and running.
Install and configure a ThingsBoard cluster.
Use the instructions provided with ThingsBoard single node setup.
Install Java
Skip External database installation
ThingsBoard service installation
Configure ThingsBoard to use the external database - Cassandra
Go to Cluster setup and apply the configuration steps depicted (ZK, Cassandra and RPC). Keep in mind to point to ALL members of your ZK, Cassandra cluster. You can also use IP-addresses instead of host names.
Return to single node setup and run the installation script at ONE NODE only!
Start ThingsBoard service
If everything went well, you should be able to access your ThingsBoard nodes directly using the URL http://[NODE_IP]:8080. You can verify proper cluster operation by creating a tenant on one node and check its presence on another node.
I don't know if using an even number of ThingsBoard nodes is a good idea. The documentation does not mention anything about this.
One final remark, you could/should consider putting a proxy in front of your ThingsBoard cluster to provide load balancing to your web clients and improve user experience. This way you shouldn't share the individual host addresses with your users and you will prevent node overloading due to the fact that everybody is using the same web-address to access your dashboard(s). You could also proxy your MQTT broker to provide load balancing as well.
Good luck in setting up your cluster!
Zookeeper needs at least 3 nodes to run in a cluster mode. Each node voting and the valid replica count to gain the QUORUM is 3.

Preventing Cassandra Node from Being Overwhelemed

When in Java, I create a Cassandra cluster builder, I provide a list of multiple Cassandra nodes as shown below:
Cluster cluster = Cluster.builder().addContactPoint(host1, host2, host3, host4).build();
But from what I understand, the connector connects only to the first host in the list that is available, and that host becomes my connection point to the Cassandra cluster.
Now, my question is if my Java application reads/writes huge amount of data from/to Cassandra, then doesn't my Java application overwhelm the node that it is connected to?
Is there a way to configure my connection such that it uses multiple nodes of Cassandra for its reads/writes? What is the common practice?
It uses the contact point to find the rest of the nodes in the cluster, then creates a pool of connections to all the hosts and balances the requests among them. It doesn't only connect to the hosts you provide unless you use the whitelist load balancing policy or a custom one.
If your worried about overwhelming nodes use the RoundRobinLoadBalancingPolicy (DC aware if multiple DCs) and it will distribute it amongst all of them evenly. If you have hot spots of data and use the TokenAware policy you may have it uneven, but you shouldn't need to worry about it.

How to setup Spark with a multi node Cassandra cluster?

First of all, I am not using the DSE Cassandra. I am building this on my own and using Microsoft Azure to host the servers.
I have a 2-node Cassandra cluster, I've managed to set up Spark on a single node but I couldn't find any online resources about setting it up on a multi-node cluster.
This is not a duplicate of how to setup spark Cassandra multi node cluster?
To set it up on a single node, I've followed this tutorial "Setup Spark with Cassandra Connector".
You have two high level tasks here:
setup Spark (single node or cluster);
setup Cassandra (single node or cluster);
This tasks are different and not related (if we are not talking about data locality).
How to setup Spark in Cluster you can find here Architecture overview.
Generally there are two types (standalone, where you setup Spark on hosts directly, or using tasks schedulers (Yarn, Mesos)), you should draw upon your requirements.
As you built all by yourself, I suppose you will use Standalone installation. The difference between one node is network communication. By default Spark runs on localhost, more commonly it uses FQDNS name, so you should configure it in /etc/hosts and hostname -f or try IPs.
Take a look at this page, which contains all necessary ports for nodes communication. All ports should be open and available between nodes.
Be attentive that by default Spark uses TorrentBroadcastFactory with random ports.
For Cassandra see this docs: 1, 2, tutorials 3, etc.
You will need 4 likely. You also could use Cassandra inside Mesos using docker containers.
p.s. If data locality it is your case you should come up with something yours, because nor Mesos, nor Yarn don't handle running spark jobs for partitioned data closer to Cassandra partitions.

What is meant by a node in cassandra?

I am new to Cassandra and I want to install it. So far I've read a small article on it.
But there one thing that I do not understand and it is the meaning of 'node'.
Can anyone tell me what a 'node' is, what it is for, and how many nodes we can have in one cluster ?
A node is the storage layer within a server.
Newer versions of Cassandra use virtual nodes, or vnodes. There are 256 vnodes per server by default.
A vnode is essentially the storage layer.
machine: a physical server, EC2 instance, etc.
server: an installation of Cassandra. Each machine has one installation of Cassandra. The Cassandra server runs core processes such as the snitch, the partitioner, etc.
vnode: The storage layer in a Cassandra server. There are 256 vnodes per server by default.
Helpful tip:
Where you will get confused is that Cassandra terminology (in older blog posts, YouTube videos, and so on) had been used inconsistently. In older versions of Cassandra, each machine had one Cassandra server installed, and each server contained one node. Due to the 1-to-1-to-1 relationship between machine-server-node in old versions of Cassandra people previously used the terms machine, server and node interchangeably.
Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers. Like all other distributed database systems, it provides high availability with no single point of failure.
You may got some ideas from the description of above paragraph. Generally, when we talk Cassandra, we mean a Cassandra cluster, not a single PC. A node in a cluster is just a fully functional machine that is connected with other nodes in the cluster through high internal network. All nodes work together to make sure that even if one of them failed due to unexpected error, they as a whole cluster can provide service.
All nodes in a Cassandra cluster are same. There is no concept of Master node or slave nodes. There are multiple reason to design like this, and you can Google it for more details if you want.
Theoretically, you can have as many nodes as you want in a Cassandra cluster. For example, Apple used 75,000 nodes served Cassandra summit in 2014.
Of course you can try Cassandra with one machine. It still work while just one node in this cluster.
What is meant by a node in cassandra?
Cassandra Node is a place where data is stored.
Data centerĀ is a collection of related nodes.
A cluster is a component which contains one or more data centers.
In other words collection of multiple Cassandra nodes which communicates with each other to perform set of operation.
In Cassandra, each node is independent and at the same time interconnected to other nodes.
All the nodes in a cluster play the same role.
Every node in a cluster can accept read and write requests, regardless of where the data is actually located in the cluster.
In the case of failure of one node, Read/Write requests can be served from other nodes in the network.
If you're looking to understand Cassandra terminology, then the following post is a good reference:
http://exponential.io/blog/2015/01/08/cassandra-terminology/

Ability to write to a particular cassandra node

Is there a possibility to write to a particular node using datastax driver?
For example, I have three nodes in datacenter 1 and three nodes in datacenter 2.
Existing
If i build up the cluster with any one of them as seed, all the nodes will get detected by the datastax java driver. So, in this case, if i insert a data using driver, it will automatically choose one of the nodes and proceed with it as the co-ordinator(preferably local data center)
Requirement
I want a way to contact any node in datacenter 2 and hand over the co-ordinator job to one of the nodes in datacenter 2.
Why i need this
I am trying to use the trigger functionality from datacenter 2 alone. Since triggers are taken care by co-ordinator , i want a co-ordinator to be selected from datacenter 2 so that data center 1 doesnt have to do this operation.
You may be able to use the DCAwareRoundRobinPolicy load balancing policy to achieve this by creating the policy such that DC2 is considered the "local" DC.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("dc2"));
In the above example, remote (non-DC2) nodes will be ignored.
There is also a new WhiteListPolicy in driver version 2.0.2 that wraps another load balancing policy and restricts the nodes to a specific list you provide.
Cluster.Builder builder = Cluster.builder().withLoadBalancingPolicy(new WhiteListPolicy(new DCAwareRoundRobinPolicy("dc2"), whiteList));
For multi-DC scenarios Cassandra provides EACH and LOCAL consistency levels where EACH will acknowledge successful operation in each DC and LOCAL only in local one.
If I understood correctly, what you are trying to achieve is DC failover in your application. This is not a good practice. Let's assume your application is hosted in DC1 alongside with Cassandra. If DC1 goes down, your entire application is unavailable. If DC2 goes down, your application still can write with LOCAL CL and C* will replicate changes when DC2 is back.
If you want to achieve HA, you need to deploy application in each DC, use CL=LOCAL_X and finally do failover on DNS level (e.g. using AWS Route53).
See data consistency docs and this blog post for more info about consistency levels for multiple DCs.

Resources