Connecting to Couchbase Server Cluster - node.js

I've been using Couchbase for my database solution and so far it looks very good.
I'm confused however with connecting to a Cluster. A Cluster is just a group of nodes so when you use the API to connect to a Cluster what do you use as the IP? Do you just use one of the nodes in the Cluster? Does it matter which one?
I'm personally using the Node.js API.

Technically all you need is just one node in the list. As soon as it connects to that one, it will get the cluster map of the entire cluster and know all of the rest of the nodes. No it does not matter which node.
That being said, best practice is to have at least 3 nodes of the cluster listed in the connection string or better yet if the SDK you are using supports it, use a DNS SRV record with at least 3 nodes in there. With three nodes in the list if for some reason (e.g. server failure or maintenance) one of the nodes is unavailable, you can still bootstrap an application server to get that cluster map with one of the other nodes in the list.

I asked this question a few months ago on couchbase forums and the author of the node.js module answered that you should use "some" of them
like :
cluster.openBucket("couchbase://server1,server2,server3", function(err) {});
if you have server4 and 5 are added , they will be automatically added to the cluster as soon as they are available in the cluster.
Check here for details : https://forums.couchbase.com/t/couchnode-connection-to-cluster/6281

Related

Does "spring data cassandra" have client side loadbalancing?

I'm operating project using spring-boot, spring-data-cassandra.
When I setup that project, I set cassandra properties by ip and port.
(referred by https://www.baeldung.com/spring-data-cassandra-tutorial)
When set it up like this, If I had 3 cassandra nodes and 1 cassandra node died, I think project should fail to connect with cassandra at a 33% probability.
But my project was fine even though 1 cassandra node was dead. (just have some error on one's deathbed)
Do It happen to have A function in spring-data-cassandra like client-side-loadbalancing?
If they have that function, Where can I see that code??
I tried to find that code but failed.
Please give me a little clue.
Spring Data Cassandra relies on the functionality of the DataStax Java driver that is responsible for making everything works. This includes:
establishing the initial connection to the cluster. This is where the contact points play their role. After driver is connected to any of points, it reads information about the whole cluster and establishes connections to all nodes (by default)
establishing the control connection that is used to receive notifications about changes in the cluster - nodes going up & down, changes in schema, etc. If node goes down or up, this information is used to modify the list of the active nodes
providing the load balancing of requests based on the replication, and nodes availability - if the node is down, it's excluded from list of candidates, so we don't send queries to node that is known to be down

Thingsboard cluster setup

Building a Thingsboard cluster
I need help setting up a Thingsboard cluster, the documentation online is very limited.
The cluster will contain 2 Zookeeper nodes and 4 Thingsboard nodes with Cassandra DB.
Should Zookeeper be installed separately?
A step-by-step guide would be much appreciated!
I cannot provide you detailed step-by-step instructions to setup a ThingsBoard cluster. I can point you into the right direction by sharing the different documents you need to do so.
Bottom line, the following tasks must be completed:
Install and configure a ZooKeeper ensemble.
Check the ZooKeeper documentation for further installation details. Keep in mind that you need at least three different ZK-nodes in a clustered environment and that you always need an odd number of ZK nodes (3,5,7,...). It is a very very very bad idea to build a cluster consisting out of two ZK-nodes, check split brain condition that might appear under these circumstances! Basically you setup the number of individual nodes you wish to use and change the configuration file to enable the different nodes as an ensemble. This is documented quite well in the ZK-docs.
Install and configure a Cassandra cluster.
Again you will setup the number of individual nodes you need for your Cassandra cluster and modify the individual configuration files to convert them into a Cassandra cluster. Check Cassandra documentation for details. Be sure to check proper configuration using the nodetool status command as described at the end of the document. All your nodes should be up and running.
Install and configure a ThingsBoard cluster.
Use the instructions provided with ThingsBoard single node setup.
Install Java
Skip External database installation
ThingsBoard service installation
Configure ThingsBoard to use the external database - Cassandra
Go to Cluster setup and apply the configuration steps depicted (ZK, Cassandra and RPC). Keep in mind to point to ALL members of your ZK, Cassandra cluster. You can also use IP-addresses instead of host names.
Return to single node setup and run the installation script at ONE NODE only!
Start ThingsBoard service
If everything went well, you should be able to access your ThingsBoard nodes directly using the URL http://[NODE_IP]:8080. You can verify proper cluster operation by creating a tenant on one node and check its presence on another node.
I don't know if using an even number of ThingsBoard nodes is a good idea. The documentation does not mention anything about this.
One final remark, you could/should consider putting a proxy in front of your ThingsBoard cluster to provide load balancing to your web clients and improve user experience. This way you shouldn't share the individual host addresses with your users and you will prevent node overloading due to the fact that everybody is using the same web-address to access your dashboard(s). You could also proxy your MQTT broker to provide load balancing as well.
Good luck in setting up your cluster!
Zookeeper needs at least 3 nodes to run in a cluster mode. Each node voting and the valid replica count to gain the QUORUM is 3.

Is there a way to remove a rogue node from our hazelcast cluster?

We are currently running a hazelcast cluster using it to communicate information on a queue to be picked up by a single node in the cluster. We are vulnerable however to a "rogue" node that joins the cluster but without the right version of software to handle the request in a way that's proper.
Is there a way proactively remove rogue nodes of this nature in a way that prevents them from actively re-joining the cluster? I haven't been able to see a way from the documentation.
It looks like you are using default hazelcast xml. You better need to have a custom hazelcast xml with updated Group credentials.

How to run Couchbase on multiple server or multiple AWS instances?

I am trying to evaluate couchbase`s performance on multiple nodes. I have a Client that generates data for me based on some schema(for 1 node currently, local). But I want to know how I can horizontally scale Couchbase and how it works. Like If I have multiple machines or AWS instances or Windows Azure how can I configure Couchbase to shard the data and than I can evaluate its performance for multiple nodes. Any suggestions and details as to how I can do this?
I am not (yet) familiar with Azure but you can find a very good white paper about Couchbase on AWS:
Running Couchbase on AWS
Let's talk about the cluster itself, you just need to
install Couchbase on multiple nodes
create a "cluster" on one of then
then you simply have to add other nodes to the cluster and rebalance.
I have created an Ansible script that use exactly the steps to create a cluster from command line, see
Create a Couchbase cluster with Ansible
Once you have done that your application will leverage all the nodes automatically, and you can add/remove nodes as you need.
Finally if you want to learn more about Couchbase architecture, how sharding, failover, data consistency, indexing work, I am inviting your to look at this white paper:
Couchbase Server: An Architectural Overview

Set cluster name when using Cassandra CQL/JDBC driver

I'm using the Cassandra CQL/JDBC driver I got from google code but it doesn't seem to let me provide a cluster name - is there a way?
I'm using cluster names to ensure I don't run commands against a live system, it has a different cluster name to my dev systems.
Edit: Just to clarify, I have two totally separate Cassandra clusters, one live and one for test. They have different cluster names to ensure that I don't accidentally run test code meant for the test cluster on the live cluster. Therefore any client I need to use must let me set a cluster name. Hector does this.
There is no inbuilt protection for checking cluster names for Cassandra clients. It is built to ensure nodes from different clusters don't try and join together but not to ensure clients connect to the right cluster. It would be possible to add this checking to a client though (since the cluster name is exposed to the client) but I'm not aware of any clients doing this.
I'd strongly recommend firewalling off your different environments to avoid this kind of mistake. If that isn't possible, you should choose different ports to avoid confusion. Change this with the 'rpc_port' setting in cassandra.yaml.
You'd have to mirror the data on two different clusters. You cant access the same cluster with different names.
To rename your cluster (from the default 'Test Cluster') you edit the cassandra configuration file found in location/of/cassandra/conf/cassandra.yaml. Its the top line, if you need more details look at the datastax configuration documentation and explanation.

Resources