MongoDB Replica Connections from NodeJS - node.js

My NodeJS client is able to connect to the MongoDB primary server and interact with it, as per requirements.
I use the following code to build a Server Object
var dbServer = new Server(
host, // primary server IP address
port,
{
auto_reconnect: true,
poolSize: poolSize
});
and the following code to create a DB Object:
var db = new Db(
'MyDB',
dbServer,
{ w: 1 }
);
I was under the impression, that when the primary goes down, the client will automatically figure out that it now needs to talk to one of the secondaries, which will be elected to be a primary.
But when I manually kill the primary server, one of the secondary servers does become the primary (as can be observed from its mongo shell and the fact that it now responds to mongo shell commands), but the client doesn't automatically talk to it. How do I configure NodeJS server to automatically switch to the secondary?
Do, I need to specify all 3 server addresses somewhere? But that doesn't seem like a good solution, as once the primary is back on line, it's IP address will be different from what it originally was.
I feel that I am missing something very basic, please enlighten me :)
Thank You,
Gary

Well your understanding is part there but there are some problems. The general premise of assigning more than a Single server in the connection is that should that server address be unavailable at the time of connection, then something else from the "seed list" will be chosen in order to establish the connection. This removes a single point of failure such as the "Primary" being unavailable at this time.
Where this is a "replica Set" then the driver will discover the members once connected and then "automatically" switch to the new "Primary" as that member is elected. So this does require that your "replica Set" is actually capable of electing a new "Primary" in order to switch the connection. Additionally, this is not "instantaneous", so there can be a delay before the new "Primary" is promoted and able to accept operations.
Your "auto_reconnect" setting is also not doing what you think it is doing. All this manages is that if a connection "error" occurs, the driver will "automatically" retry the connection without throwing an exception. What you likely really want to do is handle this yourself as you could end up infinitely retrying a connection that just cannot be made. So good code would take this into account, and manage the "re-connect" attempts itself with some reasonably handling and logging.
Your final point on IP addresses is generally addressed by using hostnames that resolve to an IP address where those "hostnames" never change, regardless of what they resolve to. This is equally important for the driver as it is for the "replica set" itself. As indeed if the server members are looking for another member by an IP address that changes, then they do not know what to look for.
So the driver will "fail over" or otherwise select a new available "Primary", but only within the same tolerances that the servers can also communicate with each other. You should seed you connections as you cannot guarantee which node is the "Primary" when you connect. Finally you should use hostnames instead of IP addresses if the latter is subject to change.
The driver will "self discover", but again it is only using the configuration available to the replica set in order to do so. If that configuration is invalid for the replica set, then it is invalid for the driver as well.
Example:
MongoClient.connect("mongodb://member1,member2,member3/database", function(err,db) {
})
Or other with an array of Server objects instead.

Related

How to add a new independent IP to a running Hazelcast cluster without restarting existed node?

Hazelcast cluster running in different hosts IP1, IP2...
hazelcast.xml configure the TCP-IP members
enter image description here
Now I want to expanding the cluster to support more service.
I install a new hazelcast in new IP3
How can I add the new IP3 to the exsiting cluster without restarting IP1, IP2?
The members section in TCP is for finding the cluster.
You list some places where cluster members may be. The process starting tries those locations, and if it gets a response the response includes the locations of all cluster members.
When scaling up you frequently won't know the location in advance. The TCP list is one solution, but there's other ways if running on the cloud, etc.
For your specific question: You don't need to add IP3 to your XML. Or you can and it be picked up the next time the processes are restarted.
If you're new to Hazelcast, why not join the community slack
Usually, you don't need to change anything, and just starting a new member (IP3) with listed IP1 and IP2 will work. The third member will join the cluster.
How does it look like under the hood (simplified):
the newly started member tries to contact addresses in its member list; after a successful connection, it sends a "Who Is Master" request and reads the master node address from the response;
the new node makes a connection to the master address (if not established already) and asks it to join the cluster;
if the join is successful, the master replies with an updated member list (cluster view) and it also sends the updated list to all other cluster members;
There were significant improvements in handling corner cases for these "incomplete address configuration" issues in Hazelcast 5.2. So if you are on an older version I strongly recommend switching to an up-to-date one.
If for any reason the default behavior is not sufficient in your case, you can also use the /hazelcast/rest/config/tcp-ip/member-list REST endpoint to make member-list changes. The endpoint was also introduced in 5.2. Find details in the documentation: https://docs.hazelcast.com/hazelcast/5.2/network-partitioning/split-brain-recovery#eliminating-unsuccessful-cluster-merges

How detect MongoDB reconnection for replica set

If I try to register for the event 'reconnect' in a MongoDB replicaset:
db.on('reconnect', () => console.log('Reconnected'));
I receive a deprecation warning as:
DeprecationWarning: The `reconnect` event is no longer supported by the unified topology
How can I handle a case of lost MongoDB connection (all servers in the replica set) but I want to be notified of servers availability status (when at least one server become again available)?
Suppose to handle this in a Node app with MongoDB native drivers.
Thanks in advance.
If we take a look at the spec regarding the unified topology, we can find the following section:
The unified topology is the first step in a paradigm shift away from a
concept of “connecting” to a MongoDB deployment using a connect
method. Consider for a moment what it means to be connected to a
replica set: do we trigger this state when connected to a primary? A
primary and one secondary? When connected to all known nodes? It’s
unclear whether its possible to answer this without introducing
something like a ReadPreference parameter to the connect method. At
this point “connecting” is just one half of “operation execution” -
you pass a ReadPreference in, and await a selectable server for the
operation, now we’re connected!
There are, however, new events you can listen to and that might be useful for your usecase -> see this for more information.

how does mongodb replica set work with nodejs-mongoose?

Techstack used nodejs,mongoose,mongodb
i'm working on product that handles many DBrequests. During beginning of every month the db requests are high due to high read/write requests (bulk data processing). The number of records in each collection's targeted for serving these read/write requests are quite high. Read is high but write is not that high.
So the cpu utilization on the instance in which mongodb is running reaches the dangerzone(above 90%) during these times. The only thing that gets me through these times is HOPE (yes, hoping that instance will not crash).
Rather than scaling vertically, i'm looking for solutions to scale horizontally (not a revolutionary thought). i looked at replicaset and sharding. This question is only related to replicaSet.
i went through documents and i feel like the understanding i have on replicaset is not really the way it might work.
i have configured my replicaset with below configuration. i simply want to add one more instance because as per the understanding i have right now, if i add one more instance then my database can handle more read requests by distributing the load which could minimize the cpuUtilization by atleast 30% on primaryNode. is this understanding correct or wrong? Please share your thoughts
var configuration = {
_id : "testReplicaDB",
members:[
{_id:0,host:"localhost:12017"},
{_id:1,host:"localhost:12018",arbiterOnly:true,buildIndexes:false},
{_id:2,host:"localhost:12019"}
]
}
When i broughtup the replicaset with above config and ran my nodejs-mongoose code, i ran into this issue . Resolution they are proposing is to change the above config into
var configuration = {
_id : "testReplicaDB",
members:[
{_id:0,host:"validdomain.com:12017"},
{_id:1,host:"validdomain.com:12018",arbiterOnly:true,buildIndexes:false},
{_id:2,host:"validdomain.com:12019"}
]
}
Question 1 (related to the coding written in nodejsproject with mongoose library(for handling db) which connects to the replicaSet)
const URI = mongodb://167.99.21.9:12017,167.99.21.9:12019/${DB};
i have to specify both uri's of my mongodb instances in mongoose connection URI String.
When i look at my nodejs-mongoose code that will connect to the replicaSet, i have many doubts on how it might handle the multipleNode.
How does mongoose know which ip is the primaryNode?
Lets assume 167.99.21.9:12019 is primaryNode and rs.slaveOk(false) on secondaryReplica, so secondaryNode cannot serve readRequests.
In this situation, does mongoose trigger to the first uri(167.99.21.9:12017) and this instance would redirect to the primaryNode or will the request comeback to mongoose and then mongoose will trigger another request to the 167.99.21.9:12019 ?
Question 2
This docLink mention's that data redundancy enables to handle high read requests. Lets assume, read is enabled for secondaryNode, and
Lets assume the case when mongoose triggers a request to primaryNode and primaryNode was getting bombarded at that time with read/write requests but secondaryNode is free(doing nothing) , then will mongodb automatically redirect the request to secondaryNode or will this request fail and redirect back to mongoose, so that the burden will be on mongoose to trigger another request to the next available Node?
can mongoose automatically know which Node in the replicaSet is free?
Question 3
Assuming both 167.99.21.9:12017 & 167.99.21.9:12019 instances are available for read requests with ReadPreference.SecondaryPreferred or ReadPreference.nearest, will the load get distributed when secondaryNode gets bombarded with readRequests and primaryNode is like 20% utilization? is this the case? or is my understanding wrong? Can the replicaSet act as a loadbalancer? if not, how to make it balance the load?
Question 4
var configuration = {
_id : "testReplicaDB",
members:[
{_id:0,host:"validdomain.com:12017"},
{_id:1,host:"validdomain.com:12018",arbiterOnly:true,buildIndexes:false},
{_id:2,host:"validdomain.com:12019"}
]
}
You can see the DNS name in the configuration, does this mean that when primaryNode redirects a request to secondaryNode, DNS resolution will happen and then using that IP which corresponds to secondaryNode, the request will be redirected to secondaryNode? is my understanding correct or wrong? (if my understanding is correct, this is going to fireup another set of questions)
:|
i could've missed many details while reading the docs. This is my last hope of getting answers. So please share if you know the answers to any of these.
if this is the case, then how does mongoose know which ip is the primaryReplicaset?
There is no "primary replica set", there can be however a primary in a replica set.
Each MongoDB driver queries all of the hosts specified in the connection string to discover the members of the replica set (in case one or more of the hosts is unavailable for whatever reason). When any member of the replica set responds, it does so with the full list of current members of the replica set. The driver then knows what the replica set members are, and which of them is currently primary (if any).
secondaryReplica cannot serve readRequests
This is not at all true. Any data-bearing node can fulfill read requests, IF the application provided a suitable read preference.
In this situation, does mongoose trigger to the first uri(167.99.21.9:12017) and this instance would redirect to the primaryReplicaset or will the request comeback to mongoose and then mongoose will trigger another request to the 167.99.21.9:12019 ?
mongoose does not directly talk to the database. It uses the driver (node driver for MongoDB) to do so. The driver has connections to all replica set members, and sends the requests to the appropriate node.
For example, if you specified a primary read preference, the driver would send that query to the primary if one exists. If you specified a secondary read preference, the driver would send that query to a secondary if one exists.
i'm assuming that when both 167.99.21.9:12017 & 167.99.21.9:12019 instances are available for read requests with ReadPreference.SecondaryPreferred or ReadPreference.nearest
Correct, any node can fulfill those.
the load could get distributed across
Yes and no. In general replicas may have stale data. If you require current data, you must read from the primary. If you do not require current data, you may read from secondaries.
how to make it balance the load?
You can make your application balance the load by using secondary or nearest reads, assuming it is OK for your application to receive stale data.
if mongoose triggers a request to primaryReplica and primaryReplica is bombarded with read/write requests and secondaryReplica is free(doing nothing) , then will mongodb automatically redirect the request to secondaryReplica?
No, a primary read will not be changed to a secondary read.
Especially in the scenario you are describing, the secondary is likely to be stale, thus a secondary read is likely to produce wrong results.
can mongoose automatically know which replica is free?
mongoose does not track deployment state, the driver is responsible for this. There is limited support in drivers for choosing a "less loaded" node, although this is measured based on network latency and not CPU/memory/disk load and only applies to the nearest read preference.

mongodb failover connection

I have a nodejs app that connects to mongodb.
Mongodb allows replicaset client connection to provide a level of resilience.
e.g "mongodb://localhost:50000,localhost:50001/myproject?replicaSet=foo", the client first connects to localhost#50000 and if that dies it switches to localhost#50001.
This is fine but if when the application starts and if one of the two mongo is dead then the application dies - with can't connect error.
The only solution I can think is to reformat the url so it excludes the inactive instance, but would like to avoid this ...
Any ideas?
Thanks
Replicaset works fine when do you have an odd number of servers, because MongoDB ReplicaSet work using election between nodes for define what server will be the "primary".
You can to add a new node in your ReplicaSet just for voting. It's called "ARBITER".
You can understand more about ReplicaSet Elections on this page https://docs.mongodb.com/manual/core/replica-set-elections/#replica-set-elections.
As Rafael mentioned, a replica set needs an odd number of members to be able to function properly when some of the members are offline. There are more details in the Replica Set Elections docs page, but the most relevant is:
If a majority of the replica set is inaccessible or unavailable to the current primary, the primary will step down and become a secondary. The replica set cannot accept writes after this occurs, but remaining members can continue to serve read queries if such queries are configured to run on secondaries.
The node driver by default requires a Primary to be online to be able to connect to the replica set, and will output an error you observed when you try to connect to a replica set without a Primary.
This default behaviour can be changed by setting connectWithNoPrimary to true. However, to be able to do queries, you also should set the proper readPreference setting (which also defaults to Primary). For example:
var MongoClient = require('mongodb').MongoClient
conn = MongoClient.connect('mongodb://localhost:27017,localhost:27018,localhost:27019/test',
{
replicaSet: 'replset',
connectWithNoPrimary: true,
readPreference: 'primaryPreferred'
}).catch(console.log)
More information about the connection options can be found in the Node Driver URI Connection Settings page

Connect to ElastiCache cluster via Node.js

I'm confused as to how to connect to AWS's ElastiCache Redis via Node.js. I've successfully managed to connect to the primary host (001) via the node_redis NPM, but I'm unable to use the clustering ability of ioredis because apparently ElastiCache doesn't implement the CLUSTER commands.
I figured that there must be another way, but the AWS SDK for Node only has commands for managing ElastiCache, not for actually connecting to it.
Without using CLUSTER, I'm concerned that my app won't be able to fail over if the master node fails, since I can't fall back to the other clusters. I also get errors from my Redis client, Error: READONLY You can't write against a read only slave. when the master switches, which I'm not sure how to handle gracefully.
Am I overthinking this? I am finding very little information about using ElastiCache Redis clusters with Node.js.
I was overthinking this.
Q: What options does Amazon ElastiCache for Redis provide for node failures?
Amazon ElastiCache for Redis will repair the node by acquiring new service resources, and will then redirect the node's existing DNS name to point to the new service resources. Thus, the DNS name for a Redis node remains constant, but the IP address of a Redis node can change over time. If you have a replication group with one or more read replicas and Multi-AZ is enabled, then in case of primary node failure ElastiCache will automatically detect the failure, select a replica and promote it to become the new primary. It will also propagate the DNS so that you can continue to use the primary endpoint and after the promotion it will point to the newly promoted primary. For more details see the Multi-AZ section of this FAQ. When Redis replication option is selected with Multi-AZ disabled, in case of primary node failure you will be given the option to initiate a failover to a read replica node. The failover target can be in the same zone or another zone. To failback to the original zone, promote the read replica in the original zone to be the primary. You may choose to architect your application to force the Redis client library to reconnect to the repaired Redis server node. This can help as some Redis libraries will stop using a server indefinitely when they encounter communication errors or timeouts.
The solution is to connect to the primary master node only, without using any clustering on the client side. When the master fails, the slave is promoted and the DNS is updated so that the slave will become the primary node, without the host needing to change on the client's side.
To prevent temporary connectivity errors when the failover happens, you can add some configuration to ioredis:
var client = new Redis(port, host, {
retryStrategy: function (times) {
log.warn('Lost Redis connection, reattempting');
return Math.min(times * 2, 2000);
},
reconnectOnError: function (err) {
if (err.message.slice(0, targetError.length) === 'READONLY') {
// When a slave is promoted, we might get temporary errors saying
// READONLY You can't write against a read only slave. Attempt to
// reconnect if this happens.
log.warn('ElastiCache returned a READONLY error, reconnecting');
return 2; // `1` means reconnect, `2` means reconnect and resend
// the failed command
}
}
});

Resources