Amazon RDS Multi-AZ Failover - amazon-rds

Am considering the RDS Oracle product using Multi-AZ. One thing I can't find - if your primary instance goes away and you failover to the secondary instance, do you ever go back to the primary? Or does the secondary become primary, and another instance (perhaps your old primary) then become secondary? Does RDS automatically start a secondary instance for you and ensure that the secondary is up to date automatically? Or are you just running on one instance until you intervene manually?
Also - what type of latency hit can I expect to take given the synchronous updates?
Thanks...

After a Failover, a "recovery" action is performed on the old primary server. It will be recovered OR it will be replaced. What is the algorithm that decides this is not made public by AWS.
Once the failed node is recovered/replaced, a "sync" action with the current primary node kicks in and data is sync'd.
From here on, the recovered node (Old primary) will remain secondary node until another failover happens.

I got this answered by #faisal Khan in AWS repost.
https://repost.aws/questions/QU4DYhqh2yQGGmjE_x0ylBYg/what-happens-after-failover-in-rds
Yes, upon RDS Multi-AZ failover the failed primary instance is brought back up as the new standby instance to reinstate the high availability of your database.
To elucidate the above-mentioned:
The failover process is typically completed within 60-120 seconds where the standby instance is promoted as the new primary instance allowing you to resume your DB activities in the shortest amount of time.
While in the background, the failed primary instance is diagnosed by the RDS internal health monitoring system and remediation actions are taken based on the detected fault. The remediation action may involve simply rebooting the faulty instance to even replacement of the hardware depending on the detected fault. Once the old primary node is recovered, it is brought back up as the new Standby instance ensuring your DBs high-availability.
The recovery time of the failed node may vary upon the type of fault and the recovery process applied. Also, recovery time largely depends on the DB workload at the time of crash as RDS would perform data recovery and roll-back any uncommitted transactions eliminating any data inconsistencies across the nodes while providing you a single-box experience.
References:
https://aws.amazon.com/blogs/database/amazon-rds-under-the-hood-multi-az/
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.MultiAZSingleStandby.html#Concepts.MultiAZ.Failover

Related

Which MongoDB scaling strategy (Sharding, Replication) is suitable for concurrent connections?

Consider scenario that
I have multiple devclouds (remote workplace for developers), they are all virtual machines running on the same bare-metal server.
In the past, they used their own MongoDB containers running on Docker. So that number of MongoDB containers can add up to over 50 instances across devclouds.
The problem becomes apparent that while 50 instances is running at the same time, but only 5 people actually perform read/write operations against their own instances. So other 45 running instances waste the server's resources.
Should I use only one MongoDB cluster by combining a set of MongoDB instances ,for everyone so that they can connect to 1 endpoint only (via internal network) to avoid wasting resources.
I am considering the sharding strategy, but the problem is there are chances that if one node taken down (one VM shut down), is that ok for availability (redundancy)?
I am pretty new to sharding and replication, looking forward to know your solutions. Thank you
If each developer expects to have full control over their database deployment, you can't combine the deployments. Otherwise one developer can delete all data in the deployment, etc.
If each developer expects to have access to one database, you can deploy a single replica set serving all developers and assign one database per developer (via authentication).
Sharding in MongoDB sense (a sharded cluster) is not really going to help in this scenario since an application generally uses all of the shards. You can of course "shard manually" by setting up multiple replica sets.

How do you handle time based events in a cluster?

I have a Node.js application that runs in a cluster, therefore, there are many instances of an app running simultaneously and accepting requests from load balancer.
Consider I have a notion of a "subscription" in my app, and each subscription is stored in the central database with dateStart and dateEnd fields. For each subscription I need to send notifications, reminding clients about subscription expiration (e.g. 14, 7 and 3 days before expiration). Also, I will need to mark a subscription as expired and perform some additional logic, when time comes.
What are the best practices to handle such time-based events for multi-instance applications?
I can make my application to run expiration routine, e.g. every five minutes, but then I will have to deal with concurrency issues, because every instance will try to do so and we don't want notifications to be submitted twice.
I refactored the scheduled jobs for one of our systems when we clustered it a few years ago, a similar issue to what you are describing.
I created a cluster aware scheduled job monitor and used the DB to ensure only one was operating at any given time. Each generated their own unique GUID at startup and used it for an ID. At startup, they all look to the DB to see if a primary is running based on a table indicating ID, start time and last run. A primary is running if the recorded last run is with a specified time. If a primary is running, the rest stay running as backups and check on a given interval to take over if the primary were to die. If the primary dies, the one which takes over as primary marks the record with its ID and updates the times, then looks for jobs in other tables which would be similar to your subscriptions. The primary will continue to look for jobs at a configurable interval until it dies or is restarted.
During testing, I was able to spin up 50+ instances of the monitor which all constantly attempted to become primary. Only one would ever take over and during testing I would then manually kill the primary and watch the others all vie for primary, but only one would prevail. This approach relies on the DB record to only allow one of the threads to update the record using qualified updates based on the prior information in the record.

Make Node/MEANjs Highly Available

I'm probably opening up a can of worms with regard to how many hundreds of directions can be taken with this- but I want high availability / disaster recovery with my MEANjs servers.
Right now, I have 3 servers:
MongoDB
App (Grunt'ing the main application, this is the front end
server)
A third server for other processing on the back-end
So at the moment, if I reboot my MongoDB server (or more realistically, it crashes for some reason), I suddenly see this in my App server terminal:
MongoDB connection error: Error: failed to connect to
[172.30.3.30:27017] [nodemon] app crashed - waiting for file changes
before starting...
After MongoDB is back online, nothing happens on the app server until I re-grunt.
What's the best practice for this situation? You can see in the error I'm using nodeMon to monitor changes to the app. I bet upon init I could get my MongoDB server to update a file on the app server within nodemon's view to force a restart? Or is there some other tool I can use for this? Or should I be handling my connections to the db server more gracefully so the app doesn't "crash"?
Is there a way to re-direct to a secondary mongodb in case the primary isn't available? This would be more apt to HA/DR type stuff.
I would like to start with a side note: Given the description in the question and the comments to it, I am not convinced that using AWS is a wise option. A PaaS provider like Heroku, OpenShift or AppFog seems to be more suitable, especially when combined with a MongoDB service provider. Running MongoDB on EBS can be quite a challenge when you are new to MongoDB. And pretty expensive, too, as soon as you need provisioned IOPS.
Note In the following paragraphs, I simplified a few things for the sake of comprehensibility
If you insist on running it on your own, however, you have an option. MongoDB itself comes with means of automatic, transparent failover, called a replica set.
A minimal replica set consists of of two data bearing nodes and a so called arbiter. Write operations go to the node currently elected "primary" only, and reads do, too, unless you explicitly allow or request reads to be performed on the current "secondary". The secondary constantly syncs to the primary. If the current primary goes down for some reason, the former secondary becomes elected primary.
The arbiter is there so that there is always a quorum (qualified majority would be an equivalent term) of members to elect the current secondary to be the new primary. This quorum is mainly important for edge cases, but since you can not rule out these edge cases, an uneven number of members is a hard requirement for a MongoDB replica set (setting aside some special cases).
The beauty of this is that almost all drivers, and the node.js for sure, are replica set aware and deal with the failover procedure pretty gracefully. They simply send the reads and writes to the new primary, without any change to be done at any other point.
You only need to deal with some cases during the failover process. Without going into much detail, you basically check for certain errors in the according callbacks and redo the operation, if you encounter one of those errors and redoing the operation is feasible.
As you might have noticed, the third member, the arbiter, does not hold much data. It is a very lightweight process and can basically run on the cheapest instance you can find.
So you have data replication and automatic, transparent failover with relative ease at the cost of the cheapest VM you can find, since you would need two data bearing nodes anyway if you used any other means.

Distributed Lock Manager with Azure SQL database

We have Web API using Azure SQL database. Database model has Customers and Managers. Customers can add appointments. We can't allow overlapping appointments from 2 or more Customers for same Manager. Because we are working in a distributed environment (multiple instances of web server can insert records into database at the same time), there is a possibility that appointments that are not valid will be saved. As an example, Customer 1 wants an appointment between 10:00 - 10: 30. Customer 2 wants an appointment between 10:15 - 10:45. If both appointments happen during the same time, then the validation code in Web API, will not catch an error. That's why we need something like distributed lock manager. We read about Redlock from Redis and Zookeeper. My questions is: Is Redlock or Zookeeper good choise for our use case or there is some better solution?
If we would use Redlock than we would go with Azure Redis Cache because we already use Azure Cloud to host our Web API. We plan to identify shared resource (resource we want to lock) by using ManagerId + Date. This would result in lock for Manager on one date, so it would be possible to have other locks for same Manager on some other date. We plan to use one instance of Azure Redis Cache, is this safe enough?
Q1: Is Redlock or Zookeeper good choise for our use case or there is some better solution?
I consider Redlock as not the best choice for your use case because:
a) its guarantees are for a specific amount of time (TTL) set before using the DB operation. If for some reason (talk to DevOps for incredible ones and also check How to do distributed locking) the DB operation takes longer than TTL you loose the guarantee for lock validity (see lock validity time in the official documentation). You could use large TTL (minutes) or you could try to extend its validity with another thread which would monitor the DB operation time - but this gets incredibly complicated. On the other hands with zookeeper (ZK) your lock is there till you remove it or the process dies; it could be the situation when your DB operation hangs which would lead to the lock also to hang but these kind of problems are easily spotted by DevOps tools which will kill the hanging process which in turn will free the ZK lock (there's also the option to have a monitoring process which to also do this faster and in a more specific to you business fashion).
b) while trying to lock the processes must “fight” to win a lock; the “fighting” suppose for them to wait then retry getting the lock. These could lead to retry-count to overflow which would lead to a fail to get the lock. This seems to me a less important issue but with ZK the solution is far better: there’s no “fight” but all processes will get in a line of ones waiting their turn to get the lock (check ZK lock recipe).
c) Redlock is based on time measures which is incredible tricky; check at least the paragraph containing “feeling smug” at How to do distributed locking (Conclusion paragraph too) then think again how large should be that TTL value in order to be sure about your RedLock (time) based locking.
For these reasons I consider RedLock a risky solution while Zookeeper a good solution for your use case. Other better distributed locking solution fit for your case I don’t know but other distributed locking solutions do exist, e.g. just check Apache ZooKeeper vs. etcd3.
Q2: We plan to use one instance of Azure Redis Cache, is this safe enough?
It could be safe for your use case because the TTL seems to be predictable (if we really trust the time measuring - see the warn below) but only if the slave taking over a failed master could be delayed (not sure if possible, you should check the Redis configuration capabilities). In case you loose the master before a lock is synchronized to the slave than another process could just acquire the same lock. Redlock recommends to use delayed restarts (check Performance, crash-recovery and fsync in official documentation) with a period at least of 1 TTL. If for the Q1:a+c reason your TTL is a very long one than your system won’t be able to lock for maybe an unacceptable large period (because the only 1 Redis master you have must be replaced by the slave in a delayed fashion).
PS: I stress again to read Martin Kleppmann's opinion on Redlock where you’ll find incredible reasons for a DB operation to be delayed (search for before reaching the storage service) and also incredible reasons for not relaying on time measuring when locking (and also an interesting argumentation against using Redlock)

Changing service tiers or performance level and database downtime

I have identified that we may need to scale into the next service tier as some point soon (Standard to Premium).
For others interested, this article provides great guidelines for analysing your SQL Database.
My question: Is there any downtime while scaling to a different service tier or performance level?
Depends on your definition of "downtime". I have changed performance levels many times. Going from standard to premium we experienced many errors. Here are a few samples:
System.Data.SqlClient.SqlException (0x80131904): A transport-level
error has occurred when receiving results from the server. (provider:
TCP Provider, error: 0 - An existing connection was forcibly closed by
the remote host.) ---> System.ComponentModel.Win32Exception
(0x80004005): An existing connection was forcibly closed by the remote
host.
System.Data.SqlClient.SqlException (0x80131904): The ALTER DATABASE
command is in process. Please wait at least five minutes before
logging into database '...', in order for the command to complete.
Some system catalogs may be out of date until the command completes.
If you have altered the database name, use the NEW database name for
future activity.
System.Data.SqlClient.SqlException (0x80131904): The service has
encountered an error processing your request. Please try again. Error
code 40174. A severe error occurred on the current command. The
results, if any, should be discarded.
System.Data.DataException: Unable to commit the transaction. The
underlying connection is not open or has not been initialized.
My advice is to change performance levels off hours or during maintenance periods if possible.
There is no downtime when changing tiers, I have done it a few times. The change is not immediate though, it does take at least 5 minutes but during that time it will operate as normal.
As above, it depends on your definition of downtime. There is a brief period as the tier switches when transactions may be rolled back.
From 'Scaling up or scaling down...' section of this page: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-service-tiers
Note that changing the service tier and/or performance level of a database creates a replica of the original database at the new performance level, and then switches connections over to the replica. No data is lost during this process but during the brief moment when we switch over to the replica, connections to the database are disabled, so some transactions in flight may be rolled back. This window varies, but is on average under 4 seconds, and in more than 99% of cases is less than 30 seconds. Very infrequently, especially if there are large numbers of transactions in flight at the moment connections are disabled, this window may be longer.
Since "in-flight transaction" usually refers to a transaction that is running when a connection is broken, it seems that either connections may be broken mid-transaction, or, transactions operating across multiple connections might fail and be rolled back if one the connections is denied during the switch. If the latter, then simple transactions may not often be affected during the switch. If the former, then busy databases will almost certain see some impact.
There is no downtime when changing TIERS but there IS downtime when changing billing models. You literally have to backup your databases, spin up new databases in the new billing model servers, and restore them. You then have to change all your database references in apps or websites. If you want to change tiers FROM a billing tier that is no longer supported you WILL need to migrate to the new billing model first. We learned this the hard way. Microsoft doesn't make it easy either - it's not a pushbutton operation.

Resources