GitLab Disastry Recovery DB read-only issue - gitlab

We are setting up GitLab - DR, we have replicated secondary slave from primary Postgres database.
But we are getting READ ONLY issues since DB slave is in read only mode, we can't make DB- READ WRITE since it's secondary.
What should be the standard way to proceed with DR setup for GitLab?

It depends on what you are setting up for GitLab DR: Geo is the official solution since Nov. 2017 (GitLab 10.4)
If you are using Geo (premium or ultimate GitLab instance only), then the documentation would suggest:
The GitLab primary node where the write operations happen will connect to the primary database server, and the secondary nodes which are read-only will connect to the secondary database servers (which are also read-only).
Note:
In database documentation, you may see "primary" being referenced as "master"
and "secondary" as either "slave" or "standby" server (read-only).
We recommend using PostgreSQL replication slots to ensure that the primary retains all the data necessary for the secondaries to recover.
So make sure those replication slots are properly configured.

Related

Point-In-time-recovery in Greenplum Database

We have recently setup greenplum. Now the major concern is to setup strategy for PITR. Postgres provides PITR capability but I am a little confused as how will it work in greenplum as each segment will have it's own log directory and config file
We recently introduced the concept of named restore point to serve as a building block for PITR for greenplum. In order to use this you will need to use the catalog function gp_create_restore_point() which internally creates a cluster wide consistency point across all the segments. This function returns all the restore point location () for each segment and the master. Using these recovery points you will be able to configure the recover.conf in your PITR cluster
To demonstrate how Greenplum named restore points work, a new test
directory src/test/gpdb_pitr has been added. The test showcases WAL
Archiving in conjunction with the named restore points to do
Point-In-Time Recovery.
In case you are most interested in the details, please refer to the following two commits that discusses this functionality in detail https://github.com/greenplum-db/gpdb/commit/47896cc89b4935199aa7d97043f2b7572a71042b
https://github.com/greenplum-db/gpdb/commit/40e0fd9ce6c7da3921f0b12e55118320204f0f6d

How to perform backup and restore of Janusgraph database which is backed by Apache Cassandra?

I'm having trouble in figuring out on how to take the backup of Janusgraph database which is backed by persistent storage Apache Cassandra.
I'm looking for correct methodology on how to perform backup and restore tasks. I'm very new to this concept and have no idea on how to do this. It will be highly appreciated if someone explain the correct approach or point me to rightful documentation to safely execute the tasks.
Thanks a lot for your time.
Cassandra can be backed up a few ways. One way is called a "snapshot". You can issue this via "nodetool snapshot" command. What cassandra will do is to create a "snapshots" sub-directory, if it doesn't already exist, under each table that's being "backed up" (each table has its own directory where it stores its data) and then it will create the specific snapshot directory for this particular occurrence of the snapshot (either you can name the directory with the "nodetool snapshot" parameter or let it default). Cassandra will then create soft links to all of the sstables that exist for that particular table - looping through each table, keyspace or database - depending on your "nodetool snapshot" parameters. It's very fast as creating soft links takes almost 0 time. You will have to perform this command on each node in the cassandra cluster to back up all of the data. Each node's data will be backed up to the local host. I know DSE, and possibly Apache, are adding functionality to back up to object storage as well (I don't know if this is an OpsCenter-only capability or if it can be done via the snapshot command as well). You will have to watch the space consumption on this as there are no processes to clean these up.
Like many database systems, you can also purchase/use 3rd party software to perform backups (e.g. Cohesity (formally Talena), Rubrik, etc.). We use one such product in our environments and it works well (graphical interface, easy-to-use point-in-time recoveryt, etc.). They also offer easy-to-use "refresh" capabilities (e.g. refresh your PT environment from, say, production backups).
Those are probably the two best options.
Good luck.

Implementing database failover in Azure Service Fabric

My company's application experienced database connection issues this morning resulting in me having to failover to our secondary database. Within our Azure App Services, this was an easy step of changing the connection string in the configuration, however I could not find an easy way of changing these settings on our Service Fabric services without redeploying.
I'm considering options to allow failover at runtime for these services to a secondary database but don't know what the 'best practices' would be. A couple options I have:
I could create a dns entry for our database server that i manage and then just switch that to the new server name when I need to fail over.
I could have some sort of rest api to call on my app services that would return whether or not to go to the secondary database.
Any other ideas? I'd like to make failover to the secondary as seamless as possible so it can be done quickly.
Have you considered putting both your primary and secondary database connection strings into your application's config and writing some code that automatically switches between them if it detects a problem? Both of the options you presented puts a human in the path, which means your users are going to experience downtime until the human fixes the problem (maybe the human is asleep, or on vacation, or on vacation and asleep).
In Service Fabric, Application (and system) upgrades are always rolling upgrades. Rolling upgrades have the advantage of preventing global outages. For example, suppose at some point you updated your config with the wrong connection string. A global config change might be quick and easy, but now you have a global outage and some upset customers. A rolling upgrade would have caught the error in the first upgrade domain and then rolled back, so only a fraction of your application would have been affected.
You can do a config-only rolling upgrade. This is where you make a change to your config package and then create a differential upgrade package so that only the config changes go out and your service process doesn't have to restart.
Just to post an update to my issue here. SQL Azure now has automatic failover groups. This is described here

Relocating live mongodb data to another drive

I have a mongodb running on windows azure linux vm,
The database is located on the system drive and i wish to move it to another hard drive since there is not enough space there.
I found out this post :
Changing MongoDB data store directory
These seems to be a fine solution suggested there, yet there is another person who mentioned something about copying the files,
My database is live and getting data all the time, how can i make this proccess with lossing the least data possible ?
Thanks,
First, if this is a production system you really need to be running this as a replica set. Running production databases on singleton mongodb instances is not a best practice. I would consider 2 full members plus 1 arbiter the minimum production set up.
If you want to go the replica set route, you can first convert this instance to a replica set:
http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
this should have minimal down time.
Then add 2 new instances with the correct storage set up. After they sync you will have a full 3 member set. You can then fail over to one of the new instances. Remove this bad instance from the replica set. Finally I'd add an arbiter instance to get you back up to 3 members of the replica set while keeping costs down.
If on the other hand you do not want to run as a replica set, I'd shutdown mongod on this instance, copy the files over to the new directory structure on another appropriate volume, change the config to point to it (either changing dbpath or using a symlink) and then startup again. Downtime will be largely a factor of the size of your existing database, so the sooner you do this the better.
However - I will stress this again - if you are looking for little to no down time with mongoDB you need to use a replica set.

CouchDB delete and replication

I'm having a problem with the replication of my couchDB databases.
I have a remote database which gathers measurement data and replicates it to a central server.
On the server, I add some extra parameters to these documents.
Sometimes, a measurement goes wrong and I just want to delete it.
I do that on the central server and want to replicate it to the remote database.
Since I updated the document on the central server, there is a new revision which isn't synced to the remote.
If I want to delete that measurement, couchdb deletes the latest revision.
Replicating this to the remote doesn't delete the documents on the remote.
(Probably because it doesn't sync the latest revision first, it just wants to delete the latest revision, which isn't on the remote yet).
Replicating the database to the remote before I delete the document fixes this issue.
But sometimes, the remote host is unreachable. I want to be able to delete the document on the central database and make sure that once the remote comes online, it also deletes the document. Is there a way to do this with default couchdb commands?
You could configure continuous replication so that your remote listens for changes on the central database. If it goes offline, and comes back online, re-start continuous replication.

Resources