Recommended ways to deal with database migrations while doing a swap using deployment slots - azure

I am trying to understand the use of deployment slots for hosting my web app using the Azure app service.
I am particularly confused with the ideal ways to deal with the database while the swap is performed.
While maintaining two database versions seems like a solution, it adds the complexity of maintaining data across multiple databases to make them consistent.
What are the recommended ways for dealing with database schema and migrations while using blue/green deployments and in particular deployment slots?

Ideally you'll stage / production would share the same database, so it would not be an issue.
But if you have more slots, then you'd better also work with different databases and handle migrations during the release phase

We've worked through various solutions to this problem for a few years. There's not a toolset that provides a magic bullet for all cases. There are a few solutions:
Smaller databases/trivial changes
If it is possible to execute a migration script on a database that will complete in a second or two, and you can have an easy fallback script, you can execute the script concurrently with the swap. This can also be automated. But it's a higher stress situation and not one I'd recommend. This can even be done with EF Migrations.
Carefully ensure database compatibility between versions
Since we're dealing with a few hundred GB of data that cannot go down, we've just made it a rule that the database has to work with both versions of our application. It's not as awful or impossible as it sounds. For example, net new tables and fields can oftentimes be added before you even perform the swap. We test rollback between versions as part of our QA. If some fields need to be dropped, we wait until after the new version has been deployed and burned in, then run another script to perform the drops after we're sure we won't need rollback. We'll create new stored procedures when one needs to be upgraded so that the new version has its own. Example: sp_foo and sp_foo2.
We've had a lot of success with this strategy.

Slots are a feature specifically for App Services and not for DBs, if you want to use a specific DB with a specific slot then you setup the slot like this:
https://learn.microsoft.com/en-us/azure/app-service/deploy-staging-slots
Now when using Slots and swapping it also swaps App Configurations\Settings, and in App Settings you can have two DB connections strings but each with its own slot name and setting enabled. You can see it has been shown in this example here as well: https://learn.microsoft.com/en-us/azure/app-service/deploy-staging-slots#swap-two-slots

Related

Microservices on GCP

I am looking to use GCP for a micro-services application. After comparing AWS and GCP I have decided to go with Google because one major requirement for the project is to schedule tasks to run in the future (Cloud Tasks) which AWS does not seem to offer an equivalent of.
I am planning on containerizing my services and deploying to GCP using Cloud Run with a Redis cluster running as well for caching.
I understand that you cannot have multiple Firestore instances running in one project. Does this mean that all if my services will be using the same database?
I was looking to follow a model (possible on AWS) where each service had its own database instance that it reached out to.
Is this pattern possible on GCP?
Firestore indeed is for the moment limited to a single database instance per project. For performance that is usually not a problem, but for isolation such as your use-case, that can indeed be a reason to look elsewhere.
Firebase's original Realtime Database does allow multiple instances per project, and recently added a REST API for provisioning database instances. Since each Google Cloud Project can be toggled to also be a Firebase project, you could consider that.
Does this mean that all if my services will be using the same database?
I don't know all details of your case. Do you think that you can deploy a "microservice" per project? Not ideal, especially if they are to communicate using PubSub, but may be an option. In that case every "microservice" may get its own Firestore if that is a requirement.
I don't think one should consider GCP project as some kind of "hard boundaries". For me they are just another level of granularity - in addition to folders, etc.
There might be some benefits for "one microservice - one project" appraoch as well. For example, less dependent lifecycles, better (more accurate) security, may be simpler development workflows...

Web application deployment approaches

Currently, our product is a web application with SQL Server as DBMS, ASP.NET backend, and classic HTML/JavaScript/CSS frontend. The product is actively developed and each month we have to deploy a new version of it to production.
During this deployment, we update all the components listed above (apply some SQL scripts, update binaries, and client files) but we deploy only the delta (set of files which were changed since the last release). It has some benefits like we do not reset custom data/configs/client adjustments.
Now we are going to move inside clouds like Azure, AWS, etc. Adjust product architecture to be compliant with the Docker/Kubernetes and provide the product as SaaS.
And now the question itself: "Which approach of deployment is recommended in the clouds?" Can we keep applying the delta only? Or we have to reorganize the process to always deploy from scratch?
If there are some Internet resources I have missed, please share.
This question is extremely broad but maybe some clarification could steer you in the right direction anyway:
Source code deployments (like applying delta's) and container deployments are two very different directions in the sense that the tooling you invest in during the entire SLDC CAN differ substantially. Some testing pipelines/products focus heavily (or exclusively) on working with one or the other. There will be tools that can handle both of course.
They also differ in the problems they're attempting to solve and come with some pro's and con's:
Source Code Deployments/Apply Diffs:
Good for small teams and quick deployments as they're simple to understand and setup.
Starts to introduce risk when you need to upgrade the Host OS or application dependencies
Starts to introduce risk when the Host's in production begin to drift (have more differing files then expected) more dramatically over time
Slack has a good write up of their experience here.
Container deployments
Provides isolation from the application (developer space) and the Host OS (sysadmin/ops space). This usually means they can work with each other independently.
Gives an "artifact" that won't change between deployments, ie the container tagged v1 will always be the same unless you do something really funky. You can't really guarantee this
The practice of isolating stateless components makes autoscaling those components very easy, and you can eventually spend more time on the harder ones (usually stateful).
Introduces a new abstraction with new concerns that your team will have to mature into. Testing pipelines, dev tooling, monitoring/loggin architectures might all need to be adjusted over time and that comes with cost and risk.
Stateful containers is hardly a solved problem (ie shoving an existing database in a container can be a surprising challenge).
In order to work with Kubernetes, you need to have a containerized application. That doesn't mean you need to containerize your entire product over night. Splitting out the front end to deploy with cloudfront/s3, and containerizing a stateless app will get your feet wet.
Some books that talk about devops philosophies (in which this transition plays a part)
The Devops Handbook
Accelerate
Effective Devops
SRE book

Blue-green deployments on Azure for multi-tenancy

Let us say blue and green app services share the same database instances and you can use slots for swapping the applications. How would you handle the schema-breaking changes, as some users may be about to post a request that would not work with the new schema?
From my understanding, it looks like you always have to write backward compatible code that would work in both cases to handle the schema changes, which does not look much ideal to me.
There is no magic bullet for rolling back Azure SQL schema changes. Unfortunately you will have to create a script to deploy new updates to your database, and one to roll changes back if things do not go well if you choose to do so. There is a tool called Elastic Jobs that will help you execute scripts across one or more databases in an elastic pool.

Azure Web Site Migrations & Concurrency

I have two Azure Websites set up - one that serves the client application with no database, another with a database and WebApi solution that the client gets data from.
I'm about to add a new table to the database and populate it with data using a temporary Seed method that I only plan on running once. I'm not sure what the best way to go about it is though.
Right now I have the database initializer set to MigrateDatabaseToLatestVersion and I've tested this update locally several times. Everything seems good to go but the update / seed method takes about 6 minutes to run. I have some questions about concurrency while migrating:
What happens when someone performs CRUD operations against the database while business logic and tables are being updated in this 6-minute window? I mean - the time between when I hit "publish" from VS, and when the new bits are actually deployed. What if the seed method modifies every entry in another table, and a user adds some data mid-seed that doesn't get hit by this critical update? Should I lock the site while doing it just in case (far from ideal...)?
Any general guidance on this process would be fantastic.
Operations like creating a new table or adding new columns should have only minimal impact on the performance and be transparent, especially if the application applies the recommended pattern of dealing with transient faults (for instance by leveraging the Enterprise Library).
Mass updates or reindexing could cause contention and affect the application's performance or even cause errors. Depending on the case, transient fault handling could work around that as well.
Concurrent modifications to data that is being upgraded could cause problems that would be more difficult to deal with. These are some possible approaches:
Maintenance window
The most simple and safe approach is to take the application offline, backup the database, upgrade the database, update the application, test and bring the application back online.
Read-only mode
This approach avoids making the application completely unavailable, by keeping it online but disabling any feature that changes the database. The users can still query and view data while the application is updated.
Staged upgrade
This approach is based on carefully planned sequences of changes to the database structure and data and to the application code so that at any given stage the application version that is online is compatible with the current database structure.
For example, let's suppose we need to introduce a "date of last purchase" field to a customer record. This sequence could be used:
Add the new field to the customer record in the database (without updating the application). Set the new field default value as NULL.
Update the application so that for each new sale, the date of last purchase field is updated. For old sales the field is left unchanged, and the application at this point does not query or show the new field.
Execute a batch job on the database to update this field for all customers where it is still NULL. A delay could be introduced between updates so that the system is not overloaded.
Update the application to start querying and showing the new information.
There are several variations of this approach, such as the concept of "expansion scripts" and "contraction scripts" described in Zero-Downtime Database Deployment. This could be used along with feature toggles to change the application's behavior dinamically as the upgrade stages are executed.
New columns could be added to records to indicate that they have been converted. The application logic could be adapted to deal with records in the old version and in the new version concurrently.
The Entity Framework may impose some additional limitations in the options, because it generates the SQL statements on behalf of the application, so you would have to take that into consideration when planning the stages.
Staging environment
Changing the production database structure and executing mass data changes is risky business, especially when it must be done in a specific sequence while data is being entered and changed by users. Your options to revert mistakes can be severely limited.
It would be necessary to do extensive testing and simulation in a separate staging environment before executing the upgrade procedures on the production environment.
I agree with the maintenance window idea from Fernando. But here is the approach I would take given your question.
Make sure your database is backed up before doing anything (I am assuming its SQL Azure)
Put up a maintenance page on the Client Application
Run the migration via Visual Studio to your database(I am assuming you are doing this through the console) or a unit test
Publish the website/web api websites
Verify your changes.
The main thing is working with the seed method via Entity Framework is that its easy to get it wrong and without a proper backup while running against Prod you could get yourself in trouble real fast. I would probably run it through your test database/environment first (if you have one) to verify what you want is happening.

mvc-mini-profiler - working with a load balanced web role (azure et al)

I believe that the mvc mini profiler is a bit of a 'God-send'
I have incorporated it in a new MVC project which is targeting the Azure platform.
My question is - how to handle profiling across server (role instance) barriers?
Is this is even possible?
I don't understand why you would need to profile these apps any differently. You want to profile how your app behaves on the production server - go ahead and do it.
A single request will still be executed on a single instance, and you'll get the data from that same instance. If you want to profile services located on a different physical tier as well, that would require different approaches; involving communication through internal endpoints which I'm sure the mini profiler doesn't support out of the box. However, the modification shouldn't be that complicated.
However, would you want to profile physically separated tiers, I would go about it in a different way. Specifically, profile each tier independantly. Because that's how I would go about optimizing it. If you wrap the call to your other tier in a profiler statement, you can see where the problem lies and still be able to solve it.
By default the mvc-mini-profiler stores and delivers its results using HttpRuntime.Cache. This is going to cause some problems in a multi-instance environment.
If you are using multiple instances, then some ways you might be able to make this work are:
to change the Http Cache to an AppFabric Cache implementation (or some MemCached implementation)
to use an alternative Storage strategy for your profile results (the code includes SqlServerStorage as an example?)
Obviously, whichever strategy you choose will require more time/resources than just the single instance implementation.

Resources