Blue Green Deployment with AWS ECS - terraform

We are using ECS Fargate containers to deploy all of our services (~10) and want to follow Blue/Green Deployment.
We have deployed all the services under BLUE flag where target groups are pointing to the services.
In CICD, New Target groups are created and having slightly different forward rules to allow testing without any issue.
Now, my System is running with 2 kind of target groups, services and task definition -
tg_blue, service_blue, task_blue → pointing to old containers and serving live traffic
tg_green, service_green, task_green → pointing to new containers and do not have any traffic.
All above steps are done in Terraform.
Now, I want to switch the traffic and here I am stuck, How to Switch the Traffic and How the next Deployment will look like?

I would go for AWS native solution if no important reasons against. I have on my mind CodeDeploy. It switches in automatic way between TGroups.
Without CDeploy, you need to implement weighted balancing among two TGroups and adjust them later on. That is extra work.
Whole flow is quite good explained on this YT video.

Related

Azure edge layered deployment not reapplied when base deployment modified

We are using deployments for our IoT devices and managing these using deployment templates. I am in the process of migrating our deployments to a layered approach, where we use a base deployment with all required containers and then apply a layer that is dependent on the type of product.
I have noticed that a layer does not get re-applied when changing the base deployment. Notice the bad crop, but it says 3 devices are targeted but it does not get applied to them after having updated the base deployment:
And when re-applying the layer after having changed the deployment, everything works as it should.
Just because I change my base deployment, I don't want to drop the containers defined in the layer.
The documentation on layered deployments says nothing about this, and I can reproduce this consistently.
What is the intended behavior? Doesn't this break the purpose of layered deployments?
I have also noticed that our stack becomes extremely slow when using layered deployments. Rolling back to a "monolithic" deployment template for each product and everything is snappy again. We are using routes in the edgeHub, and some of these routes point to a container that is deployed as a layer. Don't know if that is an issue, but it is still very slow even after this container has been deployed. The system works, but with extreme delays.
The documentation I linked clearly states:
Any layered deployments targeting a device must have a higher priority than the automatic deployment for that device.
So now the automatic deployment has priority 0 and the layers have priority 1 and it all works.

Trigger CodeDeploy in GitLab?

I am working on a CI/CD pipeline on AWS. For the given information, I have to use GitLab as the repository and use Blue/Green Deployment as the deployment method for ECS Fargate. I would like to use CodeDeploy(preset in the template of Cloudformation) and trigger it by each commit push to GitLab. I cannot use CodePipeline in my region so using CodePipeline is not work for me.
I have read so much docs and webpage related to ECS fargate and B/G deployment. But it seems not much information can help. Are there anyone have related experience?
If your goal is Zero Down Time, ECS already comes packaged as so by default, but not in what I'd call Blue/Green deployment, but rather a rolling upgrade. You'll have the ability to control percentage of healthy instances, ensuring no downtime, with ECS draining connections from the old tasks and provisioning new tasks with new versions.
Your application must be able to handle this 'duality' in versions, e.g. on the data layer, UX etc.
If Blue/Green is an essential requirement, you'll have to leverage CodeDeploy and ALB with ECS. Without going into implementation details, here's the highlight of it:
You have two sets of: Task Definitions and Target Groups (tied to one ALB)
Code Deploy deploys new task definition, which is tied
to the green Target Group. Leaving blue as is.
Test your green deployment by configuring a test listener to the new target group.
When testing is complete, switch all/incremental traffic from blue to green (ALB rules/weighted targets)
Repeat the same process on the next update, except you'll be going from green to red.
Parts of what I've described are handled by CodeDeploy, but hopefully this gives you an idea of the solution architecture, hence how to automate. ECS B/G.

Correct container technolgy on azure for long running service

I want to run a docker container which hosts a server which is going to be long running (e.g. 24x7).
Initially I looked at Azure Container Instances (ACI) and whilst these seems to fit the bill perfectly I've been advised they're not designed for long running containers, also they can prove to be quite expensive to run all the time compared to a basic VM.
So I've been looking at what else I should run this as:
AKS - Seems overkill for just one docker container
App Service for containers - my container doesn't have an http endpoint so I believe I will have issues with things like health checks
VM - this seems all a bit manual as I'd really like not to deal with VM maintenance and I'm also unsure I can use CI/CD techniques to build / spin up-down / do releases on a VM image (we're using terraform to deploy infra).
Are there any best practise guides on this, I've tried searching but I'm not finding anything relevant, I'm assuming I'm missing some key term to get going with this!
TIA
ACI is not designed for long-running (uninterrupted) processes have a look here
Recommendation is to use AKS where you can fully manage lifecycle of your machines or just use VMs

How to ensure that the ASG ( Auto scaling Group) replaces existing instances with every change in the Launch configuration

The infrastructure is provisioned using terraform code.
In our AWS environment, we have a new AMI created for every commit made to the repository. Now, if we want to have autoscaling configured for the web servers behind an ALB using this new AMI
how can we make sure that the ASG replaces existing instances with every change in the Launch configuration, because I believe, once you change the LC, only the instances that are created out of scaling in/out are launched using the new AMI and the existing ones are not replaced.
Also, do you have any idea of how can we pro-grammatically (via terraform) get how many servers run at any point in time, in case of auto- scaling ?
Any help is highly appreciated here.
Thanks!
For the most part this is pretty straightforward and there are already a dozen of implementations around the web.
The tricky part is to express the 'create_before_destroy' field on the LC and the ASG. You schould also refer to the LC in your ASG resource. That way once your LC is changed you would trigger a workflow that creates a new ASG, that replaces your current one.
Very Good Documented Example
Also, do you have any idea of how can we pro-grammatically (via
terraform) get how many servers run at any point in time, in case of
auto- scaling ?
This depends on the context. If you have a static number it's easy, you could define it in your module and stick with it. If it's about passing the previous ASG value the way would be again described in the guide above :) You need to write a custom external handler for how many in 'the moment' running instances you have around your target groups. There might be of course a new AWS REST API addition that gives you the chance to query all your Target Groups health check property and get their total sum ( not aware about it ). Then again, you might add some custom rules for scaling policies.
External Handler
Side note: in the example the deployment is happening with ELB.

Microservices on docker - architecture

I am building a micro-services project using docker.
one of my micro-services is a listener that should get data from various number of sources.
What i'm trying to achieve is the ability to start and stop getting data from sources dynamically.
For example in this drawing, i have 3 sources connected to 3 dockers.
My problem starts because i need to create another docker instance when a new source is available. In this example lets say source #4 is now available and i need to get his data (I know when a new source became available) but i want it to be scaled automatically (with source #4 information for listening)
I came up with two solutions, each has advantages and disadvantages:
1) Create a docker pool of a large number of docker running the listener service and every time a new source is available send a message (using rabbitmq but i think less relevant) to an available docker to start getting data.
in this solution i'm a little bit afraid of the memory consumption of the docker images running for no reason - but it is not a very complex solution.
2) Whenever a new source is becoming available create a new docker (with different environment variables)
With this solution i have a problem creating the docker.
At this moment i have achieved this one, but the service that is starting the dockers (lets call it manager) is just a regular nodejs application that is executing commands on the same server - and i need it to be inside a docker container also.
So the problem here is that i couldn't manage create an ssh connection from the main docker to create my new Docker.
I am not quite sure that both of my solutions are on track and would really appreciate any suggestions for my problem.
Your question is a bit unclear, but if you just want to scale a service horizontally you should look into a container orchestration technology that will allow you that - For example Kubernetes. I recommend reading the introduction.
All you would need to do for adding additional service containers is to update the number of desired replicas in the Deployment configuration. For more information read this.
Using kubernetes (or short k8s) you will benefit from deployment automation, self healing and service discovery as well as load balancing capabilities in addition to the horizontal scalability.
There are other orchestration alternatives, too ( e.g. Docker Swarm), but I would recommend to look into kubernetes first.
Let me know if that solves your issue or if you have additional requirements that weren't so clear in your original question.
Links for your follow up questions:
1 - Run kubectl commands inside container
2 - Kubernetes autoscaling based on custom metrics
3 - Env variables in Pods

Resources