cloud run deployment pattern when new images are pushed, if services are created via terraform, is it avoidable? - terraform

I have a set of cloud run services created/maintained via terraform cloud.
When I create a new version, a github actions workflow pushes a new image to gcr.io.
Now in a normal scenario, I'd call:
gcloud run deploy auth-service --image gcr.io/riu-production/auth-service:latest
And a new version would be up. If I do this and the resource is managed by terraform, on the next run, terraform apply will fail saying it can't create that cloud run service due to a service with that name already existing. So it drifts apart in state and terraform no longer recognizes it.
A simple solution is to connect the pipeline to terraform cloud and run terraform apply -auto-approve for deployment purposes. That should work.
The problem with that is I really realy don't want to apply terraform commands in a pipeline, for now.
And the biggest one is I really would like to keep terraform out of the deployment process altogether.
Is there any way to force cloud run to take that new image for a service without messing up the terraform infrastructure?
Cloud run configs:
resource "google_cloud_run_service" "auth-service" {
name = "auth-service"
location = var.gcp_region
project = var.gcp_project
template {
spec {
service_account_name = module.cloudrun-sa.email
containers {
image = "gcr.io/${var.gcp_project}/auth-service:latest"
}
}
}
traffic {
percent = 100
latest_revision = true
}
}

In theory yes it should be possible ...
But I would recommend against that, you should be doing terraform apply on every deployment to guarantee the infrastructure is as expected.
Here are some things you can try:
Keep track of when it changes and use the import on that resource:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloud_run_service#import
Look into lifecycle ignore, you can ignore the attribute that triggers the change:
https://www.terraform.io/language/meta-arguments/lifecycle#ignore_changes

Related

terraform CLI can't find remote cloud workspaces when using multiple tags, attempts to create them

I have a 'development' - 'staging' and 'production' workspace within the terraform cloud organisation.
I'm attempting to interact with them as per documentation here.
Particularly this:
If you associate the directory with multiple workspaces (using
workspace tags), you can use the terraform workspace commands to
select which remote workspace to use.
Locally, I also have the exact same three terraform workspaces created.
Screenshots:
local workspaces:
remote workspaces and tags:
100% same organisation I was able to interact with the workspaces if I hardcode the workspace value instead of using tags.
My terraform backend cloud definitions:
terraform {
cloud {
organization = "<<myorgname>>"
workspaces {
tags = ["development", "staging", "production"]
}
}
}
When I run a simple terraform init I get greeted by:
No workspaces found.
There are no workspaces with the configured tags (development, production, staging)
in your Terraform Cloud organization. To finish initializing, Terraform needs at
least one workspace available.
Terraform can create a properly tagged workspace for you now. Please enter a
name to create a new Terraform Cloud workspace.
I've been going through the documentation here that goes into CLI-driven runs with this context but I can't figure out the right way to do this.
What I want is:
run a terraform plan or terraform apply whilst in the locally-selected development workspace
and then:
the cloud terraform to perform a run on the remote development workspace.
If I just go ahead and write 'development' as a name, it will then apply all 3 tags in the static definition to the remote 'development' workspace, thus defeating the whole purpose of using tags instead of a name.
What's the right way of doing this?
That is true, however there is also this part of documentation [1]:
tags - (Optional) A set of Terraform Cloud workspace tags. You will be able to use this working directory with any workspaces that have all of the specified tags, and can use the terraform workspace commands to switch between them or create new workspaces. New workspaces will automatically have the specified tags. This option conflicts with name.
EDIT: As mentioned in the comments, in order for local workspaces to be usable in Terraform Cloud as well (i.e., to be able to apply the code in Terraform Cloud), there has to be a "common" or a "main" tag across all workspaces created in the Terraform Cloud.
[1] https://www.terraform.io/cli/cloud/settings#arguments

Switching Terraform cloud workspaces in GitHub Actions/Terraform CLI

We're in the middle of working on a small proof of concept project which will deploy infrastructure to Azure using Terraform. Our Terraform source is held in GitHub and we've using Terraform cloud as the backend to store our state, secrets etc.
Within Terraform cloud we've created two workspaces, one for the staging environment and one for the production environment.
So far we've used the guide on the Terraform docs to develop a GitHub action which triggers on a push to the main branch and deploys our infrastructure to the staging environment. This all works great and we can see our state held in Terraform cloud.
The next hurdle is to promote our changes into the production environment.
Unfortunately we've hit a brick wall trying to figure out how to dynamically change the Terraform cloud workspace within the GitHub action so it's operating on production and not staging. I've spent most of the day looking into this with little joy.
For reference the Terraform backend is currently configured as follows:
terraform {
backend "remote" {
organization = "terraform-organisation-name"
workspaces {
name = "staging-workspace-name"
}
}
}
The action itself does an init and then and apply.
Obviously with the workspace name hardcoded this will only work on staging. Ultimately the questions comes down to how to parameterise or dynamically change the Terraform cloud workspace from the command line?
I feel I'm missing something fundamental and any help or suggestions would be greatly appreciated.

Regarding terraform script behaviour

I am using Terraform scripts to create azure services, I am having some doubts regarding Terraform,
1) If I have one environment let say dev in azure having some azure resources how can I copy all the resources to new environment lest say prod using terraform script.
2)what are the impact of re-run the terraform file with additional azure resources, what it will do.
3)What if I want to create an app service with the same name from Terraform script that already present in the azure will it update the resource or do nothing after terraform execution completed.
Please feel free to answer the question, it will be great help.
To answer your questions:
You could create a new workspace with terraform workspace new and copy all configuration files (.tf) to the new environment, then run terraform init, plan, apply.
The terraform will compare the content in your current state file with your configuration file, then update the new attributes or creating new resources other than re-creating the existing resources.
You could run terraform import to import existing infrastructure into Terraform. For referencing existing resources in the portal, you can use data sources.

Terraform Create a New EBS Snapshot on each Terraform apply

I am trying to use Terraform as part of my continuous deployment pipeline. I am using Terraform to create a snapshot of my production EBS volume (for backup purposes) prior to executing any other pipeline tasks.
I can get terraform to take the Snapshot, however the issue is Terraform will not create a new snapshot on each run. Instead it detects there is already an existing snapshot and does nothing.
For example.
Terraform Apply Execution 1 - Snapshot successfully taken.
Terraform Apply Execution 2 - No snapshot taken.
The code I am using for Terraform is provided below.
provider "aws" {
access_key = "..."
secret_key = "..."
region = "..."
}
resource "aws_ebs_snapshot" "example_snapshot" {
volume_id = "vol-xyz"
tags = {
Name = "continuous_deployment_backup"
}
}
Does anyone know how I can force Terraform to create a new EBS snapshot each time it is run?
To avoid any repetitive and manual tasks if you are working on a continuous deployment pipeline, an option could be to run CloudWatch Events rules according to a schedule automating Amazon EBS Snapshots.
You can check it out here in this tutorial suggested by AWS in its CloudWatch Documentation.
You can use Amazon Data Lifecycle Manager (Amazon DLM) to automate the creation, retention, and deletion of snapshots taken to back up your Amazon EBS volumes as well, always using terraform through the aws_dlm_lifecycle_policy resource for instance.

Terraform and Updates

Being able to capture infrastructure in a single Terraform file has obvious benefits. However, I am not clear in my mind how - once, for example, a virtual machine has been created - subsequent updates are handled.
So, to provide a specific scenario. Suppose that using Terraform we set up an Azure vm with SQL Server 2014. Then, after a month we decide that we should like to update that vm with the latest service pack for SQL Server 2014 that has just been released.
Is the recommended practice that we update the Terraform configuration file and re-apply it?
I have to disagree with the other two responses. Terraform can handle infrastructure updates just fine. The key thing to understand, however, is that Terraform largely follows an immutable infrastructure paradigm, which means that to "update" a resource, you delete the old resource and create a new one to replace it. This is much like functional programming, where variables are immutable, and to "update" something, you actually create a new variable.
The typical pattern with Terraform is to use it to deploy a server image, such as an Virtual Machine (VM) Image (e.g. an Amazon Machine Image (AMI)) or a Container Image (e.g. a Docker Image). When you want to "update" something, you create a new version of your image, deploy that onto a new server, and undeploy the old server.
Here's an example of how that works:
Imagine that you're building a Ruby on Rails app. You get the app working in dev and it's time to deploy to prod. The first step is to package the app as an AMI. You could do this using a tool like Packer. Now you have an AMI with id ami-1234.
Here is a Terraform template you could use to deploy this AMI on a server (an EC2 Instance) in AWS with an Elastic IP Address attached to it:
resource "aws_instance" "example" {
ami = "ami-1234"
instance_type = "t2.micro"
}
resource "aws_eip" "example" {
instance = "${aws_instance.example.id}"
}
When you run terraform apply, Terraform deploys the server, attaches an IP address to it, and now when users visit that IP, they will see v1 of your Rails app.
Some time later, you update your Rails app and want to deploy the new version, v2. To do that, you build a new AMI (i.e. you run Packer again) to get an ami with ID "ami-5678". You update your Terraform templates accordingly:
resource "aws_instance" "example" {
ami = "ami-5678"
instance_type = "t2.micro"
}
When you run terraform apply, Terraform undeploys the old server (which it can find because Terraform records the state of your infrastructure), deploys a new server with the new AMI, and now users will see v2 of your code at that same IP.
Of course, there is one problem here: in between the time when Terraform undeploys v1 and when it deploys v2, your users would see downtime. To work around that, you could use Terraform's create_before_destroy lifecycle setting:
resource "aws_instance" "example" {
ami = "ami-5678"
instance_type = "t2.micro"
lifecycle {
create_before_destroy = true
}
}
With create_before_destroy set to true, Terraform will create the replacement server first, switch the IP to it, and then remove the old server. This allows you to do zero-downtime deployment with immutable infrastructure (note: zero-downtime deployment works better with a load balancer that can do health checks than a simple IP address, especially if your server takes a long time to boot).
For more information on this, check out the book Terraform: Up & Running. The code samples for the book include an example of a zero-downtime deployment with a cluster of servers and a load balancer: https://github.com/brikis98/terraform-up-and-running-code
Terraform is an infrastructure provision tool, th configuration/deployment tools will be:
chef
saltstack
ansible
etc.,
As I am working with chef, so basically, I provision the server instance by terraform, then terraform (terraform provisioner) handles the control to chef for system configuration and deployment.
For the moment, terraform cannot delete the node/client in chef server, so after you terraform destroy, you need remove them by yourself.
Terraform isn't best placed for this sort of task. Terraform is an infrastructure management tool, not configuration management.
You should use tools such as chef, puppet, and ansible to deal with the configuration of the system.
If you must use terraform for this task; you could create a template_file resource and place in the configuration required to install the SQL server, and how to upgrade if a different version is presented. Reference: here
Put that code inside a provisioner under the null_resource resource. reference: here.
The trigger for this could be the variable containing the SQL version. So, when you present a different version of SQL it'll execute that provisioner on each instance to upgrade the versions.

Resources