Terraform updating a one of many ECS service/task - terraform

Happy Friday! hoping someone can help me with this issue or point out the flaws in my thinking.
$ terraform --version
Terraform v0.12.7
+ provider.aws v2.25.0
+ provider.template v2.1.2
Preface
This is my first time using Terraform. We have an existing AWS ECS/Fargate environment up and running, this is in a 'test' environment. We recently (e.g. after setting up the test env) started to use Terraform for IaC purposes.
Current Config
The environment has a single ECS cluster, we're using FARGATE but I'm not sure that matters for this question. The cluster has several services, each service has a single task (docker image) associated with it - so they can be individually scaled. Each docker image has its own Repo.
What I'm trying to do
So with Terraform I was hoping to be able to create, update and destroy the environment. Creating/destroying seems fairly straight forward, however; I'm hitting a road-block for updating.
As I said each task has its own repo, when a pull request is made against the repo our CI platform (CircleCI if that matters) builds the new docker image, tags it and pushes it. Then we use an API call to trigger a build of the Terraform Repo passing the name of the service/task that was updated.
Problem
The problem we're facing is that when going through the services (described below) I can't figure out how to get Terraform to either ignore the services that are not being updated, or how I can provide the correct container_definitions in the aws_ecs_task_definition, specifically the current image tag (we don't use the latest tag). So I'm trying to figure out how I can get the latest container information (tag) or just tell Terraform to skip the unmodified task.
Terraform Script
Here is a stripped down version of what I have tried, this is in a module called ecs.tf, the var.ecs_svc_names is a list of the service names. I have removed some elements as I don't think they pertain to this issue and having them makes this very large.
CAVEATS
I have not run the Terraform 'script' as shown below due to the issues I am asking about, so my syntax maybe a off. Sorry if that is the case, hopefully this will show you what I'm trying to do....
ecs.tf
/* ecs_service_names is passed in, but here is its definition:
variable "ecs_service_names" {
type = list(string)
description = "This is a list/array of the images/services that are contained in the cluster"
default = [
"main",
"sub-service1",
"sub-service2"]
}
*/
locals {
numberOfServices = length(var.ecs_svc_names)
}
resource "aws_ecs_cluster" "ecs_cluster" {
name = "${var.env_type}-ecs-cluster"
}
// Create the service objects (1 for each service)
resource "aws_ecs_service" "ecs-service" {
// How many services are being created
count = local.numberOfServices
name = var.ecs_svc_names[count.index]
cluster = aws_ecs_cluster.ecs_cluster.id
definition[count.index].family}:${max(aws_ecs_task_definition.ecs-task-definition[count.index].revision, data.aws_ecs_task_definition.ecs-task-def.revision)}"
desired_count = 1
launch_type = "FARGATE"
// stuff removed
}
resource "aws_ecs_task_definition" "ecs-task-definition" {
// How many tasks. There is a 1-1 relationship between tasks and services
count = local.numberOfServices
family = var.ecs_svc_names[count.index]
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
// cpu/memory stuff removed
task_role_arn = var.ecs_task_role_arn
container_definitions = data.template_file.ecs_containers_json[count.index].rendered
}
data.tf
locals {
numberOfServices = length(var.ecs_svc_names)
}
data "aws_ecs_task_definition" "ecs-task-def" {
// How many services are being created, 1-1 relationship between tasks and services
count = local.numberOfServices
task_definition = aws_ecs_task_definition.ecs-task-definition[count.index].family
depends_on = [
"aws_ecs_task_definition.ecs-task-definition",
]
}
data "template_file" "ecs_containers_json" {
// How many tasks. There is a 1-1 relationship between tasks and services
count = local.numberOfServices
template = file("${path.module}/container.json.template")
vars = {
// vars removed
image = aws_ecs_task_definition.ecs-task-definition[count.index].family
// This is where I hit the road-block, how do I get the current docker tag from Terraform?
tag = var.ecs_svc_name_2_update == var.ecs_svc_names[count.index]
? var.ecs_svc_image_tag
: data.aws_ecs_task_definition.ecs-task-def[count.index].
}
I didn't post the JSON document, if you need that I can provide it...
Thank you

It is necessary to pass the updated image attribute in the container definition of the task definition revision.
You can data source the container definition of the current task revision which is used by the service and pass it to the terraform. You may follow the code below.
data "template_file" "example" {
template = "${file("${path.module}/example.json")}"
vars {
image = "${data.aws_ecs_container_definition.example.image}"
}
}
resource "aws_ecs_task_definition" "example" {
family = "${var.project_name}-${var.environment_name}-example"
container_definitions = "${data.template_file.example.rendered}"
cpu = 192
memory = 512
}
data "aws_ecs_container_definition" "example" {
task_definition = "${var.project_name}-${var.environment_name}-example"
container_name = "example"
}

Related

Terraform Import Resources and Looping Over Those Resources

I am new to Terraform and looking to utilize it for management of Snowflake environment using the provider of "chanzuckerberg/snowflake". I am specifically looking to leverage it for managing an RBAC model for roles within Snowflake.
The scenario is that I have about 60 databases in Snowflake which would equate to a resource for each in Terraform. We will then create 3 roles (reader, writer, all privileges) for each database. We will expand our roles from there.
The first question is, can I leverage map or object variables to define all database names and their attributes and import them using a for_each within a single resource or do I need to create a resource for each database and then import them individually?
The second question is, what would be the best approach for creating the 3 roles per database? Is there a way to iterate over all the resources of type snowflake_database and create the 3 roles? I was imagining the use of modules, variables, and resources based on the research I have done.
Any help in understanding how this can be accomplished would be super helpful. I understand the basics of Terraform but this is a bit of a complex situation for a newbie like myself to visualize enough to implement it. Thanks all!
Update:
This is what my project looks like and the error I am receiving is below it.
variables.tf:
variable "databases" {
type = list(object(
{
name = string
comment = string
retention_days = number
}))
}
databases.auto.tfvars:
databases = [
{
name = "TEST_DB1"
comment = "Testing state."
retention_days = 90
},
{
name = "TEST_DB2"
comment = ""
retention_days = 1
}
]
main.tf:
terraform {
required_providers {
snowflake = {
source = "chanzuckerberg/snowflake"
version = "0.25.25"
}
}
}
provider "snowflake" {
username = "user"
account = "my_account"
region = "my_region"
password = "pwd"
role = "some_role"
}
resource "snowflake_database" "sf_database" {
for_each = { for idx, db in var.databases: idx => db }
name = each.value.name
comment = each.value.comment
data_retention_time_in_days = each.value.retention_days
}
To Import the resource I run:
terraform import snowflake_database.sf_databases["TEST_DB1"]
db_test_db1
I am left with this error:
Error: resource address
"snowflake_database.sf_databases["TEST_DB1"]" does not exist in the
configuration.
Before importing this resource, please create its configuration in the
root module. For example:
resource "snowflake_database" "sf_databases" { # (resource
arguments) }
You should be able to define the databases using for_each and referring to the actual resources with brackets in the import command. Something like:
terraform import snowflake_database.resource_id_using_for_each[foreachkey]
You could then create three snowflake_role and three snowflake_database_grant definitions using for_each over the same map of databases used for the database definitions.
had this exact same problem and in the end the solution was quite simple.
you just need to wrap your import statement within single brackets.
so instead of
terraform import snowflake_database.sf_databases["TEST_DB1"] db_test_db1
do
terraform import 'snowflake_database.sf_databases["TEST_DB1"]' db_test_db1
this took way to long to figure out!

Terraform: ignore changes to a certain environment variable

I have an AWS Lambda function I created using terraform. Code-changes are auto-deployed from our CI-server and the commit-sha is passed as an environment variable (GIT_COMMIT_HASH) - so this changes the Lambda function outside of the Terraform-scope (because people were asking...).
This works good so far. But now I wanted to update the function's node-version and terraform tries to reset the env var to the initial value of "unknown".
I tried to use the ignore_changes block but couldn't get terraform to ignore the changes made elsewhere ...
resource "aws_lambda_function" "test" {
filename = data.archive_file.helloworld.output_path
function_name = "TestName_${var.environment}"
role = aws_iam_role.test.arn
handler = "src/index.handler"
runtime = "nodejs14.x"
timeout = 1
memory_size = 128
environment {
variables = {
GIT_COMMIT_HASH = "unknown"
}
}
lifecycle {
ignore_changes = [
environment.0.variables["GIT_COMMIT_HASH"],
]
}
}
Is this possible? How do I have to reference the variable?
** edit **
Plan output looks like this:
# aws_lambda_function.test will be updated in-place
~ resource "aws_lambda_function" "test" {
# ... removed some lines
source_code_size = 48012865
tags = {}
timeout = 1
version = "12"
~ environment {
~ variables = {
~ "GIT_COMMIT_HASH" = "b7a77d0" -> "unknown"
}
}
tracing_config {
mode = "PassThrough"
}
}
I tried to replicate the issue and in my tests it works exactly as expected. I can only suspect that you are using an old version of TF, where this issue occurs. There has been numerous GitHub Issues reported regarding the limitations of ignore_changes. For example, here, here or here.
I performed tests using Terraform v0.15.3 with aws v3.31.0, and I can confirm that ignore_changes works as it should. Since this is a TF internal problem, the only way to rectify the problem, to the best of my knowledge, would be to upgrade your TF.

Terraform resource property dependent on creation of resource

Often, I've found myself in the scenario where I want to create a resource with Terraform and want to set, for example, an environment variable on this resource which is only known at a later stage, when the resource is created.
Let's say I want to create a google_cloud_run_service and want to set an environment variable in the container, that represents the url from which the app can be approached:
resource "google_cloud_run_service" "test_app" {
name = "test-app"
location = var.region
template {
spec {
containers {
image = "gcr.io/myimage:latest"
env {
name = "CURRENT_HOST"
value = google_cloud_run_service.test_app.status[0].url
}
}
}
}
}
This however is not allowed, as the service is not yet created. Is there a way to accomplish this?

Terraform & OpenStack - Zero downtime flavor change

I’m using openstack_compute_instance_v2 to create instances in OpenStack. There is a lifecycle setting create_before_destroy = true present. And it works just fine in case I e.g. change volume size, where instances needs to be replaced.
But. When I do flavor change, which can be done by using resize instance option from OpenStack, it does just that, but doesn’t care about any HA. All instances in the cluster are unavailable for 20-30 seconds, before resize finishes.
How can I change this behaviour?
Some setting like serial from Ansible, or some other options would come in handy. But I can’t find anything.
Just any solution that would allow me to say “at least half of the instances needs to be online at all times”.
Terraform version: 12.20.
TF plan: https://pastebin.com/ECfWYYX3
The Openstack Terraform provider knows that it can update the flavor by using a resize API call instead of having to destroy the instance and recreate it.
Unfortunately there's not currently a lifecycle option that forces mutable things to do a destroy/create or create/destroy when coupled with the create_before_destroy lifecycle customisation so you can't easily force this to replace the instance instead.
One option in these circumstances is to find a parameter that can't be modified in place (these are noted by the ForceNew flag on the schema in the underlying provider source code for the resource) and then have a change in the mutable parameter also cascade a change to the immutable parameter.
A common example here would be replacing an AWS autoscaling group when the launch template (which is mutable compared to the immutable launch configurations) changes so you can immediately roll out the changes instead of waiting for the ASG to slowly replace the instances over time. A simple example would look something like this:
variable "ami_id" {
default = "ami-123456"
}
resource "random_pet" "ami_random_name" {
keepers = {
# Generate a new pet name each time we switch to a new AMI id
ami_id = var.ami_id
}
}
resource "aws_launch_template" "example" {
name_prefix = "example-"
image_id = var.ami_id
instance_type = "t2.small"
vpc_security_group_ids = ["sg-123456"]
}
resource "aws_autoscaling_group" "example" {
name = "${aws_launch_template.example.name}-${random_pet.ami_random_name.id}"
vpc_zone_identifier = ["subnet-123456"]
min_size = 1
max_size = 3
launch_template {
id = aws_launch_template.example.id
version = "$Latest"
}
lifecycle {
create_before_destroy = true
}
}
In the above example a change to the AMI triggers a new random pet name which changes the ASG name which is an immutable field so this triggers replacing the ASG. Because the ASG has the create_before_destroy lifecycle customisation then it will create a new ASG, wait for the minimum amount of instances to pass EC2 health checks and then destroy the old ASG.
For your case you can also use the name parameter on the openstack_compute_instance_v2 resource as that is an immutable field as well. So a basic example might look like this:
variable "flavor_name" {
default = "FLAVOR_1"
}
resource "random_pet" "flavor_random_name" {
keepers = {
# Generate a new pet name each time we switch to a new flavor
flavor_name = var.flavor_name
}
}
resource "openstack_compute_instance_v2" "example" {
name = "example-${random_pet.flavor_random_name}"
image_id = "ad091b52-742f-469e-8f3c-fd81cadf0743"
flavor_name = var.flavor_name
key_pair = "my_key_pair_name"
security_groups = ["default"]
metadata = {
this = "that"
}
network {
name = "my_network"
}
}
So. At first I've started digging how, as #ydaetskcoR proposed, to use random instance name.
Name wasn't an option, both because in openstack it is a mutable parameter, and because I have a decided naming schema which I can't change.
I've started to look for other parameters that I could modify to force instance being created instead of modified. I've found about personality.
https://www.terraform.io/docs/providers/openstack/r/compute_instance_v2.html#instance-with-personality
But it didn't work either. Mainly, because personality is no longer supported as it seems:
The use of personality files is deprecated starting with the 2.57 microversion. Use metadata and user_data to customize a server instance.
https://docs.openstack.org/api-ref/compute/
Not sure if terraform doesn't support it, or there are any other issues. But I went with user_data. I've already used user_data in compute instance module, so adding some flavor data there shouldn't be an issue.
So, within user_data I've added the following:
user_data = "runcmd:\n - echo ${var.host["flavor"]} > /tmp/tf_flavor"
No need for random pet names, no need to change instances names. Just change their "personality" by adding flavor name somewhere. This does force instance to be recreated when flavor changes.
So. Instead of simply:
# module.instance.openstack_compute_instance_v2.server[0] will be updated in-place
~ resource "openstack_compute_instance_v2" "server" {
I have now:
-/+ destroy and then create replacement
+/- create replacement and then destroy
Terraform will perform the following actions:
# module.instance.openstack_compute_instance_v2.server[0] must be replaced
+/- resource "openstack_compute_instance_v2" "server" {

Does Terraform execute parallely in multi-cloud deployments?

I want to create a VMs in different cloud provider from a single Terraform script e.g. GCP, AWS, Azure using Terraform.
So, I wanted to know that, will Terraform make the VM instances in parallel in all public clouds?
Terraform builds a directed, acyclical graphic (also referred to as a DAG) to understand the dependencies between things. If something isn't dependent on something else then it will execute it in parallel up to the number specified by the -parallelism flag which defaults to 10.
If things are completely separate across multiple providers (you're just creating the same stack in n cloud providers) then it will be comfortably parallel across those stacks.
However, I'd recommend against applying multiple environments/cloud providers at the same time like this because of blast radius issues and in general erring towards minimising how much changes in one operation.
If you have cross provider dependencies then Terraform is great for handling this but it still relies on building that DAG so it can understand your dependencies.
For example you might want to create an instance in GCP and use DNS to resolve the IP address but use AWS' Route53 for all your DNS. For this you could use something like this:
resource "google_compute_instance" "test" {
name = "test"
machine_type = "n1-standard-1"
zone = "us-central1-a"
tags = ["foo", "bar"]
boot_disk {
initialize_params {
image = "debian-cloud/debian-9"
}
}
// Local SSD disk
scratch_disk {
}
network_interface {
network = "default"
access_config {
// Ephemeral IP
}
}
metadata = {
foo = "bar"
}
metadata_startup_script = "echo hi > /test.txt"
service_account {
scopes = ["userinfo-email", "compute-ro", "storage-ro"]
}
}
data "aws_route53_zone" "example" {
name = "example.com."
}
resource "aws_route53_record" "www" {
zone_id = "${data.aws_route53_zone.example.zone_id}"
name = "www.${data.aws_route53_zone.example.name}"
type = "A"
ttl = "300"
records = ["${google_compute_instance.test.network_interface.0.access_config.0.nat_ip}"]
}
This would build a graph that has the aws_route53_record.www depending on both the data.aws_route53_zone.example data source and also the google_compute_instance.test resource so Terraform knows that both of these need to complete before it can start work on the Route53 record.

Resources