Rolling update of ASG using launch template - terraform

When I update the AMI associated with a aws_launch_template, Terraform creates a new version of the launch template as expected and also updates the aws_autoscaling_group to point to the new version of the launch template.
However, no "rolling update" is performed to switch out the existing instances with new instances based on the new AMI, I have to manually terminate the existing instances and then the ASG brings up new instances using the new AMI.
What changes do I have to make to my config to get Terraform to perform a rolling update?
Existing code is as follows:
resource "aws_launch_template" "this" {
name_prefix = "my-launch-template-"
image_id = var.ami_id
instance_type = "t3.small"
key_name = "testing"
vpc_security_group_ids = [ aws_security_group.this.id ]
lifecycle {
create_before_destroy = true
}
}
resource "aws_autoscaling_group" "this" {
name_prefix = "my-asg-"
vpc_zone_identifier = var.subnet_ids
target_group_arns = var.target_group_arns
health_check_type = "ELB"
health_check_grace_period = 300
default_cooldown = 10
min_size = 4
max_size = 4
desired_capacity = 4
launch_template {
id = aws_launch_template.this.id
version = aws_launch_template.this.latest_version
}
lifecycle {
create_before_destroy = true
}
}

I recently worked on that exact same scenario.
We used the random_pet resource to generate a human readable random name that links with the AMI changes.
resource "random_pet" "ami_random_name" {
keepers = {
# Generate a new pet name every time we change the AMI
ami_id = var.ami_id
}
}
You can then use that random_pet name id on a variable that would force the recreation of your autoscaling group.
For example with name_prefix:
resource "aws_autoscaling_group" "this" {
name_prefix = "my-asg-${random_pet.ami_random_name.id}"
vpc_zone_identifier = var.subnet_ids
target_group_arns = var.target_group_arns
health_check_type = "ELB"
health_check_grace_period = 300
default_cooldown = 10
min_size = 4
max_size = 4
desired_capacity = 4
launch_template {
id = aws_launch_template.this.id
version = aws_launch_template.this.latest_version
}
lifecycle {
create_before_destroy = true
}
}

ASG instance refresh is also an option that replaces all old instances with newer instances as per the newest version in the launch template ( make sure to set LaunchTemplateVersion = $Latest in ASG settings). other benefits are:
Set a time to warm up instances before receiving traffic ( if you have time taking bootstrap installations)
You can specify percentage of how many instances to replace within ASG parallelly to speed things up.
Below is terraform code block. More about the feature here
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
triggers = ["tag"]
}

Related

How to dynamically attach multiple volumes to multiple instances via terraform in openstack?

I have written a terraform module(v0.14) that can be used to provision multiple instances in openstack with the correct availability_zone,network-port, correct flavor, can create 1 local disk or 1 blockstorage_volume based on boolean input variables and so forth. Now I have gotten a feature request to be able to add dynamically multiple blockstorage_volumes(=network/shared storage volumes) on multiple instances.
Example of how everything dynamically works for 1 blockstorage_volume on multiple instances:
#Dynamically create this resource ONLY if boolean "volume_storage" = true
resource "openstack_blockstorage_volume_v3" "shared_storage" {
for_each = var.volume_storage ? var.nodes : {}
name = "${each.value}-${each.key}.${var.domain}"
size = var.volume_s
availability_zone = each.key
volume_type = var.volume_type
image_id = var.os_image
}
resource "openstack_compute_instance_v2" "instance" {
for_each = var.nodes
name = "${each.value}-${each.key}.${var.domain}"
flavor_name = var.flavor
availability_zone = each.key
key_pair = var.key_pair
image_id = (var.volume_storage == true ? "" : var.os_image)
config_drive = true
#Dynamically create this parameter ONLY if boolean "volume_storage" = true
dynamic "block_device" {
for_each = var.volume_storage ? var.volume : {}
content {
uuid = openstack_blockstorage_volume_v3.shared_storage[each.key].id
source_type = "volume"
destination_type = "volume"
delete_on_termination = var.volume_delete_termination
}
}
user_data = data.template_file.cloud-init[each.key].rendered
scheduler_hints {
group = openstack_compute_servergroup_v2.servergroup.id
}
network {
port = openstack_networking_port_v2.network-port[each.key].id
}
}
So let's say now that I have 2 instances, where I want to add dynamically 2 extra blockstorage_volumes to each instance, my first idea was to add 2 extra dynamic resources as a try-out:
#Dynamically create this resource if boolean "multiple_volume_storage" = true
resource "openstack_blockstorage_volume_v3" "multiple_shared_storage" {
for_each = var.multiple_volume_storage ? var.multiple_volumes : {}
name = each.value
size = var.volume_s
availability_zone = each.key
volume_type = var.volume_type
image_id = var.os_image
}
Example of 2 extra blockstorage_volumes defined in a .tf file:
variable "multiple_volumes" {
type = map(any)
default = {
dc1 = "/volume/mysql"
dc2 = "/volume/postgres"
}
}
Example of 2 instances defined in a .tf file:
nodes = {
dc1 = "app-stage"
dc2 = "app-stage"
}
Here I try to dynamically attach 2 extra blockstorage_volumes to each instance:
resource "openstack_compute_volume_attach_v2" "attach_multiple_shared_storage" {
for_each = var.multiple_volume_storage ? var.multiple_volumes : {}
instance_id = openstack_compute_instance_v2.instance[each.key].id
volume_id = openstack_blockstorage_volume_v3.multiple_shared_storage[each.key].id
}
The openstack_compute_instance_v2.instance [each.key] is obviously not correct since it now only creates 1 extra blockstorage_volume per instance. Is there a clean/elegant way to solve this? So basically to attach all given volumes in variable "multiple_volumes" to each single instance that is defined in var.nodes
Kind regards,
Jonas

Will "aws_ami" with most_recent=true impact future updates?

Given the config below, what happens if I run apply command against the infrastructure if Amazon rolled out a new version of the AMI?
Will the test instance is going to be destroyed and recreated?
so scenario
terraform init
terraform apply
wait N months
terraform plan (or apply)
AM I going to see "forced" recreation of the ec2 instance that was created N months ago using the older version of the AMI which was "recent" back then?
data "aws_ami" "amazon-linux-2" {
most_recent = true
filter {
name = "owner-alias"
values = ["amazon"]
}
filter {
name = "name"
values = ["amzn2-ami-hvm*"]
}
}
resource "aws_instance" "test" {
depends_on = ["aws_internet_gateway.test"]
ami = "${data.aws_ami.amazon-linux-2.id}"
associate_public_ip_address = true
iam_instance_profile = "${aws_iam_instance_profile.test.id}"
instance_type = "t2.micro"
key_name = "bflad-20180605"
vpc_security_group_ids = ["${aws_security_group.test.id}"]
subnet_id = "${aws_subnet.test.id}"
}
Will "aws_ami" with most_recent=true impact future updates?
#ydeatskoR and #sogyals429 have the right answer. To be more concrete:
resource "aws_instance" "test" {
# ... (all the stuff at the top)
lifecycle {
ignore_changes = [
ami,
]
}
}
note: docs moved to: https://www.terraform.io/docs/language/meta-arguments/lifecycle.html#ignore_changes
Yes as per what #ydaetskcoR said you can have a look at the ignore_changes lifecycle and then it would not recreate the instances. https://www.terraform.io/docs/configuration/resources.html#ignore_changes

Terraform resource recreation dynamic AWS RDS instance counts

I have a question relating to AWS RDS cluster and instance creation.
Environment
We recently experimented with:
Terraform v0.11.11
provider.aws v1.41.0
Background
Creating some AWS RDS databases. Our mission was that in some environment (e.g. staging) we may run fewer instances than in others (e.g. production.). With this in mind and not wanting to have totally different terraform files per environment we instead decided to specify the database resources just once and use a variable for the number of instances which is set in our staging.tf and production.tf files respectively for the number of instances.
Potentially one more "quirk" of our setup, is that the VPC in which the subnets exist is not defined in terraform, the VPC already existed via manual creation in the AWS console, so this is provided as a data provider and the subnets for the RDS are specific in terraform - but again this is dynamic in the sense that in some environments we might have 3 subnets (1 in each AZ), whereas in others perhaps we have only 2 subnets. Again to achieve this we used iteration as shown below:
Structure
|-/environments
-/staging
-staging.tf
-/production
-production.tf
|- /resources
- database.tf
Example Environment Variables File
dev.tf
terraform {
terraform {
backend "s3" {
bucket = "my-bucket-dev"
key = "terraform"
region = "eu-west-1"
encrypt = "true"
acl = "private"
dynamodb_table = "terraform-state-locking"
}
version = "~> 0.11.8"
}
provider "aws" {
access_key = "${var.access_key}"
secret_key = "${var.secret_key}"
region = "${var.region}"
version = "~> 1.33"
allowed_account_ids = ["XXX"]
}
module "main" {
source = "../../resources"
vpc_name = "test"
test_db_name = "terraform-test-db-dev"
test_db_instance_count = 1
test_db_backup_retention_period = 7
test_db_backup_window = "00:57-01:27"
test_db_maintenance_window = "tue:04:40-tue:05:10"
test_db_subnet_count = 2
test_db_subnet_cidr_blocks = ["10.2.4.0/24", "10.2.5.0/24"]
}
We came to this module based structure for environment isolation mainly due to these discussions:
https://github.com/hashicorp/terraform/issues/18632#issuecomment-412247266
https://github.com/hashicorp/terraform/issues/13700
https://www.terraform.io/docs/state/workspaces.html#when-to-use-multiple-workspaces
Our Issue
Initial resource creation works fine, our subnets are created, the database cluster starts up.
Our issues start the next time we subsequently run a terraform plan or terraform apply (with no changes to the files), at which point we see interesting things like:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
module.main.aws_rds_cluster.test_db (new resource required)
id: "terraform-test-db-dev" => (forces new resource)
availability_zones.#: "3" => "1" (forces new resource)
availability_zones.1924028850: "eu-west-1b" => "" (forces new resource)
availability_zones.3953592328: "eu-west-1a" => "eu-west-1a"
availability_zones.94988580: "eu-west-1c" => "" (forces new resource)
and
module.main.aws_rds_cluster_instance.test_db (new resource required)
id: "terraform-test-db-dev" => (forces new resource)
cluster_identifier: "terraform-test-db-dev" => "${aws_rds_cluster.test_db.id}" (forces new resource)
Something about the way we are approaching this appears to be causing terraform to believe that the resource has changed to such an extent that it must destroy the existing resource and create a brand new one.
Config
variable "aws_availability_zones" {
description = "Run the EC2 Instances in these Availability Zones"
type = "list"
default = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
}
variable "test_db_name" {
description = "Name of the RDS instance, must be unique per region and is provided by the module config"
}
variable "test_db_subnet_count" {
description = "Number of subnets to create, is provided by the module config"
}
resource "aws_security_group" "test_db_service" {
name = "${var.test_db_service_user_name}"
vpc_id = "${data.aws_vpc.vpc.id}"
}
resource "aws_security_group" "test_db" {
name = "${var.test_db_name}"
vpc_id = "${data.aws_vpc.vpc.id}"
}
resource "aws_security_group_rule" "test_db_ingress_app_server" {
security_group_id = "${aws_security_group.test_db.id}"
...
source_security_group_id = "${aws_security_group.test_db_service.id}"
}
variable "test_db_subnet_cidr_blocks" {
description = "Cidr block allocated to the subnets"
type = "list"
}
resource "aws_subnet" "test_db" {
count = "${var.test_db_subnet_count}"
vpc_id = "${data.aws_vpc.vpc.id}"
cidr_block = "${element(var.test_db_subnet_cidr_blocks, count.index)}"
availability_zone = "${element(var.aws_availability_zones, count.index)}"
}
resource "aws_db_subnet_group" "test_db" {
name = "${var.test_db_name}"
subnet_ids = ["${aws_subnet.test_db.*.id}"]
}
variable "test_db_backup_retention_period" {
description = "Number of days to keep the backup, is provided by the module config"
}
variable "test_db_backup_window" {
description = "Window during which the backup is done, is provided by the module config"
}
variable "test_db_maintenance_window" {
description = "Window during which the maintenance is done, is provided by the module config"
}
data "aws_secretsmanager_secret" "test_db_master_password" {
name = "terraform/db/test-db/root-password"
}
data "aws_secretsmanager_secret_version" "test_db_master_password" {
secret_id = "${data.aws_secretsmanager_secret.test_db_master_password.id}"
}
data "aws_iam_role" "rds-monitoring-role" {
name = "rds-monitoring-role"
}
resource "aws_rds_cluster" "test_db" {
cluster_identifier = "${var.test_db_name}"
engine = "aurora-mysql"
engine_version = "5.7.12"
# can only request to deploy in AZ's where there is a subnet in the subnet group.
availability_zones = "${slice(var.aws_availability_zones, 0, var.test_db_instance_count)}"
database_name = "${var.test_db_schema_name}"
master_username = "root"
master_password = "${data.aws_secretsmanager_secret_version.test_db_master_password.secret_string}"
preferred_backup_window = "${var.test_db_backup_window}"
preferred_maintenance_window = "${var.test_db_maintenance_window}"
backup_retention_period = "${var.test_db_backup_retention_period}"
db_subnet_group_name = "${aws_db_subnet_group.test_db.name}"
storage_encrypted = true
kms_key_id = "${data.aws_kms_key.kms_rds_key.arn}"
deletion_protection = true
enabled_cloudwatch_logs_exports = ["audit", "error", "general", "slowquery"]
vpc_security_group_ids = ["${aws_security_group.test_db.id}"]
final_snapshot_identifier = "test-db-final-snapshot"
}
variable "test_db_instance_count" {
description = "Number of instances to create, is provided by the module config"
}
resource "aws_rds_cluster_instance" "test_db" {
count = "${var.test_db_instance_count}"
identifier = "${var.test_db_name}"
cluster_identifier = "${aws_rds_cluster.test_db.id}"
availability_zone = "${element(var.aws_availability_zones, count.index)}"
instance_class = "db.t2.small"
db_subnet_group_name = "${aws_db_subnet_group.test_db.name}"
monitoring_interval = 60
engine = "aurora-mysql"
engine_version = "5.7.12"
monitoring_role_arn = "${data.aws_iam_role.rds-monitoring-role.arn}"
tags {
Name = "test_db-${count.index}"
}
}
My question is, is there a way to achieve this so that terraform would not try to recreate the resource (e.g. ensure that the availability zones of the cluster and ID of the instance do not change each time we run terraform.
Turns out that simply by just removing the explicit availability zones definitions from the aws_rds_cluster and aws_rds_cluster_instance then this issue goes away and everything so far appears to work as expected. See also https://github.com/terraform-providers/terraform-provider-aws/issues/7307#issuecomment-457441633

Delay in creation of launch config in AWS

Using Terraform, I have the following launch config and autoscale group resources defined:
resource "aws_launch_configuration" "lc_name" {
name = "lc_name"
image_id = "ami-035d01348bb6e6070"
instance_type = "m3.large"
security_groups = ["sg-61a0b51b"]
}
####################
# Autoscaling group
####################
resource "aws_autoscaling_group" "as_group_name" {
name = "as_group_name"
launch_configuration = "lc_name"
vpc_zone_identifier = ["subnet-be1088f7","subnet-fa8d6fa1"]
min_size = "1"
max_size = "1"
desired_capacity = "1"
load_balancers = ["${aws_elb.elb_name.name}"]
health_check_type = "EC2"
}
When I run terraform apply, I get:
Error: Error applying plan:
1 error(s) occurred:
aws_autoscaling_group.as_group_name: 1 error(s) occurred:
aws_autoscaling_group.as_group_name: Error creating AutoScaling Group: ValidationError: Launch configuration name not found - A launch configuration with the name: lc_name does not exist
status code: 400, request id: b09191d3-a47c-11e8-8198-198283743bc9
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
If I run apply again, all goes well, strongly implying that there is a delay in the create autoscale group code recognizing a new launch configuration. Is there a way to adjust to this delay?
Update:
Per suggestion, I added a dependency:
resource "aws_launch_configuration" "myLaunchConfig" {
name = "myLaunchConfig"
image_id = "ami-01c068891b0d9411a"
instance_type = "m3.large"
security_groups = ["sg-61a0b51b"]
}
resource "aws_autoscaling_group" "myAutoScalingGroup" {
name = "myAutoScalingGroup"
launch_configuration = "myLaunchConfig"
depends_on = ["myLaunchConfig"]
vpc_zone_identifier = ["subnet-be1088f7","subnet-fa8d6fa1"]
min_size = "1"
max_size = "1"
desired_capacity = "1"
load_balancers = ["${aws_elb.myLoadBalancer.name}"]
health_check_type = "EC2"
}
Still getting an error for the same reason, though it looks a bit different:
Error: aws_autoscaling_group.myAutoScalingGroup: resource depends on non-existent resource 'myLaunchConfig'
As far as Terraform can tell there is no relationship between your autoscaling group and your launch configuration so it is going to try to create these in parallel, leading you to the observed race condition that corrects itself on the next apply.
With Terraform you have two different ways of ordering a dependency chain between resources.
You can use the explicit depends_on syntax to force a resource to wait until another resource is created before it, in turn, is created.
In your case this would be something like:
resource "aws_launch_configuration" "lc_name" {
name = "lc_name"
image_id = "ami-035d01348bb6e6070"
instance_type = "m3.large"
security_groups = ["sg-61a0b51b"]
}
####################
# Autoscaling group
####################
resource "aws_autoscaling_group" "as_group_name" {
name = "as_group_name"
launch_configuration = "lc_name"
vpc_zone_identifier = ["subnet-be1088f7", "subnet-fa8d6fa1"]
min_size = "1"
max_size = "1"
desired_capacity = "1"
load_balancers = ["${aws_elb.elb_name.name}"]
health_check_type = "EC2"
depends_on = ["aws_launch_configuration.lc_name"]
}
Or, and this is generally preferable where possible, if you interpolate a value from one resource then it will automatically wait until that resource is created before creating the second resource.
In your case you would then use something like this:
resource "aws_launch_configuration" "lc_name" {
name = "lc_name"
image_id = "ami-035d01348bb6e6070"
instance_type = "m3.large"
security_groups = ["sg-61a0b51b"]
}
####################
# Autoscaling group
####################
resource "aws_autoscaling_group" "as_group_name" {
name = "as_group_name"
launch_configuration = "${aws_launch_configuration.lc_name.name}"
vpc_zone_identifier = ["subnet-be1088f7", "subnet-fa8d6fa1"]
min_size = "1"
max_size = "1"
desired_capacity = "1"
load_balancers = ["${aws_elb.elb_name.name}"]
health_check_type = "EC2"
}
If you are ever unsure as to the order of things that Terraform will operate on then you might want to take a look at the terraform graph command.

Terraform (provider AWS) - Auto Scaling group doesn't take effect on a launch template change

Unable to make launch template work with ASGs while using launch templates, it works with Launch Configuration using a small hack i.e by interpolating the launch configuration name in ASG resource but it doesn't work with launch templates.
ASG uses the latest version to launch new instances but doesn't change anything w.r.t to pre-running instances inspite of a change in launch template.
I understand that this is sort of expected but do we have any workaround to make launch templates work with ASG or we need to stick to launch configuration itself?
TF code snippet -
resource "aws_launch_template" "lc_ec2" {
image_id = "${var.ami_id}"
instance_type = "${var.app_instance_type}"
key_name = "${var.orgname}_${var.environ}_kp"
vpc_security_group_ids = ["${aws_security_group.sg_ec2.id}"]
user_data = "${base64encode(var.userdata)}"
block_device_mappings {
device_name = "/dev/xvdv"
ebs {
volume_size = 15
}
}
iam_instance_profile {
name = "${var.orgname}_${var.environ}_profile"
}
lifecycle {
create_before_destroy = true
}
tag_specifications {
resource_type = "instance"
tags = "${merge(map("Name", format("%s-%s-lc-ec2", var.orgname, var.environ)), var.tags)}"
}
tag_specifications {
resource_type = "volume"
tags = "${merge(map("Name", format("%s-%s-lc-ec2-volume", var.orgname, var.environ)), var.tags)}"
}
tags = "${merge(map("Name", format("%s-%s-lc-ec2", var.orgname, var.environ)), var.tags)}"
}
resource "aws_autoscaling_group" "asg_ec2" {
name = "${var.orgname}-${var.environ}-asg-ec2-${aws_launch_template.lc_ec2.name}"
vpc_zone_identifier = ["${data.aws_subnet.private.*.id}"]
min_size = 1
desired_capacity = 1
max_size = 1
target_group_arns = ["${aws_lb_target_group.alb_tg.arn}"]
default_cooldown= 100
health_check_grace_period = 100
termination_policies = ["ClosestToNextInstanceHour", "NewestInstance"]
health_check_type="ELB"
launch_template = {
id = "${aws_launch_template.lc_ec2.id}"
version = "$$Latest"
}
lifecycle {
create_before_destroy = true
}
tags = [
{
key = "Name"
value = "${var.orgname}"
propagate_at_launch = true
},
{
key = "Environ"
value = "${var.environ}"
propagate_at_launch = true
}
]
}
There is one hack to achieve this.
AWS CloudFormation supports rolling updates of an Autoscaling group.
Since Terraform supports a cloudformation stack resource, you can define your ASG as a cloudformation stack with an update policy. However, CloudFormation does not support the $$Latest tag for launch template version, so you will have to parameterize the version and take the input value from the latest_version attribute of the launch template resource created in your terraform configuration file.

Resources