How to import existing EC2 with EBS block device? - terraform

I want to use Terraform for my existing AWS infrastructure. Therefore, I imported an existing EC2 instance with EBS block devices. Then I wrote the Terraform code. But Terraform wants to replace all EBS block devices. Is there any way to not replace the EBS block devices?
Import
I imported my AWS EC2 instance as described in Import:
Import
Instances can be imported using the id, e.g.,
$ terraform import aws_instance.web i-12345678
Code
After I imported the state file, I wrote my Terraform code.
This is the part for the EBS block device:
ebs_block_device {
device_name = "/dev/sdf"
volume_type = "gp2"
volume_size = 20
tags = {
Name = "my-test-docker"
}
}
Plan
After I wrote the code, I run the plan command, see Command: plan.
This is the part for the EBS block device:
- ebs_block_device { # forces replacement
- delete_on_termination = false -> null
- device_name = "/dev/sdf" -> null
- encrypted = false -> null
- iops = 100 -> null
- snapshot_id = "snap-0e22b434e3106fd51" -> null
- tags = {
- "Name" = "my-test-docker"
} -> null
- throughput = 0 -> null
- volume_id = "vol-065138961fea23bf4" -> null
- volume_size = 20 -> null
- volume_type = "gp2" -> null
}
+ ebs_block_device { # forces replacement
+ delete_on_termination = true
+ device_name = "/dev/sdf"
+ encrypted = (known after apply)
+ iops = (known after apply)
+ kms_key_id = (known after apply)
+ snapshot_id = (known after apply)
+ tags = {
+ "Name" = "my-test-docker"
}
+ throughput = (known after apply)
+ volume_id = (known after apply)
+ volume_size = 20
+ volume_type = "gp2"
}
Research
I tried to set the volume_id in my code, but Terraform didn't allow to set this property.
Actually, the EBS block device shouldn't be replaced after creation, see ebs_block_device:
ebs_block_device - (Optional) One or more configuration blocks with additional EBS block devices to attach to the instance. Block device configurations only apply on resource creation. See Block Devices below for details on attributes and drift detection. When accessing this as an attribute reference, it is a set of objects.
Background
One of the reasons for using ebs_block_device instead of aws_ebs_volume with aws_volume_attachment was aws_volume_attachment Error waiting for Volume .
Another reason was
Ebs volumes are not attached and available at boot time for userdata script to mount them.
AWS Attach/Format device with CLOUDINIT
fs_setup/disk_setup: option to wait for the device to exist before continuing
ebs_block_device is still working fine for new (not imported) resources.
The formating and mounting part of my cloud-init user data script:
fs_setup:
- label: docker
filesystem: ext4
device: /dev/xvdf
overwrite: false
mounts:
- [ "/dev/xvdf", "/mnt/docker-data", "auto", "defaults,nofail", "0", "2" ]
Question
Is there any way to give Terraform a hint to match EBS block device in code with EBS block devices in the state file?

Rather than using ebs_block_device, you should be using aws_ebs_volume and aws_volume_attachment resources to import non-root EBS blocks, according to Terraform docs:
Currently, changes to the ebs_block_device configuration of existing resources cannot be automatically detected by Terraform. To manage changes and attachments of an EBS block to an instance, use the aws_ebs_volume and aws_volume_attachment resources instead. If you use ebs_block_device on an aws_instance, Terraform will assume management over the full set of non-root EBS block devices for the instance, treating additional block devices as drift.

Related

When to use ebs_block_device?

I want to create an EC2 instance with Terraform. This instance should have some EBS.
In the documentation I read that Terraform provides two ways to create an EBS:
ebs_block_device
aws_ebs_volume with aws_volume_attachment
I want to know, when should I use ebs_block_device?
Documentation
Unfortunately the documentation isn't that clear (at least for me) about:
When to use ebs_block_device?
How is the exact actual behavior?
See Resource: aws_instance:
ebs_block_device - (Optional) One or more configuration blocks with additional EBS block devices to attach to the instance. Block device configurations only apply on resource creation. See Block Devices below for details on attributes and drift detection. When accessing this as an attribute reference, it is a set of objects.
and
Currently, changes to the ebs_block_device configuration of existing resources cannot be automatically detected by Terraform. To manage changes and attachments of an EBS block to an instance, use the aws_ebs_volume and aws_volume_attachment resources instead. If you use ebs_block_device on an aws_instance, Terraform will assume management over the full set of non-root EBS block devices for the instance, treating additional block devices as drift. For this reason, ebs_block_device cannot be mixed with external aws_ebs_volume and aws_volume_attachment resources for a given instance.
Research
I read:
No change when modifying aws_instance.ebs_block_device.volume_size, which says that Terraform doesn't show any changes with plan/apply and doesn't change anything in AWS, although changes were made..
AWS "ebs_block_device.0.volume_id": this field cannot be set, which says that Terraform shows an error while running plan.
Ebs_block_device forcing replacement every terraform apply, which says that Terraform replaces all EBS.
aws_instance dynamic ebs_block_device forces replacement, which says that Terraform replaces all EBS, although no changes were made.
adding ebs_block_device to existing aws_instance forces unneccessary replacement, which says that Terraform replaces the whole EC2 instance with all EBS.
aws_instance dynamic ebs_block_device forces replacement, which says that Terraform replaces the whole EC2 instance with all EBS, although no changes were made.
I know that the issues are about different versions of Terraform and Terraform AWS provider and some issues are already fixed, but what is the actual intended behavoir?
In almost all issues the workaround/recommendation is to use aws_ebs_volume with aws_volume_attachment instead of ebs_block_device.
Question
When should I use ebs_block_device? What is the use case for this feature?
When should I use ebs_block_device?
When you need another volume other than the root volume because
Unlike the data stored on a local instance store (which persists only
as long as that instance is alive), data stored on an Amazon EBS
volume can persist independently of the life of the instance.
When you launch an instance, the root device volume contains the image used to boot the instance.
Instances that use Amazon EBS for the root device automatically have
an Amazon EBS volume attached. When you launch an Amazon EBS-backed
instance, we create an Amazon EBS volume for each Amazon EBS snapshot
referenced by the AMI you use
Here's an example of an EC2 with additional EBS volume.
provider "aws" {
region = "eu-central-1"
}
resource "aws_instance" "example" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
ebs_block_device {
device_name = "/dev/sda1"
volume_size = 8
volume_type = "gp3"
throughput = 125
delete_on_termination = false
}
}
Note: delete_on_termination for root_block_device is set to true by default.
You can read more on AWS block device mapping here.
EDIT:
aws_volume_attachment is used when you want to attach an existing EBS volume to an EC2 instance. It helps to manage the relationship between the volume and the instance, and ensure that the volume is attached to the desired instance in the desired state.
Here's an example usage:
resource "aws_volume_attachment" "ebs_att" {
device_name = "/dev/sdh"
volume_id = aws_ebs_volume.example.id
instance_id = aws_instance.web.id
}
resource "aws_instance" "web" {
ami = "ami-21f78e11"
availability_zone = "us-west-2a"
instance_type = "t2.micro"
tags = {
Name = "HelloWorld"
}
}
resource "aws_ebs_volume" "example" {
availability_zone = "us-west-2a"
size = 1
}
and the ebs_block_device is used when you want to create a new EBS volume and attach it to an EC2 instance at the same time as the instance is being created.
NOTE:
If you use ebs_block_device on an aws_instance, Terraform will assume
management over the full set of non-root EBS block devices for the
instance, and treats additional block devices as drift. For this
reason, ebs_block_device cannot be mixed with external aws_ebs_volume + aws_volume_attachment resources for a given instance.
Source
I strongly suggest using only resource aws_ebs_volume. When creating an instance, the root block will be created automatically. For extra EBS storage, you will want Terraform to manage them independently.
Why?
Basically you have 2 choices to create an instance with 1 extra disk:
resource "aws_instance" "instance" {
ami = "ami-xxxx"
instance_type = "t4g.micro"
#... other arguments ...
ebs_block_device {
volume_size = 10
volume_type = "gp3"
#... other arguments ...
}
}
OR
resource "aws_instance" "instance" {
ami = "ami-xxxx"
instance_type = "t4g.micro"
#... other arguments ...
}
resource "aws_ebs_volume" "volume" {
size = 10
type = "gp3"
}
resource "aws_volume_attachment" "attachment" {
volume_id = aws_ebs_volume.volume.id
instance_id = aws_instance.instance.id
device_name = "/dev/sdb"
}
The first method is more compact, creates fewer Terraform resource and makes terraform import easier. But if you need to recreate your instance what will happen? Terraform will remove the instance, and redeploy it from scratch with new volumes. If you use the argument delete_on_termination to false, the volumes will still exist but they won't be attached to your instance.
In the contrary, when using a dedicated resource, the instance recreation will recreate the attachements (because the instance id changes) and then, reattach your existing volumes to your instance, which is what we need 90% of the time.
Also, if at some point you need to manipulate your volume in the Terraform state (terraform state commands), it will be much easier to do it on the individual resource aws_ebs_volume.
Finally, at some point in your Terraform journey, you will want to industrialize your code by adding loops, variables and so on. A common use case is to make the number of volumes variables: you provide a list of volumes and Terraform create 1, 2 or 10 volumes according to this list.
And for this you have also have 2 options :
variable "my_volume" { map(any) }
my_volume = {
"/dev/sdb": {
"size": 10
"type": "gp3"
}
}
resource "aws_instance" "instance" {
ami = "ami-xxxx"
instance_type = "t4g.micro"
#... other arguments ...
dynamic "ebs_block_device" {
for_each = var.my_volumes
content {
volume_size = ebs_block_device.value["size"]
volume_type = ebs_block_device.value["type"]
#... other arguments ...
}
}
}
OR
resource "aws_instance" "instance" {
ami = "ami-xxxx"
instance_type = "t4g.micro"
#... other arguments ...
}
resource "aws_ebs_volume" "volume" {
for_each = var.
size = 10
type = "gp3"
}
resource "aws_volume_attachment" "attachment" {
volume_id = aws_ebs_volume.volume.id
instance_id = aws_instance.instance.id
device_name = "/dev/sdb"
}

How to use terraform depends_on to dictate ordering of resource creation?

I have the following terraform resources in a file
resource "google_project_service" "cloud_resource_manager" {
project = var.tf_project_id
service = "cloudresourcemanager.googleapis.com"
disable_dependent_services = true
}
resource "google_project_service" "artifact_registry" {
project = var.tf_project_id
service = "artifactregistry.googleapis.com"
disable_dependent_services = true
depends_on = [google_project_service.cloud_resource_manager]
}
resource "google_artifact_registry_repository" "el" {
provider = google-beta
project = var.tf_project_id
location = var.region
repository_id = "el"
description = "Repository for extract/load docker images"
format = "DOCKER"
depends_on = [google_project_service.artifact_registry]
}
However, when I run terraform plan, I get this
Terraform will perform the following actions:
# google_artifact_registry_repository.el will be created
+ resource "google_artifact_registry_repository" "el" {
+ create_time = (known after apply)
+ description = "Repository for extract/load docker images"
+ format = "DOCKER"
+ id = (known after apply)
+ location = "us-central1"
+ name = (known after apply)
+ project = "backbone-third-party-data"
+ repository_id = "el"
+ update_time = (known after apply)
}
# google_project_iam_member.ingest_sa_roles["cloudscheduler.serviceAgent"] will be created
+ resource "google_project_iam_member" "ingest_sa_roles" {
+ etag = (known after apply)
+ id = (known after apply)
+ member = (known after apply)
+ project = "backbone-third-party-data"
+ role = "roles/cloudscheduler.serviceAgent"
}
# google_project_iam_member.ingest_sa_roles["run.invoker"] will be created
+ resource "google_project_iam_member" "ingest_sa_roles" {
+ etag = (known after apply)
+ id = (known after apply)
+ member = (known after apply)
+ project = <my project id>
+ role = "roles/run.invoker"
}
# google_project_service.artifact_registry will be created
+ resource "google_project_service" "artifact_registry" {
+ disable_dependent_services = true
+ disable_on_destroy = true
+ id = (known after apply)
+ project = <my project id>
+ service = "artifactregistry.googleapis.com"
}
See how google_project_service.artifact_registry is created after google_artifact_registry_repository.el. I was hoping that my depends_on in resource google_artifact_registry_repository.el would make it so the service was created first. Am I misunderstanding how depends_on works? Or does the ordering of resources listed from terraform plan not actually mean that thats the order they are created in?
Edit: when I run terraform apply it errors out with
Error 403: Cloud Resource Manager API has not been used in project 521986354168 before or it is disabled
Even though it is enabled. I think it's doing this because its running the artifact registry resource creation before creating the terraform services?
I don't think that it will be possible to enable this particular API this way, as google_project_service resource depends on Resource Manager API (and maybe also on Service Usage API?) being enabled. So you could either enable those manually or use null_resource with local-exec provisioner to do it automatically:
resource "null_resource" "enable_cloudresourcesmanager_api" {
provisioner "local-exec" {
command = "gcloud services enable cloudresourcesmanager.googleapis.com cloudresourcemanager.googleapis.com --project ${var.project_id}"
}
}
Another issue you might find is that enabling an API takes some time, depending on a service. So sometimes even though your resources depend on a resource enabling an API, you will still get the same error message. Then you can just reapply your configuration and as an API had time to initialize, second apply will work. In some cases this is good enough, but if you are building a reusable module you might want to avoid those reapplies. Then you can use time_sleep resources to wait for API initialization:
resource "time_sleep" "wait_for_cloudresourcemanager_api" {
depends_on = [null_resource.enable_cloudresourcesmanager_api]
# or: depends_on = [google_project_service.some_other_api]
create_duration = "30s"
}

Terraform keeps destroying existing resource

Im trying to debug why my Terraform script is not working. Due an unknown reason Terraform keeps destroying my MySQL database and recreates it after that.
Below is the output of the execution plan:
# azurerm_mysql_server.test01 will be destroyed
- resource "azurerm_mysql_server" "test01" {
- administrator_login = "me" -> null
- auto_grow_enabled = true -> null
- backup_retention_days = 7 -> null
- create_mode = "Default" -> null
- fqdn = "db-test01.mysql.database.azure.com" -> null
- geo_redundant_backup_enabled = false -> null
- id = "/subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/production-rg/providers/Microsoft.DBforMySQL/servers/db-test01" -> null
- infrastructure_encryption_enabled = false -> null
- location = "westeurope" -> null
- name = "db-test01" -> null
- public_network_access_enabled = true -> null
- resource_group_name = "production-rg" -> null
- sku_name = "B_Gen5_1" -> null
- ssl_enforcement = "Disabled" -> null
- ssl_enforcement_enabled = false -> null
- ssl_minimal_tls_version_enforced = "TLSEnforcementDisabled" -> null
- storage_mb = 51200 -> null
- tags = {} -> null
- version = "8.0" -> null
- storage_profile {
- auto_grow = "Enabled" -> null
- backup_retention_days = 7 -> null
- geo_redundant_backup = "Disabled" -> null
- storage_mb = 51200 -> null
}
- timeouts {}
}
# module.databases.module.test.azurerm_mysql_server.test01 will be created
+ resource "azurerm_mysql_server" "test01" {
+ administrator_login = "me"
+ administrator_login_password = (sensitive value)
+ auto_grow_enabled = true
+ backup_retention_days = 7
+ create_mode = "Default"
+ fqdn = (known after apply)
+ geo_redundant_backup_enabled = false
+ id = (known after apply)
+ infrastructure_encryption_enabled = false
+ location = "westeurope"
+ name = "db-test01"
+ public_network_access_enabled = true
+ resource_group_name = "production-rg"
+ sku_name = "B_Gen5_1"
+ ssl_enforcement = (known after apply)
+ ssl_enforcement_enabled = false
+ ssl_minimal_tls_version_enforced = "TLSEnforcementDisabled"
+ storage_mb = 51200
+ version = "8.0"
+ storage_profile {
+ auto_grow = (known after apply)
+ backup_retention_days = (known after apply)
+ geo_redundant_backup = (known after apply)
+ storage_mb = (known after apply)
}
}
As far as i know all is exactly the same. To prevent this i also did a manually terraform import to sync the state with the remote state.
The actually resource as defined in my main.tf
resource "azurerm_mysql_server" "test01" {
name = "db-test01"
location = "West Europe"
resource_group_name = var.rg
administrator_login = "me"
administrator_login_password = var.root_password
sku_name = "B_Gen5_1"
storage_mb = 51200
version = "8.0"
auto_grow_enabled = true
backup_retention_days = 7
geo_redundant_backup_enabled = false
infrastructure_encryption_enabled = false
public_network_access_enabled = true
ssl_enforcement_enabled = false
}
The other odd thing is that below command will output that all is actually in sync?
➜ terraform git:(develop) ✗ terraform plan --refresh-only
azurerm_mysql_server.test01: Refreshing state... [id=/subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/firstklas-production-rg/providers/Microsoft.DBforMySQL/servers/db-test01]
No changes. Your infrastructure still matches the configuration.
After an actual import the same still happens even though the import states all is in state:
➜ terraform git:(develop) ✗ terraform import azurerm_mysql_server.test01 /subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/production-rg/providers/Microsoft.DBforMySQL/servers/db-test01
azurerm_mysql_server.test01: Importing from ID "/subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/production-rg/providers/Microsoft.DBforMySQL/servers/db-test01"...
azurerm_mysql_server.test01: Import prepared!
Prepared azurerm_mysql_server for import
azurerm_mysql_server.test01: Refreshing state... [id=/subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/production-rg/providers/Microsoft.DBforMySQL/servers/db-test01]
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.
What can i do to prevent this destroy? Or even figure out why the actually destroy is triggered? This is happening on multiple azure instances at this point.
NOTE: the subscription ID is spoofed so don't worry
Best,
Pim
Your plan output shows that Terraform is seeing two different resource addresses:
# azurerm_mysql_server.test01 will be destroyed
# module.databases.module.test.azurerm_mysql_server.test01 will be created
Notice that the one to be created is in a nested module, not in the root module.
If your intent is to import this object to the address that is shown as needing to be created above, you'll need to specify this full address in the terraform import command:
terraform import 'module.databases.module.test.azurerm_mysql_server.test01' /subscriptions/8012-4035-b8f3-860f8cb1119e/resourceGroups/production-rg/providers/Microsoft.DBforMySQL/servers/db-test01
The terraform import command tells Terraform to bind an existing remote object to a particular Terraform address, and so when you use it you need to be careful to specify the correct Terraform address to bind to.
In your case, you told Terraform to bind the object to a hypothetical resource "azurerm_mysql_server" "test01" block in the root module, but your configuration has no such block and so when you ran terraform plan Terraform assumed that you wanted to delete that object, because deleting a resource block is how we typically tell Terraform that we intend to delete something.
There is a way.
user
Plan
Apply
terraform state rm "resource_name" --------This will eliminate or remove resource from current state
next Apply
Worked perfectly on GCP for creating 2 successive VM using same TF script.
Only thing is we need to write/code to get current resources and store somewhere and create commands in config: require blocks for upstream dependencies #3. While destroying we can add back using terraform state mv "resource_name"
Note : This has risk as the very first VM did not get deleted as it is considered generated out of scope of terraform. So your cost may persist. So you have to have back up (incremental for states)
Hope this helps.

GCP Terraform Error waiting for disk to attach: The disk resource disk resource is already in use

I am able to create instances and volumes in GCP using terraform. But when I snapshot the volumes I get an error and the disk won't attach:
Error: Error waiting for disk to attach: The disk resource 'projects/btgcp-iaas-dev/zones/us-east1-d/disks/dev-sql-td-data-disk' is already being used by 'projects/btgcp-iaas-dev/global/snapshots/dev-sql-td-data-disk-volume-snapshot'
on main.tf line 83, in resource "google_compute_attached_disk" "datadiskattach":
83: resource "google_compute_attached_disk" "datadiskattach" {
This is how I have my disk, attach disk and snapshot resources defined:
// Create additional disks
resource "google_compute_disk" "datadisk" {
name = var.data_disk_name
type = var.data_disk_type
size = var.data_disk_size
zone = var.zone
image = data.google_compute_image.sqlserverimage.self_link
labels = {
environment = "dev"
asv = "mycompanytools"
ownercontact = "myuser"
}
physical_block_size_bytes = 4096
}
// Attach additional disks
resource "google_compute_attached_disk" "datadiskattach" {
disk = google_compute_disk.datadisk.id
instance = google_compute_instance.default.id
}
// Create a snapshot of the datadisk volume
resource "google_compute_snapshot" "datadisksnapshot" {
name = var.datadisk_volume_snapshot_name
source_disk = google_compute_disk.datadisk.name
zone = var.zone
labels = {
my_label = var.datadisk_volume_snapshot_label
}
}
Here is how I am naming them in the variables. The naming seems to be clashing in the error I got:
// Create Data Disk Variables
variable "data_disk_name" {
description = "Value of the disk name for the data disk for the GCP instance"
type = string
default = "dev-sql-td-data-disk"
}
// Create Snapshot Variables
variable "datadisk_volume_snapshot_name" {
description = "Value of the datadisk name for the GCP instance root disk volume"
type = string
default = "dev-sql-td-data-disk-volume-snapshot"
}
How can I get passed this error?
Terraform documentation has suggested that, "When using google_compute_attached_disk you must use lifecycle.ignore_changes = ["attached_disk"] on the google_compute_instance resource that has the disks attached. Otherwise the two resources will fight for control of the attached disk block"
Ref link: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_attached_disk
Additionally, one can remove the following because Terraform will tell you that the resources is already being used.
// Attach additional disks
resource "google_compute_attached_disk" "datadiskattach" {
disk = google_compute_disk.datadisk.id
instance = google_compute_instance.default.id
}
Terraform is asking for the disk you are taking a snapshot from and not another disk as seen in the error. The source disk must be the boot VM disk and not the additional disk that you are attaching.
For simplicity, I will just add
resource "google_compute_snapshot" "snapshot-backup" {
name = "snapshot-name"
source_disk = "boot-os"
# zone = "europe-west3-b"
storage_locations = ["eu", ]
}
Refer here for more information

Restore an instance from snapshot without recreating using terraform

Use case: Intent is to revert an instance with already taken AWS snapshot of the volume the instance is currently running on.
For this, i thought why not use "terraform import" to bring the existing state of the instance and then modify HCL config file to replace ONLY volume. By behaviour it expect to create AMI from snapshot and then spawn instance from AMI. It works but it destroy's instance and then recreate instance.
I don't expect instance recreation, rather why not just do below by using terraform:
stop instance
detach current volume
create volume from provided snapshot to revert to
attach created volume to instance
power on instance.
How to achieve above?
Current config file which i am trying after importing an instance:
provider "aws" {
…..
…..
…..
}
resource "aws_ami" "example5554" {
name = "example5554"
virtualization_type = "hvm"
root_device_name = "/dev/sda1"
ebs_block_device {
snapshot_id = "snap-xxxxxxxxxxxxx”
device_name = "/dev/sda1"
volume_type = "gp2"
}
}
resource "aws_instance" "arstest1new" {
ami = "${aws_ami.example5554.id}"
instance_type = "m4.large"
}

Resources