I can't quite work out how to add a provisioning "remote-exec" section to my module, where I would like it to copy configuration scripts from the project directory and execute them. But when I add this module I can't seem to have it target the VM instance and as it has multiple network cards, I would just like to target the primary card.
I have used this to deploy a Linux VM via Terraform on a on-premise vSphere instance.
provider "vsphere" {
user = var.vsphere_user
password = var.vsphere_password
vsphere_server = var.vsphere_server
# If you have a self-signed cert
allow_unverified_ssl = true
}
This is the sample Linux deployment script outlining the network part, which allows configuring of multiple network card to a VM
resource "vsphere_virtual_machine" "Linux" {
count = var.is_windows_image ? 0 : var.instances
depends_on = [var.vm_depends_on]
name = "%{if var.vmnameliteral != ""}${var.vmnameliteral}%{else}${var.vmname}${count.index + 1}${var.vmnamesuffix}%{endif}"
........
dynamic "network_interface" {
for_each = keys(var.network) #data.vsphere_network.network[*].id #other option
content {
network_id = data.vsphere_network.network[network_interface.key].id
adapter_type = var.network_type != null ? var.network_type[network_interface.key] : data.vsphere_virtual_machine.template.network_interface_types[0]
}
}
........
//Copy the file to execute
provisioner "file" {
source = var.provisioner_file_source // eg ./scripts/*
destination = var.provisioner_file_destination // eg /tmp/filename
connection {
type = "ssh" // for Linux its SSH
user = var.provisioner_ssh_username
password = var.provisioner_ssh_password
host = self.vsphere_virtual_machine.Linux.*.guest_ip_address
}
}
//Run the script
provisioner "remote-exec" {
inline = [
"chmod +x ${var.provisioner_file_destination}",
"${var.provisioner_file_destination} args",
]
connection {
type = "ssh" // for Linux its SSH
user = var.provisioner_ssh_username
password = var.provisioner_ssh_password
host = self.vsphere_virtual_machine.Linux.*.guest_ip_address
}
}
}
} // end of resource "vsphere_virtual_machine" "Linux"
So I have tried self. reference but thus far self.vsphere_virtual_machine.Linux.*.guest_ip_address this just shows the entire array of guest IPs?
Anyone able to point me in the right direction or even a good guide on terraform modules?
The first issue I notice is that the vsphere_virtual_machine resource doesn't have a property guest_ip_address, it is guest_ip_addresses. This does indeed return a list, so you would need to figure out how to select the IP you want from the list. I'm not sure if the ordering is predictable in vSphere. If I recall, it isn't.
The simplest approach, without knowing what you're trying to accomplish, would probably be to use the default_ip_address as this returns a single address and selects the ip for the "most likely" scenario.
It looks like you're also setting up the host to be multi-homed. This adds additional complexity. If default_ip_address doesn't give you what you want, you'll need to resort to using some more complex expression to find your IP. Perhaps you could use the sort function so the ordering is more predictable. You may also be able to "find" the IP using a for loop.
Regarding building modules, if the above code is in a module, the first thing I would recommend is avoiding the use of count. Reasoning for this is explained in the following texts. Hashicorp has a lot of good documentation directly on their site. Also, the folks over at gruntwork also have a blog series that they developed into a book called Terraform Up and Running. I recommend checking this out.
Related
Use-case:
Create X amount of VM’s with a public ip assigned to each VM.
Research so far has found a previous issue ticket created - dated 2017. (https://github.com/hashicorp/terraform/issues/15285) where #apparentlymart discussed this issue.
The fix was that in terraform v0.12 explicit support was added for individual constants being added for reference in a depends_on.
My Repo: https://github.com/CPWu/terraform_azure_compute
Attempting to create a module that will create X number of azure linux computes, the first VM is created perfectly but the second compute unit on a “2X” request does not get created properly. The dynamic IP is created but does not get assigned to the NIC that’s associated to the second VM. My understanding is that because the count’ed’ resources are being seen as a single node and the IP is not available at time of NIC creation. Looked into the for_each construct implemented in v0.12.6 but that doesn’t look like that will solve my issue.
Update: I can see that all the resources are created but IP[1] and so moving forward do not get the association to the respective server NIC. I also posted on Hashicorp community forums with no response yet.
Quick idea,
Try to change the depends on in here:
resource "azurerm_network_interface" "sandbox_nic" {
name = "${var.SERVER_NAME}-${format("%02d",count.index)}-nic"
location = var.AZURE_REGION
resource_group_name = var.RESOURCE_GROUP_NAME
count = var.NODE_COUNT
ip_configuration {
name = "${var.SERVER_NAME}-ip"
subnet_id = azurerm_subnet.sandbox_subnet.id
private_ip_address_allocation = "dynamic"
public_ip_address_id = count.index == 0 ? element(azurerm_public_ip.sandbox_public_ip.*.id,count.index) : null
}
**depends_on = [
azurerm_public_ip.sandbox_public_ip[1],
]**
}
To:
azurerm_public_ip.sandbox_public_ip without the index pointer
The interface might get created faster than the 3rd and 4th IP thus not picking it up
I have a very frustrating Terraform issue, I made some changes to my terraform script which failed when I applied the plan. I've gone through a bunch of machinations and probably made the situation worse as I ended up manually deleting a bunch of AWS resources in trying to resolve this.
So now I am unable to use Terraform at all (refresh, plan, destroy) all get the same error.
The Situation
I have a list of Fargate services, and a set of maps which correlate different features of the fargate services such as the "Target Group" for the load balancer (I've provided some code below). The problem appears to be that Terraform is not picking up that these resources have been manually deleted or is somehow getting confused because they don't exist. At this point if I run a refresh, plan or destroy I get an error stating that a specific list is empty, even though it isn't (or should not be).
In the failed run I added a new service to the list below along with a new url (see code below)
Objective
At this point I would settle for destroying the entire environment (its my dev environment), however; ideally I want to just get the system working such that Terraform will detect the changes and work properly.
Terraform Script is Valid
I have reverted my Terraform scripts back to the last known good version. I have run the good version against our staging environment and it works fine.
Configuration Info
MacOS Mojave 10.14.6 (18G103)
Terraform v0.12.24.
provider.archive v1.3.0
provider.aws v2.57.0
provider.random v2.2.1
provider.template v2.1.2
The Terraform state file is being stored in a S3 bucket, and terraform init --reconfigure has been called.
What I've done
I was originally getting a similar error but it was in a different location, after many hours Googling and trying stuff (which I didn't write down) I decided to manually remove the AWS resources associated with the problematic code (the ALB, Target Groups, security groups)
Example Terraform Script
Unfortunately I can't post the actual script as it is private, but I've posted what I believe is the pertinent parts but have redacted some info. The reason I mention this is that any syntax type error you might see would be caused by this redaction, as I stated above the script works fine when run in our staging environment.
globalvars.tf
In the root directory. In the case of the failed Terraform run I added a new name to the service_names (edd = "edd") list (I added as the first element). In the service_name_map_2_url I added the new entry (edd = "edd") as the last entry. I'm not sure if the fact that I added these elements in different 'order' is the problem, although it really shouldn't since I access the map via the name and not by index
variable "service_names" {
type = list(string)
description = "This is a list/array of the images/services for the cluster"
default = [
"alert",
"alert-config"
]
}
variable service_name_map_2_url {
type = map(string)
description = "This map contains the base URL used for the service"
default = {
alert = "alert"
alert-config = "alert-config"
}
}
alb.tf
In modules/alb. In this module we create an ALB and then a target group for each service, which looks like this. The items from globalvars.tf are passed into this script
locals {
numberOfServices = length(var.service_names)
}
resource "aws_alb" "orchestration_alb" {
name = "orchestration-alb"
subnets = var.public_subnet_ids
security_groups = [var.alb_sg_id]
tags = {
environment = var.environment
group = var.tag_group_name
app = var.tag_app_name
contact = var.tag_contact_email
}
}
resource "aws_alb_target_group" "orchestration_tg" {
count = local.numberOfServices
name = "${var.service_names[count.index]}-tg"
port = 80
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "ip"
deregistration_delay = 60
tags = {
environment = var.environment
group = var.tag_group_name
app = var.tag_app_name
contact = var.tag_contact_email
}
health_check {
path = "/${var.service_name_map_2_url[var.service_names[count.index]]}/health"
port = var.app_port
protocol = "HTTP"
healthy_threshold = 2
unhealthy_threshold = 5
interval = 30
timeout = 5
matcher = "200-308"
}
}
output.tf
This is the output of the alb.tf, other things are outputted but this is the one that matters for this issue
output "target_group_arn_suffix" {
value = aws_alb_target_group.orchestration_tg.*.arn_suffix
}
cloudwatch.tf
In modules/cloudwatch. I attempt to create a dashboard
data "template_file" "Dashboard" {
template = file("${path.module}/dashboard.json.template")
vars = {
...
alert-tg = var.target_group_arn_suffix[0]
alert-config-tg = var.target_group_arn_suffix[1]
edd-cluster-name = var.ecs_cluster_name
alb-arn-suffix = var.alb-arn-suffix
}
}
Error
When I run terraform refresh (or plan or destroy) I get the following error (I get the same error for alert-config as well)
Error: Invalid index
on modules/cloudwatch/cloudwatch.tf line 146, in data "template_file" "Dashboard":
146: alert-tg = var.target_group_arn_suffix[0]
|----------------
| var.target_group_arn_suffix is empty list of string
The given key does not identify an element in this collection value.
AWS Environment
I have manually deleted the ALB. Dashboard and all Target Groups. I would expect (and this has worked in the past) that Terraform would detect this and update its state file appropriately such that when running a plan it would know it has to create the ALB and target groups.
Thank you
Terraform trusts its state as the single source of truth. Using Terraform in the presence of manual change is possible, but problematic.
If you manually remove infrastructure, you need to run terraform state rm [resource path] on the manually removed resource.
Gruntwork has what they call The Golden Rule of Terraform:
The master branch of the live repository should be a 1:1 representation of what’s actually deployed in production.
when performing terraform plan, if an azurerm_kubernetes_cluster (Azure) resource exists in the state, terraform will print some information from kube_config which seems sensitive
Example printout: (all ... values get printed)
kube_config = [
{
client_certificate = (...)
client_key = (...)
cluster_ca_certificate = (...)
host = (...)
password = (...)
}
I'm not exactly sure WHICH of those values are sensitive, but password probably is...right?
On the other hand, terraform does seem to have some knowledge of which values are sensitive, as it does print the client_secret this way:
service_principal {
client_id = "(...)"
client_secret = (sensitive value)
}
So, my questions would be:
Are those values actually sensitive?
If so, is there a way to instruct terraform to mask those values in the plan?
Versions we are using:
provider "azurerm" {
version = "~>1.37.0"
}
The reason why this is problematic is that we pipe the plan in a Github PR comment.
Thanks
Are those values actually sensitive?
Yes, there are sensitive data. Actually they are the config that you need to use to control the AKS cluster. It's the AKS credential. I think it's necessary to output these data, just make a suppose that you only have Terraform and use it to create an AKS cluster, if Terraform does not output the credential, you cannot control your AKS cluster.
If so, is there a way to instruct terraform to mask those values in
the plan?
According to the explanation above, you should not wrong about the sensitive data in the Terraform state file. What you need to care about is how to protect the state file. I suggest you store the Terraform state file in Azure storage then you can encrypt it. Follow the steps in Store Terraform state in Azure Storage.
Terraform now offers the ability to set variables as sensitive, and outputs as sensitive.
variable example:
variable "user_information" {
type = object({
name = string
address = string
})
sensitive = true
}
output example:
output "db_password" {
value = aws_db_instance.db.password
description = "The password for logging in to the database."
sensitive = true
}
However, as of July 1, 2021 there is no option to hide plan output for something that isn't derived from a sensitive input.
References:
https://www.hashicorp.com/blog/terraform-0-14-adds-the-ability-to-redact-sensitive-values-in-console-output
https://www.terraform.io/docs/language/values/outputs.html
I have 3 environments for my infrastructure. All of them the same, but with various sizes. I understand this is a good use case for Terraform workspaces. And indeed it works well in that regard. But please correct me if this is not the right way to go.
Now my only issue is with managing the DNS within the workspaces. I use the Google provider and that works by having 2 types of resources: a google_dns_managed_zone which represents the zone, and a google_dns_record_set type for each DNS record.
Note that the record set type needs to have a reference to the managed zone type.
With that in mind, I need to manage the DNS zone from the production environment. I can't share that resource in the other workspaces because I should be able to destroy the dev or staging workspace without destroying the DNS zone.
I try to solve that issue with count. I use it as a boolean as shown in the code below and find it pretty hackish but that's what I have found in the Terraform community. Any improvement is welcome.
That allows me to have the zone and the production records (like MX shown below as example) only present in the prod workspace.
But then I am stuck when it comes to managing record sets only in a specific workspace. I need that for example in the case of creating an nginx in the dev workspace and automatically create a DNS record set for it, like dev.example.com.
For that I need to access the managed zone resource. As shown below I use terraform_remote_state in order to access the resource from the prod workspace. To the extent of my understanding, that works with an output, which you can see below. When I select the prod workspace, I can indeed output the managed zone. And then if I select another workspace, using the remote state retrieves the managed zone from prod successfully. But my issue is that Terraform fails when it comes to the output line since it is only present in the prod workspace and does not exist in any other workspace and thus can't be outputted.
So it's a bit of a nonsense and I don't understand if there is a better way to achieve this. I did a fair bit of research and asked the community but could not find an answer to that. It seems to me that managing DNS is common to all infrastructures and should be pretty well covered. What am I doing wrong and how should it be done?
locals {
environment="${terraform.workspace}"
dns_zone_managers = {
"dev" = "0"
"staging" = "0"
"prod" = "1"
}
dns_zone_manager = "${lookup(local.dns_zone_managers, local.environment)}"
}
resource "google_dns_managed_zone" "base_zone" {
name = "base_zone"
dns_name = "example.com."
count = "${local.dns_zone_manager}"
}
resource "google_dns_record_set" "mx" {
name = "${google_dns_managed_zone.base_zone.dns_name}"
managed_zone = "${google_dns_managed_zone.base_zone.name}"
type = "MX"
ttl = 300
rrdatas = [
"10 spool.mail.example.com.",
"50 fb.mail.example.com."
]
count = "${local.dns_zone_manager}"
}
data "terraform_remote_state" "dns" {
backend = "local"
workspace = "prod"
}
output "dns_zone_name" {
value = "${google_dns_managed_zone.base_zone.*.name[0]}"
}
Then I can introduce record sets in a specific workspace only, using count again and referring to the managed zone through the remote state like so:
resource "google_dns_record_set" "a" {
name = "dev"
managed_zone = "${data.terraform_remote_state.dns.dns_zone_name}"
type = "A"
ttl = 300
rrdatas = ["1.2.3.4"]
}
As part of a setup, I create TLS certs and store them in S3. Creating the certs is done via external data source that runs the command to generate the certs. I then use those outputs to create S3 bucket object resources.
This works very well the first time I run terraform apply. However, if I change any other (non-cert) variable, resource, etc. and rerun, it reruns the external command, which generates a new key/cert pair, uploads them to S3, and breaks everything that already works.
Is there any way to create the resource conditionally? What pattern could I use to make the certs created only if they don't exist?
I did look at storing the generated keys/certs locally, but this is sensitive key material; I do not want it stored in local disk (and there are keys per environment).
Key/cert generation and storage:
data "external" "ca" {
program = ["sh","-c","jq '.root|fromjson' | cfssl gencert -initca -"]
#
query = {root = "${ data.template_file.etcd-ca-csr.rendered }"}
# the result will be saved in
# data.external.etcd-ca.result.key
# data.external.etcd-ca.result.csr
# data.external.etcd-ca.result.cert
}
resource "aws_s3_bucket_object" "ca_cert" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca.pem"
content = "${data.external.ca.result.cert}"
}
resource "aws_s3_bucket_object" "ca_key" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca-key.pem"
content = "${data.external.ca.result.key}"
}
Happy to look at using some form of conditional or entirely different generation pattern.
The reason for this behavior is that external is a data source, and thus Terraform expects that it is is read-only and side-effect-free. It re-runs data sources for every plan.
In order to do this via an external script, it would be necessary to use a resource provisioner to run the script and upload it to S3, since there is currently no external equivalent for resources, which are allowed to have side-effects, and provisioners are side-effect-only (that is, they can't produce results to use elsewhere in config.)
Another approach, though, would be to use Terraform's built-in TLS provider, which allows creation of certificates within Terraform itself. In this case it looks like you're trying to create a new CA cert and key, which could be done with tls_self_signed_cert like this:
resource "tls_private_key" "ca" {
algorithm = "RSA"
rsa_bits = 2048
}
resource "tls_self_signed_cert" "ca" {
key_algorithm = "RSA"
private_key_pem = "${tls_private_key.ca.private_key_pem}"
# ... subject and validity settings, as appropriate
is_ca_certificate = true
allowed_uses = ["cert_signing"]
}
resource "aws_s3_bucket_object" "ca_cert" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca.pem"
content = "${resource.tls_self_signed_cert.ca.cert_pem}"
}
resource "aws_s3_bucket_object" "ca_key" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca-key.pem"
content = "${resource.tls_self_signed_cert.ca.private_key_pem}"
}
The generated private key will be included in the state for use on future runs, so it's important to ensure that the state is stored securely. Note that this would also be true using the external data source, since data source results are also stored in state. Thus this approach is equivalent from the standpoint of where the secrets get stored.
I wrote more details about using Terraform for TLS certificate management in an article on my website. Its scope is broader than your requirements here, but may be of some interest.