I'm struggling with Terragrunt (I'm still quite new).
I can describe my problem even using pure Terragrunt repo examples:
Looking here(https://github.com/gruntwork-io/terragrunt-infrastructure-live-example/tree/master/prod/us-east-1/prod/webserver-cluster) we can see terragrunt.hcl that imports a module asg-elb-service taken from particular URL (also terragrunt example)
Now my point is that everything is fine untill module solves all my needs. But using mentioned example let's say that I want to add something on top of this module (e.g listener rule for ALB or anything) - then I would like to rely on module outputs and as we can check "used" module exposes those: outputs (https://github.com/gruntwork-io/terragrunt-infrastructure-modules-example/blob/master/asg-elb-service/outputs.tf)
But how even if I add tf file inside my structure - continuing my example, it would be something like:
I'm just not able to anyhow "interpolate" and get access to those outputs from module :(
terragrunt is a thin wrapper that just provides some extra tools for configuration. terragrunt is used to make management of multiple terraform modules easier, it takes care about remote state and so on. But it does not extend terraform modules by adding some functionality on top of it.
Coming back to your example, common approach is to create a new terraform module, probably on top of the existing one and add missing functionality there. You should consider terraform module as a function that does particular job on a certain level of abstraction. With that said, it's completely valid to create modules that use another modules. Consider following example: you need to provision an infrastructure that can send Slack notifications if AWS CloudWatch alarm is triggered. To simplify it a little bit, let's imagine, that Alarm is already created. The missing part is a Lambda function that will send notification, SNS topic that will trigger Lambda function.
This is something, that can be created using terraform module, but under the hood it will most probably rely on another terraform modules (one that provisions Lambda and another one that provisions SNS topic). Those "internal" modules are on another level of abstraction and you still can reuse them in other cases individually. Pseudo code might look like this:
module "sns_topic" {
source = "git::https://github.com/..."
name = "trigger_lambda_to_send_notification_to_slack"
}
module "labmda_function" {
source = "git::https://github.com/..."
name = "SendMessageToSlack"
...
}
# Invoke Lambda by SNS
resource "aws_sns_topic_subscription" "sns_subscriptions" {
endpoint = module.labmda_function.lambda_endpoint # this is how you reference module output
protocol = "lambda"
topic_arn = module.sns_topic.sns_topic_arn
}
And then, you can simply use this module in terragrunt.
Related
Context: Implementation a Terraform Provider via TF Provider SDKv2 by following an official tutorial.
As a result, all of the schema elements in the corresponding Terraform Provider resource aws_launch_configuration are marked as ForceNew: true. This behavior instructs Terraform to first destroy and then recreate the resource if any of the attributes change in the configuration, as opposed to trying to update the existing resource.
TF tutorial suggests we should add ForceNew: true for every non-updatable field like:
"base_image": {
Type: schema.TypeString,
Required: true,
ForceNew: true,
},
resource "example_instance" "ex" {
name = "bastion host"
base_image = "ubuntu_17.10" # base_image updates are not supported
}
However one might run into the following:
Let's consider "important" resources foo_db_instance (a DB instance that should be deleted / recreated in exceptional scenarios) (related unanswered question) that has name attribute:
resource "foo_db_instance" "ex" {
name = "bar" # name updates are not supported
...
}
However its underlying API was written in a weird way and it doesn't support updates for name attribute. There're 2 options:
Following approach of the tutorial, we might add ForceNew: true and then, if a user doesn't pay attention to terraform plan output it might recreate foo_db_instance.ex when updating name attribute by accident that will create an outage.
Don't follow the approach from the tutorial and don't add ForceNew: true. As a result terraform plan will not output the error and it will make it look like the update is possible. However when running terraform apply a user will run into an error, if we add a custom code to resourceUpdate() like this:
func resourceUpdate(ctx context.Context, d *schema.ResourceData, meta interface{}) diag.Diagnostics {
if d.HasChanges("name) {
return diag.Errorf("error updating foo_db_instance: name attribute updates are not supported")
}
...
}
There're 2 disadvantages of this approach:
non-failing output of terraform plan
we might need some hack to restore tf state to override d.Set(name, oldValue).
Which approach should be preferrable?
I know there's prevent_destroy = true lifecycle attribute but it seems like it won't prevent this specific scenario (it only prevents from accidental terraform destroy).
The most typical answer is to follow your first option, and then allow Terraform to report in its UI that the change requires replacement and allow the user to decide how to proceed.
It is true that if someone does not read the plan output then they can potentially make a change they did not intend to make, but in that case the user is not making use of the specific mechanism that Terraform provides to help users avoid making undesirable changes.
You mentioned prevent_destroy = true and indeed that this a setting that's relevant to this situation, and is in fact exactly what that option is for: it will cause Terraform to raise an error if the plan includes a "replace" action for the resource that was annotated with that setting, thereby preventing the user from accepting the plan and thus from destroying the object.
Some users also wrap Terraform in automation which will perform more complicated custom policy checks on the generated plan, either achieving a similar effect as prevent_destroy (blocking the operation altogether) or alternatively just requiring an additional confirmation to help ensure that the operator is aware that something unusual is happening. For example, in Terraform Cloud a programmatic policy can report a "soft failure" which causes an additional confirmation step that might be approvable only by a smaller subset of operators who are better equipped to understand the impact of what's being proposed.
It is in principle possible to write logic in either the CustomizeDiff function (which runs during planning) or the Update function (which runs during the apply step) to return an error in this or any other situation you can write logic for in the Go programming language. Of these two options I would say that CustomizeDiff would be preferable since that would then prevent creating a plan at all, rather than allowing the creation of a plan and then failing partway through the apply step, when some other upstream changes may have already been applied.
However, to do either of these would be inconsistent with the usual behavior users expect for Terraform providers. The intended model is for a Terraform provider to describe the effect of a change as accurately as possible and then allow the operator to make the final decision about whether the proposed change is acceptable, and to cancel the plan and choose another strategy if not.
I am using Terraform version 0.12. I have a requirement to skip resource creation if resource with the same name already exists.
I did the following for this :
Read the list of custom images,
data "ibm_is_images" "custom_images" {
}
Check if image already exists,
locals {
custom_vsi_image = contains([for x in data.ibm_is_images.custom_images.images: "true" if x.visibility == "private" && x.name == var.vnf_vpc_image_name], "true")
}
output "abc" {
value="${local.custom_vsi_image}"
}
Create only if image exists is false.
resource "ibm_is_image" "custom_image" {
count = "${local.custom_vsi_image == true ? 0 : 1}"
depends_on = ["data.ibm_is_images.custom_images"]
href = "${local.image_url}"
name = "${var.vnf_vpc_image_name}"
operating_system = "centos-7-amd64"
timeouts {
create = "30m"
delete = "10m"
}
}
This works fine for the first time with "terraform apply". It finds that the image did not exists, so it creates image.
When I run "terraform apply" for the second time. It is deleting the resource "custom_image" that is created above. Any idea why it is deleting the resource, when it is run for the 2nd time ?
Also, how to create a resource based on some condition(like only when it does not exists) ?
In Terraform, you're required to decide explicitly what system is responsible for the management of a particular object, and conversely which systems are just consuming an existing object. There is no way to make that decision dynamically, because that would make the result non-deterministic and -- for objects managed by Terraform -- make it unclear which configuration's terraform destroy would destroy the object.
Indeed, that non-determinism is why you're seeing Terraform in your situation flop between trying to create and then trying to delete the resource: you've told Terraform to only manage that object if it doesn't already exist, and so the first time you run Terraform after it exists Terraform will see that the object is no longer managed and so it will plan to destroy it.
If you goal is to manage everything with Terraform, an important design task is to decide how object dependencies flow within and between Terraform configurations. In your case, it seems like there is a producer/consumer relationship between a system that manages images (which may or may not be a Terraform configuration) and one or more Terraform configurations that consume existing images.
If the images are managed by Terraform then that suggests either that your main Terraform configuration should assume the image does not exist and unconditionally create it -- if your decision is that the image is owned by the same system as what consumes it -- or it should assume that the image does already exist and retrieve the information about it using a data block.
A possible solution here is to write a separate Terraform configuration that manages the image and then only apply that configuration in situations where that object isn't expected to already exist. Then your configuration that consumes the existing image can just assume it exists without caring about whether it was created by the other Terraform configuration or not.
There's a longer overview of this situation in the Terraform documentation section Module Composition, and in particular the sub-section Conditional Creation of Objects. That guide is focused on interactions between modules in a single configuration, but the same underlying principles apply to dependencies between configurations (via data sources) too.
I have a Terraform module that manages AWS GuardDuty.
In the module, an aws_guardduty_detector resource is declared. The resource allows no specification of region, although I need to configure one of these resources for each region in a list. The region used needs to be declared by the provider, apparently(?).
Lack of module for_each seems to be part of the problem, or, at least, module for_each, if it existed, might let me declare the whole module, once for each region.
Thus, I wonder, is it possible to somehow declare a provider, for each region in a list?
Or, short of writing a shell script wrapper, or doing code generation, is there any other clean way to solve this problem that I might not have thought of?
To support similar processes I have found two approaches to this problem
Declare multiple AWS providers in the Terraform module.
Write the module to use a single provider, and then have a separate .tfvars file for each region you want to execute against.
For the first option, it can get messy having multiple AWS providers in one file. You must give each an alias and then each time you create a resource you must set the provider property on the resource so that Terraform knows which region provider to execute against. Also, if the provider for one of the regions can not initialize, maybe the region is down, then the entire script will not run, until you remove it or the region is back up.
For the second option, you can write the Terraform for what resources you need to set up and then just run the module multiple times, once for each regional .tfvars file.
prod-us-east-1.tfvars
prod-us-west-1.tfvars
prod-eu-west-2.tfvars
My preference is to use the second option as the module is simpler and less duplication. The only duplication is in the .tfvars files and should be more manageable.
EDIT: Added some sample .tfvars
prod-us-east-1.tfvars:
region = "us-east-1"
account_id = "0000000000"
tags = {
env = "prod"
}
dynamodb_read_capacity = 100
dynamodb_write_capacity = 50
prod-us-west-1.tfvars:
region = "us-west-1"
account_id = "0000000000"
tags = {
env = "prod"
}
dynamodb_read_capacity = 100
dynamodb_write_capacity = 50
We put whatever variables might need to be changed for the service or feature based on environment and/or region. For instance in a testing environment, the dynamodb capacity may be lower than in the production environment.
Terraform 0.12.13, azurerm provider 1.35
Some background: I have a set of Azure App Services, hosted on an App Service Plan, in a Resource Group, in an Azure location. I now need to duplicate this stack in a different Azure location and add some additional resources like Traffic Managers and CNAMEs and whatnot in order to implement high availability. Architecturally we have Primary resources, and then a smaller subset of Secondary resources in the secondary region (not everything needs to be duplicated). Not every deployment will require high availability, so I need to be able to instantiate or not instantiate the Secondaries at run-time.
Because I was trying to be a good software engineer, I created modules to instantiate most of this stuff - one for the app services, one for the app service plan, one for the traffic managers, and so on.
The problem I have now is that I'm using the old count + ternary operator trick to control whether the secondary resources get created, and this is breaking because 1) count isn't allowed as a module meta-argument yet and 2) I can't figure out how to pass exported attributes from a resource controlled by the count meta-argument to a module as an input variable.
The following code may make this clearer.
resource "azurerm_resource_group" "appservices_secondary" {
name = "foo-services-ca-${local.secondary_release_stage_name}-${var.pipeline}-rg"
location = local.secondary_location
count = var.enable_high_availability ? 1 : 0
}
# Create the app service plan to host the secondary app services
module "plan_secondary" {
source = "./app_service_plan"
release_stage_name = local.secondary_release_stage_name
# HERE'S THE PROBLEMATIC LINE
appsvc_resource_group_name = azurerm_resource_group.appservices_secondary[0].name
location = local.secondary_location
pipeline = var.pipeline
}
If count resolves to 1 (var.enable_high_availability = true) then everything's fine.
If count resolves to 0 (var.enable_high_availability = false) then terraform plan fails:
Error: Invalid index
on .terraform\modules\services\secondary.tf line 25, in module "plan_secondary":
25: appsvc_resource_group_name = azurerm_resource_group.appservices_secondary[0].name
|----------------
| azurerm_resource_group.appservices_secondary is empty tuple
The given key does not identify an element in this collection value.
If I change the input variable value to azurerm_resource_group.appservices_secondary.name then it won't pass terraform validate because it recognizes that it needs [count.index].
Is there a simple way to resolve this? I'm increasingly thinking this is a design problem and I should have built the modules with count = [1..2] rather than count = 1 (primary) and count = [0 || 1] (secondary) but that will require me to rewrite all the modules and I'd like to avoid that if there's some clever workaround.
In order to resolve this you can use a conditional expression for appsvc_resource_group_name to provide some alternative value to use when the azurerm_resource_group.appservices_secondary resource has count = 0:
appsvc_resource_group_name = length(azurerm_resource_group.appservices_secondary) > 0 ? azurerm_resource_group.appservices_secondary[0].name : "default-value"
It looks like this other module is not useful in situations where high availability is disabled. In that case, you might want to define the variable as being optional with a default of null so that you can recognize when it isn't set in the module:
variable "appsvc_resource_group_name" {
type = string
default = null
}
Elsewhere in the configuration you can test var.appsvc_resource_group_name != null to see if it's enabled.
When following the module composition patterns I'd likely instead build this as two modules, using one of the following two strategies:
One module for building a "normal" (non-HA) stack and another module for building a HA stack, and then choose which one to use in the root module of each configuration depending on whether a particular configuration needs the normal or HA mode.
Alternatively, if the HA stack is always a superset of the "normal" stack, have one module for the normal stack, and then another module that consumes the outputs of the first and describes the extended resources needed for HA mode.
Here's an example of the second of those approaches, just to illustrate what I mean by it:
module "primary_example" {
source = "./primary_example"
# whatever arguments are needed
}
module "secondary_example" {
source = "./secondary_example"
# Make sure the primary module exports as outputs all of the
# values required to extend to HA mode, and then just pass
# that whole object through to secondary.
primary = module.primary_example
}
In a configuration that doesn't need HA mode you can then omit module "secondary_example".
The module composition patterns are about decomposing the configuration into small pieces that describe one self-contained capability and then letting the root module select from those capabilities whatever subset of them are relevant and connecting them in a suitable way.
In this case, I'm treating non-HA infrastructure as one capability and then HA extensions to that infrastructure as a second capability that depends on the first, connecting them together in a dependency inversion style so that the HA extensions can just assume that a non-HA deployment already exists and that information about it will be passed in by its caller.
Lets say I want to define a set of resources that have dependencies on each other, and the dependent resources should reuse parameters from their ancestors. Something like this:
server { 'my_server':
path => '/path/to/server/root',
...
}
server_module { 'my_module':
server => Server['my_server'],
...
}
The server_module resource both depends on my_server, but also wants to reuse the configuration of it, in this case the path where the server is installed. stdlib has functions for doing this, specifically getparam().
Is this the "puppet" way to handle this, or is there a better way to have this kind of dependency?
I don't think there's a standard "puppet way" to do this. If you can get it done using the stdlib and you're happy with it, then by all means do it that way.
Personally, if I have a couple defined resources that both need the same data I'll do one of the follow:
1) Have a manifest that creates both resources and passes the data both need via parameters. The manifest will have access to all data both resources need, whether shared or not.
2) Have both defined resources look up the data they need in Hiera.
I've been leaning more towards #2 lately.
Dependency is only a matter of declaring it. So your server_module resource would have a "require => Server['my_server']" parameter --- or the server resource would have a "before => Server_module['my_module']".