When to use Terraform Modules from terraform registry and when to use resource - terraform

I am fairly new to terraform. I am trying to understand what is the right way to spin up resources using terraform. I have come across two different ways to do it.
Using terraform modules from terraform registry:
module "vpc" {
source = "terraform-google-modules/network/google"
version = "~> 6.0"
project_id = var.project_id
network_name = var.network_name
routing_mode = var.routing_mode
auto_create_subnetworks = var.auto_create_subnetworks
delete_default_internet_gateway_routes = var.delete_default_internet_gateway_routes
mtu = var.mtu
subnets = var.subnets
firewall_rules = var.firewall_rules
}
Using resources
resource "google_compute_network" "network" {
name = var.network_name
auto_create_subnetworks = var.auto_create_subnetworks
routing_mode = var.routing_mode
project = var.project_id
delete_default_routes_on_create = var.delete_default_internet_gateway_routes
mtu = var.mtu
}
Both these approaches work fine and they give me desired output but I would like to know which one should be used when? When should I use Terraform Module directly and when should I use resource?

As you may read in the documentation, modules are a way of organising Terraform scripts, it makes them more reusable, sharable and publishable.
In general, when working with Terraform you want to have an infrastructure that's portable and that can be transferred and deployed on multiple environments.
It would have been harder to maintain a complex infrastructure where you only use resources that will be sitting in the same codebase.
modules allows you to group one, to multiple resources, so you can break your infrastructure code to reusable and configurable portions of code, where you can enforce certain defaults. These modules can be used in the main infrastructure codebase making it easily readable and maintainable since it abstracts the details of each part of the infrastructure.
And with this abstraction, you can afford using different patterns to deploy your infrastructure on different environments (example: 'dev', 'staging', and 'prod') since you may only pass different (and less) values for your variables, or your can just duplicate your infrastructure folder (dev-infra) for example to create another (stg-infra), the cost of this code duplication is cheaper than duplicating many files in case of using only ressources.
TLDR; as you shared in your example:
module "vpc" {
source = "terraform-google-modules/network/google"
version = "~> 6.0"
project_id = var.project_id
network_name = var.network_name
routing_mode = var.routing_mode
auto_create_subnetworks = var.auto_create_subnetworks
delete_default_internet_gateway_routes = var.delete_default_internet_gateway_routes
mtu = var.mtu
subnets = var.subnets
firewall_rules = var.firewall_rules
}
Does more than creating the VPC on GCP. As you can see in their source-code, it is composed on many other modules, in which they use the google_compute_network resources along others to setup networks, firewall rules, subnets and more.
You may have a very complex VPC network that can be done only by using the module that abstracts the usage of the resources. That same VPC setup might be re-used for other GCP projects for different environments. Or you can even build a module on top of this if you want to enforce some configurations, like you hardcode auto_create_subnetworks = true.
In some organisations, it is recommended to use only the organisation's approved terraform module, where they apply certain patterns, and set some defaults/best-practices.
See more:
https://developer.hashicorp.com/terraform/tutorials/modules/module#what-are-modules-for
https://www.terraform-best-practices.com/key-concepts#composition
https://www.terraform-best-practices.com/examples/terraform
https://developer.hashicorp.com/terraform/language/modules
Hope this helps!

Related

Terraform using both required_providers and provider blocks

I am going through a terraform guide, where the author is spinning up a docker setup using the docker_image and docker_container resources.
In the sample code the main.tf file includes both the required_providers and the provider blocks, as follows:
terraform {
required_providers {
docker = {
source = "kreuzwerker/docker"
}
}
}
provider "docker" {}
Why are they both needed?
Shouldn't terraform be able to understand the need for a docker provider, only by this line?
provider "docker" {}
When considering Terraform providers there are two related notions to think about: the provider itself, and a configuration for the provider.
As an analogy, the provider kreuzwerker/docker here is a bit like a class you're importing from another library, giving it the local name docker. I'll use a pseudo-JavaScript syntax just to make this a bit more concrete:
var docker = require("kreuzwerker/docker");
However, all we have here so far is the class itself. In order to use it we need to create an instance of it, which in Terraform's vernacular is called a "configuration". Again, using pseudo-JavaScript syntax:
var dockerInstance = new docker({});
Terraform's syntax here is decidedly less explicit than this pseudo-JavaScript form, but we can make the distinction more visible by adding a second instance of the provider to the configuration, which in Terraform we do by assigning it a configuration "alias":
provider "docker" {
alias = "example"
host = "ssh://user#remote-host:22"
}
This is like creating a second instance of the provider "class" in our pseudo-JavaScript example:
var dockerInstance2 = new docker({
host: 'ssh://user#remote-host:22'
});
Another variant that shows the distinction is when a module inherits a provider configuration from its calling module. In that case, it's as if the calling module were implicitly passing the provider configuration (instance) into the module, but the child module still needs to import the provider "class" so Terraform can see that we're talking about kreuzwerker/docker as opposed to any other provider that might have the name "docker".
Terraform has some automatic "magic" behaviors that try to make simpler cases implicit, but unfortunately that comes at the cost of making it harder to understand what's going on when things get more complicated. Providers and provider configurations are a particularly hard example of this, because providers have been in the Terraform language for a long time and the current incarnation of the language is trying to stay broadly backward-compatible with the simple uses while still allowing for the newer features like having third-party providers installable from multiple namespaces.
The particularly confusing assumption here is that if you don't declare a particular provider Terraform will create an implicit required_providers declaration assuming that you mean a provider in the hashicorp/ namespace, which makes it seem as though required_providers is only for third-party providers. In fact though, that is largely a backward-compatibility mechanism and so I'd suggest always writing out the required_providers entries, even for the providers in the hashicorp/ namespace, so that less-experienced readers don't need to know about this special backward-compatibility behavior. In your case though, the provider you're using is in a third-party namespace anyway and so the required_providers entry is mandatory.
The source needs to be provided since this isn't one of the "official" HashiCorp providers. There could be multiple providers with the name "docker" in the provider registry, so providing the source is needed in order to tell Terraform exactly which provider to download.

Terraform multiple resources with the same monitoring settings

Multiple resources (S3, lambda, etc.) in AWS are created by different teams via Terraform scripts;
I’ve developed my Terraform scripts for CloudWatch monitoring, they require AWS ARN as input;
Is it possible to use my monitoring terraform scripts by each team without copying them to their repos?
Yes you can let each team make use of your terraform scripts by creating them as a module. Then the teams can load them into their own scripts using the module configuration section If you do not use git, or you have all terraform in one repository, there are other ways to load a module
This is how we make and use a large number of modules to have a consistent setup across all teams. We have a git repository for each module like (module_cloudwatch_monitoring). There is nothing special about defining a module, you only need to have variables and outputs defined.
When a team wants to use that module, they can use the module syntax like:
module "cloudwatch_monitoring" {
source = "git::http://your-git-repository-url.git?ref=latest"
resource_arn = "${aws_s3_bucket.my_bucket.id}"
}
If the module was in a local path in the same repository, you could do something like:
module "cloudwatch_monitoring" {
source = "../../modules/cloudwatch_monitoring"
resource_arn = "${aws_s3_bucket.my_bucket.id}"
}

Can I use variables in the TerraForm main.tf file?

Ok, so I have three .tf-files: main.tf where I state azure as provider, resources.tf where all the my resources are claimed, and variables.tf.
I use variables.tf to store keys used by resources.tf.
However, I want to use variables stored in my variable file to fill in the fields in the backend scope like this:
main.tf:
provider "azurerm" {
version = "=1.5.0"
}
terraform {
backend "azurerm" {
storage_account_name = "${var.sa_name}"
container_name = "${var.c_name}"
key = "${var.key}"
access_key = "${var.access_key}"
}
}
Variables stored in variables.tf like this:
variable "sa_name" {
default = "myStorageAccount"
}
variable "c_name" {
default = "tfstate"
}
variable "key" {
default = "codelab.microsoft.tfstate"
}
variable "access_key" {
default = "weoghwoep489ug40gu ... "
}
I got this when running terraform init:
terraform.backend: configuration cannot contain interpolations
The backend configuration is loaded by Terraform extremely early,
before the core of Terraform can be initialized. This is necessary
because the backend dictates the behavior of that core. The core is
what handles interpolation processing. Because of this, interpolations
cannot be used in backend configuration.
If you'd like to parameterize backend configuration, we recommend
using partial configuration with the "-backend-config" flag to
"terraform init".
Is there a way of solving this? I really want all my keys/secrets in the same file... and not one key in the main which I preferably want to push to git.
Terraform doesn't care much about filenames: it just loads all .tf files in the current directory and processes them. Names like main.tf, variables.tf, and outputs.tf are useful conventions to make it easier for developers to navigate the code, but they won't have much impact on Terraform's behavior.
The reason you're seeing the error is that you're trying to use variables in a backend configuration. Unfortunately, Terraform does not allow any interpolation (any ${...}) in backends. Quoting from the documentation:
Only one backend may be specified and the configuration may not contain interpolations. Terraform will validate this.
So, you have to either hard-code all the values in your backend, or provide a partial configuration and fill in the rest of the configuration via CLI params using an outside tool (e.g., Terragrunt).
There are some important limitations on backend configuration:
A configuration can only provide one backend block.
A backend block cannot refer to named values (like input variables, locals, or data source attributes).
Terraform backends configurations one can see at below link:
https://www.terraform.io/docs/configuration/backend.html

How to manage terraform for multiple repos

I have 2 repos for my project. A Static website and server. I want the website to be hosted by cloudfront and s3 and the server on elasticbeanstalk. I know these resources will need to know about a route53 resource at least to be under the same domain name for cors to work. Among other things such as vpcs and stuff.
So my question is how do I manage terraform with multiple repos.
I'm thinking I could have a seperate infrastructure repo that builds for all repos.
I could also have them seperate and pass in the arns/names/ids as variables (annoying).
You can use terraform remote_state for this. It lets you read the output variables from another terraform state file.
Lets assume you save your state files remotely on s3 and you have your website.tfstate and server.tfstate file. You could output your hosted zone ID of your route53 zone as hosted_zone_id in your website.tfstate and then reference that output variable directly in your server state terraform code.
data "terraform_remote_state" "website" {
backend = "s3"
config {
bucket = "<website_state_bucket>"
region = "<website_bucket_region>"
key = "website.tfstate"
}
}
resource "aws_route53_record" "www" {
zone_id = "${data.terraform_remote_state.website.hosted_zone_id}"
name = "www.example.com"
type = "A"
ttl = "300"
records = ["${aws_eip.lb.public_ip}"]
}
Note, that you can only read output variables from remote states. You cannot access resources directly, as terraform treats other states/modules as black boxes.
Update
As mentioned in the comments, terraform_remote_state is a simple way to share explicitly published variables across multiple states. However, it comes with 2 issues:
Close coupling between code components, i.e., producer of the variable cannot change easily.
It can only be used by terraform, i.e., you cannot easily share those variables across different layers. Configuration tools such as Ansible cannot use .tfstate natively without some additional custom plugin/wrapper.
The recommended HashiCorp way is to use a central config store such as Consul. It comes with more benefits:
Consumer is decoupled from the variable producer.
Explicit publishing of variables (like in terraform_remote_state).
Can be used by other tools.
A more detailed explanation can be found here.
An approach I've used in the past is to have a single repo for all of the Infrastructure.
An alternative is to have 2 separate tf configurations, each using remote state. Config 1 can use output variables to store any arns/ids as necessary.
Config 2 can then have a remote_state data source to query for the relevant arns/ids.
E.g.
# Declare remote state
data "terraform_remote_state" "network" {
backend = "s3"
config {
bucket = "my-terraform-state"
key = "network/terraform.tfstate"
region = "us-east-1"
}
}
You can then use output values using standard interpolation syntax
${data.terraform_remote_state.network.some_id}

How to use multiple Terraform Providers sequentially

How can I get Terraform 0.10.1 to support two different providers without having to run 'terraform init' every time for each provider?
I am trying to use Terraform to
1) Provision an API server with the 'DigitalOcean' provider
2) Subsequently use the 'Docker' provider to spin up my containers
Any suggestions? Do I need to write an orchestrating script that wraps Terraform?
Terraform's current design struggles with creating "multi-layer" architectures in a single configuration, due to the need to pass dynamic settings from one provider to another:
resource "digitalocean_droplet" "example" {
# (settings for a machine running docker)
}
provider "docker" {
host = "tcp://${digitalocean_droplet.example.ipv4_address_private}:2376/"
}
As you saw in the documentation, passing dynamic values into provider configuration doesn't fully work. It does actually partially work if you use it with care, so one way to get this done is to use a config like the above and then solve the "chicken-and-egg" problem by forcing Terraform to create the droplet first:
$ terraform plan -out=tfplan -target=digitalocean_droplet.example
The above will create a plan that only deals with the droplet and any of its dependencies, ignoring the docker resources. Once the Docker droplet is up and running, you can then re-run Terraform as normal to complete the setup, which should then work as expected because the Droplet's ipv4_address_private attribute will then be known. As long as the droplet is never replaced, Terraform can be used as normal after this.
Using -target is fiddly, and so the current recommendation is to split such systems up into multiple configurations, with one for each conceptual "layer". This does, however, require initializing two separate working directories, which you indicated in your question that you didn't want to do. This -target trick allows you to get it done within a single configuration, at the expense of an unconventional workflow to get it initially bootstrapped.
Maybe you can use a provider instance within your resources/module to set up various resources with various providers.
https://www.terraform.io/docs/configuration/providers.html#multiple-provider-instances
The doc talks about multiple instances of same provider but I believe the same should be doable with distinct providers as well.
A little bit late...
Well, got the same Problem. My workaround is to create modules.
First you need a module for your docker Provider with an ip variable:
# File: ./docker/main.tf
variable "ip" {}
provider "docker" {
host = "tcp://${var.ip}:2375/"
}
resource "docker_container" "www" {
provider = "docker"
name = "www"
}
Next one is to load that modul in your root configuration:
# File: .main.tf
module "docker01" {
source = "./docker"
ip = "192.169.10.12"
}
module "docker02" {
source = "./docker"
ip = "192.169.10.12"
}
True, you will create on every node the same container, but in my case that's what i wanted. I currently haven't found a way to configure the hosts with an individual configuration. Maybe nested modules, but that didn't work in the first tries.

Resources