Running 'terragrunt apply' on an EC2 Instance housed in a No Internet Environment - terraform

I have been trying to set up my Terragrunt EC2 environment in a no/very limited internet setting.
Current Setup:
AWS network firewall that whitelists domains to allow traffic, and most internet traffic is blocked excepted a few domains.
EC2 instance where I run the terragrunt code, it has an instance profile that can assume the role in providers
VPC endpoints set up for sts, s3, dynamodb, codeartifact etc
All credentials (assumed role etc) work and have been verified
Remote State and Providers File
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "***"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "ap-southeast-1"
encrypt = true
dynamodb_table = "***"
}
}
# Dynamically changes the role depending on which account is being modified
generate "providers" {
path = "providers.tf"
if_exists = "overwrite"
contents = <<EOF
provider "aws" {
region = "${local.env_vars.locals.aws_region}"
assume_role {
role_arn = "arn:aws:iam::$***"
endpoints {
sts = "https://sts.ap-southeast-1.amazonaws.com"
s3 = "https://s3.ap-southeast-1.amazonaws.com"
dynamodb = "https://dynamodb.ap-southeast-1.amazonaws.com"
}
}
EOF
}
With Internet (Turning off the firewall):
I am able to run all the terragrunt commands
Without Internet
I only allow "registry.terraform.io" to pass the firewall
I am able to assume the role listed in providers via aws sts assume-role, and I can list the tables in dynamodb and files in the s3 bucket
I am able to run terragrunt init on my EC2 instance with the instance profile, I assume terragrunt does use the correct sts_endpoint
However when I run terragrunt apply, it hangs at the stage `DEBU[0022] Running command: terraform plan prefix=[***]
In my CloudTrail I do see that Terragrunt has assumed the username aws-go-sdk-1660077597688447480 for the event GetCallerIdentity, so I think the provider is able to assume the role that was declared in the providers block
I tried adding custom endpoints for sts, s3, and dynamodb, but it still hangs.
I suspect that terraform is still trying to use the internet when making the AWS SDK calls, which leads to terragrunt apply being stuck.
Is there a comprehensive list of endpoints I need to custom add, or a list of domains I should whitelist to be able to run terragrunt apply?

I set the environment variable TF_LOG to debug, and besides the registry.terraform.io domain, I was able to gather these ones:
github.com
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/go-tfe v1.0.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/hcl/v2 v2.12.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-config-inspect v0.0.0-20210209133302-4fd17a0faac2
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-svchost v0.0.0-20200729002733-f050f53b9734
sts.region.amazonaws.com
resource.region.amazonaws.com
You'll want to add those domains to your whitelist in the firewall settings, something like *.region.amazonaws.com should do the trick, of course, you can be more restrictive, and rather than use a wildcard, you can specify the exact resource.
For reference: https://docs.aws.amazon.com/general/latest/gr/rande.html

Related

Terraform: How to obtain VPCE service name when it was dynamically created

I am trying to obtain (via terraform) the dns name of a dynamically created VPCE endpoint using a data resource but the problem I am facing is the service name is not known until resources have been created. See notes below.
Is there any way of retrieving this information as a hard-coded service name just doesn’t work for automation?
e.g. this will not work as the service_name is dynamic
resource "aws_transfer_server" "sftp_lambda" {
count = local.vpc_lambda_enabled
domain = "S3"
identity_provider_type = "AWS_LAMBDA"
endpoint_type = "VPC"
protocols = ["SFTP"]
logging_role = var.loggingrole
function = var.lambda_idp_arn[count.index]
endpoint_details = {
security_group_ids = var.securitygroupids
subnet_ids = var.subnet_ids
vpc_id = var.vpc_id
}
tags = {
NAME = "tf-test-transfer-server"
ENV = "test"
}
}
data "aws_vpc_endpoint" "vpce" {
count = local.vpc_lambda_enabled
vpc_id = var.vpc_id
service_name = "com.amazonaws.transfer.server.c-001"
depends_on = [aws_transfer_server.sftp_lambda]
}
output "transfer_server_dnsentry" {
value = data.aws_vpc_endpoint.vpce.0.dns_entry[0].dns_name
}
Note: The VPCE was created automatically from an AWS SFTP transfer server resource that was configured with endpoint type of VPC (not VPC_ENDPOINT which is now deprecated). I had no control over the naming of the endpoint service name. It was all created in the background.
Minimum AWS provider version: 3.69.0 required.
Here is an example cloudformation script to setup an SFTP transfer server using Lambda as the IDP.
This will create the VPCE automatically.
So my aim here is to output the DNS name from the auto-created VPC endpoint using terraform if at all possible.
example setup in cloudFormation
data source: aws_vpc_endpoint
resource: aws_transfer_server
I had a response from Hashicorp Terraform Support on this and this is what they suggested:
you can get the service SFTP-Server-created-VPC-Endpoint by calling the following exported attribute of the vpc_endpoint_service resource [a].
NOTE: There are certain setups that causes AWS to create additional resources outside of what you configured. The AWS SFTP transfer service is one of them. This behavior is outside Terraform's control and more due to how AWS designed the service.
You can bring that VPC Endpoint back under Terraform's control however, by importing the VPC endpoint it creates on your behalf AFTER the transfer service has been created - via the VPCe ID [b].
If you want more ideas of pulling the service name from your current AWS setup, feel free to check out this example [c].
Hope that helps! Thank you.
[a] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint_service#service_name
[b] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint#import
[c] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_endpoint#gateway-load-balancer-endpoint-type
There is a way forward like I shared earlier with the imports but it not going to be fully automated unfortunately.
Optionally, you can use a provisioner [1] and the aws ec2 describe-vpc-endpoint-services --service-names command [2] to get the service names you need.
I'm afraid that's the last workaround I can provide, as explained in our doc here [3] - which will explain how - as much as we'd like to, Terraform isn't able to solve all use-cases.
[1] https://www.terraform.io/language/resources/provisioners/remote-exec
[2] https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ec2/describe-vpc-endpoint-services.html
[3] https://www.terraform.io/language/resources/provisioners/syntax
I've finally found the solution:
data "aws_vpc_endpoint" "transfer_server_vpce" {
count = local.is_enabled
vpc_id = var.vpc_id
filter {
name = "vpc-endpoint-id"
values = ["${aws_transfer_server.transfer_server[0].endpoint_details[0].vpc_endpoint_id}"]
}
}

How to update an existing cloudflare_record in terraform and github actions

I creaed my project with code from Hashicorp tutorial "Host a static website with S3 and Cloudflare", but the tutorial didn't mention github actions. So, when I put my project in github actions, even though terraform plan and terraform apply output successfully locally, I get errors on terraform apply:
Error: expected DNS record to not already be present but already exists
with cloudflare_record.site_cname ...
with cloudflare_record.www
I have two resources in my main.tf, one for the site domain and one for www, like the following:
resource "cloudflare_record" "site_cname" {
zone_id = data.cloudflare_zones.domain.zones[0].id
name = var.site_domain
value = aws_s3_bucket.site.website_endpoint
type = "CNAME"
ttl = 1
proxied = true
}
resource "cloudflare_record" "www" {
zone_id = data.cloudflare_zones.domain.zones[0].id
name = "www"
value = var.site_domain
type = "CNAME"
ttl = 1
proxied = true
}
If I remove these lines of code from my main.tf and then run terraform apply locally, I get the warning that this will destroy my resource.
Which should I do?
add an allow_overwrite somewhere (don't see examples of how to use this in the docs) and the ways I've tried to add it generated errors.
remove the lines of code from main.tf knowing the github actions run will destroy my cloudflare_record.www and cloudflare_record.site_cname knowing I can see my zone id and CNAME if I log into cloudflare so maybe this code isn't necessary after the initial set up
run terrform import somewhere? If so, where do I find the zone ID and record ID
or something else?
Where is your terraform state? Did you store it locally or in a remote location?
Because it would explain why you don't have any problems locally and why it's trying to recreate the resources in Github actions.
More information about terraform backend (where the state is stored) -> https://www.terraform.io/docs/language/settings/backends/index.html
And how to create one with S3 for example ->
https://www.terraform.io/docs/language/settings/backends/s3.html
It shouldn't be a problem if Terraform would drop and re-create DNS records, but for better result, you need to ensure that GitHub Actions has access to the (current) workspace state.
Since Terraform Cloud provides a free plan, there is no reason not to take advantage of it. Just create a workspace through their dashboard, add "remote" backend configuration to your project and ensure that GitHub Actions uses Terraform API Token at runtime (you would set it via GitHub repository settings > Secrets).
You may want to check this example — Terraform Starter Kit
infra/backend.tf
infra/dns-records.tf
scripts/tf.js
Here is how you can pass Terraform API Token from secrets.TERRAFORM_API_TOKEN GitHub secret to Terraform CLI:
- env: { TERRAFORM_API_TOKEN: "${{ secrets.TERRAFORM_API_TOKEN }}" }
run: |
echo "credentials \"app.terraform.io\" { token = \"$TERRAFORM_API_TOKEN\" }" > ./.terraformrc

Apply a `configMap` to EKS cluster with Terraform

I am trying to apply a configMap to an EKS cluster through Terraform, but I don't see how. There is lots of documentation about this, but I don't see anyone succeeding with it, so I am not sure if this is possible or not.
Currently we control our infrastructure through Terraform. When I create the .kube/config file through AWS cli, and try to connect to the cluster, I get the Unauthorized error, which is documented how to solve here; in AWS. According to the docs, we need to edit aws-auth configMap and add some lines to it, which configures API server to accept requests from a VM with certain role. The problem is that only cluster creator has access to connect to the cluster and make these changes. The cluster creator in this case is Terraform, so what we do is aws config, we add the credentials of Terraform to the VM from where we are trying to connect to the cluster, we successfully authenticate against it, add the necessary lines to the configMap, then revoke the credentials from the VM.
From there on, any user can connect to the cluster from that VM, which is our goal. Now we would like to be able to edit the configMap through Terraform object, instead of doing all this process. There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS, so it is not being able to find the cluster, and fails with trying to connect to the API server running in localhost.
There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS
It is a different provider, because Terraform should now interact with a different API (Kubernetes API instead of AWS API).
There are data sources for aws_eks_cluster and aws_eks_cluster_auth that can be used to authenticate the kubernetes provider.
The aws_eks_cluster_auth has examples for authenticating the kubernetes provider:
data "aws_eks_cluster" "example" {
name = "example"
}
data "aws_eks_cluster_auth" "example" {
name = "example"
}
provider "kubernetes" {
host = data.aws_eks_cluster.example.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.example.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.example.token
load_config_file = false
}
Another example is how the Cloud Posse AWS EKS module authenticate the kubernetes provider and also use a ConfigMap.

Terraform cloud : Import existing resource

I am using terraform cloud to manage the state of the infrastructure provisioned in AWS.
I am trying to use terraform import to import an existing resource that is currently not managed by terraform.
I understand terraform import is a local only command. I have set up a workspace reference as follows:
terraform {
required_version = "~> 0.12.0"
backend "remote" {
hostname = "app.terraform.io"
organization = "foo"
workspaces {
name = "bar"
}
}
}
The AWS credentials are configured in the remote cloud workspace but terraform does not appear to be referencing the AWS credentials from the workspace but instead falls back trying to using the local credentials which points to a different AWS account. I would like Terraform to use the credentials by referencing the variables in the workspace when I run terraform import.
When I comment out the locally configured credentials, I get the error:
Error: No valid credential sources found for AWS Provider.
I would have expected terraform to use the credentials configured in the workspace.
Note that terraform is able to use the credentials correctly, when I run the plan/apply command directly from the cloud console.
Per the backends section of the import docs, plan and apply run in Terraform Cloud whereas import runs locally. Therefore, the import command will not have access to workspace credentials set in Terraform Cloud. From the docs:
In order to use Terraform import with a remote state backend, you may need to set local variables equivalent to the remote workspace variables.
So instead of running the following locally (assuming you've provided access keys to Terraform Cloud):
terraform import aws_instance.myserver i-12345
we should run for example:
export AWS_ACCESS_KEY_ID=abc
export AWS_SECRET_ACCESS_KEY=1234
terraform import aws_instance.myserver i-12345
where the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY have the same permissions as those configured in Terraform Cloud.
Note for AWS SSO users
If you are using AWS SSO and CLI v2, functionality for terraform to be able to use the credential cache for sso was added per this AWS provider issue. The steps for importing with an SSO profile are:
Ensure you've performed a login and have an active session with e.g. aws sso login --profile my-profile
Make the profile name available to terraform as an environment variable with e.g. AWS_PROFILE=my-profile terraform import aws_instance.myserver i-12345
If the following error is displayed, ensure you are using a version of the cli > 2.1.23:
Error: SSOProviderInvalidToken: the SSO session has expired or is invalid
│ caused by: expected RFC3339 timestamp: parsing time "2021-07-18T23:10:46UTC" as "2006-01-02T15:04:05Z07:00": cannot parse "UTC" as "Z07:00"
Use the data provider, for Example:-
data "terraform_remote_state" "test" {
backend = "s3"
config = {
bucket = "BUCKET_NAME"
key = "BUCKET_KEY WHERE YOUR TERRAFORM.TFSTATE FILE IS PRESENT"
region = "CLOUD REGION"
}
}
Now you can call your provisioned resources
Example :-
For getting the VPC ID:-
data.terraform_remote_state.test.*.outputs.vpc_id
Just make the cloud resource property you want to refer should be in exported as output and stored in terraform.tfstate file

Terraform profile field usage in AWS provider

I have a $HOME/.aws/credentials file like this:
[config1]
aws_access_key_id=accessKeyId1
aws_secret_access_key=secretAccesskey1
[config2]
aws_access_key_id=accessKeyId2
aws_secret_access_key=secretAccesskey2
So I was expecting that with this configuration, terraform will choose the second credentials:
terraform {
backend "s3" {
bucket = "myBucket"
region = "eu-central-1"
key = "path/to/terraform.tfstate"
encrypt = true
}
}
provider "aws" {
profile = "config2"
region = "eu-central-1"
}
But when I try terraform init it says it hasn't found any valid credentials:
Initializing the backend...
Error: No valid credential sources found for AWS Provider.
Please see https://terraform.io/docs/providers/aws/index.html for more information on
providing credentials for the AWS Provider
As as workaround, I changed config2 by default in my credentials file and I removed the profile field from the provider block so it works but I really need to use something like the first approach. What am I missing here?
Unfortunately you also need to provide the IAM credential configuration to the backend configuration as well as your AWS provider configuration.
The S3 backend configuration takes the same parameters here as the AWS provider so you can specify the backend configuration like this:
terraform {
backend "s3" {
bucket = "myBucket"
region = "eu-central-1"
key = "path/to/terraform.tfstate"
encrypt = true
profile = "config2"
}
}
provider "aws" {
profile = "config2"
region = "eu-central-1"
}
There's a few reasons behind this needing to be done separately. One of the reasons would be that you can independently use different IAM credentials, accounts and regions for the S3 bucket and the resources you will be managing with the AWS provider. You might also want to use S3 as a backend even if you are creating resources in another cloud provider or not using a cloud provider at all, Terraform can manage resources in a lot of places that don't have a way to store Terraform state. The main reason though is that the backends are actually managed by the core Terraform binary rather than the provider binaries and the backend initialisation happens before pretty much anything else.

Resources