Did anyone experienced issues with Terraform being throttled when using it with AWS Route53 records and being VERY slow?
I have enabled DEBUG mode and getting this:
2018-11-30T14:35:08.467Z [DEBUG] plugin.terraform-provider-aws_v1.36.0_x4: 2018/11/30 14:35:08 [DEBUG] [aws-sdk-go] <?xml version="1.0"?>
2018-11-30T14:35:08.467Z [DEBUG] plugin.terraform-provider aws_v1.36.0_x4: <ErrorResponse xmlns="https://route53.amazonaws.com/doc/2013-04-01/"><Error><Type>Sender</Type><Code>Throttling</Code><Message>Rate exceeded</Message></Error><RequestId>REQUEST_ID</RequestId></ErrorResponse>
2018-11-30T14:35:08.518Z [DEBUG] plugin.terraform-provider-aws_v1.36.0_x4: 2018/11/30 14:35:08 [DEBUG] [aws-sdk-go] DEBUG: Validate Response route53/ListResourceRecordSets failed, will retry, error Throttling: Rate exceeded
Terraform takes >1h just to do simple Plan, something which normally takes <5 mins.
My infrastructure is organized like this:
alb.tf:
module "ALB"
{ source = "modules/alb" }
modules/alb/alb.tf:
resource "aws_alb" "ALB"
{ name = "alb"
subnets = var.subnets ...
}
modules/alb/dns.tf
resource "aws_route53_record" "r53" {
count = "${length(var.cnames_generic)}"
zone_id = "HOSTED_ZONE_ID"
name = "${element(var.cnames_generic_dns, count.index)}.${var.environment}.${var.domain}"
type = "A"
alias {
name = "dualstack.${aws_alb.ALB.dns_name}"
zone_id = "${aws_alb.ALB.zone_id}"
evaluate_target_health = false
}
}
modules/alb/variables.tf:
variable "cnames_generic_dns" {
type = "list"
default = [
"hostname1",
"hostname2",
"hostname3",
"hostname4",
"hostname5",
"hostname6",
"hostname7",
...
"hostname25"
]
}
So I am using modules to configure Terraform, and inside modules there are resources (ALB, DNS..).
However, looks like Terraform is describing every single DNS Resource (CNAME and A records, which I have ~1000) in a HostedZone which is causing it to Throttle?
Terraform v0.10.7
Terraform AWS provider version = "~> 1.36.0"
that's a lot of DNS records! And partly the reason why the AWS API is throttling you.
First, I'd recommend upgrading your AWS provider. v1.36 is fairly old and there have been more than a few bug fixes since.
(Next, but not absolutely necessary, is to use TF v0.11.x if possible.)
In your AWS Provider block, increase max_retries to at least 10 and experiment with higher values.
Then, use Terraform's --parallelism flag to limit TF's concurrency rate. Try setting that to 5 for starters.
Last, enable Terraform's debug mode to see if it gives you any more useful info.
Hope this helps!
The problem is solved by performing the following actions:
since we re-structured DNS records by adding one resource and then variables / iterate through them, this probably caused Terraform to query constantly all DNS records
we decided to leave Terraform to finish refresh (took 4h and lots of throttling)
manually deleted DNS records from R53 for the Workspace which we were doing this
commenting out Terraform DNS resources so let it also delete from state files
uncommenting Terraform DNS and re-run it again so it created them again
run Terraform plan went fine again
Looks like throttling with Terraform AWS Route53 is completely resolved after upgrading to newer AWS provider. We have updated TF AWS provider to 1.54.0 like this in our init.tf :
version = "~> 1.54.0"
Here are more details about the issue and suggestions from Hashicorp engineers:
https://github.com/terraform-providers/terraform-provider-aws/issues/7056
Related
I am new to terraform and I am just working in a lab environment at the moment so I hope someone can help me and point me in the right direction as to where I am going wrong. I am following this video for reference https://www.youtube.com/watch?v=SLB_c_ayRMo&t=2336s. I am running terraform V1.2.9 I run a terraform init command and everything initialises, but when I run a terraform plan I get this error "No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed."
This is my code for reference
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.16"
}
}
required_version = ">= 1.2.0"
}
provider "aws" {
region = "us-east-1"
access_key = "access key"
secret_key = "secretkey"
}
resource "aws_instance" "my-first-server" {
ami = "ami-052efd3df9dad4825"
instance_type = "t2.micro"
tags = {
Name = "MyFirstServer"
}
}
Any help would be gratefully appreciated to help me on my learning journey.
Terraform creates resources (for example, in AWS) and tracks them in a state file. The state file is one large object - nothing complex.
If you look in AWS - EC2 region us-east-1 see if your instance is there. If it isn't you should see the terraform plan saying it was deleted outside of Terraform.
One way to check if it's even being tracked properly is to do a terraform state list. That command should show you aws_instance_my-first-server if you've plan & applied this configuration in the past.
If you're still stuck one way to "start from fresh" since it's only a learning exercise and not production level at work, you can delete the entire state file and start fresh.
However I trust Terraform... you must have the instance running somewhere ;)
I have an ECS Fargate cluster set up that currently has 4 tasks of the same app running on it. The desired number of tasks is defined within a variable:
variable "desired_task_count" {
description = "Desired ECS tasks to run in service"
default = 4
}
When I change the default value to any given number, save it and run terraform plan or terraform apply, terraform doesn't see any changes. The tfstate file remains unchanged.
No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found
no differences, so no changes are needed.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
TFstate:
"desired_count": 4,
If I change any other variable in that exact same variables.tf file, terraform picks up the changes and applies them.
What I tried to do:
Create a new variable to pass the value - didn't work
Rebuild infrastructure with destroy and then apply. This did work as it rewrites a new state file.
TF and provider versions:
terraform --version
Terraform v1.2.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.75.2
Could this be a provider issue? It seems like the problem only occurs with a viariable that points to a specific setting in a specific resource.
What else can I check?
SOLVED:
There was a lifecycle block in the ECS resource that contained a list of change ignores. It was there because of autoscaling, which was temporarily removed from the project.
lifecycle {
ignore_changes = [task_definition, desired_count]
}
I am experiencing a weird behaviour with terraform. I have been working on an infra. I have a backend state configured to state my state file in a storage account in azure. Until yesterday everything was fine, this morning when I tried to update my infra, the output from terraform plan was weird as its trying to create all the resources as new, when I checked my local testate..it was empty.
I tried terraform pull and terraform refresh but nothing, still same result. I checked my remote state and I have all the resources still declared.
So I went for plan b, copy and paste my remote state into my local project and run terraform once again, but nothing, seems that terraform is ignoring my terraform state on my local and doesn't wanna pull the remote one.
EDIT:
this is the structure of my terraform backend:
terraform {
backend "azurerm" {
resource_group_name = "<resource-group-name>"
storage_account_name = "<storage-name>"
container_name = "<container-name>"
key = "terraform.tfstate"
}
}
The weird thing also, is that I just used terraform to create 8 resource for another project, and it did created everything and updated my backend state without any issue. The problem is only with the old resources.
Any help please?
if you run terraform workspace show are you in the default workspace?
if you have the tfstate locally but you're not on the correct workspace terraform will ignore it : https://www.terraform.io/docs/language/state/workspaces.html#using-workspaces
also is it possible to see your backend file structure?
EDIT:
i dont know why it ignores your remote state, but i think that your problem is that when you run terraform refresh it ignores your local file because you have a remote config:
Usage: terraform refresh [options]
-state=path - Path to read and write the state file to. Defaults to "terraform.tfstate". Ignored when remote state is used.
-state-out=path - Path to write updated state file. By default, the -state path will be used. Ignored when remote state is used.
is it possible to see the ouput of your terraform state pull?
I am intending to use Terraform to stand up my entire monitoring infrastructure in AWS.
So far in my terraform project have created VPC, subnets, appropriate security groups. I am using the Terraform Registry where possible:
vpc
security-group
iam-role
eks
The issue I am seeing is that after the EKS cluster is deployed it introduces tags to the VPC and Subnets that do not appear to be known to Terraform. Hence the next time terraform plan is run it identifying tags that it does not manage and intends to remove them:
------------------------------------------------------------------------
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
~ module.vpc.aws_subnet.private[0]
tags.%: "4" => "3"
tags.kubernetes.io/cluster/monitoring: "shared" => ""
~ module.vpc.aws_subnet.private[1]
tags.%: "4" => "3"
tags.kubernetes.io/cluster/monitoring: "shared" => ""
~ module.vpc.aws_vpc.this
tags.%: "4" => "3"
tags.kubernetes.io/cluster/monitoring: "shared" => ""
Plan: 0 to add, 3 to change, 0 to destroy.
------------------------------------------------------------------------
There is an issue open with terraform-provider-aws with a local workaround using bash, but does anyone know how to get Terraform to become aware of these tags or to get them to be ignored by subsequent plans in a robust way?
Just add the tags when you call the module, notice in the example of https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/1.41.0 it shows tags there and in the docs it says "A map of tags to add to all resources" so you can add it to that map.
If you controlled the module, you could try to use ignore_changes clause in the lifecycle block. Something like
lifecycle {
ignore_changes = [
"tags"
]
}
It's going to be much trickier with a module that you don't control though.
So in the end we chose not to use terraform to deploy the cluster at all,
instead we use eksctl the community based tool from Weaveworks.
https://eksctl.io/
It was recommended by an AWS solutions architect when we were at the AWS offices in London for some training.
The config can be stored in source control if needed.
eksctl create cluster -f cluster.yaml
Since EKS does a lot of tagging of infrastructure, our lives are much better now the state file is not complaining about tags.
I am trying to declare the following Terraform provider:
provider "mysql" {
endpoint = "${aws_db_instance.main.endpoint}:3306"
username = "root"
password = "root"
}
I get the following error:
Error refreshing state: 1 error(s) occurred:
* dial tcp: lookup ${aws_db_instance.main.endpoint}: invalid domain name
It seems that Terraform is not performing interpolation on my endpoint string, yet I don't see anything in the documentation about this -- what gives?
Yes, it does. There's an example in the docs at https://www.terraform.io/docs/providers/mysql/
# Configure the MySQL provider based on the outcome of
# creating the aws_db_instance.
provider "mysql" {
endpoint = "${aws_db_instance.default.endpoint}"
username = "${aws_db_instance.default.username}"
password = "${aws_db_instance.default.password}"
}
I ran into a similar set of error messages ("connect failed," "invalid domain lookup") and looked into this a bit. I hope this helps you or someone else working across cloud and database providers in Terraform.
This seems to come down to the MySQL provider attempting to establish a database connection as soon as it's initialized, which could be a problem if you're trying to build a database server and configure the database / grants on it as part of the same Terraform run. Providers get initialized based on Terraform finding a resource owned by that provider in your Terraform code, and since this connection attempt happens when the provider gets initialized, you can't work around this with -target=<SPECIFIC RESOURCE>.
The workarounds I can think of would be to have a codebase for setting up the database server and a different codebase for setting up the database grants and suchlike ... or to have Terraform kick off a script that does that work for you (with dynamic parameters, of course!). Either way, you're effectively removing mysql_* resources from your initial Terraform run and that's what fixes this.
There are a couple of code changes that probably need to happen here - the Terraform MySQL provider would need to delay connecting to the database until Terraform tells it to run an operation on a resource, and it may be necessary to look at how Terraform handles dependencies across providers. I tried hacking in deferred connection logic just for the mysql_database resource to see if that solved all my problems and Terraform still complained about a dependency loop in the graph.
You can track the MySQL provider issue here:
https://github.com/terraform-providers/terraform-provider-mysql/issues/2
And the comments from before providers were split into their own releasable codebases:
https://github.com/hashicorp/terraform/issues/5687