Terraform AWS OpenSearch using Cognito module circular problem - terraform

Hi im trying to create a terraform module to deploy aws opensearch using cognito, but it seems it is not possible to complete!
To create an opensearch cluster with cognito, you need to create
cognito user pool
cognito user pool client app
cognito identity pool (giving it the user pool and client app)
Pass the user pool and identity pool to opensearch
After the opensearch cluster is installed, it creates a new client app That you then have to add to the identity pool!
Any know how to get around a terraform deploy -> manual update.
[EDIT]
Added code snippet that resolve the issue, i didnt attach the starter code as to deploy open-search with cognito as its a good few hundred lines of code and seemed redundant.
## calls after elasticsearch and cognito has been built to
## add the elasticsearch client app to the cognito identity pool
data "external" "cognito" {
depends_on = [
aws_opensearch_domain.this
]
program = ["sh", "-c", "aws cognito-idp list-user-pool-clients --user-pool-id ${aws_cognito_user_pool.cognito-user-pool.id}| jq '.UserPoolClients | .[] | select(.ClientName | contains(\"AmazonOpenSearchService\"))'"]
}
output "cognito" {
value = data.external.cognito.result["ClientId"]
}
resource "aws_cognito_identity_pool" "cognito-identity-pool-opensearch" {
depends_on = [
data.external.cognito
]
identity_pool_name = "opensearch-${var.domain_name}-identity-pool"
allow_unauthenticated_identities = false
cognito_identity_providers {
client_id = data.external.cognito.result["ClientId"]
provider_name = aws_cognito_user_pool.cognito-user-pool.endpoint
server_side_token_check = false
}
}

Although your question should provide some sample code, I happen to know exactly what you're referring to because I've had to deal with it in several projects.
Unless things have changed since I last dealt with this, there is no easy solution and it's a gaping hole in the AWS API and Terraform AWS provider. The workaround I've used is:
Create the OpenSearch domain and allow it to create the Cognito user pool client app.
Use an external data source to make an AWS CLI call to read the OpenSearch domain, which will get you the details of the client app it created.
Use an external data source to update the client app using the AWS CLI and change the necessary settings.
It sucks, yes.

Related

Running 'terragrunt apply' on an EC2 Instance housed in a No Internet Environment

I have been trying to set up my Terragrunt EC2 environment in a no/very limited internet setting.
Current Setup:
AWS network firewall that whitelists domains to allow traffic, and most internet traffic is blocked excepted a few domains.
EC2 instance where I run the terragrunt code, it has an instance profile that can assume the role in providers
VPC endpoints set up for sts, s3, dynamodb, codeartifact etc
All credentials (assumed role etc) work and have been verified
Remote State and Providers File
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "***"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "ap-southeast-1"
encrypt = true
dynamodb_table = "***"
}
}
# Dynamically changes the role depending on which account is being modified
generate "providers" {
path = "providers.tf"
if_exists = "overwrite"
contents = <<EOF
provider "aws" {
region = "${local.env_vars.locals.aws_region}"
assume_role {
role_arn = "arn:aws:iam::$***"
endpoints {
sts = "https://sts.ap-southeast-1.amazonaws.com"
s3 = "https://s3.ap-southeast-1.amazonaws.com"
dynamodb = "https://dynamodb.ap-southeast-1.amazonaws.com"
}
}
EOF
}
With Internet (Turning off the firewall):
I am able to run all the terragrunt commands
Without Internet
I only allow "registry.terraform.io" to pass the firewall
I am able to assume the role listed in providers via aws sts assume-role, and I can list the tables in dynamodb and files in the s3 bucket
I am able to run terragrunt init on my EC2 instance with the instance profile, I assume terragrunt does use the correct sts_endpoint
However when I run terragrunt apply, it hangs at the stage `DEBU[0022] Running command: terraform plan prefix=[***]
In my CloudTrail I do see that Terragrunt has assumed the username aws-go-sdk-1660077597688447480 for the event GetCallerIdentity, so I think the provider is able to assume the role that was declared in the providers block
I tried adding custom endpoints for sts, s3, and dynamodb, but it still hangs.
I suspect that terraform is still trying to use the internet when making the AWS SDK calls, which leads to terragrunt apply being stuck.
Is there a comprehensive list of endpoints I need to custom add, or a list of domains I should whitelist to be able to run terragrunt apply?
I set the environment variable TF_LOG to debug, and besides the registry.terraform.io domain, I was able to gather these ones:
github.com
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/go-tfe v1.0.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/hcl/v2 v2.12.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-config-inspect v0.0.0-20210209133302-4fd17a0faac2
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-svchost v0.0.0-20200729002733-f050f53b9734
sts.region.amazonaws.com
resource.region.amazonaws.com
You'll want to add those domains to your whitelist in the firewall settings, something like *.region.amazonaws.com should do the trick, of course, you can be more restrictive, and rather than use a wildcard, you can specify the exact resource.
For reference: https://docs.aws.amazon.com/general/latest/gr/rande.html

Apply a `configMap` to EKS cluster with Terraform

I am trying to apply a configMap to an EKS cluster through Terraform, but I don't see how. There is lots of documentation about this, but I don't see anyone succeeding with it, so I am not sure if this is possible or not.
Currently we control our infrastructure through Terraform. When I create the .kube/config file through AWS cli, and try to connect to the cluster, I get the Unauthorized error, which is documented how to solve here; in AWS. According to the docs, we need to edit aws-auth configMap and add some lines to it, which configures API server to accept requests from a VM with certain role. The problem is that only cluster creator has access to connect to the cluster and make these changes. The cluster creator in this case is Terraform, so what we do is aws config, we add the credentials of Terraform to the VM from where we are trying to connect to the cluster, we successfully authenticate against it, add the necessary lines to the configMap, then revoke the credentials from the VM.
From there on, any user can connect to the cluster from that VM, which is our goal. Now we would like to be able to edit the configMap through Terraform object, instead of doing all this process. There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS, so it is not being able to find the cluster, and fails with trying to connect to the API server running in localhost.
There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS
It is a different provider, because Terraform should now interact with a different API (Kubernetes API instead of AWS API).
There are data sources for aws_eks_cluster and aws_eks_cluster_auth that can be used to authenticate the kubernetes provider.
The aws_eks_cluster_auth has examples for authenticating the kubernetes provider:
data "aws_eks_cluster" "example" {
name = "example"
}
data "aws_eks_cluster_auth" "example" {
name = "example"
}
provider "kubernetes" {
host = data.aws_eks_cluster.example.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.example.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.example.token
load_config_file = false
}
Another example is how the Cloud Posse AWS EKS module authenticate the kubernetes provider and also use a ConfigMap.

Authenticating to Google Cloud Firestore from GKE with Workload Identity

I'm trying to write a simple backend that will access my Google Cloud Firestore, it lives in the Google Kubernetes Engine. On my local I'm using the following code to authenticate to Firestore as detailed in the Google Documentation.
if (process.env.NODE_ENV !== 'production') {
const result = require('dotenv').config()
//Additional error handling here
}
This pulls the GOOGLE_APPLICATION_CREDENTIALS environment variable and populates it with my google-application-credentals.json which I got from creating a service account with the "Cloud Datastore User" role.
So, locally, my code runs fine. I can reach my Firestore and do everything I need to. However, the problem arises once I deploy to GKE.
I followed this Google Documentation to set up a Workload Identity for my cluster, I've created a deployment and verified that the pods all are using the correct IAM Service Account by running:
kubectl exec -it POD_NAME -c CONTAINER_NAME -n NAMESPACE sh
> gcloud auth list
I was under the impression from the documentation that authentication would be handled for my service as long as the above held true. I'm really not sure why but my Firestore() instance is behaving as if it does not have the necessary credentials to access the Firestore.
In case it helps below is my declaration and implementation of the instance:
const firestore = new Firestore()
const server = new ApolloServer({
schema: schema,
dataSources: () => {
return {
userDatasource: new UserDatasource(firestore)
}
}
})
UPDATE:
In a bout of desperation I decided to tear down everything and re-build it. Following everything over step by step I appear to have either encountered a bug or (more likely) I did something mildly wrong the first time. I'm now able to connect to my backend service. However, I'm now getting a different error. Upon sending any request (I'm using GraphQL, but in essence it's any REST call) I get back a 404.
Inspecting the logs yields the following:
'Getting metadata from plugin failed with error: Could not refresh access token: A Not Found error was returned while attempting to retrieve an accesstoken for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have any permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 404'
A cursory search for this issue doesn't seem to return anything related to what I'm trying to accomplish, and so I'm back to square one.
I think your initial assumption was correct! Workload Identity is not functioning properly if you still have to specify scopes. In the Workload article you have linked, scopes are not used.
I've been struggling with the same issue and have identified three ways to get authenticated credentials in the pod.
1. Workload Identity (basically the Workload Identity article above with some deployment details added)
This method is preferred because it allows each pod deployment in a cluster to be granted only the permissions it needs.
Create cluster (note: no scopes or service account defined)
gcloud beta container clusters create {cluster-name} \
--release-channel regular \
--identity-namespace {projectID}.svc.id.goog
Then create the k8sServiceAccount, assign roles, and annotate.
gcloud container clusters get-credentials {cluster-name}
kubectl create serviceaccount --namespace default {k8sServiceAccount}
gcloud iam service-accounts add-iam-policy-binding \
--member serviceAccount:{projectID}.svc.id.goog[default/{k8sServiceAccount}] \
--role roles/iam.workloadIdentityUser \
{googleServiceAccount}
kubectl annotate serviceaccount \
--namespace default \
{k8sServiceAccount} \
iam.gke.io/gcp-service-account={googleServiceAccount}
Then I create my deployment, and set the k8sServiceAccount.
(Setting the service account was the part that I was missing)
kubectl create deployment {deployment-name} --image={containerImageURL}
kubectl set serviceaccount deployment {deployment-name} {k8sServiceAccount}
Then expose with a target of 8080
kubectl expose deployment {deployment-name} --name={service-name} --type=LoadBalancer --port 80 --target-port 8080
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
2. Cluster Service Account
This method is not preferred, because all VMs and pods in the cluster will have permissions based on the defined service account.
Create cluster with assigned service account
gcloud beta container clusters create [cluster-name] \
--release-channel regular \
--service-account {googleServiceAccount}
The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).
Then deploy and expose as above, but without setting the k8sServiceAccount
3. Scopes
This method is not preferred, because all VMs and pods in the cluster will have permisions based on the scopes defined.
Create cluster with assigned scopes (firestore only requires "cloud-platform", realtime database also requires "userinfo.email")
gcloud beta container clusters create $2 \
--release-channel regular \
--scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email
Then deploy and expose as above, but without setting the k8sServiceAccount
The first two methods require a Google Service Account with the appropriate IAM roles assigned. Here are the roles I assigned to get a few Firebase products working:
FireStore: Cloud Datastore User (Datastore)
Realtime Database: Firebase Realtime Database Admin (Firebase Products)
Storage: Storage Object Admin (Cloud Storage)
Going to close this question.
Just in case anyone stumbles onto it here's what fixed it for me.
1.) I re-followed the steps in the Google Documentation link above, this fixed the issue of my pods not launching.
2.) As for my update, I re-created my cluster and gave it the Cloud Datasource permission. I had assumed that the permissions were seperate from what Workload Identity needed to function. I was wrong.
I hope this helps someone.

How to make aws_cloudwatch_event_rule with terraform and localstack?

I am using terraform(terraform) and localstack(localstack) and trying to create a aws_cloudwatch_event_rule. I get an error:
Error: Updating CloudWatch Event Rule failed: UnrecognizedClientException: The security token included in the request is invalid.
status code: 400, request id: 2d0671b9-cb55-4872-8e8c-82e26f4336cb
Im not sure why im getting this error because this works to create the resource in AWS but not on localstack 🤷‍♂️. Does anybody have any suggestions as to how to fix this? Thanks.
Its a large terraform project so I cant share all the code. This is the relevant section.
resource "aws_cloudwatch_event_rule" "trigger" {
name = "trigger-event"
description = "STUFF"
schedule_expression = "cron(0 */1 * * ? *)"
}
resource "aws_cloudwatch_event_target" "trigger_target" {
rule = "${aws_cloudwatch_event_rule.trigger.name}"
arn = "${trigger.arn}"
}
I realize this is an old question, but I just ran into this problem. I wanted to share what resolved it for me, in case it helps others who end up here. This works for me with terraform 0.12 (should work for 0.13 as well) and AWS provider 3.x.
When you get the The security token included in the request is invalid error, it usually means terraform attempted to perform the operation against real AWS rather than localstack.
The following should resolve the issue with creating CloudWatch Event rules.
Make sure you're running the events service in localstack. It's this service, and not cloudwatch, that provides the CloudWatch Events interface. E.g. if you're running localstack from the command line:
SERVICES=cloudwatch,events localstack start
Make sure the AWS provider in the terraform config is pointed to localstack. Like from step (1), we need to make sure to have a setting specifically for CloudWatch Events. In the AWS provider config, that's cloudwatchevents.
provider "aws" {
version = "~> 3.0"
profile = "<profile used for localstack>"
region = "<region configured for localstack>"
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true
endpoints {
# Update the urls below if you've e.g. customized localstack's port
cloudwatch = "http://localhost:4566"
cloudwatchevents = "http://localhost:4566"
iam = "http://localhost:4566"
sts = "http://localhost:4566"
}
}
Now, the terraform apply should successfully run against localstack.
One more gotcha to be aware of is that localstack currently doesn't persist CloudWatch or CloudWatch Events data, even if you enable persistence. So when you kill or restart localstack, any CloudWatch Events rules will be lost.

Terraform profile field usage in AWS provider

I have a $HOME/.aws/credentials file like this:
[config1]
aws_access_key_id=accessKeyId1
aws_secret_access_key=secretAccesskey1
[config2]
aws_access_key_id=accessKeyId2
aws_secret_access_key=secretAccesskey2
So I was expecting that with this configuration, terraform will choose the second credentials:
terraform {
backend "s3" {
bucket = "myBucket"
region = "eu-central-1"
key = "path/to/terraform.tfstate"
encrypt = true
}
}
provider "aws" {
profile = "config2"
region = "eu-central-1"
}
But when I try terraform init it says it hasn't found any valid credentials:
Initializing the backend...
Error: No valid credential sources found for AWS Provider.
Please see https://terraform.io/docs/providers/aws/index.html for more information on
providing credentials for the AWS Provider
As as workaround, I changed config2 by default in my credentials file and I removed the profile field from the provider block so it works but I really need to use something like the first approach. What am I missing here?
Unfortunately you also need to provide the IAM credential configuration to the backend configuration as well as your AWS provider configuration.
The S3 backend configuration takes the same parameters here as the AWS provider so you can specify the backend configuration like this:
terraform {
backend "s3" {
bucket = "myBucket"
region = "eu-central-1"
key = "path/to/terraform.tfstate"
encrypt = true
profile = "config2"
}
}
provider "aws" {
profile = "config2"
region = "eu-central-1"
}
There's a few reasons behind this needing to be done separately. One of the reasons would be that you can independently use different IAM credentials, accounts and regions for the S3 bucket and the resources you will be managing with the AWS provider. You might also want to use S3 as a backend even if you are creating resources in another cloud provider or not using a cloud provider at all, Terraform can manage resources in a lot of places that don't have a way to store Terraform state. The main reason though is that the backends are actually managed by the core Terraform binary rather than the provider binaries and the backend initialisation happens before pretty much anything else.

Resources