gitlab-ci terraform state lock file eradication

gitlab-ci terraform state lock file eradication - terraform

While trying to migrate my backend config to use the new state storage with gitlab, I have run into this glorious problem: My state is locked.
I cannot force-unlock the state because the backend needs to be reinitialized
I cannot force-unlock -force the state unlock because the backend needs to be reinitialized
I cannot set up the backend with -lock=false because the same credentials that started this entire mess cannot seem to push things other than toxic lock tokens:
Error: Error copying state from the previous "local" backend to the newly configured
"http" backend:
Failed to upload state: POST http://internal.host/api/v4/projects/14/terraform/state/project-name giving up after 3 attempts
I'm at my patience's end. I did try to check whether the chatter in /var/log/gitlab/gitlab-rails/production_json.log delivers something relevant or not, and came away no more sure and little less sane for it.
Is there a sudo pretty-please-with-sugar-on-top-clean-the-fn-lock command that doesn't have any gatekeeping on it?

I have run into the same problem while migrating the terraform state files from s3 to gitlab.
I caused the problem because I had a typo in the backend_config unlock_address and I inserted Control+C while init was still running.
The terraform init did not ask me to migrate states from s3 to gitlab, but I got locked and force unlock would not work in any way.
The solution I came with:
Configure backend.tf to use as unlock address the previously used lock_address and re-initialize terraform.
Terraform plan should work fine now.
Reconfigure backend.tf to continue with state migration. Re-initialize terraform state URLs with the ones you want by migrating again.
For example, this is the terraform init I used where the desired adress was <TF_State_Name> and I had a typo <TF_State_Name_B> .
I interrupted with control+C:
terraform init \
-backend-config="address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name>" \
-backend-config="lock_address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name>/lock" \
-backend-config="unlock_address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name_B>/lock" \
-backend-config="username=<user>" \
-backend-config="password=<password>" \
-backend-config="lock_method=POST" \
-backend-config="unlock_method=DELETE" \
-backend-config="retry_wait_min=5"
And this is how I re-configured terraform init in order to by-pass the lock.
terraform init \
-backend-config="address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name_B>" \
-backend-config="lock_address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name_B>/lock" \
-backend-config="unlock_address=https://<gitlab_url>/api/v4/projects/<ProjectID>/terraform/state/<TF_State_Name_B>/lock" \
-backend-config="username=<user>" \
-backend-config="password=<password>" \
-backend-config="lock_method=POST" \
-backend-config="unlock_method=DELETE" \
-backend-config="retry_wait_min=5"
Finally, you should reconfigure to the desired address.

Related

Local state cannot be unlocked by another process on terraform

My terraform remote states and lockers are configured on s3 and dynamodb under aws account, On gitlab runner some plan task has been crashed and on the next execution plan the following error pops up:
Error: Error locking state: Error acquiring the state lock: ConditionalCheckFailedException:
The conditional request failed
Lock Info:
ID: <some-hash>
Path: remote-terrform-states/app/terraform.tfstate
Operation: OperationTypePlan
Who: root#runner-abc-project-123-concurrent-0
Version: 0.14.10
Created: 2022-01-01 00:00:00 +0000 UTC
Info: some really nice info
While trying to unlock this locker in order to perform additional execution plan again - I get the following error:
terraform force-unlock <some-hash-abc-123>
#output:
Local state cannot be unlocked by another process
How do we release this terraform locker?

According to reference of terraform command: force-unlock
Manually unlock the state for the defined configuration.
This will not modify your infrastructure. This command removes the
lock on the state for the current configuration. The behavior of this
lock is dependent on the backend being used. Local state files cannot
be unlocked by another process.
Explanation: apparently the execution plan is processing the plan output file locally and being apply on the second phase of terraform steps, like the following example:
phase 1: terraform plan -out execution-plan.out
phase 2: terraform apply -input=false execution-plan.out
Make sure that filename is same in phase 1 and 2
However - if phase 1 is being terminated or accidentally crashing, the locker will be assigned to local state file and therefore will be must to get removed on the dynamodb itself and not with terraform force-unlock command.
Solution: Locate this specific item under the dynamodb terraform lockers table and explicitly remove the locked item, you can do either with aws console or through the api.
For example:
aws dynamodb delete-item \
--table-name terraform-locker-bucket \
--key file://key.json
Contents of key.json:
{
"LockID": "remote-terrform-states/app/terraform.tfstate",
"Info": {
"ID":"<some-hash>",
"Operation":"OperationTypePlan",
"Who":"root#runner-abc-project-123-concurrent-0",
"Version":"0.14.10",
"Created":"2022-01-01 00:00:00 +0000 UTC",
"Info":"some really nice info"
}
}

terraform force-unlock <lock id>
For terragrunt, in <terragruntfile>.hcl directory, run
terragrunt force-unlock <lock id>. If didn't work, remove terragrunt.lock.hcl and .terragrunt-cache/ and try again.
Also

While destroying if facing the issue:
Adding -lock=false would help proceed further

terraform init creating a new workspace in automation

I have looked on the Internets but did not find anything close to an answer.
I have the following main.tf :
terraform {
cloud {
organization = "my-organization"
workspaces {
tags = ["app:myapplication"]
}
}
}
I am using terraform cloud and I would like to use workspace in automation.
In order to so, i need first to do a terraform init :
/my/path # terraform init
Initializing Terraform Cloud...
No workspaces found.
There are no workspaces with the configured tags
(app:myapplication) in your Terraform Cloud
organization. To finish initializing, Terraform needs at least one
workspace available.
Terraform can create a properly tagged workspace for you now. Please
enter a name to create a new Terraform Cloud workspace.
Enter a value:
I would like to do something of the kind :
terraform init -workspace=my-workspace
so that it is created if it does not exist. But I do not find anything. The only way to create a the first workspace is manually.
How to do that in automation with ci/cd?
[edit]
terraform workspace commands are not available before init
/src/terraform # terraform workspace list
Error: Terraform Cloud initialization required: please run "terraform
init"
Reason: Initial configuration of Terraform Cloud.
Changes to the Terraform Cloud configuration block require
reinitialization, to discover any changes to the available workspaces.
To re-initialize, run: terraform init
Terraform has not yet made changes to your existing configuration or
state.

You would need to use the TF Cloud/TFE API. You are using TF Cloud, but can modify the endpoint to target your installation to use TFE.
You first need to list the TF Cloud Workspaces:
curl \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/vnd.api+json" \
https://app.terraform.io/api/v2/organizations/my-organization/workspaces
where my-organization is your TF Cloud organization. This will return the workspaces in a JSON format. You would then need to parse the JSON and iterate over the maps/hashes/dictionaries of existing TF Cloud workspaces. For each iteration, inside the data and then the name key would be the nested value for the name of the workspace. You would gather the names of the workspaces and check that against the name of the workspace you want to exist. If the desired workspace does not exist in the list of workspaces, then you create the TF Cloud workspace:
curl \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/vnd.api+json" \
--request POST \
--data #payload.json \
https://app.terraform.io/api/v2/organizations/my-organization/workspaces
again substituting with your organization and your specific payload. You can then terraform init successfully with the backend specifying the Tf Cloud workspace.
Note that if you are executing this in automation as you specify in the question, then the build agent needs connectivity to TF Cloud.

I will not mark this as the answer, but I finally did this, which look like a bad trick to me :
export TF_WORKSPACE=myWorkspace
if terraform init -input=false; then echo "already exist"; else (sleep2; echo $TF_WORKSPACE) | terraform init; fi
terraform apply -auto-approve -var myvar=boo

How to setup the terrafrom Backend configuration using init Command

How to setup the terraform Backend configuration using CLI arguments using Terraform init Command

Whenever a configuration's backend changes you must run the terraform init to again validate and configure the backend before you can perform any plans and operations.
The Terraform init [options] performs several different initialization steps.After Initialization you can perform other commands.
for backend configuration you need to define a conffiguration file specified in init command.
init setting is defined in this way:
$ terraform init \
-backend-config="address=demo.consul.io" \
-backend-config="path=example_app/terraform_state" \
-backend-config="scheme=https"

Azure VM extension update failure

I tried to add a custom script to VM through extensions. I have observed that when vm is created, Microsoft.Azure.Extensions.CustomScript type is created with name "cse-agent" by default. So I try to update extension by encoding the file with script property
az vm extension set \
--resource-group test_RG \
--vm-name aks-agentpool \
--name CustomScript \
--subscription ${SUBSCRIPTION_ID} \
--publisher Microsoft.Azure.Extensions \
--settings '{"script": "'"$value"'"}'
$value represents the script file encoded in base 64.
Doing that gives me an error:
Deployment failed. Correlation ID: xxxx-xxxx-xxx-xxxxx.
VM has reported a failure when processing extension 'cse-agent'.
Error message: "Enable failed: failed to get configuration: invalid configuration:
'commandToExecute' and 'script' were both specified, but only one is validate at a time"
From the documentation, it is mentioned that when script attribute is present,
there is no need for commandToExecute. As you can see above I haven't mentioned commandToExecute, it's somehow taking it from previous extension. Is there a way to update it without deleting it? Also it will be interesting to know what impact will cse-agent extension will create when deleted.
FYI: I have tried deleting 'cse-agent' extension from VM and added my extension. It worked.

the CSE-AGENT vm extension is crucial and manages all of the post install needed to configure the nodes to be considered a valid Kubernetes nodes. Removing this CSE will break the VMs and will render your cluster inoperable.
IF you are interested in applying changes to nodes in an existing cluster, while not officially supported, you could leverage the following project.
https://github.com/juan-lee/knode
This allows you to configure the nodes using a DaemonSet, which helps when you node pools have the auto-scaling feature enabled.
for simple Node alteration of the filesystem, a privilege pod with host path will also work
https://dev.to/dannypsnl/privileged-pod-debug-kubernetes-node-5129

Compile a node's catalog to JSON without a master?

I'm looking to improve my org's Puppet work lifecycle by comparing differences for actual node catalogs. We came across this project which compiles catalogs for nodes and creates a diff for them, but it seems to require an online master.
I need to be able to do what this tool does, albeit without a master - I'd just like to compile a deterministic JSON or YAML blob which describes all of the resources that would be managed by Puppet for a given node and given a set of facts.
Is there a way for me to do this without an online master?

If you have Rspec-puppet set up, there is an easy way to do this. Just add a File.write statement inside one of your it blocks:
require 'spec_helper'
describe 'myclass' do
it {
File.write(
'myclass.json',
PSON.pretty_generate(catalogue)
)
#is_expected.to compile.with_all_deps
}
end
I have more information in a blog post on this here.
If you can't use Rspec-puppet to do it (recommended), have a look at another blog post I wrote, Compiling a puppet catalog – on a laptop.

I have been looking for it for a long time.
In the end, Vagrant in combination with Puppet brought me the solution.
The only thing you need is to install puppet.
First you have to set the Facts. After that you can compile your catalog.
FACTER_server_role='webstack' \
... \
FACTER_hostname='hostname' \
FACTER_fqdn='hostnamne.fqdn' \
puppet catalog compile hostnamne.fqdn \
--modulepath "./modules" \
--hiera_config "./hiera.yaml" \
--environmentpath ./environments/ \
--environment production
If you want to get a clean json file, pipe the output in sed and send the output to a file.
| sed -n '1!p' > $file

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

gitlab-ci terraform state lock file eradication - terraform

Related

Local state cannot be unlocked by another process on terraform

terraform init creating a new workspace in automation

How to setup the terrafrom Backend configuration using init Command

Azure VM extension update failure

Compile a node's catalog to JSON without a master?

Categories

Resources