I am working on unit testing for terraform. For some modules, I have to authorized into AWS to be able to retrieve terraform data source. Is there anyway to mock or override data source for something like below?
data "aws_region" "current" {
}
Thank you in advance.
Terraform does not include any built-in means to mock the behavior of a provider. Module authors generally test their modules using integration testing rather than unit testing, e.g. by writing a testing-only Terraform configuration that calls the module in question with suitable arguments to exercise the behaviors the author wishes to test.
The testing process is then to run terraform apply within that test configuration and observe it making the intended changes. Once you are finished testing you can run terraform destroy to clean up the temporary infrastructure that the test configuration declared.
A typical Terraform module doesn't have much useful behavior in itself and instead is just a wrapper around provider behaviors, so integration testing is often a more practical approach than unit testing in order to achieve confidence that the module will behave as expected in real use.
If you particularly want to unit test a module, I think the best way to achieve your goal within the Terraform language itself is to think about working at the module level of abstraction rather than the resource level of abstraction. You can then use Module Composition techniques, like dependency inversion, so that you can pass your module fake input when you are testing it and real input when it's being used in a "real" configuration. The module itself would therefore no longer depend directly on the aws_region data source.
However, it's unlikely that you'd be able to achieve unit testing in the purest sense with the Terraform language alone unless the module you are testing consists only of local computation (locals and output blocks, and local-compute-only resources) and doesn't interact with any remote systems at all. While you could certainly make a Terraform module that takes an AWS region as an argument, there's little the module could do with that value unless it is also interacting with the AWS provider.
A more extreme alternative would be to write your own aws provider that contains the subset of resource type names you want to test with but whose implementations of them are all fake. You could then use your own fake aws provider instead of the real one when you're running your tests, and thus avoid interacting with real AWS APIs at all.
This path is considerably more work of course, and so I would suggest to embark on it only if the value of unit testing your particular module(s) is high.
Another super-labour-intensive solution would be to emulate aws api on localhost and redirect (normal) aws provider there. I've found https://github.com/localstack/localstack - https://docs.localstack.cloud/integrations/terraform/ may probably be helpful with this.
Related
I'm using Terraform to deploy Azure resources and now want to deploy across multiple regions.
I'm finding even with Modules I'm repeating code, once for each region.
How should I be writing code for multi region? I can't find any best practices
You could create a list variable and put your regions inside.
Then you could create a for loop and create the resource for each region. This approach works only when you really want to have each resource in each region.
It really depends on your resources. Some resources are reasonably maintained as multi-region within a single module, but this is rare. This would be a case where a module specifically addresses resources in multiple regions, with some kind of unifying logic for those resources. Since regions are typically very independent by design, this is typically an anti-pattern.
Often, it is more sane to use an infrastructure module (or root module, which means the same thing) per region. Some methodologies would have you use a different directory for each region, and again per environment. Yes, you're repeating yourself, but not that much. Your root modules should usually be pretty small and opinionated, serving as a hub for modules and top-level resources to be called.
Yes, you should keep your code DRY, but don't get carried away with it. Some duplication for the sake of organizing resources is totally acceptable.
In the cases where this is truly a problem (large root modules, and/or many regions across many environments), there are tools that can handle this effectively for you. Terragrunt is a fairly effective one, and can template your root modules (including their backend configuration) via a single code location, which is then callable via fairly small files. This can help to deduplicate a codebase like the one I just described.
You may also design your infrastructure modules to be re-usable by defining variables for regional and environmental variances between deployments. Backend configuration is also configurable during Terraform runtime via CLI or environment variable settings. Between these two, you can create infrastructure modules that are capable of being applied in arbitrary environments and regions. I like this better than Terragrunt's approach, because it's much simpler.
How you call these re-usable modules is up to your orchestration implementation, be that a CI/CD system, Kubernetes, Terraform Enterprise/Cloud, whatever.
Hopefully that helps you to make a decision.
We are working on creating various terraform modules for Azure cloud in our organization. I have a basic doubt on using these modules.
Lets say we have a module created for creating resource groups. When we write a module for storage container, Would it be better to use the resource group module inside the storage module itself or would it be better to let the user terraform script handle it specifying multiple module resource. Eg,
module resourcegroup {
…
}
module storage {
}
Thanks,
Hound
What you're considering here is a design tradeoff rather than a question with a universal answer. With that said, the Terraform documentation section Module Composition recommends that you use only one level of module nesting where possible, and then have the root module connect the outputs from one module into the inputs of another.
One situation where you might decide to go against that advice and create multiple levels of nesting is when you want to write a module which intentionally constrains or raises the level of abstraction of another module written by someone else. Modules shared on Terraform Registry are often very general in order to serve various different use-cases, but those modules might also encapsulate some design best-practices for the system in question and so you might choose to wrap one or more of those general modules in a more specific module that more directly meets your use-case, and hopefully in turn make your "wrapper module" easier to use.
However, it's always important to keep in mind that although Terraform modules can in some sense encapsulate complexity, in the case of Terraform they can't truly hide that complexity the way we might expect for libraries in general-purpose languages, because the maintainer of the root module is ultimately responsible for understanding the full consequences of applying a plan, which involves reviewing all of the proposed changes even to resources encapsulated in nested modules.
We have just started a new project in our company which would help dev teams and operators to be able to provision cloud infrastructure as self-service. We plan to go with terraform and publishing modules based on business requirements, enterprise and security compliance (naming convention, forbidden values, etc).
Our new Cloud architect suggest we create the following structure.
Resources Modules
Wrappers around single terraform resource. 1:1 mapping
Eg: resource azurerm_vnet would have a module that wrap all the input and output variables and the resource.
Core Modules
Small modules which use resources modules.
At this level we would setup some compliance requirements, enforcing values, etc.
Eg. Module which create a storage account but some values are enforced (https only, no anonymous, etc), setup diagnostic settings with defaults, allowed locations etc
Modules
More complex infrastructure that would use core modules.
First, Core Modules and Modules for me would make sense since they contains a set of requirements/resources/business cases.
But I think having the Resources modules is a bad idea.
The architect reasons were:
Avoid duplication of terraform default resources
Since all other modules would use the resource if you need to block a property, refactor or enforce policy it would be at a single place.
This is the way he did in previous jobs
He talked with someone at Hashicorp which approved that structure.
Why I dislike the idea:
it doesn't avoid duplication of code, instead of duplicating a terraform resource you are duplicating modules and you get useless information in the state.
Hard to maintain, we are a small team, we would not be able to keep up with all the changes coming from terraform and providers.
It doesn't help for refactoring
I think policies/compliance should be enforced at another level. Core/Modules is a good start but we could leverage tools like terraform-compliance, Azure Policy, etc.
Terraform suggest otherwise
We had a debate on this and the discussion ended with me giving up because I felt I hadn't enough experience to challenge further.
What would be your take on this?
I would like to use Terraform programmatically like an API/function calls to create and teardown infrastructure in multiple specific steps. e.g reserve a couple of eips, add an instance to a region and assign one of the IPs all in separate steps. Terraform will currently run locally and not on a server.
I would like to know if there is a recommended way/best practices for creating the configuration to support this? So far it seems that my options are:
Properly define input/output, heavily rely on resource separation, modules, the count parameter and interpolation.
Generate the configuration files as JSON which appears to be less common
Thanks!
Instead of using Terraform directly, I would recommend a 3rd party build/deploy tool such as Jenkins, Bamboo, Travis CI, etc. to manage the release of your infrastructure managed by Terraform. Reason being is that you should treat your Terraform code in the exact same manner as you would application code (i.e. have a proper build/release pipeline). As an added bonus, these tools come integrated with a standard api that can be used to execute your build and deploy processes.
If you choose not to create a build/deploy pipeline, your other options are to use a tool such as RunDeck which allows you to execute arbitrary commands on a server. It also has the added bonus of having a excellent privilege control system to only allow specified users to execute commands. Your other option could be to upgrade from the Open Source version of Terraform to the Pro/Premium version. This version includes an integrated GUI and extensive API.
As for best practices for using an API to automate creation/teardown of your infrastructure with Terraform, the best practices are the same regardless of what tools you are using. You mentioned some good practices such as clearly defining input/output and creating a separation of concerns which are excellent practices! Some others I can recommend are:
Create all of your infrastructure code with idempotency in mind.
Use modules to separate the common shared portions of your code. This reduces the number of places that you will have to update code and therefore the number of points of error when pushing an update.
Write your code with scalability in mind from the beginning. It is much simpler to start with this than to adjust later on when it is too late.
Is there a way to unit test a code which is using cloud datastore api and written for flexible environment? testbed seems to be tied up with standard environment and it looks like using emulator will require launching/closing emulator process which usually is a flaky way for unit tests.
We end up with end to end testing (launch you tests with real db in dev environment, for example) As we having tenant based application, each test run just creates new tenant and all operations performed in scope of this tenant, so, there should no any inconsistency here. In the other hand, such solution is pretty slow.
The solution above, is just the easiest one, I believe here.
Another option would be to split you code on db dependent parts and business logic part. In this case you will test only business logic part, and mock db dependency. But, as we've investigated such solution, we found that we have a lot of code that have one line of db write operation and 1-3 lines of business logic code. So, splitting such code on different levels would be meaningless for testing and maintenance.
I guess, the last option is more generic relatively previous one, is to mock db. For each module that uses db, before test it you should inject mocked database index, that defines some responses. But in this case it is easy to fall in realization testing, instead of behavioral testing, so again that will mean, that such testing becomes quite ineffective.
I guess, this question is more generic about testing approaches, and not about actually datastore itself.