Separate Terraform state files for each service?

Separate Terraform state files for each service? - terraform

I've been following this particular blog post as a guide.
https://blog.gruntwork.io/how-to-manage-terraform-state-28f5697e68fa
So I understand the need to have state files per environment but is anyone going the extra length and using a state file per component? For example, we have modules for our application stack. To keep this example simple, let's say I have two modules.
App1 - contains a LAMP stack
App2 - LAMP stack plus rabbitmq and redis
Each environment (dev, uat and prod) would have both app stack. Would you have a state file for each component, giving you 6 state files?

I've used this method in the past - I had 3 separate components and 3 different environments. The different components had different rates of change and some dependencies between then so I used a remote state datasource to share output variables between them.
The biggest problem with this approach is the management overhead of maintaining the different environments/components.
An alternative could be to use different components in conjuction with workspaces

Related

Terraform Best Practice Multi-Environment, Modules, and State

Enviornment Isolation: Dirs v. Workspaces v. Modules
The Terraform docs Separate Development and Production Environments seem to take two major approaches for handling a "dev/test/stage" type of CI enviornment, i.e.
Directory seperation - Seems messy especially when you potentially have multiple repos
Workspaces + Different Var Files
Except when you lookup workspaces it seems to imply workspaces are NOT a correct solution for isolating enviornments.
In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments.
I would also like to consider using remote state -- e.g. azurerm backend
Best Practice Questions
When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
e.g.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "${var.enviornment}.terraform.tfstate"
}
}
vs.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "cache_cluster_${var.enviornment}.terraform.tfstate"
}
}

When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
A re-usable module for your infrastructure would essentially encapsulate the part of your infrastructure that is common to all your "dev/test/stage" environments. So no, you wouldn't have any "dev/test/stage" folders in there.
If, for example, you have an infrastructure that consists of a Kubernetes cluster and a MySQL database, you could have two modules - a 'compute' module that handles the k8s cluster, and a 'storage' module that would handle the DB. These modules go into a /modules subfolder. Your root module (main.tf file in the root of your repo) would then instantiate these modules and pass the appropriate input variables to customize them for each of the "dev/test/stage" environments.
Normally it would be a bit more complex:
Any shared VPC or firewall config might go into a networking module.
Any service accounts that you might automatically create might go into a credentials or iam module.
Any DNS mappings for API endpoints might go into a dns module.
You can then easily pass in variables to customize the behavior for "dev/test/stage" as needed.
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
Going off the Terraform docs and their recommended separation:
In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls.
You would not share tfstorageaccount. Now take this with a grain of salt and determine your own needs - essentially what you need to take into account is the security and data integrity implications of sharing backends/credentials. For example:
How sensitive is your state? If you have sensitive variables being output to your state, then you might not want your "production" state sitting in the same security perimeter as your "test" state.
Will you ever need to wipe your state or perform destructive actions? If, for example, your state storage provider only versions by folder, then you probably don't want your "dev/test/stage" states sitting next to each other.

Run 3 component of an app in 3 different pods

I have a stateless app with 3 component say x,y,z in the same code base. Each component will be run by checking on the environment variable. I want to deploy it on Kubernetes on GCP using kind: Deployment yaml config with 3 replica pods. How can I make sure each component has a single dedicated pod to it?
Can it be done on single deployment file?

As #Ivan Aracki mentioned in the comments, the best way would be distinguish each application component with appropriate deployment object in order to guarantee a Pod assignment here.

As Ivan, suggested above, deploy three deployments one each for x, y and z.
You can use the same image for the three deployments, just pass different environment variable/value for each deployment to deploy specific component. You might have to build some logic in the container start up script to retrieve the value from the environment variable and start the desired component

As I understand your requirements stated that you have three processes application code base inside one solution. Nit sure weather three components you mentioned are independent process components or just layer front end , service , DAL etc or even tiers e.g. typical 3 tier architecture application with front end web , API and backend tier but let's call three microservices or services for simplicity...
Whichever the case is , best practices of docker , kubernetes hosted microservices pattern recommends :
container per process small app (not monolethe)
though there can be multiple containers per pod, suggested is keep one per pod - you can possibly have three containers inside pod
You can have three pods one each for your component app provided these apps can be refactored into three separate independent processes.
Having one yaml file per service and include all related objects inside seperated by --- on seperate line
Three containers inside single pod or three pods per service would be easily accessible to each other
Hope this helps.

Azure ARM Templates

I have created 5 x ARM templates that combined deploys my application. Currently I have separate Templates/parameter files for the various assets (1 x servicebus, 1 x sql server, 1 x eventhub, etc)
Is this OK or should I merge them into 1 x template, 1 x parameter file that deploys everything?
Pro & cons? What is best practice here?

Its always advised to have seperate JSON File for azuredeploy.json and azuredeploy.parameters.json.
Reason:
Azuredeploy is the json file which actually holds your resouces and paramaters.json holds your paramaters. You can have one azuredeploy.json file and have multiple paramaters.json files. Like for example let say you different environements, Dev/Test/Prod, then you have seperate azuredeploy-Dev.paramaters.json, azuredeploy-Test.paramaters.json and so and so forth; you get the idea.
You can either merger seperate json files, one for service bus, one for VMs, etc. this will help when you want multiple people to work on seperate sections of your Resource group. Else you can merge them together.
BottomLine: You are the architect, do it as you want, whichever makes your life easy.

You should approach this from the deployment view.
First answer yourself few question:
How separate resources such as ASB, SqlServer, Event hub are impacting your app? can your app run independently while all above are unavailable?
How often do you plan to deploy? I assume you are going to implement some sort of Continuous deployment.
How often will you provision a new environment.
so long story short.
Anything that will have minimum (0) downtime on your app during deployment/disaster recovery, should be considered along with the fact anyone from the street can take you scripts and have your app running in reasonable time, say 30 min max.

Puppet - Dependencies between two nodes

I am installing two different application on two different nodes. But These two applications have a dependency with each other. Service for application 1 should be started only if application 2 on node 2 is deployed.
Can anybody help on how can I resolve this in my puppet manifests?

Maybe puppet is not the right tool for distributed deployments.
You could write a custom fact to detect whether node 2 is deployed and then, use this fact as value for the ensure=>, or use exec resources instead of service resources.
Anyway it will be a bit handicraft. Consider using fabric or any other tool for distributed deployment orchestation, and use puppet to keep centralized configuration integrity.

How do I handle multiple apps running on a single server?

I am new to Chef. I just finished creating a cookbook that deploys a node.js app, configures Nginx, and then starts the app as 1 or more workers that are "load balanced" by Nginx. It works great. I made sure to keep it pretty generic, and all the app level config is done via attributes.
Now I am trying to think about an instance where I have multiple node.js apps running on the same server. For example, the primary API app, and another app that registered itself as a Gearman worker.
How would I go about doing this? Would I simply create another cookbook that is specific to that app, make sure it includes the generic cookbook's recipe, and then do attribute overrides just for that app's recipe?
Or, would it be better if I moved away from using attributes for the app config, and used data_bags instead?
Any help is appreciated.

I would have separated nginx and node.js installation/configuration into separate cookbooks.
If you must have several different applications running on node.js, I think it's ok to add a recipe for every application inside node.js cookbook and make sure each of them includes installation of node.js itself.
If you must have several instances of 1 and the same application/service running, then it is better to use one recipe with different attributes or data bags to introduce differences among instances.
Any more specific questions?

You should use roles Roles to manage multiple cookbooks on a server.
I'm not exactly sure of your scenario, but from your description, I would create 3 cookbooks. One that installs nginx, one that installs your app, and one that does node specific configuration and deployment. Bundle these into a role 'app_server' and put the role in the run_list.
This makes your app more composable, and it's easier to change out any of the pieces in the future.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string