Terraform is a command-line tool for creating and managing your cloud infrastructure. Infrastructure is expressed in a JSON-like configuration language -- Hashicorp Configuration Language, and it supports multiple cloud infrastructure providers.
One of the key aspects of the functionality of terraform
is that of the state
of your infrastructure as terraform
sees it. This state is stored in a backend
-- multiple backends are supported. The default backend is local
and is implemented as a file, usually terraform.tfstate
.
The backend selection is a key decision that has to be made right at the start of adopting terraform
to manage your infrastructure.
Why Non-local Terraform Backends?
Although using the local backend is simple, especially when getting started, at least two problems will show up sooner than later:
Storage of secrets
The local state file is usually managed in the same repository as the Terraform code. This can mean that there may be unencrypted sensitive data stored in your repository which exists on disk.
No locking
Since the state file now lives as a file in the repository, there is nothing that prevents multiple instances of Terraform, invoked either manually or in a continuous integration pipeline, from attempting to modify your infrastructure. It's worth noting that this can to some extent be solved by enforcing rules such as only one instance of your pipeline is running at any given point of time.
To solve both problems, Terraform supports non local
backends. When using remote backends, terraform
doesn't store any of its state on disk. This however means that if you want to store the state encrypted at rest and in transit, we will have to implement a backend-specific solution. For example, the docs specify a possible solution when using AWS S3 backend.
To solve the second problem, two possible options are AWS S3 and HashiCorp's consul. Documentation on using S3 (which achieves its locking via DynamoDB) is part of the official Terraform documentation.
Using a Consul Remote Backend
Using consul
to solve the above problems is the focus of this article. For most of this article, we will be looking at a getting-started setup where we are running the consul server and applying infrastructural changes from our local system. Towards the end, I will briefly touch on how we may adopt this approach to a more realistic scenario and how we may go about encryption of Terraform state when using consul.
The accompanying git repository contains configuration and code used in this article and it may be a good idea to clone the repository as you work through this article. Linux and OS X are the only operating systems for which the accompanying scripts have been tested. Besides the need to download consul
and terraform
(as described next), we will use pipenv to run some Python scripts.
Set up
To use consul
as a remote backend for Terraform with locking, we will make use of consul's key value store, ACL system, and sessions.
consul
is distributed as a platform-specific binary zip file. The latest version at the time of this writing is 1.1.0
, which can be downloaded from here. Download the platform-specific zip file and unzip it to extract the consul
binary.
It may be a good idea to add the unzip location to the system PATH
variable or equivalent.
Next, we will start a consul
development server:
$ <repository root> $ consul agent dev --config-file=./consul/server-config.json ...</repository>
The server-config.json
has the following configuration:
{ "acl_datacenter": "dc1", "acl_master_token": "Arandom$tring", "acl_default_policy": "deny", "acl_down_policy": "extend-cache" }
The above starts our consul dev server so that it has ACL enabled with a controller token, and a default policy of deny
. The consul ACL guide explains this in detail.
We will leave the consul
server running in a dedicated terminal session.
Set up Terraform with
Similar to consul
, terraform
is distributed as a platform-specific binary zip file. At the time of this writing, the latest version is 0.11.7
. Download the relevant zip file from here and unzip it somewhere on your filesystem.
I assume that the path where it is unzipped is added to the system PATH
variable or equivalent so that it can be invoked from anywhere on the command line without specifying the absolute path.
Initializing the backend
Now that we have terraform
on our system, let's initialize the backend. The terraform configuration is as follows:
# backend.tf terraform { backend "consul" { path = "terraform/state" lock = true } }
The path
above specifies the consul key we will store the state in, and we specify that we want to use locking. Let's now run Terraform and initialize the backend. Since we will need to first obtain a ACL token from consul
, we will be using a bash
script to tie the two steps together:
$ <repository root> $ cd terraform/configuration/ $ ./init.bash ~/work/github.com/amitsaha/terraform-consul-lock-demo/terraform/bootstrap ~/work/github.com/amitsaha/terraform-consul-lock-demo/terraform/configuration ~/work/github.com/amitsaha/terraform-consul-lock-demo/terraform/configuration Initializing the backend... Backend configuration changed! ...</repository>
The above output tells us that we have succesfully initialized the consul
backend with Terraform. The contents of init.bash
is as follows:
#!/bin/bash set -e pushd ../bootstrap-utils terraform_token=$(pipenv run python get_session_token.py) popd terraform init --backend-config=access_token=$terraform_token
First, we run a Python script to get an ACL token from consul which has the permission to:
Read and write to consul KV store at the path
terraform/state
.Create sessions on all nodes.
We then run terraform init
, providing the token via partial configuration so that we don't have to hardcode the access token.
When you run the above script on your consul
process console, you will see logs as follows:
2018/06/05 22:35:59 [DEBUG] http: Request PUT /v1/acl/create?token=Arandom%24tring (838.515µs) from=127.0.0.1:50425 2018/06/05 22:35:59 [DEBUG] http: Request GET /v1/kv/terraform/state (639.584µs) from=127.0.0.1:50428
We can see that an API call was made to create a token (by our Python script) and then a GET query was made (by Terraform) to the consul KV store to query the current state.
The init
command initializes the backend for us and creates a .terraform
sub-directory. In that, we will have a terraform.tfstate
file which looks as follows:
$ cat .terraform/terraform.tfstate { "version": 3, "serial": 1, "lineage": "7d772b24-b269-0638-3832-339be8926025", "backend": { "type": "consul", "config": { "access_token": "4925cfa5-1195-4802-74f4-64561e6fa788", "lock": true, "path": "terraform/state" }, "hash": 14982975079171644367 }, "modules": [ { "path": [ "root" ], "outputs": {}, "resources": {}, "depends_on": [] } ] }
When we perform any subsequent Terraform operation, Terraform will consul the configuration above to interact with the backend. This file will not need to be committed to version control. It is safe to run the init
operation more than once.
Managing a consul key value resource
At this stage, Terraform is initialized and we can now start managing our infrastructure. To keep things simple, we will use Terraform's consul provider to create a consul
key on the local consul server we have running.
The configuration looks as follows:
# infrastructure.tf variable "app1_version_token" {} resource "consul_keys" "app1_version" { datacenter = "dc1" token = "${var.app1_version_token}" key { path = "app1/version" value = "0.1" } }
We will supply the token when running terraform
via the script apply_consul_key.bash
:
#!/bin/bash set -e pushd ../configuration-utils pipenv install terraform_token=$(pipenv run python get_kv_token.py) popd terraform apply -target=consul_keys.app1_version -var "app1_version_token=$terraform_token"
Next, we will run the script as:
$ < repository root > $ cd terraform/configuration $ ./apply_consul_key.bash ...
Once the above script runs, it will create the key with the specified value. On the consul server, you will see logs that show API calls to the server for acquiring a lock, reading current state, creating a session, creating the key, and writing the final state.
Demonstration of Locking
Let's modify our configuration, infrastructure.tf
, to update the value of the key to something else:
$ < repository root> $ cd terraform/configuration $ git diff infrastructure.tf diff --git a/terraform/configuration/infrastructure.tf b/terraform/configuration/infrastructure.tf index 67bf4ea..ea9490a 100644 --- a/terraform/configuration/infrastructure.tf +++ b/terraform/configuration/infrastructure.tf @@ -5,6 +5,6 @@ resource "consul_keys" "app1_version" { token = "${var.app1_version_token}" key { path = "app1/version" - value = "0.1" + value = "0.2" } }
Run the apply_consul_key.bash
script in a terminal window. While it waits for us to confirm, run the script in another console. You will see the script exits with an error output as follows:
Acquiring state lock. This may take a few moments... Error: Error locking state: Error acquiring the state lock: Lock Info: ID: 1494f8fc-fe71-7dbe-7ea8-b21b0d402d2e Path: terraform/state Operation: OperationTypeApply Who: vagrant@default-centos-7-latest Version: 0.11.7 Created: 2018-06-08 01:46:24.40623813 +0000 UTC Info: consul session: 19e976ad-0c43-b511-3cd1-bc554f7416e1
We can use the -lock-timeout
argument to specify a duration for terraform to try and acquire a lock.
Beyond Local Setup
It perhaps makes the most sense to use the consul
backend for Terraform if you are already using consul
in your organization. In such a scenario, your consul server is running at a central location while terraform
is run as part of a continuous integration and deployment pipeline. Compared to our demo setup, we would do the following things differently:
Managing consul tokens for Terraform
We saw that we need to get a consul token for Terraform to be able to create sessions, read, and write the key at which the state will be stored. In our demo, we used the controller token to create this token. Ideally, we would use a setup where we have a token that can create a token for terraform
with the desired policy.
Encrypt state at rest
Terraform would also store the state file encrypted by default. Even if we can use consul ACLs to make sure, we don't allow access to the state by any undesired entity; the state is still unencrypted at rest.
For both the above scenarios, Vault is worth looking into.
Summary
In this article, we looked at setting up terraform
with consul
backend. If you are already using consul
in your infrastructure, it is definitely worth looking into.
Although there may be solutions to still use the local backend and using a CI solution to enforce having a single instance of Terraform running at any point of time, using a remote backend with locking is so easy that there is no reason to not do it.
The repository used for this article is available here.
Resources
The following resources should be helpful for you to learn more: