Spruce, Vault, Concourse & You

January 11, 2016

BOSH makes a whole lot of tasks in the operations / systems
management space way easier than ever before. Combine that with tools like Spruce and Genesis, and you have a really powerful paradigm for managing your deployments. Pair that with Concourse and it seems like the sky is the limit!

Then you run into the security problem.

In order for your Concourse pipeline to deploy your BOSH manifests, you need the manifests. No problem; just stick it in a git repository somewhere and you’re good. Except that BOSH manifests are full of all kinds of sensitive information like database credentials, root passwords and IaaS secret keys.

Enter Vault, the secure credentials storage system from Hashicorp. Vault is a really slick piece of technology, and it would be great if we could just integrate it into our Concourse deployment pipelines, right?

Sure. Let’s do that.

It All Starts With Genesis

Genesis is a deployment paradigm for BOSH that builds on top of
Spruce and comes with its own, embeddable, helper script. It sets up new deployment manifest repositories (in git) using a multi-tiered structure of increasingly more specific layers of BOSH manifest file overrides (global → site → environment).

You can read more about it here.

When it comes to Vault and secrets, Genesis Just Works^™

Genesis will detect that you are running with access to a Vault (by way of the VAULT_ADDR environment variable), and will behave accordingly. Notably, this involves generating two versions of your BOSH manifest — one with credentials in it, to be used for deploying; and one without credentials, suitable for committing to your upstream git repository.

It also features rudimentary Concourse pipeline support for doing basic deployments. This too takes advantage of the Vault integration to pull down secrets during deployment.

Setting Up Vault

Before we can start experimenting with all these neat toys, we’re going to need a spinning Vault instance. For our purposes we’ll use BOSH and a self-contained in-memory storage backend.

Luckily, there is a BOSH release for Vault that we can use!

$ git clone https://github.com/cloudfoundry-community/vault-boshrelease
$ cd vault-boshrelease
$ bosh upload release releases/vault/vault-0.1.3.yml

Then, spin up a deployment. Go ahead. I’ll wait.

Once Vault is up, you should take note of the IP that your vault VM is running on, by running bosh vms. Put that in an environment variable named VAULT_ADDR and export it.

$ export VAULT_ADDR=http://<VAULT-IP-ADDRESS>:8200

Then, you’ll want to follow the Getting Started Guide on the Vault website, to unseal your vault and get access with the root token.

Using the App-ID Authentication Backend

Vault is optimized for people. It provides a wealth of authentication tie-ins to systems like LDAP and Active Directory, so that organizations can enforce policy globally, and users have one fewer password to remember. This poses a bit of a problem for our situation, since we want a robot (our Concourse worker) to be able to securely authenticate to the vault without the assistance of a human.

The App-ID backend (documented here) provides us just that. We can configure two tokens, the user-id and app-id, lock it down with a network ACL, and drop those two tokens to our pipeline configuration.

First up, we’ll need to enable the app-id authentication method:

$ vault auth -methods
Path     Type    Description
token/   token   token based credentials
$ vault auth -enable app-id
$ vault auth -methods
Path     Type    Description
app-id/  app-id
token/   token   token based credentials

Next, we’ll configure an app-id token, by writing to the correct backend path, like so:

$ vault write auth/app-id/map/app-id/testing-deployment-pipeline \
              value=root \
              display_name="Testing Deployments pipeline"

There’s a lot going on here, but here’s the highlights:

The path auth/app-id/map/app-id/testing-deployment-pipeline is just how
the Vault app-id backend needs to be configured. Everything but that
last component is literal. The last part is the app-id token itself,
in this case testing-deployment-pipeline.
The value=root argument associates the new token with the access policy
named, (here: "root"). This governs what access is allowed for machines
that successfully authenticate with this token.
The display_name=... argument sets the name to be used in CLI output.

An app-id token is useless without an associated user-id token, so let’s make one of those too:

$ vault write auth/app-id/map/user-id/concourse \
        value=testing-deployment-pipeline \
        cidr_block=10.244.0.0/16

That’s a keyboard-ful. Breaking it down:

The path is similar, but differs in two key ways: the last component,
concourse is the new user-id token we are creating, and the path ends
at user-id instead of app-id.
The value=... argument associates the user-id token with the app-id
token we just created (testing-deployment-pipeline). In Vault
parlance, the user ID is now mapped to the app-id, and the two tokens
can be used together to authenticate.
The cidr_block=... argument restricts where the user-id token can be
used from. Any attempts to authenticate as "concourse" from anywhere
outside of the 10.244.x.x network will fail.

We’re also going to create a fixed secret, called "secret/handshake" that we will use later to validate authentication during automation runs:

$ vault write secret/handshake knock=knock

You can validate that the secret was saved (and see what the diagnostic output in our pipeline will look like) by running:

$ vault read secret/handshake
Key             Value
lease_duration  2592000
knock           knock

That’s it. Vault is configured!

A Note On Real-World Usage

Configuring security is not a lightweight task, and definitely demands attention to detail and an appreciation for the subtleties of your environment, intended use cases and technology stack. To keep this already very long post short, I’m just using the default root policy that ships with Vault.

Don’t do that in production

The root policy has full access to all secrets in all backends. In real environments, you will want to create a new policy with locked down and very specific access, and use that instead.

You’ll also notice that these commands pass secrets in the clear as command line arguments. These have the unfortunate side effect of showing up in the process table, in sudo logs and shell history file (for fun, grep vault ~/.*history)

Instead, you should be passing credentials via files, using the @/path/to/file invocation style. Make sure you chmod the files properly before you put your secrets in them!

$ mkdir ~/secrets
$ chmod 0700 ~/secrets
$ touch ~/secrets/key
$ chmod 0600 ~/secrets/key
$ vi ~/secrets/key
...
$ vault write secret/key/stuff @~/secrets/key
$ rm ~/secrets/key

And remember, those are JSON files.

Configuring Concourse

I’m going to cheat here and lean heavily on Genesis for generating my pipelines. It creates all the necessary bits and pieces of configuration and scripts, and handles Vault for you.

Assuming you start with a genesis-managed deployment:

$ cd testing-deployments
$ genesis embed
...
$ genesis repipe
...

The genesis embed call stores a copy of your current genesis script in the top-level bin/ directory. The Concourse/Genesis integration pieces rely on this to avoid mixing versions.

When we call genesis repipe, Genesis looks at all of the Concourse configuration fragments in the ci/ directory and assembles them into a single cohesive configuration. (It also calls out to Vault if you set up the templates to pull in secrets like BOSH passwords.) It then takes the final configuration and uploads it to Concourse to configure the pipeline.

When all is said and done, we will have a new pipeline configured in our Concourse installation, all ready to go.

The Docker Task Image

The Docker image that Concourse is going to spin needs the following utilities:

Spruce
jq (for JSON integration with Vault)
Vault CLI utility

You actually get this for free if you use Genesis to generate your deployment, since it sets up a job to build a custom Docker image, as part of the pipeline itself.

Note: until v0.14.0 of Spruce is cut, you will have to build off of master since the (( vault ...)) operator (described later) is not in v0.13.0 or lower.

How It All Fits Together

Magic!

When you push new commits to master, Concourse will take note and kick off the deployment. It spins up the Docker task image, and does the following:

Authenticate to the $VAULT_ADDR using the two tokens (which are
themselves passed in as environment variables)
Validates the status of the remote Vault by running vault status
(This has the useful side-effect of providing diagnostics when / if the
vault is sealed or otherwise unusable)
Reads a fixed secret, secret/handshake to verify that the
authentication succeeded.

Take particular note of item 3. If you recall, we set that secret/handshake bit up as we were configuring Vault. If you skip that step, the automation will fail because it thinks that it has not successfully authenticated to Vault.

Once it can access the vault and retrieve secrets and credentials, the pipeline runs some bits of Genesis to combine all of those YAML templates together via Spruce.

When Spruce sees an operator like this:

---
meta:
  credentials: (( vault "cloud/admin:password" ))

it does one of two things. In the clean manifest (the one without credentials) it replaces the operator with the literal string "REDACTED". This indicates that there is supposed to be sensitive information there, but that it has been hidden to allow the manifest to be committed to git. In the deployment manifest (the one with credentials) it contacts the vault and asks for the secret/cloud/admin secret, and extracts the ‘password’ key from that, replacing the operator with that value.

With the manifests (plural) generated, the pipeline moves onto the next step, and attempts to bosh deploy the deployment manifest (secrets and all). If that succeeds, it commits the clean manifest, destroys the deployment manifest and pushes the (scrubbed) changes back to origin.

The upshot of this little dance is that credentials live inside the secure confines of the vault, and are only exposed for a small window of time inside the executing Docker container. They do not get commited.

Gaps In The Armor / Mitigation

This is not a perfect solution.

It suffers from a few large gaps in protection. For starters, neither Concourse nor BOSH understand what parts of their configuration / manifests are sensitive, so they do not redact them. They do however, make it possible to retrieve configuration.

$ bosh download manifest test-deployment
...
$ echo "--- {}" > empty.yml
$ echo n | fly set-pipeline -p test-deployment-pipelines -c empty.yml
...

This can, if not handled with appropriate caution, completely sidestep all of the protection of a Vault-enabled pipeline.

The bosh deploy command also prints a semantic difference highlighting changes being made to the deployment. If you have rotated passwords inside your vault, the new secrets will be printed in the clear on the next deploy. Concourse compounds this risk by making that output available as part of the job log in its web user interface.

Beyond that, anyone with direct access to the BOSH director, or any of the VMs inside of the BOSH deployments (especially the Concourse VMs) can access passwords that are rendered via job templates into files on-disk.

For these reasons, you must be careful to configure appropriate compensating controls in your environment. This boils down to:

Isolating your BOSH director behind strong firewalls.
Only allowing access to the BOSH director from a secure jump box.
Disallowing any remote shell access to deployed VMs.
Configuring Concourse with HTTPs transport security.
Requiring strong authentication on your Concourse web interface.

Hopefully in the future, the BOSH and Concourse teams will turn their prodigious software engineering talents towards hardening these products to be more security-consicious. There’s currently a pull request out to the Concourse team for the bosh-deployment-image resource that adds the --redact-diff option to the BOSH deploy command, to hide the diff output altogether.

Digging Deeper

Hopefully, you’re all fired up about protecting sensitive credentials without losing the ability to automate your BOSH deployments via Councourse pipelines. To dig a little deeper, check out these resources:

Genesis – Explains how Genesis deployment repositories are structured, and why.
Spruce – Learn how Spruce can make your life tons easier when managing large YAML files with lots of duplication / self-reference.
Vault Documentation (official) – Read up on how to store your credentials safely and securely. The introduction is also pretty helpful.
Vault BOSH Release – For deploying your own copy of Vault, to a BOSH director you own.

Happy Hacking!

Written by:
Ashley Gerwitz

Marketing Manager at Stark & Wayne and Qarik