Occasional blog posts from a random systems engineer

Building Konvad: A Docker-based Vault-Consul-Nomad Stack

· Read in about 16 min · (3335 Words)

The beginning

I’ve been running a Hashicorp stack for a while now - Vault, Consul and Nomad across a small cluster of machines.

Over time, I’d built up a pretty reasonable setup, but it was always a bit… scattered. Terraform here, manual config there, some docker-compose files scattered about.

The thing is, I’m running all of this on self-hosted hardware. There’s no AWS ECS or Google Cloud Run to fall back on. So I needed a way to bootstrap and manage my orchestrator that was actually manageable.

I wanted to consolidate everything into a proper, reproducible Terraform-based setup that I could actually understand and, more importantly, rebuild when something inevitably went wrong.

The goal was simple:

  • All infrastructure as code
  • Docker-based for easy management on self-hosted machines
  • Proper security with TLS everywhere
  • A clear hierarchy: Vault → Consul → Nomad
  • Easy to tune configs without ssh-ing into boxes

This is how I built konvad-stack.

Why Docker?

Before I get into the actual structure, I should explain why Docker…

When you’re running on self-hosted hardware, you don’t have the nice managed services that cloud providers give you. No ECS, no Cloud Run, no managed K8s. You’ve got bare metal and you need to make it useful.

The thing I like about this approach is I can start with a fresh VM, install Docker, and Terraform handles everything else. All the information about what’s deployed where is in Terraform state - I can see exactly what’s running on each host without having to SSH in and poke around.

Docker gives me:

  • Easy rebuilding - container dies? recreate it. No package installs to clean up
  • Version pinning - build images once, reference by hash
  • All state managed by Terraform - no side-channel config management

The tradeoff is I’m using host network mode to avoid the complexity of overlay networks. But honestly, that’s been fine. The services need to talk to each other anyway, and it makes TLS certs simpler since I don’t need to worry about container IPs.

The hierarchy

The entire stack is built around a strict dependency chain. Nothing works without the layer below it.

┌─────────────────────────────────────────────────────────┐
│                        Nomad                            │
│  (Jobs run here, using Vault secrets + Consul service)  │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                       Consul                            │
│   (Service discovery, KV store, Connect for mesh)       │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│                       Vault                             │
│        (PKI, secrets engine, auth backends)             │
└─────────────────────────────────────────────────────────┘

You literally cannot deploy Nomad without Consul, and you cannot deploy Consul without Vault. This was a design pattern I chose (and though it maybe common, I have heard about Consul being deployed first in some cases).

Environment structure

Everything is organised under environments/ with a prod environment that contains separate directories for each stage of deployment:

environments/prod/
├── images/                  # Docker image definitions
├── vault_setup/             # Initial Vault cluster deployment
├── vault_configure/         # Vault PKI, secrets engines, auth
├── consul/                  # Consul server deployment
├── consul_configure/        # Consul datacenter config, roles
├── nomad/                   # Nomad server deployment
├── nomad_configure/         # Nomad datacenter config
├── nomad_client_*/          # Nomad client clusters
└── deployment_roles/        # Non-nomad service deployment permissions

Each directory is a separate Terraform state that outputs are pulled from by subsequent stages. The separation keeps the blast radius small… which is nice when something breaks. It also provides a boundary for providers - one module creates Vault then next has a Vault provider, which has a hard dependency at plan time for it to be running.

Also means I can destroy and recreate just the Nomad clients without touching the core cluster. Which I’ve had to do. More than once.

The deployment order

Stage 1: Images

Before anything else, I need to build the Docker images that will be used across the cluster. These are stored in a separate Terraform state and referenced by other modules.

The images include:

  • Vault server and agent
  • Consul server and agent
  • Nomad server and client

I build these images once and reference them by digest. So the image hash IS the version - no wondering what’s actually running on each host.

Stage 2: Vault Setup

This is where everything begins. The vault_setup module deploys Vault nodes to each host.

Each Vault node gets:

  • A KMS container running side-by-side for auto-unseal (yes, a local Kms… more on that in a bit)
  • A Vault container running in host network mode
  • TLS certificates signed by the existing PKI
  • Auto-unseal configured via the local KMS
  • Local volumes for data, logs, and raft storage

The KMS thing… why run a KMS container when Vault could just use Shamir keys? Well, I wanted auto-unseal without actually depending on AWS KMS or similar. So there’s a local-kms container running on each Vault host that provides a KMS-compatible API.

Vault talks to this local KMS for unseal keys. No manual unseal when a node restarts, no dependency on external services, the unseal keys are stored in the KMS container’s data volume. If the host dies, I just restore the KMS data volume and Vault unseals automatically.

The thing is, Vault needs to be bootstrapped before anything else can happen. I can’t just deploy all three services at once… Vault must be running, initialised, and unsealed before Consul can even start to configure itself.

Stage 3: Vault Configure

Once Vault is running, I configure:

  • PKI mount for the internal CA
  • Consul secrets engine
  • Nomad secrets engine
  • JWT auth backend for Gitlab CI
  • AppRole auth backend for services
  • VM JWT auth backend for… VMs
  • KV stores for various static secrets

This stage outputs all the connection details that other services will need - addresses, mount paths, role names, etc.

Stage 4: Consul

Consul servers are deployed next, each with a layered container setup.

Container layering

On a Consul server, there are actually two containers:

┌─────────────────────────────────────┐
│        Consul Server Container      │
│  - Listens on 8500/8501             │
│  - Reads config from /consul/config │
│  - Mounts vault-agent volume        │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│   Vault Agent + Consul Template     │
│  - Authenticates to Vault           │
│  - Renders Consul config templates  │
│  - Manages token renewal            │
└─────────────────────────────────────┘

The vault-agent container runs consul-template which:

  • Authenticates to Vault using AppRole
  • Retrieves the Consul server token
  • Retrieves gossip encryption keys
  • Renders the Consul configuration file
  • Restarts Consul when config changes

This pattern is repeated everywhere - no static credentials in config files, everything is dynamically rendered.

Stage 5: Consul Configure

With Consul running, I configure the datacenter:

  • ACL system
  • Gossip encryption
  • TLS certificates for all nodes
  • Consul Connect for service mesh
  • DNS configuration
  • VM JWT authentication for non-containerized workloads

This outputs the datacenter configuration that Nomad will need.

Stage 6: Nomad

Nomad servers follow the same pattern as Consul - a main container with a vault-agent sidecar for dynamic configuration.

Nomad needs:

  • Vault connection details for workload identity
  • Consul connection details for service registration
  • The region configuration
  • TLS certificates

The Nomad servers are configured to use Vault for workload identity and Consul for service discovery, creating a fully integrated stack.

Stage 7: Nomad Clients

Nomad clients are deployed similarly to servers, but with a few differences:

  • They run the Docker driver for workload execution
  • They mount the Docker socket for container management
  • They have cgroup mounts for proper resource accounting
  • Each client cluster can have different configurations (network interfaces, storage, etc.)

The pattern is the same though - main container plus vault-agent sidecar, all configs dynamically rendered.

The config deployment pattern

Since I’m running on self-hosted machines and not in some magical cloud, I need to actually get config files onto the boxes.

I use a pattern that goes like this:

resource "null_resource" "service_config" {
  triggers = {
    config = local.config_content
  }

  connection {
    type = "ssh"
    user = var.docker_host.username
    host = var.docker_host.fqdn
    private_key = file(var.docker_host.private_key)
    bastion_host = var.docker_host.bastion_host
    bastion_user = var.docker_host.bastion_user
  }

  provisioner "file" {
    content     = local.config_content
    destination = "/service/config/file.hcl"
  }
}

resource "docker_container" "service" {
  # ... container config ...

  lifecycle {
    ignore_changes = [image, log_opts]

    replace_triggered_by = [
      null_resource.service_config,
      null_resource.container_image,
    ]
  }
}

So what happens is:

  1. The null_resource has a trigger based on the config content
  2. When the config changes, Terraform wants to “replace” the null_resource
  3. The file provisioner runs via SSH to upload the new config
  4. The docker_container has replace_triggered_by pointing to the null_resource
  5. When the null_resource is “replaced”, the container is destroyed and recreated

It’s a bit of a hack… okay, it’s absolutely a hack. But it works, and it means I don’t need to run Ansible or some other config management tool. Terraform handles both the config deployment AND the container lifecycle.

The container image trigger handles the image digest issue - when the image name changes to a new digest, the container gets recreated even though Terraform thinks it’s already at the right version.

This pattern is repeated everywhere - Vault, Consul, Nomad, clients, everything. Config changes trigger container restarts via this null_resource dance.

Authentication architecture

The authentication setup… there’s a few different methods depending on what we’re talking about.

For infrastructure deployment

Gitlab CI uses JWT auth to authenticate to Vault:

Gitlab CI ──JWT──> Vault ──(AppRole)──> Consul/Nomad tokens

The JWT is signed by Gitlab and verified by Vault. This gives the CI pipeline a Vault token with specific policies that allow it to deploy services.

Each service gets its own JWT role in Vault, bound to its Gitlab project path. So company/service-X can only authenticate as service-X, and company/service-Y can only authenticate as service-Y. No shared credentials, no confusion about which deployment did what.

For runtime services

Services running in Nomad use workload identity. Nomad signs a JWT for each task that includes metadata like job ID, task name, namespace, etc.

This JWT is used to:

  • Authenticate to Vault for secrets access
  • Authenticate to Consul for service registration

The policies are templated with the JWT metadata, so each task can only access secrets under its own path:

path "service_secrets/data/global/dc1/{{identity.entity.aliases.jwt_backend.metadata.nomad_job_id}}/{{identity.entity.aliases.jwt_backend.metadata.nomad_task}}/*" {
  capabilities = ["read"]
}

This means no long-lived tokens - everything is short-lived and automatically renewed. If a task crashes and restarts, it just gets a new JWT and carries on.

For VMs

For VMs that aren’t part of the container-based setup, I use a JWT-based bootstrap system. A single binary:

  • Retrieves a JWT from the hypervisor metadata
  • Authenticates to Vault
  • Retrieves all required secrets
  • Configures and starts Consul

This removes the need for complex vault-agent + consul-template chains on VMs. The binary does everything and then execs into the Consul process, becoming the Consul agent itself.

Service roles

The service_role module is where all of this comes together for actual deployments.

When I want to deploy a new service, I call this module once with the service name and Gitlab project path. It creates:

  • Vault AppRole for Terraform to authenticate during deployment
  • Vault policies for deployment-time and runtime access
  • Vault token roles for generating Consul and Nomad tokens
  • Consul policies for service registration and intentions
  • Nomad policies for job submission

The module outputs a complex object with everything the deployment Terraform needs:

output "service_role" {
  value = {
    vault_approle_deployment_role_id    = "..."
    vault_approle_deployment_secret_id  = "..."
    vault_consul_engine_path            = "consul-dc1"
    vault_consul_role_name              = "my-service"
    vault_nomad_engine_path             = "nomad-global"
    vault_nomad_role_name               = "my-service"
    vault = {
      ca_cert = "..."
      address = "https://vault.svc.example.local:8200"
    }
    consul = {
      address = "https://consul.svc.example.local:8501"
      datacenter = "dc1"
      root_cert_public_key = "..."
    }
    nomad = {
      address = "https://nomad.svc.example.local:4646"
      region = "global"
      datacenter = "dc1"
    }
  }
}

The deployment Terraform then uses this output to configure all its providers. No hardcoded addresses, no shared credentials - each deployment gets exactly what it needs and nothing more.

Deployment vs Runtime permissions

The thing that makes this work is the separation between deployment-time and runtime permissions. I wrote about this in SecureVaultConsulNomadDeployments, but the basic idea is:

Deployment permissions (what Terraform needs):

  • Create/update Nomad jobs
  • Register Consul services and intentions
  • Write application secrets to Vault
  • Generate short-lived tokens for the application to use at runtime

Runtime permissions (what the application needs):

  • Read its own secrets from Vault
  • Register its service in Consul
  • Renew its own certificates and tokens

These are completely different. The deployment Terraform needs broad permissions to set everything up, but the application should only be able to access its own stuff.

The Gitlab JWT bootstrapping flow

The deployment starts with Gitlab CI authenticating to Vault using JWT. Gitlab signs a JWT for each job, Vault verifies it against Gitlab’s public keys, and boom - a Vault token with specific policies.

I use the project_path claim for authentication rather than project_id. The reason is… I can see that “company/service-X” would be correct for the service role for “service-X”. project_id = 5231 on the other hand is not. Plus all these projects are internal, the owner is aware of the importance of their project name/path, and project paths can’t be stolen unless the original project changes its name.

Once authenticated, the deployment Terraform uses a Vault secret that contains all the configuration it needs:

data "vault_kv_secret_v2" "config" {
  mount = "deployment_secrets_kv"
  name  = "konvad/services/global/dc1/my-service"
}

locals {
  config = merge(
    data.vault_kv_secret_v2.config.data,
    {
      "consul" = jsondecode(data.vault_kv_secret_v2.config.data.consul)
      "nomad"  = jsondecode(data.vault_kv_secret_v2.config.data.nomad)
      "vault"  = jsondecode(data.vault_kv_secret_v2.config.data.vault)
    }
  )
}

This secret contains everything - Vault addresses, Consul endpoints, role names, policy names, domain names. The deployment Terraform doesn’t need any hardcoded values. It just reads this secret and configures its providers.

Granular permissions

Each service gets its own set of policies and roles in Vault, Consul, and Nomad:

Vault:

  • deployment_policy - for Terraform to deploy the service
  • application_policy - for the application to read its secrets
  • token_role - to generate short-lived tokens for the application

Consul:

  • deployment_policy - for Terraform to register services and intentions
  • application_policy - for the application to register itself

Nomad:

  • deployment_policy - for Terraform to submit jobs
  • application_policy - for workload identity

The Harbor container registry gets a robot account for each service too, so Terraform can pull images during deployment without needing a shared credential.

This granularity means that if one service is compromised, the blast radius is limited to that service’s secrets and registration. It can’t touch anything else.

TLS everywhere

Every component communicates over TLS with proper certificate verification. No “trust me bro” HTTP traffic anywhere.

The PKI setup works like this:

  1. There’s an offline root CA (air-gapped, stored… somewhere safe)
  2. Vault has a PKI mount that generates an intermediate CA
  3. This intermediate CA is signed by the root CA (via the vault-adm provider)
  4. All services use this intermediate CA to issue their certificates

So the hierarchy is more like:

┌────────────────────────────────────────┐
│     Offline Root CA (air-gapped)       │
└──────────────────┬─────────────────────┘
                   │
        ┌──────────▼──────────┐
        │ Vault Intermediate CA│
        │  (signed by root)    │
        └──────────┬───────────┘
                   │
     ┌─────────────┼─────────────┐
     │             │             │
┌────▼────┐  ┌────▼────┐  ┌────▼────┐
│ Vault   │  │ Consul  │  │ Nomad   │
│ certs   │  │ certs   │  │ certs   │
└─────────┘  └─────────┘  └─────────┘

Each service gets certificates from this intermediate CA, including the service’s main hostname, localhost for local connections, the service’s IP address, and any relevant alt names.

Traefik PKI roles

For the Nomad client clusters, there’s a Traefik PKI role that allows generating wildcard certificates for specific domains. This means services running on that cluster can get certificates for things like *.example.local without needing individual certs for every service.

The role is scoped to specific domains that are allowed for that cluster, so one cluster can’t generate certs for another cluster’s services.

Each service gets its own intermediate CA signed by the root. This means:

  • Compromise of one service’s CA doesn’t affect others
  • Each service can have its own certificate policies
  • Revocation is scoped to the affected service

Each service’s certificates include:

  • The service’s main hostname
  • localhost for local connections
  • The service’s IP address
  • Any relevant alt names (like the datacenter domain)

Certificates are passed into containers as environment variables, which is… not ideal. I know, I know. But it works for now and honestly, trying to do volume mounts for certs in Terraform was more pain than it was worth.

The important thing is that every service verifies every other service’s certificates. No skip_verify=true anywhere. If a cert is wrong, things just don’t work. And that’s how it should be.

Terraform module patterns

There’s a few patterns in the Terraform that make this whole thing manageable.

Complex output objects

Instead of outputting individual values, I group related outputs into objects. So I can pass one big object instead of wiring up half a dozen variables every time.

output "vault_cluster" {
  description = "Vault cluster configuration"
  value = {
    address                      = vault_cluster_address
    ca_cert                      = var.ca_cert
    pki_mount_path              = vault_mount.pki.path
    consul_static_mount_path    = vault_mount.consul_static.path
    service_secrets_mount_path  = vault_mount.service_secrets.path
    gitlab_jwt_auth_backend_path = vault_jwt_auth_backend.gitlab.path
  }
}

When the next stage needs to use Vault, it just passes vault_cluster as one object. Makes the module calls cleaner… and I can add new outputs without breaking every downstream module.

Module composition

Each layer is composed of smaller modules. The consul_server module isn’t one monolith:

module "consul_server" {
  source = "../../../modules/consul/server"

  datacenter    = module.dc1
  vault_cluster = data.terraform_remote_state.vault_configure.outputs.vault_cluster
  root_cert     = module.consul_certificate_authority

  docker_images = data.terraform_remote_state.images.outputs
  docker_host   = local.hosts["banana"]
}

And consul_server itself is composed of:

  • container - the actual Docker container
  • image - the image build
  • vault_approle - the AppRole for consul-template

I can test and reuse pieces independently. If I need a Consul client somewhere else, I just use the consul/client module without having to think about how it works internally.

Provider passing

Different modules need to interact with different Docker hosts, so I use provider passing:

module "vault_node" {
  source = "../../../modules/vault/node"

  docker_host = var.docker_host

  providers = {
    docker          = docker.vault
    vault.vault-adm = vault.vault-adm
  }
}

Each host gets its own Docker provider configured with the right SSH keys and bastion settings. The module doesn’t need to know HOW to connect to the host - it just uses docker.docker and assumes the provider is configured correctly.

I could use the same vault/node module to deploy to a completely different set of hosts just by changing the provider configuration.

The Docker-based approach

Everything runs in Docker containers on the hosts. Has pros and cons.

Pros

  • Container dies? recreate it. No messy package installs to clean up
  • Services don’t step on each other. Mostly.
  • Images are built once and referenced by hash. No “what version is running where?”

Cons

  • Using host network mode to avoid overlay networks. This works but feels wrong sometimes.
  • Lots of bind mounts for persistence. I’ve lost track of what’s mounted where more than once.
  • Sometimes harder to get inside containers when things go wrong

The host network thing… it avoids having to deal with Docker overlay networks and makes TLS certificates simpler. But yeah, everything shares the host network namespace, which isn’t ideal from a security perspective.

For a homelab though? It’s fine. For production? I’d probably think harder about it.

What I’d do differently

There’s a few things I’d reconsider:

  • Environment variables for certs - should use volume mounts instead. Passing certs as env vars feels wrong even if it works.
  • The number of modules - some are probably too granular. Do I really need separate modules for token vs policy? Probably not.
  • The replace_triggered_by pattern is a bit hacky. But it works, and the alternatives are worse.
  • Having 15+ separate Terraform states is annoying when I need to plan/apply them in order. But keeping them separate has saved me more than once when one stage broke.

But honestly? The stack works. It’s reproducible, it’s secure, and it follows a clear dependency chain. I can rebuild the entire thing from scratch in a few hours if I need to. That’s not nothing.

Summary

So that’s konvad-stack… a Terraform-based deployment of Vault, Consul and Nomad using Docker containers on self-hosted hardware.

The patterns that make it work:

  • Docker containers for easy management
  • null_resource + file provisioner for config deployment
  • replace_triggered_by for container restarts on config changes
  • Complex output objects to reduce module coupling
  • Vault agent + consul-template sidecars for dynamic configuration
  • JWT-based authentication for both CI/CD and workloads
  • TLS everywhere with a proper PKI hierarchy

It’s not perfect, but it’s mine, and it actually works. Which is more than I can say for some of the infrastructure I’ve dealt with over the years…