Get started with HashiCorp Nomad & Consul

Get started with HashiCorp Nomad & Consul

From 0 to 100 with HashiCorps Nomad & Consul including initial server setup, load balancing, service connection any much more.

Roman Zipp, April 8th, 2023

About this Guide

This post will guide you through the initial setup of Nomad, Consul and Vault.

Additionally, I will cover some common additional steps for

  • AWS CLI (for ECR)

  • Docker (+ Authentication)

1. Prequisites

CNI Bridge

Nomad uses CNI plugins to configure the network namespace used to secure the Consul service mesh sidecar proxy. All Nomad client nodes using network namespaces must have CNI plugins installed. See the Consul CNI Docs for more information.

See Nomad Install Docs for more information

curl -L -o cni-plugins.tgz "https://github.com/containernetworking/plugins/releases/download/v1.0.0/cni-plugins-linux-$( [ $(uname -m) = aarch64 ] && echo arm64 || echo amd64)"-v1.0.0.tgz && \
  sudo mkdir -p /opt/cni/bin && \
  sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz

Configure environment values

This script configures multiple env values NOMAD_ADDR, VAULT_ADDR, CONSUL_HTTP_ADDR so we can run cli commands without appending the address and port every time.

PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1)

echo -e "\nexport NOMAD_ADDR=http://$PRIVATE_IP:4646" >> /root/.bashrc
echo -e "export VAULT_ADDR=https://$PRIVATE_IP:8200" >> /root/.bashrc
echo -e "export CONSUL_HTTP_ADDR=http://$PRIVATE_IP:8500" >> /root/.bashrc
source /root/.bashrc

2. Installation

Get private interface IP

You need to obtain the private IP of your chosen network interface. Check if the command below fits your needs or set the IP address manually.

PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1)

echo $PRIVATE_IP

Install require packages

apt-get update && apt-get upgrade -y
apt-get install curl wget gpg gnupg coreutils ca-certificates lsb-release


# AWS CLI
apt-get install -y awscli amazon-ecr-credential-helper

Install Docker

See the official Docker install guide for more information.

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null

apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io

Add HashiCorps PPAs

See the official install guide if your prefer to use the prebuilt binary.

wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list

Install Nomad, Consul & Vault

apt-get update
apt-get install -y nomad consul vault

systemctl enable nomad
systemctl enable consul
systemctl enable vault

3. Configuration

3.1 Docker AWS ECR Authentication

We will create a new /etc/docker/config.json file to provide Nomad with our Docker login credentials. Replace <aws_id> and <aws_region> with your own values.

mkdir -p /etc/docker

cat <<EOT >> /etc/docker/config.json
{
    "credHelpers": {
        "public.ecr.aws": "ecr-login",
        "<aws_id>.dkr.ecr.<aws_region>.amazonaws.com": "ecr-login"
    }
}
EOT


3.2 Nomad

Before getting started with Nomad instances, we need to configure some environment values in /etc/nomad.d/nomad.env

mkdir -p /etc/nomad.d

cat <<EOT >> /etc/nomad.d/nomad.env
AWS_ACCESS_KEY_ID=******
AWS_SECRET_ACCESS_KEY=******
AWS_DEFAULT_REGION=<aws_region>

VAULT_ADDR=http://127.0.0.1:8200
VAULT_TOKEN=

CONSUL_HTTP_ADDR=$PRIVATE_IP:8500
CONSUL_CACERT=/etc/consul.d/certs/consul-agent-ca.pem
CONSUL_CLIENT_CERT=/etc/consul.d/certs/dc1-server-consul.pem
CONSUL_CLIENT_KEY=/etc/consul.d/certs/dc1-server-consul-key.pem
CONSUL_HTTP_SSL=false
EOT

We will now configure your Nomad instances as client and/or server. Place this file in /etc/nomad.d/nomad.hcl

rm -f /etc/nomad.d/nomad.hcl && nano /etc/nomad.d/nomad.hcl

Nomad Client

datacenter = "dc1"
data_dir   = "/opt/nomad"
bind_addr  = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"

server {
  enabled = false
}

client {
  enabled = true
  network_interface = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"name\" }}"

  template {
    disable_file_sandbox = true
  }
}

consul {}

plugin "docker" {
  config {
    volumes {
      enabled = true
    }
    auth {
      config = "/etc/docker/config.json"
    }
  }
}

Nomad Client & Server

datacenter = "dc1"
data_dir   = "/opt/nomad"
bind_addr  = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"

server {
  enabled          = true
  bootstrap_expect = 3
}

client {
  enabled = true
  network_interface = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"name\" }}"

  template {
    disable_file_sandbox = true
  }
}

consul {}

plugin "docker" {
  config {
    volumes {
      enabled = true
    }
    auth {
      config = "/etc/docker/config.json"
    }
  }
}

Some values explained

  • The server server.bootstrap_expect values defined, how many nomad server instances need to be running in order to select a leader.

  • The client.template.disable_file_sandbox allows you to mount host files into you job allcos.

3.3 Consul

Setup TLS & encryption (optional)

To enable internal TLS encryption we need to generate a certificate using the following commands. See the Consul TLS docs for more information.

mkdir -p /etc/consul.d/certs && cd /etc/consul.d/certs

consul keygen
# UyaZRVMUdoNinDtEDxMZFiqpQmjbsIQXUeGYDWgi=

consul tls ca create -domain consul
# ==> Saved consul-agent-ca.pem
# ==> Saved consul-agent-ca-key.pem

You now need to distribute the generated consul-agent-ca.pem certificate to all consul agents and place it in /etc/consul.d/certs/consul-agent-ca.pem

Generate agent certificates

On your host Consul server, generate an agent certificate for each Consul agent you want to deploy.

consul tls cert create -server -dc dc1 -domain consul
# ==> WARNING: Server Certificates grants authority to become a
#     server and access all state in the cluster including root keys
#     and all ACL tokens. Do not distribute them to production hosts
#     that are not server nodes. Store them as securely as CA keys.
# ==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
# ==> Saved dc1-server-consul-0.pem
# ==> Saved dc1-server-consul-0-key.pem

Distribute your agent certificate to the respective server

scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1-key.pem .
scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1.pem .

Configure each agent

Again, choose which servers will be used as client or server for your Consul instances.

mkdir -p /etc/consul.d/certs
chown -R consul:consul /etc/consul.d
chown -R consul:consul /opt/consul

rm -f /etc/consul.d/consul.hcl && nano /etc/consul.d/consul.hcl
chmod 640 /etc/consul.d/consul.hcl

# Paste consul keys
/etc/consul.d/certs/dc1-server-consul.pem
/etc/consul.d/certs/dc1-server-consul-key.pem

# Bootstrap ACL
consul acl bootstrap

Consul Client

datacenter = "dc1"
data_dir   = "/opt/consul"

bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}"

retry_join = ["<add all o your consul clients & server ip addresses>"] # also include this server ip

ca_file   = "/etc/consul.d/certs/consul-agent-ca.pem"
cert_file = "/etc/consul.d/certs/dc1-server-consul.pem"
key_file  = "/etc/consul.d/certs/dc1-server-consul-key.pem"

tls {
  grpc {
    use_auto_cert = false
  }
}

ports {
  grpc     = 8502
  grpc_tls = -1
}

connect {
  enabled = true
}

dns_config {
  allow_stale   = true
  node_ttl      = "5s"
  use_cache     = true
  cache_max_age = "5s"
}

log_level  = "info"

Consul Server

datacenter = "dc1"
data_dir   = "/opt/consul"

server           = true
bootstrap_expect = 3    # your server count

bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}"

retry_join = ["<add all o your consul clients & server ip addresses>"] # also include this server ip

ca_file   = "/etc/consul.d/certs/consul-agent-ca.pem"
cert_file = "/etc/consul.d/certs/dc1-server-consul-0.pem"
key_file  = "/etc/consul.d/certs/dc1-server-consul-0-key.pem"

ui_config {
  enabled = true
}

acl {
  enabled                  = true
  default_policy           = "allow" # change this
  enable_token_persistence = true
}

tls {
  grpc {
    use_auto_cert = false
  }
}

ports {
  grpc     = 8502
  grpc_tls = -1
}

connect {
  enabled = true
}

dns_config {
  allow_stale   = true
  node_ttl      = "5s"
  use_cache     = true
  cache_max_age = "5s"
}

log_level  = "info"

4. Cheat-Sheet

Here are some handy commands I commonly use for debugging.

# nomad cleanup allocation history & summary
nomad system gc
nomad system reconcile summaries

# show service logs
nomad monitor

# attach shell to job
nomad alloc exec -task=<task> <alloc> /bin/bash

# show open ports
lsof -i -P -n | grep LISTEN
ss -tulpn

# test tcp connection
nc -z -v -w 2 <host> <port>

# test consul dns
dig @127.0.0.1 -p 8600 _<service-name>._tcp.service.consul

# query container from inside
curl -H "Host: domain.tld" 10.1.10.1:21021

# query some endpoint
curl -H "Host: domain.tld" -X POST <host>/api

# list service instances "address:port"
curl -s http://127.0.0.1:8500/v1/catalog/service/<service-name>|jq -j '.[] | .ServiceAddress,":",.ServicePort,"\n"'

# consul filtering
curl --get http://127.0.0.1:8500/v1/agent/services --data-urlencode 'filter=Service == "<service-name>"'|jq -j '.[] | .Address,":",.Port,"\n"'