Configuration

Using .env file to load required environment variables

Kamal uses dotenv to automatically load environment variables set in the .env file present in the application root. This file can be used to set variables like KAMAL_REGISTRY_PASSWORD or database passwords. But for this reason you must ensure that .env files are not checked into Git or included in your Dockerfile! The format is just key-value like:

KAMAL_REGISTRY_PASSWORD=pw
DB_PASSWORD=secret123

Using a generated .env file

1Password as a secret store

If you’re using a centralized secret store, like 1Password, you can create .env.erb as a template which looks up the secrets. Example of a .env.erb file:

<% if (session_token = `op signin --account my-one-password-account --raw`.strip) != "" %># Generated by kamal envify
GITHUB_TOKEN=<%= `gh config get -h github.com oauth_token`.strip %>
KAMAL_REGISTRY_PASSWORD=<%= `op read "op://Vault/Docker Hub/password" -n --session  #{session_token}` %>
RAILS_MASTER_KEY=<%= `op read "op://Vault/My App/RAILS_MASTER_SECRET" -n --session #{session_token}` %>
MYSQL_ROOT_PASSWORD=<%= `op read "op://Vault/My App/MYSQL_ROOT_PASSWORD" -n --session #{session_token}` %>
<% else raise ArgumentError, "Session token missing" end %>

This template can safely be checked into git. Then everyone deploying the app can run kamal envify when they setup the app for the first time or passwords change to get the correct .env file.

If you need separate env variables for different destinations, you can set them with .env.destination.erb for the template, which will generate .env.staging when run with kamal envify -d staging.

Note: If you utilize biometrics with 1Password you can remove the session_token related parts in the example and just call op read op://Vault/Docker Hub/password -n.

Bitwarden as a secret store

If you are using open source secret store like bitwarden, you can create .env.erb as a template which looks up the secrets.

You can store SOME_SECRET in a secure note in bitwarden vault:

$ bw list items --search SOME_SECRET | jq
? Master password: [hidden]

[
  {
    "object": "item",
    "id": "123123123-1232-4224-222f-234234234234",
    "organizationId": null,
    "folderId": null,
    "type": 2,
    "reprompt": 0,
    "name": "SOME_SECRET",
    "notes": "yyy",
    "favorite": false,
    "secureNote": {
      "type": 0
    },
    "collectionIds": [],
    "revisionDate": "2023-02-28T23:54:47.868Z",
    "creationDate": "2022-11-07T03:16:05.828Z",
    "deletedDate": null
  }
]

… and extract the id of SOME_SECRET from the json above and use in the erb below.

Example .env.erb file:

<% if (session_token=`bw unlock --raw`.strip) != "" %># Generated by kamal envify
SOME_SECRET=<%= `bw get notes 123123123-1232-4224-222f-234234234234 --session #{session_token}` %>
<% else raise ArgumentError, "session_token token missing" end %>

Then everyone deploying the app can run kamal envify and kamal will generate .env

Configuring the run directory

Kamal needs to create files on the host for locking and audit logs.

By default these will be created in the .kamal subdirectory of the default SSH directory.

This can be changed with

run_directory: /var/run/kamal

Using another registry than Docker Hub

The default registry is Docker Hub, but you can change it using registry/server:

registry:
  server: registry.digitalocean.com
  username:
    - DOCKER_REGISTRY_TOKEN
  password:
    - DOCKER_REGISTRY_TOKEN

A reference to secret DOCKER_REGISTRY_TOKEN will look for ENV["DOCKER_REGISTRY_TOKEN"] on the machine running Kamal.

Using AWS ECR as the container registry

AWS ECR’s access token is only valid for 12hrs. In order to not have to manually regenerate the token every time, you can use ERB in the deploy.yml file to shell out to the aws cli command, and obtain the token:

registry:
  server: <your aws account id>.dkr.ecr.<your aws region id>.amazonaws.com
  username: AWS
  password: <%= %x(aws ecr get-login-password) %>

You will need to have the aws CLI installed locally for this to work.

Using a different SSH user than root

The default SSH user is root, but you can change it using ssh/user:

ssh:
  user: app

If you are using non-root user (app as above example), you need to bootstrap your servers manually, before using them with Kamal. On Ubuntu, you’d do:

sudo apt update
sudo apt upgrade -y
sudo apt install -y docker.io curl git
sudo usermod -a -G docker app

Using a proxy SSH host

If you need to connect to server through a proxy host, you can use ssh/proxy:

ssh:
  proxy: "192.168.0.1" # defaults to root as the user

Or with specific user:

ssh:
  proxy: "[email protected]"

Also if you need specific proxy command to connect to the server:

ssh:
  proxy_command: aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --region=us-east-1 ## ssh via aws ssm

Using a different SSH log level

ssh:
  log_level: debug

Valid levels are debug, info, warn, error and fatal (default).

Using env variables

You can inject env variables into the app containers using env:

env:
  DATABASE_URL: mysql2://db1/hey_production/
  REDIS_URL: redis://redis1:6379/1

Note: Before you can start the containers you need to push the env variables up to the servers.

Using secret env variables

If you have env variables that are secret, you can divide the env block into clear and secret:

env:
  clear:
    DATABASE_URL: mysql2://db1/hey_production/
    REDIS_URL: redis://redis1:6379/1
  secret:
    - DATABASE_PASSWORD
    - REDIS_PASSWORD

The list of secret env variables will be expanded at run time from your local machine. So a reference to a secret DATABASE_PASSWORD will look for ENV["DATABASE_PASSWORD"] on the machine running Kamal. Just like with build secrets.

If the referenced secret ENVs are missing, the configuration will be halted with a KeyError exception.

Note: Marking an ENV as secret currently only redacts its value in the output for Kamal. The ENV is still injected in the clear into the container at runtime.

Using Kamal env variables

The following env variables are set when your container runs:

KAMAL_CONTAINER_NAME : this contains the current container name and version

Using volumes

You can add custom volumes into the app containers using volumes:

volumes:
  - "/local/path:/container/path"

Using directories

Directories act in a similar way to volumes except it will create a corresponding directory on the host before mounting the volume:

e.g.

service: kamal-demo
accessories:
  db:
    # ...
    directories:
      - data:/var/lib/mysql

will run mkdir first …

Running /usr/bin/env mkdir -p $PWD/kamal-demo-db/data

and then it will mount the volume …

docker run ... --volume $PWD/kamal-demo-db/data:/var/lib/mysql

Using different roles for servers

If your application uses separate hosts for running jobs or other roles beyond the default web running, you can specify these hosts in a dedicated role with a new entrypoint command like so:

servers:
  web:
    - 192.168.0.1
    - 192.168.0.2
  job:
    hosts:
      - 192.168.0.3
      - 192.168.0.4
    cmd: bin/jobs

Note: Traefik will only by default be installed and run on the servers in the web role (and on all servers if no roles are defined). If you need Traefik on hosts in other roles than web, add traefik: true:

servers:
  web:
    - 192.168.0.1
    - 192.168.0.2
  web2:
    traefik: true
    hosts:
      - 192.168.0.3
      - 192.168.0.4

Using container labels

You can specialize the default Traefik rules by setting labels on the containers that are being started:

labels:
  traefik.http.routers.hey-web.rule: Host(`app.hey.com`)

Traefik rules are in the “service-role-destination” format. The default role will be web if no rule is specified. If the destination is not specified, it is not included. To give an example, the above rule would become “traefik.http.routers.hey-web-staging.rule” if it was for the “staging” destination.

Note: The backticks are needed to ensure the rule is passed in correctly and not treated as command substitution by Bash!

This allows you to run multiple applications on the same server sharing the same Traefik instance and port. See doc.traefik.io for a full list of available routing rules.

The labels can also be applied on a per-role basis:

servers:
  web:
    - 192.168.0.1
    - 192.168.0.2
  job:
    hosts:
      - 192.168.0.3
      - 192.168.0.4
    cmd: bin/jobs
    labels:
      my-label: "50"

Using shell expansion

You can use shell expansion to interpolate values from the host machine into labels with the ${} syntax. Anything within the curly braces will be executed on the host machine and the result will be interpolated into the label.

labels:
  host-machine: "${cat /etc/hostname}"

Note: Any other occurrence of $ will be escaped to prevent unwanted shell expansion!

Using container options

You can specialize the options used to start containers using the options definitions:

servers:
  web:
    - 192.168.0.1
    - 192.168.0.2
  job:
    hosts:
      - 192.168.0.3
      - 192.168.0.4
    cmd: bin/jobs
    options:
      cap-add: true
      cpu-count: 4

That’ll start the job containers with docker run ... --cap-add --cpu-count 4 ....

Setting a minimum version

You can set the minimum Kamal version with:

minimum_version: 0.13.3

Note: versions <= 0.13.2 will ignore this setting.

Configuring logging

You can configure the logging driver and options passed to Docker using logging:

logging:
  driver: awslogs
  options:
    awslogs-region: "eu-central-2"
    awslogs-group: "my-app"

If nothing is configured, the default option max-size=10m is used for all containers. The default logging driver of Docker is json-file.

Using a different stop wait time

On a new deploy, each old running container is gracefully shut down with a SIGTERM, and after a grace period of 10 seconds a SIGKILL is sent. You can configure this value via the stop_wait_time option:

stop_wait_time: 30

Using remote builder for native multi-arch

If you’re developing on ARM64 (like Apple Silicon), but you want to deploy on AMD64 (x86 64-bit), you can use multi-architecture images. By default, Kamal will setup a local buildx configuration that does this through QEMU emulation. But this can be quite slow, especially on the first build.

If you want to speed up this process by using a remote AMD64 host to natively build the AMD64 part of the image, while natively building the ARM64 part locally, you can do so using builder options:

builder:
  local:
    arch: arm64
    host: unix:///Users/<%= `whoami`.strip %>/.docker/run/docker.sock
  remote:
    arch: amd64
    host: ssh://[email protected]

Note: You must have Docker running on the remote host being used as a builder. This instance should only be shared for builds using the same registry and credentials.

Using remote builder for single-arch

If you’re developing on ARM64 (like Apple Silicon), want to deploy on AMD64 (x86 64-bit), but don’t need to run the image locally (or on other ARM64 hosts), you can configure a remote builder that just targets AMD64. This is a bit faster than building with multi-arch, as there’s nothing to build locally.

builder:
  remote:
    arch: amd64
    host: ssh://[email protected]

Using native builder when multi-arch isn’t needed

If you’re developing on the same architecture as the one you’re deploying on, you can speed up the build by forgoing both multi-arch and remote building:

builder:
  multiarch: false

This is also a good option if you’re running Kamal from a CI server that shares architecture with the deployment servers.

Using a different Dockerfile or context when building

If you need to pass a different Dockerfile or context to the build command (e.g. if you’re using a monorepo or you have different Dockerfiles), you can do so in the builder options:

# Use a different Dockerfile
builder:
  dockerfile: Dockerfile.xyz

# Set context
builder:
  context: ".."

# Set Dockerfile and context
builder:
  dockerfile: "../Dockerfile.xyz"
  context: ".."

Using multistage builder cache

Docker multistage build cache can singlehandedly speed up your builds by a lot. Currently Kamal only supports using the GHA cache or the Registry cache:

# Using GHA cache
builder:
  cache:
    type: gha

# Using Registry cache
builder:
  cache:
    type: registry

# Using Registry cache with different cache image
builder:
  cache:
    type: registry
    # default image name is <image>-build-cache
    image: application-cache-image

# Using Registry cache with additional cache-to options
builder:
  cache:
    type: registry
    options: mode=max,image-manifest=true,oci-mediatypes=true

For further insights into build cache optimization, check out documentation on Docker’s official website: https://docs.docker.com/build/cache/.

Using build secrets for new images

Some images need a secret passed in during build time, like a GITHUB_TOKEN, to give access to private gem repositories. This can be done by having the secret in ENV, then referencing it in the builder configuration:

builder:
  secrets:
    - GITHUB_TOKEN

This build secret can then be referenced in the Dockerfile:

# Copy Gemfiles
COPY Gemfile Gemfile.lock ./

# Install dependencies, including private repositories via access token (then remove bundle cache with exposed GITHUB_TOKEN)
RUN --mount=type=secret,id=GITHUB_TOKEN \
  BUNDLE_GITHUB__COM=x-access-token:$(cat /run/secrets/GITHUB_TOKEN) \
  bundle install && \
  rm -rf /usr/local/bundle/cache

Traefik command arguments

Customize the Traefik command line using args:

traefik:
  args:
    accesslog: true
    accesslog.format: json

This starts the Traefik container with --accesslog=true --accesslog.format=json arguments.

Traefik host port binding

Traefik binds to port 80 by default. Specify an alternative port using host_port:

traefik:
  host_port: 8080

Alternatively, set publish to false to prevent binding to a host port. This can be useful if you are running Traefik behind a reverse proxy, for example:

traefik:
  publish: false

Traefik version, upgrades, and custom images

Kamal runs the traefik:v2.9 image to track Traefik 2.9.x releases.

To pin Traefik to a specific version or an image published to your registry, specify image:

traefik:
  image: traefik:v2.10.0-rc1

This is useful for downgrading Traefik if there’s an unexpected breaking change in a minor version release, upgrading Traefik to test forthcoming releases, or running your own Traefik-derived image.

Kamal has not been tested for compatibility with Traefik 3 betas. Please do!

Traefik container configuration

Pass additional Docker configuration for the Traefik container using options:

traefik:
  options:
    publish:
      - 8080:8080
    volumes:
      - /tmp/example.json:/tmp/example.json
    memory: 512m

This starts the Traefik container with --volume /tmp/example.json:/tmp/example.json --publish 8080:8080 --memory 512m arguments to docker run.

Traefik container labels

Add labels to Traefik Docker container.

traefik:
  labels:
    traefik.enable: true
    traefik.http.routers.dashboard.rule: Host(`traefik.example.com`) && (PathPrefix(`/api`) || PathPrefix(`/dashboard`))
    traefik.http.routers.dashboard.service: api@internal
    traefik.http.routers.dashboard.middlewares: auth
    traefik.http.middlewares.auth.basicauth.users: test:$2y$05$H2o72tMaO.TwY1wNQUV1K.fhjRgLHRDWohFvUZOJHBEtUXNKrqUKi # test:password

This labels Traefik container with --label traefik.http.routers.dashboard.middlewares=\"auth\" and so on.

Traefik alternate entrypoints

You can configure multiple entrypoints for Traefik like so:

service: myservice

labels:
  traefik.tcp.routers.other.rule: 'HostSNI(`*`)'
  traefik.tcp.routers.other.entrypoints: otherentrypoint
  traefik.tcp.services.other.loadbalancer.server.port: 9000
  traefik.http.routers.myservice.entrypoints: web
  traefik.http.services.myservice.loadbalancer.server.port: 8080

traefik:
  options:
    publish:
      - 9000:9000
  args:
    entrypoints.web.address: ':80'
    entrypoints.otherentrypoint.address: ':9000'

Configuring build args for new images

Build arguments that aren’t secret can also be configured:

builder:
  args:
    RUBY_VERSION: 3.2.0

This build argument can then be used in the Dockerfile:

ARG RUBY_VERSION
FROM ruby:$RUBY_VERSION-slim as base

Using accessories for database, cache, search services

You can manage your accessory services via Kamal as well. Accessories are long-lived services that your app depends on. They are not updated when you deploy.

accessories:
  mysql:
    image: mysql:5.7
    host: 1.1.1.3
    port: 3306
    env:
      clear:
        MYSQL_ROOT_HOST: '%'
      secret:
        - MYSQL_ROOT_PASSWORD
    volumes:
      - /var/lib/mysql:/var/lib/mysql
    options:
      cpus: 4
      memory: "2GB"
  redis:
    image: redis:latest
    roles:
      - web
    port: "36379:6379"
    volumes:
      - /var/lib/redis:/data
  internal-example:
    image: registry.digitalocean.com/user/otherservice:latest
    host: 1.1.1.5
    port: 44444

The hosts that the accessories will run on can be specified by hosts or roles:

  # Single host
  mysql:
    host: 1.1.1.1
  # Multiple hosts
  redis:
    hosts:
      - 1.1.1.1
      - 1.1.1.2
  # By role
  monitoring:
    roles:
      - web
      - jobs

Now run kamal accessory start mysql to start the MySQL server on 1.1.1.3. See kamal accessory for all the commands possible.

Accessory images must be public or tagged in your private registry.

Using Cron

You can use a specific container to run your Cron jobs:

servers:
  cron:
    hosts:
      - 192.168.0.1
    cmd:
      bash -c "cat config/crontab | crontab - && cron -f"

This assumes the Cron settings are stored in config/crontab.

Using a custom healthcheck

Kamal uses Docker healthchecks to check the health of your application during deployment. Traefik uses this same healthcheck status to determine when a container is ready to receive traffic.

The healthcheck defaults to testing the HTTP response to the path /up on port 3000, up to 7 times. You can tailor this behaviour with the healthcheck setting:

healthcheck:
  path: /healthz
  port: 4000
  max_attempts: 7
  interval: 20s

This will ensure your application is configured with a traefik label for the healthcheck against /healthz and that the pre-deploy healthcheck that Kamal performs is done against the same path on port 4000.

You can also specify a custom healthcheck command, which is useful for non-HTTP services:

healthcheck:
  cmd: /bin/check_health

The top-level healthcheck configuration applies to all services that use Traefik, by default. You can also specialize the configuration at the role level:

servers:
  job:
    hosts: ...
    cmd: bin/jobs
    healthcheck:
      cmd: bin/check

The healthcheck allows for an optional max_attempts setting, which will attempt the healthcheck up to the specified number of times before failing the deploy. This is useful for applications that take a while to start up. The default is 7.

The HTTP health checks assume that the curl command is available inside the container. If that’s not the case, use the healthcheck’s cmd option to specify an alternative check that the container supports.

When starting container healthcheck by default will only show last 50 lines. That might be not enough when something goes wrong - so you can add log_lines params and specify larger number if required.

Zero-downtime deploy with cord files

We need to stop Traefik from sending requests to old containers before stopping them, otherwise we could get errors. We do this with a cord file.

The file is created in a directory on the host and the directory is mounted into the container. The healthcheck is modified to check for the file.

When we want to shut down the container we first delete the cord file, then wait for container to become unhealthy.

By default the directory is mounted to /tmp/kamal-cord. You can change the location with

healthcheck:
  cord: /var/run/kamal-cord

Or disable the cord (and lose the zero-downtime guarantee) with:

healthcheck:
  cord: false

Custom port for the healthcheck with multiple apps

Healthcheck is binding containers port to server’s port. When running multiple applications on the same server and deploying them in parallel you should specify different port for each application.

healthcheck:
  exposed_port: 4000 # 3999 is the default one

This allows you to run multiple applications on the same server sharing the same Traefik instance and port

Using rolling deployments

When deploying to large numbers of hosts, you might prefer not to restart your services on every host at the same time.

Kamal’s default is to boot new containers on all hosts in parallel. But you can control this by configuring boot/limit and boot/wait as options:

service: myservice

boot:
  limit: 10 # Can also specify as a percentage of total hosts, such as "25%"
  wait: 2

When limit is specified, containers will be booted on, at most, limit hosts at once. Kamal will pause for wait seconds between batches.

These settings only apply when booting containers (using kamal deploy, or kamal app boot). For other commands, Kamal continues to run commands in parallel across all hosts.

Using custom SSH connection management

Creating SSH connections concurrently can be an issue when deploying to many servers. By default Kamal will limit concurrent connection starts to 30 at a time.

It also sets a long idle timeout of 900 seconds on connections to prevent re-connection storms after a long idle period, like building an image or waiting for CI.

You can configure both of these settings:

sshkit:
  max_concurrent_starts: 10
  pool_idle_timeout: 300

Serving old and new assets during deployments

If there are changes to CSS or JS files, we may get requests for the old versions on the new container and vice-versa.

To avoid 404s we can specify an asset path. Kamal will replace that path in the container with a mapped volume containing both sets of files. This requires that file names change when the contents change (e.g. by including a hash of the contents in the name).

To configure this, set the path to the assets:

asset_path: /rails/public/assets