linderhof/CLAUDE.md
Matthias Johnson 16da843131 Fix storage_box SSH key installation and deploy ordering
- Always run install-ssh-key (drop unreliable sftp idempotency check
  that was bypassed by SSH agent forwarding)
- Use sshpass -e (env var) instead of -p to avoid shell quoting issues
  with special characters in passwords
- Add -o IdentitiesOnly=yes to prevent agent keys interfering
- Add reachable_externally: true to access_settings (was being reset
  to false on every run)
- Remove storage_box.yml from deploy.yml chain — Ansible loads
  group_vars at startup so storagebox.yml must exist before deploy.yml
- Document storage_box.yml as a prerequisite step in README, CLAUDE.md,
  and setup.sh next steps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 21:14:45 -07:00

146 lines
6.3 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Linderhof is an Ansible-based self-hosting infrastructure stack that deploys email, web server, git hosting, Matrix homeserver, monitoring, and backup services using Docker Compose on Ubuntu servers.
## Common Commands
```bash
# Select a stack (one-time per clone)
echo <stack-name> > .stack && direnv allow
# or: export LINDERHOF_STACK=<stack-name>
# Run all playbooks
ansible-playbook playbooks/site.yml
# Run a specific playbook
ansible-playbook playbooks/mail.yml
# Run specific tags only
ansible-playbook playbooks/site.yml --tags mail,monitoring
# Edit encrypted secrets
ansible-vault edit $LINDERHOF_DIR/group_vars/all/vault.yml
# Encrypt/decrypt vault
ansible-vault encrypt $LINDERHOF_DIR/group_vars/all/vault.yml
ansible-vault decrypt $LINDERHOF_DIR/group_vars/all/vault.yml
```
Note: Inventory and vault password are set via `ANSIBLE_INVENTORY` and `ANSIBLE_VAULT_PASSWORD_FILE` in `.envrc`, driven by `LINDERHOF_STACK`. No extra flags needed once the stack is selected.
## Architecture
**Deployment Pattern:** Each service is deployed to `/srv/<service>/` on the target host with a `compose.yml` and environment files.
**Standalone Playbooks** (not in `site.yml`):
- `deploy.yml` - Full first-time deployment (chains provision → dns → storage_box → bootstrap → site)
- `provision.yml` - Provision a cloud VM (Hetzner)
- `dns.yml` - Manage DNS zones/records via Hetzner DNS API
- `bootstrap.yml` - First-time server setup (run once as root before site.yml)
- `dkim_sync.yml` - Fetch DKIM keys from mailserver and publish to DNS (run once after first mail deploy)
- `storage_box.yml` - Create/configure a Hetzner Storage Box for restic backups (run once before enabling restic)
**Full deployment order** (fresh server):
1. `deploy.yml` - runs all steps below in one shot (first-time only — bootstrap connects as root)
2. `dkim_sync.yml` - generate DKIM keys, write to stack config, publish to DNS (run once after mail is up)
**What `deploy.yml` runs internally:**
1. `provision.yml` - create server, auto-writes IP to hosts.yml and config.yml
2. `dns.yml` - create DNS records
3. `bootstrap.yml` - users, SSH hardening, packages, Docker (connects as root)
4. `site.yml` - deploy all services
**Note:** `storage_box.yml` must be run before `deploy.yml` when `enable_restic: true` — Ansible loads group_vars at startup, so `storagebox.yml` must exist before the playbook begins.
**Playbook Execution Order** (via `site.yml`):
1. networks.yml - Pre-create all Docker networks (must run before any service)
2. nebula.yml - Overlay network (Nebula)
3. caddy.yml - Web server / reverse proxy
4. mail.yml - Email (docker-mailserver + rainloop)
5. forgejo.yml - Git server
6. tuwunel.yml - Matrix homeserver (Tuwunel)
7. radicale.yml - CalDAV/CardDAV
8. monitoring.yml - Prometheus, Grafana, Loki, Alloy
9. goaccess.yml - Web analytics
10. diun.yml - Docker image update notifications
11. restic.yml - Encrypted backups
12. fail2ban.yml - Intrusion prevention
**Mail TLS:** on first deployment, the mail role stops Caddy, runs certbot standalone to acquire a Let's Encrypt cert for `mail_hostname`, then restarts Caddy. subsequent runs skip this (cert already exists). Caddy owns port 80 so standalone is the only viable approach without a DNS challenge plugin.
**Role Structure:** Each role in `roles/` contains:
- `tasks/main.yml` - Core provisioning tasks
- `templates/` - Jinja2 templates (compose.yml.j2, config files)
- `handlers/main.yml` - Service restart handlers
- `files/` - Static configuration files
**Configuration** (lives outside the repo in `$XDG_CONFIG_HOME/linderhof/<stack>/`):
- `$LINDERHOF_DIR/hosts.yml` - Host connection info only
- `$LINDERHOF_DIR/group_vars/all/config.yml` - All public configuration
- `$LINDERHOF_DIR/group_vars/all/vault.yml` - All secrets (encrypted)
- `$LINDERHOF_DIR/group_vars/all/dns.yml` - DNS zone definitions
- `$LINDERHOF_DIR/group_vars/all/overrides.yml` - Per-stack variable overrides (optional)
- `$LINDERHOF_DIR/stack.env` - Per-stack shell vars (DOCKER_HOST, etc.)
- `$LINDERHOF_DIR/vault-pass` - Vault encryption key (chmod 600)
**Template files** (in the repo, used by `setup.sh`):
- `inventory/group_vars/all/config.yml.setup` - Config template
- `inventory/group_vars/all/vault.yml.setup` - Vault template
- `inventory/group_vars/all/dns.yml.setup` - DNS zones template
**Overriding variables** without editing config.yml — create `overrides.yml`:
```yaml
# Example: override mail hostname during migration
mail_hostname: mail2.example.com
# Example: add extra static sites to Caddy
caddy_sites:
- example.com
- example2.com
```
**Service Toggles:** Set `enable_<service>: false` in config.yml to disable:
- `enable_mail`
- `enable_forgejo`
- `enable_tuwunel`
- `enable_monitoring`
- `enable_goaccess`
- `enable_goaccess_sync`
- `enable_radicale`
- `enable_restic`
- `enable_fail2ban`
- `enable_nebula`
- `enable_diun`
**Docker Networks:** All networks are pre-created by the `docker_network` role before any service deploys. Services declare all networks as `external: true` in their `compose.yml.j2` — no service creates its own network. Networks are created conditionally based on `enable_*` flags:
| Network | Created when |
|---------|-------------|
| `caddy` | always |
| `mail` | `enable_mail` |
| `webmail` | `enable_mail` |
| `git` | `enable_forgejo` |
| `monitoring` | `enable_monitoring` |
| `tuwunel` | `enable_tuwunel` |
| `radicale` | `enable_radicale` |
Caddy's `compose.yml.j2` also conditionally declares network references using the same `enable_*` flags so it never references a network that wasn't created.
**Adding a new service:** create the network in `docker_network/tasks/main.yml` with the appropriate `when:` condition, declare it `external: true` in the service compose template, and add it to caddy's compose template if caddy needs to reach it.
## Available Tags
- `bootstrap` - Initial server setup (use `--tags bootstrap`)
- `docker` - Docker installation
- `mail` - Mail server
- `forgejo` - Git server
- `tuwunel` - Matrix homeserver
- `monitoring` - Monitoring stack
- `restic` - Backup configuration
- `fail2ban` - Intrusion prevention
- `nebula` - Overlay network
- `diun` - Docker image update notifications
- `config` - Configuration-only updates