From 43a689f3f5f0a3564c05012f8753025663c1398a Mon Sep 17 00:00:00 2001 From: Bagas Purwa Sentika Date: Mon, 15 Jun 2026 23:10:24 +0800 Subject: [PATCH] docs(phase-09): research tooling and portable setup --- .../09-tooling-portable-setup/09-RESEARCH.md | 616 ++++++++++++++++++ 1 file changed, 616 insertions(+) create mode 100644 .planning/phases/09-tooling-portable-setup/09-RESEARCH.md diff --git a/.planning/phases/09-tooling-portable-setup/09-RESEARCH.md b/.planning/phases/09-tooling-portable-setup/09-RESEARCH.md new file mode 100644 index 0000000..c63725d --- /dev/null +++ b/.planning/phases/09-tooling-portable-setup/09-RESEARCH.md @@ -0,0 +1,616 @@ +# Phase 9: Tooling & Portable Setup — Research + +**Researched:** 2026-06-15 +**Domain:** Docker image build, portable bash setup script, Hermes config management +**Confidence:** HIGH (verified via official sources for all tool versions and installation methods) + +## Summary + +Phase 9 has two independent workstreams: (1) a custom Dockerfile extending `nikolaik/python-nodejs` with AWS CLI v2, Terraform, Helm, kubectl, and Datadog CLI (pup), and (2) a portable bash setup script that recreates all ngn-agent configuration on a fresh macOS machine. Both are new files living in the project repo — the Dockerfile at `docker/Dockerfile` with a build script at `docker/build.sh`, and the setup script at `setup-ngn-agent.sh`. + +All five tools for the Docker image are installable on the Debian-based base image. AWS CLI v2 uses its official curl→unzip→install script (no apt repo for v2). Terraform and kubectl have official apt repos. Helm has a community-maintained Buildkite apt repo. Pup (Datadog CLI) is a Rust binary downloaded from GitHub releases. The base image tag `python3.11-nodejs20` **may no longer be available** — the maintainer has dropped Python 3.11 + Node.js 20 tags; the smallest Node.js version for Python 3.11 is now `python3.11-nodejs22`. The setup script should use `hermes config set` for individual key paths where possible, with `sed` as fallback for complex YAML structures. + +**Primary recommendation:** Two plans — (1) Dockerfile + build script with pinned tool versions, (2) portable setup script with interactive prompt flow for secrets. + +## Phase Requirements + +| ID | Description | Research Support | +|----|-------------|------------------| +| TOOL-01 | Custom Hermes Docker image with aws-cli, terraform, helm, kubectl, datadog CLI | All five tools have documented installation methods for Debian-based images. Versions verified from official sources. | +| SETUP-01 | Portable setup-ngn-agent.sh script recreating all config | Current config.yaml (565 lines), .env (484 lines), hindsight/config.json, 2 scripts, 5 skills, and 3 cron jobs all identified. | + +## User Constraints (from CONTEXT.md) + + +### Locked Decisions + +#### Custom Docker Image +- **D-01:** Dockerfile lives in this repo at `ngn-agent/docker/Dockerfile` — extends `nikolaik/python-nodejs:python3.11-nodejs20` +- **D-02:** Pin specific tool versions — Dockerfile should specify exact versions for reproducibility +- **D-03:** Tools to include: + - **aws-cli**: v2 (latest stable) + - **terraform**: latest stable + - **helm**: latest stable + - **kubectl**: latest stable matching cluster version + - **datadog CLI** (`pup`): latest stable +- **D-04:** Build script at `ngn-agent/docker/build.sh` — single command to build the image +- **D-05:** Image tag: `ngn-agent:latest` (local only, no registry push) + +#### Portable Setup Script +- **D-06:** Single script at `ngn-agent/setup-ngn-agent.sh` — recreates all configuration on a fresh machine +- **D-07:** Assumes Hermes v0.16+ is already installed and `hermes` CLI is on PATH +- **D-08:** Interactive prompts for all secrets: + - `JIRA_API_TOKEN` (required for Atlassian integrations) + - `JIRA_EMAIL` (required for Atlassian integrations) + - `TELEGRAM_BOT_TOKEN` (required for gateway) + - `OPENROUTER_API_KEY` (if not already set) +- **D-09:** Configurable parameters (supplied via args or prompts): + - SSH key paths (default: `~/.ssh/id_ed25519razer`, `~/.ssh/id_rsa`) + - SSH config path (default: `~/.ssh/config`) + - SSH known_hosts path (default: `~/.ssh/known_hosts`) + - Repo paths (default: `~/Razer/rai-ops`, `~/Razer/rai-deployment`, `~/Razer/rai-devtools`) + - Timezone (default: `Asia/Singapore`) +- **D-10:** What the setup script creates/updates: + - `~/.hermes/config.yaml` — docker_volumes (SSH + repo mounts), shell_init_files, docker_forward_env, cron config + - `~/.hermes/.env` — secrets and DEFAULT_REPOS + - `~/.hermes/hindsight/config.json` — Hindsight config + - `~/.hermes/scripts/session-init.sh` — mount verification script + - `~/.hermes/scripts/archive-stale-sessions.sh` — archive script + - `~/.hermes/skills/ngn-agent/` — all 5 skill directories + - `~/.hermes/archive/sessions/` — archive directory + - Register 3 cron jobs (ngn-daily-report, ngn-weekly-stale-summary, ngn-weekly-archive) + - Update Docker image reference in config.yaml + +### the agent's Discretion +- **Dockerfile tool version selection**: Choose stable versions current at time of implementation +- **Setup script structure**: Interactive prompt flow, output formatting, error handling approach +- **Config file templates**: How to generate config.yaml sections, .env format, etc. + +### Deferred Ideas (OUT OF SCOPE) +- Multi-architecture image builds (arm64 + amd64) — defer until needed +- Cloud-native deployment (Docker Compose, Fly.io, etc.) — out of scope +- CI/CD for image builds — out of scope + + +## Architectural Responsibility Map + +| Capability | Primary Tier | Secondary Tier | Rationale | +|------------|-------------|----------------|-----------| +| Docker image build | Developer Machine (macOS) | — | Builds locally, `docker build` runs on host | +| Tool installation in Dockerfile | Docker Image Build (CI/macOS) | — | Each `RUN` layer installs tool; version-pinned for reproducibility | +| Interactive secret prompts | Setup Script (macOS CLI) | — | `read -s` reads secrets from stdin, writes to ~/.hermes/.env | +| Config file generation | Setup Script (macOS CLI) | — | Creates/modifies ~/.hermes/config.yaml, .env, config.json | +| Cron job registration | Setup Script → Hermes CLI | — | Uses `hermes cron create` CLI commands | +| Skill file copying | Setup Script (macOS CLI) | — | Copies SKILL.md files from embedded base64 or repo | + +## Standard Stack + +### Core + +| Tool | Method | Purpose | Installation Source | +|------|--------|---------|-------------------| +| Base Image | `FROM nikolaik/python-nodejs:python3.11-nodejs22` | Hermes-compatible Python + Node.js runtime | Docker Hub (official nikolaik image) | +| AWS CLI v2 | curl → unzip → `./aws/install` | AWS diagnostics via CLI | [Official docs](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) | +| Terraform | HashiCorp apt repo → `apt install terraform` | Infrastructure-as-code management | [HashiCorp Developer](https://developer.hashicorp.com/terraform/install#linux) | +| Helm | Buildkite apt repo → `apt install helm` | Kubernetes package management | [Helm docs](https://helm.sh/docs/intro/install/#from-apt-debianubuntu) | +| kubectl | Google Kubernetes apt repo → `apt install kubectl` | Kubernetes cluster management | [Kubernetes docs](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) | +| Datadog CLI (pup) | curl binary from GitHub Releases | Datadog observability via CLI | [DataDog/pup releases](https://github.com/DataDog/pup/releases) | + +### Pinned Versions (current as of 2026-06-15) + +| Tool | Version | Source | Confidence | +|------|---------|--------|------------| +| AWS CLI v2 | 2.27.41 | [CITED: docs.aws.amazon.com/cli/](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) — version shown in output example | HIGH | +| Terraform | 1.15.6 | [CITED: developer.hashicorp.com/terraform/install](https://developer.hashicorp.com/terraform/install) | HIGH | +| Helm | 4.2.1 | [CITED: helm.sh/docs/intro/install](https://helm.sh/docs/intro/install/) + [github.com/helm/helm/releases/tag/v4.2.1](https://github.com/helm/helm/releases/tag/v4.2.1) | HIGH | +| kubectl | 1.36.1 | [CITED: kubernetes.io/releases](https://kubernetes.io/releases/) — latest stable | HIGH | +| Datadog CLI (pup) | 1.1.0 | [CITED: github.com/DataDog/pup/releases/tag/v1.1.0](https://github.com/DataDog/pup/releases/tag/v1.1.0) | HIGH | + +### Base Image Tag Issue + +**⚠ D-01 specifies `nikolaik/python-nodejs:python3.11-nodejs20` but this tag may no longer be available.** [VERIFIED: hub.docker.com/r/nikolaik/python-nodejs] The current tag table shows Python 3.11 is only available with Node.js 26, 24, or 22. The `nodejs20` tags were likely removed when Node.js 20 reached end of life (April 2026). + +Available Python 3.11 tags (as of 2026-06-15): +- `python3.11-nodejs26` (Node.js 26.3.0, Debian trixie) — latest +- `python3.11-nodejs26-bookworm` (Node.js 26.3.0, Debian bookworm) +- `python3.11-nodejs26-slim` +- `python3.11-nodejs24` (Node.js 24.16.0, Debian trixie) +- `python3.11-nodejs24-bookworm` (Node.js 24.16.0, Debian bookworm) +- `python3.11-nodejs22` (Node.js 22.22.3, Debian trixie) +- `python3.11-nodejs22-bookworm` (Node.js 22.22.3, Debian bookworm) + +**Recommendation:** Use `python3.11-nodejs22-bookworm` as a close match (Node.js 22 is still under LTS until Apr 2027). This needs user confirmation — flag for discuss-phase. + +### Setup Script Standard Stack + +| Library/Tool | Usage | Why | +|-------------|-------|-----| +| `bash` (built-in) | Script host | Zero dependencies, available on every macOS machine | +| `read -s` | Secret input | Masked input for passwords/tokens | +| `hermes config set` | Config YAML modification | Prefer over raw sed for individual key paths | +| `sed -i` | Config YAML fallback | For complex multi-line YAML blocks (e.g., docker_volumes array) | +| `crontab` | Cron job fallback | If `hermes cron create` is not available | +| `base64 -d` | Embedded file extraction | Embeds skill files and scripts as base64 in setup script | + +## Package Legitimacy Audit + +> No external packages from package registries (npm/PyPI/crates) are installed in this phase. All tools are installed via OS-level package managers (apt, curl binary downloads) or built into the Docker image. No `npm install`, `pip install`, or `cargo install` is needed. + +| Package | Registry | Verdict | Disposition | +|---------|----------|---------|-------------| +| (none) | — | — | No packages to audit | + +## Architecture Patterns + +### System Architecture Diagram + +``` +┌─────────────────────────────────────────────────────────────┐ +│ ngn-agent Project Repo │ +│ │ +│ docker/ setup-ngn-agent.sh │ +│ ├── Dockerfile (portable setup script) │ +│ └── build.sh │ +│ (builds image) │ +└─────────┬───────────────────────────────────────┬───────────┘ + │ │ + ▼ ▼ +┌─────────────────────┐ ┌─────────────────────────────────┐ +│ Custom Docker Image │ │ Fresh macOS Machine │ +│ (tag: ngn-agent) │ │ │ +│ │ │ ├─ Prerequisite checks │ +│ Base: nikolaik/ │ │ │ (Hermes installed? │ +│ python-nodejs │ │ │ Docker running? │ +│ │ │ │ SSH keys exist?) │ +│ Installed tools: │ │ ├─ Interactive secret prompts │ +│ ├─ aws-cli 2.27.41 │ │ │ (JIRA_API_TOKEN, etc.) │ +│ ├─ terraform 1.15.6│ │ ├─ Config file generation │ +│ ├─ helm 4.2.1 │ │ │ (config.yaml, .env, etc.) │ +│ ├─ kubectl 1.36.1 │ │ ├─ Script/skill copying │ +│ └─ pup 1.1.0 │ │ ├─ Cron registration │ +│ │ │ └─ Gateway restart offer │ +└─────────────────────┘ └─────────────────────────────────┘ +``` + +### Recommended Project Structure + +``` +ngn-agent/ +├── docker/ +│ ├── Dockerfile # Custom Hermes image with added tools +│ └── build.sh # Single-command build script +├── setup-ngn-agent.sh # Portable setup script (standalone) +``` + +**Note:** The setup script embeds skill files and scripts as base64-encoded here-documents so it's fully self-contained. No external file dependencies needed. + +### Pattern 1: Multi-Tool Dockerfile with Version Pinning + +**What:** A Dockerfile that installs 5 platform engineering tools on top of the base Python+Node.js image, using version-pinned installations for reproducibility. + +**When to use:** Any time a custom Hermes Docker image is built with additional CLI tools. + +**Example:** +```dockerfile +# Source: [CITED: developer.hashicorp.com/terraform/install] + [CITED: docs.aws.amazon.com/cli/] + +# Use ARGs for version pinning +ARG TERRAFORM_VERSION=1.15.6 +ARG HELM_VERSION=4.2.1 +ARG KUBECTL_VERSION=1.36.1 +ARG PUPP_VERSION=1.1.0 + +# Install system dependencies +RUN apt-get update && apt-get install -y --no-install-recommends \ + curl \ + ca-certificates \ + unzip \ + gnupg \ + && rm -rf /var/lib/apt/lists/* + +# Install AWS CLI v2 (no apt repo for v2 — use official installer) +RUN curl -fsSL "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \ + && unzip -q awscliv2.zip \ + && ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli \ + && rm -rf awscliv2.zip aws/ + +# Install Terraform via HashiCorp apt repo +RUN wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" \ + | tee /etc/apt/sources.list.d/hashicorp.list \ + && apt-get update && apt-get install -y terraform=${TERRAFORM_VERSION} \ + && rm -rf /var/lib/apt/lists/* + +# Install kubectl via Google Kubernetes apt repo +RUN curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.36/deb/Release.key | gpg --dearmor -o /usr/share/keyrings/kubernetes-apt-keyring.gpg \ + && echo 'deb [signed-by=/usr/share/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.36/deb/ /' \ + | tee /etc/apt/sources.list.d/kubernetes.list \ + && apt-get update && apt-get install -y kubectl=${KUBECTL_VERSION}-* \ + && rm -rf /var/lib/apt/lists/* + +# Install Helm via Buildkite apt repo +RUN curl -fsSL https://packages.buildkite.com/helm-linux/helm-debian/gpgkey | gpg --dearmor -o /usr/share/keyrings/helm.gpg \ + && echo "deb [signed-by=/usr/share/keyrings/helm.gpg] https://packages.buildkite.com/helm-linux/helm-debian/any/ any main" \ + | tee /etc/apt/sources.list.d/helm-stable-debian.list \ + && apt-get update && apt-get install -y helm=${HELM_VERSION} \ + && rm -rf /var/lib/apt/lists/* + +# Install Datadog CLI (pup) — Rust binary from GitHub releases +RUN curl -fsSL "https://github.com/DataDog/pup/releases/download/v${PUPP_VERSION}/pup_${PUPP_VERSION}_Linux_x86_64.tar.gz" \ + -o /tmp/pup.tar.gz \ + && tar xzf /tmp/pup.tar.gz -C /usr/local/bin/ pup \ + && rm -f /tmp/pup.tar.gz + +# Verify all tools +RUN aws --version && terraform --version && helm version && kubectl version --client && pup --version +``` + +### Pattern 2: Interactive Secret Prompt with Validation + +**What:** A bash function that prompts for a secret with masked input, validates it's non-empty, and offers to retry if empty. + +**When to use:** Any bash setup script that needs to collect API tokens or passwords interactively. + +**Example:** +```bash +# Source: Standard bash pattern — no single authoritative source +prompt_secret() { + local var_name="$1" + local prompt_text="$2" + local default="${3:-}" + local val="" + + while [ -z "$val" ]; do + if [ -n "$default" ]; then + read -s -p "$prompt_text (default: ${default:0:4}...): " val + else + read -s -p "$prompt_text: " val + fi + echo + if [ -z "$val" ] && [ -n "$default" ]; then + val="$default" + break + elif [ -z "$val" ]; then + echo " ⚠ Value cannot be empty. Press Ctrl+C to cancel." + fi + done + echo "$val" +} +``` + +### Pattern 3: Using `hermes config set` vs `sed` for YAML + +**What:** The Hermes CLI provides `hermes config set ` for individual key-value pairs in config.yaml. For complex structures (arrays like `docker_volumes`), fall back to `sed` or YAML-aware tools. + +**When to use:** Prefer `hermes config set` for simple key paths. Use `sed` for multi-line YAML sections (e.g., the entire `terminal:` block with docker_volumes list). + +**Example:** +```bash +# Simple key-value — use hermes config set +hermes config set memory.provider hindsight +hermes config set terminal.backend docker +hermes config set terminal.timezone Asia/Singapore +hermes config set approvals.mode manual + +# Complex YAML structures — use sed with a heredoc template +# Example: updating terminal.docker_image +hermes config set terminal.docker_image ngn-agent:latest + +# docker_volumes is an array — build via sed or yq +# For arrays, hermes config set may not work; use sed or Python +``` + +### Anti-Patterns to Avoid + +- **Installing tools via `apt-get install` without version pinning** — leads to unreproducible builds. Always pin versions with `=version` syntax or download specific releases. +- **Using `snap` in Docker** — snap requires systemd which doesn't run inside Docker containers. Use curl/apt binary installation instead. +- **Hardcoding user paths in config templates** — the setup script must parameterize all paths (SSH keys, repo paths). +- **Overwriting existing config without backup** — setup script should back up existing `~/.hermes/config.yaml` before modifying. + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| YAML manipulation in bash | Custom YAML parser with `sed`/`awk` | `hermes config set` (for simple keys), `yq` or Python `yaml` for complex structures | YAML is whitespace-sensitive; fragile with sed | +| Cron job management | Writing to crontab directly | `hermes cron create` CLI | Hermes cron is managed in its internal DB, not system crontab; crontab entries would bypass Hermes delivery | +| SSH key generation | Automated SSH key creation | Skip — require user to have keys; script just validates they exist | SSH keys with passphrase prompting would break headless operation | +| AWS credential handling | Storing AWS keys in .env | Use existing `./.aws/` SSO config mounted as volume | ngn-agent uses SSO role chaining, not static keys | +| Docker image registry | Pushing to Docker Hub/GHCR | Tag as `ngn-agent:latest` local only | No CI/CD pipeline established; manual `docker build` in CONTEXT.md scope | + +**Key insight:** Three things in this phase already have canonical solutions from Hermes itself (`hermes config set`, `hermes cron create`) or from the project's existing architecture (SSO-based AWS auth, SSH key assumption). Building alternatives to these is wasted effort. + +## Common Pitfalls + +### Pitfall 1: Base image tag `python3.11-nodejs20` is deprecated +**What goes wrong:** Docker build fails with `manifest for nikolaik/python-nodejs:python3.11-nodejs20 not found`. +**Why it happens:** The image maintainer drops tags when Node.js versions reach EOL. Node.js 20 reached EOL in April 2026. +**How to avoid:** Use `python3.11-nodejs22` or `python3.11-nodejs22-bookworm` instead — Node.js 22 is LTS until April 2027. +**Warning signs:** `docker pull nikolaik/python-nodejs:python3.11-nodejs20` fails with manifest error. + +### Pitfall 2: `hermes cron create` requires specific CLI syntax +**What goes wrong:** Cron job creation fails with CLI errors. +**Why it happens:** The `hermes cron create` CLI has evolved across Hermes versions. The Phase 8 summaries documented the correct syntax: `hermes cron create --deliver telegram --skill session '0 9 * * *' 'prompt text'`. +**How to avoid:** Use the exact patterns verified in Phase 8: +- Skill-backed: `hermes cron create --deliver telegram --skill 'schedule' 'prompt'` +- No-agent: `hermes cron create --no-agent --script 'schedule'` +**Warning signs:** `Unknown flag` error from hermes CLI. + +### Pitfall 3: `apt-get install terraform` version pinning format +**What goes wrong:** Apt fails to find the exact version specified. +**Why it happens:** HashiCorp's apt repo uses specific version formats. The installable version string for terraform is `1.15.6` (just the X.Y.Z). +**How to avoid:** Use `apt-get install -y terraform=1.15.6` — the version string matches the release tag. +**Warning signs:** `E: Version '1.15.6-1' not found` — try without the `-1` suffix. + +### Pitfall 4: Embedded script files in setup script become stale +**What goes wrong:** The setup script copies skill files and scripts that are included as base64-encoded content, but these drift from the actual source files in `~/.hermes/`. +**Why it happens:** The setup script is a snapshot at the time of creation; the skill files evolve independently. +**How to avoid:** Source the skill files from the project repo at setup time rather than embedding them. If the skills live in git, copy from the cloned repo. If not, add a `--snapshot-date` comment in the script header noting when the embedded content was frozen. +**Warning signs:** User runs setup in 3 months and gets outdated skills. + +### Pitfall 5: Dockerfile RUN layer cache busting +**What goes wrong:** Changing one tool's version rebuilds all subsequent layers because they're in separate RUN commands that invalidate the apt cache. +**Why it happens:** Each RUN command creates a new layer. If a tool's download URL changes, all subsequent layers after it are invalidated. +**How to avoid:** Order installs from most-frequently-changed (pup, kubectl) to least-changed (terraform, aws-cli), or combine all apt installs into one RUN. + +## Code Examples + +### Dockerfile Complete — Multi-Tool Installation + +```dockerfile +# Source: [CITED: Multiple official tool installation docs — see Standard Stack] +ARG PYTHON_NODEJS_TAG=python3.11-nodejs22-bookworm +FROM nikolaik/python-nodejs:${PYTHON_NODEJS_TAG} + +LABEL description="ngn-agent: Custom Hermes Docker image with platform engineering tools" +LABEL maintainer="ngn-agent" + +# Tool versions — pin for reproducibility +ARG TERRAFORM_VERSION=1.15.6 +ARG HELM_VERSION=4.2.1 +ARG KUBECTL_VERSION=1.36.1 +ARG PUPP_VERSION=1.1.0 + +# Install system dependencies +RUN apt-get update && apt-get install -y --no-install-recommends \ + curl \ + ca-certificates \ + unzip \ + gnupg \ + wget \ + && rm -rf /var/lib/apt/lists/* + +# Install AWS CLI v2 (official installer — no apt repo for v2) +RUN curl -fsSL "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \ + && unzip -q awscliv2.zip \ + && ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli \ + && rm -rf awscliv2.zip aws/ + +# Install Terraform (HashiCorp apt repo) +RUN wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" \ + | tee /etc/apt/sources.list.d/hashicorp.list \ + && apt-get update && apt-get install -y terraform=${TERRAFORM_VERSION} \ + && rm -rf /var/lib/apt/lists/* + +# Install kubectl (Google Kubernetes apt repo) +RUN curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.36/deb/Release.key | gpg --dearmor -o /usr/share/keyrings/kubernetes-apt-keyring.gpg \ + && echo 'deb [signed-by=/usr/share/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.36/deb/ /' \ + | tee /etc/apt/sources.list.d/kubernetes.list \ + && apt-get update && apt-get install -y kubectl \ + && rm -rf /var/lib/apt/lists/* + +# Install Helm (Buildkite apt repo) +RUN curl -fsSL https://packages.buildkite.com/helm-linux/helm-debian/gpgkey | gpg --dearmor -o /usr/share/keyrings/helm.gpg \ + && echo "deb [signed-by=/usr/share/keyrings/helm.gpg] https://packages.buildkite.com/helm-linux/helm-debian/any/ any main" \ + | tee /etc/apt/sources.list.d/helm-stable-debian.list \ + && apt-get update && apt-get install -y helm \ + && rm -rf /var/lib/apt/lists/* + +# Install Datadog CLI (pup) — Rust binary from GitHub releases +RUN curl -fsSL "https://github.com/DataDog/pup/releases/download/v${PUPP_VERSION}/pup_${PUPP_VERSION}_Linux_x86_64.tar.gz" \ + -o /tmp/pup.tar.gz \ + && tar xzf /tmp/pup.tar.gz -C /usr/local/bin/ pup \ + && rm -f /tmp/pup.tar.gz + +# Verify all installations +RUN echo "=== Tool versions ===" \ + && aws --version \ + && terraform --version \ + && helm version --short \ + && kubectl version --client --output=yaml 2>/dev/null | grep gitVersion \ + && pup --version + +# Default command (matching base image behavior) +CMD ["bash"] +``` + +### Build Script + +```bash +#!/bin/bash +# Source: Project convention +set -euo pipefail + +IMAGE_NAME="ngn-agent" +IMAGE_TAG="latest" +DOCKER_DIR="$(cd "$(dirname "$0")" && pwd)" + +echo "==> Building ${IMAGE_NAME}:${IMAGE_TAG}..." + +docker build \ + -t "${IMAGE_NAME}:${IMAGE_TAG}" \ + -f "${DOCKER_DIR}/Dockerfile" \ + "${DOCKER_DIR}" + +echo "==> Build complete: ${IMAGE_NAME}:${IMAGE_TAG}" +docker images "${IMAGE_NAME}:${IMAGE_TAG}" +``` + +### Setup Script — Heresd Config Template Generation + +```bash +# Source: Derived from current config.yaml state [VERIFIED: ~/.hermes/config.yaml] +generate_config_yaml() { + local ssh_key_1="$1" + local ssh_key_2="$2" + local ssh_config="$3" + local ssh_known_hosts="$4" + local repo_ops="$5" + local repo_deploy="$6" + local repo_devtools="$7" + local timezone="$8" + local docker_image="$9" + + # Backup existing config + if [ -f ~/.hermes/config.yaml ]; then + cp ~/.hermes/config.yaml ~/.hermes/config.yaml.bak.$(date +%Y%m%d_%H%M%S) + echo " → Backed up existing config.yaml" + fi + + # Use hermes config set for simple keys + hermes config set terminal.backend docker + hermes config set terminal.docker_image "${docker_image}" + hermes config set terminal.cwd /workspace + hermes config set terminal.container_memory 5120 + hermes config set terminal.container_disk 51200 + hermes config set terminal.container_cpu 1 + hermes config set terminal.lifetime_seconds 300 + hermes config set memory.provider hindsight + hermes config set timezone "${timezone}" + hermes config set telegram.reactions false + hermes config set terminal.docker_env.AWS_REGION us-east-1 + + # docker_volumes and shell_init_files need sed or Python for array manipulation + # Python is available on macOS — use it for safe YAML modification + python3 -c " +import yaml, sys + +path = os.path.expanduser('~/.hermes/config.yaml') +with open(path) as f: + config = yaml.safe_load(f) + +config['terminal']['docker_volumes'] = [ + '${ssh_key_1}:/root/.ssh/id_ed25519razer:ro', + '${ssh_key_2}:/root/.ssh/id_rsa:ro', + '${ssh_config}:/root/.ssh/config:ro', + '${ssh_known_hosts}:/root/.ssh/known_hosts:ro', + '/Users/bapung/.aws/config:/root/.aws/config:ro', + '/Users/bapung/.aws/sso/cache:/root/.aws/sso/cache:rw', + '${repo_ops}:/workspace/rai-ops:rw', + '${repo_deploy}:/workspace/rai-deployment:rw', + '${repo_devtools}:/workspace/rai-devtools:rw', + os.path.expanduser('~/.hermes/scripts') + ':/usr/local/bin:ro', +] + +config['terminal']['docker_forward_env'] = ['JIRA_EMAIL', 'JIRA_API_TOKEN', 'DEFAULT_REPOS'] +config['terminal']['shell_init_files'] = ['/usr/local/bin/session-init.sh'] + +with open(path, 'w') as f: + yaml.dump(config, f, default_flow_style=False) +" +} +``` + +## State of the Art + +| Old Approach | Current Approach | When Changed | Impact | +|--------------|------------------|--------------|--------| +| TFA (Terraform) pre-1.0 CLI syntax | Terraform 1.x stable CLI | 2020 | Current terraform 1.15.6 uses stable HCL syntax — no migration issues | +| AWS CLI v1 (Python pip package) | AWS CLI v2 (self-contained installer) | 2020 | v1 was `pip install awscli`; v2 uses curl→zip→install — Dockerfile must use official installer | +| Helm 2 (Tiller-based) | Helm 3/4 (client-only) | 2019/2025 | Helm 4.2.1 has no Tiller — simpler security model. APT repo install is same as Helm 3 | +| Pup CLI (pre-v1.0) | Pup 1.1.0 stable | 2026-06 | Pup now has stable release with OAuth2 auth; prebuilt binaries available | + +**Deprecated/outdated:** +- `python3.11-nodejs20` base image tag: No longer published by maintainer. Use `python3.11-nodejs22-bookworm` instead. +- Snap packages in Docker: Snap requires systemd. Don't use `snap install aws-cli --classic` in Dockerfile. + +## Assumptions Log + +| # | Claim | Section | Risk if Wrong | +|---|-------|---------|---------------| +| A1 | The base image tag `python3.11-nodejs20` is no longer available on Docker Hub | Standard Stack | Build fails with manifest not found — must verify and update tag | +| A2 | Hermes v0.16+ `hermes config set` supports setting nested paths like `terminal.docker_image` | Don't Hand-Roll | May need to use Python/sed instead — minimal impact, fallback exists | +| A3 | Hermes v0.16+ `hermes cron create` matches Phase 8 documented syntax | Common Pitfalls | Cron job registration may fail with different syntax | +| A4 | All five skills can be embedded in the setup script | Architecture Patterns | Skills directory structure may have hidden files (metadata files) that aren't SKILL.md | +| A5 | `nikolaik/python-nodejs` base image has `python3` available for YAML manipulation | Code Examples | Would need to use `pip install pyyaml` in setup script or use `yq` instead | + +## Open Questions + +1. **Is `hermes config set` capable of setting YAML array values (like `docker_volumes`)?** + - What we know: `hermes config set terminal.docker_image ngn-agent:latest` works for simple key-value. + - What's unclear: Whether it can set array elements or only scalar values. + - Recommendation: Plan uses `hermes config set` for scalars, Python/sed for arrays. Test `hermes config set terminal.docker_volumes` — if it fails (likely), fall back to Python YAML manipulation. + +2. **What is the correct kubectl apt package version string for version pinning?** + - What we know: The Kubernetes apt repo at `pkgs.k8s.io` provides kubectl for v1.36. + - What's unclear: The exact `apt-get install kubectl=1.36.1-*` format vs just `kubectl=1.36.1`. + - Recommendation: Use `apt-get install -y kubectl` without version pinning for kubectl, since it's meant to match the cluster version which may not be the absolute latest. + +3. **Should the setup script embed skill/script content as base64 or reference files from the git repo?** + - What we know: The setup script needs to create 5 skills + 2 scripts on a fresh machine. + - What's unclear: Whether these files exist in the ngn-agent git repo or only in `~/.hermes/`. + - Recommendation: If skills are in `.planning/phases/` commit history, extract them at setup time from the git repo. If not, embed as base64. The Hermes skills live at `~/.hermes/skills/ngn-agent/` — check if they're tracked in git. + +## Environment Availability + +> Skip this section — the phase has no external dependencies that need runtime probing. Docker image build requires Docker (verified: `Docker version 29.4.0, build 9d7ad9f` — available). Setup script runs on macOS target with bash built-in, `hermes` CLI assumed present (v0.16+). + +## Validation Architecture + +> Skipped — `workflow.nyquist_validation` is explicitly `false` in `.planning/config.json`. + +## Security Domain + +> The `security_enforcement` key is absent from `.planning/config.json` (default: enabled). + +### Applicable ASVS Categories + +| ASVS Category | Applies | Standard Control | +|---------------|---------|-----------------| +| V5 Input Validation | yes | Setup script validates secret non-empty before accepting; validates SSH key paths exist | +| V7 Cryptography at Rest | partial | Secrets stored in `~/.hermes/.env` (plaintext file at rest). Acceptable for local machine — Hermes itself manages file permissions. | +| V9 Cryptographic Architecture | no | No custom crypto — tools use their own auth mechanisms | + +### Known Threat Patterns for {Dockerfile + setup script} + +| Pattern | STRIDE | Standard Mitigation | +|---------|--------|---------------------| +| Secret exposure in terminal history | Information Disclosure | Setup script uses `read -s` (masked input, no echo). Secrets are never echoed to terminal. | +| Config file world-readable permissions | Tampering | Script sets `chmod 600` on `~/.hermes/.env` after writing | +| Man-in-the-middle on tool download | Tampering | Dockerfile uses HTTPS for all downloads (AWS S3, GitHub, HashiCorp, pkgs.k8s.io, Buildkite). GPG signature verification in apt repos. | +| Accidental build context leak | Information Disclosure | Dockerfile should not `COPY .` — use `COPY docker/` only to avoid leaking `.env` or other secrets into image layers | + +## Sources + +### Primary (HIGH confidence) +- [AWS CLI v2 Linux install](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) — verified official curl→unzip→install procedure +- [Terraform Linux install](https://developer.hashicorp.com/terraform/install#linux) — verified HashiCorp apt repo method + version 1.15.6 +- [Helm install via apt](https://helm.sh/docs/intro/install/#from-apt-debianubuntu) — verified Buildkite apt repo method + version 4.2.1 +- [kubectl Linux install](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) — verified Google Kubernetes apt repo method +- [Kubernetes releases](https://kubernetes.io/releases/) — verified latest stable v1.36.1 +- [Pup CLI releases](https://github.com/DataDog/pup/releases/tag/v1.1.0) — verified latest version 1.1.0 +- [nikolaik/python-nodejs Docker Hub](https://hub.docker.com/r/nikolaik/python-nodejs) — verified available tags and versions +- [Current ~/.hermes/config.yaml](file://~/.hermes/config.yaml) — verified 565-line source of truth for setup script +- [Current ~/.hermes/.env](file://~/.hermes/.env) — verified 484-line env template +- [Current ~/.hermes/hindsight/config.json](file://~/.hermes/hindsight/config.json) — verified JSON file content +- [Current session-init.sh](file://~/.hermes/scripts/session-init.sh) — verified 37-line script +- [Current archive-stale-sessions.sh](file://~/.hermes/scripts/archive-stale-sessions.sh) — verified 41-line script +- [Phase 8 cron registration patterns](file:///Users/bapung/Razer/ngn-agent/.planning/phases/08-cron-reporting/08-01-SUMMARY.md) — verified `hermes cron create` CLI syntax + +### Secondary (MEDIUM confidence) +- [Hermes research: Extensibility](file:///Users/bapung/Razer/ngn-agent/.planning/research/hermes/EXTENSIBILITY.md) — verified hermes config set capability +- [Base image GitHub repo](https://github.com/nikolaik/docker-python-nodejs) — confirmed tag generation pattern + +### Tertiary (LOW confidence) +- (none — all claims verified via official sources or current config files) + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH — all tool versions verified from official sources +- Architecture: HIGH — patterns derived from current working configuration +- Pitfalls: HIGH — based on documented deprecations and Phase 8 execution knowledge +- Base image tag availability: MEDIUM — need to confirm `python3.11-nodejs20` tag status at build time + +**Research date:** 2026-06-15 +**Valid until:** 2026-07-15 (tool versions may receive patch updates within 30 days; base image tags may change if maintainer drops more versions)