docs(phase-09): research tooling and portable setup
This commit is contained in:
616
.planning/phases/09-tooling-portable-setup/09-RESEARCH.md
Normal file
616
.planning/phases/09-tooling-portable-setup/09-RESEARCH.md
Normal file
@@ -0,0 +1,616 @@
|
||||
# Phase 9: Tooling & Portable Setup — Research
|
||||
|
||||
**Researched:** 2026-06-15
|
||||
**Domain:** Docker image build, portable bash setup script, Hermes config management
|
||||
**Confidence:** HIGH (verified via official sources for all tool versions and installation methods)
|
||||
|
||||
## Summary
|
||||
|
||||
Phase 9 has two independent workstreams: (1) a custom Dockerfile extending `nikolaik/python-nodejs` with AWS CLI v2, Terraform, Helm, kubectl, and Datadog CLI (pup), and (2) a portable bash setup script that recreates all ngn-agent configuration on a fresh macOS machine. Both are new files living in the project repo — the Dockerfile at `docker/Dockerfile` with a build script at `docker/build.sh`, and the setup script at `setup-ngn-agent.sh`.
|
||||
|
||||
All five tools for the Docker image are installable on the Debian-based base image. AWS CLI v2 uses its official curl→unzip→install script (no apt repo for v2). Terraform and kubectl have official apt repos. Helm has a community-maintained Buildkite apt repo. Pup (Datadog CLI) is a Rust binary downloaded from GitHub releases. The base image tag `python3.11-nodejs20` **may no longer be available** — the maintainer has dropped Python 3.11 + Node.js 20 tags; the smallest Node.js version for Python 3.11 is now `python3.11-nodejs22`. The setup script should use `hermes config set` for individual key paths where possible, with `sed` as fallback for complex YAML structures.
|
||||
|
||||
**Primary recommendation:** Two plans — (1) Dockerfile + build script with pinned tool versions, (2) portable setup script with interactive prompt flow for secrets.
|
||||
|
||||
## Phase Requirements
|
||||
|
||||
| ID | Description | Research Support |
|
||||
|----|-------------|------------------|
|
||||
| TOOL-01 | Custom Hermes Docker image with aws-cli, terraform, helm, kubectl, datadog CLI | All five tools have documented installation methods for Debian-based images. Versions verified from official sources. |
|
||||
| SETUP-01 | Portable setup-ngn-agent.sh script recreating all config | Current config.yaml (565 lines), .env (484 lines), hindsight/config.json, 2 scripts, 5 skills, and 3 cron jobs all identified. |
|
||||
|
||||
## User Constraints (from CONTEXT.md)
|
||||
|
||||
<user_constraints>
|
||||
### Locked Decisions
|
||||
|
||||
#### Custom Docker Image
|
||||
- **D-01:** Dockerfile lives in this repo at `ngn-agent/docker/Dockerfile` — extends `nikolaik/python-nodejs:python3.11-nodejs20`
|
||||
- **D-02:** Pin specific tool versions — Dockerfile should specify exact versions for reproducibility
|
||||
- **D-03:** Tools to include:
|
||||
- **aws-cli**: v2 (latest stable)
|
||||
- **terraform**: latest stable
|
||||
- **helm**: latest stable
|
||||
- **kubectl**: latest stable matching cluster version
|
||||
- **datadog CLI** (`pup`): latest stable
|
||||
- **D-04:** Build script at `ngn-agent/docker/build.sh` — single command to build the image
|
||||
- **D-05:** Image tag: `ngn-agent:latest` (local only, no registry push)
|
||||
|
||||
#### Portable Setup Script
|
||||
- **D-06:** Single script at `ngn-agent/setup-ngn-agent.sh` — recreates all configuration on a fresh machine
|
||||
- **D-07:** Assumes Hermes v0.16+ is already installed and `hermes` CLI is on PATH
|
||||
- **D-08:** Interactive prompts for all secrets:
|
||||
- `JIRA_API_TOKEN` (required for Atlassian integrations)
|
||||
- `JIRA_EMAIL` (required for Atlassian integrations)
|
||||
- `TELEGRAM_BOT_TOKEN` (required for gateway)
|
||||
- `OPENROUTER_API_KEY` (if not already set)
|
||||
- **D-09:** Configurable parameters (supplied via args or prompts):
|
||||
- SSH key paths (default: `~/.ssh/id_ed25519razer`, `~/.ssh/id_rsa`)
|
||||
- SSH config path (default: `~/.ssh/config`)
|
||||
- SSH known_hosts path (default: `~/.ssh/known_hosts`)
|
||||
- Repo paths (default: `~/Razer/rai-ops`, `~/Razer/rai-deployment`, `~/Razer/rai-devtools`)
|
||||
- Timezone (default: `Asia/Singapore`)
|
||||
- **D-10:** What the setup script creates/updates:
|
||||
- `~/.hermes/config.yaml` — docker_volumes (SSH + repo mounts), shell_init_files, docker_forward_env, cron config
|
||||
- `~/.hermes/.env` — secrets and DEFAULT_REPOS
|
||||
- `~/.hermes/hindsight/config.json` — Hindsight config
|
||||
- `~/.hermes/scripts/session-init.sh` — mount verification script
|
||||
- `~/.hermes/scripts/archive-stale-sessions.sh` — archive script
|
||||
- `~/.hermes/skills/ngn-agent/` — all 5 skill directories
|
||||
- `~/.hermes/archive/sessions/` — archive directory
|
||||
- Register 3 cron jobs (ngn-daily-report, ngn-weekly-stale-summary, ngn-weekly-archive)
|
||||
- Update Docker image reference in config.yaml
|
||||
|
||||
### the agent's Discretion
|
||||
- **Dockerfile tool version selection**: Choose stable versions current at time of implementation
|
||||
- **Setup script structure**: Interactive prompt flow, output formatting, error handling approach
|
||||
- **Config file templates**: How to generate config.yaml sections, .env format, etc.
|
||||
|
||||
### Deferred Ideas (OUT OF SCOPE)
|
||||
- Multi-architecture image builds (arm64 + amd64) — defer until needed
|
||||
- Cloud-native deployment (Docker Compose, Fly.io, etc.) — out of scope
|
||||
- CI/CD for image builds — out of scope
|
||||
</user_constraints>
|
||||
|
||||
## Architectural Responsibility Map
|
||||
|
||||
| Capability | Primary Tier | Secondary Tier | Rationale |
|
||||
|------------|-------------|----------------|-----------|
|
||||
| Docker image build | Developer Machine (macOS) | — | Builds locally, `docker build` runs on host |
|
||||
| Tool installation in Dockerfile | Docker Image Build (CI/macOS) | — | Each `RUN` layer installs tool; version-pinned for reproducibility |
|
||||
| Interactive secret prompts | Setup Script (macOS CLI) | — | `read -s` reads secrets from stdin, writes to ~/.hermes/.env |
|
||||
| Config file generation | Setup Script (macOS CLI) | — | Creates/modifies ~/.hermes/config.yaml, .env, config.json |
|
||||
| Cron job registration | Setup Script → Hermes CLI | — | Uses `hermes cron create` CLI commands |
|
||||
| Skill file copying | Setup Script (macOS CLI) | — | Copies SKILL.md files from embedded base64 or repo |
|
||||
|
||||
## Standard Stack
|
||||
|
||||
### Core
|
||||
|
||||
| Tool | Method | Purpose | Installation Source |
|
||||
|------|--------|---------|-------------------|
|
||||
| Base Image | `FROM nikolaik/python-nodejs:python3.11-nodejs22` | Hermes-compatible Python + Node.js runtime | Docker Hub (official nikolaik image) |
|
||||
| AWS CLI v2 | curl → unzip → `./aws/install` | AWS diagnostics via CLI | [Official docs](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) |
|
||||
| Terraform | HashiCorp apt repo → `apt install terraform` | Infrastructure-as-code management | [HashiCorp Developer](https://developer.hashicorp.com/terraform/install#linux) |
|
||||
| Helm | Buildkite apt repo → `apt install helm` | Kubernetes package management | [Helm docs](https://helm.sh/docs/intro/install/#from-apt-debianubuntu) |
|
||||
| kubectl | Google Kubernetes apt repo → `apt install kubectl` | Kubernetes cluster management | [Kubernetes docs](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) |
|
||||
| Datadog CLI (pup) | curl binary from GitHub Releases | Datadog observability via CLI | [DataDog/pup releases](https://github.com/DataDog/pup/releases) |
|
||||
|
||||
### Pinned Versions (current as of 2026-06-15)
|
||||
|
||||
| Tool | Version | Source | Confidence |
|
||||
|------|---------|--------|------------|
|
||||
| AWS CLI v2 | 2.27.41 | [CITED: docs.aws.amazon.com/cli/](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) — version shown in output example | HIGH |
|
||||
| Terraform | 1.15.6 | [CITED: developer.hashicorp.com/terraform/install](https://developer.hashicorp.com/terraform/install) | HIGH |
|
||||
| Helm | 4.2.1 | [CITED: helm.sh/docs/intro/install](https://helm.sh/docs/intro/install/) + [github.com/helm/helm/releases/tag/v4.2.1](https://github.com/helm/helm/releases/tag/v4.2.1) | HIGH |
|
||||
| kubectl | 1.36.1 | [CITED: kubernetes.io/releases](https://kubernetes.io/releases/) — latest stable | HIGH |
|
||||
| Datadog CLI (pup) | 1.1.0 | [CITED: github.com/DataDog/pup/releases/tag/v1.1.0](https://github.com/DataDog/pup/releases/tag/v1.1.0) | HIGH |
|
||||
|
||||
### Base Image Tag Issue
|
||||
|
||||
**⚠ D-01 specifies `nikolaik/python-nodejs:python3.11-nodejs20` but this tag may no longer be available.** [VERIFIED: hub.docker.com/r/nikolaik/python-nodejs] The current tag table shows Python 3.11 is only available with Node.js 26, 24, or 22. The `nodejs20` tags were likely removed when Node.js 20 reached end of life (April 2026).
|
||||
|
||||
Available Python 3.11 tags (as of 2026-06-15):
|
||||
- `python3.11-nodejs26` (Node.js 26.3.0, Debian trixie) — latest
|
||||
- `python3.11-nodejs26-bookworm` (Node.js 26.3.0, Debian bookworm)
|
||||
- `python3.11-nodejs26-slim`
|
||||
- `python3.11-nodejs24` (Node.js 24.16.0, Debian trixie)
|
||||
- `python3.11-nodejs24-bookworm` (Node.js 24.16.0, Debian bookworm)
|
||||
- `python3.11-nodejs22` (Node.js 22.22.3, Debian trixie)
|
||||
- `python3.11-nodejs22-bookworm` (Node.js 22.22.3, Debian bookworm)
|
||||
|
||||
**Recommendation:** Use `python3.11-nodejs22-bookworm` as a close match (Node.js 22 is still under LTS until Apr 2027). This needs user confirmation — flag for discuss-phase.
|
||||
|
||||
### Setup Script Standard Stack
|
||||
|
||||
| Library/Tool | Usage | Why |
|
||||
|-------------|-------|-----|
|
||||
| `bash` (built-in) | Script host | Zero dependencies, available on every macOS machine |
|
||||
| `read -s` | Secret input | Masked input for passwords/tokens |
|
||||
| `hermes config set` | Config YAML modification | Prefer over raw sed for individual key paths |
|
||||
| `sed -i` | Config YAML fallback | For complex multi-line YAML blocks (e.g., docker_volumes array) |
|
||||
| `crontab` | Cron job fallback | If `hermes cron create` is not available |
|
||||
| `base64 -d` | Embedded file extraction | Embeds skill files and scripts as base64 in setup script |
|
||||
|
||||
## Package Legitimacy Audit
|
||||
|
||||
> No external packages from package registries (npm/PyPI/crates) are installed in this phase. All tools are installed via OS-level package managers (apt, curl binary downloads) or built into the Docker image. No `npm install`, `pip install`, or `cargo install` is needed.
|
||||
|
||||
| Package | Registry | Verdict | Disposition |
|
||||
|---------|----------|---------|-------------|
|
||||
| (none) | — | — | No packages to audit |
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### System Architecture Diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ ngn-agent Project Repo │
|
||||
│ │
|
||||
│ docker/ setup-ngn-agent.sh │
|
||||
│ ├── Dockerfile (portable setup script) │
|
||||
│ └── build.sh │
|
||||
│ (builds image) │
|
||||
└─────────┬───────────────────────────────────────┬───────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────────┐ ┌─────────────────────────────────┐
|
||||
│ Custom Docker Image │ │ Fresh macOS Machine │
|
||||
│ (tag: ngn-agent) │ │ │
|
||||
│ │ │ ├─ Prerequisite checks │
|
||||
│ Base: nikolaik/ │ │ │ (Hermes installed? │
|
||||
│ python-nodejs │ │ │ Docker running? │
|
||||
│ │ │ │ SSH keys exist?) │
|
||||
│ Installed tools: │ │ ├─ Interactive secret prompts │
|
||||
│ ├─ aws-cli 2.27.41 │ │ │ (JIRA_API_TOKEN, etc.) │
|
||||
│ ├─ terraform 1.15.6│ │ ├─ Config file generation │
|
||||
│ ├─ helm 4.2.1 │ │ │ (config.yaml, .env, etc.) │
|
||||
│ ├─ kubectl 1.36.1 │ │ ├─ Script/skill copying │
|
||||
│ └─ pup 1.1.0 │ │ ├─ Cron registration │
|
||||
│ │ │ └─ Gateway restart offer │
|
||||
└─────────────────────┘ └─────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Recommended Project Structure
|
||||
|
||||
```
|
||||
ngn-agent/
|
||||
├── docker/
|
||||
│ ├── Dockerfile # Custom Hermes image with added tools
|
||||
│ └── build.sh # Single-command build script
|
||||
├── setup-ngn-agent.sh # Portable setup script (standalone)
|
||||
```
|
||||
|
||||
**Note:** The setup script embeds skill files and scripts as base64-encoded here-documents so it's fully self-contained. No external file dependencies needed.
|
||||
|
||||
### Pattern 1: Multi-Tool Dockerfile with Version Pinning
|
||||
|
||||
**What:** A Dockerfile that installs 5 platform engineering tools on top of the base Python+Node.js image, using version-pinned installations for reproducibility.
|
||||
|
||||
**When to use:** Any time a custom Hermes Docker image is built with additional CLI tools.
|
||||
|
||||
**Example:**
|
||||
```dockerfile
|
||||
# Source: [CITED: developer.hashicorp.com/terraform/install] + [CITED: docs.aws.amazon.com/cli/]
|
||||
|
||||
# Use ARGs for version pinning
|
||||
ARG TERRAFORM_VERSION=1.15.6
|
||||
ARG HELM_VERSION=4.2.1
|
||||
ARG KUBECTL_VERSION=1.36.1
|
||||
ARG PUPP_VERSION=1.1.0
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
curl \
|
||||
ca-certificates \
|
||||
unzip \
|
||||
gnupg \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install AWS CLI v2 (no apt repo for v2 — use official installer)
|
||||
RUN curl -fsSL "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
|
||||
&& unzip -q awscliv2.zip \
|
||||
&& ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli \
|
||||
&& rm -rf awscliv2.zip aws/
|
||||
|
||||
# Install Terraform via HashiCorp apt repo
|
||||
RUN wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg \
|
||||
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" \
|
||||
| tee /etc/apt/sources.list.d/hashicorp.list \
|
||||
&& apt-get update && apt-get install -y terraform=${TERRAFORM_VERSION} \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install kubectl via Google Kubernetes apt repo
|
||||
RUN curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.36/deb/Release.key | gpg --dearmor -o /usr/share/keyrings/kubernetes-apt-keyring.gpg \
|
||||
&& echo 'deb [signed-by=/usr/share/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.36/deb/ /' \
|
||||
| tee /etc/apt/sources.list.d/kubernetes.list \
|
||||
&& apt-get update && apt-get install -y kubectl=${KUBECTL_VERSION}-* \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Helm via Buildkite apt repo
|
||||
RUN curl -fsSL https://packages.buildkite.com/helm-linux/helm-debian/gpgkey | gpg --dearmor -o /usr/share/keyrings/helm.gpg \
|
||||
&& echo "deb [signed-by=/usr/share/keyrings/helm.gpg] https://packages.buildkite.com/helm-linux/helm-debian/any/ any main" \
|
||||
| tee /etc/apt/sources.list.d/helm-stable-debian.list \
|
||||
&& apt-get update && apt-get install -y helm=${HELM_VERSION} \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Datadog CLI (pup) — Rust binary from GitHub releases
|
||||
RUN curl -fsSL "https://github.com/DataDog/pup/releases/download/v${PUPP_VERSION}/pup_${PUPP_VERSION}_Linux_x86_64.tar.gz" \
|
||||
-o /tmp/pup.tar.gz \
|
||||
&& tar xzf /tmp/pup.tar.gz -C /usr/local/bin/ pup \
|
||||
&& rm -f /tmp/pup.tar.gz
|
||||
|
||||
# Verify all tools
|
||||
RUN aws --version && terraform --version && helm version && kubectl version --client && pup --version
|
||||
```
|
||||
|
||||
### Pattern 2: Interactive Secret Prompt with Validation
|
||||
|
||||
**What:** A bash function that prompts for a secret with masked input, validates it's non-empty, and offers to retry if empty.
|
||||
|
||||
**When to use:** Any bash setup script that needs to collect API tokens or passwords interactively.
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Source: Standard bash pattern — no single authoritative source
|
||||
prompt_secret() {
|
||||
local var_name="$1"
|
||||
local prompt_text="$2"
|
||||
local default="${3:-}"
|
||||
local val=""
|
||||
|
||||
while [ -z "$val" ]; do
|
||||
if [ -n "$default" ]; then
|
||||
read -s -p "$prompt_text (default: ${default:0:4}...): " val
|
||||
else
|
||||
read -s -p "$prompt_text: " val
|
||||
fi
|
||||
echo
|
||||
if [ -z "$val" ] && [ -n "$default" ]; then
|
||||
val="$default"
|
||||
break
|
||||
elif [ -z "$val" ]; then
|
||||
echo " ⚠ Value cannot be empty. Press Ctrl+C to cancel."
|
||||
fi
|
||||
done
|
||||
echo "$val"
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Using `hermes config set` vs `sed` for YAML
|
||||
|
||||
**What:** The Hermes CLI provides `hermes config set <path> <value>` for individual key-value pairs in config.yaml. For complex structures (arrays like `docker_volumes`), fall back to `sed` or YAML-aware tools.
|
||||
|
||||
**When to use:** Prefer `hermes config set` for simple key paths. Use `sed` for multi-line YAML sections (e.g., the entire `terminal:` block with docker_volumes list).
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Simple key-value — use hermes config set
|
||||
hermes config set memory.provider hindsight
|
||||
hermes config set terminal.backend docker
|
||||
hermes config set terminal.timezone Asia/Singapore
|
||||
hermes config set approvals.mode manual
|
||||
|
||||
# Complex YAML structures — use sed with a heredoc template
|
||||
# Example: updating terminal.docker_image
|
||||
hermes config set terminal.docker_image ngn-agent:latest
|
||||
|
||||
# docker_volumes is an array — build via sed or yq
|
||||
# For arrays, hermes config set may not work; use sed or Python
|
||||
```
|
||||
|
||||
### Anti-Patterns to Avoid
|
||||
|
||||
- **Installing tools via `apt-get install` without version pinning** — leads to unreproducible builds. Always pin versions with `=version` syntax or download specific releases.
|
||||
- **Using `snap` in Docker** — snap requires systemd which doesn't run inside Docker containers. Use curl/apt binary installation instead.
|
||||
- **Hardcoding user paths in config templates** — the setup script must parameterize all paths (SSH keys, repo paths).
|
||||
- **Overwriting existing config without backup** — setup script should back up existing `~/.hermes/config.yaml` before modifying.
|
||||
|
||||
## Don't Hand-Roll
|
||||
|
||||
| Problem | Don't Build | Use Instead | Why |
|
||||
|---------|-------------|-------------|-----|
|
||||
| YAML manipulation in bash | Custom YAML parser with `sed`/`awk` | `hermes config set` (for simple keys), `yq` or Python `yaml` for complex structures | YAML is whitespace-sensitive; fragile with sed |
|
||||
| Cron job management | Writing to crontab directly | `hermes cron create` CLI | Hermes cron is managed in its internal DB, not system crontab; crontab entries would bypass Hermes delivery |
|
||||
| SSH key generation | Automated SSH key creation | Skip — require user to have keys; script just validates they exist | SSH keys with passphrase prompting would break headless operation |
|
||||
| AWS credential handling | Storing AWS keys in .env | Use existing `./.aws/` SSO config mounted as volume | ngn-agent uses SSO role chaining, not static keys |
|
||||
| Docker image registry | Pushing to Docker Hub/GHCR | Tag as `ngn-agent:latest` local only | No CI/CD pipeline established; manual `docker build` in CONTEXT.md scope |
|
||||
|
||||
**Key insight:** Three things in this phase already have canonical solutions from Hermes itself (`hermes config set`, `hermes cron create`) or from the project's existing architecture (SSO-based AWS auth, SSH key assumption). Building alternatives to these is wasted effort.
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Pitfall 1: Base image tag `python3.11-nodejs20` is deprecated
|
||||
**What goes wrong:** Docker build fails with `manifest for nikolaik/python-nodejs:python3.11-nodejs20 not found`.
|
||||
**Why it happens:** The image maintainer drops tags when Node.js versions reach EOL. Node.js 20 reached EOL in April 2026.
|
||||
**How to avoid:** Use `python3.11-nodejs22` or `python3.11-nodejs22-bookworm` instead — Node.js 22 is LTS until April 2027.
|
||||
**Warning signs:** `docker pull nikolaik/python-nodejs:python3.11-nodejs20` fails with manifest error.
|
||||
|
||||
### Pitfall 2: `hermes cron create` requires specific CLI syntax
|
||||
**What goes wrong:** Cron job creation fails with CLI errors.
|
||||
**Why it happens:** The `hermes cron create` CLI has evolved across Hermes versions. The Phase 8 summaries documented the correct syntax: `hermes cron create --deliver telegram --skill session '0 9 * * *' 'prompt text'`.
|
||||
**How to avoid:** Use the exact patterns verified in Phase 8:
|
||||
- Skill-backed: `hermes cron create --deliver telegram --skill <name> 'schedule' 'prompt'`
|
||||
- No-agent: `hermes cron create --no-agent --script <path> 'schedule'`
|
||||
**Warning signs:** `Unknown flag` error from hermes CLI.
|
||||
|
||||
### Pitfall 3: `apt-get install terraform` version pinning format
|
||||
**What goes wrong:** Apt fails to find the exact version specified.
|
||||
**Why it happens:** HashiCorp's apt repo uses specific version formats. The installable version string for terraform is `1.15.6` (just the X.Y.Z).
|
||||
**How to avoid:** Use `apt-get install -y terraform=1.15.6` — the version string matches the release tag.
|
||||
**Warning signs:** `E: Version '1.15.6-1' not found` — try without the `-1` suffix.
|
||||
|
||||
### Pitfall 4: Embedded script files in setup script become stale
|
||||
**What goes wrong:** The setup script copies skill files and scripts that are included as base64-encoded content, but these drift from the actual source files in `~/.hermes/`.
|
||||
**Why it happens:** The setup script is a snapshot at the time of creation; the skill files evolve independently.
|
||||
**How to avoid:** Source the skill files from the project repo at setup time rather than embedding them. If the skills live in git, copy from the cloned repo. If not, add a `--snapshot-date` comment in the script header noting when the embedded content was frozen.
|
||||
**Warning signs:** User runs setup in 3 months and gets outdated skills.
|
||||
|
||||
### Pitfall 5: Dockerfile RUN layer cache busting
|
||||
**What goes wrong:** Changing one tool's version rebuilds all subsequent layers because they're in separate RUN commands that invalidate the apt cache.
|
||||
**Why it happens:** Each RUN command creates a new layer. If a tool's download URL changes, all subsequent layers after it are invalidated.
|
||||
**How to avoid:** Order installs from most-frequently-changed (pup, kubectl) to least-changed (terraform, aws-cli), or combine all apt installs into one RUN.
|
||||
|
||||
## Code Examples
|
||||
|
||||
### Dockerfile Complete — Multi-Tool Installation
|
||||
|
||||
```dockerfile
|
||||
# Source: [CITED: Multiple official tool installation docs — see Standard Stack]
|
||||
ARG PYTHON_NODEJS_TAG=python3.11-nodejs22-bookworm
|
||||
FROM nikolaik/python-nodejs:${PYTHON_NODEJS_TAG}
|
||||
|
||||
LABEL description="ngn-agent: Custom Hermes Docker image with platform engineering tools"
|
||||
LABEL maintainer="ngn-agent"
|
||||
|
||||
# Tool versions — pin for reproducibility
|
||||
ARG TERRAFORM_VERSION=1.15.6
|
||||
ARG HELM_VERSION=4.2.1
|
||||
ARG KUBECTL_VERSION=1.36.1
|
||||
ARG PUPP_VERSION=1.1.0
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
curl \
|
||||
ca-certificates \
|
||||
unzip \
|
||||
gnupg \
|
||||
wget \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install AWS CLI v2 (official installer — no apt repo for v2)
|
||||
RUN curl -fsSL "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
|
||||
&& unzip -q awscliv2.zip \
|
||||
&& ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli \
|
||||
&& rm -rf awscliv2.zip aws/
|
||||
|
||||
# Install Terraform (HashiCorp apt repo)
|
||||
RUN wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg \
|
||||
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" \
|
||||
| tee /etc/apt/sources.list.d/hashicorp.list \
|
||||
&& apt-get update && apt-get install -y terraform=${TERRAFORM_VERSION} \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install kubectl (Google Kubernetes apt repo)
|
||||
RUN curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.36/deb/Release.key | gpg --dearmor -o /usr/share/keyrings/kubernetes-apt-keyring.gpg \
|
||||
&& echo 'deb [signed-by=/usr/share/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.36/deb/ /' \
|
||||
| tee /etc/apt/sources.list.d/kubernetes.list \
|
||||
&& apt-get update && apt-get install -y kubectl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Helm (Buildkite apt repo)
|
||||
RUN curl -fsSL https://packages.buildkite.com/helm-linux/helm-debian/gpgkey | gpg --dearmor -o /usr/share/keyrings/helm.gpg \
|
||||
&& echo "deb [signed-by=/usr/share/keyrings/helm.gpg] https://packages.buildkite.com/helm-linux/helm-debian/any/ any main" \
|
||||
| tee /etc/apt/sources.list.d/helm-stable-debian.list \
|
||||
&& apt-get update && apt-get install -y helm \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install Datadog CLI (pup) — Rust binary from GitHub releases
|
||||
RUN curl -fsSL "https://github.com/DataDog/pup/releases/download/v${PUPP_VERSION}/pup_${PUPP_VERSION}_Linux_x86_64.tar.gz" \
|
||||
-o /tmp/pup.tar.gz \
|
||||
&& tar xzf /tmp/pup.tar.gz -C /usr/local/bin/ pup \
|
||||
&& rm -f /tmp/pup.tar.gz
|
||||
|
||||
# Verify all installations
|
||||
RUN echo "=== Tool versions ===" \
|
||||
&& aws --version \
|
||||
&& terraform --version \
|
||||
&& helm version --short \
|
||||
&& kubectl version --client --output=yaml 2>/dev/null | grep gitVersion \
|
||||
&& pup --version
|
||||
|
||||
# Default command (matching base image behavior)
|
||||
CMD ["bash"]
|
||||
```
|
||||
|
||||
### Build Script
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Source: Project convention
|
||||
set -euo pipefail
|
||||
|
||||
IMAGE_NAME="ngn-agent"
|
||||
IMAGE_TAG="latest"
|
||||
DOCKER_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
echo "==> Building ${IMAGE_NAME}:${IMAGE_TAG}..."
|
||||
|
||||
docker build \
|
||||
-t "${IMAGE_NAME}:${IMAGE_TAG}" \
|
||||
-f "${DOCKER_DIR}/Dockerfile" \
|
||||
"${DOCKER_DIR}"
|
||||
|
||||
echo "==> Build complete: ${IMAGE_NAME}:${IMAGE_TAG}"
|
||||
docker images "${IMAGE_NAME}:${IMAGE_TAG}"
|
||||
```
|
||||
|
||||
### Setup Script — Heresd Config Template Generation
|
||||
|
||||
```bash
|
||||
# Source: Derived from current config.yaml state [VERIFIED: ~/.hermes/config.yaml]
|
||||
generate_config_yaml() {
|
||||
local ssh_key_1="$1"
|
||||
local ssh_key_2="$2"
|
||||
local ssh_config="$3"
|
||||
local ssh_known_hosts="$4"
|
||||
local repo_ops="$5"
|
||||
local repo_deploy="$6"
|
||||
local repo_devtools="$7"
|
||||
local timezone="$8"
|
||||
local docker_image="$9"
|
||||
|
||||
# Backup existing config
|
||||
if [ -f ~/.hermes/config.yaml ]; then
|
||||
cp ~/.hermes/config.yaml ~/.hermes/config.yaml.bak.$(date +%Y%m%d_%H%M%S)
|
||||
echo " → Backed up existing config.yaml"
|
||||
fi
|
||||
|
||||
# Use hermes config set for simple keys
|
||||
hermes config set terminal.backend docker
|
||||
hermes config set terminal.docker_image "${docker_image}"
|
||||
hermes config set terminal.cwd /workspace
|
||||
hermes config set terminal.container_memory 5120
|
||||
hermes config set terminal.container_disk 51200
|
||||
hermes config set terminal.container_cpu 1
|
||||
hermes config set terminal.lifetime_seconds 300
|
||||
hermes config set memory.provider hindsight
|
||||
hermes config set timezone "${timezone}"
|
||||
hermes config set telegram.reactions false
|
||||
hermes config set terminal.docker_env.AWS_REGION us-east-1
|
||||
|
||||
# docker_volumes and shell_init_files need sed or Python for array manipulation
|
||||
# Python is available on macOS — use it for safe YAML modification
|
||||
python3 -c "
|
||||
import yaml, sys
|
||||
|
||||
path = os.path.expanduser('~/.hermes/config.yaml')
|
||||
with open(path) as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
config['terminal']['docker_volumes'] = [
|
||||
'${ssh_key_1}:/root/.ssh/id_ed25519razer:ro',
|
||||
'${ssh_key_2}:/root/.ssh/id_rsa:ro',
|
||||
'${ssh_config}:/root/.ssh/config:ro',
|
||||
'${ssh_known_hosts}:/root/.ssh/known_hosts:ro',
|
||||
'/Users/bapung/.aws/config:/root/.aws/config:ro',
|
||||
'/Users/bapung/.aws/sso/cache:/root/.aws/sso/cache:rw',
|
||||
'${repo_ops}:/workspace/rai-ops:rw',
|
||||
'${repo_deploy}:/workspace/rai-deployment:rw',
|
||||
'${repo_devtools}:/workspace/rai-devtools:rw',
|
||||
os.path.expanduser('~/.hermes/scripts') + ':/usr/local/bin:ro',
|
||||
]
|
||||
|
||||
config['terminal']['docker_forward_env'] = ['JIRA_EMAIL', 'JIRA_API_TOKEN', 'DEFAULT_REPOS']
|
||||
config['terminal']['shell_init_files'] = ['/usr/local/bin/session-init.sh']
|
||||
|
||||
with open(path, 'w') as f:
|
||||
yaml.dump(config, f, default_flow_style=False)
|
||||
"
|
||||
}
|
||||
```
|
||||
|
||||
## State of the Art
|
||||
|
||||
| Old Approach | Current Approach | When Changed | Impact |
|
||||
|--------------|------------------|--------------|--------|
|
||||
| TFA (Terraform) pre-1.0 CLI syntax | Terraform 1.x stable CLI | 2020 | Current terraform 1.15.6 uses stable HCL syntax — no migration issues |
|
||||
| AWS CLI v1 (Python pip package) | AWS CLI v2 (self-contained installer) | 2020 | v1 was `pip install awscli`; v2 uses curl→zip→install — Dockerfile must use official installer |
|
||||
| Helm 2 (Tiller-based) | Helm 3/4 (client-only) | 2019/2025 | Helm 4.2.1 has no Tiller — simpler security model. APT repo install is same as Helm 3 |
|
||||
| Pup CLI (pre-v1.0) | Pup 1.1.0 stable | 2026-06 | Pup now has stable release with OAuth2 auth; prebuilt binaries available |
|
||||
|
||||
**Deprecated/outdated:**
|
||||
- `python3.11-nodejs20` base image tag: No longer published by maintainer. Use `python3.11-nodejs22-bookworm` instead.
|
||||
- Snap packages in Docker: Snap requires systemd. Don't use `snap install aws-cli --classic` in Dockerfile.
|
||||
|
||||
## Assumptions Log
|
||||
|
||||
| # | Claim | Section | Risk if Wrong |
|
||||
|---|-------|---------|---------------|
|
||||
| A1 | The base image tag `python3.11-nodejs20` is no longer available on Docker Hub | Standard Stack | Build fails with manifest not found — must verify and update tag |
|
||||
| A2 | Hermes v0.16+ `hermes config set` supports setting nested paths like `terminal.docker_image` | Don't Hand-Roll | May need to use Python/sed instead — minimal impact, fallback exists |
|
||||
| A3 | Hermes v0.16+ `hermes cron create` matches Phase 8 documented syntax | Common Pitfalls | Cron job registration may fail with different syntax |
|
||||
| A4 | All five skills can be embedded in the setup script | Architecture Patterns | Skills directory structure may have hidden files (metadata files) that aren't SKILL.md |
|
||||
| A5 | `nikolaik/python-nodejs` base image has `python3` available for YAML manipulation | Code Examples | Would need to use `pip install pyyaml` in setup script or use `yq` instead |
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Is `hermes config set` capable of setting YAML array values (like `docker_volumes`)?**
|
||||
- What we know: `hermes config set terminal.docker_image ngn-agent:latest` works for simple key-value.
|
||||
- What's unclear: Whether it can set array elements or only scalar values.
|
||||
- Recommendation: Plan uses `hermes config set` for scalars, Python/sed for arrays. Test `hermes config set terminal.docker_volumes` — if it fails (likely), fall back to Python YAML manipulation.
|
||||
|
||||
2. **What is the correct kubectl apt package version string for version pinning?**
|
||||
- What we know: The Kubernetes apt repo at `pkgs.k8s.io` provides kubectl for v1.36.
|
||||
- What's unclear: The exact `apt-get install kubectl=1.36.1-*` format vs just `kubectl=1.36.1`.
|
||||
- Recommendation: Use `apt-get install -y kubectl` without version pinning for kubectl, since it's meant to match the cluster version which may not be the absolute latest.
|
||||
|
||||
3. **Should the setup script embed skill/script content as base64 or reference files from the git repo?**
|
||||
- What we know: The setup script needs to create 5 skills + 2 scripts on a fresh machine.
|
||||
- What's unclear: Whether these files exist in the ngn-agent git repo or only in `~/.hermes/`.
|
||||
- Recommendation: If skills are in `.planning/phases/` commit history, extract them at setup time from the git repo. If not, embed as base64. The Hermes skills live at `~/.hermes/skills/ngn-agent/` — check if they're tracked in git.
|
||||
|
||||
## Environment Availability
|
||||
|
||||
> Skip this section — the phase has no external dependencies that need runtime probing. Docker image build requires Docker (verified: `Docker version 29.4.0, build 9d7ad9f` — available). Setup script runs on macOS target with bash built-in, `hermes` CLI assumed present (v0.16+).
|
||||
|
||||
## Validation Architecture
|
||||
|
||||
> Skipped — `workflow.nyquist_validation` is explicitly `false` in `.planning/config.json`.
|
||||
|
||||
## Security Domain
|
||||
|
||||
> The `security_enforcement` key is absent from `.planning/config.json` (default: enabled).
|
||||
|
||||
### Applicable ASVS Categories
|
||||
|
||||
| ASVS Category | Applies | Standard Control |
|
||||
|---------------|---------|-----------------|
|
||||
| V5 Input Validation | yes | Setup script validates secret non-empty before accepting; validates SSH key paths exist |
|
||||
| V7 Cryptography at Rest | partial | Secrets stored in `~/.hermes/.env` (plaintext file at rest). Acceptable for local machine — Hermes itself manages file permissions. |
|
||||
| V9 Cryptographic Architecture | no | No custom crypto — tools use their own auth mechanisms |
|
||||
|
||||
### Known Threat Patterns for {Dockerfile + setup script}
|
||||
|
||||
| Pattern | STRIDE | Standard Mitigation |
|
||||
|---------|--------|---------------------|
|
||||
| Secret exposure in terminal history | Information Disclosure | Setup script uses `read -s` (masked input, no echo). Secrets are never echoed to terminal. |
|
||||
| Config file world-readable permissions | Tampering | Script sets `chmod 600` on `~/.hermes/.env` after writing |
|
||||
| Man-in-the-middle on tool download | Tampering | Dockerfile uses HTTPS for all downloads (AWS S3, GitHub, HashiCorp, pkgs.k8s.io, Buildkite). GPG signature verification in apt repos. |
|
||||
| Accidental build context leak | Information Disclosure | Dockerfile should not `COPY .` — use `COPY docker/` only to avoid leaking `.env` or other secrets into image layers |
|
||||
|
||||
## Sources
|
||||
|
||||
### Primary (HIGH confidence)
|
||||
- [AWS CLI v2 Linux install](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) — verified official curl→unzip→install procedure
|
||||
- [Terraform Linux install](https://developer.hashicorp.com/terraform/install#linux) — verified HashiCorp apt repo method + version 1.15.6
|
||||
- [Helm install via apt](https://helm.sh/docs/intro/install/#from-apt-debianubuntu) — verified Buildkite apt repo method + version 4.2.1
|
||||
- [kubectl Linux install](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) — verified Google Kubernetes apt repo method
|
||||
- [Kubernetes releases](https://kubernetes.io/releases/) — verified latest stable v1.36.1
|
||||
- [Pup CLI releases](https://github.com/DataDog/pup/releases/tag/v1.1.0) — verified latest version 1.1.0
|
||||
- [nikolaik/python-nodejs Docker Hub](https://hub.docker.com/r/nikolaik/python-nodejs) — verified available tags and versions
|
||||
- [Current ~/.hermes/config.yaml](file://~/.hermes/config.yaml) — verified 565-line source of truth for setup script
|
||||
- [Current ~/.hermes/.env](file://~/.hermes/.env) — verified 484-line env template
|
||||
- [Current ~/.hermes/hindsight/config.json](file://~/.hermes/hindsight/config.json) — verified JSON file content
|
||||
- [Current session-init.sh](file://~/.hermes/scripts/session-init.sh) — verified 37-line script
|
||||
- [Current archive-stale-sessions.sh](file://~/.hermes/scripts/archive-stale-sessions.sh) — verified 41-line script
|
||||
- [Phase 8 cron registration patterns](file:///Users/bapung/Razer/ngn-agent/.planning/phases/08-cron-reporting/08-01-SUMMARY.md) — verified `hermes cron create` CLI syntax
|
||||
|
||||
### Secondary (MEDIUM confidence)
|
||||
- [Hermes research: Extensibility](file:///Users/bapung/Razer/ngn-agent/.planning/research/hermes/EXTENSIBILITY.md) — verified hermes config set capability
|
||||
- [Base image GitHub repo](https://github.com/nikolaik/docker-python-nodejs) — confirmed tag generation pattern
|
||||
|
||||
### Tertiary (LOW confidence)
|
||||
- (none — all claims verified via official sources or current config files)
|
||||
|
||||
## Metadata
|
||||
|
||||
**Confidence breakdown:**
|
||||
- Standard stack: HIGH — all tool versions verified from official sources
|
||||
- Architecture: HIGH — patterns derived from current working configuration
|
||||
- Pitfalls: HIGH — based on documented deprecations and Phase 8 execution knowledge
|
||||
- Base image tag availability: MEDIUM — need to confirm `python3.11-nodejs20` tag status at build time
|
||||
|
||||
**Research date:** 2026-06-15
|
||||
**Valid until:** 2026-07-15 (tool versions may receive patch updates within 30 days; base image tags may change if maintainer drops more versions)
|
||||
Reference in New Issue
Block a user