# Phase 5: Hindsight Memory Provider - Context **Gathered:** 2026-06-14 **Status:** Ready for planning ## Phase Boundary Enable Hindsight as the cross-session memory provider for ngn-agent. Replace the built-in MEMORY.md/USER.md with Hindsight's entity-aware knowledge graph memory for persistent recall across sessions. Pure configuration change — no code, no new infrastructure, no Docker image changes. **In scope:** Hindsight memory provider activation, configuration tuning for latency sensitivity, integration with existing Hermes agent loop **Out of scope:** Migration of existing MEMORY.md/USER.md data, custom memory provider implementation, multi-provider setup, session lifecycle features (Phase 7), cron reporting (Phase 8) ## Implementation Decisions ### Hindsight Mode - **D-01:** Use **local embedded** mode — Hermes spins up a local PostgreSQL daemon. No external data send. Use existing OpenRouter API key (`OPENROUTER_API_KEY` in `~/.hermes/.env`) for LLM extraction via `HINDSIGHT_LLM_API_KEY`. - Config: `mode: local_embedded` in `~/.hermes/hindsight/config.json` - Env: `HINDSIGHT_LLM_API_KEY=sk-or-v1-...` (same as existing OpenRouter key) - Provider: `openrouter` with model per-provider default ### Memory Integration - **D-02:** Use **default (hybrid)** mode — auto-inject relevant memories before each turn + expose all 3 hindsight tools (hindsight_retain, hindsight_recall, hindsight_reflect) to the agent - Config: `memory_mode: hybrid` (default) ### Migration - **D-03:** No migration from built-in memory. MEMORY.md/USER.md will continue to work as a fallback in parallel — built-in memory writes via `memory` tool still work alongside hindsight. Hindsight does not sync with built-in memory. This is acceptable — hindsight will build its own knowledge graph from new sessions. ### Recall Settings (latency-optimized) - **D-04:** `recall_budget: low` — fastest retrieval, minimal latency overhead per turn - **D-05:** `recall_prefetch_method: recall` — raw fact search (no LLM synthesis in the hot path) - **D-06:** `auto_recall: true` — still auto-inject context before each turn - **D-07:** `recall_types: observation` — default (observations only, denser per token) ### Retain Settings (latency-optimized) - **D-08:** `retain_async: true` — processing happens in background, never blocks agent loop - **D-09:** `retain_every_n_turns: 5` — extract memories every 5 turns instead of every turn, ~80% overhead reduction - **D-10:** `auto_retain: true` — automatic retention active ### the agent's Discretion - **Bank configuration** (`bank_id`, `bank_mission`, `bank_retain_mission`): Use defaults (`bank_id: hermes`, no missions). Planner can recommend tuning if needed. - **Daemon startup logs and runtime monitoring**: Standard Hermes daemon management applies (`~/.hermes/logs/hindsight-embed.log`, `~/.hindsight/profiles/`). ## Canonical References **Downstream agents MUST read these before planning or implementing.** ### Hindsight Plugin - `~/.hermes/hermes-agent/plugins/memory/hindsight/README.md` — Full Hindsight configuration reference with all options, environment variables, and modes - `~/.hermes/hermes-agent/plugins/memory/hindsight/plugin.yaml` — Plugin metadata, pip dependencies (`hindsight-client>=0.4.22`), hooks ### Hermes Memory System - `~/.hermes/hermes-agent/plugins/memory/__init__.py` §12 — "Only ONE provider can be active at a time" constraint (`memory.provider` config) - `~/.hermes/hermes-agent/agent/memory_manager.py` lines 342-354 — Provider conflict logic (second external provider silently rejected) ### Project Documents - `.planning/REQUIREMENTS.md` §MEM-01 — Requirement definition with verification criteria - `.planning/ROADMAP.md` §Phase 5 — Phase goal and success criteria - `.planning/PROJECT.md` — Core value, constraints, existing Hermes config - `.planning/research/SUMMARY.md` §Phase 1 — Research findings on hindsight setup, pitfalls, and recommendations - `.planning/research/PITFALLS.md` §Memory Provider Conflict — Critical pitfall on setting multiple providers ### Existing Configuration - `~/.hermes/config.yaml` §memory — Current memory config (provider field to change, existing settings to preserve) - `~/.hermes/.env` — Existing environment variables (OpenRouter key to reuse for HINDSIGHT_LLM_API_KEY) ## Existing Code Insights ### Reusable Assets - **Hindsight plugin** (`plugins/memory/hindsight/`): Already bundled in Hermes v0.16.0, production-ready, requires only config change + pip install - **Built-in memory** (MEMORY.md/USER.md): Continues working in parallel as fallback — no code changes needed ### Established Patterns - **Single provider constraint** (`plugins/memory/__init__.py:12`): Hermes enforces only one external memory provider. Setting `memory.provider: hindsight` replaces built-in as the active external provider - **Async retain** (`memory_provider.py:115-131`): Memory retains are async and non-blocking by default — confirms our latency-optimized approach is supported upstream ### Integration Points - `~/.hermes/config.yaml` §`memory.provider` — Change from default to `hindsight` - `~/.hermes/.env` — Add `HINDSIGHT_LLM_API_KEY` env var (reuse existing OpenRouter key) - `~/.hermes/hindsight/config.json` — Create with our latency-optimized settings (auto-generated by `hermes memory setup`, then tuned) ## Specific Ideas No specific requirements — open to standard approaches. Follow the Hindsight plugin README setup instructions, then apply the latency-optimized config overrides from decisions D-04 through D-10. ## Deferred Ideas None — discussion stayed within phase scope. --- *Phase: 5-Hindsight Memory Provider* *Context gathered: 2026-06-14*