5.8 KiB
Phase 5: Hindsight Memory Provider - Context
Gathered: 2026-06-14 Status: Ready for planning
## Phase BoundaryEnable Hindsight as the cross-session memory provider for ngn-agent. Replace the built-in MEMORY.md/USER.md with Hindsight's entity-aware knowledge graph memory for persistent recall across sessions. Pure configuration change — no code, no new infrastructure, no Docker image changes.
In scope: Hindsight memory provider activation, configuration tuning for latency sensitivity, integration with existing Hermes agent loop
Out of scope: Migration of existing MEMORY.md/USER.md data, custom memory provider implementation, multi-provider setup, session lifecycle features (Phase 7), cron reporting (Phase 8)
## Implementation DecisionsHindsight Mode
- D-01: Use local embedded mode — Hermes spins up a local PostgreSQL daemon. No external data send. Use existing OpenRouter API key (
OPENROUTER_API_KEYin~/.hermes/.env) for LLM extraction viaHINDSIGHT_LLM_API_KEY.- Config:
mode: local_embeddedin~/.hermes/hindsight/config.json - Env:
HINDSIGHT_LLM_API_KEY=sk-or-v1-...(same as existing OpenRouter key) - Provider:
openrouterwith model per-provider default
- Config:
Memory Integration
- D-02: Use default (hybrid) mode — auto-inject relevant memories before each turn + expose all 3 hindsight tools (hindsight_retain, hindsight_recall, hindsight_reflect) to the agent
- Config:
memory_mode: hybrid(default)
- Config:
Migration
- D-03: No migration from built-in memory. MEMORY.md/USER.md will continue to work as a fallback in parallel — built-in memory writes via
memorytool still work alongside hindsight. Hindsight does not sync with built-in memory. This is acceptable — hindsight will build its own knowledge graph from new sessions.
Recall Settings (latency-optimized)
- D-04:
recall_budget: low— fastest retrieval, minimal latency overhead per turn - D-05:
recall_prefetch_method: recall— raw fact search (no LLM synthesis in the hot path) - D-06:
auto_recall: true— still auto-inject context before each turn - D-07:
recall_types: observation— default (observations only, denser per token)
Retain Settings (latency-optimized)
- D-08:
retain_async: true— processing happens in background, never blocks agent loop - D-09:
retain_every_n_turns: 5— extract memories every 5 turns instead of every turn, ~80% overhead reduction - D-10:
auto_retain: true— automatic retention active
the agent's Discretion
- Bank configuration (
bank_id,bank_mission,bank_retain_mission): Use defaults (bank_id: hermes, no missions). Planner can recommend tuning if needed. - Daemon startup logs and runtime monitoring: Standard Hermes daemon management applies (
~/.hermes/logs/hindsight-embed.log,~/.hindsight/profiles/).
<canonical_refs>
Canonical References
Downstream agents MUST read these before planning or implementing.
Hindsight Plugin
~/.hermes/hermes-agent/plugins/memory/hindsight/README.md— Full Hindsight configuration reference with all options, environment variables, and modes~/.hermes/hermes-agent/plugins/memory/hindsight/plugin.yaml— Plugin metadata, pip dependencies (hindsight-client>=0.4.22), hooks
Hermes Memory System
~/.hermes/hermes-agent/plugins/memory/__init__.py§12 — "Only ONE provider can be active at a time" constraint (memory.providerconfig)~/.hermes/hermes-agent/agent/memory_manager.pylines 342-354 — Provider conflict logic (second external provider silently rejected)
Project Documents
.planning/REQUIREMENTS.md§MEM-01 — Requirement definition with verification criteria.planning/ROADMAP.md§Phase 5 — Phase goal and success criteria.planning/PROJECT.md— Core value, constraints, existing Hermes config.planning/research/SUMMARY.md§Phase 1 — Research findings on hindsight setup, pitfalls, and recommendations.planning/research/PITFALLS.md§Memory Provider Conflict — Critical pitfall on setting multiple providers
Existing Configuration
~/.hermes/config.yaml§memory — Current memory config (provider field to change, existing settings to preserve)~/.hermes/.env— Existing environment variables (OpenRouter key to reuse for HINDSIGHT_LLM_API_KEY) </canonical_refs>
<code_context>
Existing Code Insights
Reusable Assets
- Hindsight plugin (
plugins/memory/hindsight/): Already bundled in Hermes v0.16.0, production-ready, requires only config change + pip install - Built-in memory (MEMORY.md/USER.md): Continues working in parallel as fallback — no code changes needed
Established Patterns
- Single provider constraint (
plugins/memory/__init__.py:12): Hermes enforces only one external memory provider. Settingmemory.provider: hindsightreplaces built-in as the active external provider - Async retain (
memory_provider.py:115-131): Memory retains are async and non-blocking by default — confirms our latency-optimized approach is supported upstream
Integration Points
~/.hermes/config.yaml§memory.provider— Change from default tohindsight~/.hermes/.env— AddHINDSIGHT_LLM_API_KEYenv var (reuse existing OpenRouter key)~/.hermes/hindsight/config.json— Create with our latency-optimized settings (auto-generated byhermes memory setup, then tuned) </code_context>
No specific requirements — open to standard approaches. Follow the Hindsight plugin README setup instructions, then apply the latency-optimized config overrides from decisions D-04 through D-10.
## Deferred IdeasNone — discussion stayed within phase scope.
Phase: 5-Hindsight Memory Provider Context gathered: 2026-06-14