Files
ngn-agent/.planning/phases/05-hindsight-memory-provider/05-CONTEXT.md

5.8 KiB

Phase 5: Hindsight Memory Provider - Context

Gathered: 2026-06-14 Status: Ready for planning

## Phase Boundary

Enable Hindsight as the cross-session memory provider for ngn-agent. Replace the built-in MEMORY.md/USER.md with Hindsight's entity-aware knowledge graph memory for persistent recall across sessions. Pure configuration change — no code, no new infrastructure, no Docker image changes.

In scope: Hindsight memory provider activation, configuration tuning for latency sensitivity, integration with existing Hermes agent loop

Out of scope: Migration of existing MEMORY.md/USER.md data, custom memory provider implementation, multi-provider setup, session lifecycle features (Phase 7), cron reporting (Phase 8)

## Implementation Decisions

Hindsight Mode

  • D-01: Use local embedded mode — Hermes spins up a local PostgreSQL daemon. No external data send. Use existing OpenRouter API key (OPENROUTER_API_KEY in ~/.hermes/.env) for LLM extraction via HINDSIGHT_LLM_API_KEY.
    • Config: mode: local_embedded in ~/.hermes/hindsight/config.json
    • Env: HINDSIGHT_LLM_API_KEY=sk-or-v1-... (same as existing OpenRouter key)
    • Provider: openrouter with model per-provider default

Memory Integration

  • D-02: Use default (hybrid) mode — auto-inject relevant memories before each turn + expose all 3 hindsight tools (hindsight_retain, hindsight_recall, hindsight_reflect) to the agent
    • Config: memory_mode: hybrid (default)

Migration

  • D-03: No migration from built-in memory. MEMORY.md/USER.md will continue to work as a fallback in parallel — built-in memory writes via memory tool still work alongside hindsight. Hindsight does not sync with built-in memory. This is acceptable — hindsight will build its own knowledge graph from new sessions.

Recall Settings (latency-optimized)

  • D-04: recall_budget: low — fastest retrieval, minimal latency overhead per turn
  • D-05: recall_prefetch_method: recall — raw fact search (no LLM synthesis in the hot path)
  • D-06: auto_recall: true — still auto-inject context before each turn
  • D-07: recall_types: observation — default (observations only, denser per token)

Retain Settings (latency-optimized)

  • D-08: retain_async: true — processing happens in background, never blocks agent loop
  • D-09: retain_every_n_turns: 5 — extract memories every 5 turns instead of every turn, ~80% overhead reduction
  • D-10: auto_retain: true — automatic retention active

the agent's Discretion

  • Bank configuration (bank_id, bank_mission, bank_retain_mission): Use defaults (bank_id: hermes, no missions). Planner can recommend tuning if needed.
  • Daemon startup logs and runtime monitoring: Standard Hermes daemon management applies (~/.hermes/logs/hindsight-embed.log, ~/.hindsight/profiles/).

<canonical_refs>

Canonical References

Downstream agents MUST read these before planning or implementing.

Hindsight Plugin

  • ~/.hermes/hermes-agent/plugins/memory/hindsight/README.md — Full Hindsight configuration reference with all options, environment variables, and modes
  • ~/.hermes/hermes-agent/plugins/memory/hindsight/plugin.yaml — Plugin metadata, pip dependencies (hindsight-client>=0.4.22), hooks

Hermes Memory System

  • ~/.hermes/hermes-agent/plugins/memory/__init__.py §12 — "Only ONE provider can be active at a time" constraint (memory.provider config)
  • ~/.hermes/hermes-agent/agent/memory_manager.py lines 342-354 — Provider conflict logic (second external provider silently rejected)

Project Documents

  • .planning/REQUIREMENTS.md §MEM-01 — Requirement definition with verification criteria
  • .planning/ROADMAP.md §Phase 5 — Phase goal and success criteria
  • .planning/PROJECT.md — Core value, constraints, existing Hermes config
  • .planning/research/SUMMARY.md §Phase 1 — Research findings on hindsight setup, pitfalls, and recommendations
  • .planning/research/PITFALLS.md §Memory Provider Conflict — Critical pitfall on setting multiple providers

Existing Configuration

  • ~/.hermes/config.yaml §memory — Current memory config (provider field to change, existing settings to preserve)
  • ~/.hermes/.env — Existing environment variables (OpenRouter key to reuse for HINDSIGHT_LLM_API_KEY) </canonical_refs>

<code_context>

Existing Code Insights

Reusable Assets

  • Hindsight plugin (plugins/memory/hindsight/): Already bundled in Hermes v0.16.0, production-ready, requires only config change + pip install
  • Built-in memory (MEMORY.md/USER.md): Continues working in parallel as fallback — no code changes needed

Established Patterns

  • Single provider constraint (plugins/memory/__init__.py:12): Hermes enforces only one external memory provider. Setting memory.provider: hindsight replaces built-in as the active external provider
  • Async retain (memory_provider.py:115-131): Memory retains are async and non-blocking by default — confirms our latency-optimized approach is supported upstream

Integration Points

  • ~/.hermes/config.yaml §memory.provider — Change from default to hindsight
  • ~/.hermes/.env — Add HINDSIGHT_LLM_API_KEY env var (reuse existing OpenRouter key)
  • ~/.hermes/hindsight/config.json — Create with our latency-optimized settings (auto-generated by hermes memory setup, then tuned) </code_context>
## Specific Ideas

No specific requirements — open to standard approaches. Follow the Hindsight plugin README setup instructions, then apply the latency-optimized config overrides from decisions D-04 through D-10.

## Deferred Ideas

None — discussion stayed within phase scope.


Phase: 5-Hindsight Memory Provider Context gathered: 2026-06-14