# Research Summary: ngn-agent v1.1 — Session Lifecycle, Memory & Reporting **Project:** ngn-agent v1.1 **Domain:** Platform engineering agent (Hermes Agent-based configuration/adoption) **Researched:** 2026-06-14 **Overall confidence:** HIGH ## Executive Summary ngn-agent v1.1 adds three features atop the existing Hermes Agent installation: DEFAULT_REPOS auto-clone into session workspaces, Hindsight long-term memory provider, and daily cron reporting with stale session lifecycle management (30d archive) plus Jira integration. All three integrate cleanly with existing v1.0 infrastructure using documented Hermes extension points — no core source changes, no greenfield work, no new infrastructure. **The recommended approach: additive configuration layer.** Every feature maps to an existing Hermes mechanism: `shell_init_files` for repo cloning, `memory.provider: hindsight` for memory, `hermes cron create` for reporting and archiving. The only new code is two shell scripts (session-init, stale-cleanup) and one skill markdown file (daily-report). Installation time is under 2 hours total. **Key risks and mitigations:** 1. **Docker container restart loses cloned repos** — Mitigate by cloning to a host-mounted volume (`~/Projects:/workspace/repos:rw`) 2. **Hindsight Cloud API reliability** — Monitor logs for `sync_turn failed`; have local embedded mode as fallback 3. **SSH credential exposure inside Docker** — Use read-only deploy keys scoped per repo; never mount full `~/.ssh/` 4. **Memory provider conflict** — Set `memory.provider: hindsight` only; never add a second external provider ## Key Findings ### Recommended Stack The stack is almost entirely existing Hermes infrastructure plus three small additions. See [STACK.md](./STACK.md) for full details. **Core additions:** - `hindsight-client>=0.4.22`: Python client for Hindsight Cloud API (already bundled as Hermes MemoryProvider plugin; just needs `uv pip install`) - SSH key mount (existing): Git clone auth inside Docker — `~/.ssh:/root/.ssh:ro` or deploy key per repo - `session-init.sh`: Shell script executed at terminal start via `terminal.shell_init_files` — clones DEFAULT_REPOS into `/workspace/repos/` - `daily-report.md` skill: Hermes skill-backed cron job — agent composes daily session summary and sends via Telegram - `stale-cleanup.sh`: `no_agent` cron script — exports sessions inactive >30d to JSON archive, deletes from live DB **Config changes required:** | Config | Value | |--------|-------| | `memory.provider` | `hindsight` | | `terminal.shell_init_files` | `["/usr/local/bin/session-init.sh"]` | | `terminal.docker_volumes` | Add `~/.ssh:/root/.ssh:ro` and `~/Projects:/workspace/repos:rw` | | `HINDSIGHT_API_KEY` | Set in `~/.hermes/.env` | | `DEFAULT_REPOS` | Space-separated `org/repo` list in `~/.hermes/.env` | **Alternatives considered:** | Decision | Recommended | Alternative Rejected | |----------|-------------|---------------------| | Hindsight mode | **Cloud** (zero infra) | Local embedded (~200MB download, 2-4GB RAM overhead) | | Git auth method | **SSH key mount** | SSH agent forwarding (needs host socket, less reliable) | | Session init hook | **`shell_init_files`** | Plugin `on_session_start` hook (runs after agent starts, not guaranteed before first prompt) | | Cron mechanism | **Hermes skill + cron** | Custom Python script (wastes existing delivery infrastructure) | ### Expected Features See [FEATURES.md](./FEATURES.md) for complete landscape, dependencies, and prioritization. **Must have (table stakes — P1 for v1.1):** - **DEFAULT_REPOS auto-cloned** in every new session — Manual clone per session is the #1 UX complaint. `shell_init_files` runs before agent starts, guaranteeing repos are present. - **Cross-session persistent memory** — Built-in MEMORY.md is 2.2k chars frozen at session start. Hindsight provides entity-aware KG with semantic recall across all sessions. - **Daily operational report** — Invisible work erodes trust. Daily Telegram report shows what the agent did, what sessions were active. - **Stale session cleanup** — Sessions pile up indefinitely. 30d inactivity → archive to JSON → delete from live DB. **Should have (differentiators — P2 for v1.1):** - **Knowledge graph memory (Hindsight)** — Entity-aware cross-session recall with LLM synthesis (`hindsight_reflect`), not just FTS5 text search - **On-demand repo cloning** — User says "clone rai-pipeline" mid-session, agent does it without leaving the conversation - **Jira-integrated daily report** — Report includes Jira ticket status and session→ticket correlations using existing `ngn-jira` skill - **Zero-cost stale cleanup** — `no_agent: true` cron = deterministic script, zero LLM token cost **Defer (v1.2+):** - On-demand repo cloning skill (trivial once default cloning works; user can already ask manually) - Archive restore script (JSON files are text-searchable; low urgency) - Custom ngn-agent plugin package (only valuable if shared across a team) **Anti-features (avoid):** - Custom scheduler (Hermes cron already handles this) - Custom memory provider implementation (Hindsight is production-ready and bundled) - Persistent Docker image with pre-cloned repos (image would be large, stale quickly) - Cloud-only hindsight mode (local embedded is managed by Hermes; Cloud adds dependency + cost) ### Architecture Approach See [ARCHITECTURE.md](./ARCHITECTURE.md) for full component boundaries, data flows, and patterns. All v1.1 features are an **additive plugin + script + configuration layer** around Hermes' built-in extension points. No Hermes core code is modified. **Major components:** 1. **Hindsight Memory Provider** — Cross-session memory with knowledge graph, entity resolution, semantic recall. Communicates with Hermes agent loop (pre-turn recall, post-turn retain), local PostgreSQL, OpenRouter (LLM extraction). 2. **Repo Clone Hook** (`session-init.sh`) — On session start, clones DEFAULT_REPOS from `~/.hermes/.env` into host-mounted `/workspace/repos/`. Uses `shell_init_files` mechanism (not plugin hooks) for guaranteed execution before agent starts. 3. **Daily Report Skill** (`daily-report.md`) — Skill-backed cron job. Instructs agent to query SessionDB for recent sessions, Hindsight for cross-session facts, Jira for ticket updates. Format as Telegram-friendly summary. 4. **Session Archive Script** (`stale-cleanup.sh`) — No-agent cron script. Queries SessionDB for sessions inactive >30d, exports to JSON, deletes from live DB. Deterministic, zero LLM cost. 5. **Built-in Memory (fallback)** — Always-active fallback for critical facts via MEMORY.md/USER.md, frozen at session start. **Four architectural patterns to follow:** 1. **Plugin Hook for Session Init** — `ctx.register_hook("on_session_start", handler)` for custom initialization per session (or `shell_init_files` for guaranteed gating) 2. **Skill-Backed Cron Jobs** — Cron jobs that load a skill with structured instructions; agent produces report guided by skill context 3. **No-Agent Script for Deterministic Automation** — `no_agent: true` cron jobs for data gathering, archiving, threshold checks 4. **Export-Before-Delete for Data Safety** — Before removing any data, export to archive file first; verify integrity before deleting **Anti-patterns to avoid:** - Monkey-patching Hermes core (overwritten by auto-updates) - Direct `state.db` SQL queries (schema changes between releases; use SessionDB API) - Storing credentials in workspace files (prompt injection exfiltration risk) ### Critical Pitfalls See [PITFALLS.md](./PITFALLS.md) for all 10 pitfalls with prevention and detection. **Top 5 critical:** 1. **Docker container restart loses cloned repos** — Container destroyed after `lifetime_seconds: 300` of inactivity. Repos cloned to ephemeral container filesystem disappear. **Prevention:** Always clone to host-mounted volume (`~/Projects:/workspace/repos:rw`). Script must check for existing `.git` directory before cloning. 2. **Memory provider conflict** — `MemoryManager.add_provider()` rejects a second external provider (memory_manager.py:342-354). Setting two external providers silently fails — only first is registered. **Prevention:** Set `memory.provider: hindsight` and nothing else. 3. **Cron job prompt injection via skill content** — Cron jobs load skill content at runtime. Scanning detects patterns but false negatives are possible (cron/scheduler.py:1249-1303). **Prevention:** Keep cron skills simple and vetted. Use `no_agent` scripts for deterministic operations. 4. **SSH key exposure inside Docker** — Agent with file-read tools inside Docker has read access to mounted `~/.ssh/`. Prompt injection could exfiltrate keys. **Prevention:** Mount `~/.ssh:ro` (read-only), use deploy keys per repo, consider HTTPS + scoped token instead of SSH. 5. **Shell init script blocking container start** — `shell_init_files` runs synchronously before shell prompt. Hanging git clone blocks agent startup. **Prevention:** Add `timeout 30` to clone operations, wrap in `(sleep 5; ...) &` for async init. ## Implications for Roadmap Based on research, four phases in dependency order: ### Phase 1: Hindsight Memory Provider **Rationale:** Independent, zero-risk, enhances every other feature. Pure configuration — no scripts, no volumes, no cron changes. Quickest win (~25 min). **Delivers:** Cross-session persistent memory with knowledge graph, entity resolution, semantic recall via Hindsight Cloud API. **Addresses:** Cross-session persistent memory (table stakes) + Knowledge graph memory (differentiator) **Uses:** `hindsight-client>=0.4.22`, `memory.provider: hindsight` config, `HINDSIGHT_API_KEY` env var **Implements:** Hindsight Memory Provider component **Avoids:** Pitfall 2 — Memory provider conflict (set only `hindsight`, never add second external) **Research flag:** LOW — Well-documented Hermes configuration step. Verify Hindsight Cloud API availability and free tier limits during setup. ### Phase 2: Default Repos Auto-Clone + Credential Mount **Rationale:** Second priority — fills the biggest UX gap (repos missing every session). Requires security-sensitive credential mounting, so needs careful implementation. **Delivers:** DEFAULT_REPOS auto-cloned into every new session workspace via `shell_init_files` script. On-demand cloning capability (basic — user asks, agent clones). **Addresses:** Default repos auto-cloned (table stakes) + On-demand repo cloning (differentiator) **Uses:** `terminal.shell_init_files`, `terminal.docker_volumes` (SSH mount + workspace volume), `session-init.sh` script **Implements:** Repo Clone Hook component **Avoids:** - Pitfall 1 — Lost clones on container restart (mitigated by host volume mount `~/Projects:/workspace/repos:rw`) - Pitfall 5 — Blocking init script (add `timeout 30` to git clone, consider async wrapping) - Pitfall 4 — SSH key exposure (use deploy keys, read-only mount) **Research flag:** MEDIUM — SSH credential mount security approach (deploy key vs token vs agent forwarding) needs final decision during planning. Test both `~/.ssh:ro` and HTTPS+token approaches. ### Phase 3: Daily Cron Report **Rationale:** Third priority — needs active sessions to report on. Phase 1+2 ensure sessions have memory and repos, making sessions productive. Now we can report on them. **Delivers:** Daily Telegram report at 09:00 listing active sessions, session titles, last message previews, token counts. Skill-backed agent composes the summary. **Addresses:** Daily operational report (table stakes) + Jira integration (differentiator, stretch goal) **Uses:** `daily-report.md` skill, `hermes cron create`, existing Telegram delivery channel, existing `ngn-jira` skill **Implements:** Daily Report Skill component **Avoids:** - Pitfall 3 — Cron prompt injection (keep skill simple, vetted) - Minor Pitfall 3 — Wrong chat delivery (set `deliver: telegram:474440517` explicitly) **Research flag:** MEDIUM — Daily report skill prompt quality needs iteration. The skill instructs the agent what to query and how to format. Plan for at least 2-3 prompt refinements after initial deploy. Jira integration depends on `ngn-jira` skill stability. ### Phase 4: Stale Session Archive (30d) **Rationale:** Last priority because it's destructive. Should only run after reporting is working so user can see in daily reports what sessions will be affected before archiving runs. **Delivers:** Weekly (Sunday 06:00) archival of sessions inactive >30d. Export to JSON in `~/.hermes/archive/sessions/`, delete from live DB. Summary delivered to Telegram. **Addresses:** Stale session cleanup (table stakes) **Uses:** `stale-cleanup.sh` script, `hermes cron create --no-agent`, `SessionDB.export_session()` / `delete_session()` **Implements:** Session Archive Script component **Avoids:** - Pitfall pattern — Export-before-delete for data safety (write JSON, verify, then delete) - Moderate Pitfall — Deleting active sessions (check `last_updated` carefully, use dry-run mode first) **Research flag:** LOW — Deterministic script using documented SessionDB API. Add dry-run mode flag for initial testing. Consider archive verification step. ### Phase Ordering Rationale - **Hindsight first** (Phase 1) — Zero-risk configuration change. Enhances every subsequent phase by providing cross-session context. No code, no scripts, no volumes. - **Default Repos second** (Phase 2) — Independent from Hindsight (no dependency), but has the security-sensitive credential mount. Early implementation allows maximum testing of credential isolation. - **Daily Report third** (Phase 3) — Needs active sessions producing data to report on. Both Phase 1 and 2 contribute to session quality. Report can also surface Hindsight memory patterns. - **Stale Archive fourth** (Phase 4) — Destructive operation. User should see via daily reports what will be archived before the archive runs. Install archive cron after report cron so there's visible feedback first. ### Research Flags Phases needing deeper research during planning: - **Phase 2 (Default Repos):** SSH credential mount strategy — deploy key vs fine-grained token vs agent forwarding vs full `~/.ssh:ro`. Tradeoffs between security and simplicity need a final decision. Also verify `shell_init_files` execution ordering guarantees. - **Phase 3 (Daily Report):** Skill prompt design for useful LLM-generated summaries. Jira API scoping — what ticket data to include, how to correlate sessions to tickets. The Jira integration scope (basic ticket status query vs full session→ticket mapping) needs definition. Phases with standard patterns (skip research-phase): - **Phase 1 (Hindsight):** Pure configuration — `hermes memory setup`, pick hindsight, set env vars. Hermes docs cover this completely. - **Phase 4 (Stale Archive):** Deterministic script using `SessionDB.export_session()` / `delete_session()` — documented API, straightforward implementation, export-before-delete pattern. ## Confidence Assessment | Area | Confidence | Notes | |------|------------|-------| | Stack | HIGH | All dependencies verified against Hermes v0.16.0 source code and docs. `hindsight-client` is bundled. SSH mount is standard Docker. | | Features | HIGH | All features map to documented Hermes extension points. No speculative functionality. Prioritization derived from actual usage patterns. | | Architecture | HIGH | Additive layer design avoids modifying Hermes core. Every component boundary matches a documented Hermes mechanism (hooks, cron, skills, config). | | Pitfalls | HIGH | Each pitfall is sourced from specific Hermes source lines (memory_manager.py:342, cron/scheduler.py:1249-1303, etc.). Prevention strategies are concrete and testable. | **Overall confidence: HIGH** ### Gaps to Address | Gap | How to Address | |-----|----------------| | SSH credential mount: deploy key vs token vs agent forwarding | Test all approaches during Phase 2 planning. Start with deploy keys (most secure). Document security tradeoffs. | | Hindsight Cloud API free tier limits | Create Hindsight account, verify free tier, test with actual agent usage. Fall back to local embedded mode if Cloud is unreliable. | | Daily report quality iteration | Ship basic report in Phase 3, then iterate prompt based on actual output. Plan 2-3 refinement cycles. | | Jira integration scope | Define in Phase 3 planning: basic ticket status query or full session→ticket correlation? Start with basic, iterate to full. | | Archive dry-run mode | Add `--dry-run` flag to stale-cleanup.sh for initial testing. Run manually before activating cron. | ## Sources ### Primary (HIGH confidence — Hermes v0.16.0 source code + official docs) - `agent/memory_manager.py` lines 342-354 — Memory provider conflict logic (PITFALLS.md) - `agent/memory_provider.py` lines 115-131 — Async sync_turn silent failure (PITFALLS.md) - `cron/scheduler.py` lines 1249-1303 — Cron prompt injection scanning (PITFALLS.md) - `cron/scheduler.py` line 444 — Delivery origin fallback (PITFALLS.md) - `plugins/memory/hindsight/__init__.py` — Hindsight MemoryProvider plugin (STACK.md) - `hermes_state.py` — SessionDB API for export/delete (ARCHITECTURE.md, FEATURES.md) - `agent/curator.py` — Skills-only execution (FEATURES.md) - Hermes docs: hooks.md, cron.md, session-storage.md, memory.md, memory-providers.md (ARCHITECTURE.md, FEATURES.md) - ngn-agent `config.yaml` and `initial-plan.md` (existing v1.0 baseline) ### Secondary (MEDIUM confidence) - Hindsight documentation at https://hindsight.vectorize.io — Cloud API details and limits (STACK.md) - Current `~/.hermes/config.yaml` — Existing Docker volumes and cron job configuration ### Tertiary (LOW confidence — needs validation) - SSH credential mount behavior in Docker — needs testing with actual `~/.ssh:ro` mount and git clone inside container - Hindsight Cloud API free tier reliability at scale — needs account creation to verify --- *Research completed: 2026-06-14* *Ready for roadmap: yes*