Files
ngn-agent/.planning/research/ARCHITECTURE.md

11 KiB

Architecture Patterns

Domain: Platform engineering agent (Hermes Agent configuration) Researched: 2026-06-14

The v1.1 features are additive — they extend an existing Hermes Agent deployment without modifying Hermes core. The architecture is a plugin + script + configuration layer around Hermes' built-in extension points.

                       ┌─────────────────────────────────────────┐
                       │            Telegram Gateway              │
                       │  (already active, TELEGRAM_HOME_CHANNEL) │
                       └────────────┬────────────────────────────┘
                                    │
                       ┌────────────▼────────────────────────────┐
                       │         Hermes Agent Runtime             │
                       │  ┌────────────────────────────────────┐  │
                       │  │        Hindsight Memory Provider   │  │
                       │  │  (local embedded PostgreSQL daemon)│  │
                       │  │  auto_retain: true                 │  │
                       │  │  auto_recall: true                 │  │
                       │  │  memory_mode: hybrid               │  │
                       │  └────────────────────────────────────┘  │
                       │  ┌────────────────────────────────────┐  │
                       │  │       Built-in Memory (fallback)    │  │
                       │  │       MEMORY.md + USER.md          │  │
                       │  └────────────────────────────────────┘  │
                       │  ┌────────────────────────────────────┐  │
                       │  │       Plugin Hook System            │  │
                       │  │  on_session_start → repo cloning   │  │
                       │  │  pre_llm_call → context injection  │  │
                       │  └────────────────────────────────────┘  │
                       └──────────────────────────────────────────┘
                                    │
              ┌─────────────────────┼─────────────────────┐
              │                     │                     │
    ┌─────────▼─────────┐ ┌────────▼────────┐ ┌─────────▼─────────┐
    │  Docker Terminal   │ │  Cron Jobs      │ │  Session Storage  │
    │  (repo workspace)  │ │  (reporting,    │ │  (state.db)       │
    │  DEFAULT_REPOS     │ │   archiving)    │ │                   │
    │  cloned via hook   │ │                 │ │  export_session() │
    └───────────────────┘ └─────────────────┘ │  delete_session()  │
                                              └───────────────────┘
                                                       │
                                              ┌────────▼─────────┐
                                              │  Archive Storage  │
                                              │  ~/.hermes/      │
                                              │  archive/sessions/│
                                              └──────────────────┘

Component Boundaries

Component Responsibility Communicates With
Hindsight Provider Cross-session memory with knowledge graph, entity resolution, semantic recall Hermes agent loop (pre-turn recall, post-turn retain), local PostgreSQL, OpenRouter (LLM extraction)
Repo Clone Hook On session start, clones DEFAULT_REPOS into workspace Docker terminal (via terminal tool or subprocess), filesystem
Daily Report Skill Instructs agent what data to gather and how to format the daily summary SessionDB (via state.db queries or session_search), Telegram (via send_message), Jira API (via ngn-jira skill)
Session Archive Script Exports stale sessions to JSON, deletes from live DB SessionDB API (export_session, delete_session), archive filesystem
Built-in Memory Always-active fallback for critical facts Agent system prompt (frozen at session start)

Data Flow

Session Start (Default Repos)

User sends first message
  → `on_session_start` hook fires
  → Repo clone plugin checks /workspace/
  → Missing repos cloned via git (needs credential mount)
  → `pre_llm_call` hook fires (is_first_turn=True)
  → Plugin injects "Cloned repos: rai-ops, rai-deployment, rai-devtools" as context
  → Agent sees repos available in workspace

Memory Flow (Hindsight)

Agent turn completes
  → Built-in memory save (MEMORY.md / USER.md)
  → Hindsight auto_retain: conversation turn + entity extraction
  → Stored in local PostgreSQL with knowledge graph

Next turn (any session)
  → Hindsight auto_recall: semantic search for relevant memories
  → Results injected as context into the turn
  → Agent sees recalled facts from any past session

Daily Report Flow

Cron tick at 09:00
  → Scheduler loads daily-report skill
  → Creates fresh AIAgent session
  → Skill prompt instructs agent to:
      1. Query state.db for recent sessions
      2. Query hindsight for relevant cross-session facts
      3. Query Jira for open/updated tickets
      4. Format as Telegram-friendly summary
  → Agent produces report
  → Delivered to TELEGRAM_HOME_CHANNEL

Session Archive Flow

Cron tick on Sunday 06:00
  → No-agent script runs
  → Queries state.db for sessions inactive >30d
  → For each: export_session() → write JSON → delete_session()
  → Summary of archived sessions delivered to Telegram

Patterns to Follow

Pattern 1: Plugin Hook for Session Initialization

What: Use Hermes' plugin hook system (ctx.register_hook("on_session_start", handler)) to run initialization logic when a new session begins. When: Any setup that should happen exactly once per session, before the agent processes any user message. Example:

def clone_default_repos(session_id, model, platform, **kwargs):
    repos = ["rai-ops", "rai-deployment", "rai-devtools"]
    for repo in repos:
        path = f"/workspace/{repo}"
        if not os.path.exists(path):
            subprocess.run(["git", "clone", f"github.com/rai-apps/{repo}", path])

def register(ctx):
    ctx.register_hook("on_session_start", clone_default_repos)

Pattern 2: Skill-Backed Cron Jobs

What: Cron jobs that load a skill before executing. The skill provides structured instructions; the cron prompt is the task. When: Recurring tasks that benefit from agent reasoning but follow a repeatable structure. Example:

hermes cron create "0 9 * * *" \
  --skill daily-report \
  --deliver telegram:-100474440517 \
  --name "Daily Platform Report"

The skill (daily-report/SKILL.md) contains the report template. The cron job's prompt is just "Generate today's report."

Pattern 3: No-Agent Script for Deterministic Automation

What: Cron jobs with no_agent=True that run a script directly, skipping the LLM entirely. When: Tasks where the output is fully determined by script logic — archiving, data gathering, threshold checks. Example:

hermes cron create "0 6 * * 0" \
  --no-agent \
  --script archive-stale-sessions.py \
  --deliver telegram:-100474440517 \
  --name "Weekly Session Archive"

Pattern 4: Export-Before-Delete for Data Safety

What: Before removing any data from the live system, export it to an archive file first. When: Any destructive operation on session data, files, or state. Example:

data = db.export_session(session_id)
archive_path = archive_dir / f"{session_id}.json"
archive_path.write_text(json.dumps(data, indent=2))
db.delete_session(session_id)

Anti-Patterns to Avoid

Anti-Pattern 1: Monkey-Patching Hermes Core

What: Modifying ~/.hermes/hermes-agent/ source files to add custom behavior. Why bad: Hermes updates overwrite changes. The agent auto-updates. Custom patches break silently and are unrecoverable. Instead: Use documented extension points: plugin hooks, shell hooks, skills, cron jobs.

Anti-Pattern 2: Direct state.db Schema Queries in Production Scripts

What: Writing SQL queries against ~/.hermes/state.db that depend on internal schema details. Why bad: Schema changes between releases without notice (currently v11, has gone through 11 migrations). Queries break after hermes update. Instead: Use SessionDB API methods (export_session(), create_session(), get_messages()). Fall back to direct SQL only in controlled scripts that are tested after each Hermes update.

Anti-Pattern 3: Storing Credentials in Workspace Files

What: Writing GitHub tokens or SSH keys into the Docker container's workspace. Why bad: If the agent is compromised (prompt injection), credentials in workspace files can be exfiltrated via read_file or terminal output. Instead: Mount credentials read-only at the Docker level (docker_volumes: [path:path:ro]). Use docker_forward_env for environment variable-based credentials.

Scalability Considerations

Concern At 1 user At 10 users (future team) Notes
Hindsight DB <1GB PostgreSQL 5-50GB PostgreSQL Local embedded mode is single-user. For teams, switch to cloud mode or self-hosted Hindsight.
Session archive ~100 sessions/year ~1,000 sessions/year JSON files are tiny (~50KB each). Storage is negligible.
Cron report LLM cost 1 report/day ~1K tokens 10 reports/day ~10K tokens Cost scales linearly with users. Consider no-agent mode for data sections.
Repo clones 3 repos per session Same (shared workspaces) Container persistence means clones survive across sessions in the same container.

Sources

  • Hermes Agent docs: Hook system (website/docs/user-guide/features/hooks.md)
  • Hermes Agent docs: Cron system (website/docs/user-guide/features/cron.md)
  • Hermes Agent docs: Session storage (website/docs/developer-guide/session-storage.md)
  • Hermes Agent source: hermes_state.py, agent/curator.py, hermes_cli/hooks.py
  • ngn-agent config.yaml and initial-plan.md