docs: complete v1.1 research synthesis for session lifecycle, memory & reporting

This commit is contained in:
2026-06-14 13:53:55 +08:00
parent b5e7008314
commit 4b58964a12
5 changed files with 860 additions and 0 deletions

View File

@@ -0,0 +1,197 @@
# Architecture Patterns
**Domain:** Platform engineering agent (Hermes Agent configuration)
**Researched:** 2026-06-14
## Recommended Architecture
The v1.1 features are additive — they extend an existing Hermes Agent deployment without modifying Hermes core. The architecture is a **plugin + script + configuration** layer around Hermes' built-in extension points.
```
┌─────────────────────────────────────────┐
│ Telegram Gateway │
│ (already active, TELEGRAM_HOME_CHANNEL) │
└────────────┬────────────────────────────┘
┌────────────▼────────────────────────────┐
│ Hermes Agent Runtime │
│ ┌────────────────────────────────────┐ │
│ │ Hindsight Memory Provider │ │
│ │ (local embedded PostgreSQL daemon)│ │
│ │ auto_retain: true │ │
│ │ auto_recall: true │ │
│ │ memory_mode: hybrid │ │
│ └────────────────────────────────────┘ │
│ ┌────────────────────────────────────┐ │
│ │ Built-in Memory (fallback) │ │
│ │ MEMORY.md + USER.md │ │
│ └────────────────────────────────────┘ │
│ ┌────────────────────────────────────┐ │
│ │ Plugin Hook System │ │
│ │ on_session_start → repo cloning │ │
│ │ pre_llm_call → context injection │ │
│ └────────────────────────────────────┘ │
└──────────────────────────────────────────┘
┌─────────────────────┼─────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌────────▼────────┐ ┌─────────▼─────────┐
│ Docker Terminal │ │ Cron Jobs │ │ Session Storage │
│ (repo workspace) │ │ (reporting, │ │ (state.db) │
│ DEFAULT_REPOS │ │ archiving) │ │ │
│ cloned via hook │ │ │ │ export_session() │
└───────────────────┘ └─────────────────┘ │ delete_session() │
└───────────────────┘
┌────────▼─────────┐
│ Archive Storage │
│ ~/.hermes/ │
│ archive/sessions/│
└──────────────────┘
```
### Component Boundaries
| Component | Responsibility | Communicates With |
|-----------|---------------|-------------------|
| **Hindsight Provider** | Cross-session memory with knowledge graph, entity resolution, semantic recall | Hermes agent loop (pre-turn recall, post-turn retain), local PostgreSQL, OpenRouter (LLM extraction) |
| **Repo Clone Hook** | On session start, clones DEFAULT_REPOS into workspace | Docker terminal (via `terminal` tool or subprocess), filesystem |
| **Daily Report Skill** | Instructs agent what data to gather and how to format the daily summary | SessionDB (via `state.db` queries or `session_search`), Telegram (via `send_message`), Jira API (via ngn-jira skill) |
| **Session Archive Script** | Exports stale sessions to JSON, deletes from live DB | SessionDB API (`export_session`, `delete_session`), archive filesystem |
| **Built-in Memory** | Always-active fallback for critical facts | Agent system prompt (frozen at session start) |
### Data Flow
#### Session Start (Default Repos)
```
User sends first message
→ `on_session_start` hook fires
→ Repo clone plugin checks /workspace/
→ Missing repos cloned via git (needs credential mount)
→ `pre_llm_call` hook fires (is_first_turn=True)
→ Plugin injects "Cloned repos: rai-ops, rai-deployment, rai-devtools" as context
→ Agent sees repos available in workspace
```
#### Memory Flow (Hindsight)
```
Agent turn completes
→ Built-in memory save (MEMORY.md / USER.md)
→ Hindsight auto_retain: conversation turn + entity extraction
→ Stored in local PostgreSQL with knowledge graph
Next turn (any session)
→ Hindsight auto_recall: semantic search for relevant memories
→ Results injected as context into the turn
→ Agent sees recalled facts from any past session
```
#### Daily Report Flow
```
Cron tick at 09:00
→ Scheduler loads daily-report skill
→ Creates fresh AIAgent session
→ Skill prompt instructs agent to:
1. Query state.db for recent sessions
2. Query hindsight for relevant cross-session facts
3. Query Jira for open/updated tickets
4. Format as Telegram-friendly summary
→ Agent produces report
→ Delivered to TELEGRAM_HOME_CHANNEL
```
#### Session Archive Flow
```
Cron tick on Sunday 06:00
→ No-agent script runs
→ Queries state.db for sessions inactive >30d
→ For each: export_session() → write JSON → delete_session()
→ Summary of archived sessions delivered to Telegram
```
## Patterns to Follow
### Pattern 1: Plugin Hook for Session Initialization
**What:** Use Hermes' plugin hook system (`ctx.register_hook("on_session_start", handler)`) to run initialization logic when a new session begins.
**When:** Any setup that should happen exactly once per session, before the agent processes any user message.
**Example:**
```python
def clone_default_repos(session_id, model, platform, **kwargs):
repos = ["rai-ops", "rai-deployment", "rai-devtools"]
for repo in repos:
path = f"/workspace/{repo}"
if not os.path.exists(path):
subprocess.run(["git", "clone", f"github.com/rai-apps/{repo}", path])
def register(ctx):
ctx.register_hook("on_session_start", clone_default_repos)
```
### Pattern 2: Skill-Backed Cron Jobs
**What:** Cron jobs that load a skill before executing. The skill provides structured instructions; the cron prompt is the task.
**When:** Recurring tasks that benefit from agent reasoning but follow a repeatable structure.
**Example:**
```bash
hermes cron create "0 9 * * *" \
--skill daily-report \
--deliver telegram:-100474440517 \
--name "Daily Platform Report"
```
The skill (`daily-report/SKILL.md`) contains the report template. The cron job's prompt is just "Generate today's report."
### Pattern 3: No-Agent Script for Deterministic Automation
**What:** Cron jobs with `no_agent=True` that run a script directly, skipping the LLM entirely.
**When:** Tasks where the output is fully determined by script logic — archiving, data gathering, threshold checks.
**Example:**
```bash
hermes cron create "0 6 * * 0" \
--no-agent \
--script archive-stale-sessions.py \
--deliver telegram:-100474440517 \
--name "Weekly Session Archive"
```
### Pattern 4: Export-Before-Delete for Data Safety
**What:** Before removing any data from the live system, export it to an archive file first.
**When:** Any destructive operation on session data, files, or state.
**Example:**
```python
data = db.export_session(session_id)
archive_path = archive_dir / f"{session_id}.json"
archive_path.write_text(json.dumps(data, indent=2))
db.delete_session(session_id)
```
## Anti-Patterns to Avoid
### Anti-Pattern 1: Monkey-Patching Hermes Core
**What:** Modifying `~/.hermes/hermes-agent/` source files to add custom behavior.
**Why bad:** Hermes updates overwrite changes. The agent auto-updates. Custom patches break silently and are unrecoverable.
**Instead:** Use documented extension points: plugin hooks, shell hooks, skills, cron jobs.
### Anti-Pattern 2: Direct `state.db` Schema Queries in Production Scripts
**What:** Writing SQL queries against `~/.hermes/state.db` that depend on internal schema details.
**Why bad:** Schema changes between releases without notice (currently v11, has gone through 11 migrations). Queries break after `hermes update`.
**Instead:** Use `SessionDB` API methods (`export_session()`, `create_session()`, `get_messages()`). Fall back to direct SQL only in controlled scripts that are tested after each Hermes update.
### Anti-Pattern 3: Storing Credentials in Workspace Files
**What:** Writing GitHub tokens or SSH keys into the Docker container's workspace.
**Why bad:** If the agent is compromised (prompt injection), credentials in workspace files can be exfiltrated via `read_file` or `terminal` output.
**Instead:** Mount credentials read-only at the Docker level (`docker_volumes: [path:path:ro]`). Use `docker_forward_env` for environment variable-based credentials.
## Scalability Considerations
| Concern | At 1 user | At 10 users (future team) | Notes |
|---------|-----------|---------------------------|-------|
| Hindsight DB | <1GB PostgreSQL | 5-50GB PostgreSQL | Local embedded mode is single-user. For teams, switch to cloud mode or self-hosted Hindsight. |
| Session archive | ~100 sessions/year | ~1,000 sessions/year | JSON files are tiny (~50KB each). Storage is negligible. |
| Cron report LLM cost | 1 report/day ~1K tokens | 10 reports/day ~10K tokens | Cost scales linearly with users. Consider no-agent mode for data sections. |
| Repo clones | 3 repos per session | Same (shared workspaces) | Container persistence means clones survive across sessions in the same container. |
## Sources
- Hermes Agent docs: Hook system (`website/docs/user-guide/features/hooks.md`)
- Hermes Agent docs: Cron system (`website/docs/user-guide/features/cron.md`)
- Hermes Agent docs: Session storage (`website/docs/developer-guide/session-storage.md`)
- Hermes Agent source: `hermes_state.py`, `agent/curator.py`, `hermes_cli/hooks.py`
- ngn-agent `config.yaml` and `initial-plan.md`