ngn-agent/.planning/research/hermes/SECURITY.md

# Hermes Agent: Security Model & Safety Features

## 7-Layer Security Model

1. User authorization (allowlists, DM pairing)
2. Dangerous command approval (manual/smart/off)
3. Container isolation (Docker with hardened settings)
4. MCP credential filtering (env stripping for subprocesses)
5. Context file scanning (prompt injection detection)
6. Cross-session isolation (no data sharing between sessions)
7. Input sanitization (working directory allowlist)

## Dangerous Command Approval

Three modes in `~/.hermes/config.yaml`:

```yaml
approvals:
  mode: manual        # manual | smart | off
  timeout: 60         # seconds to wait for response
  cron_mode: deny     # deny | approve — what cron does when hitting dangerous cmd
```

### Pattern Detection

Hermes detects and prompts on these patterns (non-exhaustive):

| Category | Patterns |
|----------|---------|
| Delete | `rm -r`, `rm --recursive`, `rm ... /` |
| Permissions | `chmod 777/666`, `o+w`, `a+w` |
| Filesystem | `mkfs`, `dd if=`, `> /dev/sd` |
| SQL | `DROP TABLE`, `DELETE FROM` (no WHERE), `TRUNCATE` |
| System | `> /etc/`, `systemctl stop/restart`, `kill -9 -1` |
| Remote exec | `curl ... | sh`, `bash <(curl ...)` |
| Script exec | `python -e`, `perl -e`, `ruby -e` |
| Sensitive writes | `tee`/`>`/`>>` to `/etc/`, `~/.ssh/`, `~/.hermes/.env` |

### Hardline Blocklist (Always-On, Even in YOLO Mode)

- `rm -rf /` and variants
- Fork bombs (`:(){ :|:& };:`)
- `mkfs.*` on mounted root
- `dd if=/dev/zero of=/dev/sd*`
- Piping untrusted URLs to `sh` at rootfs level

### Approval Flow

```
⚠️  DANGEROUS COMMAND: recursive delete
    rm -rf /tmp/old-project
    [o]nce  |  [s]ession  |  [a]lways  |  [d]eny
```

### YOLO Mode

```
hermes --yolo      # Bypass all approval prompts for this session
/yolo              # Toggle in-session
```

**YOLO does NOT bypass the hardline blocklist.**

## Docker Container Hardening

```bash
--cap-drop ALL                          # Drop ALL Linux capabilities
--cap-add DAC_OVERRIDE,CHOWN,FOWNER     # Only add back what's needed
--security-opt no-new-privileges         # Block privilege escalation
--pids-limit 256                         # Limit process count
--tmpfs /tmp:rw,nosuid,size=512m         # Size-limited temp dirs
```

**Important:** When terminal backend is `docker`, dangerous command checks are **skipped** — the container itself is the security boundary. This is by design.

## Tirith Pre-Exec Scanning

Optional scanner for content-level threats:
- Homograph URL spoofing
- Pipe-to-interpreter patterns
- Terminal injection attacks

```yaml
security:
  tirith_enabled: true
  tirith_fail_open: true    # Allow commands if tirith unavailable
```

## SSRF Protection

All URL-capable tools block:
- Private networks (RFC 1918): `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`
- Loopback: `127.0.0.0/8`, `::1`
- Cloud metadata: `metadata.google.internal`, `169.254.169.254`
- CGNAT: `100.64.0.0/10`

Opt-out: `security.allow_private_urls: true` (not recommended for ngn-agent)

## Context File Injection Protection

Scans AGENTS.md, .cursorrules, SOUL.md for:
- Instructions to ignore prior instructions
- Hidden HTML comments with suspicious keywords
- Attempts to read secrets (`.env`, credentials)
- Credential exfiltration via `curl`
- Invisible Unicode characters

Blocked files show: `[BLOCKED: file contained potential prompt injection]`

## MCP Credential Handling

Only `PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR` passed to MCP subprocesses.
All API keys, tokens, secrets are stripped.
Explicit `env:` config in mcp_servers is passed through intentionally.

## Website Blocklist

```yaml
security:
  website_blocklist:
    enabled: true
    domains:
      - "*.internal.company.com"
      - "admin.example.com"
```

## Key ngn-agent Implications

| Concern | Mitigation |
|---------|------------|
| Mutating AWS commands in Docker | IAM policy on dev_Restricted role — the real safety net |
| rm -rf / inside container | Container is ephemeral, but hardline blocklist still blocks it |
| Agent modifying own code | Docker terminal means it can't touch host files |
| Accidental terraform apply | Container has limited IAM — won't have apply perms |
| Prompt injection | Context file scanning + approval system + container isolation |
| AWS creds inside container | `./.aws/` mounted read-only, limited role, no admin access |

## ngn-agent Config

```yaml
# Our Phase 1 config
approvals:
  mode: manual        # Start with manual approvals
  timeout: 60
  cron_mode: deny     # Never auto-approve dangerous commands in cron

terminal:
  backend: docker     # Container is our security boundary
  # ... see config.yaml for details
```