Phase 2: memory, session search, git worktree configured Phase 3: Telegram gateway connected, DM pairing approved Phase 4: custom skills created (aws-diagnostics, jira-query, confluence-search, bitbucket-pr)
4.6 KiB
4.6 KiB
Hermes Agent: Security Model & Safety Features
7-Layer Security Model
- User authorization (allowlists, DM pairing)
- Dangerous command approval (manual/smart/off)
- Container isolation (Docker with hardened settings)
- MCP credential filtering (env stripping for subprocesses)
- Context file scanning (prompt injection detection)
- Cross-session isolation (no data sharing between sessions)
- Input sanitization (working directory allowlist)
Dangerous Command Approval
Three modes in ~/.hermes/config.yaml:
approvals:
mode: manual # manual | smart | off
timeout: 60 # seconds to wait for response
cron_mode: deny # deny | approve — what cron does when hitting dangerous cmd
Pattern Detection
Hermes detects and prompts on these patterns (non-exhaustive):
| Category | Patterns |
|---|---|
| Delete | rm -r, rm --recursive, rm ... / |
| Permissions | chmod 777/666, o+w, a+w |
| Filesystem | mkfs, dd if=, > /dev/sd |
| SQL | DROP TABLE, DELETE FROM (no WHERE), TRUNCATE |
| System | > /etc/, systemctl stop/restart, kill -9 -1 |
| Remote exec | `curl ... |
| Script exec | python -e, perl -e, ruby -e |
| Sensitive writes | tee/>/>> to /etc/, ~/.ssh/, ~/.hermes/.env |
Hardline Blocklist (Always-On, Even in YOLO Mode)
rm -rf /and variants- Fork bombs (
:(){ :|:& };:) mkfs.*on mounted rootdd if=/dev/zero of=/dev/sd*- Piping untrusted URLs to
shat rootfs level
Approval Flow
⚠️ DANGEROUS COMMAND: recursive delete
rm -rf /tmp/old-project
[o]nce | [s]ession | [a]lways | [d]eny
YOLO Mode
hermes --yolo # Bypass all approval prompts for this session
/yolo # Toggle in-session
YOLO does NOT bypass the hardline blocklist.
Docker Container Hardening
--cap-drop ALL # Drop ALL Linux capabilities
--cap-add DAC_OVERRIDE,CHOWN,FOWNER # Only add back what's needed
--security-opt no-new-privileges # Block privilege escalation
--pids-limit 256 # Limit process count
--tmpfs /tmp:rw,nosuid,size=512m # Size-limited temp dirs
Important: When terminal backend is docker, dangerous command checks are skipped — the container itself is the security boundary. This is by design.
Tirith Pre-Exec Scanning
Optional scanner for content-level threats:
- Homograph URL spoofing
- Pipe-to-interpreter patterns
- Terminal injection attacks
security:
tirith_enabled: true
tirith_fail_open: true # Allow commands if tirith unavailable
SSRF Protection
All URL-capable tools block:
- Private networks (RFC 1918):
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 - Loopback:
127.0.0.0/8,::1 - Cloud metadata:
metadata.google.internal,169.254.169.254 - CGNAT:
100.64.0.0/10
Opt-out: security.allow_private_urls: true (not recommended for ngn-agent)
Context File Injection Protection
Scans AGENTS.md, .cursorrules, SOUL.md for:
- Instructions to ignore prior instructions
- Hidden HTML comments with suspicious keywords
- Attempts to read secrets (
.env, credentials) - Credential exfiltration via
curl - Invisible Unicode characters
Blocked files show: [BLOCKED: file contained potential prompt injection]
MCP Credential Handling
Only PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR passed to MCP subprocesses.
All API keys, tokens, secrets are stripped.
Explicit env: config in mcp_servers is passed through intentionally.
Website Blocklist
security:
website_blocklist:
enabled: true
domains:
- "*.internal.company.com"
- "admin.example.com"
Key ngn-agent Implications
| Concern | Mitigation |
|---|---|
| Mutating AWS commands in Docker | IAM policy on dev_Restricted role — the real safety net |
| rm -rf / inside container | Container is ephemeral, but hardline blocklist still blocks it |
| Agent modifying own code | Docker terminal means it can't touch host files |
| Accidental terraform apply | Container has limited IAM — won't have apply perms |
| Prompt injection | Context file scanning + approval system + container isolation |
| AWS creds inside container | ./.aws/ mounted read-only, limited role, no admin access |
ngn-agent Config
# Our Phase 1 config
approvals:
mode: manual # Start with manual approvals
timeout: 60
cron_mode: deny # Never auto-approve dangerous commands in cron
terminal:
backend: docker # Container is our security boundary
# ... see config.yaml for details