Apohara · safety
AGENTGUARD

The safety gate obfuscated commands can't slip past.

A deterministic, offline Rust safety layer for AI coding agents — an anti-bypass command gate that parses Bash structure instead of grepping for substrings, a seccomp + Landlock sandbox for the code an agent actually runs, and a prompt-injection input firewall. No model, no network at scan time.

$/plugin install apohara-agentguard
View source →

The demo

Same input, same verdict — structure over substrings.

agent session · agentguard PreToolUse · live recording
agentguard blocks three obfuscated destructive commands — a variable-aliased rm, a base64-decoded rm piped to a shell, and find . -delete — then allows a benign git commit whose message merely mentions rm -rf

Real output from the committed binary. Three obfuscated destructive commands a substring blocklist lets through — all Block; the benign commit whose message merely mentions rm -rf — Allow. The gate keys on structure, not tokens.

Why it exists

AI coding agents run shell commands on your machine; the common guard — a regex blocklist in a hook — is trivially bypassed by variable aliasing, base64, compound chains and whitespace tricks. AgentGuard parses the command structure instead of matching strings, and for anything that runs adds a fail-closed seccomp + Landlock sandbox. Deterministic. No LLM in the loop.

What it does

Three layers — detect, contain, filter.

Anti-bypass gate

Parses compound bash, resolves variables, decodes base64 / ANSI-C. Proptest-verified — keyed on structure, not tokens.

Real local sandbox

seccomp + Landlock, fail-closed: network-denied, filesystem-scoped. No Docker, no cloud. Linux today.

Injection firewall

A deterministic prefilter scans tool results and fetched content for prompt-injection patterns before they reach the agent. No model call.

Honest by design

What the code backs — and where the boundary sits.

What it catches

  • Variable-aliased & compound destructive commands
  • base64 / ANSI-C decode-and-pipe-to-shell
  • Filesystem / disk / credential-read patterns
  • Sandboxed execution reaching the network

What it does not

  • A safety hook, not an escape-proof jail
  • Nested encoders & non-literal substitutions out of scope (documented)
  • seccomp sandbox is Linux-only — fails closed elsewhere
  • Deterministic, not AI — paraphrase attacks are a known boundary

The full evasion scorecard ships in the repo. Publishing where the boundary sits is the difference between a safety claim and a marketing claim.

Quick start

Three steps to a guarded Bash tool.

  1. 1
    /plugin marketplace add SuarezPM/apohara-agentguard

    Add the marketplace in Claude Code.

  2. 2
    /plugin install apohara-agentguard

    Installs the PreToolUse hook (or cargo install apohara-agentguard for the standalone binary).

  3. 3
    agentguard check "rm -rf /"

    Verify it's live; the hook now guards every Bash call.

Dual MIT / Apache-2.0 Rust single binary No model offline SECURITY.md + threat model Fuzzed proptest-gated