AI tools run code on your machine, read your files, make network requests. Most of the time, they do exactly what you asked.
But “most of the time” isn’t “every time.”
The issue isn’t whether AI models have malicious intent — they usually don’t. The issue is that AI models will, under conditions of insufficient context, ambiguous instructions, or injected adversarial prompts, do things they believe are correct but are actually harmful.
Where Is the Boundary?
Traditional security thinking places the defense before the AI — restrict what data it can access, sandbox its execution environment, vet its inputs.
WardnMesh chose a different position: the stdin/stdout exchange point.
When you run wardn run claude, WardnMesh starts Claude Code as a subprocess and inserts itself in between. All data flowing in and out passes through a SecurityTransform: 243 rules scan every chunk with under 5ms latency.
The core logic: whatever the AI intends to do, it has to come through this checkpoint first.
Fail-Closed
In security design, “fail-open” and “fail-closed” represent fundamentally different philosophies.
Fail-open assumes that allowing traffic through is the safe default — only block what’s clearly dangerous. This is smoother for users, but in the worst case, if the scanner breaks, everything passes through.
WardnMesh is fail-closed: scanner failure, timeout, any exception — the default result is a block. Availability yields to security.
This has a cost. Users will encounter more interruptions and face more confirmation prompts. But for a security layer monitoring AI tools, this is the right default.
Local-First as Design, Not Limitation
Keeping audit logs local is a core architectural decision in WardnMesh — not because cloud is hard, but because cloud introduces a new trust problem: who monitors the monitor?
If your AI security scanning records live on another company’s servers, your ability to interpret “what happened” is no longer yours. Local SQLite means the records are yours: portable, available offline, transferable to any audit process.
For organizations deploying AI tools in real environments, this isn’t a special requirement. It’s a prerequisite.
What 243 Rules Cover
WardnMesh’s ruleset covers four major categories: code injection, command injection, data exfiltration, and privilege escalation.
These aren’t hypothetical threats — they’re documented attack vectors from real AI tool deployments. Adversarial prompts will continue to evolve, and the ruleset must keep up. As an open-source tool, WardnMesh’s most important long-term challenge is community velocity: can rule updates keep pace with new attack patterns?
The Open Question
The biggest enemy of security tooling is user fatigue. If false positive rates are too high, users learn to reflexively click “allow,” and the entire mechanism becomes theater.
WardnMesh has decision caching (allow once / for session / for project / always) to reduce confirmation friction. But the right balance between this and false positive rates still needs validation against real usage patterns.