When an AI agent makes a high-stakes decision — approving a claim, escalating an incident, authorising a procedure — knowing what it decided isn't enough. You need to know whether the reasoning behind that decision was structurally sound before it commits.
That's what SENTINEL does. It sits behind agentgateway as an MCP server and audits agent reasoning quality in real time — checking whether evidence was complete, current, and properly weighted before a decision goes through.
I built SENTINEL for the AI Agent & MCP Hackathon (Secure & Govern MCP track). It demonstrates how agentgateway's MCP governance primitives — RBAC, session management, audit logging — can be combined with domain-specific reasoning audits to create a practical governance layer for autonomous agents.
Consider a healthcare AI agent — MedAgent — processing prior authorisation decisions. It evaluates whether an insurance payer will approve or deny a requested procedure. On Aetna claims, MedAgent reports 89% confidence. That number looks fine in any dashboard.
But SENTINEL has been tracking actual outcomes for 60 days. The reality:
Without SENTINEL, every one of those decisions auto-executes. Patients receive incorrect denials. Appeals pile up. Revenue leaks. Nobody notices until a quarterly audit — weeks or months later.
SENTINEL Observatory Dashboard — The reliability heatmap shows Aetna's accuracy declining week over week while UnitedHealthcare remains healthy.
SENTINEL is built in Go and runs as an MCP server (Streamable HTTP transport) behind Solo.io's agentgateway. The architecture:
MCP Clients (Claude, GPT, Custom Agents)
│
▼
┌─────────────────────────────────┐
│ agentgateway (:3000) │
│ • CEL-based RBAC per MCP tool │
│ • Session management │
│ • Audit logging (all calls) │
│ • CORS for playground access │
└────────────┬────────────────────┘
│ Streamable HTTP
▼
┌─────────────────────────────────┐
│ SENTINEL MCP Server (:8081) │
│ 4 MCP Tools │
└────────────┬────────────────────┘
│
┌────────┼────────┬───────────┐
▼ ▼ ▼ ▼
Fidelity Pattern Reliability Authority
Auditor Library Scorer Gate
agentgateway adds the security and governance layer that SENTINEL itself doesn't need to implement. CEL policies control who can call which tool:
mcpAuthorization:
rules:
# Public — anyone can read failure patterns
- 'mcp.tool.name == "sentinel_patterns"'
- 'mcp.tool.name == "sentinel_reliability"'
# Authenticated — JWT required to run evaluations
- 'mcp.tool.name == "sentinel_evaluate" && has(jwt.sub)'
# Privileged — operator role to pull from Datadog
- 'mcp.tool.name == "sentinel_pull_decisions" && has(jwt.sub) && "operator" in jwt.roles'
This means read-only tools (patterns, reliability) are accessible to any MCP client. Running the evaluation pipeline requires authentication. Pulling raw decision events from Datadog requires operator privileges. All invocations are audit-logged through agentgateway.
agentgateway UI — SENTINEL's MCP tools visible in the gateway with tool-level RBAC enforcement via CEL policies.
When an agent decision arrives at sentinel_evaluate, it passes through four stages:
SENTINEL inspects every piece of evidence the agent retrieved. For each signal (payer policy, patient history, step therapy docs, clinical criteria), it checks:
STALE_POLICY flag (CRITICAL)INCOMPLETE_RETRIEVAL flagTIMEOUT_ON_CRITICAL flagWEIGHT_DIVERGENCE flagMISSING_SIGNAL flagCONFIDENCE_MISMATCH flagThe output is a fidelity score (0.0–1.0) with per-signal audit details and suggested fixes.
The fidelity flags are matched against a library of learned failure signatures. Each pattern carries historical accuracy data:
Pattern accuracy updates via exponential moving average as new outcomes resolve.
SENTINEL Pattern Library — Each failure signature carries historical accuracy data. Pattern Delta (Stale + Incomplete) has only 23% accuracy.
SENTINEL maintains rolling accuracy profiles per agent × payer combination. It computes:
Confidence vs Reality Gap — The agent reports ~89% confidence but actual accuracy is ~72%. The Aetna drift chart shows the 8-week decline from 67% to 0%.
The gate combines fidelity, pattern, and reliability data into a verdict:
Every verdict is emitted to Datadog as a custom event with full context: decision ID, authority level, fidelity score, pattern detected, reliability score. SENTINEL also creates Datadog monitors that alert when per-payer accuracy crosses threshold — the drift detection feedback loop.
Each decision is logged to Braintrust as an eval span. When outcomes resolve (days later), the ground truth is attached. This gives a persistent eval dataset: how often SENTINEL's verdicts were correct, which patterns were misclassified, and calibration quality over time.
When the authority gate issues HUMAN_REQUIRED or QUARANTINE, SENTINEL automatically creates a Cleric incident with the full decision context: the original agent reasoning, the fidelity audit, the pattern match, the reliability profile. A human billing specialist gets a structured investigation package, not a raw alert.
SENTINEL generates a daily reliability briefing script (overnight accuracy, drift alerts, blocked decisions, top patterns) and sends it to ElevenLabs for text-to-speech synthesis. The ops team gets a spoken summary at standup — a different modality for a different workflow.
The hackathon demo runs a pre-seeded scenario with 60 days of synthetic outcomes showing Aetna's drift from 84% to 44% accuracy. Then, live:
sentinel_evaluate MCP tool (through agentgateway with CEL RBAC).The jaw-drop moment: the agent was 89% confident in a decision that belongs to a pattern with 23% historical accuracy. SENTINEL caught the gap. Without it, the decision auto-executes.
Live Decision Feed — SENTINEL intercepts decisions in real time. Aetna claims are blocked due to Pattern Delta, while UnitedHealthcare claims pass through cleanly.
agentgateway Playground — Any MCP client can connect to SENTINEL through agentgateway. Tool-level RBAC controls access, and every invocation is logged.
# 1. Clone and configure
git clone https://github.com/espirado/agent-secure.git
cd agent-secure
cp .env.example .env # Fill in API keys
# 2. Start SENTINEL with demo data
SEED_DEMO_DATA=true go run ./cmd/sentinel
# 3. (Recommended) Run behind agentgateway
curl -sL https://agentgateway.dev/install | bash
agentgateway -f agentgateway.yaml
# 4. Open agentgateway playground
open http://localhost:15000/ui/playground/
# Connect to http://localhost:3000 and try SENTINEL's MCP tools
# 5. Run the full demo script
chmod +x run_demo.sh
./run_demo.sh
As AI agents take on more autonomy in high-stakes domains — healthcare, finance, infrastructure — there's a growing need for governance that goes beyond compliance checklists. You need to understand whether the reasoning behind a decision was sound: was the evidence complete? Was it current? Did the agent's confidence actually match reality?
SENTINEL answers those questions at the MCP protocol layer. By running behind agentgateway, every tool invocation is secured with CEL-based RBAC, every call is audit-logged, and the reasoning audit itself is just another MCP tool that any client can call.
The combination of agentgateway's governance primitives with SENTINEL's domain-specific reasoning audits shows what MCP-native agent governance can look like in practice — not bolted on after the fact, but built into the protocol layer where agent decisions actually flow.