1. Why this guide exists
Most teams discover they need agent governance after the first agent is already in production. By that point they've usually shipped: a hallucinated answer to a customer, a tool call to a system that should have been off-limits, or a payload containing personally identifiable information that nobody redacted on the way to a vendor model.
This guide is for the team that doesn't want to be that team. It assumes you've decided to build with AI agents and need a working playbook for governing them at enterprise scale.
2. Defining AI agent governance
AI agent governance is the discipline of defining, enforcing, and proving the rules under which AI agents operate inside an organisation. It covers:
- What an agent can see (data access).
- What an agent can do (tool / action permissions).
- What an agent says (output controls, redaction, LLM checks).
- What an agent leaves behind (audit logs, retention).
It is not model governance. Model governance is about choosing which LLMs to use and how. Agent governance is about the behaviour of agents built on top of those models. See the deeper comparison for the full breakdown.
3. Stakeholders and concerns
A governance programme needs four buy-ins, each with a different lens:
| Role | What they care about |
|---|---|
| CIO / AI lead | Time-to-deploy, board-level risk, vendor strategy |
| Risk & Compliance | Auditability, regulatory mapping, policy enforcement |
| Security | Identity, tool permissions, data exfiltration paths |
| Platform / CTO | Architecture, MCP, model independence, observability |
Each of those teams has its own solutions page on this site with role-specific messaging.
4. The four categories of risk
Every agent incident we've seen falls into one of four buckets. Tag your incident log against these from day one:
- Data exposure — the agent saw something it shouldn't have, or leaked it downstream.
- Unauthorised action — the agent used a tool, triggered a workflow, or wrote to a system it wasn't approved for.
- Hallucination on ungrounded data — the agent guessed because it lacked safe access to the truth.
- Audit failure — you can't reconstruct what the agent did, why, or for whom.
5. The controls that close those risks
Five controls, ordered roughly by sequence of rollout:
- Identity per agent. Each agent has its own credential, separate from the human user. This makes auditing tractable and revocation possible.
- Default-deny tool allowlists. Agents only get the tools they explicitly need. Most agents need 5–10, not 50.
- Redaction at the boundary. PII never leaves the perimeter un-masked. Use entity-aware redactors (Presidio, equivalent) not regex.
- LLM checks for fuzzy policy. Use a second model to validate intent, consent, data-purpose, and minimisation rules at the boundary.
- Structured audit logs. Logs that are queryable, not free-text. Map fields to GDPR, HIPAA, SOX, and ISO 42001 control IDs.
6. A 90-day rollout plan
A realistic sequence for a regulated enterprise:
Inventory + baseline
List every agent in production today. Stand up a governance gateway in shadow-mode that logs but does not block. Catalogue the actual tools, data sources, and providers in use.
Enforce + redact
Flip from shadow-mode to enforce on the top-3 highest-risk agents. Apply redaction rules for the entity types you actually see in the baseline. Start producing the audit log your risk team will live in.
Scale + audit
Roll the gateway across every agent. Wire continuous agent-to-agent audits. Map the audit log to your regulatory framework and validate with a friendly internal-audit pass.
7. Metrics worth measuring
- ✓ Number of agents in production, per governance status (pass / fail).
- ✓ Redactions applied per day, by entity type.
- ✓ Policy blocks per day, by violation type.
- ✓ Median + p95 latency added by the governance layer.
- ✓ Audit log retention vs the strictest applicable regulation.
- ✓ Time-to-remediate when a policy drift is detected.