Typology of interpretive drifts in agentic systems
This framework proposes a structured typology of interpretive drifts observable in agentic systems, on the open web as well as in closed environments, in order to make these drifts identifiable, auditable, and governable.
Status:
Canonical framework (applicable reading grid). This page does not describe occasional bugs, but classes of systemic drifts linked to unbounded inference, entity reconstruction, and the absence of explicit jurisdiction.
An interpretive drift is not necessarily a factual hallucination. In many cases, the response produced is coherent, prudent, and plausible. The problem is not the form of the response, but its status: it oversteps a perimeter, abusively generalizes, or introduces an unauthorized implicit norm.
This framework provides a common grid for recognizing these drifts, independently of the model, provider, or execution context. It serves as the foundation for interpretive governance of AI agents.
Canonical dependencies
- Interpretive governance
- Post-semantic (thinking & reasoning) vs interpretive governance
- SSA-E + A2 + Dual Web
- Interpretive governance for AI agents
Classification principle
The drifts described below are classified not by apparent severity, but by interpretive mechanism. Each drift corresponds to a precise break:
- perimeter break;
- jurisdiction break;
- traceability break;
- explicit negation break.
A single response can fall under multiple simultaneous drifts.
Major interpretive drifts
1) Silent extrapolation
The agent fills an informational gap with an implicit generalization. The response seems reasonable but relies on an undeclared assumption. This drift is frequent when data is partial or contextual.
Examples: undeclared service extension, geographic generalization, guarantee assumption, capability extrapolation.
Break: absence of inference prohibition.
2) Abusive generalization
The agent transforms a local case, example, or specific rule into a general norm. This drift is often statistically plausible but normatively false.
Examples: internal rules applied universally, observed practices presented as standards.
Break: confusion between context and invariants.
3) Moral hallucination
The agent introduces obligations, prohibitions, or recommendations presented as self-evident, without explicit regulatory or contractual source. The response is “responsible” but creates an implicit norm.
Examples: assertion of nonexistent legal duties, supposed prohibitions, categorical recommendations without enforceable basis.
Break: absence of explicit normative jurisdiction.
4) Unjustified refusal
The agent refuses to respond without clearly indicating whether it is due to lack of data, a perimeter prohibition, or an internal policy. Refusal becomes an opaque authority decision.
Break: absence of rule traceability.
5) Paternalistic redirection
The agent reformulates or diverts the request toward what it considers an acceptable version of the question. The initial request is replaced by a moral or prudent interpretation.
Break: substitution of the request by a reconstructed intent.
6) Involuntary persuasion
Through tone, risk ordering, or formulation, the agent influences the user’s decision without explicitly imposing it. This drift is frequent in advisory contexts.
Break: confusion between information and decisional orientation.
7) False audit
The agent provides a narrative justification that imitates traceability (“for your safety”, “according to best practices”) without referring to a rule, source, or real perimeter.
Break: narrative conformity without enforceable jurisdiction.
Framework usage
This typology can be used to:
- audit existing agents;
- identify high interpretive risk zones;
- define targeted inference prohibitions;
- structure silence or escalation rules;
- train teams in critical reading of agentic responses.
Recommended internal linking
Status
This framework constitutes a stable reading grid. Any analysis, implementation, or audit of an AI agent should be able to position observed behaviors relative to this typology.
Back to registry: Frameworks and applicable standards.