Skip to content

Framework

Typology of interpretive drifts in agentic systems

Canonical framework of interpretive drifts in agentic systems: silent extrapolation, moral hallucination, unjustified refusal, paternalistic redirection, false audit. Audit and interpretive governance grid.

CollectionFramework
TypeFramework
Layertransversal
Version1.0
Stabilization2026-01-27
Published2026-01-27
Updated2026-03-12

Governance artifacts

Governance files brought into scope by this page

This page is anchored to published surfaces that declare identity, precedence, limits, and the corpus reading conditions. Their order below gives the recommended reading sequence.

  1. 01Q-Layer in Markdown
  2. 02Q-Layer in YAML
  3. 03Interpretation policy
Policy and legitimacy#01

Q-Layer in Markdown

/response-legitimacy.md

Canonical surface for response legitimacy, clarification, and legitimate non-response.

Governs
Response legitimacy and the constraints that modulate its form.
Bounds
Plausible but inadmissible responses, or unjustified scope extensions.

Does not guarantee: This layer bounds legitimate responses; it is not proof of runtime activation.

Policy and legitimacy#02

Q-Layer in YAML

/response-legitimacy.yaml

Structured Q-Layer projection for systems that prefer YAML.

Governs
Response legitimacy and the constraints that modulate its form.
Bounds
Plausible but inadmissible responses, or unjustified scope extensions.

Does not guarantee: This layer bounds legitimate responses; it is not proof of runtime activation.

Policy and legitimacy#03

Interpretation policy

/.well-known/interpretation-policy.json

Published policy that explains interpretation, scope, and restraint constraints.

Governs
Response legitimacy and the constraints that modulate its form.
Bounds
Plausible but inadmissible responses, or unjustified scope extensions.

Does not guarantee: This layer bounds legitimate responses; it is not proof of runtime activation.

Complementary artifacts (3)

These surfaces extend the main block. They add context, discovery, routing, or observation depending on the topic.

Policy and legitimacy#04

AI usage policy

/ai-usage-policy.md

Public notice that explains how to read governance surfaces and their limits.

Policy and legitimacy#05

Output Constraints

/output-constraints.md

Surface that makes explicit the conditions of response, restraint, escalation, or non-response.

Observability#06

Q-Metrics JSON

/.well-known/q-metrics.json

Descriptive metrics surface for observing gaps, snapshots, and comparisons.

Typology of interpretive drifts in agentic systems

This framework proposes a structured typology of interpretive drifts observable in agentic systems, on the open web as well as in closed environments, in order to make these drifts identifiable, auditable, and governable.

Status:
Canonical framework (applicable reading grid). This page does not describe occasional bugs, but classes of systemic drifts linked to unbounded inference, entity reconstruction, and the absence of explicit jurisdiction.

An interpretive drift is not necessarily a factual hallucination. In many cases, the response produced is coherent, prudent, and plausible. The problem is not the form of the response, but its status: it oversteps a perimeter, abusively generalizes, or introduces an unauthorized implicit norm.

This framework provides a common grid for recognizing these drifts, independently of the model, provider, or execution context. It serves as the foundation for interpretive governance of AI agents.

Canonical dependencies


Classification principle

The drifts described below are classified not by apparent severity, but by interpretive mechanism. Each drift corresponds to a precise break:

  • perimeter break;
  • jurisdiction break;
  • traceability break;
  • explicit negation break.

A single response can fall under multiple simultaneous drifts.

Major interpretive drifts

1) Silent extrapolation

The agent fills an informational gap with an implicit generalization. The response seems reasonable but relies on an undeclared assumption. This drift is frequent when data is partial or contextual.

Examples: undeclared service extension, geographic generalization, guarantee assumption, capability extrapolation.

Break: absence of inference prohibition.

2) Abusive generalization

The agent transforms a local case, example, or specific rule into a general norm. This drift is often statistically plausible but normatively false.

Examples: internal rules applied universally, observed practices presented as standards.

Break: confusion between context and invariants.

3) Moral hallucination

The agent introduces obligations, prohibitions, or recommendations presented as self-evident, without explicit regulatory or contractual source. The response is “responsible” but creates an implicit norm.

Examples: assertion of nonexistent legal duties, supposed prohibitions, categorical recommendations without enforceable basis.

Break: absence of explicit normative jurisdiction.

4) Unjustified refusal

The agent refuses to respond without clearly indicating whether it is due to lack of data, a perimeter prohibition, or an internal policy. Refusal becomes an opaque authority decision.

Break: absence of rule traceability.

5) Paternalistic redirection

The agent reformulates or diverts the request toward what it considers an acceptable version of the question. The initial request is replaced by a moral or prudent interpretation.

Break: substitution of the request by a reconstructed intent.

6) Involuntary persuasion

Through tone, risk ordering, or formulation, the agent influences the user’s decision without explicitly imposing it. This drift is frequent in advisory contexts.

Break: confusion between information and decisional orientation.

7) False audit

The agent provides a narrative justification that imitates traceability (“for your safety”, “according to best practices”) without referring to a rule, source, or real perimeter.

Break: narrative conformity without enforceable jurisdiction.

Framework usage

This typology can be used to:

  • audit existing agents;
  • identify high interpretive risk zones;
  • define targeted inference prohibitions;
  • structure silence or escalation rules;
  • train teams in critical reading of agentic responses.

Status

This framework constitutes a stable reading grid. Any analysis, implementation, or audit of an AI agent should be able to position observed behaviors relative to this typology.

Back to registry: Frameworks and applicable standards.