Skip to content

Expertise

Multi-agent audits

Service-facing expertise entry for multi-agent audits: governed examination of how meaning, authority, refusal conditions, and action permissions survive across agent chains, tool calls, retrieval layers, and handoffs.

CollectionExpertise
TypeExpertise
Domainmulti-agent-audits

Engagement decision

How to recognize that this axis should be mobilized

Use this page as a decision page. The objective is not only to understand the concept, but to identify the symptoms, framing errors, use cases, and surfaces to open in order to correct the right problem.

Typical symptoms

  • A chain of agents appears productive, but no one can explain where authority shifted between handoffs.
  • One agent keeps the boundaries while another silently extends the answer, the action scope, or the recommendation.
  • Retrieval, tools, planners, and executors do not preserve the same response conditions.
  • A local success metric hides a growing liability chain across the orchestration layer.

Frequent framing errors

  • Treating a multi-agent chain as if one final answer fully represents the whole system.
  • Benchmarking task completion without auditing authority transfer, refusal propagation, or provenance loss.
  • Assuming that internal tools automatically preserve canon and perimeter.
  • Confusing workflow success with interpretive legitimacy.

Use cases

  • Planner/executor chains, routing agents, retrieval agents, tool-using assistants, and mixed open-closed environments.
  • Enterprise assistant stacks where one agent summarizes, another decides, and another acts.
  • Audit of escalation chains in support, operations, legal, compliance, or knowledge systems.
  • Qualification of chain-level risk before rollout or after drift.

What gets corrected concretely

  • Mapping the handoffs where authority, perimeter, or refusal conditions break.
  • Separating canonical authority from local tool authority across the chain.
  • Reintroducing silence, escalation, and traceability rules at the right points.
  • Turning chain-level instability into a reconstructable audit basis.

Governance artifacts

Governance files brought into scope by this page

This page is anchored to published surfaces that declare identity, precedence, limits, and the corpus reading conditions. Their order below gives the recommended reading sequence.

  1. 01Definitions canon
  2. 02Q-Layer in Markdown
  3. 03Interpretation policy
Canon and identity#01

Definitions canon

/canon.md

Canonical surface that fixes identity, roles, negations, and divergence rules.

Governs
Public identity, roles, and attributes that must not drift.
Bounds
Extrapolations, entity collisions, and abusive requalification.

Does not guarantee: A canonical surface reduces ambiguity; it does not guarantee faithful restitution on its own.

Policy and legitimacy#02

Q-Layer in Markdown

/response-legitimacy.md

Canonical surface for response legitimacy, clarification, and legitimate non-response.

Governs
Response legitimacy and the constraints that modulate its form.
Bounds
Plausible but inadmissible responses, or unjustified scope extensions.

Does not guarantee: This layer bounds legitimate responses; it is not proof of runtime activation.

Policy and legitimacy#03

Interpretation policy

/.well-known/interpretation-policy.json

Published policy that explains interpretation, scope, and restraint constraints.

Governs
Response legitimacy and the constraints that modulate its form.
Bounds
Plausible but inadmissible responses, or unjustified scope extensions.

Does not guarantee: This layer bounds legitimate responses; it is not proof of runtime activation.

Complementary artifacts (2)

These surfaces extend the main block. They add context, discovery, routing, or observation depending on the topic.

Observability#04

Observatory map

/observations/observatory-map.json

Structured map of observation surfaces and monitored zones.

Entrypoint#05

Public AI manifest

/ai-manifest.json

Structured inventory of the surfaces, registries, and modules that extend the canonical entrypoint.

Evidence layer

Probative surfaces brought into scope by this page

This page does more than point to governance files. It is also anchored to surfaces that make observation, traceability, fidelity, and audit more reconstructible. Their order below makes the minimal evidence chain explicit.

  1. 01
    Canon and scopeDefinitions canon
  2. 02
    Response authorizationQ-Layer: response legitimacy
  3. 03
    Weak observationQ-Ledger
  4. 04
    Audit reportIIP report schema
Canonical foundation#01

Definitions canon

/canon.md

Opposable base for identity, scope, roles, and negations that must survive synthesis.

Makes provable
The reference corpus against which fidelity can be evaluated.
Does not prove
Neither that a system already consults it nor that an observed response stays faithful to it.
Use when
Before any observation, test, audit, or correction.
Legitimacy layer#02

Q-Layer: response legitimacy

/response-legitimacy.md

Surface that explains when to answer, when to suspend, and when to switch to legitimate non-response.

Makes provable
The legitimacy regime to apply before treating an output as receivable.
Does not prove
Neither that a given response actually followed this regime nor that an agent applied it at runtime.
Use when
When a page deals with authority, non-response, execution, or restraint.
Observation ledger#03

Q-Ledger

/.well-known/q-ledger.json

Public ledger of inferred sessions that makes some observed consultations and sequences visible.

Makes provable
That a behavior was observed as weak, dated, contextualized trace evidence.
Does not prove
Neither actor identity, system obedience, nor strong proof of activation.
Use when
When it is necessary to distinguish descriptive observation from strong attestation.
Report schema#04

IIP report schema

/iip-report.schema.json

Public interface for an interpretation integrity report: scope, metrics, and drift taxonomy.

Makes provable
The minimal shape of a reconstructible and comparable audit report.
Does not prove
Neither private weights, internal heuristics, nor the success of a concrete audit.
Use when
When a page discusses audit, probative deliverables, or opposable reports.
Complementary probative surfaces (1)

These artifacts extend the main chain. They help qualify an audit, an evidence level, a citation, or a version trajectory.

Citation surfaceExternal context

Citations

/citations.md

Minimal external reference surface used to contextualize some concepts without delegating canonical authority to them.

Multi-agent audits

This page captures a service-facing label. On this site, “multi-agent audits” designate a governed examination of how meaning, authority, refusal conditions, and action permissions survive or fracture across an agent chain.

It is not a generic agent leaderboard, not a task-success benchmark, and not a simple tool compatibility test.

What this label names on this site

A multi-agent audit starts from a simple fact: every handoff is interpretive.

When one agent delegates to another, the chain does not transfer only a task. It also transfers:

  • the perimeter of what may be answered or acted upon;
  • the authority hierarchy that should govern the answer;
  • the silences that should remain silences;
  • the exclusions, negations, and escalation rules that should survive the handoff.

This is why a multi-agent audit is really an audit of distributed interpretation under delegated authority.

When this entry becomes useful

This entry becomes useful when the system is no longer a single assistant, but a chain involving:

  • planners and executors;
  • routing and retrieval agents;
  • tool-calling assistants;
  • mixed open-web and internal corpora;
  • escalation paths where one agent summarizes, another decides, and another acts.

What is actually audited

On this site, a serious multi-agent audit usually checks:

Typical outputs

A useful audit should produce:

  • a map of the agent chain and its authority regime;
  • the handoffs where state, perimeter, or proof is lost;
  • the points where silence should replace synthesis;
  • the rules that must be reinstated before a later agent answers or acts;
  • an evidence basis for later Interpretive risk assessment or Independent reporting.

What this label does not replace

Multi-agent audits do not replace:

They are a concrete audit entry into those stricter structures.

Doctrinal map

On this site, “multi-agent audits” redistribute toward:

Back to the map: Expertise.