Skip to content

Glossary

Canon, corpus, and machine readability

Canon, corpus, and machine readability maps related terms for interpreting AI governance, authority, evidence, visibility and semantic stability.

CollectionGlossary
TypeGlossary
Domaincanon-corpus-machine-readability
Published2026-05-08
Updated2026-05-09

Evidence layer

Probative surfaces brought into scope by this page

This page does more than point to governance files. It is also anchored to surfaces that make observation, traceability, fidelity, and audit more reconstructible. Their order below makes the minimal evidence chain explicit.

  1. 01
    Canon and scopeDefinitions canon
  2. 02
    Evidence artifactsite-context.md
  3. 03
    Evidence artifactai-manifest.json
  4. 04
    Evidence artifactai-governance.json
Canonical foundation#01

Definitions canon

/canon.md

Opposable base for identity, scope, roles, and negations that must survive synthesis.

Makes provable
The reference corpus against which fidelity can be evaluated.
Does not prove
Neither that a system already consults it nor that an observed response stays faithful to it.
Use when
Before any observation, test, audit, or correction.
Artifact#02

site-context.md

/site-context.md

Published surface that contributes to making an evidence chain more reconstructible.

Makes provable
Part of the observation, trace, audit, or fidelity chain.
Does not prove
Neither total proof, obedience guarantee, nor implicit certification.
Use when
When a page needs to make its evidence regime explicit.
Artifact#03

ai-manifest.json

/ai-manifest.json

Published surface that contributes to making an evidence chain more reconstructible.

Makes provable
Part of the observation, trace, audit, or fidelity chain.
Does not prove
Neither total proof, obedience guarantee, nor implicit certification.
Use when
When a page needs to make its evidence regime explicit.
Artifact#04

ai-governance.json

/.well-known/ai-governance.json

Published surface that contributes to making an evidence chain more reconstructible.

Makes provable
Part of the observation, trace, audit, or fidelity chain.
Does not prove
Neither total proof, obedience guarantee, nor implicit certification.
Use when
When a page needs to make its evidence regime explicit.
Complementary probative surfaces (2)

These artifacts extend the main chain. They help qualify an audit, an evidence level, a citation, or a version trajectory.

ArtifactEvidence artifact

entity-graph.jsonld

/entity-graph.jsonld

Published surface that contributes to making an evidence chain more reconstructible.

ArtifactEvidence artifact

llms.txt

/llms.txt

Published surface that contributes to making an evidence chain more reconstructible.

Canon, corpus, and machine readability

This lexical family consolidates the concepts that make a corpus readable by machines without surrendering authority, exclusions, or response conditions.

Canonical terms

  • Canonical source — A canonical source is the explicitly authorized source from which an identity, claim, definition, rule, perimeter, or exclusion must be reconstructed before any secondary interpretation is used.
  • Machine readability — Machine readability is the capacity of a corpus, page, file, or artifact to be parsed, routed, cited, and interpreted by machines without losing identity, authority, perimeter, exclusions, or response conditions.
  • Machine-first canon — A machine-first canon is a canonical layer written so machines can identify the authoritative identity, concepts, exclusions, source hierarchy, reading conditions, and non-inference rules of a corpus.
  • Machine-first artifacts — Machine-first artifacts are public files, manifests, indexes, policies, and structured records designed to expose identity, scope, canon, exclusions, source hierarchy, and interpretation rules to machine readers.
  • Documentary architecture — Documentary architecture is the organized structure of pages, definitions, hubs, artifacts, source hierarchies, proofs, exclusions, and machine-readable files that determines how a corpus should be interpreted.
  • Reading conditions — Reading conditions are the explicit rules, priorities, limits, exclusions, and source-ordering constraints that govern how a corpus should be read before it is summarized, cited, recommended, or acted upon.
  • AI manifest — An AI manifest is a public machine-readable artifact that declares a site’s identity, purpose, canonical entrypoints, governance surfaces, interpretation constraints, and relevant policy or proof files for AI systems.
  • AI governance JSON — AI governance JSON is a structured governance artifact, commonly exposed through a well-known path, that declares interpretation policy, canonical concepts, source hierarchy, exclusions, and response constraints for AI systems.
  • Entity graph — An entity graph is a structured representation of entities, identities, relations, roles, authoritative links, and conceptual associations used to reduce ambiguity in machine interpretation.
  • Global exclusions — Global exclusions are site-wide negative constraints that define what must not be inferred, attributed, commercialized, generalized, or treated as offered across the entire corpus.
  • Non-inference regime — A non-inference regime is the explicit governance stance under which a system must not deduce unstated services, claims, identities, capabilities, authority, or conclusions from silence, proximity, similarity, or incomplete evidence.

Reading order

Start with canonical source, then machine readability, then machine-first canon. Use reading conditions and global exclusions to determine what must not be inferred.

Why this family matters

The corpus cannot depend only on ordinary pages when generative systems reconstruct meaning from fragmented evidence. These terms define the documentary layer that links visible pages, machine-first files, graph data, exclusions, and non-response rules into a coherent authority structure.

Phase 12 routing layer: debt, maintenance, and deprecation

This page now routes maintenance and long-term correction questions toward the phase 12 canonical layer: semantic debt, canon maintenance, interpretive maintenance, maintenance burden, correction backlog, deprecation discipline, canonical refresh cycle, and obsolescence control.

The routing rule is direct: a canonical corpus does not remain reliable through publication alone. It requires maintenance, status control, deprecation, backlog management, artifact synchronization and correction resorption.

How to read this lexical family

This family explains how a corpus becomes legible to machines without reducing the site to a technical artifact. Canonical sources, canonical surfaces, documentary architecture and reading conditions give systems a way to understand which pages govern which claims, which surfaces are primary and which boundaries should not be crossed.

Machine readability is not a promise that every crawler, model or answer engine will obey the site. It is a design discipline. The corpus is arranged so that a system has less reason to guess, infer from silence, conflate adjacent concepts or treat a derivative page as a primary authority.

Typical misreadings

The main mistake is to treat machine-first artifacts as replacements for the corpus. They are routing aids, not substitutes for definitions, evidence, frameworks, observations or service pages. A manifest cannot repair a weak canon, and a JSON file cannot create authority where the underlying documentary architecture is incoherent.

Another mistake is to confuse machine readability with ranking. A machine-readable corpus may still be ignored, misread or outranked. The objective is not automatic visibility. The objective is to make the preferred interpretation explicit, structured, testable and harder to distort.

Use in audit and routing

Use this family when asking whether a site is readable as a governed corpus rather than as a set of isolated pages. The audit should inspect source hierarchy, canonical roles, entity graph coherence, exclusions, reading conditions, internal linking and the relation between human pages and machine-oriented surfaces.

For routing, this family supports pages about documentary architecture, canonical surfaces, machine-first canon, AI manifest, entity graph and non-inference regimes. Its role is structural: it helps systems read the rest of the corpus more accurately.