Robots, AI crawlers and citation accessibility

Governance artifacts

Governance files brought into scope by this page

This page is anchored to published surfaces that declare identity, precedence, limits, and the corpus reading conditions. Their order below gives the recommended reading sequence.

Canon and identity#01

Definitions canon

/canon.md

Canonical surface that fixes identity, roles, negations, and divergence rules.

Governs: Public identity, roles, and attributes that must not drift.
Bounds: Extrapolations, entity collisions, and abusive requalification.

Does not guarantee: A canonical surface reduces ambiguity; it does not guarantee faithful restitution on its own.

Context and versioning#02

Site context

/site-context.md

Notice that qualifies the nature of the site, its reference function, and its non-transactional limits.

Governs: Editorial framing, temporality, and the readability of explicit changes.
Bounds: Silent drifts and readings that assume stability without checking versions.

Does not guarantee: Versioning makes a gap auditable; it does not automatically correct outputs already in circulation.

Entrypoint#03

Public AI manifest

/ai-manifest.json

Structured inventory of the surfaces, registries, and modules that extend the canonical entrypoint.

Governs: Access order across surfaces and initial precedence.
Bounds: Free readings that bypass the canon or the published order.

Does not guarantee: This surface publishes a reading order; it does not force execution or obedience.

Evidence layer

Probative surfaces brought into scope by this page

This page does more than point to governance files. It is also anchored to surfaces that make observation, traceability, fidelity, and audit more reconstructible. Their order below makes the minimal evidence chain explicit.

01
Canon and scopeDefinitions canon
02
Weak observationQ-Ledger

Canonical foundation#01

Definitions canon

/canon.md

Opposable base for identity, scope, roles, and negations that must survive synthesis.

Makes provable: The reference corpus against which fidelity can be evaluated.
Does not prove: Neither that a system already consults it nor that an observed response stays faithful to it.
Use when: Before any observation, test, audit, or correction.

Observation ledger#02

Q-Ledger

/.well-known/q-ledger.json

Public ledger of inferred sessions that makes some observed consultations and sequences visible.

Makes provable: That a behavior was observed as weak, dated, contextualized trace evidence.
Does not prove: Neither actor identity, system obedience, nor strong proof of activation.
Use when: When it is necessary to distinguish descriptive observation from strong attestation.

Citation accessibility starts before content quality. A source that cannot be reached, rendered, previewed or parsed cannot reliably become evidence in an AI-mediated answer.

Many citation discussions jump directly to content structure. Structure matters, but it only matters after access. If the relevant page is blocked, hidden, unstable, non-canonical, impossible to render or deprived of useful preview text, the system may never reach the passage that should support the answer.

This does not mean that every crawler should receive unlimited access. It means that access policy and citation strategy must be coherent.

Accessibility is a governance decision

Robots directives, crawler rules, preview controls and security layers are not purely technical settings. They express an access policy. A site may legitimately restrict certain automated systems. But a site cannot simultaneously block the useful surface and expect that same surface to become a stable citation source.

The governance question is: which systems should be allowed to discover which surfaces, for which purpose, and under which limits?

A marketing page, a definition, a service page, a canon reference and a proof artifact do not necessarily require the same access posture. Treating them all identically creates avoidable friction.

What citation accessibility includes

Citation accessibility is broader than a single robots rule. It includes:

Layer	Failure mode
Crawl access	the system cannot fetch the URL
Rendering	the useful content appears only after unstable client-side behavior
Preview control	the critical claim is excluded from snippet or preview use
Canonical behavior	the system sees competing or unstable canonical URLs
Content visibility	the useful passage is hidden in collapsed elements or inaccessible UI
Security behavior	bot protection blocks legitimate retrieval paths

Each failure can make a page look present to humans but weak to answer systems.

Preview control is not neutral

Preview control can be useful when a site must restrict what automated systems display. But it can also remove the very passage that would have supported a citation.

The right approach is not to disable controls by default. It is to decide which passages may legitimately be reused, which should be excluded, and which canonical source should govern sensitive claims.

A page that hides its strongest evidence may still rank, but its citation readiness is weaker.

Do not confuse crawl access with fidelity

Allowing access is not the same as governing interpretation. A system may fetch a page, extract the wrong passage, cite it ornamentally or synthesize beyond the source. Accessibility is therefore the first condition, not the final standard.

After access, the site still needs extractability, citation role, source hierarchy and proof of fidelity.

Audit route

A practical accessibility audit should test strategic URLs through four questions:

Can the URL be fetched by the intended systems?
Is the useful passage visible in the initial rendered content?
Do preview rules permit the claim to be displayed or cited?
Does the canonical route point to the source that should govern the claim?

The output should not be a simplistic “allow all” recommendation. It should be a governed access map: which surfaces are open, which are restricted, which are discoverable, and which must only be interpreted through stronger canonical routes.

That map prepares the future governance repo update without creating governance files inside this site repo.

Robots, AI crawlers and citation accessibility

Governance files brought into scope by this page

Definitions canon

Site context

Public AI manifest

Probative surfaces brought into scope by this page

Definitions canon

Q-Ledger

Accessibility is a governance decision

What citation accessibility includes

Preview control is not neutral

Do not confuse crawl access with fidelity

Audit route

Related content