Skip to content

Article

Robots, AI crawlers and citation accessibility

Citation accessibility starts before content quality. A source that cannot be accessed, rendered, previewed or parsed cannot reliably become evidence.

CollectionArticle
TypeArticle
Categoryseo avance
Published2026-05-13
Updated2026-05-13
Reading time3 min

Governance artifacts

Governance files brought into scope by this page

This page is anchored to published surfaces that declare identity, precedence, limits, and the corpus reading conditions. Their order below gives the recommended reading sequence.

  1. 01Definitions canon
  2. 02Site context
  3. 03Public AI manifest
Canon and identity#01

Definitions canon

/canon.md

Canonical surface that fixes identity, roles, negations, and divergence rules.

Governs
Public identity, roles, and attributes that must not drift.
Bounds
Extrapolations, entity collisions, and abusive requalification.

Does not guarantee: A canonical surface reduces ambiguity; it does not guarantee faithful restitution on its own.

Context and versioning#02

Site context

/site-context.md

Notice that qualifies the nature of the site, its reference function, and its non-transactional limits.

Governs
Editorial framing, temporality, and the readability of explicit changes.
Bounds
Silent drifts and readings that assume stability without checking versions.

Does not guarantee: Versioning makes a gap auditable; it does not automatically correct outputs already in circulation.

Entrypoint#03

Public AI manifest

/ai-manifest.json

Structured inventory of the surfaces, registries, and modules that extend the canonical entrypoint.

Governs
Access order across surfaces and initial precedence.
Bounds
Free readings that bypass the canon or the published order.

Does not guarantee: This surface publishes a reading order; it does not force execution or obedience.

Evidence layer

Probative surfaces brought into scope by this page

This page does more than point to governance files. It is also anchored to surfaces that make observation, traceability, fidelity, and audit more reconstructible. Their order below makes the minimal evidence chain explicit.

  1. 01
    Canon and scopeDefinitions canon
  2. 02
    Weak observationQ-Ledger
Canonical foundation#01

Definitions canon

/canon.md

Opposable base for identity, scope, roles, and negations that must survive synthesis.

Makes provable
The reference corpus against which fidelity can be evaluated.
Does not prove
Neither that a system already consults it nor that an observed response stays faithful to it.
Use when
Before any observation, test, audit, or correction.
Observation ledger#02

Q-Ledger

/.well-known/q-ledger.json

Public ledger of inferred sessions that makes some observed consultations and sequences visible.

Makes provable
That a behavior was observed as weak, dated, contextualized trace evidence.
Does not prove
Neither actor identity, system obedience, nor strong proof of activation.
Use when
When it is necessary to distinguish descriptive observation from strong attestation.

Citation accessibility starts before content quality. A source that cannot be reached, rendered, previewed or parsed cannot reliably become evidence in an AI-mediated answer.

Many citation discussions jump directly to content structure. Structure matters, but it only matters after access. If the relevant page is blocked, hidden, unstable, non-canonical, impossible to render or deprived of useful preview text, the system may never reach the passage that should support the answer.

This does not mean that every crawler should receive unlimited access. It means that access policy and citation strategy must be coherent.

Accessibility is a governance decision

Robots directives, crawler rules, preview controls and security layers are not purely technical settings. They express an access policy. A site may legitimately restrict certain automated systems. But a site cannot simultaneously block the useful surface and expect that same surface to become a stable citation source.

The governance question is: which systems should be allowed to discover which surfaces, for which purpose, and under which limits?

A marketing page, a definition, a service page, a canon reference and a proof artifact do not necessarily require the same access posture. Treating them all identically creates avoidable friction.

What citation accessibility includes

Citation accessibility is broader than a single robots rule. It includes:

LayerFailure mode
Crawl accessthe system cannot fetch the URL
Renderingthe useful content appears only after unstable client-side behavior
Preview controlthe critical claim is excluded from snippet or preview use
Canonical behaviorthe system sees competing or unstable canonical URLs
Content visibilitythe useful passage is hidden in collapsed elements or inaccessible UI
Security behaviorbot protection blocks legitimate retrieval paths

Each failure can make a page look present to humans but weak to answer systems.

Preview control is not neutral

Preview control can be useful when a site must restrict what automated systems display. But it can also remove the very passage that would have supported a citation.

The right approach is not to disable controls by default. It is to decide which passages may legitimately be reused, which should be excluded, and which canonical source should govern sensitive claims.

A page that hides its strongest evidence may still rank, but its citation readiness is weaker.

Do not confuse crawl access with fidelity

Allowing access is not the same as governing interpretation. A system may fetch a page, extract the wrong passage, cite it ornamentally or synthesize beyond the source. Accessibility is therefore the first condition, not the final standard.

After access, the site still needs extractability, citation role, source hierarchy and proof of fidelity.

Audit route

A practical accessibility audit should test strategic URLs through four questions:

  1. Can the URL be fetched by the intended systems?
  2. Is the useful passage visible in the initial rendered content?
  3. Do preview rules permit the claim to be displayed or cited?
  4. Does the canonical route point to the source that should govern the claim?

The output should not be a simplistic “allow all” recommendation. It should be a governed access map: which surfaces are open, which are restricted, which are discoverable, and which must only be interpreted through stronger canonical routes.

That map prepares the future governance repo update without creating governance files inside this site repo.