Skip to content

Doctrine

Multilingual corpora: translation and version hierarchy

Multilingual corpora: translation and version… states a doctrinal position on AI interpretation, authority, evidence, governance or response legitimacy.

CollectionDoctrine
TypeDoctrine
Layertransversal
Version1.0
Levelnormatif
Published2026-03-22
Updated2026-03-22

Visual schema

Minimal hierarchy between language versions

Comparing versions does not mean comparing equivalent texts, but comparing different authority statuses.

Master version

Reference canon

Translated version

Derived surface

Hybrid recomposition

Invalid zone

Source authority

Declares the reference norm and fixes hierarchy.

Depends on a master version and cannot independently create the norm.

Has no stable authority because it mixes several states of the canon.

Update rule

Corrects, supersedes, or retires earlier states.

Must follow the change without inventing new scope.

Often persists through fragments or unsynchronized residues.

Version conflictCritical point

Prevails when hierarchy is explicitly declared.

Yields when there is substantial divergence from the master version.

Must be rejected as an unreliable recomposed surface.

Jurisdictional scope

Defines what may be opposed and reused as reference.

Adapts the statement to another language without changing the regime.

Creates a mixture of languages, dates, and perimeters.

Interpretive debt

Remains governable if versioning and traceability are maintained.

Grows quickly when linguistic synchronization breaks.

Explodes when partial quotations become more visible than the canon.

Governance artifacts

Governance files brought into scope by this page

This page is anchored to published surfaces that declare identity, precedence, limits, and the corpus reading conditions. Their order below gives the recommended reading sequence.

  1. 01Site context
  2. 02Editorial context
  3. 03AI changelog
Context and versioning#01

Site context

/site-context.md

Notice that qualifies the nature of the site, its reference function, and its non-transactional limits.

Governs
Editorial framing, temporality, and the readability of explicit changes.
Bounds
Silent drifts and readings that assume stability without checking versions.

Does not guarantee: Versioning makes a gap auditable; it does not automatically correct outputs already in circulation.

Context and versioning#02

Editorial context

/editorial-context.md

Notice that fixes editorial posture, tone, abstraction level, and responsibility.

Governs
Editorial framing, temporality, and the readability of explicit changes.
Bounds
Silent drifts and readings that assume stability without checking versions.

Does not guarantee: Versioning makes a gap auditable; it does not automatically correct outputs already in circulation.

Context and versioning#03

AI changelog

/changelog-ai.md

Log of governance, identity, and machine-first surface changes.

Governs
Editorial framing, temporality, and the readability of explicit changes.
Bounds
Silent drifts and readings that assume stability without checking versions.

Does not guarantee: Versioning makes a gap auditable; it does not automatically correct outputs already in circulation.

Complementary artifacts (3)

These surfaces extend the main block. They add context, discovery, routing, or observation depending on the topic.

Entrypoint#04

Canonical AI entrypoint

/.well-known/ai-governance.json

Neutral entrypoint that declares the governance map, precedence chain, and the surfaces to read first.

Entrypoint#05

Public AI manifest

/ai-manifest.json

Structured inventory of the surfaces, registries, and modules that extend the canonical entrypoint.

Canon and identity#06

Definitions canon

/canon.md

Canonical surface that fixes identity, roles, negations, and divergence rules.

Multilingual corpora: translation and version hierarchy

Translating a canon does not consist only in moving a text from one language to another. It consists in preserving an authority perimeter, exclusions, temporality, scope, and sometimes a jurisdiction. In other words: what must survive translation is not merely the general meaning. It is the normative structure of what can be asserted.

A multilingual corpus therefore does not naturally produce a single stable truth. It produces several linguistic surfaces that can be equivalent, partially equivalent, locally adapted, or temporarily desynchronized. Without a declared hierarchy, a synthesis system often treats this plurality as one shared reservoir available for recomposition.

This page does not require perfect simultaneity or worldwide uniformity. It establishes something more demanding: in multilingual environments, one must govern what may be combined, what must prevail, and what must not travel without conditions.


1. A translated canon is not necessarily line-by-line identical

Two language versions can be canonically compatible without being textually symmetrical. A wording may need adaptation to preserve a legal nuance, a local usage, a sector distinction, or a readability level.

The doctrinal requirement is therefore not verbal identity. The requirement is the stability of:

  • the boundary of what may be deduced;
  • the hierarchy between assertion, condition, and exclusion;
  • the date or version of validity;
  • the relation between general rule and local variant.

A text can be lexically well translated and still be doctrinally false if it weakens an exclusion, universalizes an exception, or suggests that a secondary language prevails on an attribute it does not govern.


2. The main multilingual drifts

The most visible case is already documented in Multilingual and temporality: when FR and EN versions do not age together. But temporal desynchronization is only one case among others.

The most structuring multilingual drifts are usually these:

  • hybrid recomposition: an answer combines fragments from several languages without declaring their status;
  • scope shift: a local adaptation is read as a universal rule;
  • erased exclusions: the translation keeps the assertion but loses the negation;
  • asymmetric archiving: one language becomes the unintended archive of the other;
  • implicit primacy: the most detailed or easiest-to-summarize language wins, even if it should not govern the attribute at stake.

In all these cases, the problem is not merely lexical. It is hierarchical.


3. What a multilingual corpus must declare

A governable multilingual corpus must be able to answer, for critical attributes, simple questions:

  • which language is the reference for which type of information;
  • when a local version prevails over the general version;
  • which temporal gaps are tolerated;
  • which information is pending translation and must not be combined;
  • which elements must remain strictly synchronized.

This discipline matters especially for attributes that engage the real scope of an entity: offering, availability, location, compliance, lead times, prices, exclusions, contact procedures, roles, and perimeters.

It directly connects with product sources: FR documentation and EN pricing can both be perfectly accurate in isolation, then together produce a description that exists nowhere.


4. Translate negations, silences, and conditions too

A canon is not made only of positive statements. It is also made of intentional silences, boundaries, inference prohibitions, and response conditions.

This is why canonical silence must never be treated as a translation omission. What is not said in one language is not automatically fillable from another. A governed multilingual synthesis must be able to distinguish:

  • what is truly equivalent across languages;
  • what remains unspecified in all languages;
  • what is stated locally but not exportable;
  • what is temporarily absent and must not be completed.

Translating assertions without translating exclusions and conditions simply opens a wider inference space in one language than in another.


5. Temporality, memory, and persistence of old versions

In multilingual environments, system memory is not limited to internal archives. It also includes former language versions, captures, partially reused translations, indexed excerpts, and external citations.

This is why interpretive remanence is often stronger in bilingual or multiregional corpora. A correction in one language does not automatically shift the memory of the other.

Version power must therefore be read together with memory governance: what remains accessible in a secondary language can continue to govern synthesis, even if the primary language has been corrected.


6. What governing multilingual corpora does not mean

Governing multilingual corpora does not mean:

  • imposing absolute textual symmetry;
  • assuming that one language must always prevail over everything;
  • forbidding local adaptations;
  • requiring instantaneous translation of every change.

It means making visible the priority rules, the admissible gaps, and the non-combinable zones. Without those rules, translation stops being an equivalence device and becomes a reserve of fragments for opportunistic synthesis.


7. Doctrinal scope

This page extends doctrine to an object that is often under-governed: the coexistence of partially aligned linguistic truths. It does not replace a translation policy, a legal process, or a localization architecture.

It establishes only this: a multilingual site does not merely manage multiple languages. It manages multiple jurisdictions of interpretation.


Canonical connectors