Skip to content

Clarification

404, deletion, and AI citation: what are we actually talking about?

Clarification of the regimes often conflated when a deleted page continues to appear in AI outputs. Distinguishes deletion of the source, web availability, citation persistence, surviving authority, interpretive remanence, and stateful memory.

CollectionClarification
TypeClarification
Version1.0
Published2026-04-14
Updated2026-04-14

404, deletion, and AI citation: what are we actually talking about?

This page clarifies a point that is often misdiagnosed: when a deleted page continues to influence an AI answer, several distinct regimes may be involved. Blending them together leads to weak explanations, incomplete remediation, and exaggerated conclusions about “model memory.”

Status of this page

This page is an interpretive clarification.

It does not pretend to model the internal functioning of all systems, settle security debates, or comment on a specific case. Its purpose is simply to establish a cleaner reading frame so the same explanation is not projected onto different mechanisms.

The vocabulary problem

When an AI output keeps mentioning content after deletion, one often hears phrases like:

  • “the page is still in memory”;
  • “the model still cites it despite the 404”;
  • “deleting the page changed nothing.”

These formulations are understandable, but they compress several questions into one.

The five questions that must be separated

1. Does the source still exist as a current web surface?

This is the simplest question. A page may still be served, redirected, replaced, removed, or return 404.

2. Is the source still retrievable as a direct origin?

Even if the page disappeared from its main surface, it may survive elsewhere as an archive, copy, export, screenshot, PDF, cache, or republication.

3. Does the content continue to be relayed by secondary sources?

This is the terrain of citation persistence. A deleted page may already have generated enough citations, rankings, profiles, or reprises for its framing to continue circulating without it.

4. Does an old interpretation return despite a corrected canon?

This is the terrain of interpretive remanence. The issue no longer lies only in source survival, but in the persistence of an older state inside outputs.

5. Does the system itself persist states between t0 and t1?

That question belongs to memory governance and stateful systems. It should not be projected automatically onto every case observed on the open web.

As long as those five questions remain mixed, the diagnosis stays blurry.

What a 404 actually says

A 404 says one precise thing: this resource is not currently available at that address.

What it does not say is just as important. A 404 does not say:

  • that nobody ever saw the page;
  • that no archive exists;
  • that no secondary citation continues to circulate;
  • that no synthesis has already absorbed its framing;
  • that no third-party surface has reused its structuring elements.

A 404 therefore acts on the current availability of the origin. By itself, it does not purge the informational environment that has already been built around it.

What it really means when “AI still cites it after deletion”

That sentence may describe several distinct situations.

Case A: the direct source is still readable elsewhere

The page disappeared from its main URL, but survives in another form. The relevant diagnosis is not “deep model memory,” but still-open access surfaces.

Case B: the answer no longer relies on the origin, but on its reprises

The original source has disappeared, but rankings, directories, profiles, or articles still relay it. The relevant diagnosis becomes citation persistence.

Case C: the old version still frames the answer as if it remained valid

Here the main problem is not the origin citation itself, but the surviving authority of a historical or secondary artifact.

Case D: the system carries a consolidated state over time

In a stateful or agentic context, a system may persist states and reuse them. That case exists, but it must not become the default explanation for every public-web phenomenon.

What should not be concluded too quickly

Saying “the model keeps the page in memory” before excluding secondary routes is methodologically weak.

That may be true in some contexts. It may also be false in many others. Good governance begins by separating:

  • past exposure;
  • present availability;
  • secondary circulation;
  • persistence of interpretation;
  • persistence of state.

To describe an observed case correctly, the minimum vocabulary should be:

  • Deletion / 404: current status of the resource at its main URL.
  • Citation persistence: survival of framing through quotations, reprises, rankings, and secondary artifacts.
  • Surviving authority: continued framing capacity despite loss of primacy.
  • Interpretive remanence: return of an older version despite a corrected canon.
  • Memory governance: regime proper to systems that persist reusable states.

This vocabulary does not solve everything, but it already prevents the confusion of a visible symptom with the wrong causal mechanism.

Minimum reading rule

Rule C-404-1: before attributing the persistence of an answer to “model memory,” one must map the secondary surfaces that remain active, qualify their status, and determine whether the case is primarily citation persistence, surviving authority, interpretive remanence, or stateful memory.

What this clarification changes in practice

It changes remediation.

If the problem comes from a page that is still reachable elsewhere, correction should target the access routes.

If the problem comes from citation persistence, correction should target third-party reprises.

If the problem comes from surviving authority, the source hierarchy must be requalified.

If the problem comes from interpretive remanence, version power and exogenous correction must be reinforced.

If the problem comes from persisted memory, one must work on memory objects, their temporality, and their invalidation rules.