Indirect injection: when “summarize this content” becomes an attack surface

Type: Clarification

Conceptual version: 1.0

Stabilization date: 2026-02-28

This page defines indirect injection as an authority threat that transits through a legitimate task (“summarize”, “explain”, “extract”) and converts a hostile instruction into consumed context.

Prompt injection is often imagined as an adversary who “talks to the model” directly. Yet, in a modern architecture (RAG, assisted navigation, agents), a large part of context is not provided by the user, but retrieved (pages, documents, extracts, emails, repositories, tools). Indirect injection exploits this reality: it places instructions in content that will then be treated as data.

The critical point is structural: a work instruction (“summarize this content”) forces the system to ingest third-party text. If the system does not explicitly bound what can instruct, it risks letting a hostile instruction slip into the decisional hierarchy.

Operational definition

Indirect injection: insertion of instructions or constraints in third-party content (page, document, extract, tool output) such that, during a legitimate task (summary, extraction, classification, response), the system treats these instructions as authoritative context and modifies its output, priorities, or decisions.

The central mechanism is an instruction/data confusion transiting through a processing step perceived as neutral.

Why “summarize this content” is an attack surface

A summary request has a particular property: it implicitly gives the content a status of “raw material” to ingest, without prior validation of its role.

If the system does not impose strict separation between:

  • rules (what can instruct)
  • context (what can inform)
  • sources (what can carry authority)

then content can contain a hostile instruction that will be treated as if it were compatible with the requested task, or even prioritized.

Common surfaces (where injection hides)

  • Web pages: sections invisible to the eye (footer, comments, accordions), or non-editorialized “SEO” content.
  • Documents: PDF, docs, notes, where the instruction is buried in a paragraph.
  • Tool outputs: API outputs, connectors, scrapers, logs, consumed as “raw data”.
  • RAG-indexed content: a poisoned fragment can be recalled out of context and gain implicit authority rank.

Related pages