Evidence weighting: How LLMs decide who gets cited.

Table of Contents

Layers of truth for machines

Large language models reconstruct the world not through perception, but through probability. They weigh up which statements have stable evidence, are networked and logically consistent – and use this to form their own algorithmic layer of truth. This is precisely where the new visibility arises: not through reach, but through evidence.

But while AI monitoring tools measure the “what” of visibility, which brands appear in LLM responses and AI overviews, the “how” and “why” remain unclear. They diagnose retrieval signals. What they do not show is why some sources are only mentioned, while others are recommended. This is because the actual decision is not made in phase 2 (fan-out), but in phases 5-6 (evidence weighting, reasoning). This is where discoverability becomes authority – or not.

Visibility is not a measured value, but an emergent phenomenon: the result of data architecture, evidential value and semantic coherence. This article shows how LLMs actually construct answers. Step by step, phase by phase – and why brands only become part of these answers if they anchor themselves in a structured way. He explains the difference between content that is only read and content that becomes knowledge.

Because machine visibility does not mean being found. It means: belonging to the truth.

The three biggest misconceptions in the AI visibility industry

Misconception 1: AI overviews are an opportunity

GEO-optimized brands are found and retrieved – appearing in fan-out logs and AI overview statistics. But without structured evidence, they remain vague mentions instead of recommendations. They serve as background material, not as a linked source.

Misconception 2: LLMs are black boxes – you can only experiment

Trial & error makes no sense: test prompt variations, A/B test content, hope. Without an understanding of the process, you’re flying blind when it comes to optimization. If you don’t know when Mention vs. Recommendation is decided, you are investing in the wrong levers. If you don’t understand entity recognition, you produce content that is unambiguous for humans but remains ambiguous for machines.

Misconception 3: AI visibility is short-term – training cycles are irrelevant

Monitoring shows visibility today. What it doesn’t show: Who will be cited in 8-12 months even without online research? Brands with consistent Q-IDs and Schema.org relations form the parametric knowledge base of future models. They will become part of the “solidified layer of truth” – cited without research.

LLM process model: Dual-Path Architecture

Visualization of the answer formation | Case L vs. Case L+O

When a Large Language Model answers a query, it is faced with a fundamental decision: Is the internally stored knowledge sufficient – or must additional research be carried out on the web? This decision determines which of two fundamentally different paths is taken.

Case L (learning data) uses only the parametric knowledge from the training – a condensed image of human language and knowledge structures that is typically several months to over a year old. Updating is not continuous, but takes place in discrete training cycles. If the model is from 2025, its training knowledge is available until around mid-2024.

Case L+O (Learning Data + Online) supplements this static knowledge with real-time research on the web. The model opens a “candidate space” of URLs, scans their content and extracts current evidence. It recognizes structured data such as JSON-LD, Schema.org markup and entity anchors (Q-IDs, sameAs) in real time – long before they flow into future training cycles.

The following visualization shows both paths in detail: from the initial prompt to the critical decision (confidence threshold) to the final answer. What looks like a linear process at first glance is actually a complex system with feedback loops, evidence weighting and semantic synthesis – and this is precisely where it is decided whether structured data leads to a mere mention or an authoritative recommendation.

LLM Dual-Path Model - Mobile Infographic

Case L

Case L+O

Shared

Prompt Input / Intent Parsing

Semantic analysis, intent vector creation, complexity assessment

Internal Knowledge Retrieval

Retrieve parametric knowledge, calculate confidence score

Confidence
>
Threshold?

← Case L

Case L+O →

Phase 2-3
skipped

Entity Recognition

Trained patterns (Wikipedia, Wikidata)

Evidence Weighting

Internal frequency & coherence

Reasoning

Parametric synthesis

Fan-Out

Query generation & web search

Evidence Extraction

JSON-LD, RDFa, passage ranking

Entity Linking

@id, sameAs, Q-IDs verification

Evidence Weighting

Evidence architecture + source diversity

Response Planning ↻

Evidence-guided outline

Reasoning ↻

Fusion L+O + multi-hop

↻ Feedback Loops

Phase 5 → Phase 2 (on low evidence)
Phase 6 → Phase 3 (on inconsistencies)
Phase 5a → Phase 5 (on unclear structure)

Final Response Construction

Linguistic optimization, style alignment, optional citation

Phase details

Phase 0: Prompt Input / Intent Parsing

Tokenization → Semantic analysis → Intent vector creation → Complexity assessment. The intent vector controls all subsequent layers.

Phase 1: Internal Knowledge Retrieval

Access to parametric knowledge (trained weightings). Confidence score decides on Case L vs. L+O. With Confidence > Threshold, the system remains in Case L.

Decision node: Confidence Threshold

Is internal knowledge sufficient? YES → Case L (learning data only). NO → Case L+O (additional online research). This switch is critical for path selection.

Phase 2: Fan-Out / Candidate Retrieval

Generation of semantic search queries (1-6 tokens). Query of external search interfaces. Result: 3-10 prioritized URLs + snippets. This is where the external evidence search begins.

Phase 3: Source Evaluation / Evidence Extraction

HTML parsing + JSON-LD extraction + RDFa/microdata analysis. Structured data (Schema.org) increases evidential value. Passage ranking according to relevance to the intent vector.

Phase 4: Entity Recognition (Case L)

Use of trained entity patterns from Wikipedia, Wikidata, Common Crawl. Internal Q-ID matching based on training. No external verification.

Phase 4: Entity Recognition & Linking (Case L+O)

In addition to internal matching: verification via @id, sameAs, Q-IDs from JSON-LD. Cross-document entity coreference. The AI Visibility layer works directly here.

Phase 5: Evidence Weighting (Case L)

Weighting according to internal frequency (how often the pattern occurred in the training) and coherence (consistency in parametric knowledge).

Phase 5: Evidence Weighting (Case L+O)

Document architecture analysis: Sources with Q-IDs, sameAs, identifier receive 2-3x higher weight. Source diversity + topicality + domain authority are included in the evidence matrix.

Phase 5a: Response Planning (Case L+O)

Evidence-guided outline: Claims are prioritized according to the evidence matrix. Hierarchical relations (isPartOf, about) structure the argumentation. Feedback loop to phase 5 possible.

Phase 6: Reasoning (Case L)

Composition from learned patterns. Statistically coherent response based on frequency weighting in training. No external evidence.

Phase 6: Reasoning & Synthesis (Case L+O)

Fusion of internal (L) and external (O) evidence. Conflict resolution for contradictions. Multi-hop reasoning via entity graphs. This is where the decision is made: Mention vs. recommendation. Feedback loop to phase 3 possible.

Phase 7: Final Response Construction

Linguistic optimization + style alignment. In Case L+O: optional citation with url, name, publisher from Schema.org. Structured metadata enables explicit source citation.

Feedback loops

Phase 5 → Phase 2:

In case of insufficient evidence (low score in evidence matrix), return to fan-out with refined queries.

Phase 6 → Phase 3:

In case of inconsistencies or contradictions during reasoning: additional source retrievals or re-ranking.

Phase 5a → Phase 5:

If the argumentation structure is unclear: Re-assessment of the evidence weighting.

Fields of influence of the truth layer

Structured data does not only work in one of the two modes – it influences the response formation on two parallel time axes and in both paths.

Transient (Case L+O): When the LLM searches online, it reads JSON-LD, Schema.org markup and entity anchors such as @id, sameAs or Q-IDs in real time. This structured data flows immediately into the evidence weighting: sources with stable identifiers receive 2-3x higher weight, cross-document entity coreference becomes possible, and the document architecture decides on mention vs. recommendation. This effect is transient – it only exists for the duration of the current search.

Persistent (Case L): Structured data that is present over a longer period of time is integrated into the parametric knowledge base in future training cycles. They become part of the “solidified layer of truth” that the model can use even without online access. Entities with consistent Q-IDs, sameAs links and hierarchical relations then shape the internal frequency weighting and entity recognition in Case L. This effect is persistent – it shapes the model in the long term.

The following illustration shows in which specific phases (layers 4-7) structured data is effective – and how it permeates the entire reasoning process from mere identification to evidence evaluation and final citation. What is often misunderstood as “SEO for LLMs” is in fact semantic infrastructure: those who build up machine-readable evidence today secure both immediate visibility and long-term reputation.

Layer 4: Entity Recognition

Mechanism

Identification & anchoring of entities

Data objects

@id

sameAs

Q-IDs

ISIN

LEI

Mode of action

Immediate evidence (transient) + long-term effect (persistent). Unique IDs prevent confusion.

Layer 5: Evidence Weighting

Mechanism

Evaluation of the document architecture

Data objects

JSON-LD

Schema.org

sameAs

Mode of action

Strengthens evidential value and citation probability. Sources with stable anchors are given greater weight.

Layer 5a: Response Planning

Mechanism

Structuring the argumentation logic

Data objects

isPartOf

about

mentions

Mode of action

Influences outline structure: Well-connected entities are prioritized as central argumentation points.

Layer 6: Reasoning

Mechanism

Context integration across multiple sources

Data objects

Schema.org Graph

Q-IDs

Standard IDs

Mode of action

Increases chance of RECOMMENDATION instead of mere mention. Consistent data acts as a “truth anchor”.

Layer 7: Final Response

Mechanism

Citation integration & transparency

Data objects

url

name

publisher

datePublished

Mode of action

Enables explicit source citation instead of vague references. Structured metadata for attribution.

”Structured data not only affects retrieval (phase 2-3), but above all entity recognition (4), evidence weighting (5), response planning (5a) and reasoning (6). This distinguishes semantic data architecture from superficial content marketing.
Norbert Kathriner

The mechanics in detail: All steps from the request to the response

What was previously considered a black box can now be reconstructed through systematic analysis.

The following descriptions are based on 14 months of field research, evaluation of over 500 prompt-response pairs across multiple LLM platforms, and comparison with current RAG and entity linking research. They do not show how a specific model is programmed internally – that remains proprietary. But they do show the logical steps a system has to go through to get from a natural language query to a structured, evidence-based response.

This granularity is not an academic end in itself. It is the basis for the points at which structured data is effective – and therefore the prerequisite for strategic decisions: Where is it worth investing in machine-readable evidence? Which Schema.org properties have a measurable impact? When is content marketing enough, when is semantic architecture needed?

The digital communications boutique is conducting this fundamental research not just out of academic interest, but out of necessity: if you want to understand how brands will retain control of their content in an LLM-dominated future, you need to understand how these systems make decisions. What follows is not speculation – it is the most detailed, publicly available process description of LLM response formation to date.

Part I – Case L (answer based on learning data)

Phase 0

Prompt Input / Intent Parsing

The LLM process steps

Prompt reception

Processing the user input

Tokenization

Breakdown into processable units.

Semantic analysis

Recognition of intent, topic, entities, relations, time reference.

Intent vector creation

Abstract representation of the request as a semantic vector.

Complexity assessment

Assessment of response type, document requirements and potential timeliness.

Before a model “knows” anything, it has to understand what is actually required. Phase 0 is therefore not a content-related step, but an epistemic one: a human formulation becomes a machine-readable intention. Technically, this means breaking down the text into units, recognizing fields of meaning and encoding these fields as an intent vector. This vector is the working basis for all further decisions: Do we stay in the internal knowledge space (Case L), or do we open the search space (Case L+O)?

Important: This phase does not decide anything about truth content. It merely determines relevance paths – i.e. which topics, entities and relations are even possible. At the same time, the model assesses the complexity: Is the question purely definitional? Is it time-related? Does it require evidence? This assessment results in an initial confidence value, which later serves as a switch. The clearer the intent, the lower the friction losses in all subsequent phases and the less susceptibility to prompt sensitivity.

For brands, this means that linguistic and semantic clarity already pays off here. If you use consistent terms, stable entities and unambiguous designations in your own communication, you increase the probability that the model will process the question in the correct semantic neighborhood. Phase 0 is therefore the place where machine readability begins as a prerequisite for everything else.

Phase 1

Internal Knowledge Retrieval

The LLM process steps

Parametric knowledge retrieval

Access to trained weights.

Concept activation

Relevant knowledge patterns are activated.

Sample extraction

Recall of memorized structures and contexts.

Contextual filtering

Limitation to relevant areas of knowledge.

Confidence score calculation

Assessment of the internal strength of evidence.

Sufficiency check

Check whether the internal evidence is sufficient to remain in Case L.

Before a language model looks outwards, it first looks within itself. This phase is the core of every purely internal answer (Case L): Here it is checked whether the existing world knowledge is sufficient to answer the query conclusively.

Technically speaking, the model accesses its parametric knowledge base – a condensed matrix of weights that were learned during training. These parameters do not store facts in the classical sense, but rather probability spaces of meaning. When a prompt activates a certain topic, the model does not react by looking it up, but by activating concept patterns: links between concepts, relations and typical argumentation chains.

This phase determines how “stable” the model thinks. It evaluates the density of evidence in its own memory, filters out noise and forms an internal confidence measure – the confidence score. If this is high enough, the model remains in its closed system (Case L). If it is low or contradictory, this triggers the transition to hybrid mode (Case L+O).

This phase is crucial for brands because it determines whether a brand even exists in the collective memory of the machine. Anyone who does not appear here will hardly be mentioned organically, even with correct online research. This is why internal visibility – i.e. trainable evidence – is the basis for any subsequent AI visibility.

Phase 2

Phase 3

Not applicable / skipped in Case L

Phase 2 – Fan-Out (only Case L+O)
Phase 3 – Evidence Extraction (Case L+O only)

Phase 4

Entity Recognition (trained patterns)

The LLM process steps

Entity recognition

Identification of relevant concepts, objects or organizations.

Trained entity linking

Assignment to learned entities in model memory.

Q-ID matching (internal)

Comparison with stored identifiers from the training.

Relationship mapping

Recognition and linking of semantic relations between entities.

Contextualization

Integration into the overall thematic context of the prompt.

When a language model “knows” something, it rarely recognizes concepts – it understands via entities. This phase translates fuzzy human thinking into a structured, machine-readable form: words become nodes, connections become relations.

The model identifies which concepts, organizations, places or people appear in the prompt and checks whether they already exist as trained entities in the internal knowledge space. Each of these entities has a kind of semantic fingerprint: a combination of name, context and typical attributes. The internal comparison is made with stored patterns or – in a figurative sense – with known Q-IDs from the training knowledge.

The aim of this phase is not just recognition, but clarity. The model can only formulate coherent answers if it knows for sure which Allianz or Zurich is meant. To do this, it links entities with each other (“Allianz SE” – “insurance” – “Germany”) and builds a local graph of meaning. Only this graph turns mere word association into semantically comprehensible thinking.

For brands, this means that this is where it is decided whether they appear as a clear entity or are lost in the noise of meaning. Only those that are stable in terms of content, formally consistent and referenced multiple times are recognized as independent nodes – and can later exist in generative responses as a “source” or “example”.

Phase 5

Evidence weighting (internal frequency & coherence)

The LLM process steps

Frequency analysis

Determination of the frequency of activated patterns in the training room.

Coherence check

Verification of internal consistency.

Prioritization

Weighting of the most stable patterns and relationships.

Evidence score calculation

Determination of the probative value of each sample.

Preparation for phase 6

Passing on the weighted evidence for reasoning.

Narrative structuring

Building a logical chain of reasoning as the basis for the answer.

Once the model has identified the relevant entities and mapped their relationships, the key question is: Which of this internal information is robust enough to become part of the answer?

In this phase, the model evaluates the strength of evidence of its own knowledge stored in the training. There are no external sources – the evaluation takes place exclusively in the parametric space of the model.
Each connection between terms and concepts carries a statistical weight that is derived from the frequency and stability in the training data.

Two dimensions determine this internal evidence assessment:

Frequency – How often does a pattern occur in similar contexts? Frequent, consistent links are given a higher internal weight.
Coherence – How consistent is the relationship within the semantic field? Inconsistencies or competing meanings reduce trust.

Both factors are used to create an internal evidence score, which indicates which concepts serve as a viable basis for the answer. This internal weighting replaces the external evidence process in Case L – it is the machine equivalent of a plausibility check.

This phase is crucial for brands because it shows how strongly a brand is already anchored in the collective memory of the machine. Only frequently linked, semantically clear brand references generate enough internal evidence to appear in responses without additional online research.

Phase 6

Reasoning (parametric synthesis)

The LLM process steps

Knowledge consolidation

Merging of all relevant patterns and entities.

Logical inference

Derivation of conclusions on the basis of internal relations.

Coherence check

Check for consistency and semantic stability.

Conceptual integration

Combining different levels of knowledge into an overall structure.

Frequency weighting

Prioritization of frequently and consistently occurring patterns.

Narrative structuring

Building a logical chain of argumentation as a basis for linguistic formulation.

Once the model has evaluated its internal evidence, the actual synthesis phase begins – what we would call thinking in humans. Now the weighted patterns, entities and relationships are brought together to form a consistent structure of meaning.

The model checks which concepts fit together, which support each other and which contradict each other. It draws logical and semantic conclusions, consolidates similar patterns and discards inconsistent relations. This process is similar to the mental comparison of several memories: A plausible explanation emerges from scattered elements of knowledge that is internally stable enough to be expressed linguistically.

The prioritization follows the frequency weighting from phase 5: Frequently occurring, consistent patterns are treated as “main axes” of meaning, rare or weak patterns as supplementary marginal information.
The goal is coherence – an answer that is logically coherent, semantically harmonious and statistically plausible.

For brands, this phase means that this is where it is decided whether their presence in the model is considered a stable knowledge unit. A brand whose content appears consistently across many contexts is perceived as a reliable node in this internal space of meaning – and can therefore also appear in answers without external sources.

Phase 7

Final Response Construction

The LLM process steps

Language generation

Formulate the answer in natural language.

Style adaptation

Choice of tone, tempo and register according to the context.

Structuring

Organization of content in paragraphs, lists and semantic transitions.

Length optimization

Comparison of information density and readability.

Precision check

Check for clarity, consistency and semantic accuracy.

Final Response Output

Output of the finished answer - the visible surface of the entire thought process.

In this final phase, the model leaves the realm of pure knowledge – it begins to speak. A linguistic structure that combines style, rhythm and precision emerges from the previously formed structure of meaning. The model transforms its inner logic into linear language so that thoughts and weightings become readable for people.

First of all, the system decides how it responds: sober, explanatory or argumentative. For factual topics, clarity dominates – for analytical or strategic topics, a rhythmic, context-rich style. The model simulates communication competence: it selects sentence lengths, word fields and rhetorical transitions in such a way that they correspond to the user’s presumed expectations.

The answer is then structured: paragraphs, lists and semantic markers organize the previously condensed arguments. Each formulation is a controlled recombination of meaning and probability – not free creativity, but precise implementation of the previously calculated semantic framework.

In parallel, a constant self-monitoring process takes place: the model checks whether each formulation carries the intended content, removes redundancies, corrects implicit contradictions and smoothes out stylistic breaks. The result is a text that is coherent, readable and self-contained – the linguistic manifestation of a probabilistic truth.

For brands, this phase is the projection screen for their digital identity. This is where it becomes clear how the machine “tells” them: factual, trustworthy, concise – or fragmented and arbitrary. Those who have previously created semantic clarity are precisely reconstructed here. Those who communicate in an unstructured manner are rendered in a blurred manner in this phase.

AI Visibility offer
Visible to people. Visible for machines.

When AI decides what is visible, no campaign or corporate design will help. Only structure.

→ To the offer

→ To the offer analysis

Part II – Case L+O (answer with online research)

Phase 0

Prompt Input / Intent Parsing

The LLM process steps

Prompt reception

Processing the user input.

Tokenization

Breakdown into processable units.

Semantic analysis

Recognition of intent, topic, entities and relations.

Intent vector creation

Abstract representation of the request.

Complexity assessment

Assessment of the response type and document requirements.

Gap detection

Comparison between internal knowledge and potential information requirements.

Decision preparation

Preparation of the switch - remain internal or open search area.

In the hybrid path, too, everything begins with understanding – but the goal is different. While the internal mode (Case L) checks what is already known, the hybrid path aims to make gaps in knowledge visible.
The model interprets the query not only semantically, but also epistemically: Do I know enough to be able to answer?

To do this, it breaks down the user input into tokens, recognizes semantic clusters, entities and relations and uses them to form an intent vector. This intent vector describes the target content space and serves as a control signal for all subsequent layers. At the same time, the complexity of the query is evaluated – is it a definitional, factual or argumentative question? The result is a confidence profile that already reveals initial uncertainties and thus prepares the decision as to whether the search space needs to be opened.

This phase is crucial for brands: this is where it is decided whether a brand is even considered as a potential source. Only clearly structured, semantically precise communication increases the likelihood that the model will mark it as a relevant candidate for evidence in the next step.

Phase 1

Internal Knowledge Retrieval & Gap Detection

The LLM process steps

Parametric knowledge retrieval

Access to trained weights.

Concept activation

Relevant knowledge patterns are activated.

Sample extraction

Recall of memorized structures and contexts.

Contextual filtering

Limitation to relevant areas of knowledge.

Confidence score calculation

Assessment of the internal strength of evidence.

Actuality check

Comparison of the topic area with the knowledge cut-off.

Gap identification

Marking of missing or outdated information.

Sufficiency check

Decision as to whether internal evidence is sufficient → no → Activation of the search phase.

In this phase, the model performs a double movement: It first draws on its internal knowledge – as if it were “remembering” – and at the same time checks whether this remembering is sufficient. Phase 1 thus forms the bridge between the model’s familiar mental space and the as yet unknown external information space.

Technically, the model activates its trained weights, retrieves stored concepts and patterns and evaluates their relevance to the current query. It creates an internal image of the question, but this image is probabilistic – not a reference book. The model therefore does not know whether it knows the answer, but only how certain it is that the existing patterns are sufficient.

The up-to-dateness check is central: the recognized topic area is compared with the user’s own knowledge period. If the query concerns events after the training cut-off or contains terms that are not yet established in the training corpus, the system marks these areas as uncertain or potentially outdated. Semantic gaps – e.g. new institutions, technologies or terms – are also recorded in this way.

These individual assessments are used to create a confidence profile, a matrix of internal certainty and suspected knowledge gaps. If the internal evidence is deemed insufficient, the transition to an external search (fan-out queries) is activated.

For brands, this means that this is where machine memory ends – and this is where the opportunity to become visible through up-to-date, structured data begins. Anyone who regularly publishes structured information (e.g. via JSON-LD, Q-IDs or Schema.org) becomes a natural anchor in this phase. Machines then do not fall back on arbitrary sources, but on those that they recognize as reliable and semantically coherent.

Phase 2

Fan-Out Query Generation

The LLM process steps

Research needs analysis

Determination of the missing information components.

Query generation

Creation of several semantically varied search queries.

Semantic question vector

Extraction of topics, entities and relations.

Query optimization

Reduction to concise, search engine compatible terms (1-6 tokens).

Paraphrase Diversification

Linguistic variation to increase coverage.

Exploration-exploitation balance

Combination of new and established search paths.

Prioritization

Ranking of queries according to relevance, precision and likelihood of success.

As soon as the model recognizes that its internal knowledge is not sufficient, it opens up the search space. Phase 2 marks the transition from passive memory to active research – the moment when thinking becomes action. The model begins to search specifically for evidence instead of relying solely on stored patterns.

To do this, it translates the intent vector into a large number of semantically varied search queries – so-called fan-out queries. Each query is a hypothesis: a possible path to relevant evidence. Instead of formulating just one “perfect” search query, the model generates an entire search spectrum of synonyms, paraphrases, sub-terms and context combinations. The aim is to open up the candidate space as broadly as possible without losing semantic precision.

An important aspect of this phase is the balance between exploration and exploitation: the model tries out new semantic paths (exploration), but also falls back on proven source patterns (exploitation). In this way, new insights are generated without losing stability. In addition, the model optimizes the queries syntactically – search queries are reduced to short, search engine-compatible phrases (typically 1-6 tokens) and often enriched with entities or identifiers. This increases the probability that subsequent retrieval systems (e.g. Google, Bing, internal APIs) will deliver high-quality hits.

For brands, this phase means

Findability ≠ Recommendation. Phase 2 determines whether your content will even make it into the downstream review as a candidate – not whether it will be cited.
“Retrievability” is the lever now. Only content with clear semantic signatures (precise descriptions, consistent @id/Q-IDs, clean sameAs links, lean search engine-compatible query terms) will reliably appear in the candidate space of search systems.
The actual decision is made in phase 5 – evidence weighting. This is where the evidence architecture (source density, consistency of relations, agreement across multiple sources) decides whether a retrievable document is selected by the machine as an authoritative source and recommended/cited.

Phase 3

Evidence Extraction

The LLM process steps

Web search execution

Transmission of fan-out queries to external search interfaces.

Results reception

Retrieval of top URLs, snippets and metadata.

Preliminary examination

Evaluation of the hits according to relevance, topicality and structural readability.

HTML parsing

Analysis of the page structure, extraction of relevant DOM areas.

JSON-LD, RDFa and microdata recognition

Identification of existing structured data.

Passage ranking

Prioritization of content-relevant text sections based on the intent vector.

Evidence list creation

Compilation of all potential evidence for the subsequent entity analysis.

Feedback loop

If the relevance is insufficient, return to phase 2 (Fan-Out Query Refinement).

Once the fan-out queries have been generated and prioritized, the model begins the actual search. It now enters the operational search space and retrieves the generated search queries via external interfaces – such as search engines, specialized databases or API-based knowledge sources. The result is a large number of hits: URLs, snippets and metadata. Phase 3 thus forms the link between search and understanding – this is where hits are turned into structured, usable material for the first time.

The process is similar to an automated literature search: each source is checked for semantic relevance, topicality and structural connectivity. Documents that already contain structured data formats – such as JSON-LD, RDFa or microdata – are particularly valuable. These formats provide explicit references to entities, relationships and document structures and thus significantly increase the evidential value of the source.

The model extracts passages and structural elements from the HTML and metadata context that correspond to the original intent vector. It prioritizes those text passages in which semantic agreement, entity density and document potential coincide. This results in an evidence set consisting of individual text fragments, tables, meta fields and structured statements.

The transition to phase 4 is initiated as soon as enough raw material has been collected to start the entity and relationship analysis. If relevance or structure is missing, the search space is reopened via feedback loops (return to phase 2).

For brands, this is the moment when machine visibility becomes measurable – but not yet evaluated. A document can perform well in this phase, but fall out again in phase 5 (Evidence Weighting) if its structure or document architecture is weak. Fan-out provides visibility in the candidate space, evidence extraction provides context, and only evidence weighting determines authority.

Phase 4

Entity Linking (@id / sameAs / Q-IDs verification)

The LLM process steps

Entity recognition

Identification of relevant concepts and actors in the text and in structured data.

Schema type identification

Assignment to known Schema.org types (e.g. Organization, Person, CreativeWork).

@id extraction

Reading unique identifiers within the structured data.

sameAs link

Tracking of external references (e.g. Wikidata, Wikipedia, Crunchbase).

Identifier parsing

Processing of standardized IDs such as Q-ID, ISIN, LEI, ORCID etc.

Trained entity matching

Comparison of external entities with internally learned patterns.

Cross-Document Entity Coreference

Recognition of identical entities across multiple sources.

Verification via structured data

Confirmation by existing machine-readable anchors.

Graph construction

Construction of a temporary entity and relationship graph.

Transitive reasoning

Conclusion about indirect connections between entities (A → B → C).

Consolidation

Merging of identical or redundant entities.

Contextualization

Embedding in the overall thematic context of the research.

Once the relevant content has been extracted, the actual semantic work begins: the model attempts to understand who or what actually appears in the texts found. Here, terms, names and concepts are converted into machine-readable entities – and linked with each other.

The model identifies entities on two levels:

Textual level: people, organizations, places, products, time periods or events in continuous text.
Structural level: Entities that are already defined via machine-readable anchors – such as @id, sameAs, identifier, Q-IDs or Schema.org types.

The two layers are compared and merged. For example, if “Allianz” appears in the continuous text and an entity with Q487292 (Wikidata ID of Allianz SE) is present in the JSON-LD layer at the same time, the model recognizes a unique assignment. If this anchor is missing, probabilistic matching is performed on the basis of semantic proximity and context match.

The next step is to create a temporary knowledge graph: Nodes represent entities, edges their relationships – e.g. “Allianz SE ↔ operates ↔ investor relations portal ↔ publishes ↔ quarterly report”. The model analyzes the graphical structure for consistency, redundancy and transitivity: If A is linked to B and B to C, it is often possible to infer A → C – this transitive reasoning via entity paths is crucial for verifying evidence across document boundaries.

This phase is of high strategic relevance for brands: This is where it is decided whether they appear in the machine’s knowledge graph as a clearly identifiable entity or are lost as a generic term. Only entities with stable anchors (@id, sameAs, Q-IDs, LEIs, ISINs, etc.) are systematically recognized and evaluated as trustworthy references. Machines do not see logos or slogans – they recognize relationship networks.

Phase 5

Evidence weighting (document architecture + diversity of sources)

The LLM process steps

Document architecture analysis

Assessment of the structural quality of the data.

Source diversity check

Analysis of the diversity and mutual confirmation of sources.

Frequency analysis

Determination of the frequency of certain statements across different documents.

Coherence check

Comparison for consistency between sources.

Authority weighting

Evaluation of domain and source reputation.

Topicality scoring

Evaluation of the temporal relevance of the information.

Calculation of evidential value

Integration of anchors (Q-IDs, sameAs, identifiers) into the evaluation logic.

Evidence Graph Construction

Construction of a directed semantic proof graph.

Evidence matrix creation

Merging of all scores into an overall assessment of document strength.

Once the entities have been recognized and linked, the central question is: Which of this information is really reliable? In this phase, the model evaluates the strength of evidence of all the sources and data points collected. It sorts, weights and checks how consistent, how up-to-date and how structurally sound the evidence is.

Each source is evaluated not only in terms of content, but also architecturally. Structural integrity – i.e. the existence of clear data relationships, recurring identifiers and referenced links – is interpreted as a sign of trustworthiness. The model constructs an evidence graph from these relationships, in which nodes represent entities and edges represent document relationships.

This is followed by the actual weighting: sources with many stable anchors or facts that have been confirmed several times are given a high weighting. Contradictory or isolated statements are devalued. It is not the number of mentions that counts, but their coherence across different contexts – i.e. structural plausibility.

The result of this phase is an evidence matrix: a multidimensional assessment of the quality, reliability and density of the collected data. This matrix is the basis for the subsequent reasoning: only what is considered reliable here may become part of the answer.

For brands, this means that what could be described as machine trust is created. If you maintain your data consistently, update it regularly and link it across multiple platforms, you create the basis for being perceived as a stable truth. In this logic, visibility is no longer a volume phenomenon, but a product of trust.

Phase 5a

Response Planning (Evidence-guided Outline)

The LLM process steps

Outline creation

Creation of an evidence-based structure based on the evidence matrix.

Claim prioritization

Weighting of statements according to evidential value.

Structuring via relations

Use of semantic hierarchies (isPartOf, about, mentions) for argumentation order.

Coherence check

Checking the logical sequence and completeness of the outline.

Feedback loop

If the structure is unclear → return to phase 5 (evidence weighting) for readjustment.

Transfer to Reasoning

Forwarding of the organized evidence structure to phase 6 for synthesis.

Once the evidence matrix has been created and the evidential value of each source has been calculated, the model plans its response structure. This phase is the link between evaluation and justification: A linguistic-logical framework is created here from numbers and weights – an outline that determines what is mentioned first, what is mentioned next and what is not mentioned at all.

The model creates an evidence-based argumentation structure. Each statement, each claim is prioritized according to its strength of evidence: High evidence values lead to central statements, weak evidence is integrated as marginal remarks or discarded altogether. Relationships from the evidence graph – such as isPartOf, about, mentions, hasPart – serve as hierarchical points of reference. This creates a semantic structure that determines the logical flow of the subsequent answer.

The model also checks whether the line of reasoning is complete and free of contradictions. If evidence for an important section is missing, the system can access phase 5 again via a feedback loop and readjust the weighting or consider new sources. This mechanism ensures that the answer is not only correct, but also evidence-proportional – i.e. in line with the evidence.

For brands, this phase means that it is decided here whether they form the central theme of the machine’s argumentation or only appear in passing. Those who provide consistent, logically linked data are perceived by the model as a structuring element of the answer – not as a footnote.

Phase 6

Reasoning (Fusion L+O + Multi-hop)

The LLM process steps

Knowledge integration

Merging internal (L) and external (O) evidence.

Multi-hop reasoning

Linking of documents across multiple entity and source paths.

Chain-of-Thought (CoT)

Breakdown of the conclusions into explicit intermediate steps.

Process Reward Modeling (PRM)

Evaluation of the logical quality of each reasoning step.

Conflict Resolution

Resolution of contradictory statements between internal and external sources.

Self-verification

Consistency check of interim results against evidence matrix and evidence graph.

Contextual enrichment

Supplementing missing connections with trained background knowledge.

Recommendation Decision

Classification of each source as a Mention or Recommendation (decisive for visibility).

Handover to phase 7

Transition to the linguistic construction of the final answer.

In this phase, the model combines its two knowledge spaces – internal memory (L) and external evidence (O) – into a coherent rationale. This is where the actual layer of truth is created, in which knowledge and evidence are brought together, checked and weighted.

The model checks which external evidence confirms, expands or contradicts its internal assumptions. To do this, it carries out a multi-stage , multi-hop reasoning process: Information is linked along the entity paths of the evidence graph. If source A contains an assertion about an entity and source B confirms the same entity via a secondary path, this link is strengthened – structured coherence is created.

Reasoning does not follow a linear process, but rather achain of intermediate steps(chain of thought). Each step is evaluated individually: a Process Reward Model (PRM) estimates the quality and logic of each intermediate step. In this way, the system recognizes faulty or redundant inference chains at an early stage and rejects them. At the same time, an internal self-verification runs, in which the model checks its own conclusions against the evidence matrix – a machine-based form of logical self-checking.

The result is a consolidated, probabilistically based response structure in which every statement has a traceable evidence trail. The model now also decides whether a source is only evaluated as a mention (Mention) or as a recommendation – this is where relevance is separated from authority. A document that is frequently cited, structurally well linked and has been confirmed several times moves to the top of the priority list; all others remain in the background.

For brands, this phase is the most critical in the entire process: this is where it is decided whether they are cited or recommended. It is not their visibility in retrieval (phase 2-3), but their semantic and structural quality in the evidence evaluation that determines whether the machine classifies them as a reliable source. AI visibility is therefore not a result of findability, but of evidence value and coherence in the reasoning process.

Phase 7

Final Response Construction

The LLM process steps

Language generation

Transformation of consolidated evidence into natural text.

Argumentation structure

Arrangement of the statements according to their strength of evidence and logical weighting.

Citation integration

Integration of structured metadata (url, name, publisher) for transparency.

Style adaptation

Choice of tone, register and linguistic rhythm according to the request.

Structuring

Division into paragraphs, semantic transitions and rhetorical hierarchies.

Coherence and precision testing

Fine control of logic, reading flow and accuracy.

Evidence transparency

Labeling of the origin of knowledge (L, O or L+O).

Final Response Output

Output of the finished answer - the visible surface of a multi-stage reasoning process.

In this final phase, the previously synthesized evidence is translated into language. The model leaves the logical space of reasoning and enters the realm of communication. The probabilistic network of evidence, weightings and conclusions becomes a linear, readable text – the linguistic form of the machine truth layer.

The model uses the outline created in phase 5a and the weighting calculated in phase 6 to construct the answer linguistically. It organizes the arguments according to the strength of their evidence, integrates references to sources and decides which evidence is explicitly mentioned or implicitly used. Language thus becomes a reasoning surface – each word is the result of a structured decision on evidence and relevance.

At the same time, a multi-stage optimization of style and coherence takes place: the model selects tonality, rhythm and density according to the context of the query. A scientific question is answered precisely and formally, a strategic or brand-related question is given more narrative depth. A constant coherence check runs in the background: overlaps, logical leaps or stylistic breaks are corrected. The result is an answer that is not only correct, but also credible, fluid and context-sensitive.

Another aspect is the transparency of origin: modern LLMs are increasingly integrating citation or source markers – such as url, name or publisher from Schema.org – to make it clear which data points were used for the justification. This gives the answer not only semantic, but also epistemic integrity.

For brands, this phase means deciding how they are quoted, paraphrased or contextualized by machines. The type and clarity with which their structured data is linked determines whether they are cited as a source or ignored as background noise.

Conclusion: From visibility to substance

The next level of digital communication is not a question of volume, but of readability in the system. If you want to understand why your brand appears – or does not appear – in ChatGPT, Perplexity or Bing, you have to look into the mechanisms of truth generation in language models.

These models do not decide who is right, but what is provable. They prefer structures that are based on stable entities (Wikidata Q-IDs, Schema.org), consistent semantics (sameAs, @id) and verifiable links (isPartOf, about, mentions). The dual-path model shows that structured data works in both modes – transiently in search mode (Case L+O) and persistently through integration in future training cycles (Case L). Those who build up machine-readable evidence today secure both immediate visibility and long-term reputation.

What the GEO industry sells as fan-out optimization falls short. Retrieval is only the beginning. The critical phase is evidence weighting (phase 5): This is where Mention separates from Recommendation. This is where it is decided whether a source is vaguely referenced or authoritatively cited. And this is precisely where Q-IDs, document architecture and cross-document entity coreference come into play – elements that no monitoring tool can measure because they operate in the reasoning layer, not the retrieval layer.

AI visibility is therefore no longer a tactic – it is scientifically based brand management. Brands that communicate in a structured way become part of the knowledge space. Brands that do not do so remain anecdotes at the edge of a system that no longer reads stories but reconstructs contexts. And if you only measure signals, you remain an observer of a system that you cannot influence.

Link tips

Scientifically based brand management through AI Visibility

SEO brand management: Who will manage brands in the future? The role of SEO & AI

Brand management through AI-supported SEO – understand the change

Machine-readable or irrelevant – why companies need to rethink now

With entities to machine-readable structures for AI Visibility

Structure beats ranking: architectural principles for AI visibility beyond SEO

The GEO trap

AI Visibility offer

Have your visibility checked now

Sources

The phases and individual steps described in this article were compared with current RAG research, LLM reasoning literature (Synthesis AI 2025, Wei et al. 2022) and knowledge graph integration studies (Frontiers 2025, Nature 2025). The phase structure corresponds to scientific standards, but extends these to include explicit differentiation between parametric (Case L) and hybrid mode (Case L+O) as well as detailed evidence weighting mechanisms.

Lewis et al. 2020 (RAG Foundation)
https://arxiv.org/abs/2005.11401

Synthesis AI 2025 (Reasoning Models)
https://synthesis.ai/2025/02/25/

Frontiers 2025 (LLM-KG Fusion)
https://www.frontiersin.org/

arXiv RAG Survey 2024
https://arxiv.org/abs/2410.12837

Nature KG Construction 2025
https://www.nature.com/articles/s41524-025-01540-6

Entity Recognition and Evidence Weighting: How LLMs decide who gets cited.

Layers of truth for machines

The three biggest misconceptions in the AI visibility industry

LLM process model: Dual-Path Architecture

Phase details

Feedback loops

Fields of influence of the truth layer

Layer 4: Entity Recognition

Mechanism

Data objects

Mode of action

Layer 5: Evidence Weighting

Mechanism

Data objects

Mode of action

Layer 5a: Response Planning

Mechanism

Data objects

Mode of action

Layer 6: Reasoning

Mechanism

Data objects

Mode of action

Layer 7: Final Response

Mechanism

Data objects

Mode of action

The mechanics in detail: All steps from the request to the response

Part I – Case L (answer based on learning data)

Prompt Input / Intent Parsing

Prompt reception

Tokenization

Semantic analysis

Intent vector creation

Complexity assessment

Internal Knowledge Retrieval

Parametric knowledge retrieval

Concept activation

Sample extraction

Contextual filtering

Confidence score calculation

Sufficiency check

Not applicable / skipped in Case L

Entity Recognition (trained patterns)

Entity recognition

Trained entity linking

Q-ID matching (internal)

Relationship mapping

Contextualization

Evidence weighting (internal frequency & coherence)

Frequency analysis

Coherence check

Prioritization

Evidence score calculation

Preparation for phase 6

Narrative structuring

Reasoning (parametric synthesis)

Knowledge consolidation

Logical inference

Coherence check

Conceptual integration

Frequency weighting

Narrative structuring

Final Response Construction

Language generation

Style adaptation

Structuring

Length optimization

Precision check

Final Response Output

Part II – Case L+O (answer with online research)

Prompt Input / Intent Parsing

Prompt reception

Tokenization

Semantic analysis

Intent vector creation

Complexity assessment

Gap detection

Decision preparation

Internal Knowledge Retrieval & Gap Detection