AEO

How AI Answer Engines Choose Their Sources

How AI answer engines choose sources in three steps: retrieval, evaluation, citation. Find which step is costing you AI citations.

By David Jubé · · 13 min read

How AI picks who it quotes. Retrieval, evaluation, citation.

An AI answer engine chooses a source in three steps:

  1. Retrieval. It assembles a candidate set of pages for the query, from its index, a live search call, or a connected web tool.
  2. Evaluation. It ranks and filters those candidates for trust and relevance before it writes a word.
  3. Citation. It lifts a specific passage from the survivors and attributes it inside the answer.

A page that fails to get cited usually clears the first step and breaks at the third. That is why “we rank but get no AI mentions” is the most common complaint in this category.

The good news: once you can name the three steps, you can find the one that is leaking on your page and fix it directly instead of guessing.

Key takeaways

  • An AI answer engine chooses a source in three steps: Retrieval, then Evaluation, then Citation, and each step has its own job, levers, and failure mode.
  • The model is sequential, so you cannot win the Citation step if you never survive Retrieval, and clearing Retrieval guarantees nothing about Citation.
  • A page can rank well on Google and still go uncited, because the AI engine runs its own pipeline and applies its own citation test.
  • “We rank but never get cited by AI” is almost always a Citation-step problem: the page is good, but no single passage is quotable on its own.
  • To find the leaking step, ask three questions in order, is the page retrievable, does it evaluate as trustworthy, and is the answer liftable, then stop at the first no.

The three-step model in one diagram

Hold this picture: a query comes in, the engine casts a wide net (Retrieval), throws most of the catch back (Evaluation), and quotes a sentence or two from what is left (Citation).

Each step has its own job, its own levers, and its own failure mode.

StepWhat the engine is doingThe levers you controlThe failure if you lose here
1. RetrievalBuilding the candidate set of sourcesCrawlability, indexation, freshness, presence in the index it leans onInvisible: you are never in the running
2. EvaluationScoring candidates for trust and relevanceE-E-A-T, entity clarity, corroboration across sources, topical authorityConsidered but cut before the answer is written
3. CitationLifting and attributing a specific passageAnswer-first structure, self-contained claims, schema, clean formattingRead but not quotable: the engine paraphrases someone else

The model is sequential. You cannot win step three if you never survive step one, and clearing step one guarantees nothing about step three.

This is the whole reason a page can rank well on Google and still go uncited: ranking proves you passed a search engine’s version of retrieval and evaluation, but the AI engine runs its own pipeline and applies its own citation test.

Keep the diagram in mind, because every fix in this cluster maps to exactly one of these three steps.

If you want the labels behind all of this (AEO, GEO, LLMO), see the acronyms this mechanism resolves. If you want the category from the top, here is what answer engine optimization actually is.

Step 1, Retrieval: how the candidate set is built

Retrieval is the engine deciding which pages even get to compete. There is no single method.

ChatGPT, when it browses, rewrites your question into several targeted searches and pulls the results. Perplexity runs a live search on every query. Google AI Overviews leans heavily on its existing index, the same one that powers its blue-link search results.

Different plumbing, same purpose: produce a shortlist of candidate sources.

What this means for you is blunt. If a page is not crawlable, not indexed, or not surfaced by the search layer the engine uses, it is not retrievable, and nothing downstream can save it.

This is the step where classic technical SEO does its quiet work. The crawl access, the clean site structure, the indexation, the sitemap hygiene that lets a search engine find and store your page is the same work that lets an answer engine retrieve it. It is the same groundwork laid out in the first ninety days of SEO for a startup.

There is no AEO shortcut around it, which is exactly why retrieval still depends on solid search fundamentals. The hands-on version of this lives in the crawlability work that powers retrieval.

Freshness belongs to this step too. Engines retrieve fresher pages more readily, especially for topics where the facts move.

A page last touched two years ago is a weaker retrieval candidate than the same content updated last month, even before any evaluation happens. That is the case for a content refresh that wins back rankings you lost before the staleness costs you the candidate slot.

The practical takeaway: retrievability is a precondition, not a strategy. Get it right and you have earned a seat at the table. You have not yet earned a citation.

Step 2, Evaluation: how trust and relevance are scored

Once the engine has a candidate set, it filters. Evaluation is the step where it asks, of each retrieved page, “should I trust this, and does it actually answer the query?”

This is where the candidate set thins out fast. The engine is composing a single synthesized answer it will stand behind, so it is conservative about which sources it leans on.

The signals it weighs are the ones the search world has tracked for years, sharpened for the stakes of a one-answer response.

Google’s own helpful, people-first content guidance names experience, expertise, authoritativeness, and trustworthiness (E-E-A-T) as the qualities that separate content worth surfacing from content that merely exists.

Practically, that translates to recognized domain authority, clear authorship and attribution, and a track record on the topic. A page from a site that has demonstrably covered the subject for years evaluates better than an anonymous post on an unknown domain. Much of that domain authority is earned the slow way, through the backlinks a startup can earn with no budget.

Two evaluation levers get overlooked.

The first is entity clarity: whether the engine can tell who you are, what you are talking about, and how those things relate. AI systems do not read prose the way you do. They extract entities and the connections between them, so a page that names its subject consistently and resolves ambiguity scores better than one that is vague about its own topic.

The second is corroboration. Engines raise their confidence when independent sources agree, and they hedge or down-rank a claim that stands alone or conflicts with the consensus.

Ahrefs’ study on why ChatGPT cites a page found that the factors correlating with citation look a lot like the factors that have always signaled a trustworthy page: authority, relevance, and content that holds up against multiple independent sources.

Evaluation is where being genuinely good at your subject pays off, because the engine is, in effect, fact-checking you against everyone else it retrieved.

Step 3, Citation: why only certain passages get lifted

A page can be retrieved and pass evaluation and still not get cited, because citation operates on passages, not pages.

The engine has decided your page is trustworthy and relevant. Now it needs a specific sentence or two it can lift and attribute to support a claim in its answer.

If your answer to the query is buried three paragraphs down, hedged with qualifiers, or split across the page so no single chunk stands alone, the engine often paraphrases a competitor whose passage was cleaner, even though your page was a fine source.

This is the step founders feel as “we rank but never get cited,” and it is the most fixable.

Three properties make a passage liftable.

It is answer-first: the direct answer leads, before context or backstory.

It is self-contained: a reader (or a model) gets the point from that passage alone, without scrolling for the missing half.

And it is cleanly formatted: a descriptive heading that names the question, a tight paragraph or a list, optionally reinforced with schema so the engine can parse the structure with confidence.

Semrush’s large AI Overviews citation study of millions of keywords found that content traits tied to clarity and structure, not just authority, move the odds of being the cited source.

Google’s own guidance on AI features in Search says the same thing in plainer terms: well-structured, complete, self-contained answers are what get surfaced. This is the heart of writing content that ranks and gets cited by AI rather than one or the other.

The on-page craft that wins this step gets its own treatment in the on-page craft that wins the Citation step.

The most common failure: ranked but not cited

Put the three steps together and the most common founder complaint resolves itself. “We rank on Google but never get cited by AI” is almost always a Citation-step problem.

Ranking is proof you passed retrieval and evaluation, on Google’s terms at least. It tells you nothing about whether any passage on the page is liftable.

There is a second, quieter version of the same failure that lives one step earlier. A page can rank for its headline term and still lose all the related sub-questions to deeper, more specific pages, because the engine retrieves and evaluates per query, not per page.

Your homepage-level overview ranks; the precise answer to the precise question someone asked an AI engine sits on a page you never built, or in a paragraph you never wrote answer-first.

Either way, the diagnosis is structural, not mysterious. The page is good. The passage is not quotable, or the specific question was never answered in a quotable way.

That is a fixable miss, not a verdict on your authority.

Book a free diagnosis

Retrieval, evaluation, citation: in theory you can run the three checks yourself. In practice it is hard to be objective about your own pages, and the leaking step is rarely the one you assume. If you want a second set of eyes on which step is actually costing you AI citations, we will run your priority pages through all three and tell you where to spend first. That read is a free diagnosis, founder to founder.

Book your free diagnosis

How to diagnose which step your page is failing

You do not need a tool to find the leaking step. You need to ask three questions in order, and stop at the first “no.”

Is the page retrievable? Check that it is indexed (search your exact URL in Google), that nothing is blocking crawlers, and that it has been updated recently enough for the topic.

If the page is not in the index, you have a Retrieval problem and nothing else matters yet. Fix that first.

Does the page evaluate as trustworthy and on-topic? Ask whether a stranger could tell, from the page alone, who published it and why they would know.

Check that the subject is named clearly and consistently, that the claims are corroborated by what other credible sources say, and that the content goes deeper than something the model could generate on its own.

If the page is thin, anonymous, or ambiguous about its own topic, you have an Evaluation problem.

Is the answer liftable? Take the exact question a user would ask an engine. Find the passage on your page that answers it. Now copy just that passage.

Does it answer the question completely, on its own, without the surrounding page? If you have to scroll to make sense of it, or the answer is hedged and indirect, you have a Citation problem.

This last test catches the majority of “ranked but not cited” cases, and it is the one you can fix this afternoon.

Run the three questions on any page and the model tells you where the work is. Usually the first two pass and the third is where the citation is being lost.

How this maps to the work

The three-step model is the spine of everything else in answer engine optimization, and each piece of downstream work attaches to exactly one step.

The per-engine playbook attaches to Retrieval and Evaluation: because ChatGPT, Perplexity, and Google AI Overviews retrieve and score differently, the same passage can win one engine and miss another, so you apply the model engine by engine rather than optimizing for “AI” as one thing.

The on-page craft attaches to Citation: answer-first structure, self-contained claims, and schema are the mechanics that make a passage liftable once the page already retrieves and evaluates well.

And the final question, whether any of it is working when AI answers often send no referrer, becomes its own discipline of triangulating observable proxies against each step.

Hold the diagram and the work organizes itself. Retrieval gets you into the candidate set. Evaluation keeps you in it. Citation gets your words into the answer.

Find the step that is leaking, fix that one, and stop optimizing the steps that were already fine.

Frequently Asked Questions

How does ChatGPT decide which sources to cite?

ChatGPT only cites sources when it runs a live web search; in its default mode it answers from training data with no citations. When browsing, it rewrites your question into several targeted queries, retrieves pages, and cites the ones whose passages most directly match the answer it is forming. Authority, freshness, and tight question-to-answer fit drive selection.

Why does my page rank on Google but never get cited by AI?

AI engines run their own retrieval pipeline and do not consult Google’s rankings before choosing sources. A page can rank for its headline term yet lose the related sub-questions to deeper pages. Thin content the model could write itself, buried answers, and unclear entities also get skipped despite strong rankings.

What is the difference between a retrieved source and a cited source?

A retrieved source is any page the engine pulls during its web search; a cited source is the smaller set it actually attributes in the visible answer. Engines retrieve broadly, then cite selectively, keeping only the pages whose specific sentences support a claim. Being retrieved is necessary but not sufficient for a citation.

What signals make a source trustworthy to an AI answer engine?

Engines weigh accuracy, authority, transparency, and consistency over time. Practical signals include recognized domain authority, clear authorship and attribution, structured formatting that isolates facts, schema markup, and recent updates. Agreement across multiple independent sources raises confidence; when sources disagree, the engine hedges or down-ranks the conflicting claim.

Does updating a page make AI more likely to cite it?

Yes, freshness is one of the strongest levers. AI-surfaced URLs tend to run fresher than organic results for the same queries, and studies show recently updated pages earn meaningfully more citations than stale ones. Engines weight recency hardest for topics tied to changing facts, regulations, or new research.

Do all AI answer engines cite the same sources?

No. ChatGPT, Google AI Overviews, and Perplexity use different indexes, retrieval methods, and scoring, so their citation sets overlap only slightly. Analyses of millions of answers find only a handful of domains appear across all three engines, and most AI-cited URLs do not rank in Google’s top ten for the same query.

Two resources sit behind those answers worth reading directly: Semrush’s data on the most-cited domains across AI engines shows just how little the three engines overlap, and Frase’s guide to getting cited by AI walks through the answer-first structure that wins the Citation step.

For a plain-language tour of the whole retrieve-then-cite pipeline, The HOTH’s explainer on how answer engines work covers the mechanism end to end.

Continue Reading:

More On Answer Engine Optimization

More from TDM Insights

Explore TDM Insights Categories