Digital Marketing

The Ghost Citation Problem

Improve your skills with Growth Memo’s expert weekly insights. Sign up for free!

When AI answers a question using your content, it often cites you with a link to the source. What it doesn’t do, 62% of the time, is say your name. The link is there. Product mentions are not. This is what I like to call a ghost quote: The AI ​​that uses your content does not mention you in the response.

This week, I’m sharing:

  • Why citation and communication are two different outcomes that require different strategies.
  • Which LLMs name brands versus those they consider as a source of unknowns.
  • Question format and content type that generates 30x more brand mentions.

Note from Kevin: I’m a big fan of HubSpot’s Marketing Against the Grain. I had Kieran, one of the co-hosts, on my Tech Bound podcast back in 2023. Now, they’ve launched a newsletter with smart reviews, new ideas, and practical lessons on what’s currently working. So, I thought I’d give a friendly shout out: Check it out.

This analysis draws on 3,981 domains across 115 disciplines, 14 countries, and four AI search engines (ChatGPT, Google AI Overview, Gemini, AI Mode), using data from the Semrush AI Toolkit. All appearances are marked as “cited” (source link present) and/or “mentioned” (brand name appears in the text of the reply). The gap between those two states is a ghost quote problem.

1. 62% of LLM Quotes for Your Product Are Not Functional

Most brands think that being cited means being seen. The data says otherwise.

Photo Credit: Kevin Indig

74.9% of domains were cited, and 38.3% were cited. 61.7% of citations are ghost citations: the site receives a source link but zero word recognition in the response text.

Only 13.2% of impressions convert to both citations and mentions. Not a single site is cited, but not mentioned at all, or vice versa.

2. Every LLM Shows Different Behavior

The four AI engines handle citations and citations in very different ways:

  • Geminis name brands in 83.7% of appearances, but only generate a citation link 21.4% of the time. It works best as a dialog diagram for product information.
  • ChatGPT is the opposite: It cites 87.0% of the time but only mentions brands in 20.7% of responses, serving as an academic paper with footnotes.
  • Google AI Overviews (AIOs) sit in between but rely on citations.
  • Google’s AI mode offers about 17% more product mentions than ChatGPT in its results, but it also works closer to an academic paper than its Gemini sibling.

For brands, this means that the visibility of Gemini and the visibility of ChatGPT are not the same thing. (This data set showed clear evidence that there is not much overlap with ChatGPT’s citations/mentions and Gemini’s citations/mentions of the same information.) One configuration does not benefit the other. There is no single “AI visibility metric”. There are at least 4 different behavioral systems that work in parallel.

Photo Credit: Kevin Indig

3. Solid Products Named in Text

A clear pattern emerges among domains that appear three or more times: Content integrators and academic sources are cited repeatedly but almost never mentioned.

  • Medium.com was indexed 16 times with the same commands in three different engines and was indexed zero times.
  • Wikipedia.org was referenced 27 times and mentioned in only two answers, both times to the same discussion question (“What is the most dangerous creature in the world?”).
  • Wired.com, sciencedirect.com, harvard.edu: same pattern.

Consumer brands with strong public ownership are cited in nearly 100% output. AI doesn’t see the need to quote. Instead, it deals with consumer products directly. It knows that data about products comes from somewhere, but it doesn’t feel the need to clearly tell users. For publishers whose value proposition is information authorities, this is a structural problem.

*A mention rate of more than 100% means that the product is mentioned in the response text even if it is not mentioned as a source link – the engine refers to the product by name without linking to it. For values ​​in this data set greater than 100%, consider 10x quoted and 10x quoted = 100%. If a product is mentioned 12x and quoted 10x, that’s 120%.

Photo Credit: Kevin Indig

4. LLMs Disagree on the Same Product 22% of the Time

454+ domain combinations were tested across multiple engines. In 22% of those results (100 total), LLMs did not agree that the brand should be mentioned:

  • Instagram.com was mentioned by ChatGPT and Gemini but only mentioned (not named) by Google.
  • Facebook.com was mentioned by Gemini in 3 of 3 appearances.
  • Google AI identified Facebook 9 times out of 9, but they only named it 1 time.

Photo Credit: Kevin Indig

Same type, same question, but different engines and different results. This is important for measurement: A brand can appear “visible” in the data of one engine while being completely unknown in another. Aggregating AI visibility metrics hides this difference.

5. Internal Branding Rates Vary by Geography

To control the LLM, the country-level differences in the mentioned standards are significant:

  • India and Sweden show the highest mention rates (50%), suggesting more conversational or referral inquiry patterns in those markets.
  • Italy, Brazil, and the Netherlands show the lowest mention rates (18-22%), with the highest citation rates (82-94%).
  • The UK and Canada are in the middle but above the global average.

*Note: the dataset uses localized commands certified by Semrush, so the language is not a problem.

Photo Credit: Kevin Indig

Citation and Invention Are Not the Same, and Need a Different Approach

From this analysis, four takeaways stood out to me about products and their content strategies:

1. Excerpt means AI draws from your content. When it comes to him, it means naming him. We don’t yet know enough about the effects of speaking and being quoted, but we can say for sure that there is a system that determines when you are quoted versus said.

2. Your strategy must be specific to the LLM. The Gemini-first strategy is different from the ChatGPT-first strategy. Any report of AI visibility that is aggregated across LLMs is misleading.

3. Comparative content finds brand names. Information content feeds the machine anonymously. If the goal is product mentions, not just quotes, focus your content strategy on evaluation, comparison and recommendation.

4. A quick format is important. Brands must map out not only which topics they want to appear on, but especially which phrase patterns generate mentions versus ghost quotes. Short discussion questions and long structured questions behave as separate products.

How to do it

Data source: Semrush AI Toolkit: 3,981 domain views across 115 disciplines, 14 countries, and four AI search engines (ChatGPT, Google AI Overviews, Gemini, Google).

Every row in the dataset represents a domain that appeared in the AI ​​response. Each appearance is marked as “cited” (the site appears as a source link) and/or “mentioned” (the brand name appears in the text of the reply). The gap between those two states is what this analysis calls a ghost quote: the AI ​​used your content but did not mention your name.


Featured Image: Roman Samborskii/Shutterstock; Paulo Bobita/Search Engine Journal



Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button