Schema Alone Will Not Win AI Citations
/ 6 min read
Summary
The report analyzed 6 million URLs and found that pages cited by AI were roughly three times more likely to include JSON LD. This. The practical question is what this changes for SEO, content quality, and AI search visibility.
There is a certain comfort in technical SEO. The idea is that if we can just format our data correctly, if we can speak the exact language the machines prefer, we can unlock a predictable increase in visibility. For a long time, schema markup has been viewed as that "magic switch," especially as we move into the era of AI generated search results and LLM citations. This connects with structured data when the same signal needs a clearer operating decision. The same pattern also shows up in Agentic Web, where the practical question is how the signal becomes visible.
We want to believe that by adding a few lines of JSON LD, we are making it easier for an AI to understand our value and, consequently, cite us as a source. It feels logical. However, the gap between logic and actual performance is often where the most important lessons live. A recent study from Ahrefs suggests that for many of us, the effort we spend on schema specifically to move the needle on AI citations might not be yielding the results we expect.
What Ahrefs Found
To understand the impact of schema, Ahrefs started with a massive dataset of 6 million URLs. The initial finding was striking: pages that are already cited by AI are roughly three times more likely to have JSON LD schema implemented than those that aren't. On the surface, this looks like a smoking gun. It suggests a direct link between the presence of schema and the likelihood of being cited.
But as anyone who spends time with data knows, correlation is not the same as causation. It is entirely possible that sites which invest in schema are also the sites that invest more heavily in high quality content, better user experiences, and aggressive link building. In this scenario, schema is a symptom of a high quality site, not the cause of the AI citation.
To isolate the variable, Ahrefs conducted a controlled experiment. They tracked 1,885 web pages that added JSON LD schema and matched each of them against three control pages from different domains. These control pages were chosen because they had similar citation levels but did not add schema. By measuring the changes 30 days before and after the schema was added, they could see if the technical change itself triggered a boost.
The results were surprisingly flat across the board. Using a matched difference in differences analysis to account for general platform trends, the data showed:
Google AI Overviews: A decline of 4.6%. Google AI Mode: An increase of 2.4%. ChatGPT: An increase of 2.2%.
From a practical standpoint, the gains in AI Mode and ChatGPT were so small that they are indistinguishable from random noise. Essentially, adding schema did not provide a meaningful lift in citations across any of the major AI platforms tested.
The AI Overview Decline
The 4.6% dip in Google AI Overviews might look alarming at first glance, but it requires a bit of nuance to interpret correctly. Ahrefs noted that both the "treated" pages (those that added schema) and the "control" pages were already seeing a decline in citations before the experiment even began.
The pages that added schema simply declined slightly faster than the control group. To put this in perspective, the difference amounted to about 12 fewer daily citations per page. In a sample where most pages were receiving hundreds of citations daily, a difference of 12 is marginal.
The researchers didn't conclude that schema actively harms your visibility. It is just as likely that this slight variance was a coincidence. The key takeaway here isn't that you should remove your schema, but rather that it didn't act as a shield against the general volatility of AI Overviews.
What The Report Doesn't Cover
No study is perfect, and it is important to look at the boundaries of this data to avoid oversimplifying the conclusion. The most significant limitation is the starting point: every page in the Ahrefs dataset already had 100 or more AI Overview citations before the schema was added.
This means these pages were already "in the club." They were already being crawled, indexed, and recognized as authoritative enough to be surfaced by the AI. The study tells us that for pages already visible to AI, schema doesn't provide an extra boost. However, it doesn't tell us if schema helps a page get noticed for the first time. It is possible that for a brand new page or a site that has never been cited, schema might still aid in the initial parsing and indexing process.
There are a few other variables that make a clean conclusion difficult:
Concurrent Changes: When a site owner decides to add JSON LD, they are often in the middle of a larger optimization project. They might be updating the copy or improving the page structure at the same time, making it hard to attribute changes solely to the schema. Schema Variety: The study pooled all types of schema together. It is possible that specific types of markup (like Product or Review schema) behave differently than general Article or Organization schema. The Time Window: A 30 day window is relatively short in the world of SEO. Some effects might take longer to manifest.
Interestingly, the report mentions a separate experiment by searchVIU. They tested whether five different AI systems used schema when fetching pages in real time. The result? None of them did. The AI systems only extracted the visible HTML, completely ignoring JSON LD, Microdata, and RDFa. While this specifically refers to real time fetching and not the long term training or indexing process, it suggests that the "visible" layer of your content is far more important than the hidden technical layer when it comes to immediate AI retrieval.
Why This Matters
For years, the prevailing advice has been to implement every possible piece of schema to "future proof" your site for AI. This data complicates that narrative. It suggests that while there is a strong correlation between schema and AI citations, the schema itself isn't the lever you can pull to get more citations.
Instead, the correlation likely exists because the most professional, well maintained sites, the ones that are most likely to be cited, are also the ones most likely to have a complete technical SEO setup, including schema. Schema is a marker of quality, not necessarily a driver of it.
This doesn't mean schema is useless. It still matters in generating rich snippets in traditional search results and helping Google populate its Knowledge Graph. But if your primary goal is to increase the frequency with which an LLM cites your brand, focusing solely on JSON LD is likely a waste of resources.
Looking Ahead
The path forward is to stop treating technical markup as a substitute for authority. If you are already being cited by AI, adding schema is unlikely to move the needle further. If you aren't being cited, schema might help the machines understand your page, but it won't replace the need for original, high value content that the AI actually wants to reference.
The real takeaway is a reminder to focus on the "visible" web. If AI systems are ignoring JSON LD during real time fetches and relying on the HTML, then the clarity of your writing, the structure of your headings, and the directness of your answers are your most potent tools.
Continue using schema for the benefits it provides to traditional search and overall site health, but don't expect it to be the silver bullet for AI visibility. The machines are reading your content, not just your code.
Practical next steps
The useful part is not only the idea itself, but the operating habit behind it. Use it as a checklist for decisions: what deserves attention now, what should be monitored, what needs a stronger evidence base, and what can wait until the system has more scale.
Comments
Comments are published automatically. Links are not allowed inside comments.