Do AI chatbots actually read website schema?
Do AI chatbots actually read website schema?
It has become common wisdom that having a schema on your website dramatically improves your SEO and visibility to different LLMs. But when NightJarr put this to the test, the results told a very different story.
We compiled and conducted research into whether AI chatbots can actually access and use schema markup when they visit websites. In eight tests, it only worked once. But when we put the same information in HTML, AI chatbots found the information 100% of the time.
What is schema markup?
Schema markup is structured data, developed by major search engines that semantically annotates web page content to enhance machine readability and improve search results. Using a recipe as an example, elements like ingredients, prep time, calories and ratings are human-readable but not easily interpreted by machines. Schema markup encodes these elements as JSON-LD, embedding metadata tags into the page’s source code that explicitly define each type of content type for search engine crawlers.
The research question
With AI chatbots on ChatGPT, Claude, Perplexity and Google Gemini becoming more popular, we need to know if they actually read schema markup when they visit websites because:
· Many experts claim that AI chatbots use schema markup.
· Website owners spend time and money adding schema to their sites.
· If they can't read schema, there may be better ways to present information.
Our research
Our AI architect Sanushen Govender designed a rigorous test to definitively answer whether AI chatbots can read schema markup. He created different test web pages, each with the exact same recipe information (chocolate chip cookies) but presented in different formats including plain HTML with no special formatting (control test), content with schema markup embedded in the site, data as a raw JSON file, the data in visible HTML tables, the data in visible definition lists, and a recreation of SEO expert Mark Williams-Cook’s test that included a schema-only address.
Sanushen asked Claude, ChatGPT, Perplexity and Google Gemini the same questions to test data that was only in the schema code on some pages, but visible on others:
· What is the prep time?
· How many calories per cookie?
· What is the user rating and review count?
· What is the address of the company running the site?
The schema test page had visible content (recipe title and general description) and hidden schema code (detailed information like prep time, calories, ratings and measurements. The HTML table page had the same information displayed in regular HTML tables that anyone visiting the page could see. The page based on Williams-Cook’s test page had one critical difference – our page was brand new and had never been indexed by Google. This would reveal whether the chatbots were reading the schema directly from the page, or getting information that Google had already extracted.
Test results
· Gemini produced the only successful schema access out of eight attempts, likely due to Google search integration. There were multiple hallucinations (false information confidently provided).
· When we presented the exact same information in visible HTML tables instead of hidden schema code with the same questions, Claude was successful 100% of the time.
· When data was presented as a raw JSON file, there was a 50% success rate, indicating JSON files are more accessible than schema, but still not universally reliable.
· Asking them to extract the address, Claude and ChatGPT could not find an address, Perplexity returned false information and Gemini produced a fictional address.
Our results found that schema only works for some AI chatbots when:
· The page has been indexed by Google
· Google has extracted the schema information
· The AI chatbot uses Google's search API instead of directly reading a website
For new pages or direct website access, schema does not work.
Our key findings
· Schema markup embedded in website code (JSON) achieved only a 12.5% success rate (one successful attempt of eight) across all AI chatbots tested.
· When the same information was presented in visible HTML structures (tables and definition lists), the success rate was 100%.
· When schema markup worked, it was because Google had already indexed the page and had extracted and processed the schema. The AI chatbot accessed Google's pre-processed data, not the original website. Schema did not work for new pages or direct website visits.
· When AI chatbots couldn't find information, some made up false information with complete confidence.
· If humans can’t see it, AI chatbots can't reliably access it through direct web browsing. Hidden metadata (like schema) fails. Visible content (like tables on the page) succeeds.
What we recommend
1. For direct AI chatbot access: Make information visible on your page in regular text, HTML tables, or lists. AI chatbots read visible HTML content reliably. What works for humans works for AI.
2. For Google search: Schema still works fine for Google's search engine. Use it to get rich snippets and improve SEO. Just understand that most AI chatbots cannot access it directly, they can only access it after Google has indexed your page and processed the schema.
The bottom line
For reliable AI chatbot accessibility, make your content visible. Don't rely on schema in <script> tags. Use visible HTML structures. Our research proves that for AI chatbots, visible structure beats hidden complexity every time.
The bottom line is that for reliable AI chatbot accessibility, make your content visible. Don't rely on schema in <script> tags. Use visible HTML structures.
