Schema Markup for AI: The Complete Guide to Getting Cited by Language Models
Schema markup is the single highest-impact factor for AI citations after bot access. This guide covers the 14 schema types that matter, JSON-LD templates for each, and data showing how markup quality correlates with citation frequency.
Founder & CEO at AgentReady
Why Schema Markup Is the #2 AI Readiness Factor
Schema markup is how you explicitly tell AI systems what your content is, who created it, and how it should be categorized. Without it, AI models must infer these details from unstructured text — and they frequently infer wrong.
In our scan data, Structured Data & Schema carries 20% of the total AI readiness score weight, second only to Bot Access at 25%. The reason is empirical: when we analyze which sites get cited by AI systems, the correlation between schema completeness and citation frequency is r = 0.61 — the second-strongest factor after overall AI readiness score.
The mechanism is straightforward. When an AI model encounters a page with valid Article schema including author, datePublished, publisher, and headline fields, it can categorize that content with near-perfect accuracy. It knows this is a news article, written by a specific person, published on a specific date, by a known organization. All of this feeds into the model's trust evaluation.
Without schema, the model reads raw HTML and guesses. Is this a blog post? A product page? A company news release? Who wrote it? When? For a single page, the model might guess correctly. Across millions of pages, guessing introduces noise — and noise means your page loses to a competitor whose schema eliminates the guesswork.
78% of the 5,000 sites in our benchmark study have missing or incomplete schema markup. This is the second most common AI readiness failure after missing llms.txt. It is also one of the most fixable — schema implementation follows predictable templates that work across any CMS.
The 14 Schema Types That Matter for AI Visibility
Schema.org defines hundreds of types. Only 14 meaningfully impact AI readiness. We determined this by analyzing which schema types appear on the top-cited sites in our AI Citation Index and correlating type presence with citation frequency.
The 14 types fall into three priority groups based on impact.
Must-Have (every site needs these): Organization, Article, BreadcrumbList, and WebSite. These four types provide AI models with basic identity (who are you), content type (what is this page), navigation context (where does this page sit), and site-level information (what is this domain about). Every website should implement all four.
Category-Specific (implement based on your content type): Product, FAQPage, HowTo, LocalBusiness, Service, Person (for author pages), and Review. These types are powerful but only relevant for sites that have the corresponding content. An e-commerce site needs Product and Review. A service business needs Service and LocalBusiness. A publisher needs Person for author pages.
Advanced (high-value for specific use cases): Event, Course, and MedicalCondition. These types serve narrower audiences but provide disproportionate AI citation value in their categories. Event schema helps with AI recommendations for upcoming events. Course schema dominates educational AI responses. MedicalCondition schema is essential for health content publishers.
- Must-Have: Organization, Article, BreadcrumbList, WebSite — implement on every site
- E-Commerce: Add Product, Review/AggregateRating, Offer — detailed product data drives AI shopping citations
- Service Business: Add Service, LocalBusiness, FAQPage — local AI recommendations depend on these
- Publisher/Blog: Add Person (author pages), FAQPage, HowTo — authorship signals boost citation trust
- Education: Add Course, Organization (with educational specifics) — dominates AI learning recommendations
- Healthcare: Add MedicalCondition, MedicalOrganization, Physician — required for health AI citations
Organization Schema: Your Site's Identity Card for AI
Organization schema is the most important single schema type for AI readiness. It tells AI models who owns the site, what the organization does, and where to verify its identity. Sites with complete Organization schema score 12 points higher on Authority & Trust than sites without it.
A complete Organization schema includes fields that many implementations miss. The minimum viable implementation has name, url, and logo. But AI models extract significantly more value from a comprehensive version that includes description, foundingDate, founder, sameAs (linking to social profiles for verification), contactPoint, address, and numberOfEmployees.
The sameAs field is particularly powerful for AI trust. When you list your LinkedIn, Twitter/X, Wikipedia, and Crunchbase URLs, AI models can cross-reference your organization's identity across multiple authoritative sources. This multi-source verification is a strong trust signal that directly influences citation decisions.
Place Organization schema on your homepage only. Do not duplicate it on every page — this creates redundancy that some validators flag as warnings. Reference your Organization schema from Article and other content schemas using the publisher field.
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Your Company Name",
"url": "https://yoursite.com",
"logo": {
"@type": "ImageObject",
"url": "https://yoursite.com/logo.png",
"width": 600,
"height": 60
},
"description": "One clear sentence about what your company does.",
"foundingDate": "2020-01-15",
"founder": {
"@type": "Person",
"name": "Founder Name"
},
"sameAs": [
"https://linkedin.com/company/yourcompany",
"https://twitter.com/yourcompany",
"https://github.com/yourcompany"
],
"contactPoint": {
"@type": "ContactPoint",
"email": "hello@yoursite.com",
"contactType": "customer service"
},
"address": {
"@type": "PostalAddress",
"addressCountry": "US"
}
}Comprehensive Organization schema for AI readiness
Article Schema: The Citation Machine
Article schema is where AI citation decisions are most directly influenced. When an AI model evaluates whether to cite a page, it checks for Article schema fields that answer: who wrote this, when, for whom, and how authoritative is it?
The fields that drive citation decisions are author (linked to a Person schema with credentials), datePublished, dateModified (AI models prefer recently updated content), publisher (linked to Organization schema), and headline. These five fields are present on only 34% of the blog posts in our scan database. Sites that include all five have a 28% higher citation rate than sites missing any of them.
The dateModified field deserves special attention. AI models increasingly penalize stale content. A 2024 article that was updated in 2026 (with dateModified reflecting the update) outperforms a 2024 article with no modification signal. Updating your evergreen content and reflecting the update in both the visible page and the Article schema is a high-ROI practice.
For long-form content, include the wordCount field. AI models use word count as a proxy for content depth. Our data shows that articles with wordCount in their schema and actual word counts above 1,500 are cited 2.1x more frequently than articles under 500 words with no wordCount field.
Also include articleSection to categorize the content and keywords to provide topic context. These fields help AI models match your content to relevant queries with higher precision.
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Your Article Title Here",
"description": "One-sentence article summary.",
"image": "https://yoursite.com/images/article-image.jpg",
"datePublished": "2026-03-30",
"dateModified": "2026-03-30",
"wordCount": 1800,
"articleSection": "AI Readiness",
"keywords": ["AI readiness", "schema markup", "AI optimization"],
"author": {
"@type": "Person",
"name": "Author Name",
"url": "https://yoursite.com/author/name",
"jobTitle": "Senior Editor",
"sameAs": ["https://linkedin.com/in/authorname"]
},
"publisher": {
"@type": "Organization",
"name": "Your Company",
"url": "https://yoursite.com",
"logo": {
"@type": "ImageObject",
"url": "https://yoursite.com/logo.png"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://yoursite.com/blog/article-slug"
}
}Article schema optimized for AI citations
Product Schema: Getting into AI Shopping Recommendations
AI shopping agents are the fastest-growing category of AI search. ChatGPT's product search, Perplexity Shopping, and Google's AI-powered shopping results all depend heavily on Product schema to make recommendations. E-commerce sites with complete Product schema receive 3.4x more AI-referred product page visits than sites with incomplete or missing Product schema.
The difference between "complete" and "incomplete" Product schema is critical. Most e-commerce platforms generate basic Product schema automatically — name, price, and availability. But AI shopping agents need much more to make confident recommendations: detailed description, brand, color, size, material, aggregateRating, review, sku, and image.
Consider what happens when a user asks ChatGPT: "What is the best noise-cancelling headphone under $300?" The AI needs to compare products across stores. It can only compare products whose schema includes price (with currency), aggregateRating (with reviewCount), availability, and brand. Products missing these fields are invisible to the comparison, regardless of how good they are.
Our data shows the specific fields that correlate most strongly with AI shopping citations: aggregateRating (present on only 31% of product pages), review (18%), brand (62%), and offers with complete price/currency/availability (74%). The gap between what AI needs and what most stores provide is enormous.
For stores using Shopify, the default Liquid theme generates minimal Product schema. Third-party apps like JSON-LD for SEO add the missing fields. For WooCommerce, the Yoast WooCommerce SEO plugin handles most fields. For custom stores, implement Product schema manually using the template below and validate every product template.
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Product Name",
"description": "Detailed description of at least 100 words.",
"image": ["https://yourstore.com/product-front.jpg", "https://yourstore.com/product-side.jpg"],
"brand": {
"@type": "Brand",
"name": "Brand Name"
},
"sku": "SKU-12345",
"color": "Midnight Blue",
"material": "Aluminum",
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.6",
"reviewCount": "234"
},
"offers": {
"@type": "Offer",
"price": "279.99",
"priceCurrency": "USD",
"availability": "https://schema.org/InStock",
"url": "https://yourstore.com/product-slug",
"priceValidUntil": "2026-12-31"
}
}Complete Product schema for AI shopping visibility
FAQPage Schema: The Highest-ROI Schema Type You Are Probably Missing
FAQPage schema has the highest impact-to-effort ratio of any schema type for AI readiness. It takes 15 minutes to implement per page, yet it directly provides AI models with pre-structured question-answer pairs that can be cited verbatim.
When an AI model encounters a page with FAQPage schema, it can extract precise answers to specific questions without parsing unstructured text. This is exactly what AI systems want — structured, authoritative answers that they can serve with confidence. Pages with valid FAQPage schema are cited 1.8x more frequently for question-type queries than equivalent pages without it.
The strategy is straightforward: identify the top 5-10 questions that users ask about each page's topic, write concise 1-3 sentence answers, and wrap them in FAQPage JSON-LD. Place the same FAQ content visually on the page (Google requires visible FAQ content to match schema). This gives you both AI visibility and rich snippet eligibility in traditional search.
Common mistakes include adding FAQPage schema with dozens of questions (keep it to 5-10 per page), writing multi-paragraph answers (keep them under 3 sentences for maximum AI extractability), and duplicating the same FAQ across multiple pages (each page should have unique questions relevant to its specific topic).
For maximum impact, combine FAQPage schema with the visual FAQ section on your page. This creates a dual benefit: AI systems extract answers from the schema, and human visitors find answers in the visible content. Both channels are served by the same content investment.
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is the most important schema type for AI readiness?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Organization schema on your homepage and Article or Product schema on content pages. These provide AI models with the identity and content-type signals needed for accurate citation."
}
},
{
"@type": "Question",
"name": "How many schema types should I implement?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Most pages need 2-3 types. Start with Organization + BreadcrumbList site-wide, then add Article, Product, or Service based on each page type."
}
}
]
}FAQPage schema template
Validation, Testing, and Ongoing Monitoring
Schema markup that is present but invalid is worse than no schema at all. Invalid markup can confuse AI parsers, generate incorrect citations, or trigger trust penalties. 23% of sites in our scan database have schema markup with validation errors — errors that silently reduce AI readiness scores.
The validation workflow should be systematic: test every unique page template, not just individual pages. If your blog template generates Article schema, validate it on one post and you have validated it for all posts of that type. If your product template generates Product schema, validate one product page per template variant.
Use three validation tools in sequence. Google Rich Results Test checks for Google-specific compliance and rich snippet eligibility. Schema.org Validator checks broader compliance with the Schema.org specification. AgentReady's scanner checks AI-specific readiness factors that neither Google nor Schema.org validators cover — llms.txt integration, AI crawler accessibility, and cross-reference consistency.
Set up ongoing monitoring. Schema can break silently when themes are updated, plugins conflict, or content structures change. Monthly validation of key page templates catches regressions before they impact AI visibility. If you have a CI/CD pipeline, integrate schema validation into your deployment process — this prevents invalid markup from ever reaching production.
The final piece: monitor your AI citations. Search for your brand and key topics in ChatGPT, Perplexity, and Claude quarterly. Note which pages are cited and whether the citation accurately reflects your schema. If AI is citing your content with incorrect attribution or outdated information, check your schema first — it is the most common culprit for citation inaccuracies.
- Google Rich Results Test — validates Google compliance and rich snippet eligibility
- Schema.org Validator — checks full spec compliance
- AgentReady Scanner — checks AI-specific readiness factors
- Monthly: Validate one page per template type
- After deployments: Run validation on changed templates
- Quarterly: Check AI citations for accuracy against your schema
Frequently Asked Questions
Which schema markup format should I use for AI readiness?
JSON-LD exclusively. Microdata and RDFa are technically valid but poorly supported by AI systems. Place your JSON-LD in a <script type='application/ld+json'> block in the <head> of each page. Google, ChatGPT, and all major AI platforms parse JSON-LD reliably.
How many schema types should a single page have?
Most pages need 2-3 schema types. A blog post typically needs Article + BreadcrumbList + Organization (on homepage). A product page needs Product + BreadcrumbList + Organization. An FAQ page needs FAQPage + Article + BreadcrumbList. Keep each type in its own JSON-LD block for clarity.
Does schema markup directly improve AI readiness scores?
Yes. Structured Data & Schema accounts for 20% of the AgentReady score. Sites with comprehensive, valid JSON-LD average 18 points higher on overall AI readiness than sites without schema markup. The correlation between schema completeness and AI citation frequency is r = 0.61.
Can AI read schema markup generated by WordPress plugins?
Yes, if the markup is valid. Plugins like RankMath and Yoast generate competent JSON-LD for common types. However, plugin-generated schema often omits fields that AI systems value (author credentials, dateModified, citation sources). Always validate plugin output and supplement with manual additions where needed.
Check Your AI Readiness Score
Free scan. No signup required. See how AI engines like ChatGPT, Perplexity, and Google AI view your website.
Scan Your Site FreeRelated Articles
Schema Markup for AI: The Only Types That Actually Matter
Schema.org has over 800 types. Only 8 meaningfully impact whether AI systems understand and cite your content. Here they are, with JSON-LD examples for each.
GuidesThe Complete Guide to Making Your Website AI-Ready in 2026
Everything you need to know about making your website visible to AI systems in 2026 — the 8 factors that determine whether AI agents cite your content or skip it entirely.
Data & ResearchThe AI Citation Index: Which Websites Get Referenced Most by ChatGPT and Perplexity
We analyzed 12,000 AI-generated responses to build the first AI Citation Index. The concentration is staggering: 50 domains capture 34% of all citations. Here is who gets cited, why, and how the data maps to AI readiness scores.