The AI Readiness Checklist: 47 Things to Fix Before AI Agents Ignore Your Site
We distilled 5,000+ site scans into a 47-point checklist covering every factor that determines whether AI agents can find, understand, and cite your website. Organized by priority with time estimates for each fix.
Founder & CEO at AgentReady
How to Use This Checklist
This checklist is organized into four priority tiers based on impact data from our 5,000+ site scans. Tier 1 (Critical) items have the highest individual impact on AI readiness scores — fixing any single Tier 1 item moves your score by an estimated 3-8 points. Tier 2 (High) items each contribute 1-3 points. Tier 3 (Standard) and Tier 4 (Advanced) items contribute under 1 point each but compound meaningfully.
Work top to bottom within each tier. Do not skip to Tier 3 before completing Tier 1 — the dependencies are real. A site with perfect schema markup but blocked AI crawlers scores lower than a site with no schema but open access.
Each item includes a time estimate for a competent developer or technically comfortable site owner. Non-technical users should budget 2-3x the stated time or engage a developer for Tier 2+ items.
Track your progress. After completing each tier, run an AgentReady scan to measure the impact. The feedback loop is fast — most changes are reflected within 48-72 hours.
Tier 1 — Critical (12 Items): The Foundation That Makes Everything Else Work
These 12 items are the highest-impact fixes. Each one addresses a binary gate — either AI can access your content or it cannot, either AI can identify your content type or it guesses. Completing all 12 Tier 1 items typically moves a site from a score of 30-40 to a score of 55-65.
The 12 Critical items cover bot access, basic schema, content rendering, and site identity. They are the foundation upon which every other optimization builds. Skip them and nothing else matters.
We estimate Tier 1 takes 4-8 hours for a typical site with developer access, or 1-2 days for a non-technical site owner using guides and plugins.
- 1. Allow GPTBot in robots.txt — Verify GPTBot is not blocked. Add explicit Allow rule. (15 min)
- 2. Allow ClaudeBot in robots.txt — Same process for Anthropic's crawler. (5 min)
- 3. Allow PerplexityBot in robots.txt — Same process for Perplexity's crawler. (5 min)
- 4. Allow Google-Extended in robots.txt — Required for Google AI Overviews. (5 min)
- 5. Allow CCBot in robots.txt — Common Crawl feeds many AI training sets. (5 min)
- 6. Verify server-side rendering — Confirm your homepage returns full HTML content to non-JS clients. Use
curl -A 'GPTBot' your-urland verify content appears. (30 min) - 7. Add Organization JSON-LD — Place Organization schema on your homepage with name, URL, logo, description, and sameAs links. (30 min)
- 8. Add Article/Product JSON-LD — Add appropriate schema type to your primary content pages. Article for blogs, Product for stores, Service for service pages. (1-2 hrs)
- 9. Ensure valid SSL certificate — Verify HTTPS works and the certificate is current. Mixed content warnings signal an unmaintained site. (15 min)
- 10. Check for redirect chains — Resolve any chains longer than 2 hops. AI crawlers may not follow long chains. (30 min)
- 11. Add clear H1 to every page — Each page should have exactly one H1 that describes the page's primary topic. (1 hr)
- 12. Create a basic sitemap.xml — Ensure your sitemap exists, is valid XML, and is referenced in robots.txt. (30 min)
Tier 2 — High Priority (15 Items): Structure, Protocols, and Content Depth
With Tier 1 complete, AI can access and minimally identify your content. Tier 2 items help AI understand and trust your content. These items focus on content structure, AI protocol adoption, and authorship signals. Completing Tier 2 typically moves a site from 55-65 to 70-78.
Tier 2 requires more investment — an estimated 8-16 hours total — but the per-item impact is still significant. Several items in this tier (llms.txt, author attribution, FAQ schema) have outsized impact relative to their implementation effort.
- 13. Create and deploy llms.txt — Write an llms.txt file describing your site's purpose, key pages, and content structure. Deploy to domain root. (1 hr) — See our llms.txt guide
- 14. Add FAQ schema to top pages — Identify your top 10-20 pages and add FAQPage JSON-LD with relevant questions and answers. (2-4 hrs)
- 15. Add BreadcrumbList schema — Implement navigation breadcrumbs in JSON-LD on all interior pages. (1 hr)
- 16. Create author pages — Every content author should have a dedicated page with bio, credentials, links, and published articles. (2-3 hrs)
- 17. Add author bylines to all content — Every article and blog post should display the author's name, linking to their author page. (1 hr)
- 18. Write meta descriptions for all pages — Unique, descriptive meta descriptions under 155 characters for every indexed page. (2-4 hrs)
- 19. Implement H2/H3 heading hierarchy — Every page should have a logical heading structure. No skipped levels (H1 to H3 without H2). (2 hrs)
- 20. Add internal links between related pages — Each content page should link to 3-5 related pages within your site. Build topic clusters. (2 hrs)
- 21. Write original content over 500 words on key pages — Thin pages score poorly. Expand product descriptions, service pages, and key landing pages to 500+ words. (4-8 hrs)
- 22. Add OpenGraph and Twitter Card meta tags — These do not directly affect AI readiness but improve sharing signals that build authority. (1 hr)
- 23. Validate all JSON-LD with Google Rich Results Test — Run every schema template through validation. Fix errors and warnings. (1-2 hrs)
- 24. Check CDN and WAF settings for AI bot blocking — Cloudflare Bot Fight Mode and similar features can block AI crawlers. Whitelist known AI user agents. (30 min)
- 25. Add citation sources to content — Link to external authoritative sources. AI models use outbound citation patterns as a trust signal. (2 hrs)
- 26. Implement canonical URLs — Every page should have a self-referencing canonical tag. Duplicate content confuses AI extraction. (30 min)
- 27. Create an About page with business credentials — A detailed About page with company history, team credentials, and contact information builds trust. (1-2 hrs)
Tier 3 — Standard (12 Items): Optimization and Depth
Tier 3 items refine your AI readiness from good to excellent. These optimizations address content depth, technical performance, and structured data completeness. They are most impactful for sites already scoring 70+ that want to push into the A grade range.
Completing Tier 3 adds an estimated 5-10 points to a well-optimized site. Individual item impact is modest, but the cumulative effect is meaningful, especially for competitive industries where a few points separate you from competitors.
- 28. Add HowTo schema to tutorial content — If you publish how-to guides, add HowTo JSON-LD with named steps. (1-2 hrs)
- 29. Add Review/Rating schema to product pages — Include aggregateRating and individual review markup on reviewed products. (1-2 hrs)
- 30. Optimize Core Web Vitals — Target LCP under 2.5s, CLS under 0.1, INP under 200ms. Google AI Overviews correlate with CWV performance. (4-8 hrs)
- 31. Implement pagination schema — For paginated content, use proper prev/next links and consider schema markup for article series. (1 hr)
- 32. Add LocalBusiness schema (if applicable) — Local businesses should add location-specific schema with address, hours, and service area. (30 min)
- 33. Create a comprehensive FAQ section — Add FAQ content to key landing pages addressing the top 5-10 questions in your niche. (3-4 hrs)
- 34. Optimize images with alt text and captions — Every image should have descriptive alt text. AI models process alt text as content context. (2-3 hrs)
- 35. Reduce page weight to under 3MB — Heavy pages load slowly for AI crawlers on rate-limited connections. Compress images, defer scripts. (2-4 hrs)
- 36. Add publication and modification dates — Every content page should display a published date and last-updated date. Include datePublished and dateModified in Article schema. (1 hr)
- 37. Implement search functionality — A working site search helps AI agents discover content. Consider exposing search via NLWeb. (2-4 hrs)
- 38. Create a content hub or resource center — A single page linking to all key resources improves AI content discovery and internal linking. (2 hrs)
- 39. Add structured pricing data — If you list prices, use Offer schema with price, currency, and availability. AI shopping agents depend on this. (1-2 hrs)
Tier 4 — Advanced (8 Items): Protocol Leadership and AI-Native Features
Tier 4 is for organizations committed to AI-native web presence. These items require technical investment and ongoing maintenance. They are most relevant for SaaS products, large e-commerce operations, and content publishers competing for AI visibility in crowded categories.
Sites that complete Tier 4 score 85+ on AI readiness, placing them in the top 4.2% of all websites. These items differentiate leaders from followers.
- 40. Implement NLWeb endpoint — Create a /.well-known/nlweb endpoint that answers natural language queries about your content. (1-2 weeks)
- 41. Set up MCP server for transactional features — If your site has booking, purchasing, or account features, expose them via MCP. (2-4 weeks)
- 42. Create llms-full.txt with expanded content — Beyond the standard llms.txt, create a detailed version with full content summaries for key pages. (2-4 hrs)
- 43. Monitor AI crawler access logs — Set up automated monitoring for GPTBot, ClaudeBot, and PerplexityBot in your server logs. Alert on blocks or errors. (2-4 hrs)
- 44. Implement real-time schema validation — Automate JSON-LD validation in your CI/CD pipeline. Catch schema errors before deployment. (4-8 hrs)
- 45. Build an API for structured content access — Expose your content through a documented API that AI agents can query programmatically. (1-2 weeks)
- 46. Add multilingual structured data — If you serve multiple languages, implement hreflang tags and language-specific schema markup. (4-8 hrs)
- 47. Participate in AI platform feedback programs — Submit your site to ChatGPT's publisher tools, Perplexity's publisher program, and similar initiatives. (2-4 hrs)
Expected Score Impact by Tier
Based on our scan data, here is the expected score progression as you complete each tier. These are estimates based on averages; actual results will vary based on your starting point and implementation quality.
A site starting at 25/100 (F grade) that completes all four tiers can expect to reach 80-90/100 (A/B grade). The largest jump comes from Tier 1, which addresses the most impactful binary gates. Each subsequent tier shows diminishing returns per item but continues to build competitive advantage.
The key insight from our data: the first 12 items deliver approximately 60% of the total score improvement. Do not let perfection delay progress. A site that completes Tier 1 in one week and never touches the other tiers will still outperform 73% of the web on AI readiness.
Prioritize ruthlessly. Implement in order. Measure after each tier. The AI readiness gap is real, but it is closable — and every point of improvement compounds over time as AI systems learn to trust and return to sites that make their job easier.
Expected Score Progression by Tier Completion
Frequently Asked Questions
Do I need to complete all 47 items on the checklist?
No. The checklist is organized by priority tiers. Completing the 12 Critical items (Tier 1) will move most sites from a D or F grade to a C or low B. The 15 High-priority items (Tier 2) push you into solid B territory. The remaining 20 items are optimization for sites targeting an A grade.
How long does it take to complete the full checklist?
Tier 1 (Critical) takes 4-8 hours for a typical site. Tier 2 (High) adds another 8-16 hours. Tier 3 (Standard) and Tier 4 (Advanced) require 20-40 hours combined, spread over weeks. A reasonable timeline is: Tier 1 in week one, Tier 2 by end of month one, and Tiers 3-4 over the next quarter.
Is this checklist applicable to all CMS platforms?
Yes, with caveats. All 47 items are platform-agnostic in principle. However, some items are harder or impossible on certain platforms. Wix and Squarespace limit server-side customization. Shopify restricts robots.txt editing. Each item notes platform-specific limitations where they exist.
How often should I re-audit against this checklist?
Monthly for Tier 1 items (critical access and rendering checks). Quarterly for the full checklist. After any major site update — new theme, CMS migration, significant content restructure — run the full audit immediately. AI readiness can regress silently when site changes override previous optimizations.
Check Your AI Readiness Score
Free scan. No signup required. See how AI engines like ChatGPT, Perplexity, and Google AI view your website.
Scan Your Site FreeRelated Articles
The Complete Guide to Making Your Website AI-Ready in 2026
Everything you need to know about making your website visible to AI systems in 2026 — the 8 factors that determine whether AI agents cite your content or skip it entirely.
GuidesHow to Fix Your robots.txt for AI Crawlers (5-Minute Guide)
Over 40% of websites accidentally block AI crawlers. Here is exactly how to fix your robots.txt in under 5 minutes, with templates for every major platform.
GuidesSchema Markup for AI: The Only Types That Actually Matter
Schema.org has over 800 types. Only 8 meaningfully impact whether AI systems understand and cite your content. Here they are, with JSON-LD examples for each.