How to Improve AI Readiness on Gatsby
React-based static site generator with plugin-driven SEO capabilities. Follow these 6 steps to dramatically improve how AI agents discover and understand your Gatsby site.
Expected Score Improvement
Before
After
Estimated improvement based on implementing all steps in this guide. Actual results vary depending on existing site configuration and content quality.
Step-by-Step Instructions
Install and Configure gatsby-plugin-sitemap
Add the sitemap plugin to generate comprehensive XML sitemaps from all your Gatsby pages.
npm install gatsby-plugin-sitemap
// gatsby-config.js
module.exports = {
siteMetadata: {
siteUrl: "https://www.yoursite.com",
title: "Your Site Name",
},
plugins: [
{
resolve: "gatsby-plugin-sitemap",
options: {
query: `{
allSitePage { nodes { path } }
allMarkdownRemark { nodes { frontmatter { date } fields { slug } } }
}`,
resolveSiteUrl: () => "https://www.yoursite.com",
resolvePages: ({ allSitePage, allMarkdownRemark }) => {
const mdPages = new Map();
allMarkdownRemark.nodes.forEach(node => {
mdPages.set(node.fields.slug, node.frontmatter.date);
});
return allSitePage.nodes.map(page => ({
path: page.path,
lastmod: mdPages.get(page.path) || new Date().toISOString(),
}));
},
serialize: ({ path, lastmod }) => ({
url: path,
lastmod,
changefreq: "weekly",
priority: path === "/" ? 1.0 : 0.7,
}),
},
},
],
};Add SEO Component with Schema Markup
Create a reusable SEO component that generates meta tags and JSON-LD for every page.
// src/components/seo.tsx
import { useStaticQuery, graphql } from "gatsby";
export function SEO({ title, description, schema, children }) {
const { site } = useStaticQuery(graphql`
query { site { siteMetadata { title description siteUrl } } }
`);
const metaTitle = title
? `${title} | ${site.siteMetadata.title}`
: site.siteMetadata.title;
const metaDesc = description || site.siteMetadata.description;
return (
<>
<title>{metaTitle}</title>
<meta name="description" content={metaDesc} />
<meta property="og:title" content={metaTitle} />
<meta property="og:description" content={metaDesc} />
<meta property="og:type" content="website" />
<link rel="canonical" href={`${site.siteMetadata.siteUrl}`} />
{schema && (
<script type="application/ld+json">
{JSON.stringify(schema)}
</script>
)}
{children}
</>
);
}
// Usage in a page:
export function Head() {
return (
<SEO
title="Page Title"
description="Page description"
schema={{
"@context": "https://schema.org",
"@type": "Article",
headline: "Page Title",
datePublished: "2026-01-15",
}}
/>
);
}Create llms.txt at Build Time
Use Gatsby's createPages API or the onPostBuild hook to generate a static llms.txt file during the build process.
// gatsby-node.js
const fs = require("fs");
const path = require("path");
exports.onPostBuild = async ({ graphql }) => {
const result = await graphql(`
query {
site { siteMetadata { title description } }
allMarkdownRemark(sort: { frontmatter: { date: DESC } }, limit: 20) {
nodes { frontmatter { title } fields { slug } }
}
}
`);
const { title, description } = result.data.site.siteMetadata;
const posts = result.data.allMarkdownRemark.nodes;
const content = [
`# ${title}`,
`# ${description}`,
"",
"## Recent Articles",
...posts.map(p => `- ${p.fields.slug}: ${p.frontmatter.title}`),
"",
"## Key Pages",
"- /: Homepage",
"- /about: About us",
"- /blog: All articles",
"- /contact: Contact information",
].join("\n");
fs.writeFileSync(
path.join("public", "llms.txt"),
content,
"utf-8"
);
console.log("Generated llms.txt");
};Configure robots.txt
Use gatsby-plugin-robots-txt to generate a customized robots.txt with AI crawler rules.
npm install gatsby-plugin-robots-txt
// gatsby-config.js plugins array:
{
resolve: "gatsby-plugin-robots-txt",
options: {
host: "https://www.yoursite.com",
sitemap: "https://www.yoursite.com/sitemap-index.xml",
policy: [
{ userAgent: "*", allow: "/" },
{ userAgent: "GPTBot", allow: "/" },
{ userAgent: "ChatGPT-User", allow: "/" },
{ userAgent: "ClaudeBot", allow: "/" },
{ userAgent: "PerplexityBot", allow: "/" },
],
},
},Add Organization and WebSite Schema Globally
Include Organization and WebSite schema on every page using the gatsby-ssr API or your layout component.
// gatsby-ssr.js — inject on every page
exports.onRenderBody = ({ setHeadComponents }) => {
setHeadComponents([
<script
key="org-schema"
type="application/ld+json"
dangerouslySetInnerHTML={{
__html: JSON.stringify({
"@context": "https://schema.org",
"@type": "Organization",
name: "Your Business",
url: "https://www.yoursite.com",
logo: "https://www.yoursite.com/logo.png",
description: "What your business does",
}),
}}
/>,
<script
key="website-schema"
type="application/ld+json"
dangerouslySetInnerHTML={{
__html: JSON.stringify({
"@context": "https://schema.org",
"@type": "WebSite",
name: "Your Site",
url: "https://www.yoursite.com",
potentialAction: {
"@type": "SearchAction",
target: "https://www.yoursite.com/search?q={search_term}",
"query-input": "required name=search_term",
},
}),
}}
/>,
]);
};Ensure Static HTML Output for AI Crawlers
Gatsby pre-renders all pages to static HTML by default, which is ideal for AI crawlers. Verify this is working and that no critical content is client-only.
# Gatsby builds static HTML by default — verify it:
# 1. Build your site:
gatsby build
# 2. Check the output HTML:
cat public/index.html | head -100
# Verify your content is in the HTML, not loaded via JS
# 3. Serve and test:
gatsby serve
# Visit http://localhost:9000
# View source (Ctrl+U) — all content should be visible
# Common pitfall: useEffect-only content
# BAD: Content that only appears after client-side hydration
const [data, setData] = useState(null);
useEffect(() => { fetchData().then(setData); }, []);
// This content won't be in the static HTML!
# GOOD: Use Gatsby's data layer (GraphQL) or getServerData
# Content from GraphQL queries IS in the static HTMLRecommended Tools for Gatsby
- gatsby-plugin-sitemap
- gatsby-plugin-robots-txt
- gatsby-plugin-canonical-urls
- Gatsby Head API
Frequently Asked Questions
Does Gatsby pre-render content for AI crawlers?
Yes, Gatsby generates static HTML at build time for every page. AI crawlers receive fully-rendered HTML without needing to execute JavaScript. This gives Gatsby an inherent advantage for AI readiness compared to client-rendered React apps.
How do I generate llms.txt in Gatsby?
Use the onPostBuild hook in gatsby-node.js to generate a static llms.txt file in the public/ directory during the build process. You can query your content via GraphQL and format it as plain text.
Which Gatsby plugins are essential for AI readiness?
The key plugins are: gatsby-plugin-sitemap (XML sitemaps), gatsby-plugin-robots-txt (robots.txt), gatsby-plugin-canonical-urls (canonical URLs), and gatsby-plugin-react-helmet or the built-in Head API (meta tags). Most structured data should be added via custom components.
Gatsby Benchmark Data
See how Gatsby sites score compared to other platforms, with protocol adoption rates and top-performing sites.
View Gatsby AI Readiness BenchmarkGuides for Other Platforms
Ready to Check Your Gatsby Site?
Run a free AI readiness scan to see your current score and get personalized recommendations for your Gatsby site.
Scan Your Gatsby Site