Advanced Guide

How to Improve AI Readiness on Gatsby

React-based static site generator with plugin-driven SEO capabilities. Follow these 6 steps to dramatically improve how AI agents discover and understand your Gatsby site.

35-45 minutes

6 steps

Score: 32 → 85

Expected Score Improvement

Before

32/ 100

After

85/ 100

Estimated improvement based on implementing all steps in this guide. Actual results vary depending on existing site configuration and content quality.

Step-by-Step Instructions

Install and Configure gatsby-plugin-sitemap

Add the sitemap plugin to generate comprehensive XML sitemaps from all your Gatsby pages.

javascript

npm install gatsby-plugin-sitemap

// gatsby-config.js
module.exports = {
  siteMetadata: {
    siteUrl: "https://www.yoursite.com",
    title: "Your Site Name",
  },
  plugins: [
    {
      resolve: "gatsby-plugin-sitemap",
      options: {
        query: `{
          allSitePage { nodes { path } }
          allMarkdownRemark { nodes { frontmatter { date } fields { slug } } }
        }`,
        resolveSiteUrl: () => "https://www.yoursite.com",
        resolvePages: ({ allSitePage, allMarkdownRemark }) => {
          const mdPages = new Map();
          allMarkdownRemark.nodes.forEach(node => {
            mdPages.set(node.fields.slug, node.frontmatter.date);
          });
          return allSitePage.nodes.map(page => ({
            path: page.path,
            lastmod: mdPages.get(page.path) || new Date().toISOString(),
          }));
        },
        serialize: ({ path, lastmod }) => ({
          url: path,
          lastmod,
          changefreq: "weekly",
          priority: path === "/" ? 1.0 : 0.7,
        }),
      },
    },
  ],
};

Add SEO Component with Schema Markup

Create a reusable SEO component that generates meta tags and JSON-LD for every page.

typescript

// src/components/seo.tsx
import { useStaticQuery, graphql } from "gatsby";

export function SEO({ title, description, schema, children }) {
  const { site } = useStaticQuery(graphql`
    query { site { siteMetadata { title description siteUrl } } }
  `);

  const metaTitle = title
    ? `${title} | ${site.siteMetadata.title}`
    : site.siteMetadata.title;
  const metaDesc = description || site.siteMetadata.description;

  return (
    <>
      <title>{metaTitle}</title>
      <meta name="description" content={metaDesc} />
      <meta property="og:title" content={metaTitle} />
      <meta property="og:description" content={metaDesc} />
      <meta property="og:type" content="website" />
      <link rel="canonical" href={`${site.siteMetadata.siteUrl}`} />

      {schema && (
        <script type="application/ld+json">
          {JSON.stringify(schema)}
        </script>
      )}
      {children}
    </>
  );
}

// Usage in a page:
export function Head() {
  return (
    <SEO
      title="Page Title"
      description="Page description"
      schema={{
        "@context": "https://schema.org",
        "@type": "Article",
        headline: "Page Title",
        datePublished: "2026-01-15",
      }}
    />
  );
}

Create llms.txt at Build Time

Use Gatsby's createPages API or the onPostBuild hook to generate a static llms.txt file during the build process.

javascript

// gatsby-node.js
const fs = require("fs");
const path = require("path");

exports.onPostBuild = async ({ graphql }) => {
  const result = await graphql(`
    query {
      site { siteMetadata { title description } }
      allMarkdownRemark(sort: { frontmatter: { date: DESC } }, limit: 20) {
        nodes { frontmatter { title } fields { slug } }
      }
    }
  `);

  const { title, description } = result.data.site.siteMetadata;
  const posts = result.data.allMarkdownRemark.nodes;

  const content = [
    `# ${title}`,
    `# ${description}`,
    "",
    "## Recent Articles",
    ...posts.map(p => `- ${p.fields.slug}: ${p.frontmatter.title}`),
    "",
    "## Key Pages",
    "- /: Homepage",
    "- /about: About us",
    "- /blog: All articles",
    "- /contact: Contact information",
  ].join("\n");

  fs.writeFileSync(
    path.join("public", "llms.txt"),
    content,
    "utf-8"
  );
  console.log("Generated llms.txt");
};

Configure robots.txt

Use gatsby-plugin-robots-txt to generate a customized robots.txt with AI crawler rules.

javascript

npm install gatsby-plugin-robots-txt

// gatsby-config.js plugins array:
{
  resolve: "gatsby-plugin-robots-txt",
  options: {
    host: "https://www.yoursite.com",
    sitemap: "https://www.yoursite.com/sitemap-index.xml",
    policy: [
      { userAgent: "*", allow: "/" },
      { userAgent: "GPTBot", allow: "/" },
      { userAgent: "ChatGPT-User", allow: "/" },
      { userAgent: "ClaudeBot", allow: "/" },
      { userAgent: "PerplexityBot", allow: "/" },
    ],
  },
},

Add Organization and WebSite Schema Globally

Include Organization and WebSite schema on every page using the gatsby-ssr API or your layout component.

javascript

// gatsby-ssr.js — inject on every page
exports.onRenderBody = ({ setHeadComponents }) => {
  setHeadComponents([
    <script
      key="org-schema"
      type="application/ld+json"
      dangerouslySetInnerHTML={{
        __html: JSON.stringify({
          "@context": "https://schema.org",
          "@type": "Organization",
          name: "Your Business",
          url: "https://www.yoursite.com",
          logo: "https://www.yoursite.com/logo.png",
          description: "What your business does",
        }),
      }}
    />,
    <script
      key="website-schema"
      type="application/ld+json"
      dangerouslySetInnerHTML={{
        __html: JSON.stringify({
          "@context": "https://schema.org",
          "@type": "WebSite",
          name: "Your Site",
          url: "https://www.yoursite.com",
          potentialAction: {
            "@type": "SearchAction",
            target: "https://www.yoursite.com/search?q={search_term}",
            "query-input": "required name=search_term",
          },
        }),
      }}
    />,
  ]);
};

Ensure Static HTML Output for AI Crawlers

Gatsby pre-renders all pages to static HTML by default, which is ideal for AI crawlers. Verify this is working and that no critical content is client-only.

bash

# Gatsby builds static HTML by default — verify it:

# 1. Build your site:
gatsby build

# 2. Check the output HTML:
cat public/index.html | head -100
# Verify your content is in the HTML, not loaded via JS

# 3. Serve and test:
gatsby serve
# Visit http://localhost:9000
# View source (Ctrl+U) — all content should be visible

# Common pitfall: useEffect-only content
# BAD: Content that only appears after client-side hydration
const [data, setData] = useState(null);
useEffect(() => { fetchData().then(setData); }, []);
// This content won't be in the static HTML!

# GOOD: Use Gatsby's data layer (GraphQL) or getServerData
# Content from GraphQL queries IS in the static HTML

Recommended Tools for Gatsby

gatsby-plugin-sitemap
gatsby-plugin-robots-txt
gatsby-plugin-canonical-urls
Gatsby Head API

Frequently Asked Questions

Does Gatsby pre-render content for AI crawlers?

Yes, Gatsby generates static HTML at build time for every page. AI crawlers receive fully-rendered HTML without needing to execute JavaScript. This gives Gatsby an inherent advantage for AI readiness compared to client-rendered React apps.

How do I generate llms.txt in Gatsby?

Use the onPostBuild hook in gatsby-node.js to generate a static llms.txt file in the public/ directory during the build process. You can query your content via GraphQL and format it as plain text.

Which Gatsby plugins are essential for AI readiness?

The key plugins are: gatsby-plugin-sitemap (XML sitemaps), gatsby-plugin-robots-txt (robots.txt), gatsby-plugin-canonical-urls (canonical URLs), and gatsby-plugin-react-helmet or the built-in Head API (meta tags). Most structured data should be added via custom components.

Gatsby Benchmark Data

See how Gatsby sites score compared to other platforms, with protocol adoption rates and top-performing sites.

View Gatsby AI Readiness Benchmark

Guides for Other Platforms

Ready to Check Your Gatsby Site?

Run a free AI readiness scan to see your current score and get personalized recommendations for your Gatsby site.

Scan Your Gatsby Site

Step-by-Step Instructions

Install and Configure gatsby-plugin-sitemap

Add the sitemap plugin to generate comprehensive XML sitemaps from all your Gatsby pages.

javascript

npm install gatsby-plugin-sitemap

// gatsby-config.js
module.exports = {
  siteMetadata: {
    siteUrl: "https://www.yoursite.com",
    title: "Your Site Name",
  },
  plugins: [
    {
      resolve: "gatsby-plugin-sitemap",
      options: {
        query: `{
          allSitePage { nodes { path } }
          allMarkdownRemark { nodes { frontmatter { date } fields { slug } } }
        }`,
        resolveSiteUrl: () => "https://www.yoursite.com",
        resolvePages: ({ allSitePage, allMarkdownRemark }) => {
          const mdPages = new Map();
          allMarkdownRemark.nodes.forEach(node => {
            mdPages.set(node.fields.slug, node.frontmatter.date);
          });
          return allSitePage.nodes.map(page => ({
            path: page.path,
            lastmod: mdPages.get(page.path) || new Date().toISOString(),
          }));
        },
        serialize: ({ path, lastmod }) => ({
          url: path,
          lastmod,
          changefreq: "weekly",
          priority: path === "/" ? 1.0 : 0.7,
        }),
      },
    },
  ],
};

Add SEO Component with Schema Markup

Create a reusable SEO component that generates meta tags and JSON-LD for every page.

typescript

// src/components/seo.tsx
import { useStaticQuery, graphql } from "gatsby";

export function SEO({ title, description, schema, children }) {
  const { site } = useStaticQuery(graphql`
    query { site { siteMetadata { title description siteUrl } } }
  `);

  const metaTitle = title
    ? `${title} | ${site.siteMetadata.title}`
    : site.siteMetadata.title;
  const metaDesc = description || site.siteMetadata.description;

  return (
    <>
      <title>{metaTitle}</title>
      <meta name="description" content={metaDesc} />
      <meta property="og:title" content={metaTitle} />
      <meta property="og:description" content={metaDesc} />
      <meta property="og:type" content="website" />
      <link rel="canonical" href={`${site.siteMetadata.siteUrl}`} />

      {schema && (
        <script type="application/ld+json">
          {JSON.stringify(schema)}
        </script>
      )}
      {children}
    </>
  );
}

// Usage in a page:
export function Head() {
  return (
    <SEO
      title="Page Title"
      description="Page description"
      schema={{
        "@context": "https://schema.org",
        "@type": "Article",
        headline: "Page Title",
        datePublished: "2026-01-15",
      }}
    />
  );
}

Create llms.txt at Build Time

Use Gatsby's createPages API or the onPostBuild hook to generate a static llms.txt file during the build process.

javascript

// gatsby-node.js
const fs = require("fs");
const path = require("path");

exports.onPostBuild = async ({ graphql }) => {
  const result = await graphql(`
    query {
      site { siteMetadata { title description } }
      allMarkdownRemark(sort: { frontmatter: { date: DESC } }, limit: 20) {
        nodes { frontmatter { title } fields { slug } }
      }
    }
  `);

  const { title, description } = result.data.site.siteMetadata;
  const posts = result.data.allMarkdownRemark.nodes;

  const content = [
    `# ${title}`,
    `# ${description}`,
    "",
    "## Recent Articles",
    ...posts.map(p => `- ${p.fields.slug}: ${p.frontmatter.title}`),
    "",
    "## Key Pages",
    "- /: Homepage",
    "- /about: About us",
    "- /blog: All articles",
    "- /contact: Contact information",
  ].join("\n");

  fs.writeFileSync(
    path.join("public", "llms.txt"),
    content,
    "utf-8"
  );
  console.log("Generated llms.txt");
};

Configure robots.txt

Use gatsby-plugin-robots-txt to generate a customized robots.txt with AI crawler rules.

javascript

npm install gatsby-plugin-robots-txt

// gatsby-config.js plugins array:
{
  resolve: "gatsby-plugin-robots-txt",
  options: {
    host: "https://www.yoursite.com",
    sitemap: "https://www.yoursite.com/sitemap-index.xml",
    policy: [
      { userAgent: "*", allow: "/" },
      { userAgent: "GPTBot", allow: "/" },
      { userAgent: "ChatGPT-User", allow: "/" },
      { userAgent: "ClaudeBot", allow: "/" },
      { userAgent: "PerplexityBot", allow: "/" },
    ],
  },
},

Add Organization and WebSite Schema Globally

Include Organization and WebSite schema on every page using the gatsby-ssr API or your layout component.

javascript

// gatsby-ssr.js — inject on every page
exports.onRenderBody = ({ setHeadComponents }) => {
  setHeadComponents([
    <script
      key="org-schema"
      type="application/ld+json"
      dangerouslySetInnerHTML={{
        __html: JSON.stringify({
          "@context": "https://schema.org",
          "@type": "Organization",
          name: "Your Business",
          url: "https://www.yoursite.com",
          logo: "https://www.yoursite.com/logo.png",
          description: "What your business does",
        }),
      }}
    />,
    <script
      key="website-schema"
      type="application/ld+json"
      dangerouslySetInnerHTML={{
        __html: JSON.stringify({
          "@context": "https://schema.org",
          "@type": "WebSite",
          name: "Your Site",
          url: "https://www.yoursite.com",
          potentialAction: {
            "@type": "SearchAction",
            target: "https://www.yoursite.com/search?q={search_term}",
            "query-input": "required name=search_term",
          },
        }),
      }}
    />,
  ]);
};

Ensure Static HTML Output for AI Crawlers

Gatsby pre-renders all pages to static HTML by default, which is ideal for AI crawlers. Verify this is working and that no critical content is client-only.

bash

# Gatsby builds static HTML by default — verify it:

# 1. Build your site:
gatsby build

# 2. Check the output HTML:
cat public/index.html | head -100
# Verify your content is in the HTML, not loaded via JS

# 3. Serve and test:
gatsby serve
# Visit http://localhost:9000
# View source (Ctrl+U) — all content should be visible

# Common pitfall: useEffect-only content
# BAD: Content that only appears after client-side hydration
const [data, setData] = useState(null);
useEffect(() => { fetchData().then(setData); }, []);
// This content won't be in the static HTML!

# GOOD: Use Gatsby's data layer (GraphQL) or getServerData
# Content from GraphQL queries IS in the static HTML

Frequently Asked Questions

Does Gatsby pre-render content for AI crawlers?

How do I generate llms.txt in Gatsby?

Use the onPostBuild hook in gatsby-node.js to generate a static llms.txt file in the public/ directory during the build process. You can query your content via GraphQL and format it as plain text.