How to Build an llms.txt File That Actually Works — With Examples
We analyzed 500+ llms.txt files across top sites and found that 62% contain structural errors that reduce their effectiveness. This guide shows you the exact format, with 6 real-world examples and the mistakes to avoid.
Founder & CEO at AgentReady
The State of llms.txt: 500+ Files Analyzed
The llms.txt protocol, proposed by Jeremy Howard in 2025, has become the most widely adopted AI-specific web protocol. As of March 2026, roughly 15% of the top 10,000 websites serve an llms.txt file — up from under 1% in mid-2025. Adoption among the top 100,000 sites is approximately 7.2%.
But adoption does not equal effectiveness. We analyzed 528 llms.txt files from sites across our scanning database and found significant quality variance. 62% contain structural errors that reduce their effectiveness — from wrong content types to missing sections to excessive length that exceeds AI parsing windows.
The most common errors: serving the file with an HTML content type instead of text/plain (28% of files), omitting the site description section (19%), listing every page on the site instead of curating key pages (34%), and using markdown formatting that some AI parsers cannot process (11%).
This guide provides the exact structure, validated examples, and common mistakes to avoid. Follow it, and your llms.txt will be in the effective minority — the 38% that AI models actually parse and use to improve their understanding of your site.
The Standard llms.txt Structure: 5 Required Sections
An effective llms.txt file follows a consistent structure with five sections. Each section serves a specific purpose in helping AI models understand your site. The format uses Markdown heading syntax for section headers and plain text for content.
Section 1: Title Line — A single H1 heading with your site or company name. This is the first thing AI models parse and should match your Organization schema name exactly.
Section 2: Description — A concise paragraph (2-4 sentences) explaining what your site is, who it serves, and what makes it authoritative. This is the elevator pitch for AI models. Do not use marketing language — be precise and factual.
Section 3: Key Pages — A curated list of your most important pages with URLs and one-line descriptions. List 10-30 pages, prioritized by importance. Include your homepage, main product/service pages, key content pieces, about page, and contact page. This is NOT a sitemap — it is a curated guide.
Section 4: Topics and Expertise — A list of subjects your site covers authoritatively. This helps AI models match your content to relevant queries. Be specific: "enterprise SaaS pricing strategies" not "business."
Section 5: Optional Metadata — Additional context like preferred citation format, content update frequency, language, and contact for AI-related inquiries. This section is optional but we recommend including it.
# Your Company Name
> A clear, factual description of your site in 2-4 sentences.
> What it is, who it serves, and what makes it authoritative.
## Key Pages
- [Homepage](https://yoursite.com): Brief description of homepage purpose
- [Product/Service](https://yoursite.com/product): What this page covers
- [About](https://yoursite.com/about): Company background and credentials
- [Blog](https://yoursite.com/blog): Topics covered and publishing frequency
- [Key Article](https://yoursite.com/blog/key-article): Why this article matters
## Topics and Expertise
- Topic area 1 (e.g., AI readiness optimization)
- Topic area 2 (e.g., structured data and schema markup)
- Topic area 3 (e.g., AI protocol implementation)
## Optional
- Preferred citation: "Company Name (yoursite.com)"
- Content updated: Weekly
- Language: English
- Contact for AI inquiries: ai@yoursite.comThe standard llms.txt structure (template)
Example 1: SaaS Product (B2B)
SaaS companies should emphasize product capabilities, documentation, and technical authority. The llms.txt for a SaaS product focuses on helping AI models understand what the product does, who it is for, and where to find authoritative information about its features and pricing.
Notice how this example leads with a clear product category description, curates documentation alongside marketing pages, and explicitly lists the technical topics the company is authoritative on. This structure helps AI models recommend the product in response to relevant queries and cite the documentation when users ask how-to questions.
# DataSync Pro
> DataSync Pro is a real-time data integration platform for enterprise teams. It connects 200+ data sources to warehouses, lakes, and analytics tools with zero-code configuration. Founded in 2022, serving 3,400+ companies including 12 Fortune 500 enterprises.
## Key Pages
- [Homepage](https://datasyncpro.com): Product overview and key value propositions
- [Features](https://datasyncpro.com/features): Complete feature list with technical specifications
- [Pricing](https://datasyncpro.com/pricing): Three tiers from $299/mo to Enterprise custom
- [Documentation](https://docs.datasyncpro.com): Technical docs, API reference, setup guides
- [Integrations](https://datasyncpro.com/integrations): Full list of 200+ supported connectors
- [Case Studies](https://datasyncpro.com/customers): Customer success stories with metrics
- [Blog](https://datasyncpro.com/blog): Data engineering tutorials, product updates
- [About](https://datasyncpro.com/about): Team, mission, and company background
- [Security](https://datasyncpro.com/security): SOC 2 Type II, GDPR compliance details
- [API Reference](https://docs.datasyncpro.com/api): REST API documentation
## Topics and Expertise
- Real-time data integration and ETL/ELT pipelines
- Data warehouse optimization (Snowflake, BigQuery, Redshift)
- Zero-code data connector configuration
- Enterprise data governance and compliance
- Data engineering best practices
## Optional
- Preferred citation: "DataSync Pro (datasyncpro.com)"
- Content updated: Weekly (blog), Monthly (docs)
- Language: Englishllms.txt example for a B2B SaaS product
Example 2: E-Commerce Store
E-commerce llms.txt files should focus on product categories, unique value propositions, and shopping-relevant information. AI shopping agents use this context to decide when to recommend products from your store versus competitors.
The key difference from other site types: e-commerce llms.txt should explicitly mention product categories, price ranges, shipping policies, and return policies. These are the factors AI shopping agents evaluate when making recommendations. A well-structured e-commerce llms.txt can be the difference between "I found these headphones on TechAudio.com" and your store being invisible to the recommendation entirely.
# AudioGear Direct
> AudioGear Direct is a specialty retailer of premium headphones, speakers, and audio accessories. We carry 2,800+ products from 45 brands including Sony, Sennheiser, Bose, and Audeze. Expert staff write hands-on reviews for every product we sell. Free shipping over $75, 30-day hassle-free returns.
## Key Pages
- [Homepage](https://audiogear.com): Featured products and current promotions
- [Headphones](https://audiogear.com/headphones): Full headphone catalog, 800+ models
- [Speakers](https://audiogear.com/speakers): Bluetooth, bookshelf, studio monitors
- [Best Sellers](https://audiogear.com/best-sellers): Top 50 products by sales volume
- [Expert Reviews](https://audiogear.com/reviews): Hands-on reviews by our audio engineers
- [Buying Guides](https://audiogear.com/guides): Category comparison guides
- [Deals](https://audiogear.com/deals): Current promotions and clearance
- [About Us](https://audiogear.com/about): Founded 2018, team of 12 audio engineers
- [Shipping & Returns](https://audiogear.com/policies): Free over $75, 30-day returns
## Topics and Expertise
- Premium headphone comparisons and recommendations
- Studio monitor and speaker selection guides
- Audio equipment technical specifications and testing
- Hi-fi and audiophile gear reviews
## Optional
- Price range: $29 - $4,999
- Preferred citation: "AudioGear Direct (audiogear.com)"
- Content updated: Daily (products), Weekly (reviews)
- Ships to: US, Canada, UK, EUllms.txt example for an e-commerce store
Example 3: Local Service Business
Local businesses have a unique AI visibility opportunity. When someone asks an AI assistant "find me a plumber in Austin" or "best dentist near downtown Portland," the AI model needs local business information structured in a way it can parse. An llms.txt file for a local business should emphasize location, service area, credentials, and specific services offered.
Local business llms.txt files should be shorter than enterprise versions — 400-800 words is ideal. AI models processing local queries need precise, factual information: what you do, where you do it, your hours, and your qualifications. Marketing language is counterproductive here.
# Riverside Family Dental
> Riverside Family Dental is a general and cosmetic dentistry practice in Portland, Oregon. Dr. Sarah Chen (DDS, 15 years experience) and Dr. Mark Rivera (DMD, Invisalign certified) serve patients in the Portland metro area. Accepting new patients, most insurance plans accepted.
## Key Pages
- [Homepage](https://riversidedental.com): Practice overview and appointment booking
- [Services](https://riversidedental.com/services): General, cosmetic, emergency, pediatric dentistry
- [Our Dentists](https://riversidedental.com/team): Credentials and specializations
- [New Patients](https://riversidedental.com/new-patients): Insurance, forms, what to expect
- [Contact](https://riversidedental.com/contact): Address, phone, hours, map
## Topics and Expertise
- General dentistry (cleanings, fillings, crowns)
- Cosmetic dentistry (veneers, whitening, Invisalign)
- Emergency dental care in Portland, OR
- Pediatric dentistry for ages 2+
## Optional
- Location: 1234 Riverside Dr, Portland, OR 97201
- Hours: Mon-Fri 8am-6pm, Sat 9am-2pm
- Phone: (503) 555-0147
- Languages: English, Spanishllms.txt example for a local service business
The 7 Most Common llms.txt Mistakes (and How to Fix Them)
After analyzing 528 llms.txt files, we identified seven mistakes that appear repeatedly and measurably reduce effectiveness.
Mistake 1: Wrong Content-Type header (28% of files). The file must be served with Content-Type: text/plain. Many web servers default to text/html for unknown file extensions. AI parsers that receive HTML content type may skip the file or misparse it. Fix: configure your server to serve .txt files with the correct MIME type. In Nginx: location = /llms.txt { default_type text/plain; }. In Apache: add AddType text/plain .txt to .htaccess.
Mistake 2: Listing every page (34% of files). An llms.txt with 500 URLs is a sitemap, not a guide. AI models have limited context windows for protocol file parsing. If your file exceeds ~3,000 words, the AI may truncate it and miss your most important pages. Fix: curate 10-30 key pages. Quality over quantity.
Mistake 3: Missing the description section (19% of files). Some files jump straight from the title to page links. The description is how AI models understand your site's purpose and authority at a glance. Without it, the model has to infer context from URLs alone. Fix: always include a 2-4 sentence description after the title.
Mistake 4: Using marketing language instead of factual descriptions. "We are the world's leading provider of innovative solutions" tells AI nothing useful. "We provide cloud-based accounting software for small businesses in the US and Canada, serving 12,000 customers since 2019" is specific, verifiable, and useful. Fix: write descriptions as if you were describing your business to an encyclopedia editor.
Mistake 5: Broken or redirecting URLs (17% of files). Links in llms.txt that 404 or redirect undermine trust. AI models may follow the links to verify content. Broken links signal an unmaintained file. Fix: validate every URL before publishing. Re-validate monthly.
Mistake 6: No page descriptions (41% of files). Listing URLs without descriptions forces AI to visit every page to understand its content. Adding a one-line description per URL lets AI models index your site structure from the llms.txt alone. Fix: add a colon or dash after each URL with a brief description.
Mistake 7: Not updating after site changes (estimated 50%+ of files). An llms.txt that references pages, products, or services that no longer exist creates confusion. Fix: add llms.txt updates to your deployment checklist. When you restructure navigation or add major pages, update the file.
- Wrong Content-Type: 28% of files — serve as text/plain, not text/html
- Too many URLs: 34% of files — curate 10-30 key pages, not your entire sitemap
- Missing description: 19% of files — always include 2-4 factual sentences about your site
- Marketing language: Be factual and specific, not promotional
- Broken URLs: 17% of files — validate all links, re-check monthly
- No page descriptions: 41% of files — add one-line descriptions to every listed URL
- Stale content: Update llms.txt whenever site structure changes
How to Validate Your llms.txt File
After creating your llms.txt, validate it with a three-step process.
Step 1: HTTP validation. Use curl to verify the file is accessible and served correctly: curl -I https://yourdomain.com/llms.txt. Confirm the response is 200 OK with Content-Type: text/plain. If you see a 301/302 redirect, fix it — some AI crawlers do not follow redirects for protocol files.
Step 2: Content validation. Check that your file includes all five standard sections (title, description, key pages, topics, optional metadata). Verify that all URLs return 200 status codes. Check that the file is under 3,000 words.
Step 3: AI model testing. The ultimate test: ask an AI model about your site and see if the response reflects the context from your llms.txt. Try asking Claude or ChatGPT "What does [your site name] do?" and compare the answer to your llms.txt description. If the AI's answer aligns with your description, the file is being parsed correctly. If the answer is vague or inaccurate, investigate whether the file is accessible and correctly formatted.
Ongoing monitoring: Check your server logs for llms.txt requests from AI crawlers. You should see requests from GPTBot, ClaudeBot, PerplexityBot, and others within days of publishing the file. If no AI crawlers request the file within two weeks, verify that your robots.txt is not blocking access to the file path.
Run an AgentReady scan before and after publishing your llms.txt to measure the score impact. In our data, a well-structured llms.txt typically adds 6-10 points to the AI Protocols category score and 3-5 points to the overall AI readiness score.
Frequently Asked Questions
What is the difference between llms.txt and llms-full.txt?
llms.txt is a concise overview of your site: its purpose, key pages, and navigation structure. llms-full.txt is an optional expanded version with detailed content summaries, full page descriptions, and deeper context. Most AI models check for llms.txt first. Create llms.txt initially; add llms-full.txt when you want to provide maximum context.
Where should the llms.txt file be placed?
At your domain root: https://yourdomain.com/llms.txt. Some implementations use /.well-known/llms.txt. We recommend the root path because it has broader recognition across AI platforms. Ensure the file returns with a Content-Type of text/plain and a 200 status code.
How long should an llms.txt file be?
The sweet spot based on our analysis is 800-2,000 words (roughly 50-120 lines). Files under 400 words are too sparse to provide meaningful context. Files over 3,000 words risk exceeding the context window that AI models allocate for protocol file parsing. The top-performing llms.txt files in our study average 1,200 words.
Do all AI models read llms.txt files?
Most major AI platforms now check for llms.txt. Claude shows the strongest response to llms.txt (2.3x citation boost in our data). ChatGPT and Perplexity also process it. Google has not confirmed whether Googlebot checks llms.txt for AI Overviews, but early evidence suggests it may influence AI Overview source selection.
Check Your AI Readiness Score
Free scan. No signup required. See how AI engines like ChatGPT, Perplexity, and Google AI view your website.
Scan Your Site FreeRelated Articles
How to Create the Perfect llms.txt File (With Templates)
The llms.txt file tells AI models what your site is about and where to find key content. Here is exactly how to create one, with copy-paste templates for every site type.
AI ProtocolsNLWeb, MCP, and llms.txt: The Three Protocols That Will Define the Agentic Web
The agentic web runs on three protocol layers. llms.txt tells AI what to read. NLWeb lets AI ask questions. MCP lets AI take action. Here's how they fit together and which one your site needs first.
Data & ResearchAI Protocol Adoption: Where the Web Stands in March 2026
We measured adoption rates for llms.txt, NLWeb, and MCP across 5,000 websites. The numbers are tiny but growing fast, with llms.txt doubling since December 2025.