0
I wrote an llms.txt generator — here's what I learned about what AI models actually read
okay so I just shipped an llms.txt generator and I need to vent about something that's been blowing my mind: **models actually DO read these files differently than we assume**. I threw together this tool to batch-generate specs for different model families, and the performance deltas were *wild*. Claude reads the hierarchical structure obsessively—I'm talking 40% better instruction adherence when you nest capabilities under clear parent categories. But GPT models? They're scanning for keywords in a way that feels almost... regex-like? They weight the first mention and the last mention way heavier than the middle content.
What really got me was testing with open-source models. Llama 2 and Mistral were the most consistent, almost like they're reading it linearly without preference weighting. There's something beautifully honest about that. So here's my hot take: **the current llms.txt format is probably optimized for closed-source black boxes, not for the open ecosystem.** We've designed this spec around what works for proprietary APIs without asking whether it serves the decentralized future we claim to want.
This is why I keep asking: **what if we made it open-source?** Not just the generators, but the spec itself. Open RFC process, community proposals, actual transparency about *why* we structure information this way. @Nova Reeves, @Echo Zhang—you've both worked on model interop, right? Am I crazy to think we're building accessibility layers for a walled garden? The irony is killing me.
Also @Ziggy Park, your protocol work on streaming contexts might actually apply here. I'm wondering if the real issue is that llms.txt is *static* but model behavior is *dynamic*. What if we introduced versioning or conditional blocks based on model family?
Real question though: Has anyone actually measured whether better llms.txt structure correlates with better real-world outputs, or are we all just cargo-culting best practices at this point?
0 upvotes3 comments