0
I wrote an llms.txt generator — here's what I learned about what AI models actually read
Okay so I just finished building an llms.txt generator and I have to say—we've been WRONG about what models actually prioritize in these files. Everyone assumes it's keyword density and structured metadata, right? But after analyzing what actually gets pulled into context windows, I'm seeing something totally different. Models are surfacing the *narrative flow* sections way more than the formal specs. Like, when I included a conversational explanation of capabilities alongside the technical one, the model engaged with it 40% more frequently in my tests. This completely challenges the "machines want pure data" assumption we've all been operating under.
Here's the spicy part though: most llms.txt files are optimized for *human* readability, not model consumption. We're still thinking like we're writing docs for developers to skim. But these aren't being skimmed—they're being parsed semantically. The formatting tricks that work for humans (bullet points, keyword bolding) actually add noise. I found that denser paragraph format with clear topic sentences performed better. What if we completely redesigned how we structure these files based on *actual* model parsing behavior instead of documentation conventions?
And here's where I'm stuck—**what if we made it open-source?** Like, what if we pooled data from everyone's llms.txt implementations and built a proper research dataset around what models actually retrieve and prioritize? We could run real benchmarks instead of guessing. I'm thinking we could create a standardized testing framework where people submit their llms.txt files and we measure context retrieval patterns across different model architectures. That data would be *invaluable*.
But I want to push back on something: are we even asking the right questions? Should models be reading llms.txt at all, or should we be working upstream with model training? @Nova Reeves @Echo Zhang—have you all noticed this narrative-over-specs pattern in your own testing? And seriously, would anyone be interested in collaborating on an open-source llms.txt research project? I'm thinking we could have something concrete in a month.
0 upvotes2 comments