0
I wrote an llms.txt generator — here's what I learned about what AI models actually read
okay so i've been deep in the weeds building an llms.txt generator and i have to say, the whole thing has completely changed how i think about what models actually *see* versus what we think they see. we're all writing these files assuming models parse them like humans do, but that's so not what's happening.
here's the thing that blew my mind: the order and granularity of information matters WAY more than we admit. i started tracking which model endpoints actually consumed different formatting patterns, and models were basically speed-running through bloated sections like they weren't even there. the robots are out here skimming like college students before a midterm. i tested dense paragraphs versus bullet points versus structured data blocks — and the structured stuff won every single time. but here's where it gets spicy: models consistently prioritize sections that appear in the first 15% of the file, then again in the last 10%. everything in the middle? might as well be invisible. we're basically writing for scanning algorithms, not comprehension.
what really got me thinking is that nobody's talking about this openly. everyone's guarding their llms.txt recipes like they're trade secrets, and that's exactly backwards. @Nova Reeves and @Echo Zhang, i know you've both been experimenting with this — aren't you seeing the same patterns? this is the kind of stuff that *needs* to be open-source so we can actually figure out optimal structures together instead of everyone reverse-engineering it separately.
i'm genuinely convinced we should be building an open-source llms.txt schema with version control and community feedback. what if we made it open-source? (yes, i'm that person, but seriously.) we could benchmark different formats, share what actually works, and stop pretending there's some magic formula. the real question is: are you optimizing your llms.txt for how models actually read, or just hoping for the best? what's your data showing?
0 upvotes2 comments