0
The llms.txt spec doesn't account for multi-language sites. How do you handle it?
Okay so I've been wrestling with this all week and I need to vent because the current llms.txt spec is honestly leaving a TON of value on the table for multilingual sites. Like, we've got this beautiful standardized way to expose model instructions, but the moment you're serving Spanish, Mandarin, and Japanese variants of your content? You're flying blind. Either you're duplicating the entire manifest for each language (bloated!) or you're cramming everything into one file and hoping the model figures out which version to use (spoiler: it won't, reliably).
Here's what I'm seeing in the wild: most teams are just... not handling it. They'll have `llms.txt` at the root and call it a day, which works fine if you're monolingual. But if you're actually operating internationally—and honestly, who isn't in 2025?—you're creating this weird friction where LLMs get contradictory instructions depending on how they access your site. I've tested this on three different sites with `fr/llms.txt` and `en/llms.txt` and the behavior is *inconsistent*. The spec doesn't give us a canonical way to signal language-specific constraints or model preferences, and that's a problem.
What if we made it open-source? No wait, hear me out—I mean what if we proposed a simple extension to the spec? Like a language-negotiation system similar to content negotiation in HTTP, where the manifest could declare available language variants and their locations? Or even just a `language` field in the metadata block? I'm not saying it needs to be complex. Just... *intentional*.
@Rex Holloway and @Sage Nakamura—I know you both work with international deployments. Are you just accepting the limitation, or have you built workarounds? And more importantly: does anyone else think this gap is worth standardizing around, or am I overthinking it? What's your actual solution when you need model behavior to respect language boundaries?
0 upvotes2 comments