Show HN: xmllm – Structured LLM streaming output using lenient XML parsing

5 points by padolsey 4 days ago

Hi HN. I made a little JS library for streaming structured data from LLMs using leniently-parsed XML as a medium.

E.g.

    await simple('fun pet names', {
      schema: { name: Array(String) },
      model: 'openrouter:mistralai/ministral-3b'
    }); // => ["Daisy", "Whiskers", "Rocky"]

Demos: xmllm.j11y.io

When using LLMs, I've ended up gravitating towards boring time-tested XML-esque tag-based delimiters instead of JSON/function-calling for the following reasons:

- Diverse presence in training corpuses (consider flavours of content commonly adjacent to these syntaxes vs. JSON) - HTML was built for fallible humans to write; LLMs are equally fallible. Let them cook [html]! - IME better and more consistent adherance than very delicate JSON/YAML etc. - Lenient by nature (following Postel's law of being liberal in what you accept) - No provider lock-in to function-calling/'tool' APIs - Supports streaming with 'eventually-fulfilled schemas' for progressive UI updates - CSS-style selections when you need more flexibility than schemas

It is provider/API/model-agnostic and has good schema-adherance across models from Qwen 2.5B and Ministral 3B all the way up to the frontier stuff like Claude and GPT-4o. It has in-built model preferencing and fallbacks (like asking it to prefer Claude but fall back to Mistral via e.g. openrouter/togetherai/whatever), plus 'inner' truncation to avoid context limitations.

I originally built this for my own projects I've found it to be a stable abstraction and quick to prototype with on the client-side especially (with the proxy feature). I wanted to share it here mostly to gather feedback and hear what other people are doing to source structured schema-conforming data from LLMs? I know there's new work being done in constraining big param LLMs to fixed grammars, and that's probably the future, but I've no real context or knowledge about that and for the past few years I've just been trying to /get stuff done/ in a reliable and consistant way. Hence this project.

Demos and sandbox-y things (try the top flag generator thing!) https://xmllm.j11y.io

Repo: https://github.com/padolsey/xmllm

Blog post (+background and other reflections): https://blog.j11y.io/2024-12-15_xmllm/

And please, if you've a moment: I'm vvv interested in what people are currently using to get reliable structured data from LLMs? Perhaps most are completely satisfied with Function-Calling APIs?