0x: A Token-Efficient, Verifiable Compilation Target for LLM Code Generation

Authors: Han Kim
Papers: IOV Labs · flagship · open source · 13pp · 2026-05-28

Abstract

Large language models spend most of their output tokens on framework boilerplate. We present 0x, a compact AI-first language that compiles one source to React, Vue 3, Svelte 5, React Native, Express, and Terraform, and use it to ask two questions a code-generation target must answer. First, efficiency: measured with a real BPE tokenizer across ten apps, 0x source is 2.41× smaller than the React it compiles to (58% fewer tokens; 1.88× vs Vue, 1.80× vs Svelte) — a conservative lower bound. Second, hittability: naively prompted, gpt-4o compiles valid 0x on only 1 of 5 tasks, because it does not know the syntax of a language absent from its training data — familiarity beats compactness. Critically, every failure is a syntax error, not a semantic one. Because syntax is exactly what structure enforcement removes, we constrain generation to a schema-guaranteed AST and render canonical 0x ourselves; combined with real compiler work (desugaring JS spread, normalizing strict equality, two lexer fixes — all 303 tests still passing), first-try compilation rises 1/5 → 5/5, holding at 7/8 on a fresh task set. The compiler-as-verifier, not the prompt, is what makes a compact DSL a viable LLM target. Everything is open source and reproducible with one command.

Keywords

large language models
code generation
domain-specific languages
token efficiency
constrained decoding
structured output
compilers
verification
reproducibility

Download PDF GitHub npm Website