General Info
What is text diffusion here?
A masked-diffusion language model (LLaDA / MaskGIT style). The SQL answer begins as a fixed window of
masked tokens; each step predicts every masked position and permanently commits the most confident ones,
so the query emerges in parallel instead of left-to-right. During training, the model sees examples at many
noise levels by hiding different fractions of the target SQL, then learns to reconstruct only the hidden
tokens from the prompt, schema, and visible SQL tokens. Unlike a causal decoder, the transformer can look
across the whole proposed SQL window at once, which helps it coordinate table names, joins, filters, and
syntax globally. At inference time this runs in reverse: early high-confidence tokens become anchors, and
later steps use that partially completed query to resolve the remaining blanks. Adaptive early stopping can
finish before the full step budget when every remaining blank is already predicted with high confidence.
The animation shows those intermediate states.
Model
Backbone: ModernBERT-base with a masked-LM head, fine-tuned LLaDA-style (continuous
mask ratio, 1/t-weighted ELBO) on the gretelai/synthetic_text_to_sql dataset with
on-the-fly augmentation (spaced / mixed-case identifiers, colloquial prompts) for robustness.
Input is your question (Prompt) plus the schema DDL (Context); output
is the SQL.
Runtime profile
This is an early Research Preview deployed on a CPU (yes i cant afford GPUs).
Model parameters: 149.66M
Requested backend: onnx
Runtime backend: onnxruntime (worker process)
Requested dtype: fp32
Effective dtype: fp32
How to use
Put your question in Prompt and the table schema (e.g. a
CREATE TABLE …)
in Context, then run. More denoising steps can improve quality but take longer.
Contact
Email: thorekoritzius@gmail.com