Diffusion LLM
Text Diffusion Playground
vs AR: — vs AR compares diffusion decoding with traditional autoregressive decoding, which emits about one token per forward pass — diffusion fills many at once, so fewer steps is better.
Step - / -
Ready for generation. Submit a prompt to start.
Token confidence Each step, the model rates its confidence in every still-masked token. Gate — the least-confident remaining token; early-stop watches this one. Mean — the average across remaining tokens. A wide gap means a few hard tokens are holding back the rest.
Run a generation, then scrub the steps to watch token confidence rise.
gate mean early-stop
Download ready
General Info
What is text diffusion here?
A masked-diffusion language model (LLaDA / MaskGIT style). The SQL answer begins as a fixed window of masked tokens; each step predicts every masked position and permanently commits the most confident ones, so the query emerges in parallel instead of left-to-right. During training, the model sees examples at many noise levels by hiding different fractions of the target SQL, then learns to reconstruct only the hidden tokens from the prompt, schema, and visible SQL tokens. Unlike a causal decoder, the transformer can look across the whole proposed SQL window at once, which helps it coordinate table names, joins, filters, and syntax globally. At inference time this runs in reverse: early high-confidence tokens become anchors, and later steps use that partially completed query to resolve the remaining blanks. Adaptive early stopping can finish before the full step budget when every remaining blank is already predicted with high confidence. The animation shows those intermediate states.
Model
Backbone: ModernBERT-base with a masked-LM head, fine-tuned LLaDA-style (continuous mask ratio, 1/t-weighted ELBO) on the gretelai/synthetic_text_to_sql dataset with on-the-fly augmentation (spaced / mixed-case identifiers, colloquial prompts) for robustness.
Input is your question (Prompt) plus the schema DDL (Context); output is the SQL.
Runtime profile

This is an early Research Preview deployed on a CPU (yes i cant afford GPUs).

Model parameters: 149.66M
Requested backend: onnx
Runtime backend: onnxruntime (worker process)
Requested dtype: fp32
Effective dtype: fp32
How to use
Put your question in Prompt and the table schema (e.g. a CREATE TABLE …) in Context, then run. More denoising steps can improve quality but take longer.
Contact
Email: thorekoritzius@gmail.com