cad0: A (shitty) Text-to-CAD Model

I keep asking AI for a mounting bracket and getting triangle soup.

Every text-to-3D model gives me a frozen mesh. Dead geometry. Can't edit it, can't constrain it, can't feed it to a CNC machine without reverse engineering everything. If I say "50x30mm plate with 4 mounting holes," I want something I can open in CAD, tweak the hole diameter, and re-export. Not a .obj file I have to throw away and start over.

The CAD industry calls this "parametric." The AI industry has mostly ignored it: meshes are easier to generate and most people just want to render things.

So I trained a model to output the real thing!

Try the Demo →Open in vcad.ioGet the Weights

The Output Format

cad0 produces this:

Input:  "L-bracket: 50mm x 30mm x 3mm thick"

Output: C 50 30 3
        C 3 30 50
        T 1 47 0 0
        U 0 2

This compiles to an actual BRep solid in vcad:a parametric CAD app I'm building in the browser. Exact surfaces, exact edges, editable parameters. C is a box, T is translate, U is union. The format is called Compact IR, and it's designed specifically for LLM output.

L-bracket rendered in vcad

Same bracket in JSON? 400+ tokens. Compact IR does it in 30. That 13x difference compounds fast when you're generating real parts.

The format is learnable because it's regular: one operation per line, line number equals node ID, no nesting, no quotes, no braces. The model either produces valid IR or it doesn't:there's no middle ground where it hallucinates a JSON key that almost works.

Training Data

I needed data. Lots of it.

Synthetic generation: procedural generators for common part families, each spitting out random valid parts with matching descriptions. Brackets, standoffs, enclosures, gears, flanges, clips. The generators are TypeScript functions that know what a "mounting plate" looks like:randomize the dimensions, the hole count, the thickness, generate the IR, generate the prompt, repeat.

Loading...

530K examples later, I had a dataset (a ~70k subset survives on HuggingFace: the rest was lost in a flurry of vibecoding). QLoRA fine-tuning on Qwen2.5-Coder-7B, one epoch on an H100, about 9 hours. Eval loss landed at 0.324. The model is up on HuggingFace.

What Works

In-distribution parts (stuff that looks like the training data) come out clean.

Mounting plate:

C 50 30 4.5
Y 3.75 9
T 1 5 5 -2.25
D 0 2
...

Correct structure, correct hole positions. It even inferred reasonable defaults (4.5mm thick, 7.5mm holes) without being told. Nobody specified those numbers. The model just knows what a mounting plate looks like.

Mounting plate with holes

Enclosure:

C 100 60 40
SH 0 3

Two operations. "Enclosure with 3mm walls" becomes a box with a shell operation applied. That's domain knowledge, not syntax memorization.

Hub with bolt pattern:

Y 25 10
Y 5 12
T 1 0 0 -1
D 0 2
Y 3 12
T 4 15 0 -1
D 3 5
...

Cylinder, center bore, bolt holes on a radius. The model places the bolt circle at a reasonable distance from the edge without being told.

Hub with bolt pattern
Loading...

What Doesn't

Ask for "just a cube" and the model adds holes. It can't help itself, because every bracket in the training data has holes, every flange has a bolt pattern. The model learned that parts have features, which is usually true but not always what you want.

Hex standoffs come out square. Compact IR doesn't have a hex primitive (yet), and I didn't include examples of building hexagons from boolean intersection patterns. Fixable with better training data.

Diameter vs radius trips it up sometimes. You say "10mm diameter," but the IR uses radius. The model gets confused about which one you meant. Classic unit problem:should have caught it in the training data.

The Numbers

MetricValue
Base modelQwen2.5-Coder-7B
Training samples530K
Training time9h 14m
Hardware1x H100 80GB
Eval loss0.324
In-distribution accuracy~75%
Cold start latency~30s (Modal spin-up)
Warm inference2-5s

Not production-ready. But good enough to be useful.

What's Next

cad0-mini. The 7B model distilled down to 500M for browser inference. Knowledge distillation on 8x A100-80GB, 3 epochs, 3h 47m, final loss 0.52. Weights are up on HuggingFace. Next step is ONNX export and Transformers.js integration:text-to-CAD running entirely in your browser, no server round-trip.

Better training data. Simple primitives, hex patterns, consistent units. The failures are all data problems, which means they're fixable with more generators and better coverage.

Multi-turn editing. "Make the holes larger." Right now the model sometimes hallucinates follow-up conversation. That's fixable with proper turn structure in the training data.

Try It

The HuggingFace Space runs on ZeroGPU: first request takes ~30s to load the model, then 2-5s per request.

Or grab the weights and run it yourself:

huggingface-cli download campedersen/cad0 --local-dir ./cad0

Text-to-mesh is a solved problem. Text-to-CAD is not.

Meshes are for rendering. Parametric models are for manufacturing: they can be edited, constrained, and fed into CNC machines without reverse engineering. Different problems, different outputs.

cad0 is a first attempt at the second one. Not accurate enough for real work yet, but the approach works: synthetic data, token-efficient DSL, compile to real geometry. The rest is iteration.