← back

🧠 Z80 LLM

Training tiny neural nets to predict Z80 assembly for retro hardware

What it is

A small language model trained from scratch on Z80 assembly source code, designed to run on or assist with real retro hardware. The model predicts assembly instructions, helping with code generation and completion for Z80-based systems — including an actual IMSAI 8080 with Z80 upgrade card.

Approach

Quaternary GRU architecture — custom model using 4-bit discrete states, not a standard Transformer
312M character corpus — curated Z80 assembly source code from real projects
Full precision training first — then quantize after, not during
Targeting 64KB footprint — the model itself needs to fit in retro hardware memory

Working Config

Hidden size	512
Embedding	64
Sequence length	128
Batch size	4096
Learning rate	0.001
Epochs	5
GPU	GTX 1080 Ti (11GB)

Why it matters

Most LLMs target modern x86/ARM or cloud inference. This one is built for a machine that predates the internet. It's an exercise in extreme constraint ML — what's the smallest useful model you can train, and can it actually run on hardware designed in the 1970s?