Live Demo

WikiLM Text Generation

Type a prompt and watch a from-scratch GPT-style language model — trained entirely on Wikipedia — generate text in real time. Adjust temperature, top-k, and top-p to explore different sampling behaviors.

Transformer Decoder ~14M Parameters BPE Tokenizer Wikipedia Corpus PyTorch

How it works

Data Collection

~50K English Wikipedia articles are downloaded, cleaned of markup and metadata, and prepared as a training corpus.

Custom Tokenizer

A Byte-Pair Encoding tokenizer is trained from scratch on the corpus, learning subword units that efficiently represent the vocabulary.

Model Training

A GPT-style transformer decoder learns to predict the next token by training on millions of token sequences with cosine-scheduled AdamW optimization.

Text Generation

Given a prompt, the model autoregressively generates tokens using configurable sampling — temperature, top-k, and nucleus (top-p) — for varied output.