Writing about web engineering, infrastructure, and the occasional experiment.
Built a “Victorian-ish” Nanochat by scraping and cleaning 700k Internet Archive texts (up to 1899), then mid-training and SFT’ing it with custom synthetic chat data. It’s imperfect and hallucinates, but it was a fun end-to-end dive into real LLM pipelines.