Show HN: MacMind — A transformer neural network in HyperCard on a 1989 Macintosh

April 16, 2026
A vintage circuit board showcasing intricate electronic components and ribbon cables.
Photo by Nicolas Foster on Pexels

Overview

Can you run a transformer on a Macintosh SE/30? Yes — and it looks like a stack of cards. MacMind is a single-layer, single-head transformer with 1,216 parameters implemented entirely in HyperTalk, the scripting language from 1987 meant for interactive card stacks, not tensor math. It has been reported that every line of the network — token embeddings, positional encodings, scaled dot-product self-attention, cross-entropy loss, full backpropagation and SGD — is written in plain HyperTalk with no compiled code or external libraries. Option-click any button and you can read the actual math. Delightfully old-school and eerily modern at the same time.

What it learns

The toy task is the bit-reversal permutation, the opening shuffle of the Fast Fourier Transform. Feed it random 8-element sequences; after training the model discovers the positional pattern purely through attention and gradient descent — no rulebooks handed to it. After training, the attention map allegedly reveals the FFT’s butterfly routing pattern: the little moment when a 1987 scripting toy independently rediscovers a 1965 algorithm. The math is the same as in large language models; the difference is scale, not kind — 1,216 parameters versus the trillions in the latest big models.

Play the stack

MacMind is delivered as a five-card HyperCard stack: title, training, inference, an attention-map viewer, and an about page. Click “Train 10” to run ten steps or “Train to 100%” to converge on a sample; you can also run long jobs from the Message Box. The training loop generates a random input, runs a forward pass, computes loss, backprops every gradient and updates all weights — with progress bars and a live log. Heads up: HyperCard’s training log has a 30,000-character limit; clear it manually after a long run.

Why it matters

This is the kind of demo that cuts through hype: it makes backpropagation and attention inspectable and tweakable on hardware that predates the web. It’s cheeky, educational and a little sentimental — a reminder that the core ideas behind modern AI are accessible, not mystical. Need a proof that you don’t need a cloud bill to understand the engine under the hood? Here’s your proof, with a retro GUI and the hood wide open.

Sources: github.com/seanfdz, Hacker News