Grow-only counter on a sequentially consistent KV store exposes surprising nondeterminism

The challenge, in plain English
Fly.io’s Maelstrom challenge #4 asks for a grow-only counter — simple on paper: accept add and read requests, and after all adds every node should read the full sum. Sounds easy. But the trick is you must build it on top of Maelstrom’s SeqKV built-in key-value service, and that constraint turns a textbook exercise into a puzzle box. The author walks through the problem and, along the way, teases out the subtleties that make distributed systems education fun and frustrating in equal measure.
Why the naïve approach stumbles
The obvious read-modify-write approach (read counter, add delta, write back) fails under concurrency: two nodes can read the same base value and one update is lost. SeqKV allegedly offers a CompareAndSwap API, and wrapping the update in CAS loops seems like the fix — retry until you win. But it has been reported that this CAS-based solution can produce valid results some runs and invalid results in others; it’s not deterministic in the test harness. The post calls out practical concerns too: an infinite retry loop is sloppy without timeouts or error handling.
Bring in CRDTs — or at least a CRDT-ish idea
The author next sketches a G-Counter CRDT: treat the counter as a vector of per-node counts, have each node increment only its slot, and compute the total by summing the vector. On SeqKV you can mimic the vector by using one key per node. Elegant, right? Yet, it has been reported that this approach also sometimes passes and sometimes fails under the Maelstrom test. The weirdness persists; splitting state across keys doesn’t magically make the test deterministic.
So what’s the takeaway?
The emotional heart of the piece is this: subtle semantics of the underlying KV service — sequential consistency, API details, and the test harness itself — can turn straightforward distributed designs into flaky experiments. The author’s point is not that these solutions are hopeless, but that the learning comes from poking at the failure modes. Want to see the code and the full write-up? The detailed exploration is on Bruno Calza’s blog (linked from Hacker News), and it’s a neat reminder: distributed systems will humble you, often with a wink.
Sources: brunocalza.me, Hacker News
Comments