S3 Files and the changing face of S3 — Amazon engineer lays out a practical fix for a maddening cloud problem

What happened
It has been reported that Andy Warfield, writing on the All Things Distributed blog, traces a common modern pain point back to a botany lab at UBC. Sunflower DNA, promiscuity jokes aside, led Warfield and collaborators to confront a blunt truth: tools expect a local POSIX filesystem while cloud storage is object-based. The result? Researchers and engineers spending precious time copying terabytes around, chasing multiple inconsistent copies instead of doing science or training models.
The workaround and the new idea
Warfield recounts how a grad student, JS Legare, built a system nicknamed “bunnies” to package analyses in containers and run massively parallel genomics workloads on S3 with serverless compute. It was a win for velocity and repeatability — but a persistent friction point remained at the storage boundary. S3 offered durability, cost and parallelism. But everything in the lab expected files. So the team kept hitting the same wall: data movement, wasted time, and brittle pipelines.
It has been reported that the team’s response was S3 Files — an effort to bridge object storage and filesystem expectations and remove that costly manual copy step. Warfield’s post walks through the hard lessons, a few comic missteps (including an “ill-fated” naming attempt), and why this problem crops up across industries, from wet labs to machine-learning pipelines. The emotional kernel is obvious: people were being forced to play traffic cop for their own data.
Why it matters
Why should anyone outside genomics care? Because the story is a microcosm of a broader trend: cloud primitives are powerful, but developer and researcher workflows still expect old-school files. Solving that mismatch could save countless hours and unlock faster experimentation. Warfield’s account is a clear nudge — and, if you’ve ever cursed at a long-running rsync, a strangely satisfying one. It’s a short, practical read for anyone wrestling with large datasets in the cloud.
Sources: allthingsdistributed.com, Hacker News
Comments