Progressive encoding and decoding of "repeated" protobuffer fields

What’s broken
A new deep-dive explains how the protobuf wire format can be nudged into streaming large repeated message fields so you don't have to keep the whole thing in RAM. The author, writing about Perfetto trace files used by tools like Tonbandgerät and CircumSpect, says traces can quickly balloon into millions of TracePacket messages and gigabytes of data — and it has been reported that standard protobuf workflows force you to deserialize the entire parent Trace structure at once. Furious, right? No one wants to babysit a gigabyte-sized struct while waiting for post-processing.
How the wire helps
The trick isn't magic; it's the wire format. The post walks through varint encoding and length-delimited submessages — the primitives protobuf uses to pack tags, lengths, and payloads — and shows how understanding those pieces makes progressive append and read strategies possible. By writing TracePacket wire-encoded blocks sequentially (tag + length + data) you can stream out packets as they’re generated, and conversely read and process packets one at a time without materializing the full Trace in memory. It has been reported that most protobuf libraries don't provide this pattern out of the box, so you have to think in bytes, not objects.
Why it matters
This is a practical win for anyone capturing long-running traces or operating on huge message logs: lower memory use, faster write paths, and simpler post-processing pipelines. It also nudges a larger conversation about serialization design — zero-copy formats like Cap'n Proto promise different trade-offs, but sometimes you can get surprisingly far simply by reading the spec and doing a little byte fiddling. The blog is a neat reminder: sometimes the best tool is a clearer look at the format you already have.
Sources: schilk.co, Hacker News
Comments