Linux extreme performance H1 load generator

Overview
It has been reported that a new high-performance HTTP/1.1 and WebSocket load generator — available at gcannon.org and commonly called Glass Cannon — has been released, built on Linux io_uring and targeting low-level, extreme-throughput testing. It requires Linux 6.1+, gcc and liburing-dev 2.5+. The project claims to be "the fastest HTTP load generator available." Want to squeeze more requests per second out of a single box? This tool is explicitly engineered for that.
Features and interface
Glass Cannon pairs a simple CLI (the only required argument is the target URL) with an optional rich TUI: live progress, a req/s sparkline, throughput stats updated every second, and a color-coded percentile table (cyan, yellow, red for p99/p99.9). Every response is recorded and the author says percentiles are exact rather than estimated; JSON output is available for CI and dashboards with --json, and WebSocket runs add ws_upgrades/ws_frames fields. The tool tracks reconnects when you use -r N and keeps a rolling history (~100 runs) at ~/.gcannon/history.bin with TUI trend bars and a --clear-history switch.
Precise latency measurement
Where Glass Cannon tries to stand out is latency fidelity. It uses clock_gettime(CLOCK_MONOTONIC) via the kernel vDSO for fast, monotonic nanosecond timestamps and records per-request send and arrival times in per-connection circular buffers. Responses are matched FIFO to send timestamps; arrival − send gives microsecond-resolution latency samples. A two-tier histogram captures the distribution: tier 1 covers 0–10 ms at 1 μs resolution (10,000 buckets) and tier 2 spans higher latencies (roughly 10 ms–5 s at coarser resolution). Histograms auto-zoom to the observed range and you can enable per-template latency breakdowns when using multiple raw request files.
Why it matters
In an era of microservices and tight SLOs, accurate latency numbers matter as much as raw throughput. Glass Cannon reads like a tool for people who demand both: high RPS from io_uring batching and precise, reproducible latency sampling for percentiles that teams actually trust. Will it displace existing load testers? That remains to be seen, but for benchmark junkies and CI-driven performance gates, this one is worth a spin.
Sources: gcannon.org, Hacker News
Comments