feat: add cachecompress package to compress static files for HTTP by spikecurtis · Pull Request #21915 · coder/coder

relates to: coder/internal#1300

Adds a new package called cachecompress which takes a http.FileSystem and wraps it with an on-disk cache of compressed files. We lazily compress files when they are requested over HTTP.

Why we want this

With cached compress, we reduce CPU utilization during workspace creation significantly.

image.png

This is from a 2k scaletest at the top of this stack of PRs so that it's used to server /bin/ files. Previously we pegged the 4-core Coderds, with profiling showing 40% of CPU going to zstd compression (c.f. coder/internal#1300).

With this change compression is reduced down to 1s of CPU time (from 7 minutes).

Implementation details

The basic structure is taken from Chi's Compressor middleware. I've reproduced the LICENSE in the directory because it's MIT licensed, not AGPL like the rest of Coder.

I've structured it not as a middleware that calls an arbitrary upstream HTTP handler, but taking an explicit http.FileSystem. This is done for safety so we are only caching static files and not dynamically generated content with this.

One limitation is that on first request for a resource, it compresses the whole file before starting to return any data to the client. For large files like the Coder binaries, this can add 1-5 seconds to the time-to-first-byte, depending on the compression used.

I think this is reasonable: it only affects the very first download of the binary with a particular compression for a particular Coderd.

If we later find this unacceptible, we can fix it without changing interfaces. We can poll the file system to figure out how much data is available while the compression is inprogress.