Task #2981
closedHelp Online: Optionally compress static pages
0%
Description
Help online packages are rather large, with 1.8GiB per release. Xapian indices (for full text search, cf. #2555), including spelling/stemming suggestions, add another 1.9GiB per release on top of that, which raise some concerns in term of salability and sustainability from an infrastructure perspective.
6.4 has 164735 files right now; here are the extensions weighing ≥4MiB:
ext #files avg. size tot size ---- ------ --------- --------- svg 4341 1.16kiB 4.91MiB ods 329 20.53kiB 6.59MiB png 724 13.88kiB 9.81MiB js 204 276.81kiB 55.96MiB html 159061 9.43kiB 1464.93MiB
Some of these files (html, js, svg, css) have a fairly high compression ratio. In fact all modern browsers send Accept-Encoding: gzip
headers in their requests, causing the HTTPd to compress on the fly the payload, which on reception is decompressed by the client. Saving traffic, but not space. (And causing the HTTPd some overhead due to the extra processing.)
- compression is done once and for all on Olivier Hallot 's workstation, meaning less work to be done on the HTTPd side (hence faster processing time);
- since compression isn't done on the fly one can safely use more aggressive options (compression level) without risk of DoS'ing ourselves; and
- The HTTPd can safely add a
Content-Length
header to the response (this is not possible for pipelined compression since the server doesn't know the size of the payload by the time it writes the header part).
For the few browsers not supporting gzip or not sending Accept-Encoding: gzip
in the request, the requested file, stored compressed on the server, would be decompressed on the fly by the server, and the decompressed payload is served as is (without Content-Length
header). So pretty much the opposite of what's performed right now.
Concretely, what I request is a flag to optionally run
find /path/to/6.4 -type f \
\( -name "*.css" -o -name "*.html" -o -name "*.js" -o -name "*.svg" \) \
\! -size -128c \
-print0 | xargs -r0 gzip -n
After a successful build (symbolic links require some extra care: if the target is compressed, then the link name should be removed and replaced — targeting the .gz counterpart — with a .gz suffix).
I.e., compress (with gzip(1)
's default options) these files. But only when exceeding 128 bytes.
Maybe the list of extensions to compress and the compression threshold (128 bytes) could be specified by the flag.
I'll take care of the server configuration. (In fact I already have a PoC for 6.4.) That requires a new location{}
block, and since we already had to add one for 6.4 (for #2555) it's best for the infra team if that flag would be added to 6.4 as well.