linux-next/lib/zstd/decompress
Nick Terrell 40eb0e915d zstd: Backport Huffman speed improvement from upstream
Backport upstream commit c7269ad [0] to improve zstd decoding speed.

Updating the kernel to zstd v1.5.5 earlier in this patch series
regressed zstd decoding speed. This turned out to be because gcc was not
unrolling the inner loops of the Huffman decoder which are executed a
constant number of times [1]. This really hurts performance, as we expect
this loop to be completely branch-free. This commit fixes the issue by
unrolling the loop manually [2].

The commit fixes one more minor issue, which is to mask a variable shift
by 0x3F. The shift was guaranteed to be less than 64, but gcc couldn't
prove that, and emitted suboptimal code.

Finally, the upstream commit added a build macro
`HUF_DISABLE_FAST_DECODE` which is not used in the kernel, but is
maintained to keep a clean import from upstream.

This commit was generated from upstream signed tag v1.5.5-kernel [3] by:

  export ZSTD=/path/to/repo/zstd/
  export LINUX=/path/to/repo/linux/
  cd "$ZSTD/contrib/linux-kernel"
  git checkout v1.5.5-kernel
  make import LINUX="$LINUX"

I ran my benchmark & test suite before and after this commit to measure
the overall decompression speed benefit. It benchmarks zstd at several
compression levels. These benchmarks measure the total time it takes to
read data from the compressed filesystem.

Component,	Level,	Read time delta
Btrfs    ,	    1,	-7.0%
Btrfs    ,	    3,	-3.9%
Btrfs    ,	    5,	-4.7%
Btrfs    ,	    7,	-5.5%
Btrfs    ,	    9,	-2.4%
Squashfs ,	    1,	-9.1%

Link: c7269add7e
Link: https://gist.github.com/terrelln/2e14ff1fb197102a08d7823d8044978d
Link: https://gist.github.com/terrelln/a70bde22a2abc800691fb65c21eabc2a
Link: https://github.com/facebook/zstd/tree/v1.5.5-kernel
Signed-off-by: Nick Terrell <terrelln@fb.com>
2023-11-20 14:49:06 -08:00
..
huf_decompress.c zstd: Backport Huffman speed improvement from upstream 2023-11-20 14:49:06 -08:00
zstd_ddict.c zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00
zstd_ddict.h zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00
zstd_decompress_block.c zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00
zstd_decompress_block.h zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00
zstd_decompress_internal.h zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00
zstd_decompress.c zstd: import upstream v1.5.5 2023-11-20 14:48:34 -08:00