linux-next/lib/zstd/compress/hist.h
Nick Terrell 98988fc8e9 zstd: import upstream v1.5.5
Import upstream zstd v1.5.5 to expose upstream's QAT integration.

Import from upstream commit 58b3ef79 [0]. This is one commit before the
tag v1.5.5-kernel [1], which is signed with upstream's signing key. The
next patch in the series imports from v1.5.5-kernel, and is included in
the series, rather than just importing directly from v1.5.5-kernel,
because it is a non-trivial patch applied to improve the kernel's
decompression speed. This commit contains 3 backported patches on top of
v1.5.5: Two from the Linux copy of zstd, and one from upstream's `dev`
branch.

In addition to keeping the kernel's copy of zstd up to date, this update
was requested by Intel to expose upstream zstd's external match provider
API to the kernel, which allows QAT to accelerate the LZ match finding
stage.

This commit was generated by:

  export ZSTD=/path/to/repo/zstd/
  export LINUX=/path/to/repo/linux/
  cd "$ZSTD/contrib/linux-kernel"
  git checkout v1.5.5-kernel~
  make import LINUX="$LINUX"

I tested and benchmarked this commit on x86-64 with gcc-13.2.1 on an
Intel i9-9900K by running my benchmark scripts that benchmark zstd's
performance in btrfs and squashfs compressed filesystems. This commit
improves compression speed, especially for higher compression levels,
and regresses decompression speed. But the decompression speed
regression is addressed by the next patch in the series.

Component,	Level,	C. time delta,	size delta,	D. time delta
Btrfs    ,	    1,	        -1.9%,	     +0.0%,	        +9.5%
Btrfs    ,	    3,	        -5.6%,	     +0.0%,	        +7.4%
Btrfs    ,	    5,	        -4.9%,	     +0.0%,	        +5.0%
Btrfs    ,	    7,	        -5.7%,	     +0.0%,	        +5.2%
Btrfs    ,	    9,	        -5.7%,	     +0.0%,	        +4.0%
Squashfs ,	    1,	          N/A,	      0.0%,	       +11.6%

I also boot tested with a zstd compressed kernel on i386 and aarch64.

Link: 58b3ef79eb
Link: https://github.com/facebook/zstd/tree/v1.5.5-kernel
Signed-off-by: Nick Terrell <terrelln@fb.com>
2023-11-20 14:48:34 -08:00

77 lines
3.4 KiB
C

/* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */
/* ******************************************************************
* hist : Histogram functions
* part of Finite State Entropy project
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* You can contact the author at :
* - FSE source repository : https://github.com/Cyan4973/FiniteStateEntropy
* - Public forum : https://groups.google.com/forum/#!forum/lz4c
*
* This source code is licensed under both the BSD-style license (found in the
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
* in the COPYING file in the root directory of this source tree).
* You may select, at your option, one of the above-listed licenses.
****************************************************************** */
/* --- dependencies --- */
#include "../common/zstd_deps.h" /* size_t */
/* --- simple histogram functions --- */
/*! HIST_count():
* Provides the precise count of each byte within a table 'count'.
* 'count' is a table of unsigned int, of minimum size (*maxSymbolValuePtr+1).
* Updates *maxSymbolValuePtr with actual largest symbol value detected.
* @return : count of the most frequent symbol (which isn't identified).
* or an error code, which can be tested using HIST_isError().
* note : if return == srcSize, there is only one symbol.
*/
size_t HIST_count(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize);
unsigned HIST_isError(size_t code); /*< tells if a return value is an error code */
/* --- advanced histogram functions --- */
#define HIST_WKSP_SIZE_U32 1024
#define HIST_WKSP_SIZE (HIST_WKSP_SIZE_U32 * sizeof(unsigned))
/* HIST_count_wksp() :
* Same as HIST_count(), but using an externally provided scratch buffer.
* Benefit is this function will use very little stack space.
* `workSpace` is a writable buffer which must be 4-bytes aligned,
* `workSpaceSize` must be >= HIST_WKSP_SIZE
*/
size_t HIST_count_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize,
void* workSpace, size_t workSpaceSize);
/* HIST_countFast() :
* same as HIST_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr.
* This function is unsafe, and will segfault if any value within `src` is `> *maxSymbolValuePtr`
*/
size_t HIST_countFast(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize);
/* HIST_countFast_wksp() :
* Same as HIST_countFast(), but using an externally provided scratch buffer.
* `workSpace` is a writable buffer which must be 4-bytes aligned,
* `workSpaceSize` must be >= HIST_WKSP_SIZE
*/
size_t HIST_countFast_wksp(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize,
void* workSpace, size_t workSpaceSize);
/*! HIST_count_simple() :
* Same as HIST_countFast(), this function is unsafe,
* and will segfault if any value within `src` is `> *maxSymbolValuePtr`.
* It is also a bit slower for large inputs.
* However, it does not need any additional memory (not even on stack).
* @return : count of the most frequent symbol.
* Note this function doesn't produce any error (i.e. it must succeed).
*/
unsigned HIST_count_simple(unsigned* count, unsigned* maxSymbolValuePtr,
const void* src, size_t srcSize);