2019-05-19 12:07:45 +00:00
|
|
|
# SPDX-License-Identifier: GPL-2.0-only
|
2005-04-16 22:20:36 +00:00
|
|
|
#
|
|
|
|
# Library configuration
|
|
|
|
#
|
|
|
|
|
2009-03-06 16:21:46 +00:00
|
|
|
config BINARY_PRINTF
|
|
|
|
def_bool n
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
menu "Library routines"
|
|
|
|
|
2009-07-13 10:35:12 +00:00
|
|
|
config RAID6_PQ
|
|
|
|
tristate
|
|
|
|
|
2018-11-12 23:26:52 +00:00
|
|
|
config RAID6_PQ_BENCHMARK
|
|
|
|
bool "Automatically choose fastest RAID6 PQ functions"
|
|
|
|
depends on RAID6_PQ
|
|
|
|
default y
|
|
|
|
help
|
|
|
|
Benchmark all available RAID6 PQ functions on init and choose the
|
|
|
|
fastest one.
|
|
|
|
|
2020-05-08 15:39:35 +00:00
|
|
|
config LINEAR_RANGES
|
|
|
|
tristate
|
|
|
|
|
2019-05-02 20:23:29 +00:00
|
|
|
config PACKING
|
|
|
|
bool "Generic bitfield packing and unpacking"
|
2022-12-10 00:44:23 +00:00
|
|
|
select BITREVERSE
|
2019-05-02 20:23:29 +00:00
|
|
|
default n
|
|
|
|
help
|
|
|
|
This option provides the packing() helper function, which permits
|
|
|
|
converting bitfields between a CPU-usable representation and a
|
|
|
|
memory representation that can have any combination of these quirks:
|
|
|
|
- Is little endian (bytes are reversed within a 32-bit group)
|
|
|
|
- The least-significant 32-bit word comes first (within a 64-bit
|
|
|
|
group)
|
|
|
|
- The most significant bit of a byte is at its right (bit 0 of a
|
|
|
|
register description is numerically 2^7).
|
|
|
|
Drivers may use these helpers to match the bit indices as described
|
|
|
|
in the data sheets of the peripherals they are in control of.
|
|
|
|
|
|
|
|
When in doubt, say N.
|
|
|
|
|
2024-10-02 21:51:55 +00:00
|
|
|
config PACKING_KUNIT_TEST
|
|
|
|
tristate "KUnit tests for packing library" if !KUNIT_ALL_TESTS
|
|
|
|
depends on PACKING && KUNIT
|
|
|
|
default KUNIT_ALL_TESTS
|
|
|
|
help
|
|
|
|
This builds KUnit tests for the packing library.
|
|
|
|
|
|
|
|
For more information on KUnit and unit tests in general,
|
|
|
|
please refer to the KUnit documentation in Documentation/dev-tools/kunit/.
|
|
|
|
|
|
|
|
When in doubt, say N.
|
|
|
|
|
2006-12-08 10:36:25 +00:00
|
|
|
config BITREVERSE
|
|
|
|
tristate
|
|
|
|
|
2014-11-03 02:01:03 +00:00
|
|
|
config HAVE_ARCH_BITREVERSE
|
2015-02-17 00:00:20 +00:00
|
|
|
bool
|
2014-11-03 02:01:03 +00:00
|
|
|
default n
|
|
|
|
help
|
2015-04-16 19:49:07 +00:00
|
|
|
This option enables the use of hardware bit-reversal instructions on
|
|
|
|
architectures which support such operations.
|
2014-11-03 02:01:03 +00:00
|
|
|
|
2021-05-17 07:22:34 +00:00
|
|
|
config ARCH_HAS_STRNCPY_FROM_USER
|
2012-05-24 20:12:28 +00:00
|
|
|
bool
|
|
|
|
|
2021-05-17 07:22:34 +00:00
|
|
|
config ARCH_HAS_STRNLEN_USER
|
2012-05-26 18:06:38 +00:00
|
|
|
bool
|
|
|
|
|
2021-05-17 07:22:34 +00:00
|
|
|
config GENERIC_STRNCPY_FROM_USER
|
|
|
|
def_bool !ARCH_HAS_STRNCPY_FROM_USER
|
|
|
|
|
|
|
|
config GENERIC_STRNLEN_USER
|
|
|
|
def_bool !ARCH_HAS_STRNLEN_USER
|
|
|
|
|
2013-06-04 16:46:26 +00:00
|
|
|
config GENERIC_NET_UTILS
|
|
|
|
bool
|
|
|
|
|
2019-05-14 22:43:05 +00:00
|
|
|
source "lib/math/Kconfig"
|
|
|
|
|
2012-01-29 22:20:48 +00:00
|
|
|
config NO_GENERIC_PCI_IOPORT_MAP
|
|
|
|
bool
|
|
|
|
|
2011-11-24 12:54:28 +00:00
|
|
|
config GENERIC_IOMAP
|
|
|
|
bool
|
2011-11-24 18:45:20 +00:00
|
|
|
select GENERIC_PCI_IOMAP
|
2011-11-24 12:54:28 +00:00
|
|
|
|
lib: add support for stmp-style devices
MX23/28 use IP cores which follow a register layout I have first seen on
STMP3xxx SoCs. In this layout, every register actually has four u32:
1.) to store a value directly
2.) a SET register where every 1-bit sets the corresponding bit,
others are unaffected
3.) same with a CLR register
4.) same with a TOG (toggle) register
Also, the 2 MSBs in register 0 are always the same and can be used to reset
the IP core.
All this is strictly speaking not mach-specific (but IP core specific) and,
thus, doesn't need to be in mach-mxs/include. At least mx6 also uses IP cores
following this stmp-style. So:
Introduce a stmp-style device, put the code and defines for that in a public
place (lib/), and let drivers for stmp-style devices select that code.
To avoid regressions and ease reviewing, the actual code is simply copied from
mach-mxs. It definately wants updates, but those need a seperate patch series.
Voila, mach dependency gone, reusable code introduced. Note that I didn't
remove the duplicated code from mach-mxs yet, first the drivers have to be
converted.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Dong Aisheng <dong.aisheng@linaro.org>
2011-08-31 18:35:40 +00:00
|
|
|
config STMP_DEVICE
|
|
|
|
bool
|
2012-12-18 00:01:39 +00:00
|
|
|
|
lockref: implement lockless reference count updates using cmpxchg()
Instead of taking the spinlock, the lockless versions atomically check
that the lock is not taken, and do the reference count update using a
cmpxchg() loop. This is semantically identical to doing the reference
count update protected by the lock, but avoids the "wait for lock"
contention that you get when accesses to the reference count are
contended.
Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
Even when the lockref reference counts are updated atomically with
cmpxchg, the fact that they also verify the state of the spinlock means
that the lockless updates can never happen while somebody else holds the
spinlock.
So while "lockref_put_or_lock()" looks a lot like just another name for
"atomic_dec_and_lock()", and both optimize to lockless updates, they are
fundamentally different: the decrement done by atomic_dec_and_lock() is
truly independent of any lock (as long as it doesn't decrement to zero),
so a locked region can still see the count change.
The lockref structure, in contrast, really is a *locked* reference
count. If you hold the spinlock, the reference count will be stable and
you can modify the reference count without using atomics, because even
the lockless updates will see and respect the state of the lock.
In order to enable the cmpxchg lockless code, the architecture needs to
do three things:
(1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
in an aligned u64, and have a "cmpxchg()" implementation that works
on such a u64 data type.
(2) define a helper function to test for a spinlock being unlocked
("arch_spin_value_unlocked()")
(3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
Kconfig file.
This enables it for x86-64 (but not 32-bit, we'd need to make sure
cmpxchg() turns into the proper cmpxchg8b in order to enable it for
32-bit mode).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02 19:12:15 +00:00
|
|
|
config ARCH_USE_CMPXCHG_LOCKREF
|
|
|
|
bool
|
|
|
|
|
2014-09-13 18:14:53 +00:00
|
|
|
config ARCH_HAS_FAST_MULTIPLIER
|
|
|
|
bool
|
|
|
|
|
2020-04-16 18:24:02 +00:00
|
|
|
config ARCH_USE_SYM_ANNOTATIONS
|
|
|
|
bool
|
|
|
|
|
2018-03-14 18:15:50 +00:00
|
|
|
config INDIRECT_PIO
|
|
|
|
bool "Access I/O in non-MMIO mode"
|
|
|
|
depends on ARM64
|
2023-03-23 16:33:52 +00:00
|
|
|
depends on HAS_IOPORT
|
2018-03-14 18:15:50 +00:00
|
|
|
help
|
|
|
|
On some platforms where no separate I/O space exists, there are I/O
|
|
|
|
hosts which can not be accessed in MMIO mode. Using the logical PIO
|
|
|
|
mechanism, the host-local I/O resource can be mapped into system
|
|
|
|
logic PIO space shared with MMIO hosts, such as PCI/PCIe, then the
|
|
|
|
system can access the I/O devices with the mapped-logic PIO through
|
|
|
|
I/O accessors.
|
|
|
|
|
|
|
|
This way has relatively little I/O performance cost. Please make
|
|
|
|
sure your devices really need this configure item enabled.
|
|
|
|
|
|
|
|
When in doubt, say N.
|
|
|
|
|
2021-03-05 12:19:52 +00:00
|
|
|
config INDIRECT_IOMEM
|
|
|
|
bool
|
|
|
|
help
|
|
|
|
This is selected by other options/architectures to provide the
|
|
|
|
emulated iomem accessors.
|
|
|
|
|
|
|
|
config INDIRECT_IOMEM_FALLBACK
|
|
|
|
bool
|
|
|
|
depends on INDIRECT_IOMEM
|
|
|
|
help
|
|
|
|
If INDIRECT_IOMEM is selected, this enables falling back to plain
|
|
|
|
mmio accesses when the IO memory address is not a registered
|
|
|
|
emulated region.
|
|
|
|
|
lib: Add register read/write tracing support
Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
are typically used to read/write from/to memory mapped registers
and can cause hangs or some undefined behaviour in following few
cases,
* If the access to the register space is unclocked, for example: if
there is an access to multimedia(MM) block registers without MM
clocks.
* If the register space is protected and not set to be accessible from
non-secure world, for example: only EL3 (EL: Exception level) access
is allowed and any EL2/EL1 access is forbidden.
* If xPU(memory/register protection units) is controlling access to
certain memory/register space for specific clients.
and more...
Such cases usually results in instant reboot/SErrors/NOC or interconnect
hangs and tracing these register accesses can be very helpful to debug
such issues during initial development stages and also in later stages.
So use ftrace trace events to log such MMIO register accesses which
provides rich feature set such as early enablement of trace events,
filtering capability, dumping ftrace logs on console and many more.
Sample output:
rwmmio_write: __qcom_geni_serial_console_write+0x160/0x1e0 width=32 val=0xa0d5d addr=0xfffffbfffdbff700
rwmmio_post_write: __qcom_geni_serial_console_write+0x160/0x1e0 width=32 val=0xa0d5d addr=0xfffffbfffdbff700
rwmmio_read: qcom_geni_serial_poll_bit+0x94/0x138 width=32 addr=0xfffffbfffdbff610
rwmmio_post_read: qcom_geni_serial_poll_bit+0x94/0x138 width=32 val=0x0 addr=0xfffffbfffdbff610
Co-developed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-18 16:44:14 +00:00
|
|
|
config TRACE_MMIO_ACCESS
|
|
|
|
bool "Register read/write tracing"
|
|
|
|
depends on TRACING && ARCH_HAVE_TRACE_MMIO_ACCESS
|
|
|
|
help
|
|
|
|
Create tracepoints for MMIO read/write operations. These trace events
|
|
|
|
can be used for logging all MMIO read/write operations.
|
|
|
|
|
2022-01-12 14:01:38 +00:00
|
|
|
source "lib/crypto/Kconfig"
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
config CRC_CCITT
|
|
|
|
tristate "CRC-CCITT functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC-CCITT functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC-CCITT
|
|
|
|
functions require M here.
|
|
|
|
|
2005-08-17 11:17:26 +00:00
|
|
|
config CRC16
|
|
|
|
tristate "CRC16 functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC16 functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC16
|
|
|
|
functions require M here.
|
|
|
|
|
2008-06-25 15:22:42 +00:00
|
|
|
config CRC_T10DIF
|
|
|
|
tristate "CRC calculation for the T10 Data Integrity Field"
|
|
|
|
help
|
|
|
|
This option is only needed if a module that's not in the
|
|
|
|
kernel tree needs to calculate CRC checks for use with the
|
|
|
|
SCSI data integrity subsystem.
|
|
|
|
|
2024-12-02 01:20:46 +00:00
|
|
|
config ARCH_HAS_CRC_T10DIF
|
|
|
|
bool
|
|
|
|
|
|
|
|
choice
|
|
|
|
prompt "CRC-T10DIF implementation"
|
|
|
|
depends on CRC_T10DIF
|
|
|
|
default CRC_T10DIF_IMPL_ARCH if ARCH_HAS_CRC_T10DIF
|
|
|
|
default CRC_T10DIF_IMPL_GENERIC if !ARCH_HAS_CRC_T10DIF
|
|
|
|
help
|
|
|
|
This option allows you to override the default choice of CRC-T10DIF
|
|
|
|
implementation.
|
|
|
|
|
|
|
|
config CRC_T10DIF_IMPL_ARCH
|
|
|
|
bool "Architecture-optimized" if ARCH_HAS_CRC_T10DIF
|
|
|
|
help
|
|
|
|
Use the optimized implementation of CRC-T10DIF for the selected
|
|
|
|
architecture. It is recommended to keep this enabled, as it can
|
|
|
|
greatly improve CRC-T10DIF performance.
|
|
|
|
|
|
|
|
config CRC_T10DIF_IMPL_GENERIC
|
|
|
|
bool "Generic implementation"
|
|
|
|
help
|
|
|
|
Use the generic table-based implementation of CRC-T10DIF. Selecting
|
|
|
|
this will reduce code size slightly but can greatly reduce CRC-T10DIF
|
|
|
|
performance.
|
|
|
|
|
|
|
|
endchoice
|
|
|
|
|
|
|
|
config CRC_T10DIF_ARCH
|
|
|
|
tristate
|
|
|
|
default CRC_T10DIF if CRC_T10DIF_IMPL_ARCH
|
|
|
|
|
2022-03-03 20:13:10 +00:00
|
|
|
config CRC64_ROCKSOFT
|
|
|
|
tristate "CRC calculation for the Rocksoft model CRC64"
|
|
|
|
select CRC64
|
|
|
|
select CRYPTO
|
|
|
|
select CRYPTO_CRC64_ROCKSOFT
|
|
|
|
help
|
|
|
|
This option provides a CRC64 API to a registered crypto driver.
|
|
|
|
This is used with the block layer's data integrity subsystem.
|
|
|
|
|
2006-06-12 14:17:04 +00:00
|
|
|
config CRC_ITU_T
|
|
|
|
tristate "CRC ITU-T V.41 functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC ITU-T V.41 functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC ITU-T V.41
|
|
|
|
functions require M here.
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
config CRC32
|
2012-03-23 22:02:25 +00:00
|
|
|
tristate "CRC32/CRC32c functions"
|
2005-04-16 22:20:36 +00:00
|
|
|
default y
|
2006-12-08 10:36:25 +00:00
|
|
|
select BITREVERSE
|
2005-04-16 22:20:36 +00:00
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
2012-03-23 22:02:25 +00:00
|
|
|
modules require CRC32/CRC32c functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC32/CRC32c
|
|
|
|
functions require M here.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config ARCH_HAS_CRC32
|
|
|
|
bool
|
|
|
|
|
2012-03-23 22:02:26 +00:00
|
|
|
choice
|
|
|
|
prompt "CRC32 implementation"
|
|
|
|
depends on CRC32
|
2024-12-02 01:08:27 +00:00
|
|
|
default CRC32_IMPL_ARCH_PLUS_SLICEBY8 if ARCH_HAS_CRC32
|
|
|
|
default CRC32_IMPL_SLICEBY8 if !ARCH_HAS_CRC32
|
2012-03-28 21:42:56 +00:00
|
|
|
help
|
2024-12-02 01:08:27 +00:00
|
|
|
This option allows you to override the default choice of CRC32
|
|
|
|
implementation. Choose the default unless you know that you need one
|
|
|
|
of the others.
|
2012-03-23 22:02:26 +00:00
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config CRC32_IMPL_ARCH_PLUS_SLICEBY8
|
|
|
|
bool "Arch-optimized, with fallback to slice-by-8" if ARCH_HAS_CRC32
|
|
|
|
help
|
|
|
|
Use architecture-optimized implementation of CRC32. Fall back to
|
|
|
|
slice-by-8 in cases where the arch-optimized implementation cannot be
|
|
|
|
used, e.g. if the CPU lacks support for the needed instructions.
|
|
|
|
|
|
|
|
This is the default when an arch-optimized implementation exists.
|
|
|
|
|
|
|
|
config CRC32_IMPL_ARCH_PLUS_SLICEBY1
|
|
|
|
bool "Arch-optimized, with fallback to slice-by-1" if ARCH_HAS_CRC32
|
|
|
|
help
|
|
|
|
Use architecture-optimized implementation of CRC32, but fall back to
|
|
|
|
slice-by-1 instead of slice-by-8 in order to reduce the binary size.
|
|
|
|
|
|
|
|
config CRC32_IMPL_SLICEBY8
|
2012-03-23 22:02:26 +00:00
|
|
|
bool "Slice by 8 bytes"
|
|
|
|
help
|
|
|
|
Calculate checksum 8 bytes at a time with a clever slicing algorithm.
|
2024-12-02 01:08:27 +00:00
|
|
|
This is much slower than the architecture-optimized implementation of
|
|
|
|
CRC32 (if the selected arch has one), but it is portable and is the
|
|
|
|
fastest implementation when no arch-optimized implementation is
|
|
|
|
available. It uses an 8KiB lookup table. Most modern processors have
|
|
|
|
enough cache to hold this table without thrashing the cache.
|
2012-03-23 22:02:26 +00:00
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config CRC32_IMPL_SLICEBY4
|
2012-03-23 22:02:26 +00:00
|
|
|
bool "Slice by 4 bytes"
|
|
|
|
help
|
|
|
|
Calculate checksum 4 bytes at a time with a clever slicing algorithm.
|
|
|
|
This is a bit slower than slice by 8, but has a smaller 4KiB lookup
|
|
|
|
table.
|
|
|
|
|
|
|
|
Only choose this option if you know what you are doing.
|
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config CRC32_IMPL_SLICEBY1
|
|
|
|
bool "Slice by 1 byte (Sarwate's algorithm)"
|
2012-03-23 22:02:26 +00:00
|
|
|
help
|
|
|
|
Calculate checksum a byte at a time using Sarwate's algorithm. This
|
2024-12-02 01:08:27 +00:00
|
|
|
is not particularly fast, but has a small 1KiB lookup table.
|
2012-03-23 22:02:26 +00:00
|
|
|
|
|
|
|
Only choose this option if you know what you are doing.
|
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config CRC32_IMPL_BIT
|
2012-03-23 22:02:26 +00:00
|
|
|
bool "Classic Algorithm (one bit at a time)"
|
|
|
|
help
|
|
|
|
Calculate checksum one bit at a time. This is VERY slow, but has
|
|
|
|
no lookup table. This is provided as a debugging option.
|
|
|
|
|
|
|
|
Only choose this option if you are debugging crc32.
|
|
|
|
|
|
|
|
endchoice
|
|
|
|
|
2024-12-02 01:08:27 +00:00
|
|
|
config CRC32_ARCH
|
|
|
|
tristate
|
|
|
|
default CRC32 if CRC32_IMPL_ARCH_PLUS_SLICEBY8 || CRC32_IMPL_ARCH_PLUS_SLICEBY1
|
|
|
|
|
|
|
|
config CRC32_SLICEBY8
|
|
|
|
bool
|
|
|
|
default y if CRC32_IMPL_SLICEBY8 || CRC32_IMPL_ARCH_PLUS_SLICEBY8
|
|
|
|
|
|
|
|
config CRC32_SLICEBY4
|
|
|
|
bool
|
|
|
|
default y if CRC32_IMPL_SLICEBY4
|
|
|
|
|
|
|
|
config CRC32_SARWATE
|
|
|
|
bool
|
|
|
|
default y if CRC32_IMPL_SLICEBY1 || CRC32_IMPL_ARCH_PLUS_SLICEBY1
|
|
|
|
|
|
|
|
config CRC32_BIT
|
|
|
|
bool
|
|
|
|
default y if CRC32_IMPL_BIT
|
|
|
|
|
lib: add crc64 calculation routines
Patch series "add crc64 calculation as kernel library", v5.
This patchset adds basic implementation of crc64 calculation as a Linux
kernel library. Since bcache already does crc64 by itself, this patchset
also modifies bcache code to use the new crc64 library routine.
Currently bcache is the only user of crc64 calculation, another potential
user is bcachefs which is on the way to be in mainline kernel. Therefore
it makes sense to make crc64 calculation to be a public library.
bcache uses crc64 as storage checksum, if a change of crc lib routines
results an inconsistent result, the unmatched checksum may make bcache
'think' the on-disk is corrupted, such a change should be avoided or
detected as early as possible. Therefore a patch is being prepared which
adds a crc test framework, to check consistency of different calculations.
This patch (of 2):
Add the re-write crc64 calculation routines for Linux kernel. The CRC64
polynomical arithmetic follows ECMA-182 specification, inspired by CRC
paper of Dr. Ross N. Williams (see
http://www.ross.net/crc/download/crc_v3.txt) and other public domain
implementations.
All the changes work in this way,
- When Linux kernel is built, host program lib/gen_crc64table.c will be
compiled to lib/gen_crc64table and executed.
- The output of gen_crc64table execution is an array called as lookup
table (a.k.a POLY 0x42f0e1eba9ea369) which contain 256 64-bit long
numbers, this table is dumped into header file lib/crc64table.h.
- Then the header file is included by lib/crc64.c for normal 64bit crc
calculation.
- Function declaration of the crc64 calculation routines is placed in
include/linux/crc64.h
Currently bcache is the only user of crc64_be(), another potential user is
bcachefs which is on the way to be in mainline kernel. Therefore it makes
sense to move crc64 calculation into lib/crc64.c as public code.
[colyli@suse.de: fix review comments from v4]
Link: http://lkml.kernel.org/r/20180726053352.2781-2-colyli@suse.de
Link: http://lkml.kernel.org/r/20180718165545.1622-2-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Noah Massey <noah.massey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 04:57:11 +00:00
|
|
|
config CRC64
|
|
|
|
tristate "CRC64 functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC64 functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC64
|
|
|
|
functions require M here.
|
|
|
|
|
2017-06-06 21:08:39 +00:00
|
|
|
config CRC4
|
|
|
|
tristate "CRC4 functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC4 functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC4
|
|
|
|
functions require M here.
|
|
|
|
|
2007-07-17 11:04:03 +00:00
|
|
|
config CRC7
|
|
|
|
tristate "CRC7 functions"
|
|
|
|
help
|
|
|
|
This option is provided for the case where no in-kernel-tree
|
|
|
|
modules require CRC7 functions, but a module built outside
|
|
|
|
the kernel tree does. Such modules that use library CRC7
|
|
|
|
functions require M here.
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
config LIBCRC32C
|
|
|
|
tristate "CRC32c (Castagnoli, et al) Cyclic Redundancy-Check"
|
2024-12-02 01:08:40 +00:00
|
|
|
select CRC32
|
2005-04-16 22:20:36 +00:00
|
|
|
help
|
2024-12-02 01:08:40 +00:00
|
|
|
This option just selects CRC32 and is provided for compatibility
|
|
|
|
purposes until the users are updated to select CRC32 directly.
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2011-05-31 09:22:15 +00:00
|
|
|
config CRC8
|
|
|
|
tristate "CRC8 function"
|
|
|
|
help
|
|
|
|
This option provides CRC8 function. Drivers may select this
|
|
|
|
when they need to do cyclic redundancy check according CRC8
|
|
|
|
algorithm. Module will be called crc8.
|
|
|
|
|
lib: Add xxhash module
Adds xxhash kernel module with xxh32 and xxh64 hashes. xxhash is an
extremely fast non-cryptographic hash algorithm for checksumming.
The zstd compression and decompression modules added in the next patch
require xxhash. I extracted it out from zstd since it is useful on its
own. I copied the code from the upstream XXHash source repository and
translated it into kernel style. I ran benchmarks and tests in the kernel
and tests in userland.
I benchmarked xxhash as a special character device. I ran in four modes,
no-op, xxh32, xxh64, and crc32. The no-op mode simply copies the data to
kernel space and ignores it. The xxh32, xxh64, and crc32 modes compute
hashes on the copied data. I also ran it with four different buffer sizes.
The benchmark file is located in the upstream zstd source repository under
`contrib/linux-kernel/xxhash_test.c` [1].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using the file `filesystem.squashfs`
from `ubuntu-16.10-desktop-amd64.iso`, which is 1,536,217,088 B large.
Run the following commands for the benchmark:
modprobe xxhash_test
mknod xxhash_test c 245 0
time cp filesystem.squashfs xxhash_test
The time is reported by the time of the userland `cp`.
The GB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Normalized GB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
| Buffer Size (B) | Hash | Time (s) | GB/s | Adjusted GB/s |
|-----------------|-------|----------|------|---------------|
| 1024 | none | 0.408 | 3.77 | - |
| 1024 | xxh32 | 0.649 | 2.37 | 6.37 |
| 1024 | xxh64 | 0.542 | 2.83 | 11.46 |
| 1024 | crc32 | 1.290 | 1.19 | 1.74 |
| 4096 | none | 0.380 | 4.04 | - |
| 4096 | xxh32 | 0.645 | 2.38 | 5.79 |
| 4096 | xxh64 | 0.500 | 3.07 | 12.80 |
| 4096 | crc32 | 1.168 | 1.32 | 1.95 |
| 8192 | none | 0.351 | 4.38 | - |
| 8192 | xxh32 | 0.614 | 2.50 | 5.84 |
| 8192 | xxh64 | 0.464 | 3.31 | 13.60 |
| 8192 | crc32 | 1.163 | 1.32 | 1.89 |
| 16384 | none | 0.346 | 4.43 | - |
| 16384 | xxh32 | 0.590 | 2.60 | 6.30 |
| 16384 | xxh64 | 0.466 | 3.30 | 12.80 |
| 16384 | crc32 | 1.183 | 1.30 | 1.84 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/XXHashUserlandTest.cpp` [2] by mocking the
kernel functions. A line in each branch of every function in `xxhash.c`
was commented out to ensure that the test-suite fails. Additionally
tested while testing zstd and with SMHasher [3].
[1] https://phabricator.intern.facebook.com/P57526246
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/XXHashUserlandTest.cpp
[3] https://github.com/aappleby/smhasher
zstd source repository: https://github.com/facebook/zstd
XXHash source repository: https://github.com/cyan4973/xxhash
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2017-08-04 20:19:17 +00:00
|
|
|
config XXHASH
|
|
|
|
tristate
|
|
|
|
|
2006-09-12 07:04:40 +00:00
|
|
|
config AUDIT_GENERIC
|
|
|
|
bool
|
|
|
|
depends on AUDIT && !AUDIT_ARCH
|
|
|
|
default y
|
|
|
|
|
2014-03-15 05:48:00 +00:00
|
|
|
config AUDIT_ARCH_COMPAT_GENERIC
|
|
|
|
bool
|
|
|
|
default n
|
|
|
|
|
|
|
|
config AUDIT_COMPAT_GENERIC
|
|
|
|
bool
|
|
|
|
depends on AUDIT_GENERIC && AUDIT_ARCH_COMPAT_GENERIC && COMPAT
|
|
|
|
default y
|
|
|
|
|
2013-11-11 11:20:37 +00:00
|
|
|
config RANDOM32_SELFTEST
|
|
|
|
bool "PRNG perform self test on init"
|
|
|
|
help
|
|
|
|
This option enables the 32 bit PRNG library functions to perform a
|
|
|
|
self test on initialization.
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
#
|
|
|
|
# compression support is select'ed if needed
|
|
|
|
#
|
2015-05-07 17:49:14 +00:00
|
|
|
config 842_COMPRESS
|
2016-01-13 22:24:02 +00:00
|
|
|
select CRC32
|
2015-05-07 17:49:14 +00:00
|
|
|
tristate
|
|
|
|
|
|
|
|
config 842_DECOMPRESS
|
2016-01-13 22:24:02 +00:00
|
|
|
select CRC32
|
2015-05-07 17:49:14 +00:00
|
|
|
tristate
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
config ZLIB_INFLATE
|
|
|
|
tristate
|
|
|
|
|
|
|
|
config ZLIB_DEFLATE
|
|
|
|
tristate
|
2015-10-15 22:28:35 +00:00
|
|
|
select BITREVERSE
|
2005-04-16 22:20:36 +00:00
|
|
|
|
lib/zlib: add s390 hardware support for kernel zlib_deflate
Patch series "S390 hardware support for kernel zlib", v3.
With IBM z15 mainframe the new DFLTCC instruction is available. It
implements deflate algorithm in hardware (Nest Acceleration Unit - NXU)
with estimated compression and decompression performance orders of
magnitude faster than the current zlib.
This patchset adds s390 hardware compression support to kernel zlib.
The code is based on the userspace zlib implementation:
https://github.com/madler/zlib/pull/410
The coding style is also preserved for future maintainability. There is
only limited set of userspace zlib functions represented in kernel.
Apart from that, all the memory allocation should be performed in
advance. Thus, the workarea structures are extended with the parameter
lists required for the DEFLATE CONVENTION CALL instruction.
Since kernel zlib itself does not support gzip headers, only Adler-32
checksum is processed (also can be produced by DFLTCC facility). Like
it was implemented for userspace, kernel zlib will compress in hardware
on level 1, and in software on all other levels. Decompression will
always happen in hardware (when enabled).
Two DFLTCC compression calls produce the same results only when they
both are made on machines of the same generation, and when the
respective buffers have the same offset relative to the start of the
page. Therefore care should be taken when using hardware compression
when reproducible results are desired. However it does always produce
the standard conform output which can be inflated anyway.
The new kernel command line parameter 'dfltcc' is introduced to
configure s390 zlib hardware support:
Format: { on | off | def_only | inf_only | always }
on: s390 zlib hardware support for compression on
level 1 and decompression (default)
off: No s390 zlib hardware support
def_only: s390 zlib hardware support for deflate
only (compression on level 1)
inf_only: s390 zlib hardware support for inflate
only (decompression)
always: Same as 'on' but ignores the selected compression
level always using hardware support (used for debugging)
The main purpose of the integration of the NXU support into the kernel
zlib is the use of hardware deflate in btrfs filesystem with on-the-fly
compression enabled. Apart from that, hardware support can also be used
during boot for decompressing the kernel or the ramdisk image
With the patch for btrfs expanding zlib buffer from 1 to 4 pages (patch
6) the following performance results have been achieved using the
ramdisk with btrfs. These are relative numbers based on throughput rate
and compression ratio for zlib level 1:
Input data Deflate rate Inflate rate Compression ratio
NXU/Software NXU/Software NXU/Software
stream of zeroes 1.46 1.02 1.00
random ASCII data 10.44 3.00 0.96
ASCII text (dickens) 6,21 3.33 0.94
binary data (vmlinux) 8,37 3.90 1.02
This means that s390 hardware deflate can provide up to 10 times faster
compression (on level 1) and up to 4 times faster decompression (refers
to all compression levels) for btrfs zlib.
Disclaimer: Performance results are based on IBM internal tests using DD
command-line utility on btrfs on a Fedora 30 based internal driver in
native LPAR on a z15 system. Results may vary based on individual
workload, configuration and software levels.
This patch (of 9):
Create zlib_dfltcc library with the s390 DEFLATE CONVERSION CALL
implementation and related compression functions. Update zlib_deflate
functions with the hooks for s390 hardware support and adjust workspace
structures with extra parameter lists required for hardware deflate.
Link: http://lkml.kernel.org/r/20200103223334.20669-2-zaslonko@linux.ibm.com
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Eduard Shishkin <edward6@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31 06:16:17 +00:00
|
|
|
config ZLIB_DFLTCC
|
|
|
|
def_bool y
|
|
|
|
depends on S390
|
|
|
|
prompt "Enable s390x DEFLATE CONVERSION CALL support for kernel zlib"
|
|
|
|
help
|
|
|
|
Enable s390x hardware support for zlib in the kernel.
|
|
|
|
|
2007-07-11 00:22:24 +00:00
|
|
|
config LZO_COMPRESS
|
|
|
|
tristate
|
|
|
|
|
|
|
|
config LZO_DECOMPRESS
|
|
|
|
tristate
|
|
|
|
|
2013-07-08 23:01:49 +00:00
|
|
|
config LZ4_COMPRESS
|
|
|
|
tristate
|
|
|
|
|
|
|
|
config LZ4HC_COMPRESS
|
|
|
|
tristate
|
|
|
|
|
2013-07-08 23:01:46 +00:00
|
|
|
config LZ4_DECOMPRESS
|
|
|
|
tristate
|
|
|
|
|
2022-09-29 02:08:23 +00:00
|
|
|
config ZSTD_COMMON
|
lib: Add zstd modules
Add zstd compression and decompression kernel modules.
zstd offers a wide varity of compression speed and quality trade-offs.
It can compress at speeds approaching lz4, and quality approaching lzma.
zstd decompressions at speeds more than twice as fast as zlib, and
decompression speed remains roughly the same across all compression levels.
The code was ported from the upstream zstd source repository. The
`linux/zstd.h` header was modified to match linux kernel style.
The cross-platform and allocation code was stripped out. Instead zstd
requires the caller to pass a preallocated workspace. The source files
were clang-formatted [1] to match the Linux Kernel style as much as
possible. Otherwise, the code was unmodified. We would like to avoid
as much further manual modification to the source code as possible, so it
will be easier to keep the kernel zstd up to date.
I benchmarked zstd compression as a special character device. I ran zstd
and zlib compression at several levels, as well as performing no
compression, which measure the time spent copying the data to kernel space.
Data is passed to the compresser 4096 B at a time. The benchmark file is
located in the upstream zstd source repository under
`contrib/linux-kernel/zstd_compress_test.c` [2].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
211,988,480 B large. Run the following commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same machine.
The benchmark file is located in the upstream zstd repo under
`contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
the amount of memory required to decompress data compressed with the given
compression level. If you know the maximum size of your input, you can
reduce the memory usage of decompression irrespective of the compression
level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
`contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
BtrFS and SquashFS patches coming next.
[1] https://clang.llvm.org/docs/ClangFormat.html
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
[5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
[6] http://llvm.org/docs/LibFuzzer.html
[7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
[8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2017-08-10 02:35:53 +00:00
|
|
|
select XXHASH
|
|
|
|
tristate
|
|
|
|
|
2022-09-29 02:08:23 +00:00
|
|
|
config ZSTD_COMPRESS
|
|
|
|
select ZSTD_COMMON
|
|
|
|
tristate
|
|
|
|
|
lib: Add zstd modules
Add zstd compression and decompression kernel modules.
zstd offers a wide varity of compression speed and quality trade-offs.
It can compress at speeds approaching lz4, and quality approaching lzma.
zstd decompressions at speeds more than twice as fast as zlib, and
decompression speed remains roughly the same across all compression levels.
The code was ported from the upstream zstd source repository. The
`linux/zstd.h` header was modified to match linux kernel style.
The cross-platform and allocation code was stripped out. Instead zstd
requires the caller to pass a preallocated workspace. The source files
were clang-formatted [1] to match the Linux Kernel style as much as
possible. Otherwise, the code was unmodified. We would like to avoid
as much further manual modification to the source code as possible, so it
will be easier to keep the kernel zstd up to date.
I benchmarked zstd compression as a special character device. I ran zstd
and zlib compression at several levels, as well as performing no
compression, which measure the time spent copying the data to kernel space.
Data is passed to the compresser 4096 B at a time. The benchmark file is
located in the upstream zstd source repository under
`contrib/linux-kernel/zstd_compress_test.c` [2].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
211,988,480 B large. Run the following commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same machine.
The benchmark file is located in the upstream zstd repo under
`contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
the amount of memory required to decompress data compressed with the given
compression level. If you know the maximum size of your input, you can
reduce the memory usage of decompression irrespective of the compression
level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
`contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
BtrFS and SquashFS patches coming next.
[1] https://clang.llvm.org/docs/ClangFormat.html
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
[5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
[6] http://llvm.org/docs/LibFuzzer.html
[7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
[8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2017-08-10 02:35:53 +00:00
|
|
|
config ZSTD_DECOMPRESS
|
2022-09-29 02:08:23 +00:00
|
|
|
select ZSTD_COMMON
|
lib: Add zstd modules
Add zstd compression and decompression kernel modules.
zstd offers a wide varity of compression speed and quality trade-offs.
It can compress at speeds approaching lz4, and quality approaching lzma.
zstd decompressions at speeds more than twice as fast as zlib, and
decompression speed remains roughly the same across all compression levels.
The code was ported from the upstream zstd source repository. The
`linux/zstd.h` header was modified to match linux kernel style.
The cross-platform and allocation code was stripped out. Instead zstd
requires the caller to pass a preallocated workspace. The source files
were clang-formatted [1] to match the Linux Kernel style as much as
possible. Otherwise, the code was unmodified. We would like to avoid
as much further manual modification to the source code as possible, so it
will be easier to keep the kernel zstd up to date.
I benchmarked zstd compression as a special character device. I ran zstd
and zlib compression at several levels, as well as performing no
compression, which measure the time spent copying the data to kernel space.
Data is passed to the compresser 4096 B at a time. The benchmark file is
located in the upstream zstd source repository under
`contrib/linux-kernel/zstd_compress_test.c` [2].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
211,988,480 B large. Run the following commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same machine.
The benchmark file is located in the upstream zstd repo under
`contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
the amount of memory required to decompress data compressed with the given
compression level. If you know the maximum size of your input, you can
reduce the memory usage of decompression irrespective of the compression
level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
`contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
BtrFS and SquashFS patches coming next.
[1] https://clang.llvm.org/docs/ClangFormat.html
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
[5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
[6] http://llvm.org/docs/LibFuzzer.html
[7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
[8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2017-08-10 02:35:53 +00:00
|
|
|
tristate
|
|
|
|
|
2011-01-13 01:01:22 +00:00
|
|
|
source "lib/xz/Kconfig"
|
|
|
|
|
2009-01-05 21:48:31 +00:00
|
|
|
#
|
|
|
|
# These all provide a common interface (hence the apparent duplication with
|
|
|
|
# ZLIB_INFLATE; DECOMPRESS_GZIP is just a wrapper.)
|
|
|
|
#
|
|
|
|
config DECOMPRESS_GZIP
|
2009-01-07 08:01:43 +00:00
|
|
|
select ZLIB_INFLATE
|
2009-01-05 21:48:31 +00:00
|
|
|
tristate
|
|
|
|
|
|
|
|
config DECOMPRESS_BZIP2
|
|
|
|
tristate
|
|
|
|
|
|
|
|
config DECOMPRESS_LZMA
|
|
|
|
tristate
|
|
|
|
|
decompressors: add boot-time XZ support
This implements the API defined in <linux/decompress/generic.h> which is
used for kernel, initramfs, and initrd decompression. This patch together
with the first patch is enough for XZ-compressed initramfs and initrd;
XZ-compressed kernel will need arch-specific changes.
The buffering requirements described in decompress_unxz.c are stricter
than with gzip, so the relevant changes should be done to the
arch-specific code when adding support for XZ-compressed kernel.
Similarly, the heap size in arch-specific pre-boot code may need to be
increased (30 KiB is enough).
The XZ decompressor needs memmove(), memeq() (memcmp() == 0), and
memzero() (memset(ptr, 0, size)), which aren't available in all
arch-specific pre-boot environments. I'm including simple versions in
decompress_unxz.c, but a cleaner solution would naturally be nicer.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alain Knaff <alain@knaff.lu>
Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-13 01:01:23 +00:00
|
|
|
config DECOMPRESS_XZ
|
|
|
|
select XZ_DEC
|
|
|
|
tristate
|
|
|
|
|
2010-01-08 22:42:46 +00:00
|
|
|
config DECOMPRESS_LZO
|
|
|
|
select LZO_DECOMPRESS
|
|
|
|
tristate
|
|
|
|
|
2013-07-08 23:01:46 +00:00
|
|
|
config DECOMPRESS_LZ4
|
|
|
|
select LZ4_DECOMPRESS
|
|
|
|
tristate
|
|
|
|
|
2020-07-30 19:08:35 +00:00
|
|
|
config DECOMPRESS_ZSTD
|
|
|
|
select ZSTD_DECOMPRESS
|
|
|
|
tristate
|
|
|
|
|
2005-06-22 00:15:02 +00:00
|
|
|
#
|
|
|
|
# Generic allocator support is selected if needed
|
|
|
|
#
|
|
|
|
config GENERIC_ALLOCATOR
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-06-22 00:15:02 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
#
|
|
|
|
# reed solomon support is select'ed if needed
|
|
|
|
#
|
|
|
|
config REED_SOLOMON
|
|
|
|
tristate
|
|
|
|
|
|
|
|
config REED_SOLOMON_ENC8
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
config REED_SOLOMON_DEC8
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
config REED_SOLOMON_ENC16
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
config REED_SOLOMON_DEC16
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-04-16 22:20:36 +00:00
|
|
|
|
lib: add shared BCH ECC library
This is a new software BCH encoding/decoding library, similar to the shared
Reed-Solomon library.
Binary BCH (Bose-Chaudhuri-Hocquenghem) codes are widely used to correct
errors in NAND flash devices requiring more than 1-bit ecc correction; they
are generally better suited for NAND flash than RS codes because NAND bit
errors do not occur in bursts. Latest SLC NAND devices typically require at
least 4-bit ecc protection per 512 bytes block.
This library provides software encoding/decoding, but may also be used with
ASIC/SoC hardware BCH engines to perform error correction. It is being
currently used for this purpose on an OMAP3630 board (4bit/8bit HW BCH). It
has also been used to decode raw dumps of NAND devices with on-die BCH ecc
engines (e.g. Micron 4bit ecc SLC devices).
Latest NAND devices (including SLC) can exhibit high error rates (typically
a dozen or more bitflips per hour during stress tests); in order to
minimize the performance impact of error correction, this library
implements recently developed algorithms for fast polynomial root finding
(see bch.c header for details) instead of the traditional exhaustive Chien
root search; a few performance figures are provided below:
Platform: arm926ejs @ 468 MHz, 32 KiB icache, 16 KiB dcache
BCH ecc : 4-bit per 512 bytes
Encoding average throughput: 250 Mbits/s
Error correction time (compared with Chien search):
average worst average (Chien) worst (Chien)
----------------------------------------------------------
1 bit 8.5 µs 11 µs 200 µs 383 µs
2 bit 9.7 µs 12.5 µs 477 µs 728 µs
3 bit 18.1 µs 20.6 µs 758 µs 1010 µs
4 bit 19.5 µs 23 µs 1028 µs 1280 µs
In the above figures, "worst" is meant in terms of error pattern, not in
terms of cache miss / page faults effects (not taken into account here).
The library has been extensively tested on the following platforms: x86,
x86_64, arm926ejs, omap3630, qemu-ppc64, qemu-mips.
Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11 10:05:32 +00:00
|
|
|
#
|
|
|
|
# BCH support is selected if needed
|
|
|
|
#
|
|
|
|
config BCH
|
|
|
|
tristate
|
2023-07-30 08:17:17 +00:00
|
|
|
select BITREVERSE
|
lib: add shared BCH ECC library
This is a new software BCH encoding/decoding library, similar to the shared
Reed-Solomon library.
Binary BCH (Bose-Chaudhuri-Hocquenghem) codes are widely used to correct
errors in NAND flash devices requiring more than 1-bit ecc correction; they
are generally better suited for NAND flash than RS codes because NAND bit
errors do not occur in bursts. Latest SLC NAND devices typically require at
least 4-bit ecc protection per 512 bytes block.
This library provides software encoding/decoding, but may also be used with
ASIC/SoC hardware BCH engines to perform error correction. It is being
currently used for this purpose on an OMAP3630 board (4bit/8bit HW BCH). It
has also been used to decode raw dumps of NAND devices with on-die BCH ecc
engines (e.g. Micron 4bit ecc SLC devices).
Latest NAND devices (including SLC) can exhibit high error rates (typically
a dozen or more bitflips per hour during stress tests); in order to
minimize the performance impact of error correction, this library
implements recently developed algorithms for fast polynomial root finding
(see bch.c header for details) instead of the traditional exhaustive Chien
root search; a few performance figures are provided below:
Platform: arm926ejs @ 468 MHz, 32 KiB icache, 16 KiB dcache
BCH ecc : 4-bit per 512 bytes
Encoding average throughput: 250 Mbits/s
Error correction time (compared with Chien search):
average worst average (Chien) worst (Chien)
----------------------------------------------------------
1 bit 8.5 µs 11 µs 200 µs 383 µs
2 bit 9.7 µs 12.5 µs 477 µs 728 µs
3 bit 18.1 µs 20.6 µs 758 µs 1010 µs
4 bit 19.5 µs 23 µs 1028 µs 1280 µs
In the above figures, "worst" is meant in terms of error pattern, not in
terms of cache miss / page faults effects (not taken into account here).
The library has been extensively tested on the following platforms: x86,
x86_64, arm926ejs, omap3630, qemu-ppc64, qemu-mips.
Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11 10:05:32 +00:00
|
|
|
|
|
|
|
config BCH_CONST_PARAMS
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
lib: add shared BCH ECC library
This is a new software BCH encoding/decoding library, similar to the shared
Reed-Solomon library.
Binary BCH (Bose-Chaudhuri-Hocquenghem) codes are widely used to correct
errors in NAND flash devices requiring more than 1-bit ecc correction; they
are generally better suited for NAND flash than RS codes because NAND bit
errors do not occur in bursts. Latest SLC NAND devices typically require at
least 4-bit ecc protection per 512 bytes block.
This library provides software encoding/decoding, but may also be used with
ASIC/SoC hardware BCH engines to perform error correction. It is being
currently used for this purpose on an OMAP3630 board (4bit/8bit HW BCH). It
has also been used to decode raw dumps of NAND devices with on-die BCH ecc
engines (e.g. Micron 4bit ecc SLC devices).
Latest NAND devices (including SLC) can exhibit high error rates (typically
a dozen or more bitflips per hour during stress tests); in order to
minimize the performance impact of error correction, this library
implements recently developed algorithms for fast polynomial root finding
(see bch.c header for details) instead of the traditional exhaustive Chien
root search; a few performance figures are provided below:
Platform: arm926ejs @ 468 MHz, 32 KiB icache, 16 KiB dcache
BCH ecc : 4-bit per 512 bytes
Encoding average throughput: 250 Mbits/s
Error correction time (compared with Chien search):
average worst average (Chien) worst (Chien)
----------------------------------------------------------
1 bit 8.5 µs 11 µs 200 µs 383 µs
2 bit 9.7 µs 12.5 µs 477 µs 728 µs
3 bit 18.1 µs 20.6 µs 758 µs 1010 µs
4 bit 19.5 µs 23 µs 1028 µs 1280 µs
In the above figures, "worst" is meant in terms of error pattern, not in
terms of cache miss / page faults effects (not taken into account here).
The library has been extensively tested on the following platforms: x86,
x86_64, arm926ejs, omap3630, qemu-ppc64, qemu-mips.
Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11 10:05:32 +00:00
|
|
|
help
|
|
|
|
Drivers may select this option to force specific constant
|
|
|
|
values for parameters 'm' (Galois field order) and 't'
|
|
|
|
(error correction capability). Those specific values must
|
|
|
|
be set by declaring default values for symbols BCH_CONST_M
|
|
|
|
and BCH_CONST_T.
|
|
|
|
Doing so will enable extra compiler optimizations,
|
|
|
|
improving encoding and decoding performance up to 2x for
|
|
|
|
usual (m,t) values (typically such that m*t < 200).
|
|
|
|
When this option is selected, the BCH library supports
|
|
|
|
only a single (m,t) configuration. This is mainly useful
|
|
|
|
for NAND flash board drivers requiring known, fixed BCH
|
|
|
|
parameters.
|
|
|
|
|
|
|
|
config BCH_CONST_M
|
|
|
|
int
|
|
|
|
range 5 15
|
|
|
|
help
|
|
|
|
Constant value for Galois field order 'm'. If 'k' is the
|
|
|
|
number of data bits to protect, 'm' should be chosen such
|
|
|
|
that (k + m*t) <= 2**m - 1.
|
|
|
|
Drivers should declare a default value for this symbol if
|
|
|
|
they select option BCH_CONST_PARAMS.
|
|
|
|
|
|
|
|
config BCH_CONST_T
|
|
|
|
int
|
|
|
|
help
|
|
|
|
Constant value for error correction capability in bits 't'.
|
|
|
|
Drivers should declare a default value for this symbol if
|
|
|
|
they select option BCH_CONST_PARAMS.
|
|
|
|
|
2005-06-25 00:39:03 +00:00
|
|
|
#
|
|
|
|
# Textsearch support is select'ed if needed
|
|
|
|
#
|
2005-06-24 03:49:30 +00:00
|
|
|
config TEXTSEARCH
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2005-04-16 22:20:36 +00:00
|
|
|
|
[LIB]: Knuth-Morris-Pratt textsearch algorithm
Implements a linear-time string-matching algorithm due to Knuth,
Morris, and Pratt [1]. Their algorithm avoids the explicit
computation of the transition function DELTA altogether. Its
matching time is O(n), for n being length(text), using just an
auxiliary function PI[1..m], for m being length(pattern),
precomputed from the pattern in time O(m). The array PI allows
the transition function DELTA to be computed efficiently
"on the fly" as needed. Roughly speaking, for any state
"q" = 0,1,...,m and any character "a" in SIGMA, the value
PI["q"] contains the information that is independent of "a" and
is needed to compute DELTA("q", "a") [2]. Since the array PI
has only m entries, whereas DELTA has O(m|SIGMA|) entries, we
save a factor of |SIGMA| in the preprocessing time by computing
PI rather than DELTA.
[1] Cormen, Leiserson, Rivest, Stein
Introdcution to Algorithms, 2nd Edition, MIT Press
[2] See finite automation theory
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-24 03:58:37 +00:00
|
|
|
config TEXTSEARCH_KMP
|
2005-06-25 00:39:03 +00:00
|
|
|
tristate
|
[LIB]: Knuth-Morris-Pratt textsearch algorithm
Implements a linear-time string-matching algorithm due to Knuth,
Morris, and Pratt [1]. Their algorithm avoids the explicit
computation of the transition function DELTA altogether. Its
matching time is O(n), for n being length(text), using just an
auxiliary function PI[1..m], for m being length(pattern),
precomputed from the pattern in time O(m). The array PI allows
the transition function DELTA to be computed efficiently
"on the fly" as needed. Roughly speaking, for any state
"q" = 0,1,...,m and any character "a" in SIGMA, the value
PI["q"] contains the information that is independent of "a" and
is needed to compute DELTA("q", "a") [2]. Since the array PI
has only m entries, whereas DELTA has O(m|SIGMA|) entries, we
save a factor of |SIGMA| in the preprocessing time by computing
PI rather than DELTA.
[1] Cormen, Leiserson, Rivest, Stein
Introdcution to Algorithms, 2nd Edition, MIT Press
[2] See finite automation theory
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-24 03:58:37 +00:00
|
|
|
|
2005-08-25 23:12:22 +00:00
|
|
|
config TEXTSEARCH_BM
|
2005-08-25 23:23:11 +00:00
|
|
|
tristate
|
2005-08-25 23:12:22 +00:00
|
|
|
|
2005-06-24 03:59:16 +00:00
|
|
|
config TEXTSEARCH_FSM
|
2005-06-25 00:39:03 +00:00
|
|
|
tristate
|
2005-06-24 03:59:16 +00:00
|
|
|
|
2009-11-20 19:13:39 +00:00
|
|
|
config BTREE
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2009-11-20 19:13:39 +00:00
|
|
|
|
2014-03-17 12:21:54 +00:00
|
|
|
config INTERVAL_TREE
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2014-03-17 12:21:54 +00:00
|
|
|
help
|
|
|
|
Simple, embeddable, interval-tree. Can find the start of an
|
|
|
|
overlapping range in log(n) time and then iterate over all
|
|
|
|
overlapping nodes. The algorithm is implemented as an
|
|
|
|
augmented rbtree.
|
|
|
|
|
|
|
|
See:
|
|
|
|
|
2020-04-01 17:33:43 +00:00
|
|
|
Documentation/core-api/rbtree.rst
|
2014-03-17 12:21:54 +00:00
|
|
|
|
|
|
|
for more information.
|
|
|
|
|
2022-11-29 20:29:26 +00:00
|
|
|
config INTERVAL_TREE_SPAN_ITER
|
|
|
|
bool
|
|
|
|
depends on INTERVAL_TREE
|
|
|
|
|
2017-11-04 03:09:45 +00:00
|
|
|
config XARRAY_MULTI
|
2016-05-21 00:01:54 +00:00
|
|
|
bool
|
2017-11-04 03:09:45 +00:00
|
|
|
help
|
|
|
|
Support entries which occupy multiple consecutive indices in the
|
|
|
|
XArray.
|
2016-05-21 00:01:54 +00:00
|
|
|
|
Add a generic associative array implementation.
Add a generic associative array implementation that can be used as the
container for keyrings, thereby massively increasing the capacity available
whilst also speeding up searching in keyrings that contain a lot of keys.
This may also be useful in FS-Cache for tracking cookies.
Documentation is added into Documentation/associative_array.txt
Some of the properties of the implementation are:
(1) Objects are opaque pointers. The implementation does not care where they
point (if anywhere) or what they point to (if anything).
[!] NOTE: Pointers to objects _must_ be zero in the two least significant
bits.
(2) Objects do not need to contain linkage blocks for use by the array. This
permits an object to be located in multiple arrays simultaneously.
Rather, the array is made up of metadata blocks that point to objects.
(3) Objects are labelled as being one of two types (the type is a bool value).
This information is stored in the array, but has no consequence to the
array itself or its algorithms.
(4) Objects require index keys to locate them within the array.
(5) Index keys must be unique. Inserting an object with the same key as one
already in the array will replace the old object.
(6) Index keys can be of any length and can be of different lengths.
(7) Index keys should encode the length early on, before any variation due to
length is seen.
(8) Index keys can include a hash to scatter objects throughout the array.
(9) The array can iterated over. The objects will not necessarily come out in
key order.
(10) The array can be iterated whilst it is being modified, provided the RCU
readlock is being held by the iterator. Note, however, under these
circumstances, some objects may be seen more than once. If this is a
problem, the iterator should lock against modification. Objects will not
be missed, however, unless deleted.
(11) Objects in the array can be looked up by means of their index key.
(12) Objects can be looked up whilst the array is being modified, provided the
RCU readlock is being held by the thread doing the look up.
The implementation uses a tree of 16-pointer nodes internally that are indexed
on each level by nibbles from the index key. To improve memory efficiency,
shortcuts can be emplaced to skip over what would otherwise be a series of
single-occupancy nodes. Further, nodes pack leaf object pointers into spare
space in the node rather than making an extra branch until as such time an
object needs to be added to a full node.
Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 09:35:17 +00:00
|
|
|
config ASSOCIATIVE_ARRAY
|
|
|
|
bool
|
|
|
|
help
|
|
|
|
Generic associative array. Can be searched and iterated over whilst
|
|
|
|
it is being modified. It is also reasonably quick to search and
|
|
|
|
modify. The algorithms are non-recursive, and the trees are highly
|
|
|
|
capacious.
|
|
|
|
|
|
|
|
See:
|
|
|
|
|
2018-05-08 18:14:57 +00:00
|
|
|
Documentation/core-api/assoc_array.rst
|
Add a generic associative array implementation.
Add a generic associative array implementation that can be used as the
container for keyrings, thereby massively increasing the capacity available
whilst also speeding up searching in keyrings that contain a lot of keys.
This may also be useful in FS-Cache for tracking cookies.
Documentation is added into Documentation/associative_array.txt
Some of the properties of the implementation are:
(1) Objects are opaque pointers. The implementation does not care where they
point (if anywhere) or what they point to (if anything).
[!] NOTE: Pointers to objects _must_ be zero in the two least significant
bits.
(2) Objects do not need to contain linkage blocks for use by the array. This
permits an object to be located in multiple arrays simultaneously.
Rather, the array is made up of metadata blocks that point to objects.
(3) Objects are labelled as being one of two types (the type is a bool value).
This information is stored in the array, but has no consequence to the
array itself or its algorithms.
(4) Objects require index keys to locate them within the array.
(5) Index keys must be unique. Inserting an object with the same key as one
already in the array will replace the old object.
(6) Index keys can be of any length and can be of different lengths.
(7) Index keys should encode the length early on, before any variation due to
length is seen.
(8) Index keys can include a hash to scatter objects throughout the array.
(9) The array can iterated over. The objects will not necessarily come out in
key order.
(10) The array can be iterated whilst it is being modified, provided the RCU
readlock is being held by the iterator. Note, however, under these
circumstances, some objects may be seen more than once. If this is a
problem, the iterator should lock against modification. Objects will not
be missed, however, unless deleted.
(11) Objects in the array can be looked up by means of their index key.
(12) Objects can be looked up whilst the array is being modified, provided the
RCU readlock is being held by the thread doing the look up.
The implementation uses a tree of 16-pointer nodes internally that are indexed
on each level by nibbles from the index key. To improve memory efficiency,
shortcuts can be emplaced to skip over what would otherwise be a series of
single-occupancy nodes. Further, nodes pack leaf object pointers into spare
space in the node rather than making an extra branch until as such time an
object needs to be added to a full node.
Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 09:35:17 +00:00
|
|
|
|
|
|
|
for more information.
|
|
|
|
|
2017-03-18 00:35:23 +00:00
|
|
|
config CLOSURES
|
|
|
|
bool
|
|
|
|
|
2007-02-11 15:41:31 +00:00
|
|
|
config HAS_IOMEM
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2007-02-11 15:41:31 +00:00
|
|
|
depends on !NO_IOMEM
|
|
|
|
default y
|
|
|
|
|
2023-03-23 16:33:52 +00:00
|
|
|
config HAS_IOPORT
|
|
|
|
bool
|
|
|
|
|
2014-04-07 22:39:19 +00:00
|
|
|
config HAS_IOPORT_MAP
|
2014-12-20 20:41:11 +00:00
|
|
|
bool
|
2014-04-07 22:39:19 +00:00
|
|
|
depends on HAS_IOMEM && !NO_IOPORT_MAP
|
2006-12-13 08:35:00 +00:00
|
|
|
default y
|
|
|
|
|
2018-06-12 17:01:45 +00:00
|
|
|
source "kernel/dma/Kconfig"
|
2007-05-06 21:49:09 +00:00
|
|
|
|
2018-01-05 16:26:46 +00:00
|
|
|
config SGL_ALLOC
|
|
|
|
bool
|
|
|
|
default n
|
|
|
|
|
2018-04-03 13:47:59 +00:00
|
|
|
config IOMMU_HELPER
|
|
|
|
bool
|
|
|
|
|
2007-08-22 21:01:36 +00:00
|
|
|
config CHECK_SIGNATURE
|
|
|
|
bool
|
|
|
|
|
2008-12-13 10:50:27 +00:00
|
|
|
config CPUMASK_OFFSTACK
|
|
|
|
bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
|
|
|
|
help
|
|
|
|
Use dynamic allocation for cpumask_var_t, instead of putting
|
|
|
|
them on the stack. This is a bit more expensive, but avoids
|
|
|
|
stack overflow.
|
|
|
|
|
lib/cpumask: add FORCE_NR_CPUS config option
The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.
In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.
This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.
If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.
The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.
For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-05 23:08:20 +00:00
|
|
|
config FORCE_NR_CPUS
|
2024-06-18 16:00:04 +00:00
|
|
|
def_bool !SMP
|
lib/cpumask: add FORCE_NR_CPUS config option
The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.
In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.
This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.
If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.
The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.
For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-05 23:08:20 +00:00
|
|
|
|
2011-01-19 11:03:25 +00:00
|
|
|
config CPU_RMAP
|
|
|
|
bool
|
|
|
|
depends on SMP
|
|
|
|
|
dql: Dynamic queue limits
Implementation of dynamic queue limits (dql). This is a libary which
allows a queue limit to be dynamically managed. The goal of dql is
to set the queue limit, number of objects to the queue, to be minimized
without allowing the queue to be starved.
dql would be used with a queue which has these properties:
1) Objects are queued up to some limit which can be expressed as a
count of objects.
2) Periodically a completion process executes which retires consumed
objects.
3) Starvation occurs when limit has been reached, all queued data has
actually been consumed but completion processing has not yet run,
so queuing new data is blocked.
4) Minimizing the amount of queued data is desirable.
A canonical example of such a queue would be a NIC HW transmit queue.
The queue limit is dynamic, it will increase or decrease over time
depending on the workload. The queue limit is recalculated each time
completion processing is done. Increases occur when the queue is
starved and can exponentially increase over successive intervals.
Decreases occur when more data is being maintained in the queue than
needed to prevent starvation. The number of extra objects, or "slack",
is measured over successive intervals, and to avoid hysteresis the
limit is only reduced by the miminum slack seen over a configurable
time period.
dql API provides routines to manage the queue:
- dql_init is called to intialize the dql structure
- dql_reset is called to reset dynamic values
- dql_queued called when objects are being enqueued
- dql_avail returns availability in the queue
- dql_completed is called when objects have be consumed in the queue
Configuration consists of:
- max_limit, maximum limit
- min_limit, minimum limit
- slack_hold_time, time to measure instances of slack before reducing
queue limit
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-28 16:32:35 +00:00
|
|
|
config DQL
|
|
|
|
bool
|
|
|
|
|
2014-08-06 23:09:23 +00:00
|
|
|
config GLOB
|
|
|
|
bool
|
|
|
|
# This actually supports modular compilation, but the module overhead
|
|
|
|
# is ridiculous for the amount of code involved. Until an out-of-tree
|
|
|
|
# driver asks for it, we'll just link it directly it into the kernel
|
|
|
|
# when required. Since we're ignoring out-of-tree users, there's also
|
|
|
|
# no need bother prompting for a manual decision:
|
|
|
|
# prompt "glob_match() function"
|
|
|
|
help
|
|
|
|
This option provides a glob_match function for performing
|
|
|
|
simple text pattern matching. It originated in the ATA code
|
|
|
|
to blacklist particular drive models, but other device drivers
|
|
|
|
may need similar functionality.
|
|
|
|
|
|
|
|
All drivers in the Linux kernel tree that require this function
|
|
|
|
should automatically select this option. Say N unless you
|
|
|
|
are compiling an out-of tree driver which tells you that it
|
|
|
|
depends on this.
|
|
|
|
|
2014-08-06 23:09:25 +00:00
|
|
|
config GLOB_SELFTEST
|
2017-02-24 23:00:52 +00:00
|
|
|
tristate "glob self-test on init"
|
2014-08-06 23:09:25 +00:00
|
|
|
depends on GLOB
|
|
|
|
help
|
|
|
|
This option enables a simple self-test of the glob_match
|
|
|
|
function on startup. It is primarily useful for people
|
|
|
|
working on the code to ensure they haven't introduced any
|
|
|
|
regressions.
|
|
|
|
|
|
|
|
It only adds a little bit of code and slows kernel boot (or
|
|
|
|
module load) by a small amount, so you're welcome to play with
|
|
|
|
it, but you probably don't need it.
|
|
|
|
|
2009-03-04 06:53:30 +00:00
|
|
|
#
|
|
|
|
# Netlink attribute parsing support is select'ed if needed
|
|
|
|
#
|
|
|
|
config NLATTR
|
|
|
|
bool
|
|
|
|
|
2009-06-12 21:10:05 +00:00
|
|
|
#
|
|
|
|
# Generic 64-bit atomic support is selected if needed
|
|
|
|
#
|
|
|
|
config GENERIC_ATOMIC64
|
|
|
|
bool
|
|
|
|
|
2009-09-25 23:07:19 +00:00
|
|
|
config LRU_CACHE
|
|
|
|
tristate
|
|
|
|
|
2012-02-01 22:17:54 +00:00
|
|
|
config CLZ_TAB
|
|
|
|
bool
|
|
|
|
|
2015-11-10 13:56:14 +00:00
|
|
|
config IRQ_POLL
|
|
|
|
bool "IRQ polling library"
|
|
|
|
help
|
|
|
|
Helper library to poll interrupt mitigation using polling.
|
|
|
|
|
2011-08-31 11:05:16 +00:00
|
|
|
config MPILIB
|
2012-01-17 15:12:06 +00:00
|
|
|
tristate
|
2012-02-01 22:17:54 +00:00
|
|
|
select CLZ_TAB
|
2011-08-31 11:05:16 +00:00
|
|
|
help
|
|
|
|
Multiprecision maths library from GnuPG.
|
|
|
|
It is used to implement RSA digital signature verification,
|
|
|
|
which is used by IMA/EVM digital signature extension.
|
|
|
|
|
2012-01-17 15:12:03 +00:00
|
|
|
config SIGNATURE
|
2012-01-17 15:12:06 +00:00
|
|
|
tristate
|
2014-07-11 15:59:45 +00:00
|
|
|
depends on KEYS
|
|
|
|
select CRYPTO
|
2012-01-17 15:12:04 +00:00
|
|
|
select CRYPTO_SHA1
|
2011-10-14 12:25:16 +00:00
|
|
|
select MPILIB
|
|
|
|
help
|
|
|
|
Digital signature verification. Currently only RSA is supported.
|
|
|
|
Implementation is done using GnuPG MPI library
|
|
|
|
|
2019-01-10 15:33:17 +00:00
|
|
|
config DIMLIB
|
2024-05-06 17:50:40 +00:00
|
|
|
tristate
|
2024-06-21 10:13:50 +00:00
|
|
|
depends on NET
|
2019-01-10 15:33:17 +00:00
|
|
|
help
|
|
|
|
Dynamic Interrupt Moderation library.
|
2019-09-26 00:20:42 +00:00
|
|
|
Implements an algorithm for dynamically changing CQ moderation values
|
2019-01-10 15:33:17 +00:00
|
|
|
according to run time performance.
|
|
|
|
|
2012-07-05 16:12:38 +00:00
|
|
|
#
|
|
|
|
# libfdt files, only selected if needed.
|
|
|
|
#
|
|
|
|
config LIBFDT
|
|
|
|
bool
|
|
|
|
|
2012-09-21 22:30:46 +00:00
|
|
|
config OID_REGISTRY
|
|
|
|
tristate
|
|
|
|
help
|
|
|
|
Enable fast lookup object identifier registry.
|
|
|
|
|
2013-04-15 20:09:45 +00:00
|
|
|
config UCS2_STRING
|
2019-12-07 01:04:08 +00:00
|
|
|
tristate
|
2013-04-15 20:09:45 +00:00
|
|
|
|
2019-06-21 09:52:29 +00:00
|
|
|
#
|
|
|
|
# generic vdso
|
|
|
|
#
|
|
|
|
source "lib/vdso/Kconfig"
|
|
|
|
|
2013-06-09 09:46:43 +00:00
|
|
|
source "lib/fonts/Kconfig"
|
|
|
|
|
lib: scatterlist: add sg splitting function
Sometimes a scatter-gather has to be split into several chunks, or sub
scatter lists. This happens for example if a scatter list will be
handled by multiple DMA channels, each one filling a part of it.
A concrete example comes with the media V4L2 API, where the scatter list
is allocated from userspace to hold an image, regardless of the
knowledge of how many DMAs will fill it :
- in a simple RGB565 case, one DMA will pump data from the camera ISP
to memory
- in the trickier YUV422 case, 3 DMAs will pump data from the camera
ISP pipes, one for pipe Y, one for pipe U and one for pipe V
For these cases, it is necessary to split the original scatter list into
multiple scatter lists, which is the purpose of this patch.
The guarantees that are required for this patch are :
- the intersection of spans of any couple of resulting scatter lists is
empty.
- the union of spans of all resulting scatter lists is a subrange of
the span of the original scatter list.
- streaming DMA API operations (mapping, unmapping) should not happen
both on both the resulting and the original scatter list. It's either
the first or the later ones.
- the caller is reponsible to call kfree() on the resulting
scatterlists.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-08 08:44:10 +00:00
|
|
|
config SG_SPLIT
|
|
|
|
def_bool n
|
|
|
|
help
|
2015-09-04 10:45:05 +00:00
|
|
|
Provides a helper to split scatterlists into chunks, each chunk being
|
|
|
|
a scatterlist. This should be selected by a driver or an API which
|
|
|
|
whishes to split a scatterlist amongst multiple DMA channels.
|
lib: scatterlist: add sg splitting function
Sometimes a scatter-gather has to be split into several chunks, or sub
scatter lists. This happens for example if a scatter list will be
handled by multiple DMA channels, each one filling a part of it.
A concrete example comes with the media V4L2 API, where the scatter list
is allocated from userspace to hold an image, regardless of the
knowledge of how many DMAs will fill it :
- in a simple RGB565 case, one DMA will pump data from the camera ISP
to memory
- in the trickier YUV422 case, 3 DMAs will pump data from the camera
ISP pipes, one for pipe Y, one for pipe U and one for pipe V
For these cases, it is necessary to split the original scatter list into
multiple scatter lists, which is the purpose of this patch.
The guarantees that are required for this patch are :
- the intersection of spans of any couple of resulting scatter lists is
empty.
- the union of spans of all resulting scatter lists is a subrange of
the span of the original scatter list.
- streaming DMA API operations (mapping, unmapping) should not happen
both on both the resulting and the original scatter list. It's either
the first or the later ones.
- the caller is reponsible to call kfree() on the resulting
scatterlists.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-08 08:44:10 +00:00
|
|
|
|
2016-04-04 21:48:11 +00:00
|
|
|
config SG_POOL
|
|
|
|
def_bool n
|
|
|
|
help
|
|
|
|
Provides a helper to allocate chained scatterlists. This should be
|
|
|
|
selected by a driver or an API which whishes to allocate chained
|
|
|
|
scatterlist.
|
|
|
|
|
2014-08-08 21:23:25 +00:00
|
|
|
#
|
|
|
|
# sg chaining option
|
|
|
|
#
|
|
|
|
|
2018-11-09 08:51:00 +00:00
|
|
|
config ARCH_NO_SG_CHAIN
|
2014-08-08 21:23:25 +00:00
|
|
|
def_bool n
|
|
|
|
|
2015-06-25 07:08:39 +00:00
|
|
|
config ARCH_HAS_PMEM_API
|
|
|
|
bool
|
|
|
|
|
2019-11-07 01:43:31 +00:00
|
|
|
config MEMREGION
|
|
|
|
bool
|
|
|
|
|
memregion: Add cpu_cache_invalidate_memregion() interface
With CXL security features, and CXL dynamic provisioning, global CPU
cache flushing nvdimm requirements are no longer specific to that
subsystem, even beyond the scope of security_ops. CXL will need such
semantics for features not necessarily limited to persistent memory.
The functionality this is enabling is to be able to instantaneously
secure erase potentially terabytes of memory at once and the kernel
needs to be sure that none of the data from before the erase is still
present in the cache. It is also used when unlocking a memory device
where speculative reads and firmware accesses could have cached poison
from before the device was unlocked. Lastly this facility is used when
mapping new devices, or new capacity into an established physical
address range. I.e. when the driver switches DeviceA mapping AddressX to
DeviceB mapping AddressX then any cached data from DeviceA:AddressX
needs to be invalidated.
This capability is typically only used once per-boot (for unlock), or
once per bare metal provisioning event (secure erase), like when handing
off the system to another tenant or decommissioning a device. It may
also be used for dynamic CXL region provisioning.
Users must first call cpu_cache_has_invalidate_memregion() to know
whether this functionality is available on the architecture. On x86 this
respects the constraints of when wbinvd() is tolerable. It is already
the case that wbinvd() is problematic to allow in VMs due its global
performance impact and KVM, for example, has been known to just trap and
ignore the call. With confidential computing guest execution of wbinvd()
may even trigger an exception. Given guests should not be messing with
the bare metal address map via CXL configuration changes
cpu_cache_has_invalidate_memregion() returns false in VMs.
While this global cache invalidation facility, is exported to modules,
since NVDIMM and CXL support can be built as a module, it is not for
general use. The intent is that this facility is not available outside
of specific "device-memory" use cases. To make that expectation as clear
as possible the API is scoped to a new "DEVMEM" module namespace that
only the NVDIMM and CXL subsystems are expected to import.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-10-28 18:34:04 +00:00
|
|
|
config ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
|
|
|
|
bool
|
|
|
|
|
2020-01-30 20:06:07 +00:00
|
|
|
config ARCH_HAS_MEMREMAP_COMPAT_ALIGN
|
|
|
|
bool
|
|
|
|
|
2019-04-23 16:38:08 +00:00
|
|
|
# use memcpy to implement user copies for nommu architectures
|
|
|
|
config UACCESS_MEMCPY
|
|
|
|
bool
|
|
|
|
|
2017-05-29 19:22:50 +00:00
|
|
|
config ARCH_HAS_UACCESS_FLUSHCACHE
|
|
|
|
bool
|
|
|
|
|
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.
Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:
On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > However now I see that copy_user_generic() works for the wrong reason.
> > It works because the exception on the source address due to poison
> > looks no different than a write fault on the user address to the
> > caller, it's still just a short copy. So it makes copy_to_user() work
> > for the wrong reason relative to the name.
>
> Right.
>
> And it won't work that way on other architectures. On x86, we have a
> generic function that can take faults on either side, and we use it
> for both cases (and for the "in_user" case too), but that's an
> artifact of the architecture oddity.
>
> In fact, it's probably wrong even on x86 - because it can hide bugs -
> but writing those things is painful enough that everybody prefers
> having just one function.
Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().
Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.
One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.
[ bp: Massage a bit. ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06 03:40:16 +00:00
|
|
|
# arch has a concept of a recoverable synchronous exception due to a
|
|
|
|
# memory-read error like x86 machine-check or ARM data-abort, and
|
|
|
|
# implements copy_mc_to_{user,kernel} to abort and report
|
|
|
|
# 'bytes-transferred' if that exception fires when accessing the source
|
|
|
|
# buffer.
|
|
|
|
config ARCH_HAS_COPY_MC
|
2018-05-23 06:17:03 +00:00
|
|
|
bool
|
|
|
|
|
2019-04-25 09:45:21 +00:00
|
|
|
# Temporary. Goes away when all archs are cleaned up
|
|
|
|
config ARCH_STACKWALK
|
|
|
|
bool
|
|
|
|
|
2016-03-25 21:22:08 +00:00
|
|
|
config STACKDEPOT
|
|
|
|
bool
|
|
|
|
select STACKTRACE
|
2023-11-20 17:47:04 +00:00
|
|
|
help
|
|
|
|
Stack depot: stack trace storage that avoids duplication
|
2016-03-25 21:22:08 +00:00
|
|
|
|
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be
allocated from memblock, even if stack depot ends up not actually used.
The default size of stack_table is 4MB on 32-bit, 8MB on 64-bit.
This is fine for use-cases such as KASAN which is also a config option
and has overhead on its own. But it's an issue for functionality that
has to be actually enabled on boot (page_owner) or depends on hardware
(GPU drivers) and thus the memory might be wasted. This was raised as
an issue [1] when attempting to add stackdepot support for SLUB's debug
object tracking functionality. It's common to build kernels with
CONFIG_SLUB_DEBUG and enable slub_debug on boot only when needed, or
create only specific kmem caches with debugging for testing purposes.
It would thus be more efficient if stackdepot's table was allocated only
when actually going to be used. This patch thus makes the allocation
(and whole stack_depot_init() call) optional:
- Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
well-defined point of allocation as part of mem_init(). Make
CONFIG_KASAN select this flag.
- Other users have to call stack_depot_init() as part of their own init
when it's determined that stack depot will actually be used. This may
depend on both config and runtime conditions. Convert current users
which are page_owner and several in the DRM subsystem. Same will be
done for SLUB later.
- Because the init might now be called after the boot-time memblock
allocation has given all memory to the buddy allocator, change
stack_depot_init() to allocate stack_table with kvmalloc() when
memblock is no longer available. Also handle allocation failure by
disabling stackdepot (could have theoretically happened even with
memblock allocation previously), and don't unnecessarily align the
memblock allocation to its own size anymore.
[1] https://lore.kernel.org/all/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/
Link: https://lkml.kernel.org/r/20211013073005.11351-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Marco Elver <elver@google.com> # stackdepot
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
From: Colin Ian King <colin.king@canonical.com>
Subject: lib/stackdepot: fix spelling mistake and grammar in pr_err message
There is a spelling mistake of the work allocation so fix this and
re-phrase the message to make it easier to read.
Link: https://lkml.kernel.org/r/20211015104159.11282-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup
On FLATMEM, we call page_ext_init_flatmem_late() just before
kmem_cache_init() which means stack_depot_init() (called by page owner
init) will not recognize properly it should use kvmalloc() and not
memblock_alloc(). memblock_alloc() will also not issue a warning and
return a block memory that can be invalid and cause kernel page fault when
saving stacks, as reported by the kernel test robot [1].
Fix this by moving page_ext_init_flatmem_late() below kmem_cache_init() so
that slab_is_available() is true during stack_depot_init(). SPARSEMEM
doesn't have this issue, as it doesn't do page_ext_init_flatmem_late(),
but a different page_ext_init() even later in the boot process.
Thanks to Mike Rapoport for pointing out the FLATMEM init ordering issue.
While at it, also actually resolve a checkpatch warning in stack_depot_init()
from DRM CI, which was supposed to be in the original patch already.
[1] https://lore.kernel.org/all/20211014085450.GC18719@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/6abd9213-19a9-6d58-cedc-2414386d2d81@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup3
Due to cd06ab2fd48f ("drm/locking: add backtrace for locking contended
locks without backoff") landing recently to -next adding a new stack depot
user in drivers/gpu/drm/drm_modeset_lock.c we need to add an appropriate
call to stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/2a692365-cfa1-64f2-34e0-8aa5674dce5e@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup4
Due to 4e66934eaadc ("lib: add reference counting tracking
infrastructure") landing recently to net-next adding a new stack depot
user in lib/ref_tracker.c we need to add an appropriate call to
stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/45c1b738-1a2f-5b5f-2f6d-86fab206d01c@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Slab <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 06:14:27 +00:00
|
|
|
config STACKDEPOT_ALWAYS_INIT
|
|
|
|
bool
|
|
|
|
select STACKDEPOT
|
2023-11-20 17:47:04 +00:00
|
|
|
help
|
|
|
|
Always initialize stack depot during early boot
|
|
|
|
|
|
|
|
config STACKDEPOT_MAX_FRAMES
|
|
|
|
int "Maximum number of frames in trace saved in stack depot"
|
|
|
|
range 1 256
|
|
|
|
default 64
|
|
|
|
depends on STACKDEPOT
|
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be
allocated from memblock, even if stack depot ends up not actually used.
The default size of stack_table is 4MB on 32-bit, 8MB on 64-bit.
This is fine for use-cases such as KASAN which is also a config option
and has overhead on its own. But it's an issue for functionality that
has to be actually enabled on boot (page_owner) or depends on hardware
(GPU drivers) and thus the memory might be wasted. This was raised as
an issue [1] when attempting to add stackdepot support for SLUB's debug
object tracking functionality. It's common to build kernels with
CONFIG_SLUB_DEBUG and enable slub_debug on boot only when needed, or
create only specific kmem caches with debugging for testing purposes.
It would thus be more efficient if stackdepot's table was allocated only
when actually going to be used. This patch thus makes the allocation
(and whole stack_depot_init() call) optional:
- Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
well-defined point of allocation as part of mem_init(). Make
CONFIG_KASAN select this flag.
- Other users have to call stack_depot_init() as part of their own init
when it's determined that stack depot will actually be used. This may
depend on both config and runtime conditions. Convert current users
which are page_owner and several in the DRM subsystem. Same will be
done for SLUB later.
- Because the init might now be called after the boot-time memblock
allocation has given all memory to the buddy allocator, change
stack_depot_init() to allocate stack_table with kvmalloc() when
memblock is no longer available. Also handle allocation failure by
disabling stackdepot (could have theoretically happened even with
memblock allocation previously), and don't unnecessarily align the
memblock allocation to its own size anymore.
[1] https://lore.kernel.org/all/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/
Link: https://lkml.kernel.org/r/20211013073005.11351-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Marco Elver <elver@google.com> # stackdepot
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
From: Colin Ian King <colin.king@canonical.com>
Subject: lib/stackdepot: fix spelling mistake and grammar in pr_err message
There is a spelling mistake of the work allocation so fix this and
re-phrase the message to make it easier to read.
Link: https://lkml.kernel.org/r/20211015104159.11282-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup
On FLATMEM, we call page_ext_init_flatmem_late() just before
kmem_cache_init() which means stack_depot_init() (called by page owner
init) will not recognize properly it should use kvmalloc() and not
memblock_alloc(). memblock_alloc() will also not issue a warning and
return a block memory that can be invalid and cause kernel page fault when
saving stacks, as reported by the kernel test robot [1].
Fix this by moving page_ext_init_flatmem_late() below kmem_cache_init() so
that slab_is_available() is true during stack_depot_init(). SPARSEMEM
doesn't have this issue, as it doesn't do page_ext_init_flatmem_late(),
but a different page_ext_init() even later in the boot process.
Thanks to Mike Rapoport for pointing out the FLATMEM init ordering issue.
While at it, also actually resolve a checkpatch warning in stack_depot_init()
from DRM CI, which was supposed to be in the original patch already.
[1] https://lore.kernel.org/all/20211014085450.GC18719@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/6abd9213-19a9-6d58-cedc-2414386d2d81@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup3
Due to cd06ab2fd48f ("drm/locking: add backtrace for locking contended
locks without backoff") landing recently to -next adding a new stack depot
user in drivers/gpu/drm/drm_modeset_lock.c we need to add an appropriate
call to stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/2a692365-cfa1-64f2-34e0-8aa5674dce5e@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Marco Elver <elver@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Oliver Glitta <glittao@gmail.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
From: Vlastimil Babka <vbabka@suse.cz>
Subject: lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup4
Due to 4e66934eaadc ("lib: add reference counting tracking
infrastructure") landing recently to net-next adding a new stack depot
user in lib/ref_tracker.c we need to add an appropriate call to
stack_depot_init() there as well.
Link: https://lkml.kernel.org/r/45c1b738-1a2f-5b5f-2f6d-86fab206d01c@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Slab <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-22 06:14:27 +00:00
|
|
|
|
2021-12-05 04:21:55 +00:00
|
|
|
config REF_TRACKER
|
|
|
|
bool
|
|
|
|
depends on STACKTRACE_SUPPORT
|
|
|
|
select STACKDEPOT
|
|
|
|
|
2016-09-17 14:38:44 +00:00
|
|
|
config SBITMAP
|
|
|
|
bool
|
|
|
|
|
2017-02-03 09:29:06 +00:00
|
|
|
config PARMAN
|
2017-02-24 10:25:55 +00:00
|
|
|
tristate "parman" if COMPILE_TEST
|
2017-02-03 09:29:06 +00:00
|
|
|
|
2019-09-09 21:54:21 +00:00
|
|
|
config OBJAGG
|
|
|
|
tristate "objagg" if COMPILE_TEST
|
|
|
|
|
2023-09-11 14:39:43 +00:00
|
|
|
config LWQ_TEST
|
|
|
|
bool "Boot-time test for lwq queuing"
|
|
|
|
help
|
|
|
|
Run boot-time test of light-weight queuing.
|
|
|
|
|
2005-06-24 03:49:30 +00:00
|
|
|
endmenu
|
2017-05-23 17:28:26 +00:00
|
|
|
|
2019-08-13 09:24:04 +00:00
|
|
|
config GENERIC_IOREMAP
|
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_ASHLDI3
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_ASHRDI3
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_LSHRDI3
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_MULDI3
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_CMPDI2
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
|
|
|
|
2018-04-11 07:50:17 +00:00
|
|
|
config GENERIC_LIB_UCMPDI2
|
2017-05-23 17:28:26 +00:00
|
|
|
bool
|
2020-07-24 00:21:59 +00:00
|
|
|
|
2020-07-09 18:43:21 +00:00
|
|
|
config GENERIC_LIB_DEVMEM_IS_ALLOWED
|
|
|
|
bool
|
|
|
|
|
2020-07-24 00:21:59 +00:00
|
|
|
config PLDMFW
|
|
|
|
bool
|
|
|
|
default n
|
2021-01-27 19:06:13 +00:00
|
|
|
|
|
|
|
config ASN1_ENCODER
|
|
|
|
tristate
|
2022-04-01 21:40:29 +00:00
|
|
|
|
|
|
|
config POLYNOMIAL
|
|
|
|
tristate
|
2023-10-12 18:53:54 +00:00
|
|
|
|
|
|
|
config FIRMWARE_TABLE
|
|
|
|
bool
|
2024-10-11 14:12:14 +00:00
|
|
|
|
|
|
|
config UNION_FIND
|
|
|
|
bool
|
lib/min_heap: introduce non-inline versions of min heap API functions
Patch series "Enhance min heap API with non-inline functions and
optimizations", v2.
Add non-inline versions of the min heap API functions in lib/min_heap.c
and updates all users outside of kernel/events/core.c to use these
non-inline versions. To mitigate the performance impact of indirect
function calls caused by the non-inline versions of the swap and compare
functions, a builtin swap has been introduced that swaps elements based on
their size. Additionally, it micro-optimizes the efficiency of the min
heap by pre-scaling the counter, following the same approach as in
lib/sort.c. Documentation for the min heap API has also been added to the
core-api section.
This patch (of 10):
All current min heap API functions are marked with '__always_inline'.
However, as the number of users increases, inlining these functions
everywhere leads to a increase in kernel size.
In performance-critical paths, such as when perf events are enabled and
min heap functions are called on every context switch, it is important to
retain the inline versions for optimal performance. To balance this, the
original inline functions are kept, and additional non-inline versions of
the functions have been added in lib/min_heap.c.
Link: https://lkml.kernel.org/r/20241020040200.939973-1-visitorckw@gmail.com
Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
Link: https://lkml.kernel.org/r/20241020040200.939973-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-20 04:01:51 +00:00
|
|
|
|
|
|
|
config MIN_HEAP
|
|
|
|
bool
|