The only current user of the DH KPP algorithm, the
keyctl(KEYCTL_DH_COMPUTE) syscall, doesn't set the domain parameter ->q
in struct dh. Remove it and any associated (de)serialization code in
crypto_dh_encode_key() and crypto_dh_decode_key. Adjust the encoded
->secret values in testmgr's DH test vectors accordingly.
Note that the dh-generic implementation would have initialized its
struct dh_ctx's ->q from the decoded struct dh's ->q, if present. If this
struct dh_ctx's ->q would ever have been non-NULL, it would have enabled a
full key validation as specified in NIST SP800-56A in dh_is_pubkey_valid().
However, as outlined above, ->q is always NULL in practice and the full key
validation code is effectively dead. A later patch will make
dh_is_pubkey_valid() to calculate Q from P on the fly, if possible, so
don't remove struct dh_ctx's ->q now, but leave it there until that has
happened.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The upcoming support for the RFC 7919 ffdhe group parameters will be
made available in the form of templates like "ffdhe2048(dh)",
"ffdhe3072(dh)" and so on. Template instantiations thereof would wrap the
inner "dh" kpp_alg and also provide kpp_alg services to the outside again.
The primitves needed for providing kpp_alg services from template instances
have been introduced with the previous patch. Continue this work now and
implement everything needed for enabling template instances to make use
of inner KPP algorithms like "dh".
More specifically, define a struct crypto_kpp_spawn in close analogy to
crypto_skcipher_spawn, crypto_shash_spawn and alike. Implement a
crypto_grab_kpp() and crypto_drop_kpp() pair for binding such a spawn to
some inner kpp_alg and for releasing it respectively. Template
implementations can instantiate transforms from the underlying kpp_alg by
means of the new crypto_spawn_kpp(). Finally, provide the
crypto_spawn_kpp_alg() helper for accessing a spawn's underlying kpp_alg
during template instantiation.
Annotate everything with proper kernel-doc comments, even though
include/crypto/internal/kpp.h is not considered for the generated docs.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The upcoming support for the RFC 7919 ffdhe group parameters will be
made available in the form of templates like "ffdhe2048(dh)",
"ffdhe3072(dh)" and so on. Template instantiations thereof would wrap the
inner "dh" kpp_alg and also provide kpp_alg services to the outside again.
Furthermore, it might be perhaps be desirable to provide KDF templates in
the future, which would similarly wrap an inner kpp_alg and present
themselves to the outside as another kpp_alg, transforming the shared
secret on its way out.
Introduce the bits needed for supporting KPP template instances. Everything
related to inner kpp_alg spawns potentially being held by such template
instances will be deferred to a subsequent patch in order to facilitate
review.
Define struct struct kpp_instance in close analogy to the already existing
skcipher_instance, shash_instance and alike, but wrapping a struct kpp_alg.
Implement the new kpp_register_instance() template instance registration
primitive. Provide some helper functions for
- going back and forth between a generic struct crypto_instance and the new
struct kpp_instance,
- obtaining the instantiating kpp_instance from a crypto_kpp transform and
- for accessing a given kpp_instance's implementation specific context
data.
Annotate everything with proper kernel-doc comments, even though
include/crypto/internal/kpp.h is not considered for the generated docs.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When doing iperf over ipsec with crypto hardware sun8i-ce, I hit some
spinlock recursion bug.
This is due to completion function called with enabled BH.
Add check a to detect this.
Fixes: 735d37b542 ("crypto: engine - Introduce the block request crypto engine framework")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The lrw template relies on ecb to work. So we need to declare
a Kconfig dependency as well as a module softdep on it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The xts module needs ecb to be present as it's meant to work
on top of ecb. This patch adds a softdep so ecb can be included
automatically into the initramfs.
Reported-by: rftc <rftc@gmx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
FIPS 140 requires a minimum security strength of 112 bits. This implies
that the HMAC key must not be smaller than 112 in FIPS mode.
This restriction implies that the test vectors for HMAC that have a key
that is smaller than 112 bits must be disabled when FIPS support is
compiled.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
By adding the support for the flag fips_skip, hash / HMAC test vectors
may be marked to be not applicable in FIPS mode. Such vectors are
silently skipped in FIPS mode.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The multibuffer algorithms was removed already in 2018, so it is
necessary to clear the test code left by tcrypt.
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The C standard does not support dereferencing pointers that are not
aligned with respect to the pointed-to type, and doing so is technically
undefined behavior, even if the underlying hardware supports it.
This means that conditionally dereferencing such pointers based on
whether CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y is not the right thing
to do, and actually results in alignment faults on ARM, which are fixed
up on a slow path. Instead, we should use the unaligned accessors in
such cases: on architectures that don't care about alignment, they will
result in identical codegen whereas, e.g., codegen on ARM will avoid
doubleword loads and stores but use ordinary ones, which are able to
tolerate misalignment.
Link: https://lore.kernel.org/linux-crypto/CAHk-=wiKkdYLY0bv+nXrcJz3NH9mAqPAafX7PpW5EwVtxsEu7Q@mail.gmail.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function crypto_authenc_decrypt_tail discards its flags
argument and always relies on the flags from the original request
when starting its sub-request.
This is clearly wrong as it may cause the SLEEPABLE flag to be
set when it shouldn't.
Fixes: 92d95ba917 ("crypto: authenc - Convert to new AEAD interface")
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The new convention for akcipher_alg::verify makes it unclear which
values are the lengths of the signature and digest. Add local variables
to make it clearer what is going on.
Also rename the digest_size variable in pkcs1pad_sign(), as it is
actually the digest *info* size, not the digest size which is different.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Before checking whether the expected digest_info is present, we need to
check that there are enough bytes remaining.
Fixes: a49de377e0 ("crypto: Add hash param to pkcs1pad")
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
RSA PKCS#1 v1.5 signatures are required to be the same length as the RSA
key size. RFC8017 specifically requires the verifier to check this
(https://datatracker.ietf.org/doc/html/rfc8017#section-8.2.2).
Commit a49de377e0 ("crypto: Add hash param to pkcs1pad") changed the
kernel to allow longer signatures, but didn't explain this part of the
change; it seems to be unrelated to the rest of the commit.
Revert this change, since it doesn't appear to be correct.
We can be pretty sure that no one is relying on overly-long signatures
(which would have to be front-padded with zeroes) being supported, given
that they would have been broken since commit c7381b0128
("crypto: akcipher - new verify API for public key algorithms").
Fixes: a49de377e0 ("crypto: Add hash param to pkcs1pad")
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Tadeusz Struk <tadeusz.struk@linaro.org>
Suggested-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit c7381b0128 ("crypto: akcipher - new verify API for public key
algorithms") changed akcipher_alg::verify to take in both the signature
and the actual hash and do the signature verification, rather than just
return the hash expected by the signature as was the case before. To do
this, it implemented a hack where the signature and hash are
concatenated with each other in one scatterlist.
Obviously, for this to work correctly, akcipher_alg::verify needs to
correctly extract the two items from the scatterlist it is given.
Unfortunately, it doesn't correctly extract the hash in the case where
the signature is longer than the RSA key size, as it assumes that the
signature's length is equal to the RSA key size. This causes a prefix
of the hash, or even the entire hash, to be taken from the *signature*.
(Note, the case of a signature longer than the RSA key size should not
be allowed in the first place; a separate patch will fix that.)
It is unclear whether the resulting scheme has any useful security
properties.
Fix this by correctly extracting the hash from the scatterlist.
Fixes: c7381b0128 ("crypto: akcipher - new verify API for public key algorithms")
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The pkcs1pad template can be instantiated with an arbitrary akcipher
algorithm, which doesn't make sense; it is specifically an RSA padding
scheme. Make it check that the underlying algorithm really is RSA.
Fixes: 3d5b1ecdea ("crypto: rsa - RSA padding algorithm")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In addition to sha256 we must also enable hmac for the kdf self-test
to work.
Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 304b4acee2 ("crypto: kdf - select SHA-256 required...")
Fixes: 026a733e66 ("crypto: kdf - add SP800-108 counter key...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As testmgr is part of cryptomgr which was designed to be unloadable
as a module, it shouldn't export any symbols for other crypto
modules to use as that would prevent it from being unloaded. All
its functionality is meant to be accessed through notifiers.
The symbol crypto_simd_disabled_for_test was added to testmgr
which caused it to be pinned as a module if its users were also
loaded. This patch moves it out of testmgr and into crypto/algapi.c
so cryptomgr can again be unloaded and replaced on demand.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
tcrypt supports testing of SM3 hash algorithms that use AVX
instruction acceleration.
In order to add the sm3 asynchronous test to the appropriate
position, shift the testcase sequence number of the multi buffer
backward and start from 450.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds AVX assembly accelerated implementation of SM3 secure
hash algorithm. From the benchmark data, compared to pure software
implementation sm3-generic, the performance increase is up to 38%.
The main algorithm implementation based on SM3 AES/BMI2 accelerated
work by libgcrypt at:
https://gnupg.org/software/libgcrypt/index.html
Benchmark on Intel i5-6200U 2.30GHz, performance data of two
implementations, pure software sm3-generic and sm3-avx acceleration.
The data comes from the 326 mode and 422 mode of tcrypt. The abscissas
are different lengths of per update. The data is tabulated and the
unit is Mb/s:
update-size | 16 64 256 1024 2048 4096 8192
------------+-------------------------------------------------------
sm3-generic | 105.97 129.60 182.12 189.62 188.06 193.66 194.88
sm3-avx | 119.87 163.05 244.44 260.92 257.60 264.87 265.88
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SM3 generic library is stand-alone implementation, it is necessary
making the sm3-generic implementation to depends on SM3 library.
The functions crypto_sm3_*() provided by sm3_generic is no longer
exported.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SM3 generic library is stand-alone implementation, it is necessary
for the calculation of sm2 z digest to depends on SM3 library
instead of sm3-generic.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 6048fdcc5f ("lib/crypto: blake2s: include as built-in") took
away a number of prompt texts from other crypto libraries. This makes
values flip from built-in to module when oldconfig runs, and causes
problems when these crypto libs need to be built in for thingslike
BIG_KEYS.
Fixes: 6048fdcc5f ("lib/crypto: blake2s: include as built-in")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
[Jason: - moved menu into submenu of lib/ instead of root menu
- fixed chacha sub-dependencies for CONFIG_CRYPTO]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Pull signal/exit/ptrace updates from Eric Biederman:
"This set of changes deletes some dead code, makes a lot of cleanups
which hopefully make the code easier to follow, and fixes bugs found
along the way.
The end-game which I have not yet reached yet is for fatal signals
that generate coredumps to be short-circuit deliverable from
complete_signal, for force_siginfo_to_task not to require changing
userspace configured signal delivery state, and for the ptrace stops
to always happen in locations where we can guarantee on all
architectures that the all of the registers are saved and available on
the stack.
Removal of profile_task_ext, profile_munmap, and profile_handoff_task
are the big successes for dead code removal this round.
A bunch of small bug fixes are included, as most of the issues
reported were small enough that they would not affect bisection so I
simply added the fixes and did not fold the fixes into the changes
they were fixing.
There was a bug that broke coredumps piped to systemd-coredump. I
dropped the change that caused that bug and replaced it entirely with
something much more restrained. Unfortunately that required some
rebasing.
Some successes after this set of changes: There are few enough calls
to do_exit to audit in a reasonable amount of time. The lifetime of
struct kthread now matches the lifetime of struct task, and the
pointer to struct kthread is no longer stored in set_child_tid. The
flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
removed. Issues where task->exit_code was examined with
signal->group_exit_code should been examined were fixed.
There are several loosely related changes included because I am
cleaning up and if I don't include them they will probably get lost.
The original postings of these changes can be found at:
https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.orghttps://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.orghttps://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org
I trimmed back the last set of changes to only the obviously correct
once. Simply because there was less time for review than I had hoped"
* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
ptrace/m68k: Stop open coding ptrace_report_syscall
ptrace: Remove unused regs argument from ptrace_report_syscall
ptrace: Remove second setting of PT_SEIZED in ptrace_attach
taskstats: Cleanup the use of task->exit_code
exit: Use the correct exit_code in /proc/<pid>/stat
exit: Fix the exit_code for wait_task_zombie
exit: Coredumps reach do_group_exit
exit: Remove profile_handoff_task
exit: Remove profile_task_exit & profile_munmap
signal: clean up kernel-doc comments
signal: Remove the helper signal_group_exit
signal: Rename group_exit_task group_exec_task
coredump: Stop setting signal->group_exit_task
signal: Remove SIGNAL_GROUP_COREDUMP
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
signal: Make coredump handling explicit in complete_signal
signal: Have prepare_signal detect coredumps using signal->core_state
signal: Have the oom killer detect coredumps using signal->core_state
exit: Move force_uaccess back into do_exit
exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
...
-----BEGIN PGP SIGNATURE-----
iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCYdzf7hIcamFya2tvQGtl
cm5lbC5vcmcACgkQGnq6IXRrq9IA/AEA2sX9fNNYSYnUwvi/Ju+Y8BgW4pA+GvA0
L8iSuUkWdssA/iQFdQ3vyDK0CI56G1jerKMyT7o8QEuJmUYogTRV7+oA
=7q7g
-----END PGP SIGNATURE-----
Merge tag 'tpmdd-next-v5.17-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull TPM updates from Jarkko Sakkinen:
"Other than bug fixes for TPM, this includes a patch for asymmetric
keys to allow to look up and verify with self-signed certificates
(keys without so called AKID - Authority Key Identifier) using a new
"dn:" prefix in the query"
* tag 'tpmdd-next-v5.17-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
lib: remove redundant assignment to variable ret
tpm: fix NPE on probe for missing device
tpm: fix potential NULL pointer access in tpm_del_char_device
tpm: Add Upgrade/Reduced mode support for TPM2 modules
char: tpm: cr50: Set TPM_FIRMWARE_POWER_MANAGED based on device property
keys: X.509 public key issuer lookup without AKID
tpm_tis: Fix an error handling path in 'tpm_tis_core_init()'
tpm: tpm_tis_spi_cr50: Add default RNG quality
tpm/st33zp24: drop unneeded over-commenting
tpm: add request_locality before write TPM_INT_ENABLE
Pull crypto updates from Herbert Xu:
"Algorithms:
- Drop alignment requirement for data in aesni
- Use synchronous seeding from the /dev/random in DRBG
- Reseed nopr DRBGs every 5 minutes from /dev/random
- Add KDF algorithms currently used by security/DH
- Fix lack of entropy on some AMD CPUs with jitter RNG
Drivers:
- Add support for the D1 variant in sun8i-ce
- Add SEV_INIT_EX support in ccp
- PFVF support for GEN4 host driver in qat
- Compression support for GEN4 devices in qat
- Add cn10k random number generator support"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (145 commits)
crypto: af_alg - rewrite NULL pointer check
lib/mpi: Add the return value check of kcalloc()
crypto: qat - fix definition of ring reset results
crypto: hisilicon - cleanup warning in qm_get_qos_value()
crypto: kdf - select SHA-256 required for self-test
crypto: x86/aesni - don't require alignment of data
crypto: ccp - remove unneeded semicolon
crypto: stm32/crc32 - Fix kernel BUG triggered in probe()
crypto: s390/sha512 - Use macros instead of direct IV numbers
crypto: sparc/sha - remove duplicate hash init function
crypto: powerpc/sha - remove duplicate hash init function
crypto: mips/sha - remove duplicate hash init function
crypto: sha256 - remove duplicate generic hash init function
crypto: jitter - add oversampling of noise source
MAINTAINERS: update SEC2 driver maintainers list
crypto: ux500 - Use platform_get_irq() to get the interrupt
crypto: hisilicon/qm - disable qm clock-gating
crypto: omap-aes - Fix broken pm_runtime_and_get() usage
MAINTAINERS: update caam crypto driver maintainers list
crypto: octeontx2 - prevent underflow in get_cores_bmap()
...
There are non-root X.509 v3 certificates in use out there that contain
no Authority Key Identifier extension (RFC5280 section 4.2.1.1). For
trust verification purposes the kernel asymmetric key type keeps two
struct asymmetric_key_id instances that the key can be looked up by,
and another two to look up the key's issuer. The x509 public key type
and the PKCS7 type generate them from the SKID and AKID extensions in
the certificate. In effect current code has no way to look up the
issuer certificate for verification without the AKID.
To remedy this, add a third asymmetric_key_id blob to the arrays in
both asymmetric_key_id's (for certficate subject) and in the
public_keys_signature's auth_ids (for issuer lookup), using just raw
subject and issuer DNs from the certificate. Adapt
asymmetric_key_ids() and its callers to use the third ID for lookups
when none of the other two are available. Attempt to keep the logic
intact when they are, to minimise behaviour changes. Adapt the
restrict functions' NULL-checks to include that ID too. Do not modify
the lookup logic in pkcs7_verify.c, the AKID extensions are still
required there.
Internally use a new "dn:" prefix to the search specifier string
generated for the key lookup in find_asymmetric_key(). This tells
asymmetric_key_match_preparse to only match the data against the raw
DN in the third ID and shouldn't conflict with search specifiers
already in use.
In effect implement what (2) in the struct asymmetric_key_id comment
(include/keys/asymmetric-type.h) is probably talking about already, so
do not modify that comment. It is also how "openssl verify" looks up
issuer certificates without the AKID available. Lookups by the raw
DN are unambiguous only provided that the CAs respect the condition in
RFC5280 4.2.1.1 that the AKID may only be omitted if the CA uses
a single signing key.
The following is an example of two things that this change enables.
A self-signed ceritficate is generated following the example from
https://letsencrypt.org/docs/certificates-for-localhost/, and can be
looked up by an identifier and verified against itself by linking to a
restricted keyring -- both things not possible before due to the missing
AKID extension:
$ openssl req -x509 -out localhost.crt -outform DER -keyout localhost.key \
-newkey rsa:2048 -nodes -sha256 \
-subj '/CN=localhost' -extensions EXT -config <( \
echo -e "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\n" \
"subjectAltName=DNS:localhost\nkeyUsage=digitalSignature\n" \
"extendedKeyUsage=serverAuth")
$ keyring=`keyctl newring test @u`
$ trusted=`keyctl padd asymmetric trusted $keyring < localhost.crt`; \
echo $trusted
39726322
$ keyctl search $keyring asymmetric dn:3112301006035504030c096c6f63616c686f7374
39726322
$ keyctl restrict_keyring $keyring asymmetric key_or_keyring:$trusted
$ keyctl padd asymmetric verified $keyring < localhost.crt
Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Because of the possible alloc failure of the alloc_page(), it could
return NULL pointer.
And there is a check below the sg_assign_page().
But it will be more logical to move the NULL check before the
sg_assign_page().
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation for using blake2s in the RNG, we change the way that it
is wired-in to the build system. Instead of using ifdefs to select the
right symbol, we use weak symbols. And because ARM doesn't need the
generic implementation, we make the generic one default only if an arch
library doesn't need it already, and then have arch libraries that do
need it opt-in. So that the arch libraries can remain tristate rather
than bool, we then split the shash part from the glue code.
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The self test of the KDF is based on SHA-256. Thus, this algorithm must
be present as otherwise a warning is issued.
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_sha256_init() and sha256_base_init() are the same repeated
implementations, remove the crypto_sha256_init() in generic
implementation, sha224 is the same process.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The output n bits can receive more than n bits of min entropy, of course,
but the fixed output of the conditioning function can only asymptotically
approach the output size bits of min entropy, not attain that bound.
Random maps will tend to have output collisions, which reduces the
creditable output entropy (that is what SP 800-90B Section 3.1.5.1.2
attempts to bound).
The value "64" is justified in Appendix A.4 of the current 90C draft,
and aligns with NIST's in "epsilon" definition in this document, which is
that a string can be considered "full entropy" if you can bound the min
entropy in each bit of output to at least 1-epsilon, where epsilon is
required to be <= 2^(-32).
Note, this patch causes the Jitter RNG to cut its performance in half in
FIPS mode because the conditioning function of the LFSR produces 64 bits
of entropy in one block. The oversampling requires that additionally 64
bits of entropy are sampled from the noise source. If the conditioner is
changed, such as using SHA-256, the impact of the oversampling is only
one fourth, because for the 256 bit block of the conditioner, only 64
additional bits from the noise source must be sampled.
This patch is derived from the user space jitterentropy-library.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Reviewed-by: Simo Sorce <simo@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Update module_put_and_exit to call kthread_exit instead of do_exit.
Change the name to reflect this change in functionality. All of the
users of module_put_and_exit are causing the current kthread to exit
so this change makes it clear what is happening. There is no
functional change.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The jitterentropy collection loop in jent_gen_entropy() can in principle
run indefinitely without making any progress if it only receives stuck
measurements as determined by jent_stuck(). After 31 consecutive stuck
samples, the Repetition Count Test (RCT) would fail anyway and the
jitterentropy RNG instances moved into ->health_failure == 1 state.
jent_gen_entropy()'s caller, jent_read_entropy() would then check for
this ->health_failure condition and return an error if found set. It
follows that there's absolutely no point in continuing the collection loop
in jent_gen_entropy() once the RCT has failed.
Make the jitterentropy collection loop more robust by terminating it upon
jent_health_failure() so that it won't continue to run indefinitely without
making any progress.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The jitterentropy's Repetition Count Test (RCT) as well as the Adaptive
Proportion Test (APT) are run unconditionally on any collected samples.
However, their result, i.e. ->health_failure, will only get checked if
fips_enabled is set, c.f. the jent_health_failure() wrapper.
I would argue that a RCT or APT failure indicates that something's
seriously off and that this should always be reported as an error,
independently of whether FIPS mode is enabled or not: it should be up to
callers whether or not and how to handle jitterentropy failures.
Make jent_health_failure() to unconditionally return ->health_failure,
independent of whether fips_enabled is set.
Note that fips_enabled isn't accessed from the jitterentropy code anymore
now. Remove the linux/fips.h include as well as the jent_fips_enabled()
wrapper.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A subsequent patch will make the jitterentropy RNG to unconditionally
report health test errors back to callers, independent of whether
fips_enabled is set or not. The DRBG needs access to a functional
jitterentropy instance only in FIPS mode (because it's the only SP800-90B
compliant entropy source as it currently stands). Thus, it is perfectly
fine for the DRBGs to obtain entropy from the jitterentropy source only
on a best effort basis if fips_enabled is off.
Make the DRBGs to ignore jitterentropy failures if fips_enabled is not set.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On Dec 31 2023 NIST sunsets TDES for FIPS use. To prevent FIPS
validations to be completed in the future to be affected by the TDES
sunsetting, disallow TDES already now. Otherwise a FIPS validation would
need to be "touched again" end 2023 to handle TDES accordingly.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
FIPS disallows DH with keys < 2048 bits. Thus, the kernel should
consider the enforcement of this limit.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
FIPS disallows RSA with keys < 2048 bits. Thus, the kernel should
consider the enforcement of this limit.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The APT compares the current time stamp with a pre-set value. The
current code only considered the 4 LSB only. Yet, after reviews by
mathematicians of the user space Jitter RNG version >= 3.1.0, it was
concluded that the APT can be calculated on the 32 LSB of the time
delta. Thi change is applied to the kernel.
This fixes a bug where an AMD EPYC fails this test as its RDTSC value
contains zeros in the LSB. The most appropriate fix would have been to
apply a GCD calculation and divide the time stamp by the GCD. Yet, this
is a significant code change that will be considered for a future
update. Note, tests showed that constantly the GCD always was 32 on
these systems, i.e. the 5 LSB were always zero (thus failing the APT
since it only considered the 4 LSB for its calculation).
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SP800-108 defines three KDFs - this patch provides the counter KDF
implementation.
The KDF is implemented as a service function where the caller has to
maintain the hash / HMAC state. Apart from this hash/HMAC state, no
additional state is required to be maintained by either the caller or
the KDF implementation.
The key for the KDF is set with the crypto_kdf108_setkey function which
is intended to be invoked before the caller requests a key derivation
operation via crypto_kdf108_ctr_generate.
SP800-108 allows the use of either a HMAC or a hash as crypto primitive
for the KDF. When a HMAC primtive is intended to be used,
crypto_kdf108_setkey must be used to set the HMAC key. Otherwise, for a
hash crypto primitve crypto_kdf108_ctr_generate can be used immediately
after allocating the hash handle.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In contrast to the fully prediction resistant 'pr' DRBGs, the 'nopr'
variants get seeded once at boot and reseeded only rarely thereafter,
namely only after 2^20 requests have been served each. AFAICT, this
reseeding based on the number of requests served is primarily motivated
by information theoretic considerations, c.f. NIST SP800-90Ar1,
sec. 8.6.8 ("Reseeding").
However, given the relatively large seed lifetime of 2^20 requests, the
'nopr' DRBGs can hardly be considered to provide any prediction resistance
whatsoever, i.e. to protect against threats like side channel leaks of the
internal DRBG state (think e.g. leaked VM snapshots). This is expected and
completely in line with the 'nopr' naming, but as e.g. the
"drbg_nopr_hmac_sha512" implementation is potentially being used for
providing the "stdrng" and thus, the crypto_default_rng serving the
in-kernel crypto, it would certainly be desirable to achieve at least the
same level of prediction resistance as get_random_bytes() does.
Note that the chacha20 rngs underlying get_random_bytes() get reseeded
every CRNG_RESEED_INTERVAL == 5min: the secondary, per-NUMA node rngs from
the primary one and the primary rng in turn from the entropy pool, provided
sufficient entropy is available.
The 'nopr' DRBGs do draw randomness from get_random_bytes() for their
initial seed already, so making them to reseed themselves periodically from
get_random_bytes() in order to let them benefit from the latter's
prediction resistance is not such a big change conceptually.
In principle, it would have been also possible to make the 'nopr' DRBGs to
periodically invoke a full reseeding operation, i.e. to also consider the
jitterentropy source (if enabled) in addition to get_random_bytes() for the
seed value. However, get_random_bytes() is relatively lightweight as
compared to the jitterentropy generation process and thus, even though the
'nopr' reseeding is supposed to get invoked infrequently, it's IMO still
worthwhile to avoid occasional latency spikes for drbg_generate() and
stick to get_random_bytes() only. As an additional remark, note that
drawing randomness from the non-SP800-90B-conforming get_random_bytes()
only won't adversely affect SP800-90A conformance either: the very same is
being done during boot via drbg_seed_from_random() already once
rng_is_initialized() flips to true and it follows that if the DRBG
implementation does conform to SP800-90A now, it will continue to do so.
Make the 'nopr' DRBGs to reseed themselves periodically from
get_random_bytes() every CRNG_RESEED_INTERVAL == 5min.
More specifically, introduce a new member ->last_seed_time to struct
drbg_state for recording in units of jiffies when the last seeding
operation had taken place. Make __drbg_seed() maintain it and let
drbg_generate() invoke a reseed from get_random_bytes() via
drbg_seed_from_random() if more than 5min have passed by since the last
seeding operation. Be careful to not to reseed if in testing mode though,
or otherwise the drbg related tests in crypto/testmgr.c would fail to
reproduce the expected output.
In order to keep the formatting clean in drbg_generate() wrap the logic
for deciding whether or not a reseed is due in a new helper,
drbg_nopr_reseed_interval_elapsed().
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that drbg_prepare_hrng() doesn't do anything but to instantiate a
jitterentropy crypto_rng instance, it looks a little odd to have the
related error handling at its only caller, drbg_instantiate().
Move the handling of jitterentropy allocation failures from
drbg_instantiate() close to the allocation itself in drbg_prepare_hrng().
There is no change in behaviour.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
get_random_bytes() usually hasn't full entropy available by the time DRBG
instances are first getting seeded from it during boot. Thus, the DRBG
implementation registers random_ready_callbacks which would in turn
schedule some work for reseeding the DRBGs once get_random_bytes() has
sufficient entropy available.
For reference, the relevant history around handling DRBG (re)seeding in
the context of a not yet fully seeded get_random_bytes() is:
commit 16b369a91d ("random: Blocking API for accessing
nonblocking_pool")
commit 4c7879907e ("crypto: drbg - add async seeding operation")
commit 205a525c33 ("random: Add callback API for random pool
readiness")
commit 57225e6797 ("crypto: drbg - Use callback API for random
readiness")
commit c2719503f5 ("random: Remove kernel blocking API")
However, some time later, the initialization state of get_random_bytes()
has been made queryable via rng_is_initialized() introduced with commit
9a47249d44 ("random: Make crng state queryable"). This primitive now
allows for streamlining the DRBG reseeding from get_random_bytes() by
replacing that aforementioned asynchronous work scheduling from
random_ready_callbacks with some simpler, synchronous code in
drbg_generate() next to the related logic already present therein. Apart
from improving overall code readability, this change will also enable DRBG
users to rely on wait_for_random_bytes() for ensuring that the initial
seeding has completed, if desired.
The previous patches already laid the grounds by making drbg_seed() to
record at each DRBG instance whether it was being seeded at a time when
rng_is_initialized() still had been false as indicated by
->seeded == DRBG_SEED_STATE_PARTIAL.
All that remains to be done now is to make drbg_generate() check for this
condition, determine whether rng_is_initialized() has flipped to true in
the meanwhile and invoke a reseed from get_random_bytes() if so.
Make this move:
- rename the former drbg_async_seed() work handler, i.e. the one in charge
of reseeding a DRBG instance from get_random_bytes(), to
"drbg_seed_from_random()",
- change its signature as appropriate, i.e. make it take a struct
drbg_state rather than a work_struct and change its return type from
"void" to "int" in order to allow for passing error information from
e.g. its __drbg_seed() invocation onwards to callers,
- make drbg_generate() invoke this drbg_seed_from_random() once it
encounters a DRBG instance with ->seeded == DRBG_SEED_STATE_PARTIAL by
the time rng_is_initialized() has flipped to true and
- prune everything related to the former, random_ready_callback based
mechanism.
As drbg_seed_from_random() is now getting invoked from drbg_generate() with
the ->drbg_mutex being held, it must not attempt to recursively grab it
once again. Remove the corresponding mutex operations from what is now
drbg_seed_from_random(). Furthermore, as drbg_seed_from_random() can now
report errors directly to its caller, there's no need for it to temporarily
switch the DRBG's ->seeded state to DRBG_SEED_STATE_UNSEEDED so that a
failure of the subsequently invoked __drbg_seed() will get signaled to
drbg_generate(). Don't do it then.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since commit 42ea507fae ("crypto: drbg - reseed often if seedsource is
degraded"), the maximum seed lifetime represented by ->reseed_threshold
gets temporarily lowered if the get_random_bytes() source cannot provide
sufficient entropy yet, as is common during boot, and restored back to
the original value again once that has changed.
More specifically, if the add_random_ready_callback() invoked from
drbg_prepare_hrng() in the course of DRBG instantiation does not return
-EALREADY, that is, if get_random_bytes() has not been fully initialized
at this point yet, drbg_prepare_hrng() will lower ->reseed_threshold
to a value of 50. The drbg_async_seed() scheduled from said
random_ready_callback will eventually restore the original value.
A future patch will replace the random_ready_callback based notification
mechanism and thus, there will be no add_random_ready_callback() return
value anymore which could get compared to -EALREADY.
However, there's __drbg_seed() which gets invoked in the course of both,
the DRBG instantiation as well as the eventual reseeding from
get_random_bytes() in aforementioned drbg_async_seed(), if any. Moreover,
it knows about the get_random_bytes() initialization state by the time the
seed data had been obtained from it: the new_seed_state argument introduced
with the previous patch would get set to DRBG_SEED_STATE_PARTIAL in case
get_random_bytes() had not been fully initialized yet and to
DRBG_SEED_STATE_FULL otherwise. Thus, __drbg_seed() provides a convenient
alternative for managing that ->reseed_threshold lowering and restoring at
a central place.
Move all ->reseed_threshold adjustment code from drbg_prepare_hrng() and
drbg_async_seed() respectively to __drbg_seed(). Make __drbg_seed()
lower the ->reseed_threshold to 50 in case its new_seed_state argument
equals DRBG_SEED_STATE_PARTIAL and let it restore the original value
otherwise.
There is no change in behaviour.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, the DRBG implementation schedules asynchronous works from
random_ready_callbacks for reseeding the DRBG instances with output from
get_random_bytes() once the latter has sufficient entropy available.
However, as the get_random_bytes() initialization state can get queried by
means of rng_is_initialized() now, there is no real need for this
asynchronous reseeding logic anymore and it's better to keep things simple
by doing it synchronously when needed instead, i.e. from drbg_generate()
once rng_is_initialized() has flipped to true.
Of course, for this to work, drbg_generate() would need some means by which
it can tell whether or not rng_is_initialized() has flipped to true since
the last seeding from get_random_bytes(). Or equivalently, whether or not
the last seed from get_random_bytes() has happened when
rng_is_initialized() was still evaluating to false.
As it currently stands, enum drbg_seed_state allows for the representation
of two different DRBG seeding states: DRBG_SEED_STATE_UNSEEDED and
DRBG_SEED_STATE_FULL. The former makes drbg_generate() to invoke a full
reseeding operation involving both, the rather expensive jitterentropy as
well as the get_random_bytes() randomness sources. The DRBG_SEED_STATE_FULL
state on the other hand implies that no reseeding at all is required for a
!->pr DRBG variant.
Introduce the new DRBG_SEED_STATE_PARTIAL state to enum drbg_seed_state for
representing the condition that a DRBG was being seeded when
rng_is_initialized() had still been false. In particular, this new state
implies that
- the given DRBG instance has been fully seeded from the jitterentropy
source (if enabled)
- and drbg_generate() is supposed to reseed from get_random_bytes()
*only* once rng_is_initialized() turns to true.
Up to now, the __drbg_seed() helper used to set the given DRBG instance's
->seeded state to constant DRBG_SEED_STATE_FULL. Introduce a new argument
allowing for the specification of the to be written ->seeded value instead.
Make the first of its two callers, drbg_seed(), determine the appropriate
value based on rng_is_initialized(). The remaining caller,
drbg_async_seed(), is known to get invoked only once rng_is_initialized()
is true, hence let it pass constant DRBG_SEED_STATE_FULL for the new
argument to __drbg_seed().
There is no change in behaviour, except for that the pr_devel() in
drbg_generate() would now report "unseeded" for ->pr DRBG instances which
had last been seeded when rng_is_initialized() was still evaluating to
false.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are two different randomness sources the DRBGs are getting seeded
from, namely the jitterentropy source (if enabled) and get_random_bytes().
At initial DRBG seeding time during boot, the latter might not have
collected sufficient entropy for seeding itself yet and thus, the DRBG
implementation schedules a reseed work from a random_ready_callback once
that has happened. This is particularly important for the !->pr DRBG
instances, for which (almost) no further reseeds are getting triggered
during their lifetime.
Because collecting data from the jitterentropy source is a rather expensive
operation, the aforementioned asynchronously scheduled reseed work
restricts itself to get_random_bytes() only. That is, it in some sense
amends the initial DRBG seed derived from jitterentropy output at full
(estimated) entropy with fresh randomness obtained from get_random_bytes()
once that has been seeded with sufficient entropy itself.
With the advent of rng_is_initialized(), there is no real need for doing
the reseed operation from an asynchronously scheduled work anymore and a
subsequent patch will make it synchronous by moving it next to related
logic already present in drbg_generate().
However, for tracking whether a full reseed including the jitterentropy
source is required or a "partial" reseed involving only get_random_bytes()
would be sufficient already, the boolean struct drbg_state's ->seeded
member must become a tristate value.
Prepare for this by introducing the new enum drbg_seed_state and change
struct drbg_state's ->seeded member's type from bool to that type.
For facilitating review, enum drbg_seed_state is made to only contain
two members corresponding to the former ->seeded values of false and true
resp. at this point: DRBG_SEED_STATE_UNSEEDED and DRBG_SEED_STATE_FULL. A
third one for tracking the intermediate state of "seeded from jitterentropy
only" will be introduced with a subsequent patch.
There is no change in behaviour at this point.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stephan Müller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to the BER encoding rules, integer value should be encoded
as two's complement, and if the highest bit of a positive integer
is 1, should add a leading zero-octet.
The kernel's built-in RSA algorithm cannot recognize negative numbers
when parsing keys, so it can pass this test case.
Export the key to file and run the following command to verify the
fix result:
openssl asn1parse -inform DER -in /path/to/key/file
Signed-off-by: Lei He <helei.sig11@bytedance.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This PR includes 5 commits that update the zstd library version:
1. Adds a new kernel-style wrapper around zstd. This wrapper API
is functionally equivalent to the subset of the current zstd API that is
currently used. The wrapper API changes to be kernel style so that the symbols
don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same
API and preserves the semantics, so that none of the callers need to be
updated. All callers are updated in the commit, because there are zero
functional changes.
2. Adds an indirection for `lib/decompress_unzstd.c` so it
doesn't depend on the layout of `lib/zstd/` to include every source file.
This allows the next patch to be automatically generated.
3. Imports the zstd-1.4.10 source code. This commit is automatically generated
from upstream zstd (https://github.com/facebook/zstd).
4. Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
5. Fixes a newly added build warning for clang.
The discussion around this patchset has been pretty long, so I've included a
FAQ-style summary of the history of the patchset, and why we are taking this
approach.
Why do we need to update?
-------------------------
The zstd version in the kernel is based off of zstd-1.3.1, which is was released
August 20, 2017. Since then zstd has seen many bug fixes and performance
improvements. And, importantly, upstream zstd is continuously fuzzed by OSS-Fuzz,
and bug fixes aren't backported to older versions. So the only way to sanely get
these fixes is to keep up to date with upstream zstd. There are no known security
issues that affect the kernel, but we need to be able to update in case there
are. And while there are no known security issues, there are relevant bug fixes.
For example the problem with large kernel decompression has been fixed upstream
for over 2 years https://lkml.org/lkml/2020/9/29/27.
Additionally the performance improvements for kernel use cases are significant.
Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
- BtrFS zstd compression at levels 1 and 3 is 5% faster
- BtrFS zstd decompression+read is 15% faster
- SquashFS zstd decompression+read is 15% faster
- F2FS zstd compression+write at level 3 is 8% faster
- F2FS zstd decompression+read is 20% faster
- ZRAM decompression+read is 30% faster
- Kernel zstd decompression is 35% faster
- Initramfs zstd decompression+build is 5% faster
On top of this, there are significant performance improvements coming down the
line in the next zstd release, and the new automated update patch generation
will allow us to pull them easily.
How is the update patch generated?
----------------------------------
The first two patches are preparation for updating the zstd version. Then the
3rd patch in the series imports upstream zstd into the kernel. This patch is
automatically generated from upstream. A script makes the necessary changes and
imports it into the kernel. The changes are:
- Replace all libc dependencies with kernel replacements and rewrite includes.
- Remove unncessary portability macros like: #if defined(_MSC_VER).
- Use the kernel xxhash instead of bundling it.
This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.
The automated process makes it easy to keep the kernel version of zstd up to
date. The current zstd in the kernel shares the guts of the code, but has a lot
of API and minor changes to work in the kernel. This is because at the time
upstream zstd was not ready to be used in the kernel envrionment as-is. But,
since then upstream zstd has evolved to support being used in the kernel as-is.
Why are we updating in one big patch?
-------------------------------------
The 3rd patch in the series is very large. This is because it is restructuring
the code, so it both deletes the existing zstd, and re-adds the new structure.
Future updates will be directly proportional to the changes in upstream zstd
since the last import. They will admittidly be large, as zstd is an actively
developed project, and has hundreds of commits between every release. However,
there is no other great alternative.
One option ruled out is to replay every upstream zstd commit. This is not feasible
for several reasons:
- There are over 3500 upstream commits since the zstd version in the kernel.
- The automation to automatically generate the kernel update was only added recently,
so older commits cannot easily be imported.
- Not every upstream zstd commit builds.
- Only zstd releases are "supported", and individual commits may have bugs that were
fixed before a release.
Another option to reduce the patch size would be to first reorganize to the new
file structure, and then apply the patch. However, the current kernel zstd is formatted
with clang-format to be more "kernel-like". But, the new method imports zstd as-is,
without additional formatting, to allow for closer correlation with upstream, and
easier debugging. So the patch wouldn't be any smaller.
It also doesn't make sense to import upstream zstd commit by commit going
forward. Upstream zstd doesn't support production use cases running of the
development branch. We have a lot of post-commit fuzzing that catches many bugs,
so indiviudal commits may be buggy, but fixed before a release. So going forward,
I intend to import every (important) zstd release into the Kernel.
So, while it isn't ideal, updating in one big patch is the only patch I see forward.
Who is responsible for this code?
---------------------------------
I am. This patchset adds me as the maintainer for zstd. Previously, there was no tree
for zstd patches. Because of that, there were several patches that either got ignored,
or took a long time to merge, since it wasn't clear which tree should pick them up.
I'm officially stepping up as maintainer, and setting up my tree as the path through
which zstd patches get merged. I'll make sure that patches to the kernel zstd get
ported upstream, so they aren't erased when the next version update happens.
How is this code tested?
------------------------
I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS, Kernel,
InitRAMFS). I also tested Kernel & InitRAMFS on i386 and aarch64. I checked both
performance and correctness.
Also, thanks to many people in the community who have tested these patches locally.
If you have tested the patches, please reply with a Tested-By so I can collect them
for the PR I will send to Linus.
Lastly, this code will bake in linux-next before being merged into v5.16.
Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
------------------------------------------------------------
This patchset has been outstanding since 2020, and zstd-1.4.10 was the latest
release when it was created. Since the update patch is automatically generated
from upstream, I could generate it from zstd-1.5.0. However, there were some
large stack usage regressions in zstd-1.5.0, and are only fixed in the latest
development branch. And the latest development branch contains some new code that
needs to bake in the fuzzer before I would feel comfortable releasing to the
kernel.
Once this patchset has been merged, and we've released zstd-1.5.1, we can update
the kernel to zstd-1.5.1, and exercise the update process.
You may notice that zstd-1.4.10 doesn't exist upstream. This release is an
artifical release based off of zstd-1.4.9, with some fixes for the kernel
backported from the development branch. I will tag the zstd-1.4.10 release after
this patchset is merged, so the Linux Kernel is running a known version of zstd
that can be debugged upstream.
Why was a wrapper API added?
----------------------------
The first versions of this patchset migrated the kernel to the upstream zstd
API. It first added a shim API that supported the new upstream API with the old
code, then updated callers to use the new shim API, then transitioned to the
new code and deleted the shim API. However, Cristoph Hellwig suggested that we
transition to a kernel style API, and hide zstd's upstream API behind that.
This is because zstd's upstream API is supports many other use cases, and does
not follow the kernel style guide, while the kernel API is focused on the
kernel's use cases, and follows the kernel style guide.
Where is the previous discussion?
---------------------------------
Links for the discussions of the previous versions of the patch set.
The largest changes in the design of the patchset are driven by the discussions
in V11, V5, and V1. Sorry for the mix of links, I couldn't find most of the the
threads on lkml.org.
V12: https://www.spinics.net/lists/linux-crypto/msg58189.html
V11: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/
V10: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/
V9: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/
V8: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/
V7: https://lkml.org/lkml/2020/12/3/1195
V6: https://lkml.org/lkml/2020/12/2/1245
V5: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
V4: https://www.spinics.net/lists/linux-btrfs/msg105783.html
V3: https://lkml.org/lkml/2020/9/23/1074
V2: https://www.spinics.net/lists/linux-btrfs/msg105505.html
V1: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGJyKIACgkQuzRpqaNE
qPXnmw/+PKyCn6LvRQqNfdpF5f59j/B1Fab15tkpVyz3UWnCw+EKaPZOoTfIsjRf
7TMUVm4iGsm+6xBO/YrGdRl4IxocNgXzsgnJ1lTGDbvfRC1tG+YNwuv+EEXwKYq5
Yz3DRwDotgsrV0Kg05b+VIgkmAuY3ukmu2n09LnAdKkxoIgmHw3MIDCdVZW2Br4c
sjJmYI+fiJd7nAlbDa42VOrdTiLzkl/2BsjWBqTv6zbiQ5uuJGsKb7P3kpcybWzD
5C118pyE3qlVyvFz+UFu8WbN0NSf47DP22KV/3IrhNX7CVQxYBe+9/oVuPWTgRx0
4Vl0G6u7rzh4wDZuGqTC3LYWwH9GfycI0fnVC0URP2XMOcGfPlGd3L0PEmmAeTmR
fEbaGAN4dr0jNO3lmbyAGe/G8tvtXQx/4ZjS9Pa3TlQP24GARU/f78/blbKR87Vz
BGMndmSi92AscgXb9buO3bCwAY1YtH5WiFaZT1XVk42cj4MiOLvPTvP4UMzDDxcZ
56ahmAP/84kd6H+cv9LmgEMqcIBmxdUcO1nuAItJ4wdrMUgw3+lrbxwFkH9xPV7I
okC1K0TIVEobADbxbdMylxClAylbuW+37Pko97NmAlnzNCPNE38f3s3gtXRrUTaR
IP8jv5UQ7q3dFiWnNLLodx5KM6s32GVBKRLRnn/6SJB7QzlyHXU=
=Xb18
-----END PGP SIGNATURE-----
Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux
Pull zstd update from Nick Terrell:
"Update to zstd-1.4.10.
Add myself as the maintainer of zstd and update the zstd version in
the kernel, which is now 4 years out of date, to a much more recent
zstd release. This includes bug fixes, much more extensive fuzzing,
and performance improvements. And generates the kernel zstd
automatically from upstream zstd, so it is easier to keep the zstd
verison up to date, and we don't fall so far out of date again.
This includes 5 commits that update the zstd library version:
- Adds a new kernel-style wrapper around zstd.
This wrapper API is functionally equivalent to the subset of the
current zstd API that is currently used. The wrapper API changes to
be kernel style so that the symbols don't collide with zstd's
symbols. The update to zstd-1.4.10 maintains the same API and
preserves the semantics, so that none of the callers need to be
updated. All callers are updated in the commit, because there are
zero functional changes.
- Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
depend on the layout of `lib/zstd/` to include every source file.
This allows the next patch to be automatically generated.
- Imports the zstd-1.4.10 source code. This commit is automatically
generated from upstream zstd (https://github.com/facebook/zstd).
- Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
- Fixes a newly added build warning for clang.
The discussion around this patchset has been pretty long, so I've
included a FAQ-style summary of the history of the patchset, and why
we are taking this approach.
Why do we need to update?
-------------------------
The zstd version in the kernel is based off of zstd-1.3.1, which is
was released August 20, 2017. Since then zstd has seen many bug fixes
and performance improvements. And, importantly, upstream zstd is
continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
older versions. So the only way to sanely get these fixes is to keep
up to date with upstream zstd.
There are no known security issues that affect the kernel, but we need
to be able to update in case there are. And while there are no known
security issues, there are relevant bug fixes. For example the problem
with large kernel decompression has been fixed upstream for over 2
years [1]
Additionally the performance improvements for kernel use cases are
significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
- BtrFS zstd compression at levels 1 and 3 is 5% faster
- BtrFS zstd decompression+read is 15% faster
- SquashFS zstd decompression+read is 15% faster
- F2FS zstd compression+write at level 3 is 8% faster
- F2FS zstd decompression+read is 20% faster
- ZRAM decompression+read is 30% faster
- Kernel zstd decompression is 35% faster
- Initramfs zstd decompression+build is 5% faster
On top of this, there are significant performance improvements coming
down the line in the next zstd release, and the new automated update
patch generation will allow us to pull them easily.
How is the update patch generated?
----------------------------------
The first two patches are preparation for updating the zstd version.
Then the 3rd patch in the series imports upstream zstd into the
kernel. This patch is automatically generated from upstream. A script
makes the necessary changes and imports it into the kernel. The
changes are:
- Replace all libc dependencies with kernel replacements and rewrite
includes.
- Remove unncessary portability macros like: #if defined(_MSC_VER).
- Use the kernel xxhash instead of bundling it.
This automation gets tested every commit by upstream's continuous
integration. When we cut a new zstd release, we will submit a patch to
the kernel to update the zstd version in the kernel.
The automated process makes it easy to keep the kernel version of zstd
up to date. The current zstd in the kernel shares the guts of the
code, but has a lot of API and minor changes to work in the kernel.
This is because at the time upstream zstd was not ready to be used in
the kernel envrionment as-is. But, since then upstream zstd has
evolved to support being used in the kernel as-is.
Why are we updating in one big patch?
-------------------------------------
The 3rd patch in the series is very large. This is because it is
restructuring the code, so it both deletes the existing zstd, and
re-adds the new structure. Future updates will be directly
proportional to the changes in upstream zstd since the last import.
They will admittidly be large, as zstd is an actively developed
project, and has hundreds of commits between every release. However,
there is no other great alternative.
One option ruled out is to replay every upstream zstd commit. This is
not feasible for several reasons:
- There are over 3500 upstream commits since the zstd version in the
kernel.
- The automation to automatically generate the kernel update was only
added recently, so older commits cannot easily be imported.
- Not every upstream zstd commit builds.
- Only zstd releases are "supported", and individual commits may have
bugs that were fixed before a release.
Another option to reduce the patch size would be to first reorganize
to the new file structure, and then apply the patch. However, the
current kernel zstd is formatted with clang-format to be more
"kernel-like". But, the new method imports zstd as-is, without
additional formatting, to allow for closer correlation with upstream,
and easier debugging. So the patch wouldn't be any smaller.
It also doesn't make sense to import upstream zstd commit by commit
going forward. Upstream zstd doesn't support production use cases
running of the development branch. We have a lot of post-commit
fuzzing that catches many bugs, so indiviudal commits may be buggy,
but fixed before a release. So going forward, I intend to import every
(important) zstd release into the Kernel.
So, while it isn't ideal, updating in one big patch is the only patch
I see forward.
Who is responsible for this code?
---------------------------------
I am. This patchset adds me as the maintainer for zstd. Previously,
there was no tree for zstd patches. Because of that, there were
several patches that either got ignored, or took a long time to merge,
since it wasn't clear which tree should pick them up. I'm officially
stepping up as maintainer, and setting up my tree as the path through
which zstd patches get merged. I'll make sure that patches to the
kernel zstd get ported upstream, so they aren't erased when the next
version update happens.
How is this code tested?
------------------------
I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
aarch64. I checked both performance and correctness.
Also, thanks to many people in the community who have tested these
patches locally.
Lastly, this code will bake in linux-next before being merged into
v5.16.
Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
------------------------------------------------------------
This patchset has been outstanding since 2020, and zstd-1.4.10 was the
latest release when it was created. Since the update patch is
automatically generated from upstream, I could generate it from
zstd-1.5.0.
However, there were some large stack usage regressions in zstd-1.5.0,
and are only fixed in the latest development branch. And the latest
development branch contains some new code that needs to bake in the
fuzzer before I would feel comfortable releasing to the kernel.
Once this patchset has been merged, and we've released zstd-1.5.1, we
can update the kernel to zstd-1.5.1, and exercise the update process.
You may notice that zstd-1.4.10 doesn't exist upstream. This release
is an artifical release based off of zstd-1.4.9, with some fixes for
the kernel backported from the development branch. I will tag the
zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
is running a known version of zstd that can be debugged upstream.
Why was a wrapper API added?
----------------------------
The first versions of this patchset migrated the kernel to the
upstream zstd API. It first added a shim API that supported the new
upstream API with the old code, then updated callers to use the new
shim API, then transitioned to the new code and deleted the shim API.
However, Cristoph Hellwig suggested that we transition to a kernel
style API, and hide zstd's upstream API behind that. This is because
zstd's upstream API is supports many other use cases, and does not
follow the kernel style guide, while the kernel API is focused on the
kernel's use cases, and follows the kernel style guide.
Where is the previous discussion?
---------------------------------
Links for the discussions of the previous versions of the patch set
below. The largest changes in the design of the patchset are driven by
the discussions in v11, v5, and v1. Sorry for the mix of links, I
couldn't find most of the the threads on lkml.org"
Link: https://lkml.org/lkml/2020/9/29/27 [1]
Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8]
Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1]
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
* tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
MAINTAINERS: Add maintainer entry for zstd
lib: zstd: Upgrade to latest upstream zstd version 1.4.10
lib: zstd: Add decompress_sources.h for decompress_unzstd
lib: zstd: Add kernel-specific API