Linux kernel source tree
Go to file
Eric Biggers 1862eb0073 crypto: arm/blake2b - add NEON-accelerated BLAKE2b
Add a NEON-accelerated implementation of BLAKE2b.

On Cortex-A7 (which these days is the most common ARM processor that
doesn't have the ARMv8 Crypto Extensions), this is over twice as fast as
SHA-256, and slightly faster than SHA-1.  It is also almost three times
as fast as the generic implementation of BLAKE2b:

	Algorithm            Cycles per byte (on 4096-byte messages)
	===================  =======================================
	blake2b-256-neon     14.0
	sha1-neon            16.3
	blake2s-256-arm      18.8
	sha1-asm             20.8
	blake2s-256-generic  26.0
	sha256-neon	     28.9
	sha256-asm	     32.0
	blake2b-256-generic  38.9

This implementation isn't directly based on any other implementation,
but it borrows some ideas from previous NEON code I've written as well
as from chacha-neon-core.S.  At least on Cortex-A7, it is faster than
the other NEON implementations of BLAKE2b I'm aware of (the
implementation in the BLAKE2 official repository using intrinsics, and
Andrew Moon's implementation which can be found in SUPERCOP).  It does
only one block at a time, so it performs well on short messages too.

NEON-accelerated BLAKE2b is useful because there is interest in using
BLAKE2b-256 for dm-verity on low-end Android devices (specifically,
devices that lack the ARMv8 Crypto Extensions) to replace SHA-1.  On
these devices, the performance cost of upgrading to SHA-256 may be
unacceptable, whereas BLAKE2b-256 would actually improve performance.

Although BLAKE2b is intended for 64-bit platforms (unlike BLAKE2s which
is intended for 32-bit platforms), on 32-bit ARM processors with NEON,
BLAKE2b is actually faster than BLAKE2s.  This is because NEON supports
64-bit operations, and because BLAKE2s's block size is too small for
NEON to be helpful for it.  The best I've been able to do with BLAKE2s
on Cortex-A7 is 18.8 cpb with an optimized scalar implementation.

(I didn't try BLAKE2sp and BLAKE3, which in theory would be faster, but
they're more complex as they require running multiple hashes at once.
Note that BLAKE2b already uses all the NEON bandwidth on the Cortex-A7,
so I expect that any speedup from BLAKE2sp or BLAKE3 would come only
from the smaller number of rounds, not from the extra parallelism.)

For now this BLAKE2b implementation is only wired up to the shash API,
since there is no library API for BLAKE2b yet.  However, I've tried to
keep things consistent with BLAKE2s, e.g. by defining
blake2b_compress_arch() which is analogous to blake2s_compress_arch()
and could be exported for use by the library API later if needed.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-01-03 08:41:39 +11:00
arch crypto: arm/blake2b - add NEON-accelerated BLAKE2b 2021-01-03 08:41:39 +11:00
block block: update some copyrights 2020-12-22 08:43:06 -07:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto crypto: blake2b - update file comment 2021-01-03 08:41:39 +11:00
Documentation dt-bindings: crypto: Add Keem Bay OCS HCU bindings 2021-01-03 08:41:36 +11:00
drivers wireguard: Kconfig: select CRYPTO_BLAKE2S_ARM 2021-01-03 08:41:39 +11:00
fs proc mountinfo: make splice available again 2020-12-27 12:00:36 -08:00
include crypto: blake2b - sync with blake2s implementation 2021-01-03 08:41:39 +11:00
init kasan, arm64: only use kasan_depth for software modes 2020-12-22 12:55:07 -08:00
ipc Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
kernel Misc fixes/updates: 2020-12-27 09:06:10 -08:00
lib crypto: blake2s - move update and final logic to internal/blake2s.h 2021-01-03 08:41:38 +11:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm virtio,vdpa: features, cleanups, fixes 2020-12-24 12:06:46 -08:00
net 9p for 5.11-rc1 2020-12-21 10:28:02 -08:00
samples ARM: SoC drivers for v5.11 2020-12-16 16:38:41 -08:00
scripts Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux 2020-12-25 11:05:32 -08:00
security Provide a fix for the incorrect handling of privilege 2020-12-24 14:08:43 -08:00
sound sound fixes for 5.11-rc1 2020-12-23 15:11:08 -08:00
tools Fix a segfault that occurs when built with Clang. 2020-12-27 09:08:23 -08:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt ARM: 2020-12-20 10:44:05 -08:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap MAINTAINERS: crypto: s5p-sss: drop Kamil Konieczny 2021-01-03 08:41:34 +11:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-11 22:29:38 -08:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: Add maintainers for Keem Bay OCS HCU driver 2021-01-03 08:41:37 +11:00
Makefile Linux 5.11-rc1 2020-12-27 15:30:22 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.