9017 Commits

Author SHA1 Message Date
Mark Rutland
bd7d95cafb arm64: KVM: Consistently advance singlestep when emulating instructions
When we emulate a guest instruction, we don't advance the hardware
singlestep state machine, and thus the guest will receive a software
step exception after a next instruction which is not emulated by the
host.

We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
whether userspace requested a single step, and fake a debug exception
from within the kernel. Other times, we advance the HW singlestep state
rely on the HW to generate the exception for us. Thus, the observed step
behaviour differs for host and guest.

Let's make this simpler and consistent by always advancing the HW
singlestep state machine when we skip an instruction. Thus we can rely
on the hardware to generate the singlestep exception for us, and never
need to explicitly check for an active-pending step, nor do we need to
fake a debug exception from the guest.

Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18 14:11:37 +00:00
Logan Gunthorpe
d1402fc708 mm: introduce common STRUCT_PAGE_MAX_SHIFT define
This define is used by arm64 to calculate the size of the vmemmap
region.  It is defined as the log2 of the upper bound on the size of a
struct page.

We move it into mm_types.h so it can be defined properly instead of set
and checked with a build bug.  This also allows us to use the same
define for riscv.

Link: http://lkml.kernel.org/r/20181107205433.3875-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-14 15:05:45 -08:00
Linus Torvalds
eb6cf9f8cb - Invalidate the caches before clearing the DMA buffer via the
non-cacheable alias in the FORCE_CONTIGUOUS case
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlwT5e4ACgkQa9axLQDI
 XvGYhg//bU9qdZ0xDxVIiuA6/H/ptiFYDFd15vAKBXF+PJkREQSqFqZc5gsZrEa1
 y3e1cJ8zcP9wSsuYbUFuN3LWaIPRqrUdNrD6d+IfGgx8o+WbdexCLhus0NyAwQUQ
 CqR/CVOBGfpdOXqKKoryXGP4yPnBcrxjhXB6tqlOahoQj3ZF1RxO4uqOw+w6bUn8
 gXd00DBOqomsqRimKFksXrslO5aAhw6opkF0tsHSSh0v1V3M+GRAKMQ7O7s/Khf/
 yvJ75r0IVwP5/dE2FqxXJAjyPv8sXZudIR+jSxAS40v2azultyRJEjaOduB+qLHw
 PqqcmXMaQhZ54hq1YnVC67h9WpvfVsUfE/TrwgRopcNikbhGQsJ7AlYUmQuxiIej
 5bKcy3YC4i+/xLOCoGjmxps3Q3tUsfr+aGj76n+yL1xLwhZK3VjKqbjnXl/OV7kf
 N2SAtPJUeVTBkCbLCwqG5k4cVNoS7Ncu/Nk3r1fuArRBc4OH4B4kQHVauHnZ51hA
 nzbnBjyLfee37E9b7IwKZGXxedRYtEjRM1NpdFg4yYuc2k3oc9J+6LPVnliA8dc7
 CnhxdlxxBSAzP5BFSPN5kytSNXZXGjBWfrMbDFw3GmMihQXOT8Oskq80sxYjFOqO
 0u/kA4E9nySAUVA7QxKY76Ez1u2lf1BxzvmYdgZegplwvTxewPg=
 =FMny
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Invalidate the caches before clearing the DMA buffer via the
  non-cacheable alias in the FORCE_CONTIGUOUS case"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
2018-12-14 09:36:41 -08:00
Miles Chen
12f799c8c7 arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset()
When debug with kaslr, it is sometimes necessary to have PHYS_OFFSET to
perform linear virtual address to physical address translation.
Sometimes we're debugging with only few information such as a kernel log
and a symbol file, print PHYS_OFFSET in dump_kernel_offset() for that case.

Tested by:
echo c > /proc/sysrq-trigger
[   11.996161] SMP: stopping secondary CPUs
[   11.996732] Kernel Offset: 0x2522200000 from 0xffffff8008000000
[   11.996881] PHYS_OFFSET: 0xffffffeb40000000

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-14 09:33:49 +00:00
Christoph Hellwig
356da6d0cd dma-mapping: bypass indirect calls for dma-direct
Avoid expensive indirect calls in the fast path DMA mapping
operations by directly calling the dma_direct_* ops if we are using
the directly mapped DMA operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:18 +01:00
Christoph Hellwig
55897af630 dma-direct: merge swiotlb_dma_ops into the dma_direct code
While the dma-direct code is (relatively) clean and simple we actually
have to use the swiotlb ops for the mapping on many architectures due
to devices with addressing limits.  Instead of keeping two
implementations around this commit allows the dma-direct
implementation to call the swiotlb bounce buffering functions and
thus share the guts of the mapping implementation.  This also
simplified the dma-mapping setup on a few architectures where we
don't have to differenciate which implementation to use.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:17 +01:00
Robin Murphy
90ac706e98 dma-mapping: factor out dummy DMA ops
The dummy DMA ops are currently used by arm64 for any device which has
an invalid ACPI description and is thus barred from using DMA due to not
knowing whether is is cache-coherent or not. Factor these out into
general dma-mapping code so that they can be referenced from other
common code paths. In the process, we can prune all the optional
callbacks which just do the same thing as the default behaviour, and
fill in .map_resource for completeness.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: moved to a separate source file]
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13 21:06:12 +01:00
Christoph Hellwig
3731c3d477 dma-mapping: always build the direct mapping code
All architectures except for sparc64 use the dma-direct code in some
form, and even for sparc64 we had the discussion of a direct mapping
mode a while ago.  In preparation for directly calling the direct
mapping code don't bother having it optionally but always build the
code in.  This is a minor hardship for some powerpc and arm configs
that don't pull it in yet (although they should in a relase ot two),
and sparc64 which currently doesn't need it at all, but it will
reduce the ifdef mess we'd otherwise need significantly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:11 +01:00
Rob Herring
acc2038738 Merge branch 'yaml-bindings-for-v4.21' into dt/next 2018-12-13 11:20:36 -06:00
Will Deacon
97bebc5fac arm64: sysreg: Use _BITUL() when defining register bits
Using shifts directly is error-prone and can cause inadvertent sign
extensions or build problems with older versions of binutils.

Consistent use of the _BITUL() macro makes these problems disappear.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:47 +00:00
Will Deacon
1e013d0612 arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches
Open-coding the pointer-auth HWCAPs is a mess and can be avoided by
reusing the multi-cap logic from the CPU errata framework.

Move the multi_entry_cap_matches code to cpufeature.h and reuse it for
the pointer auth HWCAPs.

Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:47 +00:00
Will Deacon
a56005d321 arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4
We can easily avoid defining the two meta-capabilities for the address
and generic keys, so remove them and instead just check both of the
architected and impdef capabilities when determining the level of system
support.

Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:47 +00:00
Will Deacon
84931327a8 arm64: ptr auth: Move per-thread keys from thread_info to thread_struct
We don't need to get at the per-thread keys from assembly at all, so
they can live alongside the rest of the per-thread register state in
thread_struct instead of thread_info.

This will also allow straighforward whitelisting of the keys for
hardened usercopy should we expose them via a ptrace request later on.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:47 +00:00
Mark Rutland
04ca3204fa arm64: enable pointer authentication
Now that all the necessary bits are in place for userspace, add the
necessary Kconfig logic to allow this to be enabled.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Kristina Martsenko
ba83088565 arm64: add prctl control for resetting ptrauth keys
Add an arm64-specific prctl to allow a thread to reinitialize its
pointer authentication keys to random values. This can be useful when
exec() is not used for starting new processes, to ensure that different
processes still have different keys.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
ccc4381082 arm64: perf: strip PAC when unwinding userspace
When the kernel is unwinding userspace callchains, we can't expect that
the userspace consumer of these callchains has the data necessary to
strip the PAC from the stored LR.

This patch has the kernel strip the PAC from user stackframes when the
in-kernel unwinder is used. This only affects the LR value, and not the
FP.

This only affects the in-kernel unwinder. When userspace performs
unwinding, it is up to userspace to strip PACs as necessary (which can
be determined from DWARF information).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
ec6e822d1a arm64: expose user PAC bit positions via ptrace
When pointer authentication is in use, data/instruction pointers have a
number of PAC bits inserted into them. The number and position of these
bits depends on the configured TCR_ELx.TxSZ and whether tagging is
enabled. ARMv8.3 allows tagging to differ for instruction and data
pointers.

For userspace debuggers to unwind the stack and/or to follow pointer
chains, they need to be able to remove the PAC bits before attempting to
use a pointer.

This patch adds a new structure with masks describing the location of
the PAC bits in userspace instruction and data pointers (i.e. those
addressable via TTBR0), which userspace can query via PTRACE_GETREGSET.
By clearing these bits from pointers (and replacing them with the value
of bit 55), userspace can acquire the PAC-less versions.

This new regset is exposed when the kernel is built with (user) pointer
authentication support, and the address authentication feature is
enabled. Otherwise, the regset is hidden.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: Fix to use vabits_user instead of VA_BITS and rename macro]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
7503197562 arm64: add basic pointer authentication support
This patch adds basic support for pointer authentication, allowing
userspace to make use of APIAKey, APIBKey, APDAKey, APDBKey, and
APGAKey. The kernel maintains key values for each process (shared by all
threads within), which are initialised to random values at exec() time.

The ID_AA64ISAR1_EL1.{APA,API,GPA,GPI} fields are exposed to userspace,
to describe that pointer authentication instructions are available and
that the kernel is managing the keys. Two new hwcaps are added for the
same reason: PACA (for address authentication) and PACG (for generic
authentication).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Tested-by: Adam Wallis <awallis@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: Fix sizeof() usage and unroll address key initialisation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
6984eb47d5 arm64/cpufeature: detect pointer authentication
So that we can dynamically handle the presence of pointer authentication
functionality, wire up probing code in cpufeature.c.

From ARMv8.3 onwards, ID_AA64ISAR1 is no longer entirely RES0, and now
has four fields describing the presence of pointer authentication
functionality:

* APA - address authentication present, using an architected algorithm
* API - address authentication present, using an IMP DEF algorithm
* GPA - generic authentication present, using an architected algorithm
* GPI - generic authentication present, using an IMP DEF algorithm

This patch checks for both address and generic authentication,
separately. It is assumed that if all CPUs support an IMP DEF algorithm,
the same algorithm is used across all CPUs.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
b3669b1e1c arm64: Don't trap host pointer auth use to EL2
To allow EL0 (and/or EL1) to use pointer authentication functionality,
we must ensure that pointer authentication instructions and accesses to
pointer authentication keys are not trapped to EL2.

This patch ensures that HCR_EL2 is configured appropriately when the
kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK},
ensuring that EL1 can access keys and permit EL0 use of instructions.
For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings,
and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't
bother setting them.

This does not enable support for KVM guests, since KVM manages HCR_EL2
itself when running VMs.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
a1ee8abb95 arm64/kvm: hide ptrauth from guests
In subsequent patches we're going to expose ptrauth to the host kernel
and userspace, but things are a bit trickier for guest kernels. For the
time being, let's hide ptrauth from KVM guests.

Regardless of how well-behaved the guest kernel is, guest userspace
could attempt to use ptrauth instructions, triggering a trap to EL2,
resulting in noise from kvm_handle_unknown_ec(). So let's write up a
handler for the PAC trap, which silently injects an UNDEF into the
guest, as if the feature were really missing.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:46 +00:00
Mark Rutland
4eaed6aa2c arm64/kvm: consistently handle host HCR_EL2 flags
In KVM we define the configuration of HCR_EL2 for a VHE HOST in
HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the
non-VHE host flags, and open-code HCR_RW. Further, in head.S we
open-code the flags for VHE and non-VHE configurations.

In future, we're going to want to configure more flags for the host, so
lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both
HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S.

We now use mov_q to generate the HCR_EL2 value, as we use when
configuring other registers in head.S.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:45 +00:00
Mark Rutland
aa6eece8ec arm64: add pointer authentication register bits
The ARMv8.3 pointer authentication extension adds:

* New fields in ID_AA64ISAR1 to report the presence of pointer
  authentication functionality.

* New control bits in SCTLR_ELx to enable this functionality.

* New system registers to hold the keys necessary for this
  functionality.

* A new ESR_ELx.EC code used when the new instructions are affected by
  configurable traps

This patch adds the relevant definitions to <asm/sysreg.h> and
<asm/esr.h> for these, to be used by subsequent patches.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:45 +00:00
Kristina Martsenko
1556065735 arm64: add comments about EC exception levels
To make it clear which exceptions can't be taken to EL1 or EL2, add
comments next to the ESR_ELx_EC_* macro definitions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 16:42:45 +00:00
Will Deacon
26a25c841d arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
Although the upper 32 bits of the PMEVTYPER<n>_EL0 registers are RES0,
we should treat the EXCLUDE_EL* bit definitions as unsigned so that we
avoid accidentally sign-extending the privilege filtering bit (bit 31)
into the upper half of the register.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 15:34:44 +00:00
Will Deacon
2a355ec257 arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
While the CSV3 field of the ID_AA64_PFR0 CPU ID register can be checked
to see if a CPU is susceptible to Meltdown and therefore requires kpti
to be enabled, existing CPUs do not implement this field.

We therefore whitelist all unaffected Cortex-A CPUs that do not implement
the CSV3 field.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 14:14:21 +00:00
Laurent Pinchart
6f61a2c8f1 arm64: dts: renesas: draak: Fix CVBS input
A typo in the adv7180 DT node prevents successful probing of the VIN.
Fix it.

Fixes: 6a0942c20f5c ("arm64: dts: renesas: draak: Describe CVBS input")
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2018-12-13 13:47:30 +01:00
Ard Biesheuvel
2fe55987b2 crypto: arm64/chacha - use combined SIMD/ALU routine for more speed
To some degree, most known AArch64 micro-architectures appear to be
able to issue ALU instructions in parellel to SIMD instructions
without affecting the SIMD throughput. This means we can use the ALU
to process a fifth ChaCha block while the SIMD is processing four
blocks in parallel.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:55 +08:00
Ard Biesheuvel
f2ca1cbd0f crypto: arm64/chacha - optimize for arbitrary length inputs
Update the 4-way NEON ChaCha routine so it can handle input of any
length >64 bytes in its entirety, rather than having to call into
the 1-way routine and/or memcpy()s via temp buffers to handle the
tail of a ChaCha invocation that is not a multiple of 256 bytes.

On inputs that are a multiple of 256 bytes (and thus in tcrypt
benchmarks), performance drops by around 1% on Cortex-A57, while
performance for inputs drawn randomly from the range [64, 1024)
increases by around 30%.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:40 +08:00
Eric Biggers
19c11c97c3 crypto: arm64/chacha - add XChaCha12 support
Now that the ARM64 NEON implementation of ChaCha20 and XChaCha20 has
been refactored to support varying the number of rounds, add support for
XChaCha12.  This is identical to XChaCha20 except for the number of
rounds, which is 12 instead of 20.  This can be used by Adiantum.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:37 +08:00
Eric Biggers
95a34b779e crypto: arm64/chacha20 - refactor to allow varying number of rounds
In preparation for adding XChaCha12 support, rename/refactor the ARM64
NEON implementation of ChaCha20 to support different numbers of rounds.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:36 +08:00
Eric Biggers
cc7cf991e9 crypto: arm64/chacha20 - add XChaCha20 support
Add an XChaCha20 implementation that is hooked up to the ARM64 NEON
implementation of ChaCha20.  This can be used by Adiantum.

A NEON implementation of single-block HChaCha20 is also added so that
XChaCha20 can use it rather than the generic implementation.  This
required refactoring the ChaCha20 permutation into its own function.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:36 +08:00
Eric Biggers
a00fa0c887 crypto: arm64/nhpoly1305 - add NEON-accelerated NHPoly1305
Add an ARM64 NEON implementation of NHPoly1305, an ε-almost-∆-universal
hash function used in the Adiantum encryption mode.  For now, only the
NH portion is actually NEON-accelerated; the Poly1305 part is less
performance-critical so is just implemented in C.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> # big-endian
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-12-13 18:24:35 +08:00
Olof Johansson
8e22bce990 rockpro64 regulator fixes that cause stability issues
-----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCAAuFiEE7v+35S2Q1vLNA3Lx86Z5yZzRHYEFAlwQxV0QHGhlaWtvQHNu
 dGVjaC5kZQAKCRDzpnnJnNEdgQfMCACQoDDe2VXgehKylXYMEYk1JtypTCdkbGDO
 C4QUhupe7QC3kdFP1dEgy9sluMeZbUjnzXZPOaEUUxF3j2OEqp+ucSrm0mv3e3/d
 /Z2OO9aVf1+CevIY5cP3ZUZMDA6zRo8VjAhKKyrxSV8Ji5lroApox6obrOseWI9u
 Avw9dfFF1SXpBEvy7HkhfjGnZz2nXWN48trAksEmXv5Z3r558kSV8LJAmmENtZV2
 uIULebOZvQ3Xod65A5zFCMdvjdEjlHrci/9Ci0H6TU6bGHztbNw5XO8nfkXV0yJX
 rI9Uec640SoZyAnjKHqUHe+abn5ZdGGbLv1Zj6q+DT70/SzIYDAm
 =Cgwo
 -----END PGP SIGNATURE-----

Merge tag 'v4.20-rockchip-dts64fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

rockpro64 regulator fixes that cause stability issues

* tag 'v4.20-rockchip-dts64fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: fix rk3399-rockpro64 regulator gpios

Signed-off-by: Olof Johansson <olof@lixom.net>
2018-12-12 14:21:26 -08:00
Olof Johansson
50ba37008f Renesas ARM Based SoC Updates for v4.21
* pm-rmobile driver
   - Move to drivers/soc/renesas/
   - Clean up struct rmobile_pm_domain
 * Renesas SoC Kconfig Symbols
   - Move symbols for ARM and SoCs to drivers/soc/renesas/
   - Hide ARCH_RZN1 to improve consistency
 * SH-Mobile AG5 (sh73a0) SoC: Remove obsolete inclusion of <asm/smp_twd.h>
 * Restrict TWD and SCU to Renesas ARM based SoCs where they are present
 * Enable GPIOLIB on Renesas arm64 based SoCs to allow GPIO driver selection
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4nzZofWswv9L/nKF189kaWo3T74FAlwJl3UACgkQ189kaWo3
 T77nHg/9HvFT/gkyNnqUmzPQB8l1WUu5J1b5uK6tuxQH4/6QADXvksw38o++Qamn
 vqLiIO5bRoJN9tyLcARnt6/dukjHCyXN1DiPTpl3nW40psding6CPxGDRT/XGcsO
 Q/Yn9u523Se7IddSX75O+Tfxq0XUOCHOqlMBPYBECBC5Kuo5iTUPx5Mh/Aiii7fL
 u6kt8/H7hsVw+XK6ceBElwmqqXq7Jh0SuSG5e20DNFyFUGECYod9YZOc9yz3sT8L
 bs6U9yM4Xfpe4e0lMwVNJAt2AOUPj/U0fKwqMJ2Fs9fjZVhN9jU5/+qDs8105Znh
 P0fSCqmzc4qj+Jpvz3JunzIiobQHdZUUzav9VupuTjN3CyayisoM7lLfpdw9LSc3
 AwBhMwhiqO4tbbKVxNiK9696pSOqRMXXmpU6pei8paEP0ORxrekZCr3KgprvMpv1
 MGfXewhScjuIqfrOcpfAISTZRrm0N8ZpkuBfVhrC2pNAdjZyHRz4qQgciBk2Y1Cm
 4FQdMA7k+sVj7b6fur97vTCvizDJpTsOrN+OL8/fFXzG2y6iZT9T2wPfytQ55FwA
 b12HAEGKHEBiQWRmxy/gNm+VQRm1EBzEjv7nRMXCdf253ojcYjqCI4m9kpPRHUhG
 mybv5TmmUiNdBSGtRtZwfZCtvdNXdCp39NwYeIwzYdTzzY4OrBs=
 =i17r
 -----END PGP SIGNATURE-----

Merge tag 'renesas-soc-for-v4.21' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/soc

Renesas ARM Based SoC Updates for v4.21

* pm-rmobile driver
  - Move to drivers/soc/renesas/
  - Clean up struct rmobile_pm_domain
* Renesas SoC Kconfig Symbols
  - Move symbols for ARM and SoCs to drivers/soc/renesas/
  - Hide ARCH_RZN1 to improve consistency
* SH-Mobile AG5 (sh73a0) SoC: Remove obsolete inclusion of <asm/smp_twd.h>
* Restrict TWD and SCU to Renesas ARM based SoCs where they are present
* Enable GPIOLIB on Renesas arm64 based SoCs to allow GPIO driver selection

* tag 'renesas-soc-for-v4.21' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: R-Mobile: Move pm-rmobile to drivers/soc/renesas/
  ARM: shmobile: R-Mobile: Clean up struct rmobile_pm_domain
  ARM: shmobile: Move SoC Kconfig symbols to drivers/soc/renesas/
  arm64: renesas: Move SoC Kconfig symbols to drivers/soc/renesas/
  ARM: shmobile: Hide ARCH_RZN1 to improve consistency
  ARM: shmobile: sh73a0: Remove obsolete inclusion of <asm/smp_twd.h>
  ARM: shmobile: Restrict TWD support to SoCs that have it
  ARM: shmobile: Restrict SCU support to SoCs that have it
  arm64: renesas: Enable GPIOLIB to allow GPIO driver selection

Signed-off-by: Olof Johansson <olof@lixom.net>
2018-12-12 13:49:58 -08:00
Will Deacon
b47f515bdc Merge branch 'for-next/perf' into aarch64/for-next/core
Merge in arm64 perf and PMU driver updates, including support for the
system/uncore PMU in the ThunderX2 platform.
2018-12-12 19:00:25 +00:00
Ard Biesheuvel
0a1213fa74 arm64: enable per-task stack canaries
This enables the use of per-task stack canary values if GCC has
support for emitting the stack canary reference relative to the
value of sp_el0, which holds the task struct pointer in the arm64
kernel.

The $(eval) extends KBUILD_CFLAGS at the moment the make rule is
applied, which means asm-offsets.o (which we rely on for the offset
value) is built without the arguments, and everything built afterwards
has the options set.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 18:45:31 +00:00
Robin Murphy
4ab2150615 arm64: Add memory hotplug support
Wire up the basic support for hot-adding memory. Since memory hotplug
is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
cross-check the presence of a section in the manner of the generic
implementation, before falling back to memblock to check for no-map
regions within a present section as before. By having arch_add_memory(()
create the linear mapping first, this then makes everything work in the
way that __add_section() expects.

We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
should be safe from races by virtue of the global device hotplug lock.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:43:43 +00:00
Will Deacon
6e4ede698d arm64: percpu: Fix LSE implementation of value-returning pcpu atomics
Commit 959bf2fd03b5 ("arm64: percpu: Rewrite per-cpu ops to allow use of
LSE atomics") introduced alternative code sequences for the arm64 percpu
atomics, so that the LSE instructions can be patched in at runtime if
they are supported by the CPU.

Unfortunately, when patching in the LSE sequence for a value-returning
pcpu atomic, the argument registers are the wrong way round. The
implementation of this_cpu_add_return() therefore ends up adding
uninitialised stack to the percpu variable and returning garbage.

As it turns out, there aren't very many users of the value-returning
percpu atomics in mainline and we only spotted this due to a failure in
the kprobes selftests. In this case, when attempting to single-step over
the out-of-line instruction slot, the debug monitors would not be
enabled because calling this_cpu_inc_return() on the kernel debug
monitor refcount would fail to detect the transition from 0. We would
consequently execute past the slot and take an undefined instruction
exception from the kernel, resulting in a BUG:

 | kernel BUG at arch/arm64/kernel/traps.c:421!
 | PREEMPT SMP
 | pc : do_undefinstr+0x268/0x278
 | lr : do_undefinstr+0x124/0x278
 | Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____))
 | Call trace:
 |  do_undefinstr+0x268/0x278
 |  el1_undef+0x10/0x78
 |  0xffff00000803c004
 |  init_kprobes+0x150/0x180
 |  do_one_initcall+0x74/0x178
 |  kernel_init_freeable+0x188/0x224
 |  kernel_init+0x10/0x100
 |  ret_from_fork+0x10/0x1c

Fix the argument order to get the value-returning pcpu atomics working
correctly when implemented using the LSE instructions.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:43:35 +00:00
Mark Rutland
c3296a1391 arm64: add <asm/asm-prototypes.h>
While we can export symbols from assembly files, CONFIG_MODVERIONS requires C
declarations of anyhting that's exported.

Let's account for this as other architectures do by placing these declarations
in <asm/asm-prototypes.h>, which kbuild will automatically use to generate
modversion information for assembly files.

Since we already define most prototypes in existing headers, we simply need to
include those headers in <asm/asm-prototypes.h>, and don't need to duplicate
these.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:10:18 +00:00
Will Deacon
9b31cf493f arm64: mm: Introduce MAX_USER_VA_BITS definition
With the introduction of 52-bit virtual addressing for userspace, we are
now in a position where the virtual addressing capability of userspace
may exceed that of the kernel. Consequently, the VA_BITS definition
cannot be used blindly, since it reflects only the size of kernel
virtual addresses.

This patch introduces MAX_USER_VA_BITS which is either VA_BITS or 52
depending on whether 52-bit virtual addressing has been configured at
build time, removing a few places where the 52 is open-coded based on
explicit CONFIG_ guards.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 11:51:40 +00:00
Martin KaFai Lau
37ab566c17 bpf: arm64: Enable arm64 jit to provide bpf_line_info
This patch enables arm64's bpf_int_jit_compile() to provide
bpf_line_info by calling bpf_prog_fill_jited_linfo().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-12 02:16:56 +01:00
Arnd Bergmann
4d08d20f1c arm64: fix ARM64_USER_VA_BITS_52 builds
In some randconfig builds, the new CONFIG_ARM64_USER_VA_BITS_52
triggered a build failure:

arch/arm64/mm/proc.S:287: Error: immediate out of range

As it turns out, we were incorrectly setting PGTABLE_LEVELS here,
lacking any other default value.
This fixes the calculation of CONFIG_PGTABLE_LEVELS to consider
all combinations again.

Fixes: 68d23da4373a ("arm64: Kconfig: Re-jig CONFIG options for 52-bit VA")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-11 20:07:12 +00:00
Will Deacon
7faa313f05 arm64: preempt: Fix big-endian when checking preempt count in assembly
Commit 396244692232 ("arm64: preempt: Provide our own implementation of
asm/preempt.h") extended the preempt count field in struct thread_info
to 64 bits, so that it consists of a 32-bit count plus a 32-bit flag
indicating whether or not the current task needs rescheduling.

Whilst the asm-offsets definition of TSK_TI_PREEMPT was updated to point
to this new field, the assembly usage was left untouched meaning that a
32-bit load from TSK_TI_PREEMPT on a big-endian machine actually returns
the reschedule flag instead of the count.

Whilst we could fix this by pointing TSK_TI_PREEMPT at the count field,
we're actually better off reworking the two assembly users so that they
operate on the whole 64-bit value in favour of inspecting the thread
flags separately in order to determine whether a reschedule is needed.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-11 20:07:03 +00:00
Robin Murphy
3238c359ac arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.

Fixes: dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: <stable@vger.kernel.org> # 4.18.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-11 11:55:32 +00:00
Arnd Bergmann
732291c4fa arm64: kexec_file: include linux/vmalloc.h
This is needed for compilation in some configurations that don't
include it implicitly:

arch/arm64/kernel/machine_kexec_file.c: In function 'arch_kimage_file_post_load_cleanup':
arch/arm64/kernel/machine_kexec_file.c:37:2: error: implicit declaration of function 'vfree'; did you mean 'kvfree'? [-Werror=implicit-function-declaration]

Fixes: 52b2a8af7436 ("arm64: kexec_file: load initrd and device-tree")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-11 10:37:38 +00:00
David S. Miller
addb067983 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-11

The following pull-request contains BPF updates for your *net-next* tree.

It has three minor merge conflicts, resolutions:

1) tools/testing/selftests/bpf/test_verifier.c

 Take first chunk with alignment_prevented_execution.

2) net/core/filter.c

  [...]
  case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
  case bpf_ctx_range(struct __sk_buff, wire_len):
        return false;
  [...]

3) include/uapi/linux/bpf.h

  Take the second chunk for the two cases each.

The main changes are:

1) Add support for BPF line info via BTF and extend libbpf as well
   as bpftool's program dump to annotate output with BPF C code to
   facilitate debugging and introspection, from Martin.

2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter
   and all JIT backends, from Jiong.

3) Improve BPF test coverage on archs with no efficient unaligned
   access by adding an "any alignment" flag to the BPF program load
   to forcefully disable verifier alignment checks, from David.

4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for
   proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz.

5) Extend tc BPF programs to use a new __sk_buff field called wire_len
   for more accurate accounting of packets going to wire, from Petar.

6) Improve bpftool to allow dumping the trace pipe from it and add
   several improvements in bash completion and map/prog dump,
   from Quentin.

7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for
   kernel addresses and add a dedicated BPF JIT backend allocator,
   from Ard.

8) Add a BPF helper function for IR remotes to report mouse movements,
   from Sean.

9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info
   member naming consistent with existing conventions, from Yonghong
   and Song.

10) Misc cleanups and improvements in allowing to pass interface name
    via cmdline for xdp1 BPF example, from Matteo.

11) Fix a potential segfault in BPF sample loader's kprobes handling,
    from Daniel T.

12) Fix SPDX license in libbpf's README.rst, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10 18:00:43 -08:00
Will Deacon
4a1daf29d3 arm64: mm: EXPORT vabits_user to modules
TASK_SIZE is defined using the vabits_user variable for 64-bit tasks,
so ensure that this variable is exported to modules to avoid the
following build breakage with allmodconfig:

 | ERROR: "vabits_user" [lib/test_user_copy.ko] undefined!
 | ERROR: "vabits_user" [drivers/misc/lkdtm/lkdtm.ko] undefined!
 | ERROR: "vabits_user" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 19:20:23 +00:00
Will Deacon
d34664f63b Merge branch 'for-next/kexec' into aarch64/for-next/core
Merge in kexec_file_load() support from Akashi Takahiro.
2018-12-10 18:57:17 +00:00
Will Deacon
bc84a2d106 Merge branch 'kvm/cortex-a76-erratum-1165522' into aarch64/for-next/core
Pull in KVM workaround for A76 erratum #116522.

Conflicts:
	arch/arm64/include/asm/cpucaps.h
2018-12-10 18:53:52 +00:00