69 Commits

Author SHA1 Message Date
Mark Rutland
97d935faac arm64: Unmask Debug + SError in do_notify_resume()
When returning to a user context, the arm64 entry code masks all DAIF
exceptions before handling pending work in exit_to_user_mode_prepare()
and do_notify_resume(), where it will transiently unmask all DAIF
exceptions. This is a holdover from the old entry assembly, which
conservatively masked all DAIF exceptions, and it's only necessary to
mask interrupts at this point during the exception return path, so long
as we subsequently mask all DAIF exceptions before the actual exception
return.

While most DAIF manipulation follows a save...restore sequence, the
manipulation in do_notify_resume() is the other way around, unmasking
all DAIF exceptions before masking them again. This is unfortunate as we
unnecessarily mask Debug and SError exceptions, and it would be nice to
remove this special case to make DAIF manipulation simpler and most
consistent.

This patch changes exit_to_user_mode_prepare() and do_notify_resume() to
only mask interrupts while handling pending work, masking other DAIF
exceptions after this has completed. This removes the unusual DAIF
manipulation and allows Debug and SError exceptions to be taken for a
slightly longer window during the exception return path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240206123848.1696480-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@linux.dev>
2024-02-20 18:12:13 +00:00
Mark Rutland
997d79eb93 arm64: Move do_notify_resume() to entry-common.c
Currently do_notify_resume() lives in arch/arm64/kernel/signal.c, but it would
make more sense for it to live in entry-common.c as it handles more than
signals, and is coupled with the rest of the return-to-userspace sequence (e.g.
with unusual DAIF masking that matches the exception return requirements).

Move do_notify_resume() to entry-common.c.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240206123848.1696480-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@linux.dev>
2024-02-20 18:12:13 +00:00
Mark Rutland
f130ac0ae4 arm64: syscall: unmask DAIF earlier for SVCs
For a number of historical reasons, when handling SVCs we don't unmask
DAIF in el0_svc() or el0_svc_compat(), and instead do so later in
el0_svc_common(). This is unfortunate and makes it harder to make
changes to the DAIF management in entry-common.c as we'd like to do as
cleanup and preparation for FEAT_NMI support. We can move the DAIF
unmasking to entry-common.c as long as we also hoist the
fp_user_discard() logic, as reasoned below.

We converted the syscall trace logic from assembly to C in commit:

  f37099b6992a0b81 ("arm64: convert syscall trace logic to C")

... which was intended to have no functional change, and mirrored the
existing assembly logic to avoid the risk of any functional regression.

With the logic in C, it's clear that there is currently no reason to
unmask DAIF so late within el0_svc_common():

* The thread flags are read prior to unmasking DAIF, but are not
  consumed until after DAIF is unmasked, and we don't perform a
  read-modify-write sequence of the thread flags for which we might need
  to serialize against an IPI modifying the flags. Similarly, for any
  thread flags set by other threads, whether DAIF is masked or not has
  no impact.

  The read_thread_flags() helpers performs a single-copy-atomic read of
  the flags, and so this can safely be moved after unmasking DAIF.

* The pt_regs::orig_x0 and pt_regs::syscallno fields are neither
  consumed nor modified by the handler for any DAIF exception (e.g.
  these do not exist in the `perf_event_arm_regs` enum and are not
  sampled by perf in its IRQ handler).

  Thus, the manipulation of pt_regs::orig_x0 and pt_regs::syscallno can
  safely be moved after unmasking DAIF.

Given the above, we can safely hoist unmasking of DAIF out of
el0_svc_common(), and into its immediate callers: do_el0_svc() and
do_el0_svc_compat(). Further:

* In do_el0_svc(), we sample the syscall number from
  pt_regs::regs[8]. This is not modified by the handler for any DAIF
  exception, and thus can safely be moved after unmasking DAIF.

  As fp_user_discard() operates on the live FP/SVE/SME register state,
  this needs to occur before we clear DAIF.IF, as interrupts could
  result in preemption which would cause this state to become foreign.
  As fp_user_discard() is the first function called within do_el0_svc(),
  it has no dependency on other parts of do_el0_svc() and can be moved
  earlier so long as it is called prior to unmasking DAIF.IF.

* In do_el0_svc_compat(), we sample the syscall number from
  pt_regs::regs[7]. This is not modified by the handler for any DAIF
  exception, and thus can safely be moved after unmasking DAIF.

  Compat threads cannot use SVE or SME, so there's no need for
  el0_svc_compat() to call fp_user_discard().

Given the above, we can safely hoist the unmasking of DAIF out of
do_el0_svc() and do_el0_svc_compat(), and into their immediate callers:
el0_svc() and el0_svc_compat(), so long a we also hoist
fp_user_discard() into el0_svc().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808101148.1064172-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11 12:23:48 +01:00
Catalin Marinas
f42039d10b Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
  docs: perf: Add new description for HiSilicon UC PMU
  drivers/perf: hisi: Add support for HiSilicon UC PMU driver
  drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
  perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
  perf/arm-cmn: Add sysfs identifier
  perf/arm-cmn: Revamp model detection
  perf/arm_dmc620: Add cpumask
  dt-bindings: perf: fsl-imx-ddr: Add i.MX93 compatible
  drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver
  perf/arm_cspmu: Decouple APMT dependency
  perf/arm_cspmu: Clean up ACPI dependency
  ACPI/APMT: Don't register invalid resource
  perf/arm_cspmu: Fix event attribute type
  perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
  drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  drivers/perf: apple_m1: Force 63bit counters for M2 CPUs
  perf/arm-cmn: Fix DTC reset
  perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust
  perf/arm-cci: Slightly optimize cci_pmu_sync_counters()

* for-next/kpti:
  : Simplify KPTI trampoline exit code
  arm64: entry: Simplify tramp_alias macro and tramp_exit routine
  arm64: entry: Preserve/restore X29 even for compat tasks

* for-next/missing-proto-warn:
  : Address -Wmissing-prototype warnings
  arm64: add alt_cb_patch_nops prototype
  arm64: move early_brk64 prototype to header
  arm64: signal: include asm/exception.h
  arm64: kaslr: add kaslr_early_init() declaration
  arm64: flush: include linux/libnvdimm.h
  arm64: module-plts: inline linux/moduleloader.h
  arm64: hide unused is_valid_bugaddr()
  arm64: efi: add efi_handle_corrupted_x18 prototype
  arm64: cpuidle: fix #ifdef for acpi functions
  arm64: kvm: add prototypes for functions called in asm
  arm64: spectre: provide prototypes for internal functions
  arm64: move cpu_suspend_set_dbg_restorer() prototype to header
  arm64: avoid prototype warnings for syscalls
  arm64: add scs_patch_vmlinux prototype
  arm64: xor-neon: mark xor_arm64_neon_*() static

* for-next/iss2-decode:
  : Add decode of ISS2 to data abort reports
  arm64/esr: Add decode of ISS2 to data abort reporting
  arm64/esr: Use GENMASK() for the ISS mask

* for-next/kselftest:
  : Various arm64 kselftest improvements
  kselftest/arm64: Log signal code and address for unexpected signals
  kselftest/arm64: Add a smoke test for ptracing hardware break/watch points

* for-next/misc:
  : Miscellaneous patches
  arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
  arm64: hibernate: remove WARN_ON in save_processor_state
  arm64/fpsimd: Exit streaming mode when flushing tasks
  arm64: mm: fix VA-range sanity check
  arm64/mm: remove now-superfluous ISBs from TTBR writes
  arm64: consolidate rox page protection logic
  arm64: set __exception_irq_entry with __irq_entry as a default
  arm64: syscall: unmask DAIF for tracing status
  arm64: lockdep: enable checks for held locks when returning to userspace
  arm64/cpucaps: increase string width to properly format cpucaps.h
  arm64/cpufeature: Use helper for ECV CNTPOFF cpufeature

* for-next/feat_mops:
  : Support for ARMv8.8 memcpy instructions in userspace
  kselftest/arm64: add MOPS to hwcap test
  arm64: mops: allow disabling MOPS from the kernel command line
  arm64: mops: detect and enable FEAT_MOPS
  arm64: mops: handle single stepping after MOPS exception
  arm64: mops: handle MOPS exceptions
  KVM: arm64: hide MOPS from guests
  arm64: mops: don't disable host MOPS instructions from EL2
  arm64: mops: document boot requirements for MOPS
  KVM: arm64: switch HCRX_EL2 between host and guest
  arm64: cpufeature: detect FEAT_HCX
  KVM: arm64: initialize HCRX_EL2

* for-next/module-alloc:
  : Make the arm64 module allocation code more robust (clean-up, VA range expansion)
  arm64: module: rework module VA range selection
  arm64: module: mandate MODULE_PLTS
  arm64: module: move module randomization to module.c
  arm64: kaslr: split kaslr/module initialization
  arm64: kasan: remove !KASAN_VMALLOC remnants
  arm64: module: remove old !KASAN_VMALLOC logic

* for-next/sysreg: (21 commits)
  : More sysreg conversions to automatic generation
  arm64/sysreg: Convert TRBIDR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBTRG_EL1 register to automatic generation
  arm64/sysreg: Convert TRBMAR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBSR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBBASER_EL1 register to automatic generation
  arm64/sysreg: Convert TRBPTR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBLIMITR_EL1 register to automatic generation
  arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBTRG_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBMAR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBSR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBBASER_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBPTR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format
  arm64/sysreg: Convert OSECCR_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRTX_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRRX_EL1 to automatic generation
  arm64/sysreg: Convert OSLAR_EL1 to automatic generation
  arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1
  arm64/sysreg: Convert MDSCR_EL1 to automatic register generation
  ...

* for-next/cpucap:
  : arm64 cpucap clean-up
  arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities()
  arm64: cpufeature: use cpucap naming
  arm64: alternatives: use cpucap naming
  arm64: standardise cpucap bitmap names

* for-next/acpi:
  : Various arm64-related ACPI patches
  ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init()

* for-next/kdump:
  : Simplify the crashkernel reservation behaviour of crashkernel=X,high on arm64
  arm64: add kdump.rst into index.rst
  Documentation: add kdump.rst to present crashkernel reservation on arm64
  arm64: kdump: simplify the reservation behaviour of crashkernel=,high

* for-next/acpi-doc:
  : Update ACPI documentation for Arm systems
  Documentation/arm64: Update ACPI tables from BBR
  Documentation/arm64: Update references in arm-acpi
  Documentation/arm64: Update ARM and arch reference

* for-next/doc:
  : arm64 documentation updates
  Documentation/arm64: Add ptdump documentation

* for-next/tpidr2-fix:
  : Fix the TPIDR2_EL0 register restoring on sigreturn
  kselftest/arm64: Add a test case for TPIDR2 restore
  arm64/signal: Restore TPIDR2 register rather than memory state
2023-06-23 18:32:20 +01:00
Eric Chan
ab1e29acdb arm64: lockdep: enable checks for held locks when returning to userspace
Currently arm64 doesn't use CONFIG_GENERIC_ENTRY and doesn't call
lockdep_sys_exit() when returning to userspace.
This means that lockdep won't check for held locks when
returning to userspace, which would be useful to detect kernel bugs.

Call lockdep_sys_exit() when returning to userspace,
enabling checking for held locks.

At the same time, rename arm64's prepare_exit_to_user_mode() to
exit_to_user_mode_prepare() to more clearly align with the naming
in the generic entry code.

Signed-off-by: Eric Chan <ericchancf@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230531090909.357047-1-ericchancf@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 17:35:52 +01:00
Kristina Martsenko
8536ceaa74 arm64: mops: handle MOPS exceptions
The memory copy/set instructions added as part of FEAT_MOPS can take an
exception (e.g. page fault) part-way through their execution and resume
execution afterwards.

If however the task is re-scheduled and execution resumes on a different
CPU, then the CPU may take a new type of exception to indicate this.
This is because the architecture allows two options (Option A and Option
B) to implement the instructions and a heterogeneous system can have
different implementations between CPUs.

In this case the OS has to reset the registers and restart execution
from the prologue instruction. The algorithm for doing this is provided
as part of the Arm ARM.

Add an exception handler for the new exception and wire it up for
userspace tasks.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20230509142235.3284028-8-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05 17:05:41 +01:00
Josh Poimboeuf
5ab6876c78 arm64/cpu: Mark cpu_park_loop() and friends __noreturn
In preparation for marking panic_smp_self_stop() __noreturn across the
kernel, first mark the arm64 implementation of cpu_park_loop() and
related functions __noreturn.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/55787d3193ea3e295ccbb097abfab0a10ae49d45.1681342859.git.jpoimboe@kernel.org
2023-04-14 17:31:24 +02:00
Will Deacon
5f4c374760 Merge branch 'for-next/undef-traps' into for-next/core
* for-next/undef-traps:
  arm64: armv8_deprecated: fix unused-function error
  arm64: armv8_deprecated: rework deprected instruction handling
  arm64: armv8_deprecated: move aarch32 helper earlier
  arm64: armv8_deprecated move emulation functions
  arm64: armv8_deprecated: fold ops into insn_emulation
  arm64: rework EL0 MRS emulation
  arm64: factor insn read out of call_undef_hook()
  arm64: factor out EL1 SSBS emulation hook
  arm64: split EL0/EL1 UNDEF handlers
  arm64: allow kprobes on EL0 handlers
2022-12-06 11:34:25 +00:00
Mark Rutland
61d64a376e arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.

Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.

Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:

* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
  within kprobes. Therefore it is safe for do_el0_undef to be
  instrumented with kprobes, and it does not need to be marked with
  NOKPROBE_SYMBOL().

* For UNDEFINED exceptions taken from EL1, either:

  (a) The exception is has been taken when manipulating SSBS; these cases
      are limited and do not occur within code that can be invoked
      recursively via kprobes. Hence, in these cases instrumentation
      with kprobes is benign.

  (b) The exception has been taken for an unknown reason, as other than
      manipulating SSBS we do not expect to take UNDEFINED exceptions
      from EL1. Any handling of these exception is best-effort.

  ... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
  isn't sufficient to prevent recursion via kprobes as functions it
  calls (including die()) are instrumentable via kprobes.

  Hence, it's not worthwhile to mark do_el1_undef() with
  NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
  so their NOKPROBE_SYMBOL() annotations are also removed.

Aside from the new instrumentability, there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:17 +00:00
Mark Rutland
b3a0c010e9 arm64: allow kprobes on EL0 handlers
Currently do_sysinstr() and do_cp15instr() are marked with
NOKPROBE_SYMBOL(). However, these are only called for exceptions taken
from EL0, and there is no risk of recursion in kprobes, so this is not
necessary.

Remove the NOKPROBE_SYMBOL() annotation, and rename the two functions to
more clearly indicate that these are solely for exceptions taken from
EL0, better matching the names used by the lower level entry points in
entry-common.c.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:17 +00:00
Mukesh Ojha
59598b42eb arm64: entry: Fix typo
Fix the following typo in entry-common.c
intrumentable => instrumentable

Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1667027268-1255-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-08 14:03:25 +00:00
Mark Rutland
024f4b2e1f arm64: entry: avoid kprobe recursion
The cortex_a76_erratum_1463225_debug_handler() function is called when
handling debug exceptions (and synchronous exceptions from BRK
instructions), and so is called when a probed function executes. If the
compiler does not inline cortex_a76_erratum_1463225_debug_handler(), it
can be probed.

If cortex_a76_erratum_1463225_debug_handler() is probed, any debug
exception or software breakpoint exception will result in recursive
exceptions leading to a stack overflow. This can be triggered with the
ftrace multiple_probes selftest, and as per the example splat below.

This is a regression caused by commit:

  6459b8469753e9fe ("arm64: entry: consolidate Cortex-A76 erratum 1463225 workaround")

... which removed the NOKPROBE_SYMBOL() annotation associated with the
function.

My intent was that cortex_a76_erratum_1463225_debug_handler() would be
inlined into its caller, el1_dbg(), which is marked noinstr and cannot
be probed. Mark cortex_a76_erratum_1463225_debug_handler() as
__always_inline to ensure this.

Example splat prior to this patch (with recursive entries elided):

| # echo p cortex_a76_erratum_1463225_debug_handler > /sys/kernel/debug/tracing/kprobe_events
| # echo p do_el0_svc >> /sys/kernel/debug/tracing/kprobe_events
| # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
| Insufficient stack space to handle exception!
| ESR: 0x0000000096000047 -- DABT (current EL)
| FAR: 0xffff800009cefff0
| Task stack:     [0xffff800009cf0000..0xffff800009cf4000]
| IRQ stack:      [0xffff800008000000..0xffff800008004000]
| Overflow stack: [0xffff00007fbc00f0..0xffff00007fbc10f0]
| CPU: 0 PID: 145 Comm: sh Not tainted 6.0.0 #2
| Hardware name: linux,dummy-virt (DT)
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : arm64_enter_el1_dbg+0x4/0x20
| lr : el1_dbg+0x24/0x5c
| sp : ffff800009cf0000
| x29: ffff800009cf0000 x28: ffff000002c74740 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: 00000000604003c5 x22: ffff80000801745c x21: 0000aaaac95ac068
| x20: 00000000f2000004 x19: ffff800009cf0040 x18: 0000000000000000
| x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
| x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
| x11: 0000000000000010 x10: ffff800008c87190 x9 : ffff800008ca00d0
| x8 : 000000000000003c x7 : 0000000000000000 x6 : 0000000000000000
| x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000043a4
| x2 : 00000000f2000004 x1 : 00000000f2000004 x0 : ffff800009cf0040
| Kernel panic - not syncing: kernel stack overflow
| CPU: 0 PID: 145 Comm: sh Not tainted 6.0.0 #2
| Hardware name: linux,dummy-virt (DT)
| Call trace:
|  dump_backtrace+0xe4/0x104
|  show_stack+0x18/0x4c
|  dump_stack_lvl+0x64/0x7c
|  dump_stack+0x18/0x38
|  panic+0x14c/0x338
|  test_taint+0x0/0x2c
|  panic_bad_stack+0x104/0x118
|  handle_bad_stack+0x34/0x48
|  __bad_stack+0x78/0x7c
|  arm64_enter_el1_dbg+0x4/0x20
|  el1h_64_sync_handler+0x40/0x98
|  el1h_64_sync+0x64/0x68
|  cortex_a76_erratum_1463225_debug_handler+0x0/0x34
...
|  el1h_64_sync_handler+0x40/0x98
|  el1h_64_sync+0x64/0x68
|  cortex_a76_erratum_1463225_debug_handler+0x0/0x34
...
|  el1h_64_sync_handler+0x40/0x98
|  el1h_64_sync+0x64/0x68
|  cortex_a76_erratum_1463225_debug_handler+0x0/0x34
|  el1h_64_sync_handler+0x40/0x98
|  el1h_64_sync+0x64/0x68
|  do_el0_svc+0x0/0x28
|  el0t_64_sync_handler+0x84/0xf0
|  el0t_64_sync+0x18c/0x190
| Kernel Offset: disabled
| CPU features: 0x0080,00005021,19001080
| Memory Limit: none
| ---[ end Kernel panic - not syncing: kernel stack overflow ]---

With this patch, cortex_a76_erratum_1463225_debug_handler() is inlined
into el1_dbg(), and el1_dbg() cannot be probed:

| # echo p cortex_a76_erratum_1463225_debug_handler > /sys/kernel/debug/tracing/kprobe_events
| sh: write error: No such file or directory
| # grep -w cortex_a76_erratum_1463225_debug_handler /proc/kallsyms | wc -l
| 0
| # echo p el1_dbg > /sys/kernel/debug/tracing/kprobe_events
| sh: write error: Invalid argument
| # grep -w el1_dbg /proc/kallsyms | wc -l
| 1

Fixes: 6459b8469753 ("arm64: entry: consolidate Cortex-A76 erratum 1463225 workaround")
Cc: <stable@vger.kernel.org> # 5.12.x
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221017090157.2881408-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-01 17:43:31 +00:00
Mark Rutland
830a2a4d85 arm64: rework BTI exception handling
If a BTI exception is taken from EL1, the entry code will treat this as
an unhandled exception and will panic() the kernel. This is inconsistent
with the way we handle FPAC exceptions, which have a dedicated handler
and only necessarily kill the thread from which the exception was taken
from, and we don't log all the information that could be relevant to
debug the issue.

The code in do_bti() has:

	BUG_ON(!user_mode(regs));

... and it seems like the intent was to call this for EL1 BTI
exceptions, as with FPAC, but this was omitted due to an oversight.

This patch adds separate EL0 and EL1 BTI exception handlers, with the
latter calling die() directly to report the original context the BTI
exception was taken from. This matches our handling of FPAC exceptions.

Prior to this patch, a BTI failure is reported as:

| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| Call trace:
|  dump_backtrace.part.0+0xcc/0xe0
|  show_stack+0x18/0x5c
|  dump_stack_lvl+0x64/0x80
|  dump_stack+0x18/0x34
|  panic+0x170/0x360
|  arm64_exit_nmi.isra.0+0x0/0x80
|  el1h_64_sync_handler+0x64/0xd0
|  el1h_64_sync+0x64/0x68
|  test_bti_callee+0x4/0x10
|  smp_cpus_done+0xb0/0xbc
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20

With this patch applied, a BTI failure is reported as:

| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804
| Call trace:
|  test_bti_callee+0x4/0x10
|  smp_cpus_done+0xb0/0xbc
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf)

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Mark Rutland
a1fafa3b24 arm64: rework FPAC exception handling
If an FPAC exception is taken from EL1, the entry code will call
do_ptrauth_fault(), where due to:

	BUG_ON(!user_mode(regs))

... the kernel will report a problem within do_ptrauth_fault() rather
than reporting the original context the FPAC exception was taken from.
The pt_regs and ESR value reported will be from within
do_ptrauth_fault() and the code dump will be for the BRK in BUG_ON(),
which isn't sufficient to debug the cause of the original exception.

This patch makes the reporting better by having separate EL0 and EL1
FPAC exception handlers, with the latter calling die() directly to
report the original context the FPAC exception was taken from.

Note that we only need to prevent kprobes of the EL1 FPAC handler, since
the EL0 FPAC handler cannot be called recursively.

For consistency with do_el0_svc*(), I've named the split functions
do_el{0,1}_fpac() rather than do_el{0,1}_ptrauth_fault(). I've also
clarified the comment to not imply there are casues other than FPAC
exceptions.

Prior to this patch FPAC exceptions are reported as:

| kernel BUG at arch/arm64/kernel/traps.c:517!
| Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00130-g9c8a180a1cdf-dirty #12
| Hardware name: FVP Base RevC (DT)
| pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : do_ptrauth_fault+0x3c/0x40
| lr : el1_fpac+0x34/0x54
| sp : ffff80000a3bbc80
| x29: ffff80000a3bbc80 x28: ffff0008001d8000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: 0000000020400009 x22: ffff800008f70fa4 x21: ffff80000a3bbe00
| x20: 0000000072000000 x19: ffff80000a3bbcb0 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783
| x5 : ffff80000a3bbcb0 x4 : ffff0008001d8000 x3 : 0000000072000000
| x2 : 0000000000000000 x1 : 0000000020400009 x0 : ffff80000a3bbcb0
| Call trace:
|  do_ptrauth_fault+0x3c/0x40
|  el1h_64_sync_handler+0xc4/0xd0
|  el1h_64_sync+0x64/0x68
|  test_pac+0x8/0x10
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: 97fffe5e a8c17bfd d50323bf d65f03c0 (d4210000)

With this patch applied FPAC exceptions are reported as:

| Internal error: Oops - FPAC: 0000000072000000 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g78846e1c4757-dirty #11
| Hardware name: FVP Base RevC (DT)
| pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : test_pac+0x8/0x10
| lr : 0x0
| sp : ffff80000a3bbe00
| x29: ffff80000a3bbe00 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2c8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff80000a007000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783
| x5 : ffff80000a2c6000 x4 : ffff0008001d8000 x3 : ffff800009f88378
| x2 : 0000000000000000 x1 : 0000000080210000 x0 : ffff000001a90000
| Call trace:
|  test_pac+0x8/0x10
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: d50323bf d65f03c0 d503233f aa1f03fe (d50323bf)

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Mark Rutland
0f2cb928a1 arm64: consistently pass ESR_ELx to die()
Currently, bug_handler() and kasan_handler() call die() with '0' as the
'err' value, whereas die_kernel_fault() passes the ESR_ELx value.

For consistency, this patch ensures we always pass the ESR_ELx value to
die(). As this is only called for exceptions taken from kernel mode,
there should be no user-visible change as a result of this patch.

For UNDEFINED exceptions, I've had to modify do_undefinstr() and its
callers to pass the ESR_ELx value. In all cases the ESR_ELx value had
already been read and was available.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Frederic Weisbecker
493c182282 context_tracking: Take NMI eqs entrypoints over RCU
The RCU dynticks counter is going to be merged into the context tracking
subsystem. Prepare with moving the NMI extended quiescent states
entrypoints to context tracking. For now those are dumb redirection to
existing RCU calls.

Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao <liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
2022-07-05 13:32:59 -07:00
Frederic Weisbecker
6f0e6c1598 context_tracking: Take IRQ eqs entrypoints over RCU
The RCU dynticks counter is going to be merged into the context tracking
subsystem. Prepare with moving the IRQ extended quiescent states
entrypoints to context tracking. For now those are dumb redirection to
existing RCU calls.

[ paulmck: Apply Stephen Rothwell feedback from -next. ]
[ paulmck: Apply Nathan Chancellor feedback. ]

Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicolas Saenz Julienne <nsaenz@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Yu Liao <liaoyu15@huawei.com>
Cc: Phil Auld <pauld@redhat.com>
Cc: Paul Gortmaker<paul.gortmaker@windriver.com>
Cc: Alex Belits <abelits@marvell.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
2022-07-05 13:32:59 -07:00
Linus Torvalds
2319be1356 Locking changes in this cycle were:
- rwsem cleanups & optimizations/fixes:
     - Conditionally wake waiters in reader/writer slowpaths
     - Always try to wake waiters in out_nolock path
 
  - Add try_cmpxchg64() implementation, with arch optimizations - and use it to
    micro-optimize sched_clock_{local,remote}()
 
  - Various force-inlining fixes to address objdump instrumentation-check warnings
 
  - Add lock contention tracepoints:
 
     lock:contention_begin
     lock:contention_end
 
  - Misc smaller fixes & cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmKLsrERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1js3g//cPR9PYlvZv87T2hI8VWKfNzapgSmwCsH
 1P+nk27Pef+jfxHr/N7YScvSD06+z2wIroLE3npPNETmNd1X8obBDThmeD4VI899
 J6h4sE0cFOpTG/mHeECFxqnDuzhdHiRHWS52RxOwTjZTpdbeKWZYueC0Mvqn+tIp
 UM2D2yTseIHs67ikxYtayU/iJgSZ+PYrMPv9nSVUjIFILmg7gMIz38OZYQzR84++
 auL3m8sAq/i2pjzDBbXMpfYeu177/tPHpPJr2rOErLEXWqK2K6op8+CbX4z3yv3z
 EBBhGiUNqDmFaFuIgg7Mx94SvPh8MBGexUnT0XA2aXPwyP9oAaenCk2CZ1j9u15m
 /Xp1A4KNvg1WY8jHu5ZM4VIEXQ7d6Gwtbej7IeovUxBD6y7Trb3+rxb7PVdZX941
 uVGjss1Lgk70wUQqBqBPmBm08O6NUF3vekHlona5CZTQgEF84zD7+7D++QPaAZo7
 kiuNUptdgfq6X0xqgP88GX1KU85gJYoF5Q13vb7lAcv19QhRG5JBJeWMYiXEmg12
 Ktl97Sru0zCpCY1NCvwsBll09SLVO9kX3Lq+QFD8bFMZ0obsGIBotHq1qH6U7cH8
 RY6esVBF/1/+qdrxOKs8qowlJ4EUp/3bX0R/MKYHJJbulj/aaE916HvMsoN/QR4Y
 oW7GcxMQGLE=
 =gaS5
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - rwsem cleanups & optimizations/fixes:
    - Conditionally wake waiters in reader/writer slowpaths
    - Always try to wake waiters in out_nolock path

 - Add try_cmpxchg64() implementation, with arch optimizations - and use
   it to micro-optimize sched_clock_{local,remote}()

 - Various force-inlining fixes to address objdump instrumentation-check
   warnings

 - Add lock contention tracepoints:

    lock:contention_begin
    lock:contention_end

 - Misc smaller fixes & cleanups

* tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}
  locking/atomic/x86: Introduce arch_try_cmpxchg64
  locking/atomic: Add generic try_cmpxchg64 support
  futex: Remove a PREEMPT_RT_FULL reference.
  locking/qrwlock: Change "queue rwlock" to "queued rwlock"
  lockdep: Delete local_irq_enable_in_hardirq()
  locking/mutex: Make contention tracepoints more consistent wrt adaptive spinning
  locking: Apply contention tracepoints in the slow path
  locking: Add lock contention tracepoints
  locking/rwsem: Always try to wake waiters in out_nolock path
  locking/rwsem: Conditionally wake waiters in reader/writer slowpaths
  locking/rwsem: No need to check for handoff bit if wait queue empty
  lockdep: Fix -Wunused-parameter for _THIS_IP_
  x86/mm: Force-inline __phys_addr_nodebug()
  x86/kvm/svm: Force-inline GHCB accessors
  task_stack, x86/cea: Force-inline stack helpers
2022-05-24 10:18:23 -07:00
Catalin Marinas
0616ea3f1b Merge branch 'for-next/esr-elx-64-bit' into for-next/core
* for-next/esr-elx-64-bit:
  : Treat ESR_ELx as a 64-bit register.
  KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high
  KVM: arm64: Treat ESR_EL2 as a 64-bit register
  arm64: Treat ESR_ELx as a 64-bit register
  arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
  arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly
2022-05-20 18:51:54 +01:00
Alexandru Elisei
8d56e5c5a9 arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32].  This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.

As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.

Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.

The KVM hypervisor will receive a similar update in a subsequent patch.

[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-29 19:26:27 +01:00
Mark Brown
8bd7f91c03 arm64/sme: Implement traps and syscall handling for SME
By default all SME operations in userspace will trap.  When this happens
we allocate storage space for the SME register state, set up the SVE
registers and disable traps.  We do not need to initialize ZA since the
architecture guarantees that it will be zeroed when enabled and when we
trap ZA is disabled.

On syscall we exit streaming mode if we were previously in it and ensure
that all but the lower 128 bits of the registers are zeroed while
preserving the state of ZA. This follows the aarch64 PCS for SME, ZA
state is preserved over a function call and streaming mode is exited.
Since the traps for SME do not distinguish between streaming mode SVE
and ZA usage if ZA is in use rather than reenabling traps we instead
zero the parts of the SVE registers not shared with FPSIMD and leave SME
enabled, this simplifies handling SME traps. If ZA is not in use then we
reenable SME traps and fall through to normal handling of SVE.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-17-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:05 +01:00
Nick Desaulniers
8b023accc8 lockdep: Fix -Wunused-parameter for _THIS_IP_
While looking into a bug related to the compiler's handling of addresses
of labels, I noticed some uses of _THIS_IP_ seemed unused in lockdep.
Drive by cleanup.

-Wunused-parameter:
kernel/locking/lockdep.c:1383:22: warning: unused parameter 'ip'
kernel/locking/lockdep.c:4246:48: warning: unused parameter 'ip'
kernel/locking/lockdep.c:4844:19: warning: unused parameter 'ip'

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20220314221909.2027027-1-ndesaulniers@google.com
2022-04-05 10:24:34 +02:00
Linus Torvalds
3fe2f7446f Changes in this cycle were:
- Cleanups for SCHED_DEADLINE
  - Tracing updates/fixes
  - CPU Accounting fixes
  - First wave of changes to optimize the overhead of the scheduler build,
    from the fast-headers tree - including placeholder *_api.h headers for
    later header split-ups.
  - Preempt-dynamic using static_branch() for ARM64
  - Isolation housekeeping mask rework; preperatory for further changes
  - NUMA-balancing: deal with CPU-less nodes
  - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD)
  - Updates to RSEQ UAPI in preparation for glibc usage
  - Lots of RSEQ/selftests, for same
  - Add Suren as PSI co-maintainer
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmI5rg8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hGrw/+M3QOk6fH7G48wjlNnBvcOife6ls+Ni4k
 ixOAcF4JKoixO8HieU5vv0A7yf/83tAa6fpeXeMf1hkCGc0NSlmLtuIux+WOmoAL
 LzCyDEYfiP8KnVh0A1Tui/lK0+AkGo21O6ADhQE2gh8o2LpslOHQMzvtyekSzeeb
 mVxMYQN+QH0m518xdO2D8IQv9ctOYK0eGjmkqdNfntOlytypPZHeNel/tCzwklP/
 dElJUjNiSKDlUgTBPtL3DfpoLOI/0mHF2p6NEXvNyULxSOqJTu8pv9Z2ADb2kKo1
 0D56iXBDngMi9MHIJLgvzsA8gKzHLFSuPbpODDqkTZCa28vaMB9NYGhJ643NtEie
 IXTJEvF1rmNkcLcZlZxo0yjL0fjvPkczjw4Vj27gbrUQeEBfb4mfuI4BRmij63Ep
 qEkgQTJhduCqqrQP1rVyhwWZRk1JNcVug+F6N42qWW3fg1xhj0YSrLai2c9nPez6
 3Zt98H8YGS1Z/JQomSw48iGXVqfTp/ETI7uU7jqHK8QcjzQ4lFK5H4GZpwuqGBZi
 NJJ1l97XMEas+rPHiwMEN7Z1DVhzJLCp8omEj12QU+tGLofxxwAuuOVat3CQWLRk
 f80Oya3TLEgd22hGIKDRmHa22vdWnNQyS0S15wJotawBzQf+n3auS9Q3/rh979+t
 ES/qvlGxTIs=
 =Z8uT
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Cleanups for SCHED_DEADLINE

 - Tracing updates/fixes

 - CPU Accounting fixes

 - First wave of changes to optimize the overhead of the scheduler
   build, from the fast-headers tree - including placeholder *_api.h
   headers for later header split-ups.

 - Preempt-dynamic using static_branch() for ARM64

 - Isolation housekeeping mask rework; preperatory for further changes

 - NUMA-balancing: deal with CPU-less nodes

 - NUMA-balancing: tune systems that have multiple LLC cache domains per
   node (eg. AMD)

 - Updates to RSEQ UAPI in preparation for glibc usage

 - Lots of RSEQ/selftests, for same

 - Add Suren as PSI co-maintainer

* tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
  sched/headers: ARM needs asm/paravirt_api_clock.h too
  sched/numa: Fix boot crash on arm64 systems
  headers/prep: Fix header to build standalone: <linux/psi.h>
  sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y
  cgroup: Fix suspicious rcu_dereference_check() usage warning
  sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers
  sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains
  sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity()
  sched/deadline,rt: Remove unused functions for !CONFIG_SMP
  sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently
  sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
  sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file
  sched/deadline: Remove unused def_dl_bandwidth
  sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
  sched/tracing: Don't re-read p->state when emitting sched_switch event
  sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
  sched/cpuacct: Remove redundant RCU read lock
  sched/cpuacct: Optimize away RCU read lock
  sched/cpuacct: Fix charge percpu cpuusage
  sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies
  ...
2022-03-22 14:39:12 -07:00
Peter Collingbourne
38ddf7dafa arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary
On some microarchitectures, clearing PSTATE.TCO is expensive. Clearing
TCO is only necessary if in-kernel MTE is enabled, or if MTE is
enabled in the userspace process in synchronous (or, soon, asymmetric)
mode, because we do not report uaccess faults to userspace in none
or asynchronous modes. Therefore, adjust the kernel entry code to
clear TCO only if necessary.

Because it is now possible to switch to a task in which TCO needs to
be clear from a task in which TCO is set, we also need to do the same
thing on task switch.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I52d82a580bd0500d420be501af2c35fa8c90729e
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220219012945.894950-2-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22 21:48:44 +00:00
Mark Rutland
1b2d3451ee arm64: Support PREEMPT_DYNAMIC
This patch enables support for PREEMPT_DYNAMIC on arm64, allowing the
preemption model to be chosen at boot time.

Specifically, this patch selects HAVE_PREEMPT_DYNAMIC_KEY, so that each
preemption function is an out-of-line call with an early return
depending upon a static key. This leaves almost all the codegen up to
the compiler, and side-steps a number of pain points with static calls
(e.g. interaction with CFI schemes). This should have no worse overhead
than using non-inline static calls, as those use out-of-line trampolines
with early returns.

For example, the dynamic_cond_resched() wrapper looks as follows when
enabled. When disabled, the first `B` is replaced with a `NOP`,
resulting in an early return.

| <dynamic_cond_resched>:
|        bti     c
|        b       <dynamic_cond_resched+0x10>     // or `nop`
|        mov     w0, #0x0
|        ret
|        mrs     x0, sp_el0
|        ldr     x0, [x0, #8]
|        cbnz    x0, <dynamic_cond_resched+0x8>
|        paciasp
|        stp     x29, x30, [sp, #-16]!
|        mov     x29, sp
|        bl      <preempt_schedule_common>
|        mov     w0, #0x1
|        ldp     x29, x30, [sp], #16
|        autiasp
|        ret

... compared to the regular form of the function:

| <__cond_resched>:
|        bti     c
|        mrs     x0, sp_el0
|        ldr     x1, [x0, #8]
|        cbz     x1, <__cond_resched+0x18>
|        mov     w0, #0x0
|        ret
|        paciasp
|        stp     x29, x30, [sp, #-16]!
|        mov     x29, sp
|        bl      <preempt_schedule_common>
|        mov     w0, #0x1
|        ldp     x29, x30, [sp], #16
|        autiasp
|        ret

Since arm64 does not yet use the generic entry code, we must define our
own `sk_dynamic_irqentry_exit_cond_resched`, which will be
enabled/disabled by the common code in kernel/sched/core.c. All other
preemption functions and associated static keys are defined there.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20220214165216.2231574-8-mark.rutland@arm.com
2022-02-19 11:11:09 +01:00
Mark Rutland
8e12ab7c0e arm64: entry: Centralize preemption decision
For historical reasons, the decision of whether or not to preempt is
spread across arm64_preempt_schedule_irq() and __el1_irq(), and it would
be clearer if this were all in one place.

Also, arm64_preempt_schedule_irq() calls lockdep_assert_irqs_disabled(),
but this is redundant, as we have a subsequent identical assertion in
__exit_to_kernel_mode(), and preempt_schedule_irq() will
BUG_ON(!irqs_disabled()) anyway.

This patch removes the redundant assertion and centralizes the
preemption decision making within arm64_preempt_schedule_irq().

Other than the slight change to assertion behaviour, there should be no
functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20220214165216.2231574-7-mark.rutland@arm.com
2022-02-19 11:11:08 +01:00
Mark Rutland
342b380878 arm64: Snapshot thread flags
Some thread flags can be set remotely, and so even when IRQs are disabled,
the flags can change under our feet. Generally this is unlikely to cause a
problem in practice, but it is somewhat unsound, and KCSAN will
legitimately warn that there is a data race.

To avoid such issues, a snapshot of the flags has to be taken prior to
using them. Some places already use READ_ONCE() for that, others do not.

Convert them all to the new flag accessor helpers.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20211129130653.2037928-7-mark.rutland@arm.com
2021-12-01 00:06:44 +01:00
Mark Rutland
26dc129342 irq: arm64: perform irqentry in entry code
In preparation for removing HANDLE_DOMAIN_IRQ_IRQENTRY, have arch/arm64
perform all the irqentry accounting in its entry code.

As arch/arm64 already performs portions of the irqentry logic in
enter_from_kernel_mode() and exit_to_kernel_mode(), including
rcu_irq_{enter,exit}(), the only additional calls that need to be made
are to irq_{enter,exit}_rcu(). Removing the calls to
rcu_irq_{enter,exit}() from handle_domain_irq() ensures that we inform
RCU once per IRQ entry and will correctly identify quiescent periods.

Since we should not call irq_{enter,exit}_rcu() when entering a
pseudo-NMI, el1_interrupt() is reworked to have separate __el1_irq() and
__el1_pnmi() paths for regular IRQ and psuedo-NMI entry, with
irq_{enter,exit}_irq() only called for the former.

In preparation for removing HANDLE_DOMAIN_IRQ, the irq regs are managed
in do_interrupt_handler() for both regular IRQ and pseudo-NMI. This is
currently redundant, but not harmful.

For clarity the preemption logic is moved into __el1_irq(). We should
never preempt within a pseudo-NMI, and arm64_enter_nmi() already
enforces this by incrementing the preempt_count, but it's clearer if we
never invoke the preemption logic when entering a pseudo-NMI.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Pingfan Liu <kernelfans@gmail.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
2021-10-26 10:12:53 +01:00
Mark Rutland
e130338eed arm64: entry: call exit_to_user_mode() from C
When handling an exception from EL0, we perform the entry work in that
exception's C handler, and once the C handler has finished, we return
back to the entry assembly. Subsequently in the common `ret_to_user`
assembly we perform the exit work that balances with the entry work.
This can be somewhat difficult to follow, and makes it hard to rework
the return paths (e.g. to pass additional context to the exit code, or
to have exception return logic for specific exceptions).

This patch reworks the entry code such that each EL0 C exception handler
is responsible for both the entry and exit work. This clearly balances
the two (and will permit additional variation in future), and avoids an
unnecessary bounce between assembly and C in the common case, leaving
`ret_from_fork` as the only place assembly has to call the exit code.
This means that the exit work is now inlined into the C handler, which
is already the case for the entry work, and allows the compiler to
generate better code (e.g. by immediately returning when there is no
exit work to perform).

To align with other exception entry/exit helpers, enter_from_user_mode()
is updated to take the EL0 pt_regs as a parameter, though this is
currently unused.

There should be no functional change as a result of this patch. However,
this should lead to slightly better backtraces when an error is
encountered within do_notify_resume(), as the C handler should appear in
the backtrace, indicating the specific exception that the kernel was
entered with.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-05 14:10:32 +01:00
Mark Rutland
4d1c2ee270 arm64: entry: move bulk of ret_to_user to C
In `ret_to_user` we perform some conditional work depending on the
thread flags, then perform some IRQ/context tracking which is intended
to balance with the IRQ/context tracking performed in the entry C code.

For simplicity and consistency, it would be preferable to move this all
to C. As a step towards that, this patch moves the conditional work and
IRQ/context tracking into a C helper function. To aid bisectability,
this is called from the `ret_to_user` assembly, and a subsequent patch
will move the call to C code.

As local_daif_mask() handles all necessary tracing and PMR manipulation,
we no longer need to handle this explicitly. As we call
exit_to_user_mode() directly, the `user_enter_irqoff` macro is no longer
used, and can be removed. As enter_from_user_mode() and
exit_to_user_mode() are no longer called from assembly, these can be
made static, and as these are typically very small, they are marked
__always_inline to avoid the overhead of a function call.

For now, enablement of single-step is left in entry.S, and for this we
still need to read the flags in ret_to_user(). It is safe to read this
separately as TIF_SINGLESTEP is not part of _TIF_WORK_MASK.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-4-mark.rutland@arm.com
[catalin.marinas@arm.com: removed unused gic_prio_kentry_setup macro]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-05 14:09:49 +01:00
Mark Rutland
bc29b71f53 arm64: entry: clarify entry/exit helpers
When entering an exception, we must perform irq/context state management
before we can use instrumentable C code. Similarly, when exiting an
exception we cannot use instrumentable C code after we perform
irq/context state management.

Originally, we'd intended that the enter_from_*() and exit_to_*()
helpers would enforce this by virtue of being the first and last
functions called, respectively, in an exception handler. However, as
they now call instrumentable code themselves, this is not as clearly
true.

To make this more robust, this patch splits the irq/context state
management into separate helpers, with all the helpers commented to make
their intended purpose more obvious.

In exit_to_kernel_mode() we'll now check TFSR_EL1 before we assert that
IRQs are disabled, but this ordering is not important, and other than
this there should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-3-mark.rutland@arm.com
[catalin.marinas@arm.com: comment typos fix-up]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-05 14:06:55 +01:00
Mark Rutland
46a2b02d23 arm64: entry: consolidate entry/exit helpers
To make the various entry/exit helpers easier to understand and easier
to compare, this patch moves all the entry/exit helpers to be adjacent
at the top of entry-common.c, rather than being spread out throughout
the file.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-05 14:04:19 +01:00
Mark Rutland
31a7f0f6c8 arm64: entry: add missing noinstr
We intend that all the early exception handling code is marked as
`noinstr`, but we forgot this for __el0_error_handler_common(), which is
called before we have completed entry from user mode. If it were
instrumented, we could run into problems with RCU, lockdep, etc.

Mark it as `noinstr` to prevent this.

The few other functions in entry-common.c which do not have `noinstr` are
called once we've completed entry, and are safe to instrument.

Fixes: bb8e93a287a5 ("arm64: entry: convert SError handlers to C")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210714172801.16475-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-07-15 17:36:51 +01:00
Mark Rutland
6ecbc78c3d arm64: entry: make NMI entry/exit functions static
Now that we only call arm64_enter_nmi() and arm64_exit_nmi() from within
entry-common.c, let's make these static to ensure this remains the case.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-19-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:56 +01:00
Mark Rutland
d60b228fd1 arm64: entry: split SDEI entry
We'd like to keep all the entry sequencing in entry-common.c, as this
will allow us to ensure this is consistent, and free from any unsound
instrumentation.

Currently __sdei_handler() performs the NMI entry/exit sequences in
sdei.c. Let's split the low-level entry sequence from the event
handling, moving the former to entry-common.c and keeping the latter in
sdei.c. The event handling function is renamed to do_sdei_event(),
matching the do_${FOO}() pattern used for other exception handlers.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-18-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:56 +01:00
Mark Rutland
8168f09886 arm64: entry: split bad stack entry
We'd like to keep all the entry sequencing in entry-common.c, as this
will allow us to ensure this is consistent, and free from any unsound
instrumentation.

Currently handle_bad_stack() performs the NMI entry sequence in traps.c.
Let's split the low-level entry sequence from the reporting, moving the
former to entry-common.c and keeping the latter in traps.c. To make it
clear that reporting function never returns, it is renamed to
panic_bad_stack().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-17-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:56 +01:00
Mark Rutland
afd05e28c9 arm64: entry: fold el1_inv() into el1h_64_sync_handler()
An unexpected synchronous exception from EL1h could happen at any time,
and for robustness we should treat this as an NMI, making minimal
assumptions about the context the exception was taken from.

Currently el1_inv() assumes we can use enter_from_kernel_mode(), and
also assumes that we should inherit the original DAIF value. Neither of
these are desireable when we take an unexpected exception. Further,
after el1_inv() calls __panic_unhandled(), the remainder of the function
is unreachable, and therefore superfluous.

Let's address this and simplify things by having el1h_64_sync_handler()
call __panic_unhandled() directly, without any of the redundant logic.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reported-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-16-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
ec841aab8d arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.

It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).

This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:

  asm: <el+sp>_<regsize>_<type>
  c:   <el+sp>_<regsize>_<type>_handler

.. where:

  <el+sp> is `el1t`, `el1h`, or `el0t`
  <regsize> is `64` or `32`
  <type> is `sync`, `irq`, `fiq`, or `error`

... e.g.

  asm: el1h_64_sync
  c:   el1h_64_sync_handler

... with lower-level handlers simply using "el1" and "compat" as today.

For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.

| Unhandled 64-bit el1t sync exception

For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).

The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
ca0c2647f5 arm64: entry: improve bad_mode()
Our use of bad_mode() has a few rough edges:

* AArch64 doesn't use the term "mode", and refers to "Execution
  states", "Exception levels", and "Selected stack pointer".

* We log the exception type (SYNC/IRQ/FIQ/SError), but not the actual
  "mode" (though this can be decoded from the SPSR value).

* We use bad_mode() as a second-level handler for unexpected synchronous
  exceptions, where the "mode" is legitimate, but the specific exception
  is not.

* We dump the ESR value, but call this "code", and so it's not clear to
  all readers that this is the ESR.

... and all of this can be somewhat opaque to those who aren't extremely
familiar with the code.

Let's make this a bit clearer by having bad_mode() log "Unhandled
${TYPE} exception" rather than "Bad mode in ${TYPE} handler", using
"ESR" rather than "code", and having the final panic() log "Unhandled
exception" rather than "Bad mode".

In future we'd like to log the specific architectural vector rather than
just the type of exception, so we also split the core of bad_mode() out
into a helper called __panic_unhandled(), which takes the vector as a
string argument.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-13-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
cbed5f8d3f arm64: entry: move bad_mode() to entry-common.c
In subsequent patches we'll rework the way bad_mode() is called by
exception entry code. In preparation for this, let's move bad_mode()
itself into entry-common.c.

Let's also mark it as noinstr (e.g. to prevent it being kprobed), and
let's also make the `handler` array a local variable, as this is only
use by bad_mode(), and will be removed entirely in a subsequent patch.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
064dbfb416 arm64: entry: convert IRQ+FIQ handlers to C
For various reasons we'd like to convert the bulk of arm64's exception
triage logic to C. As a step towards that, this patch converts the EL1
and EL0 IRQ+FIQ triage logic to C.

Separate C functions are added for the native and compat cases so that
in subsequent patches we can handle native/compat differences in C.

Since the triage functions can now call arm64_apply_bp_hardening()
directly, the do_el0_irq_bp_hardening() wrapper function is removed.

Since the user_exit_irqoff macro is now unused, it is removed. The
user_enter_irqoff macro is still used by the ret_to_user code, and
cannot be removed at this time.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
101a5b665d arm64: entry: move NMI preempt logic to C
Currently portions of our preempt logic are written in C while other
parts are written in assembly. Let's clean this up a little bit by
moving the NMI preempt checks to C. For now, the preempt count (and
need_resched) checking is left in assembly, and will be converted
with the body of the IRQ handler in subsequent patches.

Other than the increased lockdep coverage there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
33a3581a76 arm64: entry: move arm64_preempt_schedule_irq to entry-common.c
Subsequent patches will pull more of the IRQ entry handling into C. To
keep this in one place, let's move arm64_preempt_schedule_irq() into
entry-common.c along with the other entry management functions.

We no longer need to include <linux/lockdep.h> in process.c, so the
include directive is removed.

There should be no functional change as a result of this patch.

Reviewed-by Joey Gouly <joey.gouly@arm.com>

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:55 +01:00
Mark Rutland
bb8e93a287 arm64: entry: convert SError handlers to C
For various reasons we'd like to convert the bulk of arm64's exception
triage logic to C. As a step towards that, this patch converts the EL1
and EL0 SError triage logic to C.

Separate C functions are added for the native and compat cases so that
in subsequent patches we can handle native/compat differences in C.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:54 +01:00
Mark Rutland
f7c706f039 arm64: entry: unmask IRQ+FIQ after EL0 handling
For non-fatal exceptions taken from EL0, we expect that at some point
during exception handling it is possible to return to a regular process
context with all exceptions unmasked (e.g. as we do in
do_notify_resume()), and we generally aim to unmask exceptions wherever
possible.

While handling SError and debug exceptions from EL0, we need to leave
some exceptions masked during handling. Handling SError requires us to
mask SError (which also requires masking IRQ+FIQ), and handing debug
exceptions requires us to mask debug (which also requires masking
SError+IRQ+FIQ).

Once do_serror() or do_debug_exception() has returned, we no longer need
to mask exceptions, and can unmask them all, which is what we did prior
to commit:

  9034f6251572a474 ("arm64: Do not enable IRQs for ct_user_exit")

... where we had to mask IRQs as for context_tracking_user_exit()
expected IRQs to be masked.

Since then, we realised that our context tracking wasn't entirely
correct, and reworked the entry code to fix this. As of commit:

  23529049c6842382 ("arm64: entry: fix non-NMI user<->kernel transitions")

... we replaced the call to context_tracking_user_exit() with a call to
user_exit_irqoff() as part of enter_from_user_mode(), which occurs
earlier, before we run the body of the handler and unmask exceptions in
DAIF.

When we return to userspace, we go via ret_to_user(), which masks
exceptions in DAIF prior to calling user_enter_irqoff() as part of
exit_to_user_mode().

Thus, there's no longer a reason to leave IRQs or FIQs masked at the end
of the EL0 debug or error handlers, as neither the user exit context
tracking nor the user entry context tracking requires this. Let's bring
these into line with other EL0 exception handlers and ensure that IRQ
and FIQ are unmasked in DAIF at some point during the handler.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 11:35:54 +01:00
Mark Rutland
4d6a38da8e arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1"
on the command line hits a warning during kernel entry, due to the way
we manipulate the PMR.

Early in the entry sequence, we call lockdep_hardirqs_off() to inform
lockdep that interrupts have been masked (as the HW sets DAIF wqhen
entering an exception). Architecturally PMR_EL1 is not affected by
exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in
the exception entry sequence, so early in exception entry the PMR can
indicate that interrupts are unmasked even though they are masked by
DAIF.

If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that
interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the
exception entry paths, and hence lockdep_hardirqs_off() will WARN() that
something is amiss.

We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during
exception entry so that kernel code sees a consistent environment. We
must also update local_daif_inherit() to undo this, as currently only
touches DAIF. For other paths, local_daif_restore() will update both
DAIF and the PMR. With this done, we can remove the existing special
cases which set this later in the entry code.

We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with
local_daif_save(), as this will warn if it ever encounters
(GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This
matches the gic_prio_kentry_setup that we have to retain for
ret_to_user.

The original splat from Zenghui's report was:

| DEBUG_LOCKS_WARN_ON(!irqs_disabled())
| WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8
| Modules linked in:
| CPU: 3 PID: 125 Comm: modprobe Tainted: G        W         5.12.0-rc8+ #463
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : lockdep_hardirqs_off+0xd4/0xe8
| lr : lockdep_hardirqs_off+0xd4/0xe8
| sp : ffff80002a39bad0
| pmr_save: 000000e0
| x29: ffff80002a39bad0 x28: ffff0000de214bc0
| x27: ffff0000de1c0400 x26: 000000000049b328
| x25: 0000000000406f30 x24: ffff0000de1c00a0
| x23: 0000000020400005 x22: ffff8000105f747c
| x21: 0000000096000044 x20: 0000000000498ef9
| x19: ffff80002a39bc88 x18: ffffffffffffffff
| x17: 0000000000000000 x16: ffff800011c61eb0
| x15: ffff800011700a88 x14: 0720072007200720
| x13: 0720072007200720 x12: 0720072007200720
| x11: 0720072007200720 x10: 0720072007200720
| x9 : ffff80002a39bad0 x8 : ffff80002a39bad0
| x7 : ffff8000119f0800 x6 : c0000000ffff7fff
| x5 : ffff8000119f07a8 x4 : 0000000000000001
| x3 : 9bcdab23f2432800 x2 : ffff800011730538
| x1 : 9bcdab23f2432800 x0 : 0000000000000000
| Call trace:
|  lockdep_hardirqs_off+0xd4/0xe8
|  enter_from_kernel_mode.isra.5+0x7c/0xa8
|  el1_abort+0x24/0x100
|  el1_sync_handler+0x80/0xd0
|  el1_sync+0x6c/0x100
|  __arch_clear_user+0xc/0x90
|  load_elf_binary+0x9fc/0x1450
|  bprm_execve+0x404/0x880
|  kernel_execve+0x180/0x188
|  call_usermodehelper_exec_async+0xdc/0x158
|  ret_from_fork+0x10/0x18

Fixes: 23529049c684 ("arm64: entry: fix non-NMI user<->kernel transitions")
Fixes: 7cd1ea1010ac ("arm64: entry: fix non-NMI kernel<->kernel transitions")
Fixes: f0cd5ac1e4c5 ("arm64: entry: fix NMI {user, kernel}->kernel transitions")
Fixes: 2a9b3e6ac69a ("arm64: entry: fix EL1 debug transitions")
Link: https://lore.kernel.org/r/f4012761-026f-4e51-3a0c-7524e434e8b3@huawei.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210428111555.50880-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-05-05 18:13:58 +01:00
Vincenzo Frascino
65812c6921 arm64: mte: Enable async tag check fault
MTE provides a mode that asynchronously updates the TFSR_EL1 register
when a tag check exception is detected.

To take advantage of this mode the kernel has to verify the status of
the register at:
  1. Context switching
  2. Return to user/EL0 (Not required in entry from EL0 since the kernel
  did not run)
  3. Kernel entry from EL1
  4. Kernel exit to EL1

If the register is non-zero a trace is reported.

Add the required features for EL1 detection and reporting.

Note: ITFSB bit is set in the SCTLR_EL1 register hence it guaranties that
the indirect writes to TFSR_EL1 are synchronized at exception entry to
EL1. On the context switch path the synchronization is guarantied by the
dsb() in __switch_to().
The dsb(nsh) in mte_check_tfsr_exit() is provisional pending
confirmation by the architects.

Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210315132019.33202-8-vincenzo.frascino@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-11 10:56:40 +01:00
Mark Rutland
6459b84697 arm64: entry: consolidate Cortex-A76 erratum 1463225 workaround
The workaround for Cortex-A76 erratum 1463225 is split across the
syscall and debug handlers in separate files. This structure currently
forces us to do some redundant work for debug exceptions from EL0, is a
little difficult to follow, and gets in the way of some future rework of
the exception entry code as it requires exceptions to be unmasked late
in the syscall handling path.

To simplify things, and as a preparatory step for future rework of
exception entry, this patch moves all the workaround logic into
entry-common.c. As the debug handler only needs to run for EL1 debug
exceptions, we no longer call it for EL0 debug exceptions, and no longer
need to check user_mode(regs) as this is always false. For clarity
cortex_a76_erratum_1463225_debug_handler() is changed to return bool.

In the SVC path, the workaround is applied earlier, but this should have
no functional impact as exceptions are still masked. In the debug path
we run the fixup before explicitly disabling preemption, but we will not
attempt to preempt before returning from the exception.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210202120341.28858-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-08 17:39:02 +00:00
Catalin Marinas
d889797530 Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/core
* arm64/for-next/fixes: (26 commits)
  arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
  arm64: mte: Fix typo in macro definition
  arm64: entry: fix EL1 debug transitions
  arm64: entry: fix NMI {user, kernel}->kernel transitions
  arm64: entry: fix non-NMI kernel<->kernel transitions
  arm64: ptrace: prepare for EL1 irq/rcu tracking
  arm64: entry: fix non-NMI user<->kernel transitions
  arm64: entry: move el1 irq/nmi logic to C
  arm64: entry: prepare ret_to_user for function call
  arm64: entry: move enter_from_user_mode to entry-common.c
  arm64: entry: mark entry code as noinstr
  arm64: mark idle code as noinstr
  arm64: syscall: exit userspace before unmasking exceptions
  arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
  arm64: pgtable: Fix pte_accessible()
  ACPI/IORT: Fix doc warnings in iort.c
  arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build
  arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver
  arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list
  arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist
  ...

# Conflicts:
#	arch/arm64/include/asm/exception.h
#	arch/arm64/kernel/sdei.c
2020-12-09 18:04:55 +00:00
Mark Rutland
2a9b3e6ac6 arm64: entry: fix EL1 debug transitions
In debug_exception_enter() and debug_exception_exit() we trace hardirqs
on/off while RCU isn't guaranteed to be watching, and we don't save and
restore the hardirq state, and so may return with this having changed.

Handle this appropriately with new entry/exit helpers which do the bare
minimum to ensure this is appropriately maintained, without marking
debug exceptions as NMIs. These are placed in entry-common.c with the
other entry/exit helpers.

In future we'll want to reconsider whether some debug exceptions should
be NMIs, but this will require a significant refactoring, and for now
this should prevent issues with lockdep and RCU.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marins <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30 12:11:38 +00:00