mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-06 05:02:31 +00:00
407f1d8c1b
Patch series "kfence: optimize timer scheduling", v2.
We have observed that mostly-idle systems with KFENCE enabled wake up
otherwise idle CPUs, preventing such to enter a lower power state.
Debugging revealed that KFENCE spends too much active time in
toggle_allocation_gate().
While the first version of KFENCE was using all the right bits to be
scheduling optimal, and thus power efficient, by simply using wait_event()
+ wake_up(), that code was unfortunately removed.
As KFENCE was exposed to various different configs and tests, the
scheduling optimal code slowly disappeared. First because of hung task
warnings, and finally because of deadlocks when an allocation is made by
timer code with debug objects enabled. Clearly, the "fixes" were not too
friendly for devices that want to be power efficient.
Therefore, let's try a little harder to fix the hung task and deadlock
problems that we have with wait_event() + wake_up(), while remaining as
scheduling friendly and power efficient as possible.
Crucially, we need to defer the wake_up() to an irq_work, avoiding any
potential for deadlock.
The result with this series is that on the devices where we observed a
power regression, power usage returns back to baseline levels.
This patch (of 3):
On mostly-idle systems, we have observed that toggle_allocation_gate() is
a cause of frequent wake-ups, preventing an otherwise idle CPU to go into
a lower power state.
A late change in KFENCE's development, due to a potential deadlock [1],
required changing the scheduling-friendly wait_event_timeout() and
wake_up() to an open-coded wait-loop using schedule_timeout(). [1]
https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com
To avoid unnecessary wake-ups, switch to using wait_event_timeout().
Unfortunately, we still cannot use a version with direct wake_up() in
__kfence_alloc() due to the same potential for deadlock as in [1].
Instead, add a level of indirection via an irq_work that is scheduled if
we determine that the kfence_timer requires a wake_up().
Link: https://lkml.kernel.org/r/20210421105132.3965998-1-elver@google.com
Link: https://lkml.kernel.org/r/20210421105132.3965998-2-elver@google.com
Fixes: 0ce20dd840
("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Hillf Danton <hdanton@sina.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
84 lines
3.0 KiB
Plaintext
84 lines
3.0 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
config HAVE_ARCH_KFENCE
|
|
bool
|
|
|
|
menuconfig KFENCE
|
|
bool "KFENCE: low-overhead sampling-based memory safety error detector"
|
|
depends on HAVE_ARCH_KFENCE && (SLAB || SLUB)
|
|
select STACKTRACE
|
|
select IRQ_WORK
|
|
help
|
|
KFENCE is a low-overhead sampling-based detector of heap out-of-bounds
|
|
access, use-after-free, and invalid-free errors. KFENCE is designed
|
|
to have negligible cost to permit enabling it in production
|
|
environments.
|
|
|
|
See <file:Documentation/dev-tools/kfence.rst> for more details.
|
|
|
|
Note that, KFENCE is not a substitute for explicit testing with tools
|
|
such as KASAN. KFENCE can detect a subset of bugs that KASAN can
|
|
detect, albeit at very different performance profiles. If you can
|
|
afford to use KASAN, continue using KASAN, for example in test
|
|
environments. If your kernel targets production use, and cannot
|
|
enable KASAN due to its cost, consider using KFENCE.
|
|
|
|
if KFENCE
|
|
|
|
config KFENCE_STATIC_KEYS
|
|
bool "Use static keys to set up allocations"
|
|
default y
|
|
depends on JUMP_LABEL # To ensure performance, require jump labels
|
|
help
|
|
Use static keys (static branches) to set up KFENCE allocations. Using
|
|
static keys is normally recommended, because it avoids a dynamic
|
|
branch in the allocator's fast path. However, with very low sample
|
|
intervals, or on systems that do not support jump labels, a dynamic
|
|
branch may still be an acceptable performance trade-off.
|
|
|
|
config KFENCE_SAMPLE_INTERVAL
|
|
int "Default sample interval in milliseconds"
|
|
default 100
|
|
help
|
|
The KFENCE sample interval determines the frequency with which heap
|
|
allocations will be guarded by KFENCE. May be overridden via boot
|
|
parameter "kfence.sample_interval".
|
|
|
|
Set this to 0 to disable KFENCE by default, in which case only
|
|
setting "kfence.sample_interval" to a non-zero value enables KFENCE.
|
|
|
|
config KFENCE_NUM_OBJECTS
|
|
int "Number of guarded objects available"
|
|
range 1 65535
|
|
default 255
|
|
help
|
|
The number of guarded objects available. For each KFENCE object, 2
|
|
pages are required; with one containing the object and two adjacent
|
|
ones used as guard pages.
|
|
|
|
config KFENCE_STRESS_TEST_FAULTS
|
|
int "Stress testing of fault handling and error reporting" if EXPERT
|
|
default 0
|
|
help
|
|
The inverse probability with which to randomly protect KFENCE object
|
|
pages, resulting in spurious use-after-frees. The main purpose of
|
|
this option is to stress test KFENCE with concurrent error reports
|
|
and allocations/frees. A value of 0 disables stress testing logic.
|
|
|
|
Only for KFENCE testing; set to 0 if you are not a KFENCE developer.
|
|
|
|
config KFENCE_KUNIT_TEST
|
|
tristate "KFENCE integration test suite" if !KUNIT_ALL_TESTS
|
|
default KUNIT_ALL_TESTS
|
|
depends on TRACEPOINTS && KUNIT
|
|
help
|
|
Test suite for KFENCE, testing various error detection scenarios with
|
|
various allocation types, and checking that reports are correctly
|
|
output to console.
|
|
|
|
Say Y here if you want the test to be built into the kernel and run
|
|
during boot; say M if you want the test to build as a module; say N
|
|
if you are unsure.
|
|
|
|
endif # KFENCE
|