mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-15 02:05:33 +00:00
fa48e8d2c7
Add Documentation/scheduler/sched-ext.rst which gives a high-level overview and pointers to the examples. v6: - Add paragraph explaining debug dump. v5: - Updated to reflect /sys/kernel interface change. Kconfig options added. v4: - README improved, reformatted in markdown and renamed to README.md. v3: - Added tools/sched_ext/README. - Dropped _example prefix from scheduler names. v2: - Apply minor edits suggested by Bagas. Caveats section dropped as all of them are addressed. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: David Vernet <dvernet@meta.com> Acked-by: Josh Don <joshdon@google.com> Acked-by: Hao Luo <haoluo@google.com> Acked-by: Barret Rhoden <brho@google.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com>
161 lines
5.8 KiB
Plaintext
161 lines
5.8 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
config PREEMPT_NONE_BUILD
|
|
bool
|
|
|
|
config PREEMPT_VOLUNTARY_BUILD
|
|
bool
|
|
|
|
config PREEMPT_BUILD
|
|
bool
|
|
select PREEMPTION
|
|
select UNINLINE_SPIN_UNLOCK if !ARCH_INLINE_SPIN_UNLOCK
|
|
|
|
choice
|
|
prompt "Preemption Model"
|
|
default PREEMPT_NONE
|
|
|
|
config PREEMPT_NONE
|
|
bool "No Forced Preemption (Server)"
|
|
select PREEMPT_NONE_BUILD if !PREEMPT_DYNAMIC
|
|
help
|
|
This is the traditional Linux preemption model, geared towards
|
|
throughput. It will still provide good latencies most of the
|
|
time, but there are no guarantees and occasional longer delays
|
|
are possible.
|
|
|
|
Select this option if you are building a kernel for a server or
|
|
scientific/computation system, or if you want to maximize the
|
|
raw processing power of the kernel, irrespective of scheduling
|
|
latencies.
|
|
|
|
config PREEMPT_VOLUNTARY
|
|
bool "Voluntary Kernel Preemption (Desktop)"
|
|
depends on !ARCH_NO_PREEMPT
|
|
select PREEMPT_VOLUNTARY_BUILD if !PREEMPT_DYNAMIC
|
|
help
|
|
This option reduces the latency of the kernel by adding more
|
|
"explicit preemption points" to the kernel code. These new
|
|
preemption points have been selected to reduce the maximum
|
|
latency of rescheduling, providing faster application reactions,
|
|
at the cost of slightly lower throughput.
|
|
|
|
This allows reaction to interactive events by allowing a
|
|
low priority process to voluntarily preempt itself even if it
|
|
is in kernel mode executing a system call. This allows
|
|
applications to run more 'smoothly' even when the system is
|
|
under load.
|
|
|
|
Select this if you are building a kernel for a desktop system.
|
|
|
|
config PREEMPT
|
|
bool "Preemptible Kernel (Low-Latency Desktop)"
|
|
depends on !ARCH_NO_PREEMPT
|
|
select PREEMPT_BUILD
|
|
help
|
|
This option reduces the latency of the kernel by making
|
|
all kernel code (that is not executing in a critical section)
|
|
preemptible. This allows reaction to interactive events by
|
|
permitting a low priority process to be preempted involuntarily
|
|
even if it is in kernel mode executing a system call and would
|
|
otherwise not be about to reach a natural preemption point.
|
|
This allows applications to run more 'smoothly' even when the
|
|
system is under load, at the cost of slightly lower throughput
|
|
and a slight runtime overhead to kernel code.
|
|
|
|
Select this if you are building a kernel for a desktop or
|
|
embedded system with latency requirements in the milliseconds
|
|
range.
|
|
|
|
config PREEMPT_RT
|
|
bool "Fully Preemptible Kernel (Real-Time)"
|
|
depends on EXPERT && ARCH_SUPPORTS_RT
|
|
select PREEMPTION
|
|
help
|
|
This option turns the kernel into a real-time kernel by replacing
|
|
various locking primitives (spinlocks, rwlocks, etc.) with
|
|
preemptible priority-inheritance aware variants, enforcing
|
|
interrupt threading and introducing mechanisms to break up long
|
|
non-preemptible sections. This makes the kernel, except for very
|
|
low level and critical code paths (entry code, scheduler, low
|
|
level interrupt handling) fully preemptible and brings most
|
|
execution contexts under scheduler control.
|
|
|
|
Select this if you are building a kernel for systems which
|
|
require real-time guarantees.
|
|
|
|
endchoice
|
|
|
|
config PREEMPT_COUNT
|
|
bool
|
|
|
|
config PREEMPTION
|
|
bool
|
|
select PREEMPT_COUNT
|
|
|
|
config PREEMPT_DYNAMIC
|
|
bool "Preemption behaviour defined on boot"
|
|
depends on HAVE_PREEMPT_DYNAMIC && !PREEMPT_RT
|
|
select JUMP_LABEL if HAVE_PREEMPT_DYNAMIC_KEY
|
|
select PREEMPT_BUILD
|
|
default y if HAVE_PREEMPT_DYNAMIC_CALL
|
|
help
|
|
This option allows to define the preemption model on the kernel
|
|
command line parameter and thus override the default preemption
|
|
model defined during compile time.
|
|
|
|
The feature is primarily interesting for Linux distributions which
|
|
provide a pre-built kernel binary to reduce the number of kernel
|
|
flavors they offer while still offering different usecases.
|
|
|
|
The runtime overhead is negligible with HAVE_STATIC_CALL_INLINE enabled
|
|
but if runtime patching is not available for the specific architecture
|
|
then the potential overhead should be considered.
|
|
|
|
Interesting if you want the same pre-built kernel should be used for
|
|
both Server and Desktop workloads.
|
|
|
|
config SCHED_CORE
|
|
bool "Core Scheduling for SMT"
|
|
depends on SCHED_SMT
|
|
help
|
|
This option permits Core Scheduling, a means of coordinated task
|
|
selection across SMT siblings. When enabled -- see
|
|
prctl(PR_SCHED_CORE) -- task selection ensures that all SMT siblings
|
|
will execute a task from the same 'core group', forcing idle when no
|
|
matching task is found.
|
|
|
|
Use of this feature includes:
|
|
- mitigation of some (not all) SMT side channels;
|
|
- limiting SMT interference to improve determinism and/or performance.
|
|
|
|
SCHED_CORE is default disabled. When it is enabled and unused,
|
|
which is the likely usage by Linux distributions, there should
|
|
be no measurable impact on performance.
|
|
|
|
config SCHED_CLASS_EXT
|
|
bool "Extensible Scheduling Class"
|
|
depends on BPF_SYSCALL && BPF_JIT
|
|
help
|
|
This option enables a new scheduler class sched_ext (SCX), which
|
|
allows scheduling policies to be implemented as BPF programs to
|
|
achieve the following:
|
|
|
|
- Ease of experimentation and exploration: Enabling rapid
|
|
iteration of new scheduling policies.
|
|
- Customization: Building application-specific schedulers which
|
|
implement policies that are not applicable to general-purpose
|
|
schedulers.
|
|
- Rapid scheduler deployments: Non-disruptive swap outs of
|
|
scheduling policies in production environments.
|
|
|
|
sched_ext leverages BPF struct_ops feature to define a structure
|
|
which exports function callbacks and flags to BPF programs that
|
|
wish to implement scheduling policies. The struct_ops structure
|
|
exported by sched_ext is struct sched_ext_ops, and is conceptually
|
|
similar to struct sched_class.
|
|
|
|
For more information:
|
|
Documentation/scheduler/sched-ext.rst
|
|
https://github.com/sched-ext/scx
|