mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-03 19:55:31 +00:00
7b0888b7cc
The core-sched support is composed of the following parts: - task_struct->scx.core_sched_at is added. This is a timestamp which can be used to order tasks. Depending on whether the BPF scheduler implements custom ordering, it tracks either global FIFO ordering of all tasks or local-DSQ ordering within the dispatched tasks on a CPU. - prio_less() is updated to call scx_prio_less() when comparing SCX tasks. scx_prio_less() calls ops.core_sched_before() if available or uses the core_sched_at timestamp. For global FIFO ordering, the BPF scheduler doesn't need to do anything. Otherwise, it should implement ops.core_sched_before() which reflects the ordering. - When core-sched is enabled, balance_scx() balances all SMT siblings so that they all have tasks dispatched if necessary before pick_task_scx() is called. pick_task_scx() picks between the current task and the first dispatched task on the local DSQ based on availability and the core_sched_at timestamps. Note that FIFO ordering is expected among the already dispatched tasks whether running or on the local DSQ, so this path always compares core_sched_at instead of calling into ops.core_sched_before(). qmap_core_sched_before() is added to scx_qmap. It scales the distances from the heads of the queues to compare the tasks across different priority queues and seems to behave as expected. v3: Fixed build error when !CONFIG_SCHED_SMT reported by Andrea Righi. v2: Sched core added the const qualifiers to prio_less task arguments. Explicitly drop them for ops.core_sched_before() task arguments. BPF enforces access control through the verifier, so the qualifier isn't actually operative and only gets in the way when interacting with various helpers. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: David Vernet <dvernet@meta.com> Reviewed-by: Josh Don <joshdon@google.com> Cc: Andrea Righi <andrea.righi@canonical.com>
160 lines
5.7 KiB
Plaintext
160 lines
5.7 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
config PREEMPT_NONE_BUILD
|
|
bool
|
|
|
|
config PREEMPT_VOLUNTARY_BUILD
|
|
bool
|
|
|
|
config PREEMPT_BUILD
|
|
bool
|
|
select PREEMPTION
|
|
select UNINLINE_SPIN_UNLOCK if !ARCH_INLINE_SPIN_UNLOCK
|
|
|
|
choice
|
|
prompt "Preemption Model"
|
|
default PREEMPT_NONE
|
|
|
|
config PREEMPT_NONE
|
|
bool "No Forced Preemption (Server)"
|
|
select PREEMPT_NONE_BUILD if !PREEMPT_DYNAMIC
|
|
help
|
|
This is the traditional Linux preemption model, geared towards
|
|
throughput. It will still provide good latencies most of the
|
|
time, but there are no guarantees and occasional longer delays
|
|
are possible.
|
|
|
|
Select this option if you are building a kernel for a server or
|
|
scientific/computation system, or if you want to maximize the
|
|
raw processing power of the kernel, irrespective of scheduling
|
|
latencies.
|
|
|
|
config PREEMPT_VOLUNTARY
|
|
bool "Voluntary Kernel Preemption (Desktop)"
|
|
depends on !ARCH_NO_PREEMPT
|
|
select PREEMPT_VOLUNTARY_BUILD if !PREEMPT_DYNAMIC
|
|
help
|
|
This option reduces the latency of the kernel by adding more
|
|
"explicit preemption points" to the kernel code. These new
|
|
preemption points have been selected to reduce the maximum
|
|
latency of rescheduling, providing faster application reactions,
|
|
at the cost of slightly lower throughput.
|
|
|
|
This allows reaction to interactive events by allowing a
|
|
low priority process to voluntarily preempt itself even if it
|
|
is in kernel mode executing a system call. This allows
|
|
applications to run more 'smoothly' even when the system is
|
|
under load.
|
|
|
|
Select this if you are building a kernel for a desktop system.
|
|
|
|
config PREEMPT
|
|
bool "Preemptible Kernel (Low-Latency Desktop)"
|
|
depends on !ARCH_NO_PREEMPT
|
|
select PREEMPT_BUILD
|
|
help
|
|
This option reduces the latency of the kernel by making
|
|
all kernel code (that is not executing in a critical section)
|
|
preemptible. This allows reaction to interactive events by
|
|
permitting a low priority process to be preempted involuntarily
|
|
even if it is in kernel mode executing a system call and would
|
|
otherwise not be about to reach a natural preemption point.
|
|
This allows applications to run more 'smoothly' even when the
|
|
system is under load, at the cost of slightly lower throughput
|
|
and a slight runtime overhead to kernel code.
|
|
|
|
Select this if you are building a kernel for a desktop or
|
|
embedded system with latency requirements in the milliseconds
|
|
range.
|
|
|
|
config PREEMPT_RT
|
|
bool "Fully Preemptible Kernel (Real-Time)"
|
|
depends on EXPERT && ARCH_SUPPORTS_RT
|
|
select PREEMPTION
|
|
help
|
|
This option turns the kernel into a real-time kernel by replacing
|
|
various locking primitives (spinlocks, rwlocks, etc.) with
|
|
preemptible priority-inheritance aware variants, enforcing
|
|
interrupt threading and introducing mechanisms to break up long
|
|
non-preemptible sections. This makes the kernel, except for very
|
|
low level and critical code paths (entry code, scheduler, low
|
|
level interrupt handling) fully preemptible and brings most
|
|
execution contexts under scheduler control.
|
|
|
|
Select this if you are building a kernel for systems which
|
|
require real-time guarantees.
|
|
|
|
endchoice
|
|
|
|
config PREEMPT_COUNT
|
|
bool
|
|
|
|
config PREEMPTION
|
|
bool
|
|
select PREEMPT_COUNT
|
|
|
|
config PREEMPT_DYNAMIC
|
|
bool "Preemption behaviour defined on boot"
|
|
depends on HAVE_PREEMPT_DYNAMIC && !PREEMPT_RT
|
|
select JUMP_LABEL if HAVE_PREEMPT_DYNAMIC_KEY
|
|
select PREEMPT_BUILD
|
|
default y if HAVE_PREEMPT_DYNAMIC_CALL
|
|
help
|
|
This option allows to define the preemption model on the kernel
|
|
command line parameter and thus override the default preemption
|
|
model defined during compile time.
|
|
|
|
The feature is primarily interesting for Linux distributions which
|
|
provide a pre-built kernel binary to reduce the number of kernel
|
|
flavors they offer while still offering different usecases.
|
|
|
|
The runtime overhead is negligible with HAVE_STATIC_CALL_INLINE enabled
|
|
but if runtime patching is not available for the specific architecture
|
|
then the potential overhead should be considered.
|
|
|
|
Interesting if you want the same pre-built kernel should be used for
|
|
both Server and Desktop workloads.
|
|
|
|
config SCHED_CORE
|
|
bool "Core Scheduling for SMT"
|
|
depends on SCHED_SMT
|
|
help
|
|
This option permits Core Scheduling, a means of coordinated task
|
|
selection across SMT siblings. When enabled -- see
|
|
prctl(PR_SCHED_CORE) -- task selection ensures that all SMT siblings
|
|
will execute a task from the same 'core group', forcing idle when no
|
|
matching task is found.
|
|
|
|
Use of this feature includes:
|
|
- mitigation of some (not all) SMT side channels;
|
|
- limiting SMT interference to improve determinism and/or performance.
|
|
|
|
SCHED_CORE is default disabled. When it is enabled and unused,
|
|
which is the likely usage by Linux distributions, there should
|
|
be no measurable impact on performance.
|
|
|
|
config SCHED_CLASS_EXT
|
|
bool "Extensible Scheduling Class"
|
|
depends on BPF_SYSCALL && BPF_JIT
|
|
help
|
|
This option enables a new scheduler class sched_ext (SCX), which
|
|
allows scheduling policies to be implemented as BPF programs to
|
|
achieve the following:
|
|
|
|
- Ease of experimentation and exploration: Enabling rapid
|
|
iteration of new scheduling policies.
|
|
- Customization: Building application-specific schedulers which
|
|
implement policies that are not applicable to general-purpose
|
|
schedulers.
|
|
- Rapid scheduler deployments: Non-disruptive swap outs of
|
|
scheduling policies in production environments.
|
|
|
|
sched_ext leverages BPF struct_ops feature to define a structure
|
|
which exports function callbacks and flags to BPF programs that
|
|
wish to implement scheduling policies. The struct_ops structure
|
|
exported by sched_ext is struct sched_ext_ops, and is conceptually
|
|
similar to struct sched_class.
|
|
|
|
For more information:
|
|
https://github.com/sched-ext/scx
|