Linux kernel source tree
Go to file
Tejun Heo a4103eacc2 sched_ext: Add a cgroup scheduler which uses flattened hierarchy
This patch adds scx_flatcg example scheduler which implements hierarchical
weight-based cgroup CPU control by flattening the cgroup hierarchy into a
single layer by compounding the active weight share at each level.

This flattening of hierarchy can bring a substantial performance gain when
the cgroup hierarchy is nested multiple levels. in a simple benchmark using
wrk[8] on apache serving a CGI script calculating sha1sum of a small file,
it outperforms CFS by ~3% with CPU controller disabled and by ~10% with two
apache instances competing with 2:1 weight ratio nested four level deep.

However, the gain comes at the cost of not being able to properly handle
thundering herd of cgroups. For example, if many cgroups which are nested
behind a low priority parent cgroup wake up around the same time, they may
be able to consume more CPU cycles than they are entitled to. In many use
cases, this isn't a real concern especially given the performance gain.
Also, there are ways to mitigate the problem further by e.g. introducing an
extra scheduling layer on cgroup delegation boundaries.

v5: - Updated to specify SCX_OPS_HAS_CGROUP_WEIGHT instead of
      SCX_OPS_KNOB_CGROUP_WEIGHT.

v4: - Revert reference counted kptr for cgv_node as the change caused easily
      reproducible stalls.

v3: - Updated to reflect the core API changes including ops.init/exit_task()
      and direct dispatch from ops.select_cpu(). Fixes and improvements
      including additional statistics.

    - Use reference counted kptr for cgv_node instead of xchg'ing against
      stash location.

    - Dropped '-p' option.

v2: - Use SCX_BUG[_ON]() to simplify error handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>
2024-09-04 10:24:59 -10:00
arch minmax: add a few more MIN_T/MAX_T users 2024-07-28 13:41:14 -07:00
block block: fix deadlock between sd_remove & sd_release 2024-07-24 09:51:21 -06:00
certs kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
crypto crypto: testmgr - generate power-of-2 lengths more often 2024-07-13 11:50:28 +12:00
Documentation Linux 6.11-rc1 2024-07-30 09:30:11 -10:00
drivers Linux 6.11-rc1 2024-07-30 09:30:11 -10:00
fs sched/rt: Rename realtime_{prio, task}() to rt_or_dl_{prio, task}() 2024-08-07 18:32:38 +02:00
include sched_ext: Add cgroup support 2024-09-04 10:24:59 -10:00
init sched_ext: Add cgroup support 2024-09-04 10:24:59 -10:00
io_uring io_uring/napi: pass ktime to io_napi_adjust_timeout 2024-07-26 08:31:59 -06:00
ipc sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
kernel sched_ext: Add cgroup support 2024-09-04 10:24:59 -10:00
lib Linux 6.11-rc1 2024-07-30 09:30:11 -10:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm sched/rt: Rename realtime_{prio, task}() to rt_or_dl_{prio, task}() 2024-08-07 18:32:38 +02:00
net minmax: add a few more MIN_T/MAX_T users 2024-07-28 13:41:14 -07:00
rust Rust changes for v6.11 2024-07-27 13:44:54 -07:00
samples Driver core changes for 6.11-rc1 2024-07-25 10:42:22 -07:00
scripts Kbuild fixes for v6.11 2024-07-28 14:02:48 -07:00
security apparmor-pr-2024-07-24 PR 2024-07-25 2024-07-27 13:28:39 -07:00
sound Devicetree fixes for 6.11, part 1 2024-07-27 12:46:16 -07:00
tools sched_ext: Add a cgroup scheduler which uses flattened hierarchy 2024-09-04 10:24:59 -10:00
usr initramfs: shorten cmd_initfs in usr/Makefile 2024-07-16 01:07:52 +09:00
virt KVM generic changes for 6.11 2024-07-16 09:51:36 -04:00
.clang-format Docs: Move clang-format from process/ to dev-tools/ 2024-06-26 16:36:00 -06:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: add script and target to generate pacman package 2024-07-22 01:24:22 +09:00
.mailmap MAINTAINERS: mailmap: update James Clark's email address 2024-07-26 14:32:35 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS tracing: Update of MAINTAINERS and CREDITS file 2024-07-18 14:08:42 -07:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Linux 6.11-rc1 2024-07-30 09:30:11 -10:00
Makefile Linux 6.11-rc1 2024-07-28 14:19:55 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.