linux/tools
Tejun Heo a4103eacc2 sched_ext: Add a cgroup scheduler which uses flattened hierarchy
This patch adds scx_flatcg example scheduler which implements hierarchical
weight-based cgroup CPU control by flattening the cgroup hierarchy into a
single layer by compounding the active weight share at each level.

This flattening of hierarchy can bring a substantial performance gain when
the cgroup hierarchy is nested multiple levels. in a simple benchmark using
wrk[8] on apache serving a CGI script calculating sha1sum of a small file,
it outperforms CFS by ~3% with CPU controller disabled and by ~10% with two
apache instances competing with 2:1 weight ratio nested four level deep.

However, the gain comes at the cost of not being able to properly handle
thundering herd of cgroups. For example, if many cgroups which are nested
behind a low priority parent cgroup wake up around the same time, they may
be able to consume more CPU cycles than they are entitled to. In many use
cases, this isn't a real concern especially given the performance gain.
Also, there are ways to mitigate the problem further by e.g. introducing an
extra scheduling layer on cgroup delegation boundaries.

v5: - Updated to specify SCX_OPS_HAS_CGROUP_WEIGHT instead of
      SCX_OPS_KNOB_CGROUP_WEIGHT.

v4: - Revert reference counted kptr for cgv_node as the change caused easily
      reproducible stalls.

v3: - Updated to reflect the core API changes including ops.init/exit_task()
      and direct dispatch from ops.select_cpu(). Fixes and improvements
      including additional statistics.

    - Use reference counted kptr for cgv_node instead of xchg'ing against
      stash location.

    - Dropped '-p' option.

v2: - Use SCX_BUG[_ON]() to simplify error handling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>
2024-09-04 10:24:59 -10:00
..
accounting
arch asm-generic updates for 6.11 2024-07-16 12:09:03 -07:00
bootconfig
bpf tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids 2024-07-22 16:35:30 +02:00
build perf tools fixes for v6.11 2024-07-23 18:15:51 -07:00
certs
cgroup proc: rewrite stable_page_flags() 2024-04-25 20:56:15 -07:00
counter
crypto crypto: ccp - Update return values for some unit tests 2024-02-24 08:41:20 +08:00
debugging
firewire
firmware
gpio gpio: add sloppy logic analyzer using polling 2024-07-01 10:54:11 +02:00
hv tools: hv: suppress the invalid warning for packed member alignment 2024-05-28 05:27:35 +00:00
iio
include bitmap-6.11-rc1 2024-07-26 09:50:36 -07:00
kvm/kvm_stat
laptop
leds
lib bitmap-6.11-rc1 2024-07-26 09:50:36 -07:00
memory-model kcsan: Add __data_racy documentation and module description 2024-07-15 15:44:40 -07:00
mm tools/mm: introduce a tool to assess swap entry allocation for thp_swapout 2024-07-10 12:14:51 -07:00
net/ynl tools: ynl: use ident name for Family, too. 2024-07-03 19:13:20 -07:00
objtool - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN 2024-07-21 17:15:46 -07:00
pci
pcmcia
perf perf tools fixes for v6.11 2024-07-23 18:15:51 -07:00
power turbostat release 2024.07.26 2024-07-28 10:52:15 -07:00
rcu tools/rcu: Add rcu-updaters.sh script 2024-06-06 11:44:42 -07:00
sched_ext sched_ext: Add a cgroup scheduler which uses flattened hierarchy 2024-09-04 10:24:59 -10:00
scripts treewide: remove meaningless assignments in Makefiles 2024-02-23 14:19:07 -08:00
sound ASoC: dapm-graph: new tool to visualize DAPM state 2024-04-21 09:58:17 +09:00
spi
testing sched_ext: Add cgroup support 2024-09-04 10:24:59 -10:00
thermal
time
tracing perf tools fixes for v6.11 2024-07-23 18:15:51 -07:00
usb
verification tools/verification: Use pkg-config in lib_setup of Makefile.config 2024-07-17 13:14:51 -07:00
virtio tools/virtio: creating pipe assertion in vringh_test 2024-07-04 11:00:31 -04:00
wmi
workqueue workqueue: remove unnecessary import and function in wq_monitor.py 2024-03-25 10:11:32 -10:00
writeback writeback: add wb_monitor.py script to monitor writeback info on bdi 2024-05-05 17:53:51 -07:00
Makefile sched_ext: Add scx_simple and scx_example_qmap example schedulers 2024-06-18 10:09:17 -10:00