linux-next/tools
Tejun Heo 616db8779b workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE
If a per-cpu work item hogs the CPU, it can prevent other work items from
starting through concurrency management. A per-cpu workqueue which intends
to host such CPU-hogging work items can choose to not participate in
concurrency management by setting %WQ_CPU_INTENSIVE; however, this can be
error-prone and difficult to debug when missed.

This patch adds an automatic CPU usage based detection. If a
concurrency-managed work item consumes more CPU time than the threshold
(10ms by default) continuously without intervening sleeps, wq_worker_tick()
which is called from scheduler_tick() will detect the condition and
automatically mark it CPU_INTENSIVE.

The mechanism isn't foolproof:

* Detection depends on tick hitting the work item. Getting preempted at the
  right timings may allow a violating work item to evade detection at least
  temporarily.

* nohz_full CPUs may not be running ticks and thus can fail detection.

* Even when detection is working, the 10ms detection delays can add up if
  many CPU-hogging work items are queued at the same time.

However, in vast majority of cases, this should be able to detect violations
reliably and provide reasonable protection with a small increase in code
complexity.

If some work items trigger this condition repeatedly, the bigger problem
likely is the CPU being saturated with such per-cpu work items and the
solution would be making them UNBOUND. The next patch will add a debug
mechanism to help spot such cases.

v4: Documentation for workqueue.cpu_intensive_thresh_us added to
    kernel-parameters.txt.

v3: Switch to use wq_worker_tick() instead of hooking into preemptions as
    suggested by Peter.

v2: Lai pointed out that wq_worker_stopping() also needs to be called from
    preemption and rtlock paths and an earlier patch was updated
    accordingly. This patch adds a comment describing the risk of infinte
    recursions and how they're avoided.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
2023-05-17 17:02:08 -10:00
..
accounting delayacct: track delays from IRQ/SOFTIRQ 2023-04-18 16:39:34 -07:00
arch Disable building BPF based features by default for v6.4. 2023-05-07 11:32:18 -07:00
bootconfig bootconfig: Fix testcase to increase max node 2023-03-22 01:00:28 +09:00
bpf Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
build tools build: Add a feature test for scandirat(), that is not implemented so far in musl and uclibc 2023-04-04 13:18:17 -03:00
certs
cgroup tools:cgroup:memcg_shrinker remove redundant import 2023-01-18 17:12:59 -08:00
counter
debugging
edid
firewire
firmware
gpio tools: gpio: fix -c option of gpio-event-mon 2023-01-27 14:05:46 +01:00
hv
iio tools/iio/iio_utils:fix memory leak 2023-01-21 17:52:26 +00:00
include Disable building BPF based features by default for v6.4. 2023-05-07 11:32:18 -07:00
io_uring
kvm/kvm_stat tools/kvm_stat: use canonical ftrace path 2023-03-29 06:52:08 -04:00
laptop
leds
lib Disable building BPF based features by default for v6.4. 2023-05-07 11:32:18 -07:00
memory-model LKMM scripting updates for v6.4 2023-04-24 12:02:25 -07:00
mm slab changes for 6.4 2023-04-25 13:00:41 -07:00
net/ynl tools: ynl: Rename ethtool to ethtool.py 2023-04-13 22:18:29 -07:00
objtool Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
pci
pcmcia
perf Disable building BPF based features by default for v6.4. 2023-05-07 11:32:18 -07:00
power Power management updates for 6.4-rc1 2023-04-25 18:44:10 -07:00
rcu tools: rcu: Add usage function and check for argument 2023-03-11 18:10:17 -08:00
scripts sh updates for v6.4 2023-04-27 17:41:23 -07:00
spi
testing Disable building BPF based features by default for v6.4. 2023-05-07 11:32:18 -07:00
thermal
time
tracing rtla/timerlat: Fix "Previous IRQ" auto analysis' line 2023-04-25 19:26:59 -04:00
usb tools: usb: ffs-aio-example: Fix build error with aarch64-*-gnu-gcc toolchain(s) 2022-11-09 12:37:56 +01:00
verification rv: Fix addition on an uninitialized variable 'run' 2023-04-25 17:02:13 -04:00
virtio tools/virtio: fix build caused by virtio_ring changes 2023-04-21 03:02:35 -04:00
wmi
workqueue workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE 2023-05-17 17:02:08 -10:00
Makefile tools/Makefile: do missed s/vm/mm/ 2023-04-18 14:22:12 -07:00