mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-18 02:46:06 +00:00
workqueue: Fix selection of wake_cpu in kick_pool()
With cpu_possible_mask=0-63 and cpu_online_mask=0-7 the following kernel oops was observed: smp: Bringing up secondary CPUs ... smp: Brought up 1 node, 8 CPUs Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000803 [..] Call Trace: arch_vcpu_is_preempted+0x12/0x80 select_idle_sibling+0x42/0x560 select_task_rq_fair+0x29a/0x3b0 try_to_wake_up+0x38e/0x6e0 kick_pool+0xa4/0x198 __queue_work.part.0+0x2bc/0x3a8 call_timer_fn+0x36/0x160 __run_timers+0x1e2/0x328 __run_timer_base+0x5a/0x88 run_timer_softirq+0x40/0x78 __do_softirq+0x118/0x388 irq_exit_rcu+0xc0/0xd8 do_ext_irq+0xae/0x168 ext_int_handler+0xbe/0xf0 psw_idle_exit+0x0/0xc default_idle_call+0x3c/0x110 do_idle+0xd4/0x158 cpu_startup_entry+0x40/0x48 rest_init+0xc6/0xc8 start_kernel+0x3c4/0x5e0 startup_continue+0x3c/0x50 The crash is caused by calling arch_vcpu_is_preempted() for an offline CPU. To avoid this, select the cpu with cpumask_any_and_distribute() to mask __pod_cpumask with cpu_online_mask. In case no cpu is left in the pool, skip the assignment. tj: This doesn't fully fix the bug as CPUs can still go down between picking the target CPU and the wake call. Fixing that likely requires adding cpu_online() test to either the sched or s390 arch code. However, regardless of how that is fixed, workqueue shouldn't be picking a CPU which isn't online as that would result in unpredictable and worse behavior. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Fixes: 8639ecebc9b1 ("workqueue: Implement non-strict affinity scope for unbound workqueues") Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Tejun Heo <tj@kernel.org>
This commit is contained in:
parent
a1d34930d1
commit
57a01eafdc
@ -1277,8 +1277,12 @@ static bool kick_pool(struct worker_pool *pool)
|
||||
!cpumask_test_cpu(p->wake_cpu, pool->attrs->__pod_cpumask)) {
|
||||
struct work_struct *work = list_first_entry(&pool->worklist,
|
||||
struct work_struct, entry);
|
||||
p->wake_cpu = cpumask_any_distribute(pool->attrs->__pod_cpumask);
|
||||
get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
|
||||
int wake_cpu = cpumask_any_and_distribute(pool->attrs->__pod_cpumask,
|
||||
cpu_online_mask);
|
||||
if (wake_cpu < nr_cpu_ids) {
|
||||
p->wake_cpu = wake_cpu;
|
||||
get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
wake_up_process(p);
|
||||
|
Loading…
x
Reference in New Issue
Block a user