sched_ext: Temporarily work around pick_task_scx() being called without balance_scx()

pick_task_scx() must be preceded by balance_scx() but there currently is a
bug where fair could say yes on balance() but no on pick_task(), which then
ends up calling pick_task_scx() without preceding balance_scx(). Work around
by dropping WARN_ON_ONCE() and ignoring cases which don't make sense.

This isn't great and can theoretically lead to stalls. However, for
switch_all cases, this happens only while a BPF scheduler is being loaded or
unloaded, and, for partial cases, fair will likely keep triggering this CPU.

This will be reverted once the fair behavior is fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
This commit is contained in:
Tejun Heo 2024-09-06 08:17:09 -10:00
parent 649e980dad
commit da330f5e4c

View File

@ -2909,9 +2909,24 @@ static struct task_struct *pick_task_scx(struct rq *rq)
* If balance_scx() is telling us to keep running @prev, replenish slice
* if necessary and keep running @prev. Otherwise, pop the first one
* from the local DSQ.
*
* WORKAROUND:
*
* %SCX_RQ_BAL_KEEP should be set iff $prev is on SCX as it must just
* have gone through balance_scx(). Unfortunately, there currently is a
* bug where fair could say yes on balance() but no on pick_task(),
* which then ends up calling pick_task_scx() without preceding
* balance_scx().
*
* For now, ignore cases where $prev is not on SCX. This isn't great and
* can theoretically lead to stalls. However, for switch_all cases, this
* happens only while a BPF scheduler is being loaded or unloaded, and,
* for partial cases, fair will likely keep triggering this CPU.
*
* Once fair is fixed, restore WARN_ON_ONCE().
*/
if ((rq->scx.flags & SCX_RQ_BAL_KEEP) &&
!WARN_ON_ONCE(prev->sched_class != &ext_sched_class)) {
prev->sched_class == &ext_sched_class) {
p = prev;
if (!p->scx.slice)
p->scx.slice = SCX_SLICE_DFL;