linux-stable/kernel/time
Yu Liao 408bfb6b0a tick/broadcast: Make takeover of broadcast hrtimer reliable
commit f7d43dd206 upstream.

Running the LTP hotplug stress test on a aarch64 machine results in
rcu_sched stall warnings when the broadcast hrtimer was owned by the
un-plugged CPU. The issue is the following:

CPU1 (owns the broadcast hrtimer)	CPU2

				tick_broadcast_enter()
				  // shutdown local timer device
				  broadcast_shutdown_local()
				...
				tick_broadcast_exit()
				  clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT)
				  // timer device is not programmed
				  cpumask_set_cpu(cpu, tick_broadcast_force_mask)

				initiates offlining of CPU1
take_cpu_down()
/*
 * CPU1 shuts down and does not
 * send broadcast IPI anymore
 */
				takedown_cpu()
				  hotplug_cpu__broadcast_tick_pull()
				    // move broadcast hrtimer to this CPU
				    clockevents_program_event()
				      bc_set_next()
					hrtimer_start()
					/*
					 * timer device is not programmed
					 * because only the first expiring
					 * timer will trigger clockevent
					 * device reprogramming
					 */

What happens is that CPU2 exits broadcast mode with force bit set, then the
local timer device is not reprogrammed and CPU2 expects to receive the
expired event by the broadcast IPI. But this does not happen because CPU1
is offlined by CPU2. CPU switches the clockevent device to ONESHOT state,
but does not reprogram the device.

The subsequent reprogramming of the hrtimer broadcast device does not
program the clockevent device of CPU2 either because the pending expiry
time is already in the past and the CPU expects the event to be delivered.
As a consequence all CPUs which wait for a broadcast event to be delivered
are stuck forever.

Fix this issue by reprogramming the local timer device if the broadcast
force bit of the CPU is set so that the broadcast hrtimer is delivered.

[ tglx: Massage comment and change log. Add Fixes tag ]

Fixes: 989dcb645c ("tick: Handle broadcast wakeup of multiple cpus")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240711124843.64167-1-liaoyu15@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-03 08:49:30 +02:00
..
alarmtimer.c alarmtimer: Prevent starvation by small intervals and SIG_IGN 2023-02-22 12:59:55 +01:00
clockevents.c clockevents: Use dedicated list iterator variable 2022-04-10 12:38:45 +02:00
clocksource-wdtest.c clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
clocksource.c clocksource: Skip watchdog check for large watchdog intervals 2024-02-16 19:06:31 +01:00
hrtimer.c hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range() 2024-02-23 09:12:50 +01:00
itimer.c time: Prevent undefined behaviour in timespec64_to_ns() 2020-10-26 11:48:11 +01:00
jiffies.c clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
Kconfig context_tracking: Take idle eqs entrypoints over RCU 2022-07-05 13:32:16 -07:00
Makefile time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
ntp_internal.h ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
ntp.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
posix-clock.c posix-clocks: Rename the clock_get() callback to clock_get_timespec() 2020-01-14 12:20:49 +01:00
posix-cpu-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-05-11 23:03:00 +09:00
posix-stubs.c timers: Prevent union confusion from unexpected restart_syscall() 2023-03-10 09:33:49 +01:00
posix-timers.c posix-timers: Prevent RT livelock in itimer_delete() 2023-07-19 16:20:59 +02:00
posix-timers.h posix-clocks: Introduce clock_get_ktime() callback 2020-01-14 12:20:51 +01:00
sched_clock.c time/sched_clock: Fix formatting of frequency reporting code 2022-05-02 14:29:04 +02:00
test_udelay.c time/debug: Fix memory leak with using debugfs_lookup() 2023-03-10 09:33:52 +01:00
tick-broadcast-hrtimer.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
tick-broadcast.c tick/broadcast: Make takeover of broadcast hrtimer reliable 2024-08-03 08:49:30 +02:00
tick-common.c tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device() 2024-06-21 14:35:59 +02:00
tick-internal.h clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
tick-legacy.c timekeeping: remove xtime_update 2020-10-30 21:57:07 +01:00
tick-oneshot.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
tick-sched.c tick/sched: Preserve number of idle sleeps across CPU hotplug events 2024-01-31 16:17:12 -08:00
tick-sched.h tick: Detect and fix jiffies update stall 2022-03-07 23:01:19 +01:00
time_test.c time: test: Fix incorrect format specifier 2024-03-26 18:20:29 -04:00
time.c time: Correct the prototype of ns_to_kernel_old_timeval and ns_to_timespec64 2022-08-09 20:02:13 +02:00
timeconst.bc time: Add SPDX license identifiers 2018-11-23 11:51:20 +01:00
timeconv.c time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
timecounter.c time/timecounter: Mark 1st argument of timecounter_cyc2time() as const 2021-04-16 21:03:50 +02:00
timekeeping_debug.c timekeeping/debug: No need to check return value of debugfs_create functions 2019-01-29 20:08:41 +01:00
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timekeeping.c timekeeping: Fix cross-timestamp interpolation for non-x86 2024-03-26 18:20:29 -04:00
timekeeping.h asm-generic: cross-architecture timer cleanup 2020-12-16 00:07:17 -08:00
timer_list.c timer_list: Print name of per-cpu wakeup device 2021-05-31 17:04:49 +02:00
timer.c timers: Rename del_timer() to timer_delete() 2024-05-17 11:56:13 +02:00
vsyscall.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00