2018-10-31 19:21:09 +01:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
|
|
|
* This file contains the base functions to manage periodic tick
|
|
|
|
* related events.
|
|
|
|
*
|
|
|
|
* Copyright(C) 2005-2006, Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
* Copyright(C) 2005-2007, Red Hat, Inc., Ingo Molnar
|
|
|
|
* Copyright(C) 2006-2007, Timesys Corp., Thomas Gleixner
|
|
|
|
*/
|
2024-04-09 12:29:12 +02:00
|
|
|
#include <linux/compiler.h>
|
2007-02-16 01:28:01 -08:00
|
|
|
#include <linux/cpu.h>
|
|
|
|
#include <linux/err.h>
|
|
|
|
#include <linux/hrtimer.h>
|
[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.
I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.
However, there's a separate point to be discussed here.
That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.
So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)
Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.
[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:46:24 +02:00
|
|
|
#include <linux/interrupt.h>
|
2020-01-10 16:39:02 +08:00
|
|
|
#include <linux/nmi.h>
|
2007-02-16 01:28:01 -08:00
|
|
|
#include <linux/percpu.h>
|
|
|
|
#include <linux/profile.h>
|
|
|
|
#include <linux/sched.h>
|
2013-04-25 20:31:49 +00:00
|
|
|
#include <linux/module.h>
|
2015-05-10 01:23:35 +02:00
|
|
|
#include <trace/events/power.h>
|
2007-02-16 01:28:01 -08:00
|
|
|
|
[S390] genirq/clockevents: move irq affinity prototypes/inlines to interrupt.h
> Generic code is not supposed to include irq.h. Replace this include
> by linux/hardirq.h instead and add/replace an include of linux/irq.h
> in asm header files where necessary.
> This change should only matter for architectures that make use of
> GENERIC_CLOCKEVENTS.
> Architectures in question are mips, x86, arm, sh, powerpc, uml and sparc64.
>
> I did some cross compile tests for mips, x86_64, arm, powerpc and sparc64.
> This patch fixes also build breakages caused by the include replacement in
> tick-common.h.
I generally dislike adding optional linux/* includes in asm/* includes -
I'm nervous about this causing include loops.
However, there's a separate point to be discussed here.
That is, what interfaces are expected of every architecture in the kernel.
If generic code wants to be able to set the affinity of interrupts, then
that needs to become part of the interfaces listed in linux/interrupt.h
rather than linux/irq.h.
So what I suggest is this approach instead (against Linus' tree of a
couple of days ago) - we move irq_set_affinity() and irq_can_set_affinity()
to linux/interrupt.h, change the linux/irq.h includes to linux/interrupt.h
and include asm/irq_regs.h where needed (asm/irq_regs.h is supposed to be
rarely used include since not much touches the stacked parent context
registers.)
Build tested on ARM PXA family kernels and ARM's Realview platform
kernels which both use genirq.
[ tglx@linutronix.de: add GENERIC_HARDIRQ dependencies ]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:46:24 +02:00
|
|
|
#include <asm/irq_regs.h>
|
|
|
|
|
2007-02-16 01:28:02 -08:00
|
|
|
#include "tick-internal.h"
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
|
|
|
* Tick devices
|
|
|
|
*/
|
2007-02-16 01:28:02 -08:00
|
|
|
DEFINE_PER_CPU(struct tick_device, tick_cpu_device);
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
2020-11-17 14:19:44 +01:00
|
|
|
* Tick next event: keeps track of the tick time. It's updated by the
|
|
|
|
* CPU which handles the tick and protected by jiffies_lock. There is
|
|
|
|
* no requirement to write hold the jiffies seqcount for it.
|
2007-02-16 01:28:01 -08:00
|
|
|
*/
|
2007-02-16 01:28:02 -08:00
|
|
|
ktime_t tick_next_period;
|
2013-11-15 14:15:33 -08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* tick_do_timer_cpu is a timer core internal variable which holds the CPU NR
|
|
|
|
* which is responsible for calling do_timer(), i.e. the timekeeping stuff. This
|
|
|
|
* variable has two functions:
|
|
|
|
*
|
|
|
|
* 1) Prevent a thundering herd issue of a gazillion of CPUs trying to grab the
|
|
|
|
* timekeeping lock all at once. Only the CPU which is assigned to do the
|
|
|
|
* update is handling it.
|
|
|
|
*
|
|
|
|
* 2) Hand off the duty in the NOHZ idle case by setting the value to
|
|
|
|
* TICK_DO_TIMER_NONE, i.e. a non existing CPU. So the next cpu which looks
|
|
|
|
* at it will take over and keep the time keeping alive. The handover
|
|
|
|
* procedure also covers cpu hotplug.
|
|
|
|
*/
|
2008-09-22 18:46:37 +02:00
|
|
|
int tick_do_timer_cpu __read_mostly = TICK_DO_TIMER_BOOT;
|
2019-04-11 13:34:48 +10:00
|
|
|
#ifdef CONFIG_NO_HZ_FULL
|
|
|
|
/*
|
|
|
|
* tick_do_timer_boot_cpu indicates the boot CPU temporarily owns
|
|
|
|
* tick_do_timer_cpu and it should be taken over by an eligible secondary
|
|
|
|
* when one comes online.
|
|
|
|
*/
|
|
|
|
static int tick_do_timer_boot_cpu __read_mostly = -1;
|
|
|
|
#endif
|
2007-02-16 01:28:01 -08:00
|
|
|
|
[PATCH] Add debugging feature /proc/timer_list
add /proc/timer_list, which prints all currently pending (high-res) timers,
all clock-event sources and their parameters in a human-readable form.
Sample output:
Timer List Version: v0.1
HRTIMER_MAX_CLOCK_BASES: 2
now at 4246046273872 nsecs
cpu: 0
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
# expires at 4246432689566 nsecs [in 386415694 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050
# expires at 4247018194689 nsecs [in 971920817 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909
# expires at 4247351358392 nsecs [in 1305084520 nsecs]
#3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157
# expires at 4249097614968 nsecs [in 3051341096 nsecs]
#4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888
# expires at 4251329900926 nsecs [in 5283627054 nsecs]
.expires_next : 4246432689566 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 31306
.idle_tick : 4246020791890 nsecs
.tick_stopped : 1
.idle_jiffies : 986504
.idle_calls : 40700
.idle_sleeps : 36014
.idle_entrytime : 4246019418883 nsecs
.idle_sleeptime : 4178181972709 nsecs
cpu: 1
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
# expires at 4246050084568 nsecs [in 3810696 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227
# expires at 4261010635003 nsecs [in 14964361131 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332
# expires at 5469485798970 nsecs [in 1223439525098 nsecs]
.expires_next : 4246050084568 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 24043
.idle_tick : 4246046084568 nsecs
.tick_stopped : 0
.idle_jiffies : 986510
.idle_calls : 26360
.idle_sleeps : 22551
.idle_entrytime : 4246043874339 nsecs
.idle_sleeptime : 4170763761184 nsecs
tick_broadcast_mask: 00000003
event_broadcast_mask: 00000001
CPU#0's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246432689566 nsecs
CPU#1's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246050084568 nsecs
Clock Event Device: hpet
capabilities: 00000007
max_delta_ns: 2147483647
min_delta_ns: 3352
mult: 61496110
shift: 32
set_next_event: hpet_next_event
set_mode: hpet_set_mode
event_handler: handle_nextevt_broadcast
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 01:28:15 -08:00
|
|
|
/*
|
|
|
|
* Debugging: see timer_list.c
|
|
|
|
*/
|
|
|
|
struct tick_device *tick_get_device(int cpu)
|
|
|
|
{
|
|
|
|
return &per_cpu(tick_cpu_device, cpu);
|
|
|
|
}
|
|
|
|
|
2007-02-16 01:28:03 -08:00
|
|
|
/**
|
|
|
|
* tick_is_oneshot_available - check for a oneshot capable event device
|
|
|
|
*/
|
|
|
|
int tick_is_oneshot_available(void)
|
|
|
|
{
|
2010-12-08 16:22:55 +01:00
|
|
|
struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
|
2007-02-16 01:28:03 -08:00
|
|
|
|
2011-02-25 22:34:23 +01:00
|
|
|
if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT))
|
|
|
|
return 0;
|
|
|
|
if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
|
|
|
|
return 1;
|
|
|
|
return tick_broadcast_oneshot_available();
|
2007-02-16 01:28:03 -08:00
|
|
|
}
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
|
|
|
* Periodic tick
|
|
|
|
*/
|
|
|
|
static void tick_periodic(int cpu)
|
|
|
|
{
|
2024-04-09 12:29:12 +02:00
|
|
|
if (READ_ONCE(tick_do_timer_cpu) == cpu) {
|
2020-03-21 12:25:58 +01:00
|
|
|
raw_spin_lock(&jiffies_lock);
|
|
|
|
write_seqcount_begin(&jiffies_seq);
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
/* Keep track of the next tick event */
|
2020-11-17 14:19:49 +01:00
|
|
|
tick_next_period = ktime_add_ns(tick_next_period, TICK_NSEC);
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
do_timer(1);
|
2020-03-21 12:25:58 +01:00
|
|
|
write_seqcount_end(&jiffies_seq);
|
|
|
|
raw_spin_unlock(&jiffies_lock);
|
2013-12-12 13:10:55 -08:00
|
|
|
update_wall_time();
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
update_process_times(user_mode(get_irq_regs()));
|
|
|
|
profile_tick(CPU_PROFILING);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Event handler for periodic ticks
|
|
|
|
*/
|
|
|
|
void tick_handle_periodic(struct clock_event_device *dev)
|
|
|
|
{
|
|
|
|
int cpu = smp_processor_id();
|
2014-03-25 13:56:23 +05:30
|
|
|
ktime_t next = dev->next_event;
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
tick_periodic(cpu);
|
|
|
|
|
2015-04-14 21:08:51 +00:00
|
|
|
/*
|
|
|
|
* The cpu might have transitioned to HIGHRES or NOHZ mode via
|
|
|
|
* update_process_times() -> run_local_timers() ->
|
|
|
|
* hrtimer_run_queues().
|
|
|
|
*/
|
2024-02-25 23:54:56 +01:00
|
|
|
if (IS_ENABLED(CONFIG_TICK_ONESHOT) && dev->event_handler != tick_handle_periodic)
|
2015-04-14 21:08:51 +00:00
|
|
|
return;
|
|
|
|
|
2015-05-21 13:33:46 +05:30
|
|
|
if (!clockevent_state_oneshot(dev))
|
2007-02-16 01:28:01 -08:00
|
|
|
return;
|
|
|
|
for (;;) {
|
2014-03-25 13:56:23 +05:30
|
|
|
/*
|
|
|
|
* Setup the next period for devices, which do not have
|
|
|
|
* periodic mode:
|
|
|
|
*/
|
2020-11-17 14:19:49 +01:00
|
|
|
next = ktime_add_ns(next, TICK_NSEC);
|
2014-03-25 13:56:23 +05:30
|
|
|
|
2011-08-23 15:29:42 +02:00
|
|
|
if (!clockevents_program_event(dev, next, false))
|
2007-02-16 01:28:01 -08:00
|
|
|
return;
|
2009-05-01 13:10:25 -07:00
|
|
|
/*
|
|
|
|
* Have to be careful here. If we're in oneshot mode,
|
|
|
|
* before we call tick_periodic() in a loop, we need
|
|
|
|
* to be sure we're using a real hardware clocksource.
|
|
|
|
* Otherwise we could get trapped in an infinite
|
|
|
|
* loop, as the tick_periodic() increments jiffies,
|
2014-03-25 16:09:18 +05:30
|
|
|
* which then will increment time, possibly causing
|
2009-05-01 13:10:25 -07:00
|
|
|
* the loop to trigger again and again.
|
|
|
|
*/
|
|
|
|
if (timekeeping_valid_for_hres())
|
|
|
|
tick_periodic(cpu);
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Setup the device for a periodic tick
|
|
|
|
*/
|
2007-02-16 01:28:02 -08:00
|
|
|
void tick_setup_periodic(struct clock_event_device *dev, int broadcast)
|
2007-02-16 01:28:01 -08:00
|
|
|
{
|
2007-02-16 01:28:02 -08:00
|
|
|
tick_set_periodic_handler(dev, broadcast);
|
|
|
|
|
|
|
|
/* Broadcast setup ? */
|
|
|
|
if (!tick_device_is_functional(dev))
|
|
|
|
return;
|
2007-02-16 01:28:01 -08:00
|
|
|
|
2008-09-22 19:04:02 +02:00
|
|
|
if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
|
|
|
|
!tick_broadcast_oneshot_active()) {
|
2015-06-02 14:08:46 +02:00
|
|
|
clockevents_switch_state(dev, CLOCK_EVT_STATE_PERIODIC);
|
2007-02-16 01:28:01 -08:00
|
|
|
} else {
|
2019-03-18 20:55:56 +01:00
|
|
|
unsigned int seq;
|
2007-02-16 01:28:01 -08:00
|
|
|
ktime_t next;
|
|
|
|
|
|
|
|
do {
|
2020-03-21 12:25:58 +01:00
|
|
|
seq = read_seqcount_begin(&jiffies_seq);
|
2007-02-16 01:28:01 -08:00
|
|
|
next = tick_next_period;
|
2020-03-21 12:25:58 +01:00
|
|
|
} while (read_seqcount_retry(&jiffies_seq, seq));
|
2007-02-16 01:28:01 -08:00
|
|
|
|
2015-06-02 14:08:46 +02:00
|
|
|
clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
for (;;) {
|
2011-08-23 15:29:42 +02:00
|
|
|
if (!clockevents_program_event(dev, next, false))
|
2007-02-16 01:28:01 -08:00
|
|
|
return;
|
2020-11-17 14:19:49 +01:00
|
|
|
next = ktime_add_ns(next, TICK_NSEC);
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Setup the tick device
|
|
|
|
*/
|
|
|
|
static void tick_setup_device(struct tick_device *td,
|
|
|
|
struct clock_event_device *newdev, int cpu,
|
2008-12-13 21:20:26 +10:30
|
|
|
const struct cpumask *cpumask)
|
2007-02-16 01:28:01 -08:00
|
|
|
{
|
|
|
|
void (*handler)(struct clock_event_device *) = NULL;
|
2016-12-25 12:30:41 +01:00
|
|
|
ktime_t next_event = 0;
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* First device setup ?
|
|
|
|
*/
|
|
|
|
if (!td->evtdev) {
|
|
|
|
/*
|
|
|
|
* If no cpu took the do_timer update, assign it to
|
|
|
|
* this cpu:
|
|
|
|
*/
|
2024-04-09 12:29:12 +02:00
|
|
|
if (READ_ONCE(tick_do_timer_cpu) == TICK_DO_TIMER_BOOT) {
|
|
|
|
WRITE_ONCE(tick_do_timer_cpu, cpu);
|
2023-06-15 11:18:30 +02:00
|
|
|
tick_next_period = ktime_get();
|
2019-04-11 13:34:48 +10:00
|
|
|
#ifdef CONFIG_NO_HZ_FULL
|
|
|
|
/*
|
2024-05-28 14:20:19 +02:00
|
|
|
* The boot CPU may be nohz_full, in which case the
|
|
|
|
* first housekeeping secondary will take do_timer()
|
|
|
|
* from it.
|
2019-04-11 13:34:48 +10:00
|
|
|
*/
|
|
|
|
if (tick_nohz_full_cpu(cpu))
|
|
|
|
tick_do_timer_boot_cpu = cpu;
|
|
|
|
|
2024-05-28 14:20:19 +02:00
|
|
|
} else if (tick_do_timer_boot_cpu != -1 && !tick_nohz_full_cpu(cpu)) {
|
2019-04-11 13:34:48 +10:00
|
|
|
tick_do_timer_boot_cpu = -1;
|
2024-05-28 14:20:19 +02:00
|
|
|
/*
|
|
|
|
* The boot CPU will stay in periodic (NOHZ disabled)
|
|
|
|
* mode until clocksource_done_booting() called after
|
|
|
|
* smp_init() selects a high resolution clocksource and
|
|
|
|
* timekeeping_notify() kicks the NOHZ stuff alive.
|
|
|
|
*
|
|
|
|
* So this WRITE_ONCE can only race with the READ_ONCE
|
|
|
|
* check in tick_periodic() but this race is harmless.
|
|
|
|
*/
|
|
|
|
WRITE_ONCE(tick_do_timer_cpu, cpu);
|
2019-04-11 13:34:48 +10:00
|
|
|
#endif
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Startup in periodic mode first.
|
|
|
|
*/
|
|
|
|
td->mode = TICKDEV_MODE_PERIODIC;
|
|
|
|
} else {
|
|
|
|
handler = td->evtdev->event_handler;
|
|
|
|
next_event = td->evtdev->next_event;
|
2008-09-03 21:36:50 +00:00
|
|
|
td->evtdev->event_handler = clockevents_handle_noop;
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
td->evtdev = newdev;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When the device is not per cpu, pin the interrupt to the
|
|
|
|
* current cpu:
|
|
|
|
*/
|
2008-12-13 21:20:26 +10:30
|
|
|
if (!cpumask_equal(newdev->cpumask, cpumask))
|
2008-12-13 21:20:26 +10:30
|
|
|
irq_set_affinity(newdev->irq, cpumask);
|
2007-02-16 01:28:01 -08:00
|
|
|
|
2007-02-16 01:28:02 -08:00
|
|
|
/*
|
|
|
|
* When global broadcasting is active, check if the current
|
|
|
|
* device is registered as a placeholder for broadcast mode.
|
|
|
|
* This allows us to handle this x86 misfeature in a generic
|
2013-07-01 22:14:10 +02:00
|
|
|
* way. This function also returns !=0 when we keep the
|
|
|
|
* current active broadcast state for this CPU.
|
2007-02-16 01:28:02 -08:00
|
|
|
*/
|
|
|
|
if (tick_device_uses_broadcast(newdev, cpu))
|
|
|
|
return;
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
if (td->mode == TICKDEV_MODE_PERIODIC)
|
|
|
|
tick_setup_periodic(newdev, 0);
|
2007-02-16 01:28:03 -08:00
|
|
|
else
|
|
|
|
tick_setup_oneshot(newdev, handler, next_event);
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
|
2013-04-25 20:31:50 +00:00
|
|
|
void tick_install_replacement(struct clock_event_device *newdev)
|
|
|
|
{
|
2014-08-17 12:30:25 -05:00
|
|
|
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
|
2013-04-25 20:31:50 +00:00
|
|
|
int cpu = smp_processor_id();
|
|
|
|
|
|
|
|
clockevents_exchange_device(td->evtdev, newdev);
|
|
|
|
tick_setup_device(td, newdev, cpu, cpumask_of(cpu));
|
|
|
|
if (newdev->features & CLOCK_EVT_FEAT_ONESHOT)
|
|
|
|
tick_oneshot_notify();
|
|
|
|
}
|
|
|
|
|
2013-04-25 20:31:50 +00:00
|
|
|
static bool tick_check_percpu(struct clock_event_device *curdev,
|
|
|
|
struct clock_event_device *newdev, int cpu)
|
|
|
|
{
|
|
|
|
if (!cpumask_test_cpu(cpu, newdev->cpumask))
|
|
|
|
return false;
|
|
|
|
if (cpumask_equal(newdev->cpumask, cpumask_of(cpu)))
|
|
|
|
return true;
|
|
|
|
/* Check if irq affinity can be set */
|
|
|
|
if (newdev->irq >= 0 && !irq_can_set_affinity(newdev->irq))
|
|
|
|
return false;
|
|
|
|
/* Prefer an existing cpu local device */
|
|
|
|
if (curdev && cpumask_equal(curdev->cpumask, cpumask_of(cpu)))
|
|
|
|
return false;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool tick_check_preferred(struct clock_event_device *curdev,
|
|
|
|
struct clock_event_device *newdev)
|
|
|
|
{
|
|
|
|
/* Prefer oneshot capable device */
|
|
|
|
if (!(newdev->features & CLOCK_EVT_FEAT_ONESHOT)) {
|
|
|
|
if (curdev && (curdev->features & CLOCK_EVT_FEAT_ONESHOT))
|
|
|
|
return false;
|
|
|
|
if (tick_oneshot_mode_active())
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2013-06-13 11:39:50 -07:00
|
|
|
/*
|
|
|
|
* Use the higher rated one, but prefer a CPU local device with a lower
|
|
|
|
* rating than a non-CPU local device
|
|
|
|
*/
|
|
|
|
return !curdev ||
|
|
|
|
newdev->rating > curdev->rating ||
|
2018-07-09 16:45:35 +01:00
|
|
|
!cpumask_equal(curdev->cpumask, newdev->cpumask);
|
2013-04-25 20:31:50 +00:00
|
|
|
}
|
|
|
|
|
2013-04-25 20:31:50 +00:00
|
|
|
/*
|
|
|
|
* Check whether the new device is a better fit than curdev. curdev
|
|
|
|
* can be NULL !
|
|
|
|
*/
|
|
|
|
bool tick_check_replacement(struct clock_event_device *curdev,
|
|
|
|
struct clock_event_device *newdev)
|
|
|
|
{
|
2014-04-15 10:54:37 +05:30
|
|
|
if (!tick_check_percpu(curdev, newdev, smp_processor_id()))
|
2013-04-25 20:31:50 +00:00
|
|
|
return false;
|
|
|
|
|
|
|
|
return tick_check_preferred(curdev, newdev);
|
|
|
|
}
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
2013-04-25 20:31:48 +00:00
|
|
|
* Check, if the new registered device should be used. Called with
|
|
|
|
* clockevents_lock held and interrupts disabled.
|
2007-02-16 01:28:01 -08:00
|
|
|
*/
|
2013-04-25 20:31:47 +00:00
|
|
|
void tick_check_new_device(struct clock_event_device *newdev)
|
2007-02-16 01:28:01 -08:00
|
|
|
{
|
|
|
|
struct clock_event_device *curdev;
|
|
|
|
struct tick_device *td;
|
2013-04-25 20:31:47 +00:00
|
|
|
int cpu;
|
2007-02-16 01:28:01 -08:00
|
|
|
|
|
|
|
cpu = smp_processor_id();
|
|
|
|
td = &per_cpu(tick_cpu_device, cpu);
|
|
|
|
curdev = td->evtdev;
|
|
|
|
|
2021-03-26 02:23:28 +00:00
|
|
|
if (!tick_check_replacement(curdev, newdev))
|
2013-04-25 20:31:50 +00:00
|
|
|
goto out_bc;
|
2007-02-16 01:28:01 -08:00
|
|
|
|
2013-04-25 20:31:49 +00:00
|
|
|
if (!try_module_get(newdev->owner))
|
|
|
|
return;
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
|
|
|
* Replace the eventually existing device by the new
|
2007-02-16 01:28:02 -08:00
|
|
|
* device. If the current device is the broadcast device, do
|
|
|
|
* not give it back to the clockevents layer !
|
2007-02-16 01:28:01 -08:00
|
|
|
*/
|
2007-02-16 01:28:02 -08:00
|
|
|
if (tick_is_broadcast_device(curdev)) {
|
2008-09-16 11:32:50 -07:00
|
|
|
clockevents_shutdown(curdev);
|
2007-02-16 01:28:02 -08:00
|
|
|
curdev = NULL;
|
|
|
|
}
|
2007-02-16 01:28:01 -08:00
|
|
|
clockevents_exchange_device(curdev, newdev);
|
2009-01-01 10:12:25 +10:30
|
|
|
tick_setup_device(td, newdev, cpu, cpumask_of(cpu));
|
2007-02-16 01:28:03 -08:00
|
|
|
if (newdev->features & CLOCK_EVT_FEAT_ONESHOT)
|
|
|
|
tick_oneshot_notify();
|
2013-04-25 20:31:47 +00:00
|
|
|
return;
|
2007-02-16 01:28:02 -08:00
|
|
|
|
|
|
|
out_bc:
|
|
|
|
/*
|
|
|
|
* Can the new device be used as a broadcast device ?
|
|
|
|
*/
|
2021-05-24 23:18:16 +01:00
|
|
|
tick_install_broadcast_device(newdev, cpu);
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|
|
|
|
|
2015-07-07 16:29:38 +02:00
|
|
|
/**
|
|
|
|
* tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
|
|
|
|
* @state: The target state (enter/exit)
|
|
|
|
*
|
|
|
|
* The system enters/leaves a state, where affected devices might stop
|
|
|
|
* Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
|
|
|
|
*
|
|
|
|
* Called with interrupts disabled, so clockevents_lock is not
|
|
|
|
* required here because the local clock event device cannot go away
|
|
|
|
* under us.
|
|
|
|
*/
|
|
|
|
int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
|
|
|
|
{
|
|
|
|
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
|
|
|
|
|
|
|
|
if (!(td->evtdev->features & CLOCK_EVT_FEAT_C3STOP))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
return __tick_broadcast_oneshot_control(state);
|
|
|
|
}
|
2015-07-14 12:01:04 +02:00
|
|
|
EXPORT_SYMBOL_GPL(tick_broadcast_oneshot_control);
|
2015-07-07 16:29:38 +02:00
|
|
|
|
2015-04-03 02:37:24 +02:00
|
|
|
#ifdef CONFIG_HOTPLUG_CPU
|
2024-02-25 23:55:07 +01:00
|
|
|
void tick_assert_timekeeping_handover(void)
|
|
|
|
{
|
|
|
|
WARN_ON_ONCE(tick_do_timer_cpu == smp_processor_id());
|
|
|
|
}
|
2008-12-01 14:09:07 +01:00
|
|
|
/*
|
2024-02-25 23:54:59 +01:00
|
|
|
* Stop the tick and transfer the timekeeping job away from a dying cpu.
|
2008-12-01 14:09:07 +01:00
|
|
|
*/
|
2024-02-25 23:54:59 +01:00
|
|
|
int tick_cpu_dying(unsigned int dying_cpu)
|
2008-12-01 14:09:07 +01:00
|
|
|
{
|
2024-02-25 23:54:59 +01:00
|
|
|
/*
|
2024-04-09 12:29:12 +02:00
|
|
|
* If the current CPU is the timekeeper, it's the only one that can
|
|
|
|
* safely hand over its duty. Also all online CPUs are in stop
|
|
|
|
* machine, guaranteed not to be idle, therefore there is no
|
|
|
|
* concurrency and it's safe to pick any online successor.
|
2024-02-25 23:54:59 +01:00
|
|
|
*/
|
|
|
|
if (tick_do_timer_cpu == dying_cpu)
|
2020-12-06 22:12:54 +01:00
|
|
|
tick_do_timer_cpu = cpumask_first(cpu_online_mask);
|
2024-02-25 23:54:59 +01:00
|
|
|
|
2024-02-25 23:55:06 +01:00
|
|
|
/* Make sure the CPU won't try to retake the timekeeping duty */
|
|
|
|
tick_sched_timer_dying(dying_cpu);
|
2024-02-25 23:55:00 +01:00
|
|
|
|
2024-02-25 23:55:01 +01:00
|
|
|
/* Remove CPU from timer broadcasting */
|
|
|
|
tick_offline_cpu(dying_cpu);
|
|
|
|
|
2024-02-25 23:54:59 +01:00
|
|
|
return 0;
|
2008-12-01 14:09:07 +01:00
|
|
|
}
|
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/*
|
|
|
|
* Shutdown an event device on a given cpu:
|
|
|
|
*
|
|
|
|
* This is called on a life CPU, when a CPU is dead. So we cannot
|
|
|
|
* access the hardware device itself.
|
|
|
|
* We just set the mode and remove it from the lists.
|
|
|
|
*/
|
2015-04-03 02:38:05 +02:00
|
|
|
void tick_shutdown(unsigned int cpu)
|
2007-02-16 01:28:01 -08:00
|
|
|
{
|
2015-04-03 02:38:05 +02:00
|
|
|
struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
|
2007-02-16 01:28:01 -08:00
|
|
|
struct clock_event_device *dev = td->evtdev;
|
|
|
|
|
|
|
|
td->mode = TICKDEV_MODE_PERIODIC;
|
|
|
|
if (dev) {
|
|
|
|
/*
|
|
|
|
* Prevent that the clock events layer tries to call
|
|
|
|
* the set mode function!
|
|
|
|
*/
|
2015-06-02 14:13:46 +02:00
|
|
|
clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
|
2007-02-16 01:28:01 -08:00
|
|
|
clockevents_exchange_device(dev, NULL);
|
2013-04-25 11:45:53 +02:00
|
|
|
dev->event_handler = clockevents_handle_noop;
|
2007-02-16 01:28:01 -08:00
|
|
|
td->evtdev = NULL;
|
|
|
|
}
|
|
|
|
}
|
2015-04-03 02:38:05 +02:00
|
|
|
#endif
|
2007-02-16 01:28:01 -08:00
|
|
|
|
2015-03-25 13:09:16 +01:00
|
|
|
/**
|
2015-03-25 13:11:04 +01:00
|
|
|
* tick_suspend_local - Suspend the local tick device
|
2015-03-25 13:09:16 +01:00
|
|
|
*
|
2015-03-25 13:11:04 +01:00
|
|
|
* Called from the local cpu for freeze with interrupts disabled.
|
2015-03-25 13:09:16 +01:00
|
|
|
*
|
|
|
|
* No locks required. Nothing can change the per cpu device.
|
|
|
|
*/
|
2015-03-25 13:11:52 +01:00
|
|
|
void tick_suspend_local(void)
|
2007-03-06 08:25:42 +01:00
|
|
|
{
|
2014-08-17 12:30:25 -05:00
|
|
|
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
|
2007-03-06 08:25:42 +01:00
|
|
|
|
2008-09-16 11:32:50 -07:00
|
|
|
clockevents_shutdown(td->evtdev);
|
2007-03-06 08:25:42 +01:00
|
|
|
}
|
|
|
|
|
2015-03-25 13:09:16 +01:00
|
|
|
/**
|
2015-03-25 13:11:04 +01:00
|
|
|
* tick_resume_local - Resume the local tick device
|
2015-03-25 13:09:16 +01:00
|
|
|
*
|
2015-03-25 13:11:04 +01:00
|
|
|
* Called from the local CPU for unfreeze or XEN resume magic.
|
2015-03-25 13:09:16 +01:00
|
|
|
*
|
|
|
|
* No locks required. Nothing can change the per cpu device.
|
|
|
|
*/
|
2015-03-25 13:11:04 +01:00
|
|
|
void tick_resume_local(void)
|
2007-03-06 08:25:42 +01:00
|
|
|
{
|
2015-03-25 13:11:04 +01:00
|
|
|
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
|
|
|
|
bool broadcast = tick_resume_check_broadcast();
|
2007-03-06 08:25:42 +01:00
|
|
|
|
2015-02-27 17:21:32 +05:30
|
|
|
clockevents_tick_resume(td->evtdev);
|
2007-07-21 04:37:34 -07:00
|
|
|
if (!broadcast) {
|
|
|
|
if (td->mode == TICKDEV_MODE_PERIODIC)
|
|
|
|
tick_setup_periodic(td->evtdev, 0);
|
|
|
|
else
|
|
|
|
tick_resume_oneshot();
|
|
|
|
}
|
2021-07-13 15:39:51 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Ensure that hrtimers are up to date and the clockevents device
|
|
|
|
* is reprogrammed correctly when high resolution timers are
|
|
|
|
* enabled.
|
|
|
|
*/
|
|
|
|
hrtimers_resume_local();
|
2007-03-06 08:25:42 +01:00
|
|
|
}
|
|
|
|
|
2015-03-25 13:11:04 +01:00
|
|
|
/**
|
|
|
|
* tick_suspend - Suspend the tick and the broadcast device
|
|
|
|
*
|
|
|
|
* Called from syscore_suspend() via timekeeping_suspend with only one
|
|
|
|
* CPU online and interrupts disabled or from tick_unfreeze() under
|
|
|
|
* tick_freeze_lock.
|
|
|
|
*
|
|
|
|
* No locks required. Nothing can change the per cpu device.
|
|
|
|
*/
|
|
|
|
void tick_suspend(void)
|
|
|
|
{
|
|
|
|
tick_suspend_local();
|
|
|
|
tick_suspend_broadcast();
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* tick_resume - Resume the tick and the broadcast device
|
|
|
|
*
|
|
|
|
* Called from syscore_resume() via timekeeping_resume with only one
|
|
|
|
* CPU online and interrupts disabled.
|
|
|
|
*
|
|
|
|
* No locks required. Nothing can change the per cpu device.
|
|
|
|
*/
|
|
|
|
void tick_resume(void)
|
|
|
|
{
|
|
|
|
tick_resume_broadcast();
|
|
|
|
tick_resume_local();
|
|
|
|
}
|
|
|
|
|
2015-05-16 01:38:15 +02:00
|
|
|
#ifdef CONFIG_SUSPEND
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
|
|
|
|
static unsigned int tick_freeze_depth;
|
|
|
|
|
|
|
|
/**
|
|
|
|
* tick_freeze - Suspend the local tick and (possibly) timekeeping.
|
|
|
|
*
|
|
|
|
* Check if this is the last online CPU executing the function and if so,
|
|
|
|
* suspend timekeeping. Otherwise suspend the local tick.
|
|
|
|
*
|
|
|
|
* Call with interrupts disabled. Must be balanced with %tick_unfreeze().
|
|
|
|
* Interrupts must not be enabled before the subsequent %tick_unfreeze().
|
|
|
|
*/
|
|
|
|
void tick_freeze(void)
|
|
|
|
{
|
|
|
|
raw_spin_lock(&tick_freeze_lock);
|
|
|
|
|
|
|
|
tick_freeze_depth++;
|
2015-05-10 01:23:35 +02:00
|
|
|
if (tick_freeze_depth == num_online_cpus()) {
|
|
|
|
trace_suspend_resume(TPS("timekeeping_freeze"),
|
|
|
|
smp_processor_id(), true);
|
2018-05-25 17:54:41 +02:00
|
|
|
system_state = SYSTEM_SUSPEND;
|
2019-03-29 10:59:09 +08:00
|
|
|
sched_clock_suspend();
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
timekeeping_suspend();
|
2015-05-10 01:23:35 +02:00
|
|
|
} else {
|
2015-03-25 13:11:04 +01:00
|
|
|
tick_suspend_local();
|
2015-05-10 01:23:35 +02:00
|
|
|
}
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
|
|
|
|
raw_spin_unlock(&tick_freeze_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* tick_unfreeze - Resume the local tick and (possibly) timekeeping.
|
|
|
|
*
|
|
|
|
* Check if this is the first CPU executing the function and if so, resume
|
|
|
|
* timekeeping. Otherwise resume the local tick.
|
|
|
|
*
|
|
|
|
* Call with interrupts disabled. Must be balanced with %tick_freeze().
|
|
|
|
* Interrupts must not be enabled after the preceding %tick_freeze().
|
|
|
|
*/
|
|
|
|
void tick_unfreeze(void)
|
|
|
|
{
|
|
|
|
raw_spin_lock(&tick_freeze_lock);
|
|
|
|
|
2015-05-10 01:23:35 +02:00
|
|
|
if (tick_freeze_depth == num_online_cpus()) {
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
timekeeping_resume();
|
2019-03-29 10:59:09 +08:00
|
|
|
sched_clock_resume();
|
2018-05-25 17:54:41 +02:00
|
|
|
system_state = SYSTEM_RUNNING;
|
2015-05-10 01:23:35 +02:00
|
|
|
trace_suspend_resume(TPS("timekeeping_freeze"),
|
|
|
|
smp_processor_id(), false);
|
|
|
|
} else {
|
2020-01-10 16:39:02 +08:00
|
|
|
touch_softlockup_watchdog();
|
2015-04-03 15:21:51 +02:00
|
|
|
tick_resume_local();
|
2015-05-10 01:23:35 +02:00
|
|
|
}
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
|
|
|
|
tick_freeze_depth--;
|
|
|
|
|
|
|
|
raw_spin_unlock(&tick_freeze_lock);
|
|
|
|
}
|
2015-05-16 01:38:15 +02:00
|
|
|
#endif /* CONFIG_SUSPEND */
|
PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.
However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion. A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.
The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping. That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs. It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.
Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit. Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.
To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices. Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:50:43 +01:00
|
|
|
|
2007-02-16 01:28:01 -08:00
|
|
|
/**
|
|
|
|
* tick_init - initialize the tick control
|
|
|
|
*/
|
|
|
|
void __init tick_init(void)
|
|
|
|
{
|
2013-03-05 14:25:32 +01:00
|
|
|
tick_broadcast_init();
|
2014-08-16 17:47:18 +02:00
|
|
|
tick_nohz_init();
|
2007-02-16 01:28:01 -08:00
|
|
|
}
|