mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-09 14:43:16 +00:00
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (146 commits) tools, perf: Documentation for the power events API perf: Add calls to suspend trace point perf script: Make some lists static perf script: Use the default lost event handler perf session: Warn about errors when processing pipe events too perf tools: Fix perf_event.h header usage perf test: Clarify some error reports in the open syscall test x86, NMI: Add touch_nmi_watchdog to io_check_error delay x86: Avoid calling arch_trigger_all_cpu_backtrace() at the same time x86: Only call smp_processor_id in non-preempt cases perf timechart: Adjust perf timechart to the new power events perf: Clean up power events by introducing new, more generic ones perf: Do not export power_frequency, but power_start event perf test: Add test for counting open syscalls perf evsel: Auto allocate resources needed for some methods perf evsel: Use {cpu,thread}_map to shorten list of parameters perf tools: Refactor all_tids to hold nr and the map perf tools: Refactor cpumap to hold nr and the map perf evsel: Introduce per cpu and per thread open helpers perf evsel: Steal the counter reading routines from stat ...
This commit is contained in:
commit
28d9bfc37c
2
CREDITS
2
CREDITS
@ -2365,8 +2365,6 @@ E: acme@redhat.com
|
||||
W: http://oops.ghostprotocols.net:81/blog/
|
||||
P: 1024D/9224DF01 D5DF E3BB E3C8 BCBB F8AD 841A B6AB 4681 9224 DF01
|
||||
D: IPX, LLC, DCCP, cyc2x, wl3501_cs, net/ hacks
|
||||
S: R. Brasílio Itiberê, 4270/1010 - Água Verde
|
||||
S: 80240-060 - Curitiba - Paraná
|
||||
S: Brazil
|
||||
|
||||
N: Karsten Merker
|
||||
|
@ -1579,20 +1579,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
|
||||
nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels
|
||||
Format: [panic,][num]
|
||||
Valid num: 0,1,2
|
||||
Valid num: 0
|
||||
0 - turn nmi_watchdog off
|
||||
1 - use the IO-APIC timer for the NMI watchdog
|
||||
2 - use the local APIC for the NMI watchdog using
|
||||
a performance counter. Note: This will use one
|
||||
performance counter and the local APIC's performance
|
||||
vector.
|
||||
When panic is specified, panic when an NMI watchdog
|
||||
timeout occurs.
|
||||
This is useful when you use a panic=... timeout and
|
||||
need the box quickly up again.
|
||||
Instead of 1 and 2 it is possible to use the following
|
||||
symbolic names: lapic and ioapic
|
||||
Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic
|
||||
|
||||
netpoll.carrier_timeout=
|
||||
[NET] Specifies amount of time (in seconds) that
|
||||
|
90
Documentation/trace/events-power.txt
Normal file
90
Documentation/trace/events-power.txt
Normal file
@ -0,0 +1,90 @@
|
||||
|
||||
Subsystem Trace Points: power
|
||||
|
||||
The power tracing system captures events related to power transitions
|
||||
within the kernel. Broadly speaking there are three major subheadings:
|
||||
|
||||
o Power state switch which reports events related to suspend (S-states),
|
||||
cpuidle (C-states) and cpufreq (P-states)
|
||||
o System clock related changes
|
||||
o Power domains related changes and transitions
|
||||
|
||||
This document describes what each of the tracepoints is and why they
|
||||
might be useful.
|
||||
|
||||
Cf. include/trace/events/power.h for the events definitions.
|
||||
|
||||
1. Power state switch events
|
||||
============================
|
||||
|
||||
1.1 New trace API
|
||||
-----------------
|
||||
|
||||
A 'cpu' event class gathers the CPU-related events: cpuidle and
|
||||
cpufreq.
|
||||
|
||||
cpu_idle "state=%lu cpu_id=%lu"
|
||||
cpu_frequency "state=%lu cpu_id=%lu"
|
||||
|
||||
A suspend event is used to indicate the system going in and out of the
|
||||
suspend mode:
|
||||
|
||||
machine_suspend "state=%lu"
|
||||
|
||||
|
||||
Note: the value of '-1' or '4294967295' for state means an exit from the current state,
|
||||
i.e. trace_cpu_idle(4, smp_processor_id()) means that the system
|
||||
enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id())
|
||||
means that the system exits the previous idle state.
|
||||
|
||||
The event which has 'state=4294967295' in the trace is very important to the user
|
||||
space tools which are using it to detect the end of the current state, and so to
|
||||
correctly draw the states diagrams and to calculate accurate statistics etc.
|
||||
|
||||
1.2 DEPRECATED trace API
|
||||
------------------------
|
||||
|
||||
A new Kconfig option CONFIG_EVENT_POWER_TRACING_DEPRECATED with the default value of
|
||||
'y' has been created. This allows the legacy trace power API to be used conjointly
|
||||
with the new trace API.
|
||||
The Kconfig option, the old trace API (in include/trace/events/power.h) and the
|
||||
old trace points will disappear in a future release (namely 2.6.41).
|
||||
|
||||
power_start "type=%lu state=%lu cpu_id=%lu"
|
||||
power_frequency "type=%lu state=%lu cpu_id=%lu"
|
||||
power_end "cpu_id=%lu"
|
||||
|
||||
The 'type' parameter takes one of those macros:
|
||||
. POWER_NONE = 0,
|
||||
. POWER_CSTATE = 1, /* C-State */
|
||||
. POWER_PSTATE = 2, /* Fequency change or DVFS */
|
||||
|
||||
The 'state' parameter is set depending on the type:
|
||||
. Target C-state for type=POWER_CSTATE,
|
||||
. Target frequency for type=POWER_PSTATE,
|
||||
|
||||
power_end is used to indicate the exit of a state, corresponding to the latest
|
||||
power_start event.
|
||||
|
||||
2. Clocks events
|
||||
================
|
||||
The clock events are used for clock enable/disable and for
|
||||
clock rate change.
|
||||
|
||||
clock_enable "%s state=%lu cpu_id=%lu"
|
||||
clock_disable "%s state=%lu cpu_id=%lu"
|
||||
clock_set_rate "%s state=%lu cpu_id=%lu"
|
||||
|
||||
The first parameter gives the clock name (e.g. "gpio1_iclk").
|
||||
The second parameter is '1' for enable, '0' for disable, the target
|
||||
clock rate for set_rate.
|
||||
|
||||
3. Power domains events
|
||||
=======================
|
||||
The power domain events are used for power domains transitions
|
||||
|
||||
power_domain_target "%s state=%lu cpu_id=%lu"
|
||||
|
||||
The first parameter gives the power domain name (e.g. "mpu_pwrdm").
|
||||
The second parameter is the power domain target state.
|
||||
|
@ -4627,7 +4627,7 @@ PERFORMANCE EVENTS SUBSYSTEM
|
||||
M: Peter Zijlstra <a.p.zijlstra@chello.nl>
|
||||
M: Paul Mackerras <paulus@samba.org>
|
||||
M: Ingo Molnar <mingo@elte.hu>
|
||||
M: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||||
M: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
|
||||
S: Supported
|
||||
F: kernel/perf_event*.c
|
||||
F: include/linux/perf_event.h
|
||||
|
@ -1,10 +1,4 @@
|
||||
#ifndef __ASM_ALPHA_PERF_EVENT_H
|
||||
#define __ASM_ALPHA_PERF_EVENT_H
|
||||
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
extern void init_hw_perf_events(void);
|
||||
#else
|
||||
static inline void init_hw_perf_events(void) { }
|
||||
#endif
|
||||
|
||||
#endif /* __ASM_ALPHA_PERF_EVENT_H */
|
||||
|
@ -112,8 +112,6 @@ init_IRQ(void)
|
||||
wrent(entInt, 0);
|
||||
|
||||
alpha_mv.init_irq();
|
||||
|
||||
init_hw_perf_events();
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -14,6 +14,7 @@
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/kdebug.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/init.h>
|
||||
|
||||
#include <asm/hwrpb.h>
|
||||
#include <asm/atomic.h>
|
||||
@ -863,13 +864,13 @@ static void alpha_perf_event_irq_handler(unsigned long la_ptr,
|
||||
/*
|
||||
* Init call to initialise performance events at kernel startup.
|
||||
*/
|
||||
void __init init_hw_perf_events(void)
|
||||
int __init init_hw_perf_events(void)
|
||||
{
|
||||
pr_info("Performance events: ");
|
||||
|
||||
if (!supported_cpu()) {
|
||||
pr_cont("No support for your CPU.\n");
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
pr_cont("Supported CPU type!\n");
|
||||
@ -881,6 +882,8 @@ void __init init_hw_perf_events(void)
|
||||
/* And set up PMU specification */
|
||||
alpha_pmu = &ev67_pmu;
|
||||
|
||||
perf_pmu_register(&pmu);
|
||||
}
|
||||
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
|
||||
|
||||
return 0;
|
||||
}
|
||||
early_initcall(init_hw_perf_events);
|
||||
|
@ -3034,11 +3034,11 @@ init_hw_perf_events(void)
|
||||
pr_info("no hardware support available\n");
|
||||
}
|
||||
|
||||
perf_pmu_register(&pmu);
|
||||
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
|
||||
|
||||
return 0;
|
||||
}
|
||||
arch_initcall(init_hw_perf_events);
|
||||
early_initcall(init_hw_perf_events);
|
||||
|
||||
/*
|
||||
* Callchain handling code.
|
||||
|
@ -1047,6 +1047,6 @@ init_hw_perf_events(void)
|
||||
|
||||
return 0;
|
||||
}
|
||||
arch_initcall(init_hw_perf_events);
|
||||
early_initcall(init_hw_perf_events);
|
||||
|
||||
#endif /* defined(CONFIG_CPU_MIPS32)... */
|
||||
|
@ -126,4 +126,4 @@ static int init_e500_pmu(void)
|
||||
return register_fsl_emb_pmu(&e500_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_e500_pmu);
|
||||
early_initcall(init_e500_pmu);
|
||||
|
@ -414,4 +414,4 @@ static int init_mpc7450_pmu(void)
|
||||
return register_power_pmu(&mpc7450_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_mpc7450_pmu);
|
||||
early_initcall(init_mpc7450_pmu);
|
||||
|
@ -1379,7 +1379,7 @@ int register_power_pmu(struct power_pmu *pmu)
|
||||
freeze_events_kernel = MMCR0_FCHV;
|
||||
#endif /* CONFIG_PPC64 */
|
||||
|
||||
perf_pmu_register(&power_pmu);
|
||||
perf_pmu_register(&power_pmu, "cpu", PERF_TYPE_RAW);
|
||||
perf_cpu_notifier(power_pmu_notifier);
|
||||
|
||||
return 0;
|
||||
|
@ -681,7 +681,7 @@ int register_fsl_emb_pmu(struct fsl_emb_pmu *pmu)
|
||||
pr_info("%s performance monitor hardware support registered\n",
|
||||
pmu->name);
|
||||
|
||||
perf_pmu_register(&fsl_emb_pmu);
|
||||
perf_pmu_register(&fsl_emb_pmu, "cpu", PERF_TYPE_RAW);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -613,4 +613,4 @@ static int init_power4_pmu(void)
|
||||
return register_power_pmu(&power4_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_power4_pmu);
|
||||
early_initcall(init_power4_pmu);
|
||||
|
@ -682,4 +682,4 @@ static int init_power5p_pmu(void)
|
||||
return register_power_pmu(&power5p_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_power5p_pmu);
|
||||
early_initcall(init_power5p_pmu);
|
||||
|
@ -621,4 +621,4 @@ static int init_power5_pmu(void)
|
||||
return register_power_pmu(&power5_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_power5_pmu);
|
||||
early_initcall(init_power5_pmu);
|
||||
|
@ -544,4 +544,4 @@ static int init_power6_pmu(void)
|
||||
return register_power_pmu(&power6_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_power6_pmu);
|
||||
early_initcall(init_power6_pmu);
|
||||
|
@ -369,4 +369,4 @@ static int init_power7_pmu(void)
|
||||
return register_power_pmu(&power7_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_power7_pmu);
|
||||
early_initcall(init_power7_pmu);
|
||||
|
@ -494,4 +494,4 @@ static int init_ppc970_pmu(void)
|
||||
return register_power_pmu(&ppc970_pmu);
|
||||
}
|
||||
|
||||
arch_initcall(init_ppc970_pmu);
|
||||
early_initcall(init_ppc970_pmu);
|
||||
|
@ -250,4 +250,4 @@ static int __init sh7750_pmu_init(void)
|
||||
|
||||
return register_sh_pmu(&sh7750_pmu);
|
||||
}
|
||||
arch_initcall(sh7750_pmu_init);
|
||||
early_initcall(sh7750_pmu_init);
|
||||
|
@ -284,4 +284,4 @@ static int __init sh4a_pmu_init(void)
|
||||
|
||||
return register_sh_pmu(&sh4a_pmu);
|
||||
}
|
||||
arch_initcall(sh4a_pmu_init);
|
||||
early_initcall(sh4a_pmu_init);
|
||||
|
@ -389,7 +389,7 @@ int __cpuinit register_sh_pmu(struct sh_pmu *_pmu)
|
||||
|
||||
WARN_ON(_pmu->num_events > MAX_HWEVENTS);
|
||||
|
||||
perf_pmu_register(&pmu);
|
||||
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
|
||||
perf_cpu_notifier(sh_pmu_notifier);
|
||||
return 0;
|
||||
}
|
||||
|
@ -4,8 +4,6 @@
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
#include <asm/ptrace.h>
|
||||
|
||||
extern void init_hw_perf_events(void);
|
||||
|
||||
#define perf_arch_fetch_caller_regs(regs, ip) \
|
||||
do { \
|
||||
unsigned long _pstate, _asi, _pil, _i7, _fp; \
|
||||
@ -26,8 +24,6 @@ do { \
|
||||
(regs)->u_regs[UREG_I6] = _fp; \
|
||||
(regs)->u_regs[UREG_I7] = _i7; \
|
||||
} while (0)
|
||||
#else
|
||||
static inline void init_hw_perf_events(void) { }
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
@ -270,8 +270,6 @@ int __init nmi_init(void)
|
||||
atomic_set(&nmi_active, -1);
|
||||
}
|
||||
}
|
||||
if (!err)
|
||||
init_hw_perf_events();
|
||||
|
||||
return err;
|
||||
}
|
||||
|
@ -1307,20 +1307,23 @@ static bool __init supported_pmu(void)
|
||||
return false;
|
||||
}
|
||||
|
||||
void __init init_hw_perf_events(void)
|
||||
int __init init_hw_perf_events(void)
|
||||
{
|
||||
pr_info("Performance events: ");
|
||||
|
||||
if (!supported_pmu()) {
|
||||
pr_cont("No support for PMU type '%s'\n", sparc_pmu_type);
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
pr_cont("Supported PMU type is '%s'\n", sparc_pmu_type);
|
||||
|
||||
perf_pmu_register(&pmu);
|
||||
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
|
||||
register_die_notifier(&perf_event_nmi_notifier);
|
||||
|
||||
return 0;
|
||||
}
|
||||
early_initcall(init_hw_perf_events);
|
||||
|
||||
void perf_callchain_kernel(struct perf_callchain_entry *entry,
|
||||
struct pt_regs *regs)
|
||||
|
@ -180,8 +180,15 @@ extern void *text_poke_early(void *addr, const void *opcode, size_t len);
|
||||
* On the local CPU you need to be protected again NMI or MCE handlers seeing an
|
||||
* inconsistent instruction while you patch.
|
||||
*/
|
||||
struct text_poke_param {
|
||||
void *addr;
|
||||
const void *opcode;
|
||||
size_t len;
|
||||
};
|
||||
|
||||
extern void *text_poke(void *addr, const void *opcode, size_t len);
|
||||
extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
|
||||
extern void text_poke_smp_batch(struct text_poke_param *params, int n);
|
||||
|
||||
#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
|
||||
#define IDEAL_NOP_SIZE_5 5
|
||||
|
@ -15,10 +15,6 @@ static inline int irq_canonicalize(int irq)
|
||||
return ((irq == 2) ? 9 : irq);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_X86_LOCAL_APIC
|
||||
# define ARCH_HAS_NMI_WATCHDOG
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_32
|
||||
extern void irq_ctx_init(int cpu);
|
||||
#else
|
||||
|
@ -28,7 +28,7 @@ extern void die(const char *, struct pt_regs *,long);
|
||||
extern int __must_check __die(const char *, struct pt_regs *, long);
|
||||
extern void show_registers(struct pt_regs *regs);
|
||||
extern void show_trace(struct task_struct *t, struct pt_regs *regs,
|
||||
unsigned long *sp, unsigned long bp);
|
||||
unsigned long *sp);
|
||||
extern void __show_regs(struct pt_regs *regs, int all);
|
||||
extern void show_regs(struct pt_regs *regs);
|
||||
extern unsigned long oops_begin(void);
|
||||
|
@ -123,6 +123,10 @@
|
||||
#define MSR_AMD64_IBSCTL 0xc001103a
|
||||
#define MSR_AMD64_IBSBRTARGET 0xc001103b
|
||||
|
||||
/* Fam 15h MSRs */
|
||||
#define MSR_F15H_PERF_CTL 0xc0010200
|
||||
#define MSR_F15H_PERF_CTR 0xc0010201
|
||||
|
||||
/* Fam 10h MSRs */
|
||||
#define MSR_FAM10H_MMIO_CONF_BASE 0xc0010058
|
||||
#define FAM10H_MMIO_CONF_ENABLE (1<<0)
|
||||
|
@ -5,41 +5,15 @@
|
||||
#include <asm/irq.h>
|
||||
#include <asm/io.h>
|
||||
|
||||
#ifdef ARCH_HAS_NMI_WATCHDOG
|
||||
|
||||
/**
|
||||
* do_nmi_callback
|
||||
*
|
||||
* Check to see if a callback exists and execute it. Return 1
|
||||
* if the handler exists and was handled successfully.
|
||||
*/
|
||||
int do_nmi_callback(struct pt_regs *regs, int cpu);
|
||||
#ifdef CONFIG_X86_LOCAL_APIC
|
||||
|
||||
extern void die_nmi(char *str, struct pt_regs *regs, int do_panic);
|
||||
extern int check_nmi_watchdog(void);
|
||||
#if !defined(CONFIG_LOCKUP_DETECTOR)
|
||||
extern int nmi_watchdog_enabled;
|
||||
#endif
|
||||
extern int avail_to_resrv_perfctr_nmi_bit(unsigned int);
|
||||
extern int reserve_perfctr_nmi(unsigned int);
|
||||
extern void release_perfctr_nmi(unsigned int);
|
||||
extern int reserve_evntsel_nmi(unsigned int);
|
||||
extern void release_evntsel_nmi(unsigned int);
|
||||
|
||||
extern void setup_apic_nmi_watchdog(void *);
|
||||
extern void stop_apic_nmi_watchdog(void *);
|
||||
extern void disable_timer_nmi_watchdog(void);
|
||||
extern void enable_timer_nmi_watchdog(void);
|
||||
extern int nmi_watchdog_tick(struct pt_regs *regs, unsigned reason);
|
||||
extern void cpu_nmi_set_wd_enabled(void);
|
||||
|
||||
extern atomic_t nmi_active;
|
||||
extern unsigned int nmi_watchdog;
|
||||
#define NMI_NONE 0
|
||||
#define NMI_IO_APIC 1
|
||||
#define NMI_LOCAL_APIC 2
|
||||
#define NMI_INVALID 3
|
||||
|
||||
struct ctl_table;
|
||||
extern int proc_nmi_enabled(struct ctl_table *, int ,
|
||||
void __user *, size_t *, loff_t *);
|
||||
@ -47,33 +21,8 @@ extern int unknown_nmi_panic;
|
||||
|
||||
void arch_trigger_all_cpu_backtrace(void);
|
||||
#define arch_trigger_all_cpu_backtrace arch_trigger_all_cpu_backtrace
|
||||
|
||||
static inline void localise_nmi_watchdog(void)
|
||||
{
|
||||
if (nmi_watchdog == NMI_IO_APIC)
|
||||
nmi_watchdog = NMI_LOCAL_APIC;
|
||||
}
|
||||
|
||||
/* check if nmi_watchdog is active (ie was specified at boot) */
|
||||
static inline int nmi_watchdog_active(void)
|
||||
{
|
||||
/*
|
||||
* actually it should be:
|
||||
* return (nmi_watchdog == NMI_LOCAL_APIC ||
|
||||
* nmi_watchdog == NMI_IO_APIC)
|
||||
* but since they are power of two we could use a
|
||||
* cheaper way --cvg
|
||||
*/
|
||||
return nmi_watchdog & (NMI_LOCAL_APIC | NMI_IO_APIC);
|
||||
}
|
||||
#endif
|
||||
|
||||
void lapic_watchdog_stop(void);
|
||||
int lapic_watchdog_init(unsigned nmi_hz);
|
||||
int lapic_wd_event(unsigned nmi_hz);
|
||||
unsigned lapic_adjust_nmi_hz(unsigned hz);
|
||||
void disable_lapic_nmi_watchdog(void);
|
||||
void enable_lapic_nmi_watchdog(void);
|
||||
void stop_nmi(void);
|
||||
void restart_nmi(void);
|
||||
|
||||
|
@ -125,7 +125,6 @@ union cpuid10_edx {
|
||||
#define IBS_OP_MAX_CNT_EXT 0x007FFFFFULL /* not a register bit mask */
|
||||
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
extern void init_hw_perf_events(void);
|
||||
extern void perf_events_lapic_init(void);
|
||||
|
||||
#define PERF_EVENT_INDEX_OFFSET 0
|
||||
@ -156,7 +155,6 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
|
||||
}
|
||||
|
||||
#else
|
||||
static inline void init_hw_perf_events(void) { }
|
||||
static inline void perf_events_lapic_init(void) { }
|
||||
#endif
|
||||
|
||||
|
@ -744,14 +744,6 @@ enum P4_ESCR_EMASKS {
|
||||
};
|
||||
|
||||
/*
|
||||
* P4 PEBS specifics (Replay Event only)
|
||||
*
|
||||
* Format (bits):
|
||||
* 0-6: metric from P4_PEBS_METRIC enum
|
||||
* 7 : reserved
|
||||
* 8 : reserved
|
||||
* 9-11 : reserved
|
||||
*
|
||||
* Note we have UOP and PEBS bits reserved for now
|
||||
* just in case if we will need them once
|
||||
*/
|
||||
@ -788,5 +780,60 @@ enum P4_PEBS_METRIC {
|
||||
P4_PEBS_METRIC__max
|
||||
};
|
||||
|
||||
/*
|
||||
* Notes on internal configuration of ESCR+CCCR tuples
|
||||
*
|
||||
* Since P4 has quite the different architecture of
|
||||
* performance registers in compare with "architectural"
|
||||
* once and we have on 64 bits to keep configuration
|
||||
* of performance event, the following trick is used.
|
||||
*
|
||||
* 1) Since both ESCR and CCCR registers have only low
|
||||
* 32 bits valuable, we pack them into a single 64 bit
|
||||
* configuration. Low 32 bits of such config correspond
|
||||
* to low 32 bits of CCCR register and high 32 bits
|
||||
* correspond to low 32 bits of ESCR register.
|
||||
*
|
||||
* 2) The meaning of every bit of such config field can
|
||||
* be found in Intel SDM but it should be noted that
|
||||
* we "borrow" some reserved bits for own usage and
|
||||
* clean them or set to a proper value when we do
|
||||
* a real write to hardware registers.
|
||||
*
|
||||
* 3) The format of bits of config is the following
|
||||
* and should be either 0 or set to some predefined
|
||||
* values:
|
||||
*
|
||||
* Low 32 bits
|
||||
* -----------
|
||||
* 0-6: P4_PEBS_METRIC enum
|
||||
* 7-11: reserved
|
||||
* 12: reserved (Enable)
|
||||
* 13-15: reserved (ESCR select)
|
||||
* 16-17: Active Thread
|
||||
* 18: Compare
|
||||
* 19: Complement
|
||||
* 20-23: Threshold
|
||||
* 24: Edge
|
||||
* 25: reserved (FORCE_OVF)
|
||||
* 26: reserved (OVF_PMI_T0)
|
||||
* 27: reserved (OVF_PMI_T1)
|
||||
* 28-29: reserved
|
||||
* 30: reserved (Cascade)
|
||||
* 31: reserved (OVF)
|
||||
*
|
||||
* High 32 bits
|
||||
* ------------
|
||||
* 0: reserved (T1_USR)
|
||||
* 1: reserved (T1_OS)
|
||||
* 2: reserved (T0_USR)
|
||||
* 3: reserved (T0_OS)
|
||||
* 4: Tag Enable
|
||||
* 5-8: Tag Value
|
||||
* 9-24: Event Mask (may use P4_ESCR_EMASK_BIT helper)
|
||||
* 25-30: enum P4_EVENTS
|
||||
* 31: reserved (HT thread)
|
||||
*/
|
||||
|
||||
#endif /* PERF_EVENT_P4_H */
|
||||
|
||||
|
@ -48,7 +48,6 @@ static inline void __init smpboot_setup_io_apic(void)
|
||||
setup_IO_APIC();
|
||||
else {
|
||||
nr_ioapics = 0;
|
||||
localise_nmi_watchdog();
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
@ -7,6 +7,7 @@
|
||||
#define _ASM_X86_STACKTRACE_H
|
||||
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/ptrace.h>
|
||||
|
||||
extern int kstack_depth_to_print;
|
||||
|
||||
@ -46,7 +47,7 @@ struct stacktrace_ops {
|
||||
};
|
||||
|
||||
void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp,
|
||||
unsigned long *stack,
|
||||
const struct stacktrace_ops *ops, void *data);
|
||||
|
||||
#ifdef CONFIG_X86_32
|
||||
@ -57,13 +58,39 @@ void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
|
||||
#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_FRAME_POINTER
|
||||
static inline unsigned long
|
||||
stack_frame(struct task_struct *task, struct pt_regs *regs)
|
||||
{
|
||||
unsigned long bp;
|
||||
|
||||
if (regs)
|
||||
return regs->bp;
|
||||
|
||||
if (task == current) {
|
||||
/* Grab bp right from our regs */
|
||||
get_bp(bp);
|
||||
return bp;
|
||||
}
|
||||
|
||||
/* bp is the last reg pushed by switch_to */
|
||||
return *(unsigned long *)task->thread.sp;
|
||||
}
|
||||
#else
|
||||
static inline unsigned long
|
||||
stack_frame(struct task_struct *task, struct pt_regs *regs)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
extern void
|
||||
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp, char *log_lvl);
|
||||
unsigned long *stack, char *log_lvl);
|
||||
|
||||
extern void
|
||||
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *sp, unsigned long bp, char *log_lvl);
|
||||
unsigned long *sp, char *log_lvl);
|
||||
|
||||
extern unsigned int code_bytes;
|
||||
|
||||
|
@ -10,12 +10,6 @@
|
||||
unsigned long long native_sched_clock(void);
|
||||
extern int recalibrate_cpu_khz(void);
|
||||
|
||||
#if defined(CONFIG_X86_32) && defined(CONFIG_X86_IO_APIC)
|
||||
extern int timer_ack;
|
||||
#else
|
||||
# define timer_ack (0)
|
||||
#endif
|
||||
|
||||
extern int no_timer_check;
|
||||
|
||||
/* Accelerators for sched_clock()
|
||||
|
@ -591,17 +591,21 @@ static atomic_t stop_machine_first;
|
||||
static int wrote_text;
|
||||
|
||||
struct text_poke_params {
|
||||
void *addr;
|
||||
const void *opcode;
|
||||
size_t len;
|
||||
struct text_poke_param *params;
|
||||
int nparams;
|
||||
};
|
||||
|
||||
static int __kprobes stop_machine_text_poke(void *data)
|
||||
{
|
||||
struct text_poke_params *tpp = data;
|
||||
struct text_poke_param *p;
|
||||
int i;
|
||||
|
||||
if (atomic_dec_and_test(&stop_machine_first)) {
|
||||
text_poke(tpp->addr, tpp->opcode, tpp->len);
|
||||
for (i = 0; i < tpp->nparams; i++) {
|
||||
p = &tpp->params[i];
|
||||
text_poke(p->addr, p->opcode, p->len);
|
||||
}
|
||||
smp_wmb(); /* Make sure other cpus see that this has run */
|
||||
wrote_text = 1;
|
||||
} else {
|
||||
@ -610,8 +614,12 @@ static int __kprobes stop_machine_text_poke(void *data)
|
||||
smp_mb(); /* Load wrote_text before following execution */
|
||||
}
|
||||
|
||||
flush_icache_range((unsigned long)tpp->addr,
|
||||
(unsigned long)tpp->addr + tpp->len);
|
||||
for (i = 0; i < tpp->nparams; i++) {
|
||||
p = &tpp->params[i];
|
||||
flush_icache_range((unsigned long)p->addr,
|
||||
(unsigned long)p->addr + p->len);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -631,10 +639,13 @@ static int __kprobes stop_machine_text_poke(void *data)
|
||||
void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
|
||||
{
|
||||
struct text_poke_params tpp;
|
||||
struct text_poke_param p;
|
||||
|
||||
tpp.addr = addr;
|
||||
tpp.opcode = opcode;
|
||||
tpp.len = len;
|
||||
p.addr = addr;
|
||||
p.opcode = opcode;
|
||||
p.len = len;
|
||||
tpp.params = &p;
|
||||
tpp.nparams = 1;
|
||||
atomic_set(&stop_machine_first, 1);
|
||||
wrote_text = 0;
|
||||
/* Use __stop_machine() because the caller already got online_cpus. */
|
||||
@ -642,6 +653,26 @@ void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
|
||||
return addr;
|
||||
}
|
||||
|
||||
/**
|
||||
* text_poke_smp_batch - Update instructions on a live kernel on SMP
|
||||
* @params: an array of text_poke parameters
|
||||
* @n: the number of elements in params.
|
||||
*
|
||||
* Modify multi-byte instruction by using stop_machine() on SMP. Since the
|
||||
* stop_machine() is heavy task, it is better to aggregate text_poke requests
|
||||
* and do it once if possible.
|
||||
*
|
||||
* Note: Must be called under get_online_cpus() and text_mutex.
|
||||
*/
|
||||
void __kprobes text_poke_smp_batch(struct text_poke_param *params, int n)
|
||||
{
|
||||
struct text_poke_params tpp = {.params = params, .nparams = n};
|
||||
|
||||
atomic_set(&stop_machine_first, 1);
|
||||
wrote_text = 0;
|
||||
stop_machine(stop_machine_text_poke, (void *)&tpp, NULL);
|
||||
}
|
||||
|
||||
#if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
|
||||
|
||||
#ifdef CONFIG_X86_64
|
||||
|
@ -3,10 +3,7 @@
|
||||
#
|
||||
|
||||
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o probe_$(BITS).o ipi.o
|
||||
ifneq ($(CONFIG_HARDLOCKUP_DETECTOR),y)
|
||||
obj-$(CONFIG_X86_LOCAL_APIC) += nmi.o
|
||||
endif
|
||||
obj-$(CONFIG_HARDLOCKUP_DETECTOR) += hw_nmi.o
|
||||
obj-y += hw_nmi.o
|
||||
|
||||
obj-$(CONFIG_X86_IO_APIC) += io_apic.o
|
||||
obj-$(CONFIG_SMP) += ipi.o
|
||||
|
@ -31,7 +31,6 @@
|
||||
#include <linux/init.h>
|
||||
#include <linux/cpu.h>
|
||||
#include <linux/dmi.h>
|
||||
#include <linux/nmi.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/mm.h>
|
||||
|
||||
@ -799,11 +798,7 @@ void __init setup_boot_APIC_clock(void)
|
||||
* PIT/HPET going. Otherwise register lapic as a dummy
|
||||
* device.
|
||||
*/
|
||||
if (nmi_watchdog != NMI_IO_APIC)
|
||||
lapic_clockevent.features &= ~CLOCK_EVT_FEAT_DUMMY;
|
||||
else
|
||||
pr_warning("APIC timer registered as dummy,"
|
||||
" due to nmi_watchdog=%d!\n", nmi_watchdog);
|
||||
lapic_clockevent.features &= ~CLOCK_EVT_FEAT_DUMMY;
|
||||
|
||||
/* Setup the lapic or request the broadcast */
|
||||
setup_APIC_timer();
|
||||
@ -1387,7 +1382,6 @@ void __cpuinit end_local_APIC_setup(void)
|
||||
}
|
||||
#endif
|
||||
|
||||
setup_apic_nmi_watchdog(NULL);
|
||||
apic_pm_activate();
|
||||
|
||||
/*
|
||||
@ -1758,17 +1752,10 @@ int __init APIC_init_uniprocessor(void)
|
||||
setup_IO_APIC();
|
||||
else {
|
||||
nr_ioapics = 0;
|
||||
localise_nmi_watchdog();
|
||||
}
|
||||
#else
|
||||
localise_nmi_watchdog();
|
||||
#endif
|
||||
|
||||
x86_init.timers.setup_percpu_clockev();
|
||||
#ifdef CONFIG_X86_64
|
||||
check_nmi_watchdog();
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -17,20 +17,31 @@
|
||||
#include <linux/nmi.h>
|
||||
#include <linux/module.h>
|
||||
|
||||
#ifdef CONFIG_HARDLOCKUP_DETECTOR
|
||||
u64 hw_nmi_get_sample_period(void)
|
||||
{
|
||||
return (u64)(cpu_khz) * 1000 * 60;
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef ARCH_HAS_NMI_WATCHDOG
|
||||
|
||||
#ifdef arch_trigger_all_cpu_backtrace
|
||||
/* For reliability, we're prepared to waste bits here. */
|
||||
static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
|
||||
|
||||
/* "in progress" flag of arch_trigger_all_cpu_backtrace */
|
||||
static unsigned long backtrace_flag;
|
||||
|
||||
void arch_trigger_all_cpu_backtrace(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (test_and_set_bit(0, &backtrace_flag))
|
||||
/*
|
||||
* If there is already a trigger_all_cpu_backtrace() in progress
|
||||
* (backtrace_flag == 1), don't output double cpu dump infos.
|
||||
*/
|
||||
return;
|
||||
|
||||
cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
|
||||
|
||||
printk(KERN_INFO "sending NMI to all CPUs:\n");
|
||||
@ -42,6 +53,9 @@ void arch_trigger_all_cpu_backtrace(void)
|
||||
break;
|
||||
mdelay(1);
|
||||
}
|
||||
|
||||
clear_bit(0, &backtrace_flag);
|
||||
smp_mb__after_clear_bit();
|
||||
}
|
||||
|
||||
static int __kprobes
|
||||
@ -50,7 +64,7 @@ arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self,
|
||||
{
|
||||
struct die_args *args = __args;
|
||||
struct pt_regs *regs;
|
||||
int cpu = smp_processor_id();
|
||||
int cpu;
|
||||
|
||||
switch (cmd) {
|
||||
case DIE_NMI:
|
||||
@ -62,6 +76,7 @@ arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self,
|
||||
}
|
||||
|
||||
regs = args->regs;
|
||||
cpu = smp_processor_id();
|
||||
|
||||
if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
|
||||
static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
|
||||
@ -91,18 +106,3 @@ static int __init register_trigger_all_cpu_backtrace(void)
|
||||
}
|
||||
early_initcall(register_trigger_all_cpu_backtrace);
|
||||
#endif
|
||||
|
||||
/* STUB calls to mimic old nmi_watchdog behaviour */
|
||||
#if defined(CONFIG_X86_LOCAL_APIC)
|
||||
unsigned int nmi_watchdog = NMI_NONE;
|
||||
EXPORT_SYMBOL(nmi_watchdog);
|
||||
void acpi_nmi_enable(void) { return; }
|
||||
void acpi_nmi_disable(void) { return; }
|
||||
#endif
|
||||
atomic_t nmi_active = ATOMIC_INIT(0); /* oprofile uses this */
|
||||
EXPORT_SYMBOL(nmi_active);
|
||||
int unknown_nmi_panic;
|
||||
void cpu_nmi_set_wd_enabled(void) { return; }
|
||||
void stop_apic_nmi_watchdog(void *unused) { return; }
|
||||
void setup_apic_nmi_watchdog(void *unused) { return; }
|
||||
int __init check_nmi_watchdog(void) { return 0; }
|
||||
|
@ -54,7 +54,6 @@
|
||||
#include <asm/dma.h>
|
||||
#include <asm/timer.h>
|
||||
#include <asm/i8259.h>
|
||||
#include <asm/nmi.h>
|
||||
#include <asm/msidef.h>
|
||||
#include <asm/hypertransport.h>
|
||||
#include <asm/setup.h>
|
||||
@ -2642,24 +2641,6 @@ static void lapic_register_intr(int irq)
|
||||
"edge");
|
||||
}
|
||||
|
||||
static void __init setup_nmi(void)
|
||||
{
|
||||
/*
|
||||
* Dirty trick to enable the NMI watchdog ...
|
||||
* We put the 8259A master into AEOI mode and
|
||||
* unmask on all local APICs LVT0 as NMI.
|
||||
*
|
||||
* The idea to use the 8259A in AEOI mode ('8259A Virtual Wire')
|
||||
* is from Maciej W. Rozycki - so we do not have to EOI from
|
||||
* the NMI handler or the timer interrupt.
|
||||
*/
|
||||
apic_printk(APIC_VERBOSE, KERN_INFO "activating NMI Watchdog ...");
|
||||
|
||||
enable_NMI_through_LVT0();
|
||||
|
||||
apic_printk(APIC_VERBOSE, " done.\n");
|
||||
}
|
||||
|
||||
/*
|
||||
* This looks a bit hackish but it's about the only one way of sending
|
||||
* a few INTA cycles to 8259As and any associated glue logic. ICR does
|
||||
@ -2765,15 +2746,6 @@ static inline void __init check_timer(void)
|
||||
*/
|
||||
apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT);
|
||||
legacy_pic->init(1);
|
||||
#ifdef CONFIG_X86_32
|
||||
{
|
||||
unsigned int ver;
|
||||
|
||||
ver = apic_read(APIC_LVR);
|
||||
ver = GET_APIC_VERSION(ver);
|
||||
timer_ack = (nmi_watchdog == NMI_IO_APIC && !APIC_INTEGRATED(ver));
|
||||
}
|
||||
#endif
|
||||
|
||||
pin1 = find_isa_irq_pin(0, mp_INT);
|
||||
apic1 = find_isa_irq_apic(0, mp_INT);
|
||||
@ -2821,10 +2793,6 @@ static inline void __init check_timer(void)
|
||||
unmask_ioapic(cfg);
|
||||
}
|
||||
if (timer_irq_works()) {
|
||||
if (nmi_watchdog == NMI_IO_APIC) {
|
||||
setup_nmi();
|
||||
legacy_pic->unmask(0);
|
||||
}
|
||||
if (disable_timer_pin_1 > 0)
|
||||
clear_IO_APIC_pin(0, pin1);
|
||||
goto out;
|
||||
@ -2850,11 +2818,6 @@ static inline void __init check_timer(void)
|
||||
if (timer_irq_works()) {
|
||||
apic_printk(APIC_QUIET, KERN_INFO "....... works.\n");
|
||||
timer_through_8259 = 1;
|
||||
if (nmi_watchdog == NMI_IO_APIC) {
|
||||
legacy_pic->mask(0);
|
||||
setup_nmi();
|
||||
legacy_pic->unmask(0);
|
||||
}
|
||||
goto out;
|
||||
}
|
||||
/*
|
||||
@ -2866,15 +2829,6 @@ static inline void __init check_timer(void)
|
||||
apic_printk(APIC_QUIET, KERN_INFO "....... failed.\n");
|
||||
}
|
||||
|
||||
if (nmi_watchdog == NMI_IO_APIC) {
|
||||
apic_printk(APIC_QUIET, KERN_WARNING "timer doesn't work "
|
||||
"through the IO-APIC - disabling NMI Watchdog!\n");
|
||||
nmi_watchdog = NMI_NONE;
|
||||
}
|
||||
#ifdef CONFIG_X86_32
|
||||
timer_ack = 0;
|
||||
#endif
|
||||
|
||||
apic_printk(APIC_QUIET, KERN_INFO
|
||||
"...trying to set up timer as Virtual Wire IRQ...\n");
|
||||
|
||||
|
@ -1,567 +0,0 @@
|
||||
/*
|
||||
* NMI watchdog support on APIC systems
|
||||
*
|
||||
* Started by Ingo Molnar <mingo@redhat.com>
|
||||
*
|
||||
* Fixes:
|
||||
* Mikael Pettersson : AMD K7 support for local APIC NMI watchdog.
|
||||
* Mikael Pettersson : Power Management for local APIC NMI watchdog.
|
||||
* Mikael Pettersson : Pentium 4 support for local APIC NMI watchdog.
|
||||
* Pavel Machek and
|
||||
* Mikael Pettersson : PM converted to driver model. Disable/enable API.
|
||||
*/
|
||||
|
||||
#include <asm/apic.h>
|
||||
|
||||
#include <linux/nmi.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/sysdev.h>
|
||||
#include <linux/sysctl.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/kprobes.h>
|
||||
#include <linux/cpumask.h>
|
||||
#include <linux/kernel_stat.h>
|
||||
#include <linux/kdebug.h>
|
||||
#include <linux/smp.h>
|
||||
|
||||
#include <asm/i8259.h>
|
||||
#include <asm/io_apic.h>
|
||||
#include <asm/proto.h>
|
||||
#include <asm/timer.h>
|
||||
|
||||
#include <asm/mce.h>
|
||||
|
||||
#include <asm/mach_traps.h>
|
||||
|
||||
int unknown_nmi_panic;
|
||||
int nmi_watchdog_enabled;
|
||||
|
||||
/* For reliability, we're prepared to waste bits here. */
|
||||
static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
|
||||
|
||||
/* nmi_active:
|
||||
* >0: the lapic NMI watchdog is active, but can be disabled
|
||||
* <0: the lapic NMI watchdog has not been set up, and cannot
|
||||
* be enabled
|
||||
* 0: the lapic NMI watchdog is disabled, but can be enabled
|
||||
*/
|
||||
atomic_t nmi_active = ATOMIC_INIT(0); /* oprofile uses this */
|
||||
EXPORT_SYMBOL(nmi_active);
|
||||
|
||||
unsigned int nmi_watchdog = NMI_NONE;
|
||||
EXPORT_SYMBOL(nmi_watchdog);
|
||||
|
||||
static int panic_on_timeout;
|
||||
|
||||
static unsigned int nmi_hz = HZ;
|
||||
static DEFINE_PER_CPU(short, wd_enabled);
|
||||
static int endflag __initdata;
|
||||
|
||||
static inline unsigned int get_nmi_count(int cpu)
|
||||
{
|
||||
return per_cpu(irq_stat, cpu).__nmi_count;
|
||||
}
|
||||
|
||||
static inline int mce_in_progress(void)
|
||||
{
|
||||
#if defined(CONFIG_X86_MCE)
|
||||
return atomic_read(&mce_entry) > 0;
|
||||
#endif
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Take the local apic timer and PIT/HPET into account. We don't
|
||||
* know which one is active, when we have highres/dyntick on
|
||||
*/
|
||||
static inline unsigned int get_timer_irqs(int cpu)
|
||||
{
|
||||
return per_cpu(irq_stat, cpu).apic_timer_irqs +
|
||||
per_cpu(irq_stat, cpu).irq0_irqs;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
/*
|
||||
* The performance counters used by NMI_LOCAL_APIC don't trigger when
|
||||
* the CPU is idle. To make sure the NMI watchdog really ticks on all
|
||||
* CPUs during the test make them busy.
|
||||
*/
|
||||
static __init void nmi_cpu_busy(void *data)
|
||||
{
|
||||
local_irq_enable_in_hardirq();
|
||||
/*
|
||||
* Intentionally don't use cpu_relax here. This is
|
||||
* to make sure that the performance counter really ticks,
|
||||
* even if there is a simulator or similar that catches the
|
||||
* pause instruction. On a real HT machine this is fine because
|
||||
* all other CPUs are busy with "useless" delay loops and don't
|
||||
* care if they get somewhat less cycles.
|
||||
*/
|
||||
while (endflag == 0)
|
||||
mb();
|
||||
}
|
||||
#endif
|
||||
|
||||
static void report_broken_nmi(int cpu, unsigned int *prev_nmi_count)
|
||||
{
|
||||
printk(KERN_CONT "\n");
|
||||
|
||||
printk(KERN_WARNING
|
||||
"WARNING: CPU#%d: NMI appears to be stuck (%d->%d)!\n",
|
||||
cpu, prev_nmi_count[cpu], get_nmi_count(cpu));
|
||||
|
||||
printk(KERN_WARNING
|
||||
"Please report this to bugzilla.kernel.org,\n");
|
||||
printk(KERN_WARNING
|
||||
"and attach the output of the 'dmesg' command.\n");
|
||||
|
||||
per_cpu(wd_enabled, cpu) = 0;
|
||||
atomic_dec(&nmi_active);
|
||||
}
|
||||
|
||||
static void __acpi_nmi_disable(void *__unused)
|
||||
{
|
||||
apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
|
||||
}
|
||||
|
||||
int __init check_nmi_watchdog(void)
|
||||
{
|
||||
unsigned int *prev_nmi_count;
|
||||
int cpu;
|
||||
|
||||
if (!nmi_watchdog_active() || !atomic_read(&nmi_active))
|
||||
return 0;
|
||||
|
||||
prev_nmi_count = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL);
|
||||
if (!prev_nmi_count)
|
||||
goto error;
|
||||
|
||||
printk(KERN_INFO "Testing NMI watchdog ... ");
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
smp_call_function(nmi_cpu_busy, (void *)&endflag, 0);
|
||||
#endif
|
||||
|
||||
for_each_possible_cpu(cpu)
|
||||
prev_nmi_count[cpu] = get_nmi_count(cpu);
|
||||
local_irq_enable();
|
||||
mdelay((20 * 1000) / nmi_hz); /* wait 20 ticks */
|
||||
|
||||
for_each_online_cpu(cpu) {
|
||||
if (!per_cpu(wd_enabled, cpu))
|
||||
continue;
|
||||
if (get_nmi_count(cpu) - prev_nmi_count[cpu] <= 5)
|
||||
report_broken_nmi(cpu, prev_nmi_count);
|
||||
}
|
||||
endflag = 1;
|
||||
if (!atomic_read(&nmi_active)) {
|
||||
kfree(prev_nmi_count);
|
||||
atomic_set(&nmi_active, -1);
|
||||
goto error;
|
||||
}
|
||||
printk("OK.\n");
|
||||
|
||||
/*
|
||||
* now that we know it works we can reduce NMI frequency to
|
||||
* something more reasonable; makes a difference in some configs
|
||||
*/
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
nmi_hz = lapic_adjust_nmi_hz(1);
|
||||
|
||||
kfree(prev_nmi_count);
|
||||
return 0;
|
||||
error:
|
||||
if (nmi_watchdog == NMI_IO_APIC) {
|
||||
if (!timer_through_8259)
|
||||
legacy_pic->mask(0);
|
||||
on_each_cpu(__acpi_nmi_disable, NULL, 1);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_X86_32
|
||||
timer_ack = 0;
|
||||
#endif
|
||||
return -1;
|
||||
}
|
||||
|
||||
static int __init setup_nmi_watchdog(char *str)
|
||||
{
|
||||
unsigned int nmi;
|
||||
|
||||
if (!strncmp(str, "panic", 5)) {
|
||||
panic_on_timeout = 1;
|
||||
str = strchr(str, ',');
|
||||
if (!str)
|
||||
return 1;
|
||||
++str;
|
||||
}
|
||||
|
||||
if (!strncmp(str, "lapic", 5))
|
||||
nmi_watchdog = NMI_LOCAL_APIC;
|
||||
else if (!strncmp(str, "ioapic", 6))
|
||||
nmi_watchdog = NMI_IO_APIC;
|
||||
else {
|
||||
get_option(&str, &nmi);
|
||||
if (nmi >= NMI_INVALID)
|
||||
return 0;
|
||||
nmi_watchdog = nmi;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
__setup("nmi_watchdog=", setup_nmi_watchdog);
|
||||
|
||||
/*
|
||||
* Suspend/resume support
|
||||
*/
|
||||
#ifdef CONFIG_PM
|
||||
|
||||
static int nmi_pm_active; /* nmi_active before suspend */
|
||||
|
||||
static int lapic_nmi_suspend(struct sys_device *dev, pm_message_t state)
|
||||
{
|
||||
/* only CPU0 goes here, other CPUs should be offline */
|
||||
nmi_pm_active = atomic_read(&nmi_active);
|
||||
stop_apic_nmi_watchdog(NULL);
|
||||
BUG_ON(atomic_read(&nmi_active) != 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int lapic_nmi_resume(struct sys_device *dev)
|
||||
{
|
||||
/* only CPU0 goes here, other CPUs should be offline */
|
||||
if (nmi_pm_active > 0) {
|
||||
setup_apic_nmi_watchdog(NULL);
|
||||
touch_nmi_watchdog();
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct sysdev_class nmi_sysclass = {
|
||||
.name = "lapic_nmi",
|
||||
.resume = lapic_nmi_resume,
|
||||
.suspend = lapic_nmi_suspend,
|
||||
};
|
||||
|
||||
static struct sys_device device_lapic_nmi = {
|
||||
.id = 0,
|
||||
.cls = &nmi_sysclass,
|
||||
};
|
||||
|
||||
static int __init init_lapic_nmi_sysfs(void)
|
||||
{
|
||||
int error;
|
||||
|
||||
/*
|
||||
* should really be a BUG_ON but b/c this is an
|
||||
* init call, it just doesn't work. -dcz
|
||||
*/
|
||||
if (nmi_watchdog != NMI_LOCAL_APIC)
|
||||
return 0;
|
||||
|
||||
if (atomic_read(&nmi_active) < 0)
|
||||
return 0;
|
||||
|
||||
error = sysdev_class_register(&nmi_sysclass);
|
||||
if (!error)
|
||||
error = sysdev_register(&device_lapic_nmi);
|
||||
return error;
|
||||
}
|
||||
|
||||
/* must come after the local APIC's device_initcall() */
|
||||
late_initcall(init_lapic_nmi_sysfs);
|
||||
|
||||
#endif /* CONFIG_PM */
|
||||
|
||||
static void __acpi_nmi_enable(void *__unused)
|
||||
{
|
||||
apic_write(APIC_LVT0, APIC_DM_NMI);
|
||||
}
|
||||
|
||||
/*
|
||||
* Enable timer based NMIs on all CPUs:
|
||||
*/
|
||||
void acpi_nmi_enable(void)
|
||||
{
|
||||
if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
|
||||
on_each_cpu(__acpi_nmi_enable, NULL, 1);
|
||||
}
|
||||
|
||||
/*
|
||||
* Disable timer based NMIs on all CPUs:
|
||||
*/
|
||||
void acpi_nmi_disable(void)
|
||||
{
|
||||
if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
|
||||
on_each_cpu(__acpi_nmi_disable, NULL, 1);
|
||||
}
|
||||
|
||||
/*
|
||||
* This function is called as soon the LAPIC NMI watchdog driver has everything
|
||||
* in place and it's ready to check if the NMIs belong to the NMI watchdog
|
||||
*/
|
||||
void cpu_nmi_set_wd_enabled(void)
|
||||
{
|
||||
__get_cpu_var(wd_enabled) = 1;
|
||||
}
|
||||
|
||||
void setup_apic_nmi_watchdog(void *unused)
|
||||
{
|
||||
if (__get_cpu_var(wd_enabled))
|
||||
return;
|
||||
|
||||
/* cheap hack to support suspend/resume */
|
||||
/* if cpu0 is not active neither should the other cpus */
|
||||
if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
|
||||
return;
|
||||
|
||||
switch (nmi_watchdog) {
|
||||
case NMI_LOCAL_APIC:
|
||||
if (lapic_watchdog_init(nmi_hz) < 0) {
|
||||
__get_cpu_var(wd_enabled) = 0;
|
||||
return;
|
||||
}
|
||||
/* FALL THROUGH */
|
||||
case NMI_IO_APIC:
|
||||
__get_cpu_var(wd_enabled) = 1;
|
||||
atomic_inc(&nmi_active);
|
||||
}
|
||||
}
|
||||
|
||||
void stop_apic_nmi_watchdog(void *unused)
|
||||
{
|
||||
/* only support LOCAL and IO APICs for now */
|
||||
if (!nmi_watchdog_active())
|
||||
return;
|
||||
if (__get_cpu_var(wd_enabled) == 0)
|
||||
return;
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
lapic_watchdog_stop();
|
||||
else
|
||||
__acpi_nmi_disable(NULL);
|
||||
__get_cpu_var(wd_enabled) = 0;
|
||||
atomic_dec(&nmi_active);
|
||||
}
|
||||
|
||||
/*
|
||||
* the best way to detect whether a CPU has a 'hard lockup' problem
|
||||
* is to check it's local APIC timer IRQ counts. If they are not
|
||||
* changing then that CPU has some problem.
|
||||
*
|
||||
* as these watchdog NMI IRQs are generated on every CPU, we only
|
||||
* have to check the current processor.
|
||||
*
|
||||
* since NMIs don't listen to _any_ locks, we have to be extremely
|
||||
* careful not to rely on unsafe variables. The printk might lock
|
||||
* up though, so we have to break up any console locks first ...
|
||||
* [when there will be more tty-related locks, break them up here too!]
|
||||
*/
|
||||
|
||||
static DEFINE_PER_CPU(unsigned, last_irq_sum);
|
||||
static DEFINE_PER_CPU(long, alert_counter);
|
||||
static DEFINE_PER_CPU(int, nmi_touch);
|
||||
|
||||
void touch_nmi_watchdog(void)
|
||||
{
|
||||
if (nmi_watchdog_active()) {
|
||||
unsigned cpu;
|
||||
|
||||
/*
|
||||
* Tell other CPUs to reset their alert counters. We cannot
|
||||
* do it ourselves because the alert count increase is not
|
||||
* atomic.
|
||||
*/
|
||||
for_each_present_cpu(cpu) {
|
||||
if (per_cpu(nmi_touch, cpu) != 1)
|
||||
per_cpu(nmi_touch, cpu) = 1;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Tickle the softlockup detector too:
|
||||
*/
|
||||
touch_softlockup_watchdog();
|
||||
}
|
||||
EXPORT_SYMBOL(touch_nmi_watchdog);
|
||||
|
||||
notrace __kprobes int
|
||||
nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
|
||||
{
|
||||
/*
|
||||
* Since current_thread_info()-> is always on the stack, and we
|
||||
* always switch the stack NMI-atomically, it's safe to use
|
||||
* smp_processor_id().
|
||||
*/
|
||||
unsigned int sum;
|
||||
int touched = 0;
|
||||
int cpu = smp_processor_id();
|
||||
int rc = 0;
|
||||
|
||||
sum = get_timer_irqs(cpu);
|
||||
|
||||
if (__get_cpu_var(nmi_touch)) {
|
||||
__get_cpu_var(nmi_touch) = 0;
|
||||
touched = 1;
|
||||
}
|
||||
|
||||
/* We can be called before check_nmi_watchdog, hence NULL check. */
|
||||
if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
|
||||
static DEFINE_RAW_SPINLOCK(lock); /* Serialise the printks */
|
||||
|
||||
raw_spin_lock(&lock);
|
||||
printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
|
||||
show_regs(regs);
|
||||
dump_stack();
|
||||
raw_spin_unlock(&lock);
|
||||
cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
|
||||
|
||||
rc = 1;
|
||||
}
|
||||
|
||||
/* Could check oops_in_progress here too, but it's safer not to */
|
||||
if (mce_in_progress())
|
||||
touched = 1;
|
||||
|
||||
/* if the none of the timers isn't firing, this cpu isn't doing much */
|
||||
if (!touched && __get_cpu_var(last_irq_sum) == sum) {
|
||||
/*
|
||||
* Ayiee, looks like this CPU is stuck ...
|
||||
* wait a few IRQs (5 seconds) before doing the oops ...
|
||||
*/
|
||||
__this_cpu_inc(alert_counter);
|
||||
if (__this_cpu_read(alert_counter) == 5 * nmi_hz)
|
||||
/*
|
||||
* die_nmi will return ONLY if NOTIFY_STOP happens..
|
||||
*/
|
||||
die_nmi("BUG: NMI Watchdog detected LOCKUP",
|
||||
regs, panic_on_timeout);
|
||||
} else {
|
||||
__get_cpu_var(last_irq_sum) = sum;
|
||||
__this_cpu_write(alert_counter, 0);
|
||||
}
|
||||
|
||||
/* see if the nmi watchdog went off */
|
||||
if (!__get_cpu_var(wd_enabled))
|
||||
return rc;
|
||||
switch (nmi_watchdog) {
|
||||
case NMI_LOCAL_APIC:
|
||||
rc |= lapic_wd_event(nmi_hz);
|
||||
break;
|
||||
case NMI_IO_APIC:
|
||||
/*
|
||||
* don't know how to accurately check for this.
|
||||
* just assume it was a watchdog timer interrupt
|
||||
* This matches the old behaviour.
|
||||
*/
|
||||
rc = 1;
|
||||
break;
|
||||
}
|
||||
return rc;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SYSCTL
|
||||
|
||||
static void enable_ioapic_nmi_watchdog_single(void *unused)
|
||||
{
|
||||
__get_cpu_var(wd_enabled) = 1;
|
||||
atomic_inc(&nmi_active);
|
||||
__acpi_nmi_enable(NULL);
|
||||
}
|
||||
|
||||
static void enable_ioapic_nmi_watchdog(void)
|
||||
{
|
||||
on_each_cpu(enable_ioapic_nmi_watchdog_single, NULL, 1);
|
||||
touch_nmi_watchdog();
|
||||
}
|
||||
|
||||
static void disable_ioapic_nmi_watchdog(void)
|
||||
{
|
||||
on_each_cpu(stop_apic_nmi_watchdog, NULL, 1);
|
||||
}
|
||||
|
||||
static int __init setup_unknown_nmi_panic(char *str)
|
||||
{
|
||||
unknown_nmi_panic = 1;
|
||||
return 1;
|
||||
}
|
||||
__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
|
||||
|
||||
static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu)
|
||||
{
|
||||
unsigned char reason = get_nmi_reason();
|
||||
char buf[64];
|
||||
|
||||
sprintf(buf, "NMI received for unknown reason %02x\n", reason);
|
||||
die_nmi(buf, regs, 1); /* Always panic here */
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* proc handler for /proc/sys/kernel/nmi
|
||||
*/
|
||||
int proc_nmi_enabled(struct ctl_table *table, int write,
|
||||
void __user *buffer, size_t *length, loff_t *ppos)
|
||||
{
|
||||
int old_state;
|
||||
|
||||
nmi_watchdog_enabled = (atomic_read(&nmi_active) > 0) ? 1 : 0;
|
||||
old_state = nmi_watchdog_enabled;
|
||||
proc_dointvec(table, write, buffer, length, ppos);
|
||||
if (!!old_state == !!nmi_watchdog_enabled)
|
||||
return 0;
|
||||
|
||||
if (atomic_read(&nmi_active) < 0 || !nmi_watchdog_active()) {
|
||||
printk(KERN_WARNING
|
||||
"NMI watchdog is permanently disabled\n");
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC) {
|
||||
if (nmi_watchdog_enabled)
|
||||
enable_lapic_nmi_watchdog();
|
||||
else
|
||||
disable_lapic_nmi_watchdog();
|
||||
} else if (nmi_watchdog == NMI_IO_APIC) {
|
||||
if (nmi_watchdog_enabled)
|
||||
enable_ioapic_nmi_watchdog();
|
||||
else
|
||||
disable_ioapic_nmi_watchdog();
|
||||
} else {
|
||||
printk(KERN_WARNING
|
||||
"NMI watchdog doesn't know what hardware to touch\n");
|
||||
return -EIO;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_SYSCTL */
|
||||
|
||||
int do_nmi_callback(struct pt_regs *regs, int cpu)
|
||||
{
|
||||
#ifdef CONFIG_SYSCTL
|
||||
if (unknown_nmi_panic)
|
||||
return unknown_nmi_panic_callback(regs, cpu);
|
||||
#endif
|
||||
return 0;
|
||||
}
|
||||
|
||||
void arch_trigger_all_cpu_backtrace(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
|
||||
|
||||
printk(KERN_INFO "sending NMI to all CPUs:\n");
|
||||
apic->send_IPI_all(NMI_VECTOR);
|
||||
|
||||
/* Wait for up to 10 seconds for all CPUs to do the backtrace */
|
||||
for (i = 0; i < 10 * 1000; i++) {
|
||||
if (cpumask_empty(to_cpumask(backtrace_mask)))
|
||||
break;
|
||||
mdelay(1);
|
||||
}
|
||||
}
|
@ -894,7 +894,6 @@ void __init identify_boot_cpu(void)
|
||||
#else
|
||||
vgetcpu_set_mode();
|
||||
#endif
|
||||
init_hw_perf_events();
|
||||
}
|
||||
|
||||
void __cpuinit identify_secondary_cpu(struct cpuinfo_x86 *c)
|
||||
|
@ -330,9 +330,6 @@ static bool reserve_pmc_hardware(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
disable_lapic_nmi_watchdog();
|
||||
|
||||
for (i = 0; i < x86_pmu.num_counters; i++) {
|
||||
if (!reserve_perfctr_nmi(x86_pmu.perfctr + i))
|
||||
goto perfctr_fail;
|
||||
@ -355,9 +352,6 @@ perfctr_fail:
|
||||
for (i--; i >= 0; i--)
|
||||
release_perfctr_nmi(x86_pmu.perfctr + i);
|
||||
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
enable_lapic_nmi_watchdog();
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
@ -369,9 +363,6 @@ static void release_pmc_hardware(void)
|
||||
release_perfctr_nmi(x86_pmu.perfctr + i);
|
||||
release_evntsel_nmi(x86_pmu.eventsel + i);
|
||||
}
|
||||
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
enable_lapic_nmi_watchdog();
|
||||
}
|
||||
|
||||
#else
|
||||
@ -384,15 +375,53 @@ static void release_pmc_hardware(void) {}
|
||||
static bool check_hw_exists(void)
|
||||
{
|
||||
u64 val, val_new = 0;
|
||||
int ret = 0;
|
||||
int i, reg, ret = 0;
|
||||
|
||||
/*
|
||||
* Check to see if the BIOS enabled any of the counters, if so
|
||||
* complain and bail.
|
||||
*/
|
||||
for (i = 0; i < x86_pmu.num_counters; i++) {
|
||||
reg = x86_pmu.eventsel + i;
|
||||
ret = rdmsrl_safe(reg, &val);
|
||||
if (ret)
|
||||
goto msr_fail;
|
||||
if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
|
||||
goto bios_fail;
|
||||
}
|
||||
|
||||
if (x86_pmu.num_counters_fixed) {
|
||||
reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
|
||||
ret = rdmsrl_safe(reg, &val);
|
||||
if (ret)
|
||||
goto msr_fail;
|
||||
for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
|
||||
if (val & (0x03 << i*4))
|
||||
goto bios_fail;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Now write a value and read it back to see if it matches,
|
||||
* this is needed to detect certain hardware emulators (qemu/kvm)
|
||||
* that don't trap on the MSR access and always return 0s.
|
||||
*/
|
||||
val = 0xabcdUL;
|
||||
ret |= checking_wrmsrl(x86_pmu.perfctr, val);
|
||||
ret = checking_wrmsrl(x86_pmu.perfctr, val);
|
||||
ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
|
||||
if (ret || val != val_new)
|
||||
return false;
|
||||
goto msr_fail;
|
||||
|
||||
return true;
|
||||
|
||||
bios_fail:
|
||||
printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
|
||||
printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
|
||||
return false;
|
||||
|
||||
msr_fail:
|
||||
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
|
||||
return false;
|
||||
}
|
||||
|
||||
static void reserve_ds_buffers(void);
|
||||
@ -451,7 +480,7 @@ static int x86_setup_perfctr(struct perf_event *event)
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
u64 config;
|
||||
|
||||
if (!hwc->sample_period) {
|
||||
if (!is_sampling_event(event)) {
|
||||
hwc->sample_period = x86_pmu.max_period;
|
||||
hwc->last_period = hwc->sample_period;
|
||||
local64_set(&hwc->period_left, hwc->sample_period);
|
||||
@ -1362,7 +1391,7 @@ static void __init pmu_check_apic(void)
|
||||
pr_info("no hardware sampling interrupt available.\n");
|
||||
}
|
||||
|
||||
void __init init_hw_perf_events(void)
|
||||
int __init init_hw_perf_events(void)
|
||||
{
|
||||
struct event_constraint *c;
|
||||
int err;
|
||||
@ -1377,20 +1406,18 @@ void __init init_hw_perf_events(void)
|
||||
err = amd_pmu_init();
|
||||
break;
|
||||
default:
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
if (err != 0) {
|
||||
pr_cont("no PMU driver, software events only.\n");
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
pmu_check_apic();
|
||||
|
||||
/* sanity check that the hardware exists or is emulated */
|
||||
if (!check_hw_exists()) {
|
||||
pr_cont("Broken PMU hardware detected, software events only.\n");
|
||||
return;
|
||||
}
|
||||
if (!check_hw_exists())
|
||||
return 0;
|
||||
|
||||
pr_cont("%s PMU driver.\n", x86_pmu.name);
|
||||
|
||||
@ -1438,9 +1465,12 @@ void __init init_hw_perf_events(void)
|
||||
pr_info("... fixed-purpose events: %d\n", x86_pmu.num_counters_fixed);
|
||||
pr_info("... event mask: %016Lx\n", x86_pmu.intel_ctrl);
|
||||
|
||||
perf_pmu_register(&pmu);
|
||||
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
|
||||
perf_cpu_notifier(x86_pmu_notifier);
|
||||
|
||||
return 0;
|
||||
}
|
||||
early_initcall(init_hw_perf_events);
|
||||
|
||||
static inline void x86_pmu_read(struct perf_event *event)
|
||||
{
|
||||
@ -1686,7 +1716,7 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
|
||||
|
||||
perf_callchain_store(entry, regs->ip);
|
||||
|
||||
dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
|
||||
dump_trace(NULL, regs, NULL, &backtrace_ops, entry);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_COMPAT
|
||||
|
@ -1,7 +1,5 @@
|
||||
#ifdef CONFIG_CPU_SUP_AMD
|
||||
|
||||
static DEFINE_RAW_SPINLOCK(amd_nb_lock);
|
||||
|
||||
static __initconst const u64 amd_hw_cache_event_ids
|
||||
[PERF_COUNT_HW_CACHE_MAX]
|
||||
[PERF_COUNT_HW_CACHE_OP_MAX]
|
||||
@ -275,7 +273,7 @@ done:
|
||||
return &emptyconstraint;
|
||||
}
|
||||
|
||||
static struct amd_nb *amd_alloc_nb(int cpu, int nb_id)
|
||||
static struct amd_nb *amd_alloc_nb(int cpu)
|
||||
{
|
||||
struct amd_nb *nb;
|
||||
int i;
|
||||
@ -285,7 +283,7 @@ static struct amd_nb *amd_alloc_nb(int cpu, int nb_id)
|
||||
if (!nb)
|
||||
return NULL;
|
||||
|
||||
nb->nb_id = nb_id;
|
||||
nb->nb_id = -1;
|
||||
|
||||
/*
|
||||
* initialize all possible NB constraints
|
||||
@ -306,7 +304,7 @@ static int amd_pmu_cpu_prepare(int cpu)
|
||||
if (boot_cpu_data.x86_max_cores < 2)
|
||||
return NOTIFY_OK;
|
||||
|
||||
cpuc->amd_nb = amd_alloc_nb(cpu, -1);
|
||||
cpuc->amd_nb = amd_alloc_nb(cpu);
|
||||
if (!cpuc->amd_nb)
|
||||
return NOTIFY_BAD;
|
||||
|
||||
@ -325,8 +323,6 @@ static void amd_pmu_cpu_starting(int cpu)
|
||||
nb_id = amd_get_nb_id(cpu);
|
||||
WARN_ON_ONCE(nb_id == BAD_APICID);
|
||||
|
||||
raw_spin_lock(&amd_nb_lock);
|
||||
|
||||
for_each_online_cpu(i) {
|
||||
nb = per_cpu(cpu_hw_events, i).amd_nb;
|
||||
if (WARN_ON_ONCE(!nb))
|
||||
@ -341,8 +337,6 @@ static void amd_pmu_cpu_starting(int cpu)
|
||||
|
||||
cpuc->amd_nb->nb_id = nb_id;
|
||||
cpuc->amd_nb->refcnt++;
|
||||
|
||||
raw_spin_unlock(&amd_nb_lock);
|
||||
}
|
||||
|
||||
static void amd_pmu_cpu_dead(int cpu)
|
||||
@ -354,8 +348,6 @@ static void amd_pmu_cpu_dead(int cpu)
|
||||
|
||||
cpuhw = &per_cpu(cpu_hw_events, cpu);
|
||||
|
||||
raw_spin_lock(&amd_nb_lock);
|
||||
|
||||
if (cpuhw->amd_nb) {
|
||||
struct amd_nb *nb = cpuhw->amd_nb;
|
||||
|
||||
@ -364,8 +356,6 @@ static void amd_pmu_cpu_dead(int cpu)
|
||||
|
||||
cpuhw->amd_nb = NULL;
|
||||
}
|
||||
|
||||
raw_spin_unlock(&amd_nb_lock);
|
||||
}
|
||||
|
||||
static __initconst const struct x86_pmu amd_pmu = {
|
||||
|
@ -816,6 +816,32 @@ static int intel_pmu_hw_config(struct perf_event *event)
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (event->attr.precise_ip &&
|
||||
(event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
|
||||
/*
|
||||
* Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
|
||||
* (0x003c) so that we can use it with PEBS.
|
||||
*
|
||||
* The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
|
||||
* PEBS capable. However we can use INST_RETIRED.ANY_P
|
||||
* (0x00c0), which is a PEBS capable event, to get the same
|
||||
* count.
|
||||
*
|
||||
* INST_RETIRED.ANY_P counts the number of cycles that retires
|
||||
* CNTMASK instructions. By setting CNTMASK to a value (16)
|
||||
* larger than the maximum number of instructions that can be
|
||||
* retired per cycle (4) and then inverting the condition, we
|
||||
* count all cycles that retire 16 or less instructions, which
|
||||
* is every cycle.
|
||||
*
|
||||
* Thereby we gain a PEBS capable cycle counter.
|
||||
*/
|
||||
u64 alt_config = 0x108000c0; /* INST_RETIRED.TOTAL_CYCLES */
|
||||
|
||||
alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
|
||||
event->hw.config = alt_config;
|
||||
}
|
||||
|
||||
if (event->attr.type != PERF_TYPE_RAW)
|
||||
return 0;
|
||||
|
||||
|
@ -16,32 +16,12 @@
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/nmi.h>
|
||||
#include <asm/nmi.h>
|
||||
#include <linux/kprobes.h>
|
||||
|
||||
#include <asm/apic.h>
|
||||
#include <asm/perf_event.h>
|
||||
|
||||
struct nmi_watchdog_ctlblk {
|
||||
unsigned int cccr_msr;
|
||||
unsigned int perfctr_msr; /* the MSR to reset in NMI handler */
|
||||
unsigned int evntsel_msr; /* the MSR to select the events to handle */
|
||||
};
|
||||
|
||||
/* Interface defining a CPU specific perfctr watchdog */
|
||||
struct wd_ops {
|
||||
int (*reserve)(void);
|
||||
void (*unreserve)(void);
|
||||
int (*setup)(unsigned nmi_hz);
|
||||
void (*rearm)(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz);
|
||||
void (*stop)(void);
|
||||
unsigned perfctr;
|
||||
unsigned evntsel;
|
||||
u64 checkbit;
|
||||
};
|
||||
|
||||
static const struct wd_ops *wd_ops;
|
||||
|
||||
/*
|
||||
* this number is calculated from Intel's MSR_P4_CRU_ESCR5 register and it's
|
||||
* offset from MSR_P4_BSU_ESCR0.
|
||||
@ -60,8 +40,6 @@ static const struct wd_ops *wd_ops;
|
||||
static DECLARE_BITMAP(perfctr_nmi_owner, NMI_MAX_COUNTER_BITS);
|
||||
static DECLARE_BITMAP(evntsel_nmi_owner, NMI_MAX_COUNTER_BITS);
|
||||
|
||||
static DEFINE_PER_CPU(struct nmi_watchdog_ctlblk, nmi_watchdog_ctlblk);
|
||||
|
||||
/* converts an msr to an appropriate reservation bit */
|
||||
static inline unsigned int nmi_perfctr_msr_to_bit(unsigned int msr)
|
||||
{
|
||||
@ -172,623 +150,3 @@ void release_evntsel_nmi(unsigned int msr)
|
||||
clear_bit(counter, evntsel_nmi_owner);
|
||||
}
|
||||
EXPORT_SYMBOL(release_evntsel_nmi);
|
||||
|
||||
void disable_lapic_nmi_watchdog(void)
|
||||
{
|
||||
BUG_ON(nmi_watchdog != NMI_LOCAL_APIC);
|
||||
|
||||
if (atomic_read(&nmi_active) <= 0)
|
||||
return;
|
||||
|
||||
on_each_cpu(stop_apic_nmi_watchdog, NULL, 1);
|
||||
|
||||
if (wd_ops)
|
||||
wd_ops->unreserve();
|
||||
|
||||
BUG_ON(atomic_read(&nmi_active) != 0);
|
||||
}
|
||||
|
||||
void enable_lapic_nmi_watchdog(void)
|
||||
{
|
||||
BUG_ON(nmi_watchdog != NMI_LOCAL_APIC);
|
||||
|
||||
/* are we already enabled */
|
||||
if (atomic_read(&nmi_active) != 0)
|
||||
return;
|
||||
|
||||
/* are we lapic aware */
|
||||
if (!wd_ops)
|
||||
return;
|
||||
if (!wd_ops->reserve()) {
|
||||
printk(KERN_ERR "NMI watchdog: cannot reserve perfctrs\n");
|
||||
return;
|
||||
}
|
||||
|
||||
on_each_cpu(setup_apic_nmi_watchdog, NULL, 1);
|
||||
touch_nmi_watchdog();
|
||||
}
|
||||
|
||||
/*
|
||||
* Activate the NMI watchdog via the local APIC.
|
||||
*/
|
||||
|
||||
static unsigned int adjust_for_32bit_ctr(unsigned int hz)
|
||||
{
|
||||
u64 counter_val;
|
||||
unsigned int retval = hz;
|
||||
|
||||
/*
|
||||
* On Intel CPUs with P6/ARCH_PERFMON only 32 bits in the counter
|
||||
* are writable, with higher bits sign extending from bit 31.
|
||||
* So, we can only program the counter with 31 bit values and
|
||||
* 32nd bit should be 1, for 33.. to be 1.
|
||||
* Find the appropriate nmi_hz
|
||||
*/
|
||||
counter_val = (u64)cpu_khz * 1000;
|
||||
do_div(counter_val, retval);
|
||||
if (counter_val > 0x7fffffffULL) {
|
||||
u64 count = (u64)cpu_khz * 1000;
|
||||
do_div(count, 0x7fffffffUL);
|
||||
retval = count + 1;
|
||||
}
|
||||
return retval;
|
||||
}
|
||||
|
||||
static void write_watchdog_counter(unsigned int perfctr_msr,
|
||||
const char *descr, unsigned nmi_hz)
|
||||
{
|
||||
u64 count = (u64)cpu_khz * 1000;
|
||||
|
||||
do_div(count, nmi_hz);
|
||||
if (descr)
|
||||
pr_debug("setting %s to -0x%08Lx\n", descr, count);
|
||||
wrmsrl(perfctr_msr, 0 - count);
|
||||
}
|
||||
|
||||
static void write_watchdog_counter32(unsigned int perfctr_msr,
|
||||
const char *descr, unsigned nmi_hz)
|
||||
{
|
||||
u64 count = (u64)cpu_khz * 1000;
|
||||
|
||||
do_div(count, nmi_hz);
|
||||
if (descr)
|
||||
pr_debug("setting %s to -0x%08Lx\n", descr, count);
|
||||
wrmsr(perfctr_msr, (u32)(-count), 0);
|
||||
}
|
||||
|
||||
/*
|
||||
* AMD K7/K8/Family10h/Family11h support.
|
||||
* AMD keeps this interface nicely stable so there is not much variety
|
||||
*/
|
||||
#define K7_EVNTSEL_ENABLE (1 << 22)
|
||||
#define K7_EVNTSEL_INT (1 << 20)
|
||||
#define K7_EVNTSEL_OS (1 << 17)
|
||||
#define K7_EVNTSEL_USR (1 << 16)
|
||||
#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING 0x76
|
||||
#define K7_NMI_EVENT K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
|
||||
|
||||
static int setup_k7_watchdog(unsigned nmi_hz)
|
||||
{
|
||||
unsigned int perfctr_msr, evntsel_msr;
|
||||
unsigned int evntsel;
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
|
||||
perfctr_msr = wd_ops->perfctr;
|
||||
evntsel_msr = wd_ops->evntsel;
|
||||
|
||||
wrmsrl(perfctr_msr, 0UL);
|
||||
|
||||
evntsel = K7_EVNTSEL_INT
|
||||
| K7_EVNTSEL_OS
|
||||
| K7_EVNTSEL_USR
|
||||
| K7_NMI_EVENT;
|
||||
|
||||
/* setup the timer */
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
write_watchdog_counter(perfctr_msr, "K7_PERFCTR0", nmi_hz);
|
||||
|
||||
/* initialize the wd struct before enabling */
|
||||
wd->perfctr_msr = perfctr_msr;
|
||||
wd->evntsel_msr = evntsel_msr;
|
||||
wd->cccr_msr = 0; /* unused */
|
||||
|
||||
/* ok, everything is initialized, announce that we're set */
|
||||
cpu_nmi_set_wd_enabled();
|
||||
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
evntsel |= K7_EVNTSEL_ENABLE;
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void single_msr_stop_watchdog(void)
|
||||
{
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
|
||||
wrmsr(wd->evntsel_msr, 0, 0);
|
||||
}
|
||||
|
||||
static int single_msr_reserve(void)
|
||||
{
|
||||
if (!reserve_perfctr_nmi(wd_ops->perfctr))
|
||||
return 0;
|
||||
|
||||
if (!reserve_evntsel_nmi(wd_ops->evntsel)) {
|
||||
release_perfctr_nmi(wd_ops->perfctr);
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void single_msr_unreserve(void)
|
||||
{
|
||||
release_evntsel_nmi(wd_ops->evntsel);
|
||||
release_perfctr_nmi(wd_ops->perfctr);
|
||||
}
|
||||
|
||||
static void __kprobes
|
||||
single_msr_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
|
||||
{
|
||||
/* start the cycle over again */
|
||||
write_watchdog_counter(wd->perfctr_msr, NULL, nmi_hz);
|
||||
}
|
||||
|
||||
static const struct wd_ops k7_wd_ops = {
|
||||
.reserve = single_msr_reserve,
|
||||
.unreserve = single_msr_unreserve,
|
||||
.setup = setup_k7_watchdog,
|
||||
.rearm = single_msr_rearm,
|
||||
.stop = single_msr_stop_watchdog,
|
||||
.perfctr = MSR_K7_PERFCTR0,
|
||||
.evntsel = MSR_K7_EVNTSEL0,
|
||||
.checkbit = 1ULL << 47,
|
||||
};
|
||||
|
||||
/*
|
||||
* Intel Model 6 (PPro+,P2,P3,P-M,Core1)
|
||||
*/
|
||||
#define P6_EVNTSEL0_ENABLE (1 << 22)
|
||||
#define P6_EVNTSEL_INT (1 << 20)
|
||||
#define P6_EVNTSEL_OS (1 << 17)
|
||||
#define P6_EVNTSEL_USR (1 << 16)
|
||||
#define P6_EVENT_CPU_CLOCKS_NOT_HALTED 0x79
|
||||
#define P6_NMI_EVENT P6_EVENT_CPU_CLOCKS_NOT_HALTED
|
||||
|
||||
static int setup_p6_watchdog(unsigned nmi_hz)
|
||||
{
|
||||
unsigned int perfctr_msr, evntsel_msr;
|
||||
unsigned int evntsel;
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
|
||||
perfctr_msr = wd_ops->perfctr;
|
||||
evntsel_msr = wd_ops->evntsel;
|
||||
|
||||
/* KVM doesn't implement this MSR */
|
||||
if (wrmsr_safe(perfctr_msr, 0, 0) < 0)
|
||||
return 0;
|
||||
|
||||
evntsel = P6_EVNTSEL_INT
|
||||
| P6_EVNTSEL_OS
|
||||
| P6_EVNTSEL_USR
|
||||
| P6_NMI_EVENT;
|
||||
|
||||
/* setup the timer */
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
nmi_hz = adjust_for_32bit_ctr(nmi_hz);
|
||||
write_watchdog_counter32(perfctr_msr, "P6_PERFCTR0", nmi_hz);
|
||||
|
||||
/* initialize the wd struct before enabling */
|
||||
wd->perfctr_msr = perfctr_msr;
|
||||
wd->evntsel_msr = evntsel_msr;
|
||||
wd->cccr_msr = 0; /* unused */
|
||||
|
||||
/* ok, everything is initialized, announce that we're set */
|
||||
cpu_nmi_set_wd_enabled();
|
||||
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
evntsel |= P6_EVNTSEL0_ENABLE;
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void __kprobes p6_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
|
||||
{
|
||||
/*
|
||||
* P6 based Pentium M need to re-unmask
|
||||
* the apic vector but it doesn't hurt
|
||||
* other P6 variant.
|
||||
* ArchPerfom/Core Duo also needs this
|
||||
*/
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
|
||||
/* P6/ARCH_PERFMON has 32 bit counter write */
|
||||
write_watchdog_counter32(wd->perfctr_msr, NULL, nmi_hz);
|
||||
}
|
||||
|
||||
static const struct wd_ops p6_wd_ops = {
|
||||
.reserve = single_msr_reserve,
|
||||
.unreserve = single_msr_unreserve,
|
||||
.setup = setup_p6_watchdog,
|
||||
.rearm = p6_rearm,
|
||||
.stop = single_msr_stop_watchdog,
|
||||
.perfctr = MSR_P6_PERFCTR0,
|
||||
.evntsel = MSR_P6_EVNTSEL0,
|
||||
.checkbit = 1ULL << 39,
|
||||
};
|
||||
|
||||
/*
|
||||
* Intel P4 performance counters.
|
||||
* By far the most complicated of all.
|
||||
*/
|
||||
#define MSR_P4_MISC_ENABLE_PERF_AVAIL (1 << 7)
|
||||
#define P4_ESCR_EVENT_SELECT(N) ((N) << 25)
|
||||
#define P4_ESCR_OS (1 << 3)
|
||||
#define P4_ESCR_USR (1 << 2)
|
||||
#define P4_CCCR_OVF_PMI0 (1 << 26)
|
||||
#define P4_CCCR_OVF_PMI1 (1 << 27)
|
||||
#define P4_CCCR_THRESHOLD(N) ((N) << 20)
|
||||
#define P4_CCCR_COMPLEMENT (1 << 19)
|
||||
#define P4_CCCR_COMPARE (1 << 18)
|
||||
#define P4_CCCR_REQUIRED (3 << 16)
|
||||
#define P4_CCCR_ESCR_SELECT(N) ((N) << 13)
|
||||
#define P4_CCCR_ENABLE (1 << 12)
|
||||
#define P4_CCCR_OVF (1 << 31)
|
||||
|
||||
#define P4_CONTROLS 18
|
||||
static unsigned int p4_controls[18] = {
|
||||
MSR_P4_BPU_CCCR0,
|
||||
MSR_P4_BPU_CCCR1,
|
||||
MSR_P4_BPU_CCCR2,
|
||||
MSR_P4_BPU_CCCR3,
|
||||
MSR_P4_MS_CCCR0,
|
||||
MSR_P4_MS_CCCR1,
|
||||
MSR_P4_MS_CCCR2,
|
||||
MSR_P4_MS_CCCR3,
|
||||
MSR_P4_FLAME_CCCR0,
|
||||
MSR_P4_FLAME_CCCR1,
|
||||
MSR_P4_FLAME_CCCR2,
|
||||
MSR_P4_FLAME_CCCR3,
|
||||
MSR_P4_IQ_CCCR0,
|
||||
MSR_P4_IQ_CCCR1,
|
||||
MSR_P4_IQ_CCCR2,
|
||||
MSR_P4_IQ_CCCR3,
|
||||
MSR_P4_IQ_CCCR4,
|
||||
MSR_P4_IQ_CCCR5,
|
||||
};
|
||||
/*
|
||||
* Set up IQ_COUNTER0 to behave like a clock, by having IQ_CCCR0 filter
|
||||
* CRU_ESCR0 (with any non-null event selector) through a complemented
|
||||
* max threshold. [IA32-Vol3, Section 14.9.9]
|
||||
*/
|
||||
static int setup_p4_watchdog(unsigned nmi_hz)
|
||||
{
|
||||
unsigned int perfctr_msr, evntsel_msr, cccr_msr;
|
||||
unsigned int evntsel, cccr_val;
|
||||
unsigned int misc_enable, dummy;
|
||||
unsigned int ht_num;
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
|
||||
rdmsr(MSR_IA32_MISC_ENABLE, misc_enable, dummy);
|
||||
if (!(misc_enable & MSR_P4_MISC_ENABLE_PERF_AVAIL))
|
||||
return 0;
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
/* detect which hyperthread we are on */
|
||||
if (smp_num_siblings == 2) {
|
||||
unsigned int ebx, apicid;
|
||||
|
||||
ebx = cpuid_ebx(1);
|
||||
apicid = (ebx >> 24) & 0xff;
|
||||
ht_num = apicid & 1;
|
||||
} else
|
||||
#endif
|
||||
ht_num = 0;
|
||||
|
||||
/*
|
||||
* performance counters are shared resources
|
||||
* assign each hyperthread its own set
|
||||
* (re-use the ESCR0 register, seems safe
|
||||
* and keeps the cccr_val the same)
|
||||
*/
|
||||
if (!ht_num) {
|
||||
/* logical cpu 0 */
|
||||
perfctr_msr = MSR_P4_IQ_PERFCTR0;
|
||||
evntsel_msr = MSR_P4_CRU_ESCR0;
|
||||
cccr_msr = MSR_P4_IQ_CCCR0;
|
||||
cccr_val = P4_CCCR_OVF_PMI0 | P4_CCCR_ESCR_SELECT(4);
|
||||
|
||||
/*
|
||||
* If we're on the kdump kernel or other situation, we may
|
||||
* still have other performance counter registers set to
|
||||
* interrupt and they'll keep interrupting forever because
|
||||
* of the P4_CCCR_OVF quirk. So we need to ACK all the
|
||||
* pending interrupts and disable all the registers here,
|
||||
* before reenabling the NMI delivery. Refer to p4_rearm()
|
||||
* about the P4_CCCR_OVF quirk.
|
||||
*/
|
||||
if (reset_devices) {
|
||||
unsigned int low, high;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < P4_CONTROLS; i++) {
|
||||
rdmsr(p4_controls[i], low, high);
|
||||
low &= ~(P4_CCCR_ENABLE | P4_CCCR_OVF);
|
||||
wrmsr(p4_controls[i], low, high);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
/* logical cpu 1 */
|
||||
perfctr_msr = MSR_P4_IQ_PERFCTR1;
|
||||
evntsel_msr = MSR_P4_CRU_ESCR0;
|
||||
cccr_msr = MSR_P4_IQ_CCCR1;
|
||||
|
||||
/* Pentium 4 D processors don't support P4_CCCR_OVF_PMI1 */
|
||||
if (boot_cpu_data.x86_model == 4 && boot_cpu_data.x86_mask == 4)
|
||||
cccr_val = P4_CCCR_OVF_PMI0;
|
||||
else
|
||||
cccr_val = P4_CCCR_OVF_PMI1;
|
||||
cccr_val |= P4_CCCR_ESCR_SELECT(4);
|
||||
}
|
||||
|
||||
evntsel = P4_ESCR_EVENT_SELECT(0x3F)
|
||||
| P4_ESCR_OS
|
||||
| P4_ESCR_USR;
|
||||
|
||||
cccr_val |= P4_CCCR_THRESHOLD(15)
|
||||
| P4_CCCR_COMPLEMENT
|
||||
| P4_CCCR_COMPARE
|
||||
| P4_CCCR_REQUIRED;
|
||||
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
wrmsr(cccr_msr, cccr_val, 0);
|
||||
write_watchdog_counter(perfctr_msr, "P4_IQ_COUNTER0", nmi_hz);
|
||||
|
||||
wd->perfctr_msr = perfctr_msr;
|
||||
wd->evntsel_msr = evntsel_msr;
|
||||
wd->cccr_msr = cccr_msr;
|
||||
|
||||
/* ok, everything is initialized, announce that we're set */
|
||||
cpu_nmi_set_wd_enabled();
|
||||
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
cccr_val |= P4_CCCR_ENABLE;
|
||||
wrmsr(cccr_msr, cccr_val, 0);
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void stop_p4_watchdog(void)
|
||||
{
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
wrmsr(wd->cccr_msr, 0, 0);
|
||||
wrmsr(wd->evntsel_msr, 0, 0);
|
||||
}
|
||||
|
||||
static int p4_reserve(void)
|
||||
{
|
||||
if (!reserve_perfctr_nmi(MSR_P4_IQ_PERFCTR0))
|
||||
return 0;
|
||||
#ifdef CONFIG_SMP
|
||||
if (smp_num_siblings > 1 && !reserve_perfctr_nmi(MSR_P4_IQ_PERFCTR1))
|
||||
goto fail1;
|
||||
#endif
|
||||
if (!reserve_evntsel_nmi(MSR_P4_CRU_ESCR0))
|
||||
goto fail2;
|
||||
/* RED-PEN why is ESCR1 not reserved here? */
|
||||
return 1;
|
||||
fail2:
|
||||
#ifdef CONFIG_SMP
|
||||
if (smp_num_siblings > 1)
|
||||
release_perfctr_nmi(MSR_P4_IQ_PERFCTR1);
|
||||
fail1:
|
||||
#endif
|
||||
release_perfctr_nmi(MSR_P4_IQ_PERFCTR0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void p4_unreserve(void)
|
||||
{
|
||||
#ifdef CONFIG_SMP
|
||||
if (smp_num_siblings > 1)
|
||||
release_perfctr_nmi(MSR_P4_IQ_PERFCTR1);
|
||||
#endif
|
||||
release_evntsel_nmi(MSR_P4_CRU_ESCR0);
|
||||
release_perfctr_nmi(MSR_P4_IQ_PERFCTR0);
|
||||
}
|
||||
|
||||
static void __kprobes p4_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
|
||||
{
|
||||
unsigned dummy;
|
||||
/*
|
||||
* P4 quirks:
|
||||
* - An overflown perfctr will assert its interrupt
|
||||
* until the OVF flag in its CCCR is cleared.
|
||||
* - LVTPC is masked on interrupt and must be
|
||||
* unmasked by the LVTPC handler.
|
||||
*/
|
||||
rdmsrl(wd->cccr_msr, dummy);
|
||||
dummy &= ~P4_CCCR_OVF;
|
||||
wrmsrl(wd->cccr_msr, dummy);
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
/* start the cycle over again */
|
||||
write_watchdog_counter(wd->perfctr_msr, NULL, nmi_hz);
|
||||
}
|
||||
|
||||
static const struct wd_ops p4_wd_ops = {
|
||||
.reserve = p4_reserve,
|
||||
.unreserve = p4_unreserve,
|
||||
.setup = setup_p4_watchdog,
|
||||
.rearm = p4_rearm,
|
||||
.stop = stop_p4_watchdog,
|
||||
/* RED-PEN this is wrong for the other sibling */
|
||||
.perfctr = MSR_P4_BPU_PERFCTR0,
|
||||
.evntsel = MSR_P4_BSU_ESCR0,
|
||||
.checkbit = 1ULL << 39,
|
||||
};
|
||||
|
||||
/*
|
||||
* Watchdog using the Intel architected PerfMon.
|
||||
* Used for Core2 and hopefully all future Intel CPUs.
|
||||
*/
|
||||
#define ARCH_PERFMON_NMI_EVENT_SEL ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL
|
||||
#define ARCH_PERFMON_NMI_EVENT_UMASK ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK
|
||||
|
||||
static struct wd_ops intel_arch_wd_ops;
|
||||
|
||||
static int setup_intel_arch_watchdog(unsigned nmi_hz)
|
||||
{
|
||||
unsigned int ebx;
|
||||
union cpuid10_eax eax;
|
||||
unsigned int unused;
|
||||
unsigned int perfctr_msr, evntsel_msr;
|
||||
unsigned int evntsel;
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
|
||||
/*
|
||||
* Check whether the Architectural PerfMon supports
|
||||
* Unhalted Core Cycles Event or not.
|
||||
* NOTE: Corresponding bit = 0 in ebx indicates event present.
|
||||
*/
|
||||
cpuid(10, &(eax.full), &ebx, &unused, &unused);
|
||||
if ((eax.split.mask_length <
|
||||
(ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX+1)) ||
|
||||
(ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT))
|
||||
return 0;
|
||||
|
||||
perfctr_msr = wd_ops->perfctr;
|
||||
evntsel_msr = wd_ops->evntsel;
|
||||
|
||||
wrmsrl(perfctr_msr, 0UL);
|
||||
|
||||
evntsel = ARCH_PERFMON_EVENTSEL_INT
|
||||
| ARCH_PERFMON_EVENTSEL_OS
|
||||
| ARCH_PERFMON_EVENTSEL_USR
|
||||
| ARCH_PERFMON_NMI_EVENT_SEL
|
||||
| ARCH_PERFMON_NMI_EVENT_UMASK;
|
||||
|
||||
/* setup the timer */
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
nmi_hz = adjust_for_32bit_ctr(nmi_hz);
|
||||
write_watchdog_counter32(perfctr_msr, "INTEL_ARCH_PERFCTR0", nmi_hz);
|
||||
|
||||
wd->perfctr_msr = perfctr_msr;
|
||||
wd->evntsel_msr = evntsel_msr;
|
||||
wd->cccr_msr = 0; /* unused */
|
||||
|
||||
/* ok, everything is initialized, announce that we're set */
|
||||
cpu_nmi_set_wd_enabled();
|
||||
|
||||
apic_write(APIC_LVTPC, APIC_DM_NMI);
|
||||
evntsel |= ARCH_PERFMON_EVENTSEL_ENABLE;
|
||||
wrmsr(evntsel_msr, evntsel, 0);
|
||||
intel_arch_wd_ops.checkbit = 1ULL << (eax.split.bit_width - 1);
|
||||
return 1;
|
||||
}
|
||||
|
||||
static struct wd_ops intel_arch_wd_ops __read_mostly = {
|
||||
.reserve = single_msr_reserve,
|
||||
.unreserve = single_msr_unreserve,
|
||||
.setup = setup_intel_arch_watchdog,
|
||||
.rearm = p6_rearm,
|
||||
.stop = single_msr_stop_watchdog,
|
||||
.perfctr = MSR_ARCH_PERFMON_PERFCTR1,
|
||||
.evntsel = MSR_ARCH_PERFMON_EVENTSEL1,
|
||||
};
|
||||
|
||||
static void probe_nmi_watchdog(void)
|
||||
{
|
||||
switch (boot_cpu_data.x86_vendor) {
|
||||
case X86_VENDOR_AMD:
|
||||
if (boot_cpu_data.x86 == 6 ||
|
||||
(boot_cpu_data.x86 >= 0xf && boot_cpu_data.x86 <= 0x15))
|
||||
wd_ops = &k7_wd_ops;
|
||||
return;
|
||||
case X86_VENDOR_INTEL:
|
||||
/* Work around where perfctr1 doesn't have a working enable
|
||||
* bit as described in the following errata:
|
||||
* AE49 Core Duo and Intel Core Solo 65 nm
|
||||
* AN49 Intel Pentium Dual-Core
|
||||
* AF49 Dual-Core Intel Xeon Processor LV
|
||||
*/
|
||||
if ((boot_cpu_data.x86 == 6 && boot_cpu_data.x86_model == 14) ||
|
||||
((boot_cpu_data.x86 == 6 && boot_cpu_data.x86_model == 15 &&
|
||||
boot_cpu_data.x86_mask == 4))) {
|
||||
intel_arch_wd_ops.perfctr = MSR_ARCH_PERFMON_PERFCTR0;
|
||||
intel_arch_wd_ops.evntsel = MSR_ARCH_PERFMON_EVENTSEL0;
|
||||
}
|
||||
if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
|
||||
wd_ops = &intel_arch_wd_ops;
|
||||
break;
|
||||
}
|
||||
switch (boot_cpu_data.x86) {
|
||||
case 6:
|
||||
if (boot_cpu_data.x86_model > 13)
|
||||
return;
|
||||
|
||||
wd_ops = &p6_wd_ops;
|
||||
break;
|
||||
case 15:
|
||||
wd_ops = &p4_wd_ops;
|
||||
break;
|
||||
default:
|
||||
return;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/* Interface to nmi.c */
|
||||
|
||||
int lapic_watchdog_init(unsigned nmi_hz)
|
||||
{
|
||||
if (!wd_ops) {
|
||||
probe_nmi_watchdog();
|
||||
if (!wd_ops) {
|
||||
printk(KERN_INFO "NMI watchdog: CPU not supported\n");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (!wd_ops->reserve()) {
|
||||
printk(KERN_ERR
|
||||
"NMI watchdog: cannot reserve perfctrs\n");
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
|
||||
if (!(wd_ops->setup(nmi_hz))) {
|
||||
printk(KERN_ERR "Cannot setup NMI watchdog on CPU %d\n",
|
||||
raw_smp_processor_id());
|
||||
return -1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void lapic_watchdog_stop(void)
|
||||
{
|
||||
if (wd_ops)
|
||||
wd_ops->stop();
|
||||
}
|
||||
|
||||
unsigned lapic_adjust_nmi_hz(unsigned hz)
|
||||
{
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
if (wd->perfctr_msr == MSR_P6_PERFCTR0 ||
|
||||
wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR1)
|
||||
hz = adjust_for_32bit_ctr(hz);
|
||||
return hz;
|
||||
}
|
||||
|
||||
int __kprobes lapic_wd_event(unsigned nmi_hz)
|
||||
{
|
||||
struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
|
||||
u64 ctr;
|
||||
|
||||
rdmsrl(wd->perfctr_msr, ctr);
|
||||
if (ctr & wd_ops->checkbit) /* perfctr still running? */
|
||||
return 0;
|
||||
|
||||
wd_ops->rearm(wd, nmi_hz);
|
||||
return 1;
|
||||
}
|
||||
|
@ -175,21 +175,21 @@ static const struct stacktrace_ops print_trace_ops = {
|
||||
|
||||
void
|
||||
show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp, char *log_lvl)
|
||||
unsigned long *stack, char *log_lvl)
|
||||
{
|
||||
printk("%sCall Trace:\n", log_lvl);
|
||||
dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl);
|
||||
dump_trace(task, regs, stack, &print_trace_ops, log_lvl);
|
||||
}
|
||||
|
||||
void show_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp)
|
||||
unsigned long *stack)
|
||||
{
|
||||
show_trace_log_lvl(task, regs, stack, bp, "");
|
||||
show_trace_log_lvl(task, regs, stack, "");
|
||||
}
|
||||
|
||||
void show_stack(struct task_struct *task, unsigned long *sp)
|
||||
{
|
||||
show_stack_log_lvl(task, NULL, sp, 0, "");
|
||||
show_stack_log_lvl(task, NULL, sp, "");
|
||||
}
|
||||
|
||||
/*
|
||||
@ -210,7 +210,7 @@ void dump_stack(void)
|
||||
init_utsname()->release,
|
||||
(int)strcspn(init_utsname()->version, " "),
|
||||
init_utsname()->version);
|
||||
show_trace(NULL, NULL, &stack, bp);
|
||||
show_trace(NULL, NULL, &stack);
|
||||
}
|
||||
EXPORT_SYMBOL(dump_stack);
|
||||
|
||||
|
@ -17,11 +17,12 @@
|
||||
#include <asm/stacktrace.h>
|
||||
|
||||
|
||||
void dump_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp,
|
||||
void dump_trace(struct task_struct *task,
|
||||
struct pt_regs *regs, unsigned long *stack,
|
||||
const struct stacktrace_ops *ops, void *data)
|
||||
{
|
||||
int graph = 0;
|
||||
unsigned long bp;
|
||||
|
||||
if (!task)
|
||||
task = current;
|
||||
@ -34,18 +35,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
stack = (unsigned long *)task->thread.sp;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_FRAME_POINTER
|
||||
if (!bp) {
|
||||
if (task == current) {
|
||||
/* Grab bp right from our regs */
|
||||
get_bp(bp);
|
||||
} else {
|
||||
/* bp is the last reg pushed by switch_to */
|
||||
bp = *(unsigned long *) task->thread.sp;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
bp = stack_frame(task, regs);
|
||||
for (;;) {
|
||||
struct thread_info *context;
|
||||
|
||||
@ -65,7 +55,7 @@ EXPORT_SYMBOL(dump_trace);
|
||||
|
||||
void
|
||||
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *sp, unsigned long bp, char *log_lvl)
|
||||
unsigned long *sp, char *log_lvl)
|
||||
{
|
||||
unsigned long *stack;
|
||||
int i;
|
||||
@ -87,7 +77,7 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
touch_nmi_watchdog();
|
||||
}
|
||||
printk(KERN_CONT "\n");
|
||||
show_trace_log_lvl(task, regs, sp, bp, log_lvl);
|
||||
show_trace_log_lvl(task, regs, sp, log_lvl);
|
||||
}
|
||||
|
||||
|
||||
@ -112,8 +102,7 @@ void show_registers(struct pt_regs *regs)
|
||||
u8 *ip;
|
||||
|
||||
printk(KERN_EMERG "Stack:\n");
|
||||
show_stack_log_lvl(NULL, regs, ®s->sp,
|
||||
0, KERN_EMERG);
|
||||
show_stack_log_lvl(NULL, regs, ®s->sp, KERN_EMERG);
|
||||
|
||||
printk(KERN_EMERG "Code: ");
|
||||
|
||||
|
@ -139,8 +139,8 @@ fixup_bp_irq_link(unsigned long bp, unsigned long *stack,
|
||||
* severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
|
||||
*/
|
||||
|
||||
void dump_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *stack, unsigned long bp,
|
||||
void dump_trace(struct task_struct *task,
|
||||
struct pt_regs *regs, unsigned long *stack,
|
||||
const struct stacktrace_ops *ops, void *data)
|
||||
{
|
||||
const unsigned cpu = get_cpu();
|
||||
@ -149,6 +149,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned used = 0;
|
||||
struct thread_info *tinfo;
|
||||
int graph = 0;
|
||||
unsigned long bp;
|
||||
|
||||
if (!task)
|
||||
task = current;
|
||||
@ -160,18 +161,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
|
||||
stack = (unsigned long *)task->thread.sp;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_FRAME_POINTER
|
||||
if (!bp) {
|
||||
if (task == current) {
|
||||
/* Grab bp right from our regs */
|
||||
get_bp(bp);
|
||||
} else {
|
||||
/* bp is the last reg pushed by switch_to */
|
||||
bp = *(unsigned long *) task->thread.sp;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
bp = stack_frame(task, regs);
|
||||
/*
|
||||
* Print function call entries in all stacks, starting at the
|
||||
* current stack address. If the stacks consist of nested
|
||||
@ -235,7 +225,7 @@ EXPORT_SYMBOL(dump_trace);
|
||||
|
||||
void
|
||||
show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
unsigned long *sp, unsigned long bp, char *log_lvl)
|
||||
unsigned long *sp, char *log_lvl)
|
||||
{
|
||||
unsigned long *irq_stack_end;
|
||||
unsigned long *irq_stack;
|
||||
@ -279,7 +269,7 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
|
||||
preempt_enable();
|
||||
|
||||
printk(KERN_CONT "\n");
|
||||
show_trace_log_lvl(task, regs, sp, bp, log_lvl);
|
||||
show_trace_log_lvl(task, regs, sp, log_lvl);
|
||||
}
|
||||
|
||||
void show_registers(struct pt_regs *regs)
|
||||
@ -308,7 +298,7 @@ void show_registers(struct pt_regs *regs)
|
||||
|
||||
printk(KERN_EMERG "Stack:\n");
|
||||
show_stack_log_lvl(NULL, regs, (unsigned long *)sp,
|
||||
regs->bp, KERN_EMERG);
|
||||
KERN_EMERG);
|
||||
|
||||
printk(KERN_EMERG "Code: ");
|
||||
|
||||
|
@ -1184,6 +1184,10 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op,
|
||||
{
|
||||
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
|
||||
|
||||
/* This is possible if op is under delayed unoptimizing */
|
||||
if (kprobe_disabled(&op->kp))
|
||||
return;
|
||||
|
||||
preempt_disable();
|
||||
if (kprobe_running()) {
|
||||
kprobes_inc_nmissed_count(&op->kp);
|
||||
@ -1401,10 +1405,16 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Replace a breakpoint (int3) with a relative jump. */
|
||||
int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
|
||||
#define MAX_OPTIMIZE_PROBES 256
|
||||
static struct text_poke_param *jump_poke_params;
|
||||
static struct jump_poke_buffer {
|
||||
u8 buf[RELATIVEJUMP_SIZE];
|
||||
} *jump_poke_bufs;
|
||||
|
||||
static void __kprobes setup_optimize_kprobe(struct text_poke_param *tprm,
|
||||
u8 *insn_buf,
|
||||
struct optimized_kprobe *op)
|
||||
{
|
||||
unsigned char jmp_code[RELATIVEJUMP_SIZE];
|
||||
s32 rel = (s32)((long)op->optinsn.insn -
|
||||
((long)op->kp.addr + RELATIVEJUMP_SIZE));
|
||||
|
||||
@ -1412,16 +1422,79 @@ int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
|
||||
memcpy(op->optinsn.copied_insn, op->kp.addr + INT3_SIZE,
|
||||
RELATIVE_ADDR_SIZE);
|
||||
|
||||
jmp_code[0] = RELATIVEJUMP_OPCODE;
|
||||
*(s32 *)(&jmp_code[1]) = rel;
|
||||
insn_buf[0] = RELATIVEJUMP_OPCODE;
|
||||
*(s32 *)(&insn_buf[1]) = rel;
|
||||
|
||||
tprm->addr = op->kp.addr;
|
||||
tprm->opcode = insn_buf;
|
||||
tprm->len = RELATIVEJUMP_SIZE;
|
||||
}
|
||||
|
||||
/*
|
||||
* Replace breakpoints (int3) with relative jumps.
|
||||
* Caller must call with locking kprobe_mutex and text_mutex.
|
||||
*/
|
||||
void __kprobes arch_optimize_kprobes(struct list_head *oplist)
|
||||
{
|
||||
struct optimized_kprobe *op, *tmp;
|
||||
int c = 0;
|
||||
|
||||
list_for_each_entry_safe(op, tmp, oplist, list) {
|
||||
WARN_ON(kprobe_disabled(&op->kp));
|
||||
/* Setup param */
|
||||
setup_optimize_kprobe(&jump_poke_params[c],
|
||||
jump_poke_bufs[c].buf, op);
|
||||
list_del_init(&op->list);
|
||||
if (++c >= MAX_OPTIMIZE_PROBES)
|
||||
break;
|
||||
}
|
||||
|
||||
/*
|
||||
* text_poke_smp doesn't support NMI/MCE code modifying.
|
||||
* However, since kprobes itself also doesn't support NMI/MCE
|
||||
* code probing, it's not a problem.
|
||||
*/
|
||||
text_poke_smp(op->kp.addr, jmp_code, RELATIVEJUMP_SIZE);
|
||||
return 0;
|
||||
text_poke_smp_batch(jump_poke_params, c);
|
||||
}
|
||||
|
||||
static void __kprobes setup_unoptimize_kprobe(struct text_poke_param *tprm,
|
||||
u8 *insn_buf,
|
||||
struct optimized_kprobe *op)
|
||||
{
|
||||
/* Set int3 to first byte for kprobes */
|
||||
insn_buf[0] = BREAKPOINT_INSTRUCTION;
|
||||
memcpy(insn_buf + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
|
||||
|
||||
tprm->addr = op->kp.addr;
|
||||
tprm->opcode = insn_buf;
|
||||
tprm->len = RELATIVEJUMP_SIZE;
|
||||
}
|
||||
|
||||
/*
|
||||
* Recover original instructions and breakpoints from relative jumps.
|
||||
* Caller must call with locking kprobe_mutex.
|
||||
*/
|
||||
extern void arch_unoptimize_kprobes(struct list_head *oplist,
|
||||
struct list_head *done_list)
|
||||
{
|
||||
struct optimized_kprobe *op, *tmp;
|
||||
int c = 0;
|
||||
|
||||
list_for_each_entry_safe(op, tmp, oplist, list) {
|
||||
/* Setup param */
|
||||
setup_unoptimize_kprobe(&jump_poke_params[c],
|
||||
jump_poke_bufs[c].buf, op);
|
||||
list_move(&op->list, done_list);
|
||||
if (++c >= MAX_OPTIMIZE_PROBES)
|
||||
break;
|
||||
}
|
||||
|
||||
/*
|
||||
* text_poke_smp doesn't support NMI/MCE code modifying.
|
||||
* However, since kprobes itself also doesn't support NMI/MCE
|
||||
* code probing, it's not a problem.
|
||||
*/
|
||||
text_poke_smp_batch(jump_poke_params, c);
|
||||
}
|
||||
|
||||
/* Replace a relative jump with a breakpoint (int3). */
|
||||
@ -1453,11 +1526,35 @@ static int __kprobes setup_detour_execution(struct kprobe *p,
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __kprobes init_poke_params(void)
|
||||
{
|
||||
/* Allocate code buffer and parameter array */
|
||||
jump_poke_bufs = kmalloc(sizeof(struct jump_poke_buffer) *
|
||||
MAX_OPTIMIZE_PROBES, GFP_KERNEL);
|
||||
if (!jump_poke_bufs)
|
||||
return -ENOMEM;
|
||||
|
||||
jump_poke_params = kmalloc(sizeof(struct text_poke_param) *
|
||||
MAX_OPTIMIZE_PROBES, GFP_KERNEL);
|
||||
if (!jump_poke_params) {
|
||||
kfree(jump_poke_bufs);
|
||||
jump_poke_bufs = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
#else /* !CONFIG_OPTPROBES */
|
||||
static int __kprobes init_poke_params(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
int __init arch_init_kprobes(void)
|
||||
{
|
||||
return 0;
|
||||
return init_poke_params();
|
||||
}
|
||||
|
||||
int __kprobes arch_trampoline_kprobe(struct kprobe *p)
|
||||
|
@ -91,8 +91,7 @@ void exit_thread(void)
|
||||
void show_regs(struct pt_regs *regs)
|
||||
{
|
||||
show_registers(regs);
|
||||
show_trace(NULL, regs, (unsigned long *)kernel_stack_pointer(regs),
|
||||
regs->bp);
|
||||
show_trace(NULL, regs, (unsigned long *)kernel_stack_pointer(regs));
|
||||
}
|
||||
|
||||
void show_regs_common(void)
|
||||
@ -374,6 +373,7 @@ void default_idle(void)
|
||||
{
|
||||
if (hlt_use_halt()) {
|
||||
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
|
||||
trace_cpu_idle(1, smp_processor_id());
|
||||
current_thread_info()->status &= ~TS_POLLING;
|
||||
/*
|
||||
* TS_POLLING-cleared state must be visible before we
|
||||
@ -444,6 +444,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
|
||||
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
|
||||
{
|
||||
trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
|
||||
trace_cpu_idle((ax>>4)+1, smp_processor_id());
|
||||
if (!need_resched()) {
|
||||
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
|
||||
clflush((void *)¤t_thread_info()->flags);
|
||||
@ -460,6 +461,7 @@ static void mwait_idle(void)
|
||||
{
|
||||
if (!need_resched()) {
|
||||
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
|
||||
trace_cpu_idle(1, smp_processor_id());
|
||||
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
|
||||
clflush((void *)¤t_thread_info()->flags);
|
||||
|
||||
@ -481,10 +483,12 @@ static void mwait_idle(void)
|
||||
static void poll_idle(void)
|
||||
{
|
||||
trace_power_start(POWER_CSTATE, 0, smp_processor_id());
|
||||
trace_cpu_idle(0, smp_processor_id());
|
||||
local_irq_enable();
|
||||
while (!need_resched())
|
||||
cpu_relax();
|
||||
trace_power_end(0);
|
||||
trace_power_end(smp_processor_id());
|
||||
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -113,8 +113,8 @@ void cpu_idle(void)
|
||||
stop_critical_timings();
|
||||
pm_idle();
|
||||
start_critical_timings();
|
||||
|
||||
trace_power_end(smp_processor_id());
|
||||
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
|
||||
}
|
||||
tick_nohz_restart_sched_tick();
|
||||
preempt_enable_no_resched();
|
||||
|
@ -142,6 +142,8 @@ void cpu_idle(void)
|
||||
start_critical_timings();
|
||||
|
||||
trace_power_end(smp_processor_id());
|
||||
trace_cpu_idle(PWR_EVENT_EXIT,
|
||||
smp_processor_id());
|
||||
|
||||
/* In many cases the interrupt that ended idle
|
||||
has already called exit_idle. But some idle
|
||||
|
@ -281,6 +281,13 @@ static void __cpuinit smp_callin(void)
|
||||
*/
|
||||
smp_store_cpu_info(cpuid);
|
||||
|
||||
/*
|
||||
* This must be done before setting cpu_online_mask
|
||||
* or calling notify_cpu_starting.
|
||||
*/
|
||||
set_cpu_sibling_map(raw_smp_processor_id());
|
||||
wmb();
|
||||
|
||||
notify_cpu_starting(cpuid);
|
||||
|
||||
/*
|
||||
@ -316,16 +323,6 @@ notrace static void __cpuinit start_secondary(void *unused)
|
||||
*/
|
||||
check_tsc_sync_target();
|
||||
|
||||
if (nmi_watchdog == NMI_IO_APIC) {
|
||||
legacy_pic->mask(0);
|
||||
enable_NMI_through_LVT0();
|
||||
legacy_pic->unmask(0);
|
||||
}
|
||||
|
||||
/* This must be done before setting cpu_online_mask */
|
||||
set_cpu_sibling_map(raw_smp_processor_id());
|
||||
wmb();
|
||||
|
||||
/*
|
||||
* We need to hold call_lock, so there is no inconsistency
|
||||
* between the time smp_call_function() determines number of
|
||||
@ -1061,8 +1058,6 @@ static int __init smp_sanity_check(unsigned max_cpus)
|
||||
printk(KERN_INFO "SMP mode deactivated.\n");
|
||||
smpboot_clear_io_apic();
|
||||
|
||||
localise_nmi_watchdog();
|
||||
|
||||
connect_bsp_APIC();
|
||||
setup_local_APIC();
|
||||
end_local_APIC_setup();
|
||||
@ -1196,7 +1191,6 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
|
||||
#ifdef CONFIG_X86_IO_APIC
|
||||
setup_ioapic_dest();
|
||||
#endif
|
||||
check_nmi_watchdog();
|
||||
mtrr_aps_init();
|
||||
}
|
||||
|
||||
@ -1341,8 +1335,6 @@ int native_cpu_disable(void)
|
||||
if (cpu == 0)
|
||||
return -EBUSY;
|
||||
|
||||
if (nmi_watchdog == NMI_LOCAL_APIC)
|
||||
stop_apic_nmi_watchdog(NULL);
|
||||
clear_local_APIC();
|
||||
|
||||
cpu_disable_common();
|
||||
|
@ -73,22 +73,22 @@ static const struct stacktrace_ops save_stack_ops_nosched = {
|
||||
*/
|
||||
void save_stack_trace(struct stack_trace *trace)
|
||||
{
|
||||
dump_trace(current, NULL, NULL, 0, &save_stack_ops, trace);
|
||||
dump_trace(current, NULL, NULL, &save_stack_ops, trace);
|
||||
if (trace->nr_entries < trace->max_entries)
|
||||
trace->entries[trace->nr_entries++] = ULONG_MAX;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(save_stack_trace);
|
||||
|
||||
void save_stack_trace_bp(struct stack_trace *trace, unsigned long bp)
|
||||
void save_stack_trace_regs(struct stack_trace *trace, struct pt_regs *regs)
|
||||
{
|
||||
dump_trace(current, NULL, NULL, bp, &save_stack_ops, trace);
|
||||
dump_trace(current, regs, NULL, &save_stack_ops, trace);
|
||||
if (trace->nr_entries < trace->max_entries)
|
||||
trace->entries[trace->nr_entries++] = ULONG_MAX;
|
||||
}
|
||||
|
||||
void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
|
||||
{
|
||||
dump_trace(tsk, NULL, NULL, 0, &save_stack_ops_nosched, trace);
|
||||
dump_trace(tsk, NULL, NULL, &save_stack_ops_nosched, trace);
|
||||
if (trace->nr_entries < trace->max_entries)
|
||||
trace->entries[trace->nr_entries++] = ULONG_MAX;
|
||||
}
|
||||
|
@ -22,10 +22,6 @@
|
||||
#include <asm/hpet.h>
|
||||
#include <asm/time.h>
|
||||
|
||||
#if defined(CONFIG_X86_32) && defined(CONFIG_X86_IO_APIC)
|
||||
int timer_ack;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_64
|
||||
volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
|
||||
#endif
|
||||
@ -63,20 +59,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
|
||||
/* Keep nmi watchdog up to date */
|
||||
inc_irq_stat(irq0_irqs);
|
||||
|
||||
/* Optimized out for !IO_APIC and x86_64 */
|
||||
if (timer_ack) {
|
||||
/*
|
||||
* Subtle, when I/O APICs are used we have to ack timer IRQ
|
||||
* manually to deassert NMI lines for the watchdog if run
|
||||
* on an 82489DX-based system.
|
||||
*/
|
||||
raw_spin_lock(&i8259A_lock);
|
||||
outb(0x0c, PIC_MASTER_OCW3);
|
||||
/* Ack the IRQ; AEOI will end it automatically. */
|
||||
inb(PIC_MASTER_POLL);
|
||||
raw_spin_unlock(&i8259A_lock);
|
||||
}
|
||||
|
||||
global_clock_event->event_handler(global_clock_event);
|
||||
|
||||
/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
|
||||
|
@ -83,6 +83,8 @@ EXPORT_SYMBOL_GPL(used_vectors);
|
||||
|
||||
static int ignore_nmis;
|
||||
|
||||
int unknown_nmi_panic;
|
||||
|
||||
static inline void conditional_sti(struct pt_regs *regs)
|
||||
{
|
||||
if (regs->flags & X86_EFLAGS_IF)
|
||||
@ -300,6 +302,13 @@ gp_in_kernel:
|
||||
die("general protection fault", regs, error_code);
|
||||
}
|
||||
|
||||
static int __init setup_unknown_nmi_panic(char *str)
|
||||
{
|
||||
unknown_nmi_panic = 1;
|
||||
return 1;
|
||||
}
|
||||
__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
|
||||
|
||||
static notrace __kprobes void
|
||||
mem_parity_error(unsigned char reason, struct pt_regs *regs)
|
||||
{
|
||||
@ -342,9 +351,11 @@ io_check_error(unsigned char reason, struct pt_regs *regs)
|
||||
reason = (reason & 0xf) | 8;
|
||||
outb(reason, 0x61);
|
||||
|
||||
i = 2000;
|
||||
while (--i)
|
||||
udelay(1000);
|
||||
i = 20000;
|
||||
while (--i) {
|
||||
touch_nmi_watchdog();
|
||||
udelay(100);
|
||||
}
|
||||
|
||||
reason &= ~8;
|
||||
outb(reason, 0x61);
|
||||
@ -371,7 +382,7 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
|
||||
reason, smp_processor_id());
|
||||
|
||||
printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
|
||||
if (panic_on_unrecovered_nmi)
|
||||
if (unknown_nmi_panic || panic_on_unrecovered_nmi)
|
||||
panic("NMI: Not continuing");
|
||||
|
||||
printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
|
||||
@ -397,20 +408,8 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
|
||||
if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
|
||||
== NOTIFY_STOP)
|
||||
return;
|
||||
|
||||
#ifndef CONFIG_LOCKUP_DETECTOR
|
||||
/*
|
||||
* Ok, so this is none of the documented NMI sources,
|
||||
* so it must be the NMI watchdog.
|
||||
*/
|
||||
if (nmi_watchdog_tick(regs, reason))
|
||||
return;
|
||||
if (!do_nmi_callback(regs, cpu))
|
||||
#endif /* !CONFIG_LOCKUP_DETECTOR */
|
||||
unknown_nmi_error(reason, regs);
|
||||
#else
|
||||
unknown_nmi_error(reason, regs);
|
||||
#endif
|
||||
unknown_nmi_error(reason, regs);
|
||||
|
||||
return;
|
||||
}
|
||||
@ -446,14 +445,12 @@ do_nmi(struct pt_regs *regs, long error_code)
|
||||
|
||||
void stop_nmi(void)
|
||||
{
|
||||
acpi_nmi_disable();
|
||||
ignore_nmis++;
|
||||
}
|
||||
|
||||
void restart_nmi(void)
|
||||
{
|
||||
ignore_nmis--;
|
||||
acpi_nmi_enable();
|
||||
}
|
||||
|
||||
/* May run on IST stack. */
|
||||
|
@ -185,7 +185,7 @@ void kmemcheck_error_save(enum kmemcheck_shadow state,
|
||||
e->trace.entries = e->trace_entries;
|
||||
e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
|
||||
e->trace.skip = 0;
|
||||
save_stack_trace_bp(&e->trace, regs->bp);
|
||||
save_stack_trace_regs(&e->trace, regs);
|
||||
|
||||
/* Round address down to nearest 16 bytes */
|
||||
shadow_copy = kmemcheck_shadow_lookup(address
|
||||
|
@ -126,7 +126,7 @@ x86_backtrace(struct pt_regs * const regs, unsigned int depth)
|
||||
if (!user_mode_vm(regs)) {
|
||||
unsigned long stack = kernel_stack_pointer(regs);
|
||||
if (depth)
|
||||
dump_trace(NULL, regs, (unsigned long *)stack, 0,
|
||||
dump_trace(NULL, regs, (unsigned long *)stack,
|
||||
&backtrace_ops, &depth);
|
||||
return;
|
||||
}
|
||||
|
@ -732,6 +732,9 @@ int __init op_nmi_init(struct oprofile_operations *ops)
|
||||
case 0x14:
|
||||
cpu_type = "x86-64/family14h";
|
||||
break;
|
||||
case 0x15:
|
||||
cpu_type = "x86-64/family15h";
|
||||
break;
|
||||
default:
|
||||
return -ENODEV;
|
||||
}
|
||||
|
@ -58,9 +58,6 @@ static void timer_stop(void)
|
||||
|
||||
int __init op_nmi_timer_init(struct oprofile_operations *ops)
|
||||
{
|
||||
if ((nmi_watchdog != NMI_IO_APIC) || (atomic_read(&nmi_active) <= 0))
|
||||
return -ENODEV;
|
||||
|
||||
ops->start = timer_start;
|
||||
ops->stop = timer_stop;
|
||||
ops->cpu_type = "timer";
|
||||
|
@ -29,11 +29,12 @@
|
||||
#include "op_x86_model.h"
|
||||
#include "op_counter.h"
|
||||
|
||||
#define NUM_COUNTERS 4
|
||||
#define NUM_COUNTERS 4
|
||||
#define NUM_COUNTERS_F15H 6
|
||||
#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
|
||||
#define NUM_VIRT_COUNTERS 32
|
||||
#define NUM_VIRT_COUNTERS 32
|
||||
#else
|
||||
#define NUM_VIRT_COUNTERS NUM_COUNTERS
|
||||
#define NUM_VIRT_COUNTERS 0
|
||||
#endif
|
||||
|
||||
#define OP_EVENT_MASK 0x0FFF
|
||||
@ -41,7 +42,8 @@
|
||||
|
||||
#define MSR_AMD_EVENTSEL_RESERVED ((0xFFFFFCF0ULL<<32)|(1ULL<<21))
|
||||
|
||||
static unsigned long reset_value[NUM_VIRT_COUNTERS];
|
||||
static int num_counters;
|
||||
static unsigned long reset_value[OP_MAX_COUNTER];
|
||||
|
||||
#define IBS_FETCH_SIZE 6
|
||||
#define IBS_OP_SIZE 12
|
||||
@ -387,7 +389,7 @@ static void op_mux_switch_ctrl(struct op_x86_model_spec const *model,
|
||||
int i;
|
||||
|
||||
/* enable active counters */
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
int virt = op_x86_phys_to_virt(i);
|
||||
if (!reset_value[virt])
|
||||
continue;
|
||||
@ -406,7 +408,7 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
if (!msrs->counters[i].addr)
|
||||
continue;
|
||||
release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
|
||||
@ -418,7 +420,7 @@ static int op_amd_fill_in_addresses(struct op_msrs * const msrs)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_COUNTERS; i++) {
|
||||
for (i = 0; i < num_counters; i++) {
|
||||
if (!reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
|
||||
goto fail;
|
||||
if (!reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) {
|
||||
@ -426,8 +428,13 @@ static int op_amd_fill_in_addresses(struct op_msrs * const msrs)
|
||||
goto fail;
|
||||
}
|
||||
/* both registers must be reserved */
|
||||
msrs->counters[i].addr = MSR_K7_PERFCTR0 + i;
|
||||
msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i;
|
||||
if (num_counters == NUM_COUNTERS_F15H) {
|
||||
msrs->counters[i].addr = MSR_F15H_PERF_CTR + (i << 1);
|
||||
msrs->controls[i].addr = MSR_F15H_PERF_CTL + (i << 1);
|
||||
} else {
|
||||
msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i;
|
||||
msrs->counters[i].addr = MSR_K7_PERFCTR0 + i;
|
||||
}
|
||||
continue;
|
||||
fail:
|
||||
if (!counter_config[i].enabled)
|
||||
@ -447,7 +454,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
|
||||
int i;
|
||||
|
||||
/* setup reset_value */
|
||||
for (i = 0; i < NUM_VIRT_COUNTERS; ++i) {
|
||||
for (i = 0; i < OP_MAX_COUNTER; ++i) {
|
||||
if (counter_config[i].enabled
|
||||
&& msrs->counters[op_x86_virt_to_phys(i)].addr)
|
||||
reset_value[i] = counter_config[i].count;
|
||||
@ -456,7 +463,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
|
||||
}
|
||||
|
||||
/* clear all counters */
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
if (!msrs->controls[i].addr)
|
||||
continue;
|
||||
rdmsrl(msrs->controls[i].addr, val);
|
||||
@ -472,7 +479,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
|
||||
}
|
||||
|
||||
/* enable active counters */
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
int virt = op_x86_phys_to_virt(i);
|
||||
if (!reset_value[virt])
|
||||
continue;
|
||||
@ -503,7 +510,7 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
|
||||
u64 val;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
int virt = op_x86_phys_to_virt(i);
|
||||
if (!reset_value[virt])
|
||||
continue;
|
||||
@ -526,7 +533,7 @@ static void op_amd_start(struct op_msrs const * const msrs)
|
||||
u64 val;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
if (!reset_value[op_x86_phys_to_virt(i)])
|
||||
continue;
|
||||
rdmsrl(msrs->controls[i].addr, val);
|
||||
@ -546,7 +553,7 @@ static void op_amd_stop(struct op_msrs const * const msrs)
|
||||
* Subtle: stop on all counters to avoid race with setting our
|
||||
* pm callback
|
||||
*/
|
||||
for (i = 0; i < NUM_COUNTERS; ++i) {
|
||||
for (i = 0; i < num_counters; ++i) {
|
||||
if (!reset_value[op_x86_phys_to_virt(i)])
|
||||
continue;
|
||||
rdmsrl(msrs->controls[i].addr, val);
|
||||
@ -706,18 +713,29 @@ static int setup_ibs_files(struct super_block *sb, struct dentry *root)
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct op_x86_model_spec op_amd_spec;
|
||||
|
||||
static int op_amd_init(struct oprofile_operations *ops)
|
||||
{
|
||||
init_ibs();
|
||||
create_arch_files = ops->create_files;
|
||||
ops->create_files = setup_ibs_files;
|
||||
|
||||
if (boot_cpu_data.x86 == 0x15) {
|
||||
num_counters = NUM_COUNTERS_F15H;
|
||||
} else {
|
||||
num_counters = NUM_COUNTERS;
|
||||
}
|
||||
|
||||
op_amd_spec.num_counters = num_counters;
|
||||
op_amd_spec.num_controls = num_counters;
|
||||
op_amd_spec.num_virt_counters = max(num_counters, NUM_VIRT_COUNTERS);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct op_x86_model_spec op_amd_spec = {
|
||||
.num_counters = NUM_COUNTERS,
|
||||
.num_controls = NUM_COUNTERS,
|
||||
.num_virt_counters = NUM_VIRT_COUNTERS,
|
||||
/* num_counters/num_controls filled in at runtime */
|
||||
.reserved = MSR_AMD_EVENTSEL_RESERVED,
|
||||
.event_mask = OP_EVENT_MASK,
|
||||
.init = op_amd_init,
|
||||
|
@ -11,7 +11,7 @@
|
||||
#include <linux/oprofile.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/ptrace.h>
|
||||
#include <linux/nmi.h>
|
||||
#include <asm/nmi.h>
|
||||
#include <asm/msr.h>
|
||||
#include <asm/fixmap.h>
|
||||
#include <asm/apic.h>
|
||||
|
@ -577,9 +577,7 @@ acpi_ns_init_one_device(acpi_handle obj_handle,
|
||||
* as possible (without an NMI being received in the middle of
|
||||
* this) - so disable NMIs and initialize the device:
|
||||
*/
|
||||
acpi_nmi_disable();
|
||||
status = acpi_ns_evaluate(info);
|
||||
acpi_nmi_enable();
|
||||
|
||||
if (ACPI_SUCCESS(status)) {
|
||||
walk_info->num_INI++;
|
||||
|
@ -355,6 +355,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
|
||||
dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
|
||||
(unsigned long)freqs->cpu);
|
||||
trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
|
||||
trace_cpu_frequency(freqs->new, freqs->cpu);
|
||||
srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
|
||||
CPUFREQ_POSTCHANGE, freqs);
|
||||
if (likely(policy) && likely(policy->cpu == freqs->cpu))
|
||||
|
@ -107,6 +107,7 @@ static void cpuidle_idle_call(void)
|
||||
if (cpuidle_curr_governor->reflect)
|
||||
cpuidle_curr_governor->reflect(dev);
|
||||
trace_power_end(smp_processor_id());
|
||||
trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -220,9 +220,8 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
|
||||
kt_before = ktime_get_real();
|
||||
|
||||
stop_critical_timings();
|
||||
#ifndef MODULE
|
||||
trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
|
||||
#endif
|
||||
trace_cpu_idle((eax >> 4) + 1, cpu);
|
||||
if (!need_resched()) {
|
||||
|
||||
__monitor((void *)¤t_thread_info()->flags, 0, 0);
|
||||
|
@ -642,19 +642,14 @@ static struct notifier_block die_notifier = {
|
||||
*/
|
||||
|
||||
#ifdef CONFIG_HPWDT_NMI_DECODING
|
||||
#ifdef ARCH_HAS_NMI_WATCHDOG
|
||||
#ifdef CONFIG_X86_LOCAL_APIC
|
||||
static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
|
||||
{
|
||||
/*
|
||||
* If nmi_watchdog is turned off then we can turn on
|
||||
* our nmi decoding capability.
|
||||
*/
|
||||
if (!nmi_watchdog_active())
|
||||
hpwdt_nmi_decoding = 1;
|
||||
else
|
||||
dev_warn(&dev->dev, "NMI decoding is disabled. To enable this "
|
||||
"functionality you must reboot with nmi_watchdog=0 "
|
||||
"and load the hpwdt driver with priority=1.\n");
|
||||
hpwdt_nmi_decoding = 1;
|
||||
}
|
||||
#else
|
||||
static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
|
||||
@ -662,7 +657,7 @@ static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
|
||||
dev_warn(&dev->dev, "NMI decoding is disabled. "
|
||||
"Your kernel does not support a NMI Watchdog.\n");
|
||||
}
|
||||
#endif /* ARCH_HAS_NMI_WATCHDOG */
|
||||
#endif /* CONFIG_X86_LOCAL_APIC */
|
||||
|
||||
static int __devinit hpwdt_init_nmi_decoding(struct pci_dev *dev)
|
||||
{
|
||||
|
@ -154,12 +154,14 @@ enum {
|
||||
TRACE_EVENT_FL_ENABLED_BIT,
|
||||
TRACE_EVENT_FL_FILTERED_BIT,
|
||||
TRACE_EVENT_FL_RECORDED_CMD_BIT,
|
||||
TRACE_EVENT_FL_CAP_ANY_BIT,
|
||||
};
|
||||
|
||||
enum {
|
||||
TRACE_EVENT_FL_ENABLED = (1 << TRACE_EVENT_FL_ENABLED_BIT),
|
||||
TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT),
|
||||
TRACE_EVENT_FL_RECORDED_CMD = (1 << TRACE_EVENT_FL_RECORDED_CMD_BIT),
|
||||
TRACE_EVENT_FL_CAP_ANY = (1 << TRACE_EVENT_FL_CAP_ANY_BIT),
|
||||
};
|
||||
|
||||
struct ftrace_event_call {
|
||||
@ -196,6 +198,14 @@ struct ftrace_event_call {
|
||||
#endif
|
||||
};
|
||||
|
||||
#define __TRACE_EVENT_FLAGS(name, value) \
|
||||
static int __init trace_init_flags_##name(void) \
|
||||
{ \
|
||||
event_##name.flags = value; \
|
||||
return 0; \
|
||||
} \
|
||||
early_initcall(trace_init_flags_##name);
|
||||
|
||||
#define PERF_MAX_TRACE_SIZE 2048
|
||||
|
||||
#define MAX_FILTER_PRED 32
|
||||
@ -215,6 +225,10 @@ enum {
|
||||
FILTER_PTR_STRING,
|
||||
};
|
||||
|
||||
#define EVENT_STORAGE_SIZE 128
|
||||
extern struct mutex event_storage_mutex;
|
||||
extern char event_storage[EVENT_STORAGE_SIZE];
|
||||
|
||||
extern int trace_event_raw_init(struct ftrace_event_call *call);
|
||||
extern int trace_define_field(struct ftrace_event_call *call, const char *type,
|
||||
const char *name, int offset, int size,
|
||||
|
@ -275,7 +275,9 @@ extern int arch_prepared_optinsn(struct arch_optimized_insn *optinsn);
|
||||
extern int arch_check_optimized_kprobe(struct optimized_kprobe *op);
|
||||
extern int arch_prepare_optimized_kprobe(struct optimized_kprobe *op);
|
||||
extern void arch_remove_optimized_kprobe(struct optimized_kprobe *op);
|
||||
extern int arch_optimize_kprobe(struct optimized_kprobe *op);
|
||||
extern void arch_optimize_kprobes(struct list_head *oplist);
|
||||
extern void arch_unoptimize_kprobes(struct list_head *oplist,
|
||||
struct list_head *done_list);
|
||||
extern void arch_unoptimize_kprobe(struct optimized_kprobe *op);
|
||||
extern kprobe_opcode_t *get_optinsn_slot(void);
|
||||
extern void free_optinsn_slot(kprobe_opcode_t *slot, int dirty);
|
||||
|
@ -14,22 +14,14 @@
|
||||
* may be used to reset the timeout - for code which intentionally
|
||||
* disables interrupts for a long time. This call is stateless.
|
||||
*/
|
||||
#ifdef ARCH_HAS_NMI_WATCHDOG
|
||||
#if defined(ARCH_HAS_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
|
||||
#include <asm/nmi.h>
|
||||
extern void touch_nmi_watchdog(void);
|
||||
extern void acpi_nmi_disable(void);
|
||||
extern void acpi_nmi_enable(void);
|
||||
#else
|
||||
#ifndef CONFIG_HARDLOCKUP_DETECTOR
|
||||
static inline void touch_nmi_watchdog(void)
|
||||
{
|
||||
touch_softlockup_watchdog();
|
||||
}
|
||||
#else
|
||||
extern void touch_nmi_watchdog(void);
|
||||
#endif
|
||||
static inline void acpi_nmi_disable(void) { }
|
||||
static inline void acpi_nmi_enable(void) { }
|
||||
#endif
|
||||
|
||||
/*
|
||||
|
@ -215,8 +215,9 @@ struct perf_event_attr {
|
||||
*/
|
||||
precise_ip : 2, /* skid constraint */
|
||||
mmap_data : 1, /* non-exec mmap data */
|
||||
sample_id_all : 1, /* sample_type all events */
|
||||
|
||||
__reserved_1 : 46;
|
||||
__reserved_1 : 45;
|
||||
|
||||
union {
|
||||
__u32 wakeup_events; /* wakeup every n events */
|
||||
@ -327,6 +328,15 @@ struct perf_event_header {
|
||||
enum perf_event_type {
|
||||
|
||||
/*
|
||||
* If perf_event_attr.sample_id_all is set then all event types will
|
||||
* have the sample_type selected fields related to where/when
|
||||
* (identity) an event took place (TID, TIME, ID, CPU, STREAM_ID)
|
||||
* described in PERF_RECORD_SAMPLE below, it will be stashed just after
|
||||
* the perf_event_header and the fields already present for the existing
|
||||
* fields, i.e. at the end of the payload. That way a newer perf.data
|
||||
* file will be supported by older perf tools, with these new optional
|
||||
* fields being ignored.
|
||||
*
|
||||
* The MMAP events record the PROT_EXEC mappings so that we can
|
||||
* correlate userspace IPs to code. They have the following structure:
|
||||
*
|
||||
@ -578,6 +588,10 @@ struct perf_event;
|
||||
struct pmu {
|
||||
struct list_head entry;
|
||||
|
||||
struct device *dev;
|
||||
char *name;
|
||||
int type;
|
||||
|
||||
int * __percpu pmu_disable_count;
|
||||
struct perf_cpu_context * __percpu pmu_cpu_context;
|
||||
int task_ctx_nr;
|
||||
@ -758,6 +772,9 @@ struct perf_event {
|
||||
u64 shadow_ctx_time;
|
||||
|
||||
struct perf_event_attr attr;
|
||||
u16 header_size;
|
||||
u16 id_header_size;
|
||||
u16 read_size;
|
||||
struct hw_perf_event hw;
|
||||
|
||||
struct perf_event_context *ctx;
|
||||
@ -903,7 +920,7 @@ struct perf_output_handle {
|
||||
|
||||
#ifdef CONFIG_PERF_EVENTS
|
||||
|
||||
extern int perf_pmu_register(struct pmu *pmu);
|
||||
extern int perf_pmu_register(struct pmu *pmu, char *name, int type);
|
||||
extern void perf_pmu_unregister(struct pmu *pmu);
|
||||
|
||||
extern int perf_num_counters(void);
|
||||
@ -970,6 +987,11 @@ extern int perf_event_overflow(struct perf_event *event, int nmi,
|
||||
struct perf_sample_data *data,
|
||||
struct pt_regs *regs);
|
||||
|
||||
static inline bool is_sampling_event(struct perf_event *event)
|
||||
{
|
||||
return event->attr.sample_period != 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return 1 for a software event, 0 for a hardware event
|
||||
*/
|
||||
|
@ -316,6 +316,7 @@ extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
|
||||
size_t *lenp, loff_t *ppos);
|
||||
extern unsigned int softlockup_panic;
|
||||
extern int softlockup_thresh;
|
||||
void lockup_detector_init(void);
|
||||
#else
|
||||
static inline void touch_softlockup_watchdog(void)
|
||||
{
|
||||
@ -326,6 +327,9 @@ static inline void touch_softlockup_watchdog_sync(void)
|
||||
static inline void touch_all_softlockup_watchdogs(void)
|
||||
{
|
||||
}
|
||||
static inline void lockup_detector_init(void)
|
||||
{
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_DETECT_HUNG_TASK
|
||||
|
@ -2,6 +2,7 @@
|
||||
#define __LINUX_STACKTRACE_H
|
||||
|
||||
struct task_struct;
|
||||
struct pt_regs;
|
||||
|
||||
#ifdef CONFIG_STACKTRACE
|
||||
struct task_struct;
|
||||
@ -13,7 +14,8 @@ struct stack_trace {
|
||||
};
|
||||
|
||||
extern void save_stack_trace(struct stack_trace *trace);
|
||||
extern void save_stack_trace_bp(struct stack_trace *trace, unsigned long bp);
|
||||
extern void save_stack_trace_regs(struct stack_trace *trace,
|
||||
struct pt_regs *regs);
|
||||
extern void save_stack_trace_tsk(struct task_struct *tsk,
|
||||
struct stack_trace *trace);
|
||||
|
||||
|
@ -127,8 +127,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
|
||||
#define SYSCALL_TRACE_ENTER_EVENT(sname) \
|
||||
static struct syscall_metadata \
|
||||
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
|
||||
static struct ftrace_event_call \
|
||||
__attribute__((__aligned__(4))) event_enter_##sname; \
|
||||
static struct ftrace_event_call __used \
|
||||
__attribute__((__aligned__(4))) \
|
||||
__attribute__((section("_ftrace_events"))) \
|
||||
@ -137,13 +135,12 @@ extern struct trace_event_functions exit_syscall_print_funcs;
|
||||
.class = &event_class_syscall_enter, \
|
||||
.event.funcs = &enter_syscall_print_funcs, \
|
||||
.data = (void *)&__syscall_meta_##sname,\
|
||||
}
|
||||
}; \
|
||||
__TRACE_EVENT_FLAGS(enter_##sname, TRACE_EVENT_FL_CAP_ANY)
|
||||
|
||||
#define SYSCALL_TRACE_EXIT_EVENT(sname) \
|
||||
static struct syscall_metadata \
|
||||
__attribute__((__aligned__(4))) __syscall_meta_##sname; \
|
||||
static struct ftrace_event_call \
|
||||
__attribute__((__aligned__(4))) event_exit_##sname; \
|
||||
static struct ftrace_event_call __used \
|
||||
__attribute__((__aligned__(4))) \
|
||||
__attribute__((section("_ftrace_events"))) \
|
||||
@ -152,7 +149,8 @@ extern struct trace_event_functions exit_syscall_print_funcs;
|
||||
.class = &event_class_syscall_exit, \
|
||||
.event.funcs = &exit_syscall_print_funcs, \
|
||||
.data = (void *)&__syscall_meta_##sname,\
|
||||
}
|
||||
}; \
|
||||
__TRACE_EVENT_FLAGS(exit_##sname, TRACE_EVENT_FL_CAP_ANY)
|
||||
|
||||
#define SYSCALL_METADATA(sname, nb) \
|
||||
SYSCALL_TRACE_ENTER_EVENT(sname); \
|
||||
|
@ -106,6 +106,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
|
||||
|
||||
#define TP_PROTO(args...) args
|
||||
#define TP_ARGS(args...) args
|
||||
#define TP_CONDITION(args...) args
|
||||
|
||||
#ifdef CONFIG_TRACEPOINTS
|
||||
|
||||
@ -119,12 +120,14 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
|
||||
* as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
|
||||
* "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
|
||||
*/
|
||||
#define __DO_TRACE(tp, proto, args) \
|
||||
#define __DO_TRACE(tp, proto, args, cond) \
|
||||
do { \
|
||||
struct tracepoint_func *it_func_ptr; \
|
||||
void *it_func; \
|
||||
void *__data; \
|
||||
\
|
||||
if (!(cond)) \
|
||||
return; \
|
||||
rcu_read_lock_sched_notrace(); \
|
||||
it_func_ptr = rcu_dereference_sched((tp)->funcs); \
|
||||
if (it_func_ptr) { \
|
||||
@ -142,7 +145,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
|
||||
* not add unwanted padding between the beginning of the section and the
|
||||
* structure. Force alignment to the same alignment as the section start.
|
||||
*/
|
||||
#define __DECLARE_TRACE(name, proto, args, data_proto, data_args) \
|
||||
#define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
|
||||
extern struct tracepoint __tracepoint_##name; \
|
||||
static inline void trace_##name(proto) \
|
||||
{ \
|
||||
@ -151,7 +154,8 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
|
||||
do_trace: \
|
||||
__DO_TRACE(&__tracepoint_##name, \
|
||||
TP_PROTO(data_proto), \
|
||||
TP_ARGS(data_args)); \
|
||||
TP_ARGS(data_args), \
|
||||
TP_CONDITION(cond)); \
|
||||
} \
|
||||
static inline int \
|
||||
register_trace_##name(void (*probe)(data_proto), void *data) \
|
||||
@ -186,7 +190,7 @@ do_trace: \
|
||||
EXPORT_SYMBOL(__tracepoint_##name)
|
||||
|
||||
#else /* !CONFIG_TRACEPOINTS */
|
||||
#define __DECLARE_TRACE(name, proto, args, data_proto, data_args) \
|
||||
#define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
|
||||
static inline void trace_##name(proto) \
|
||||
{ } \
|
||||
static inline int \
|
||||
@ -227,13 +231,20 @@ do_trace: \
|
||||
* "void *__data, proto" as the callback prototype.
|
||||
*/
|
||||
#define DECLARE_TRACE_NOARGS(name) \
|
||||
__DECLARE_TRACE(name, void, , void *__data, __data)
|
||||
__DECLARE_TRACE(name, void, , 1, void *__data, __data)
|
||||
|
||||
#define DECLARE_TRACE(name, proto, args) \
|
||||
__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \
|
||||
__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), 1, \
|
||||
PARAMS(void *__data, proto), \
|
||||
PARAMS(__data, args))
|
||||
|
||||
#define DECLARE_TRACE_CONDITION(name, proto, args, cond) \
|
||||
__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), PARAMS(cond), \
|
||||
PARAMS(void *__data, proto), \
|
||||
PARAMS(__data, args))
|
||||
|
||||
#define TRACE_EVENT_FLAGS(event, flag)
|
||||
|
||||
#endif /* DECLARE_TRACE */
|
||||
|
||||
#ifndef TRACE_EVENT
|
||||
@ -347,11 +358,21 @@ do_trace: \
|
||||
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
|
||||
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
|
||||
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
|
||||
#define DEFINE_EVENT_CONDITION(template, name, proto, \
|
||||
args, cond) \
|
||||
DECLARE_TRACE_CONDITION(name, PARAMS(proto), \
|
||||
PARAMS(args), PARAMS(cond))
|
||||
|
||||
#define TRACE_EVENT(name, proto, args, struct, assign, print) \
|
||||
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
|
||||
#define TRACE_EVENT_FN(name, proto, args, struct, \
|
||||
assign, print, reg, unreg) \
|
||||
DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
|
||||
#define TRACE_EVENT_CONDITION(name, proto, args, cond, \
|
||||
struct, assign, print) \
|
||||
DECLARE_TRACE_CONDITION(name, PARAMS(proto), \
|
||||
PARAMS(args), PARAMS(cond))
|
||||
|
||||
#define TRACE_EVENT_FLAGS(event, flag)
|
||||
|
||||
#endif /* ifdef TRACE_EVENT (see note above) */
|
||||
|
@ -26,6 +26,15 @@
|
||||
#define TRACE_EVENT(name, proto, args, tstruct, assign, print) \
|
||||
DEFINE_TRACE(name)
|
||||
|
||||
#undef TRACE_EVENT_CONDITION
|
||||
#define TRACE_EVENT_CONDITION(name, proto, args, cond, tstruct, assign, print) \
|
||||
TRACE_EVENT(name, \
|
||||
PARAMS(proto), \
|
||||
PARAMS(args), \
|
||||
PARAMS(tstruct), \
|
||||
PARAMS(assign), \
|
||||
PARAMS(print))
|
||||
|
||||
#undef TRACE_EVENT_FN
|
||||
#define TRACE_EVENT_FN(name, proto, args, tstruct, \
|
||||
assign, print, reg, unreg) \
|
||||
@ -39,6 +48,10 @@
|
||||
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
|
||||
DEFINE_TRACE(name)
|
||||
|
||||
#undef DEFINE_EVENT_CONDITION
|
||||
#define DEFINE_EVENT_CONDITION(template, name, proto, args, cond) \
|
||||
DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
|
||||
|
||||
#undef DECLARE_TRACE
|
||||
#define DECLARE_TRACE(name, proto, args) \
|
||||
DEFINE_TRACE(name)
|
||||
@ -75,9 +88,11 @@
|
||||
|
||||
#undef TRACE_EVENT
|
||||
#undef TRACE_EVENT_FN
|
||||
#undef TRACE_EVENT_CONDITION
|
||||
#undef DECLARE_EVENT_CLASS
|
||||
#undef DEFINE_EVENT
|
||||
#undef DEFINE_EVENT_PRINT
|
||||
#undef DEFINE_EVENT_CONDITION
|
||||
#undef TRACE_HEADER_MULTI_READ
|
||||
#undef DECLARE_TRACE
|
||||
|
||||
|
@ -7,16 +7,67 @@
|
||||
#include <linux/ktime.h>
|
||||
#include <linux/tracepoint.h>
|
||||
|
||||
#ifndef _TRACE_POWER_ENUM_
|
||||
#define _TRACE_POWER_ENUM_
|
||||
enum {
|
||||
POWER_NONE = 0,
|
||||
POWER_CSTATE = 1, /* C-State */
|
||||
POWER_PSTATE = 2, /* Fequency change or DVFS */
|
||||
POWER_SSTATE = 3, /* Suspend */
|
||||
};
|
||||
DECLARE_EVENT_CLASS(cpu,
|
||||
|
||||
TP_PROTO(unsigned int state, unsigned int cpu_id),
|
||||
|
||||
TP_ARGS(state, cpu_id),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u32, state )
|
||||
__field( u32, cpu_id )
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->state = state;
|
||||
__entry->cpu_id = cpu_id;
|
||||
),
|
||||
|
||||
TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
|
||||
(unsigned long)__entry->cpu_id)
|
||||
);
|
||||
|
||||
DEFINE_EVENT(cpu, cpu_idle,
|
||||
|
||||
TP_PROTO(unsigned int state, unsigned int cpu_id),
|
||||
|
||||
TP_ARGS(state, cpu_id)
|
||||
);
|
||||
|
||||
/* This file can get included multiple times, TRACE_HEADER_MULTI_READ at top */
|
||||
#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING
|
||||
#define _PWR_EVENT_AVOID_DOUBLE_DEFINING
|
||||
|
||||
#define PWR_EVENT_EXIT -1
|
||||
#endif
|
||||
|
||||
DEFINE_EVENT(cpu, cpu_frequency,
|
||||
|
||||
TP_PROTO(unsigned int frequency, unsigned int cpu_id),
|
||||
|
||||
TP_ARGS(frequency, cpu_id)
|
||||
);
|
||||
|
||||
TRACE_EVENT(machine_suspend,
|
||||
|
||||
TP_PROTO(unsigned int state),
|
||||
|
||||
TP_ARGS(state),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u32, state )
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->state = state;
|
||||
),
|
||||
|
||||
TP_printk("state=%lu", (unsigned long)__entry->state)
|
||||
);
|
||||
|
||||
/* This code will be removed after deprecation time exceeded (2.6.41) */
|
||||
#ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
|
||||
|
||||
/*
|
||||
* The power events are used for cpuidle & suspend (power_start, power_end)
|
||||
* and for cpufreq (power_frequency)
|
||||
@ -75,6 +126,36 @@ TRACE_EVENT(power_end,
|
||||
|
||||
);
|
||||
|
||||
/* Deprecated dummy functions must be protected against multi-declartion */
|
||||
#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
|
||||
#define _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
|
||||
|
||||
enum {
|
||||
POWER_NONE = 0,
|
||||
POWER_CSTATE = 1,
|
||||
POWER_PSTATE = 2,
|
||||
};
|
||||
#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
|
||||
|
||||
#else /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
|
||||
|
||||
#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
|
||||
#define _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
|
||||
enum {
|
||||
POWER_NONE = 0,
|
||||
POWER_CSTATE = 1,
|
||||
POWER_PSTATE = 2,
|
||||
};
|
||||
|
||||
/* These dummy declaration have to be ripped out when the deprecated
|
||||
events get removed */
|
||||
static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {};
|
||||
static inline void trace_power_end(u64 cpuid) {};
|
||||
static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {};
|
||||
#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
|
||||
|
||||
#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
|
||||
|
||||
/*
|
||||
* The clock events are used for clock enable/disable and for
|
||||
* clock rate change
|
||||
@ -153,7 +234,6 @@ DEFINE_EVENT(power_domain, power_domain_target,
|
||||
|
||||
TP_ARGS(name, state, cpu_id)
|
||||
);
|
||||
|
||||
#endif /* _TRACE_POWER_H */
|
||||
|
||||
/* This part must be outside protection */
|
||||
|
@ -40,6 +40,8 @@ TRACE_EVENT_FN(sys_enter,
|
||||
syscall_regfunc, syscall_unregfunc
|
||||
);
|
||||
|
||||
TRACE_EVENT_FLAGS(sys_enter, TRACE_EVENT_FL_CAP_ANY)
|
||||
|
||||
TRACE_EVENT_FN(sys_exit,
|
||||
|
||||
TP_PROTO(struct pt_regs *regs, long ret),
|
||||
@ -62,6 +64,8 @@ TRACE_EVENT_FN(sys_exit,
|
||||
syscall_regfunc, syscall_unregfunc
|
||||
);
|
||||
|
||||
TRACE_EVENT_FLAGS(sys_exit, TRACE_EVENT_FL_CAP_ANY)
|
||||
|
||||
#endif /* CONFIG_HAVE_SYSCALL_TRACEPOINTS */
|
||||
|
||||
#endif /* _TRACE_EVENTS_SYSCALLS_H */
|
||||
|
@ -82,6 +82,10 @@
|
||||
TRACE_EVENT(name, PARAMS(proto), PARAMS(args), \
|
||||
PARAMS(tstruct), PARAMS(assign), PARAMS(print)) \
|
||||
|
||||
#undef TRACE_EVENT_FLAGS
|
||||
#define TRACE_EVENT_FLAGS(name, value) \
|
||||
__TRACE_EVENT_FLAGS(name, value)
|
||||
|
||||
#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
|
||||
|
||||
|
||||
@ -129,6 +133,9 @@
|
||||
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
|
||||
DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
|
||||
|
||||
#undef TRACE_EVENT_FLAGS
|
||||
#define TRACE_EVENT_FLAGS(event, flag)
|
||||
|
||||
#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
|
||||
|
||||
/*
|
||||
@ -289,13 +296,19 @@ static struct trace_event_functions ftrace_event_type_funcs_##call = { \
|
||||
|
||||
#undef __array
|
||||
#define __array(type, item, len) \
|
||||
BUILD_BUG_ON(len > MAX_FILTER_STR_VAL); \
|
||||
ret = trace_define_field(event_call, #type "[" #len "]", #item, \
|
||||
do { \
|
||||
mutex_lock(&event_storage_mutex); \
|
||||
BUILD_BUG_ON(len > MAX_FILTER_STR_VAL); \
|
||||
snprintf(event_storage, sizeof(event_storage), \
|
||||
"%s[%d]", #type, len); \
|
||||
ret = trace_define_field(event_call, event_storage, #item, \
|
||||
offsetof(typeof(field), item), \
|
||||
sizeof(field.item), \
|
||||
is_signed_type(type), FILTER_OTHER); \
|
||||
if (ret) \
|
||||
return ret;
|
||||
mutex_unlock(&event_storage_mutex); \
|
||||
if (ret) \
|
||||
return ret; \
|
||||
} while (0);
|
||||
|
||||
#undef __dynamic_array
|
||||
#define __dynamic_array(type, item, len) \
|
||||
|
@ -67,6 +67,7 @@
|
||||
#include <linux/sfi.h>
|
||||
#include <linux/shmem_fs.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/perf_event.h>
|
||||
|
||||
#include <asm/io.h>
|
||||
#include <asm/bugs.h>
|
||||
@ -603,6 +604,8 @@ asmlinkage void __init start_kernel(void)
|
||||
"enabled *very* early, fixing it\n");
|
||||
local_irq_disable();
|
||||
}
|
||||
idr_init_cache();
|
||||
perf_event_init();
|
||||
rcu_init();
|
||||
radix_tree_init();
|
||||
/* init some links before init_ISA_irqs() */
|
||||
@ -658,7 +661,6 @@ asmlinkage void __init start_kernel(void)
|
||||
enable_debug_pagealloc();
|
||||
kmemleak_init();
|
||||
debug_objects_mem_init();
|
||||
idr_init_cache();
|
||||
setup_per_cpu_pageset();
|
||||
numa_policy_init();
|
||||
if (late_time_init)
|
||||
@ -882,6 +884,7 @@ static int __init kernel_init(void * unused)
|
||||
smp_prepare_cpus(setup_max_cpus);
|
||||
|
||||
do_pre_smp_initcalls();
|
||||
lockup_detector_init();
|
||||
|
||||
smp_init();
|
||||
sched_init_smp();
|
||||
|
@ -641,7 +641,7 @@ int __init init_hw_breakpoint(void)
|
||||
|
||||
constraints_initialized = 1;
|
||||
|
||||
perf_pmu_register(&perf_breakpoint);
|
||||
perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT);
|
||||
|
||||
return register_die_notifier(&hw_breakpoint_exceptions_nb);
|
||||
|
||||
|
565
kernel/kprobes.c
565
kernel/kprobes.c
@ -354,13 +354,20 @@ static inline int kprobe_aggrprobe(struct kprobe *p)
|
||||
return p->pre_handler == aggr_pre_handler;
|
||||
}
|
||||
|
||||
/* Return true(!0) if the kprobe is unused */
|
||||
static inline int kprobe_unused(struct kprobe *p)
|
||||
{
|
||||
return kprobe_aggrprobe(p) && kprobe_disabled(p) &&
|
||||
list_empty(&p->list);
|
||||
}
|
||||
|
||||
/*
|
||||
* Keep all fields in the kprobe consistent
|
||||
*/
|
||||
static inline void copy_kprobe(struct kprobe *old_p, struct kprobe *p)
|
||||
static inline void copy_kprobe(struct kprobe *ap, struct kprobe *p)
|
||||
{
|
||||
memcpy(&p->opcode, &old_p->opcode, sizeof(kprobe_opcode_t));
|
||||
memcpy(&p->ainsn, &old_p->ainsn, sizeof(struct arch_specific_insn));
|
||||
memcpy(&p->opcode, &ap->opcode, sizeof(kprobe_opcode_t));
|
||||
memcpy(&p->ainsn, &ap->ainsn, sizeof(struct arch_specific_insn));
|
||||
}
|
||||
|
||||
#ifdef CONFIG_OPTPROBES
|
||||
@ -384,6 +391,17 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
|
||||
}
|
||||
}
|
||||
|
||||
/* Free optimized instructions and optimized_kprobe */
|
||||
static __kprobes void free_aggr_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
arch_remove_optimized_kprobe(op);
|
||||
arch_remove_kprobe(p);
|
||||
kfree(op);
|
||||
}
|
||||
|
||||
/* Return true(!0) if the kprobe is ready for optimization. */
|
||||
static inline int kprobe_optready(struct kprobe *p)
|
||||
{
|
||||
@ -397,6 +415,33 @@ static inline int kprobe_optready(struct kprobe *p)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Return true(!0) if the kprobe is disarmed. Note: p must be on hash list */
|
||||
static inline int kprobe_disarmed(struct kprobe *p)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
/* If kprobe is not aggr/opt probe, just return kprobe is disabled */
|
||||
if (!kprobe_aggrprobe(p))
|
||||
return kprobe_disabled(p);
|
||||
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
|
||||
return kprobe_disabled(p) && list_empty(&op->list);
|
||||
}
|
||||
|
||||
/* Return true(!0) if the probe is queued on (un)optimizing lists */
|
||||
static int __kprobes kprobe_queued(struct kprobe *p)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
if (kprobe_aggrprobe(p)) {
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
if (!list_empty(&op->list))
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Return an optimized kprobe whose optimizing code replaces
|
||||
* instructions including addr (exclude breakpoint).
|
||||
@ -422,30 +467,23 @@ static struct kprobe *__kprobes get_optimized_kprobe(unsigned long addr)
|
||||
|
||||
/* Optimization staging list, protected by kprobe_mutex */
|
||||
static LIST_HEAD(optimizing_list);
|
||||
static LIST_HEAD(unoptimizing_list);
|
||||
|
||||
static void kprobe_optimizer(struct work_struct *work);
|
||||
static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
|
||||
static DECLARE_COMPLETION(optimizer_comp);
|
||||
#define OPTIMIZE_DELAY 5
|
||||
|
||||
/* Kprobe jump optimizer */
|
||||
static __kprobes void kprobe_optimizer(struct work_struct *work)
|
||||
/*
|
||||
* Optimize (replace a breakpoint with a jump) kprobes listed on
|
||||
* optimizing_list.
|
||||
*/
|
||||
static __kprobes void do_optimize_kprobes(void)
|
||||
{
|
||||
struct optimized_kprobe *op, *tmp;
|
||||
|
||||
/* Lock modules while optimizing kprobes */
|
||||
mutex_lock(&module_mutex);
|
||||
mutex_lock(&kprobe_mutex);
|
||||
if (kprobes_all_disarmed || !kprobes_allow_optimization)
|
||||
goto end;
|
||||
|
||||
/*
|
||||
* Wait for quiesence period to ensure all running interrupts
|
||||
* are done. Because optprobe may modify multiple instructions
|
||||
* there is a chance that Nth instruction is interrupted. In that
|
||||
* case, running interrupt can return to 2nd-Nth byte of jump
|
||||
* instruction. This wait is for avoiding it.
|
||||
*/
|
||||
synchronize_sched();
|
||||
/* Optimization never be done when disarmed */
|
||||
if (kprobes_all_disarmed || !kprobes_allow_optimization ||
|
||||
list_empty(&optimizing_list))
|
||||
return;
|
||||
|
||||
/*
|
||||
* The optimization/unoptimization refers online_cpus via
|
||||
@ -459,17 +497,111 @@ static __kprobes void kprobe_optimizer(struct work_struct *work)
|
||||
*/
|
||||
get_online_cpus();
|
||||
mutex_lock(&text_mutex);
|
||||
list_for_each_entry_safe(op, tmp, &optimizing_list, list) {
|
||||
WARN_ON(kprobe_disabled(&op->kp));
|
||||
if (arch_optimize_kprobe(op) < 0)
|
||||
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
|
||||
list_del_init(&op->list);
|
||||
arch_optimize_kprobes(&optimizing_list);
|
||||
mutex_unlock(&text_mutex);
|
||||
put_online_cpus();
|
||||
}
|
||||
|
||||
/*
|
||||
* Unoptimize (replace a jump with a breakpoint and remove the breakpoint
|
||||
* if need) kprobes listed on unoptimizing_list.
|
||||
*/
|
||||
static __kprobes void do_unoptimize_kprobes(struct list_head *free_list)
|
||||
{
|
||||
struct optimized_kprobe *op, *tmp;
|
||||
|
||||
/* Unoptimization must be done anytime */
|
||||
if (list_empty(&unoptimizing_list))
|
||||
return;
|
||||
|
||||
/* Ditto to do_optimize_kprobes */
|
||||
get_online_cpus();
|
||||
mutex_lock(&text_mutex);
|
||||
arch_unoptimize_kprobes(&unoptimizing_list, free_list);
|
||||
/* Loop free_list for disarming */
|
||||
list_for_each_entry_safe(op, tmp, free_list, list) {
|
||||
/* Disarm probes if marked disabled */
|
||||
if (kprobe_disabled(&op->kp))
|
||||
arch_disarm_kprobe(&op->kp);
|
||||
if (kprobe_unused(&op->kp)) {
|
||||
/*
|
||||
* Remove unused probes from hash list. After waiting
|
||||
* for synchronization, these probes are reclaimed.
|
||||
* (reclaiming is done by do_free_cleaned_kprobes.)
|
||||
*/
|
||||
hlist_del_rcu(&op->kp.hlist);
|
||||
} else
|
||||
list_del_init(&op->list);
|
||||
}
|
||||
mutex_unlock(&text_mutex);
|
||||
put_online_cpus();
|
||||
end:
|
||||
}
|
||||
|
||||
/* Reclaim all kprobes on the free_list */
|
||||
static __kprobes void do_free_cleaned_kprobes(struct list_head *free_list)
|
||||
{
|
||||
struct optimized_kprobe *op, *tmp;
|
||||
|
||||
list_for_each_entry_safe(op, tmp, free_list, list) {
|
||||
BUG_ON(!kprobe_unused(&op->kp));
|
||||
list_del_init(&op->list);
|
||||
free_aggr_kprobe(&op->kp);
|
||||
}
|
||||
}
|
||||
|
||||
/* Start optimizer after OPTIMIZE_DELAY passed */
|
||||
static __kprobes void kick_kprobe_optimizer(void)
|
||||
{
|
||||
if (!delayed_work_pending(&optimizing_work))
|
||||
schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
|
||||
}
|
||||
|
||||
/* Kprobe jump optimizer */
|
||||
static __kprobes void kprobe_optimizer(struct work_struct *work)
|
||||
{
|
||||
LIST_HEAD(free_list);
|
||||
|
||||
/* Lock modules while optimizing kprobes */
|
||||
mutex_lock(&module_mutex);
|
||||
mutex_lock(&kprobe_mutex);
|
||||
|
||||
/*
|
||||
* Step 1: Unoptimize kprobes and collect cleaned (unused and disarmed)
|
||||
* kprobes before waiting for quiesence period.
|
||||
*/
|
||||
do_unoptimize_kprobes(&free_list);
|
||||
|
||||
/*
|
||||
* Step 2: Wait for quiesence period to ensure all running interrupts
|
||||
* are done. Because optprobe may modify multiple instructions
|
||||
* there is a chance that Nth instruction is interrupted. In that
|
||||
* case, running interrupt can return to 2nd-Nth byte of jump
|
||||
* instruction. This wait is for avoiding it.
|
||||
*/
|
||||
synchronize_sched();
|
||||
|
||||
/* Step 3: Optimize kprobes after quiesence period */
|
||||
do_optimize_kprobes();
|
||||
|
||||
/* Step 4: Free cleaned kprobes after quiesence period */
|
||||
do_free_cleaned_kprobes(&free_list);
|
||||
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
mutex_unlock(&module_mutex);
|
||||
|
||||
/* Step 5: Kick optimizer again if needed */
|
||||
if (!list_empty(&optimizing_list) || !list_empty(&unoptimizing_list))
|
||||
kick_kprobe_optimizer();
|
||||
else
|
||||
/* Wake up all waiters */
|
||||
complete_all(&optimizer_comp);
|
||||
}
|
||||
|
||||
/* Wait for completing optimization and unoptimization */
|
||||
static __kprobes void wait_for_kprobe_optimizer(void)
|
||||
{
|
||||
if (delayed_work_pending(&optimizing_work))
|
||||
wait_for_completion(&optimizer_comp);
|
||||
}
|
||||
|
||||
/* Optimize kprobe if p is ready to be optimized */
|
||||
@ -495,28 +627,85 @@ static __kprobes void optimize_kprobe(struct kprobe *p)
|
||||
/* Check if it is already optimized. */
|
||||
if (op->kp.flags & KPROBE_FLAG_OPTIMIZED)
|
||||
return;
|
||||
|
||||
op->kp.flags |= KPROBE_FLAG_OPTIMIZED;
|
||||
list_add(&op->list, &optimizing_list);
|
||||
if (!delayed_work_pending(&optimizing_work))
|
||||
schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
|
||||
|
||||
if (!list_empty(&op->list))
|
||||
/* This is under unoptimizing. Just dequeue the probe */
|
||||
list_del_init(&op->list);
|
||||
else {
|
||||
list_add(&op->list, &optimizing_list);
|
||||
kick_kprobe_optimizer();
|
||||
}
|
||||
}
|
||||
|
||||
/* Short cut to direct unoptimizing */
|
||||
static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
|
||||
{
|
||||
get_online_cpus();
|
||||
arch_unoptimize_kprobe(op);
|
||||
put_online_cpus();
|
||||
if (kprobe_disabled(&op->kp))
|
||||
arch_disarm_kprobe(&op->kp);
|
||||
}
|
||||
|
||||
/* Unoptimize a kprobe if p is optimized */
|
||||
static __kprobes void unoptimize_kprobe(struct kprobe *p)
|
||||
static __kprobes void unoptimize_kprobe(struct kprobe *p, bool force)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
if ((p->flags & KPROBE_FLAG_OPTIMIZED) && kprobe_aggrprobe(p)) {
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
if (!list_empty(&op->list))
|
||||
/* Dequeue from the optimization queue */
|
||||
if (!kprobe_aggrprobe(p) || kprobe_disarmed(p))
|
||||
return; /* This is not an optprobe nor optimized */
|
||||
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
if (!kprobe_optimized(p)) {
|
||||
/* Unoptimized or unoptimizing case */
|
||||
if (force && !list_empty(&op->list)) {
|
||||
/*
|
||||
* Only if this is unoptimizing kprobe and forced,
|
||||
* forcibly unoptimize it. (No need to unoptimize
|
||||
* unoptimized kprobe again :)
|
||||
*/
|
||||
list_del_init(&op->list);
|
||||
else
|
||||
/* Replace jump with break */
|
||||
arch_unoptimize_kprobe(op);
|
||||
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
|
||||
force_unoptimize_kprobe(op);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
|
||||
if (!list_empty(&op->list)) {
|
||||
/* Dequeue from the optimization queue */
|
||||
list_del_init(&op->list);
|
||||
return;
|
||||
}
|
||||
/* Optimized kprobe case */
|
||||
if (force)
|
||||
/* Forcibly update the code: this is a special case */
|
||||
force_unoptimize_kprobe(op);
|
||||
else {
|
||||
list_add(&op->list, &unoptimizing_list);
|
||||
kick_kprobe_optimizer();
|
||||
}
|
||||
}
|
||||
|
||||
/* Cancel unoptimizing for reusing */
|
||||
static void reuse_unused_kprobe(struct kprobe *ap)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
BUG_ON(!kprobe_unused(ap));
|
||||
/*
|
||||
* Unused kprobe MUST be on the way of delayed unoptimizing (means
|
||||
* there is still a relative jump) and disabled.
|
||||
*/
|
||||
op = container_of(ap, struct optimized_kprobe, kp);
|
||||
if (unlikely(list_empty(&op->list)))
|
||||
printk(KERN_WARNING "Warning: found a stray unused "
|
||||
"aggrprobe@%p\n", ap->addr);
|
||||
/* Enable the probe again */
|
||||
ap->flags &= ~KPROBE_FLAG_DISABLED;
|
||||
/* Optimize it again (remove from op->list) */
|
||||
BUG_ON(!kprobe_optready(ap));
|
||||
optimize_kprobe(ap);
|
||||
}
|
||||
|
||||
/* Remove optimized instructions */
|
||||
@ -525,12 +714,12 @@ static void __kprobes kill_optimized_kprobe(struct kprobe *p)
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
if (!list_empty(&op->list)) {
|
||||
/* Dequeue from the optimization queue */
|
||||
if (!list_empty(&op->list))
|
||||
/* Dequeue from the (un)optimization queue */
|
||||
list_del_init(&op->list);
|
||||
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
|
||||
}
|
||||
/* Don't unoptimize, because the target code will be freed. */
|
||||
|
||||
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
|
||||
/* Don't touch the code, because it is already freed. */
|
||||
arch_remove_optimized_kprobe(op);
|
||||
}
|
||||
|
||||
@ -543,16 +732,6 @@ static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
|
||||
arch_prepare_optimized_kprobe(op);
|
||||
}
|
||||
|
||||
/* Free optimized instructions and optimized_kprobe */
|
||||
static __kprobes void free_aggr_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct optimized_kprobe *op;
|
||||
|
||||
op = container_of(p, struct optimized_kprobe, kp);
|
||||
arch_remove_optimized_kprobe(op);
|
||||
kfree(op);
|
||||
}
|
||||
|
||||
/* Allocate new optimized_kprobe and try to prepare optimized instructions */
|
||||
static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
|
||||
{
|
||||
@ -587,7 +766,8 @@ static __kprobes void try_to_optimize_kprobe(struct kprobe *p)
|
||||
op = container_of(ap, struct optimized_kprobe, kp);
|
||||
if (!arch_prepared_optinsn(&op->optinsn)) {
|
||||
/* If failed to setup optimizing, fallback to kprobe */
|
||||
free_aggr_kprobe(ap);
|
||||
arch_remove_optimized_kprobe(op);
|
||||
kfree(op);
|
||||
return;
|
||||
}
|
||||
|
||||
@ -631,21 +811,16 @@ static void __kprobes unoptimize_all_kprobes(void)
|
||||
return;
|
||||
|
||||
kprobes_allow_optimization = false;
|
||||
printk(KERN_INFO "Kprobes globally unoptimized\n");
|
||||
get_online_cpus(); /* For avoiding text_mutex deadlock */
|
||||
mutex_lock(&text_mutex);
|
||||
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
|
||||
head = &kprobe_table[i];
|
||||
hlist_for_each_entry_rcu(p, node, head, hlist) {
|
||||
if (!kprobe_disabled(p))
|
||||
unoptimize_kprobe(p);
|
||||
unoptimize_kprobe(p, false);
|
||||
}
|
||||
}
|
||||
|
||||
mutex_unlock(&text_mutex);
|
||||
put_online_cpus();
|
||||
/* Allow all currently running kprobes to complete */
|
||||
synchronize_sched();
|
||||
/* Wait for unoptimizing completion */
|
||||
wait_for_kprobe_optimizer();
|
||||
printk(KERN_INFO "Kprobes globally unoptimized\n");
|
||||
}
|
||||
|
||||
int sysctl_kprobes_optimization;
|
||||
@ -669,44 +844,60 @@ int proc_kprobes_optimization_handler(struct ctl_table *table, int write,
|
||||
}
|
||||
#endif /* CONFIG_SYSCTL */
|
||||
|
||||
/* Put a breakpoint for a probe. Must be called with text_mutex locked */
|
||||
static void __kprobes __arm_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *old_p;
|
||||
struct kprobe *_p;
|
||||
|
||||
/* Check collision with other optimized kprobes */
|
||||
old_p = get_optimized_kprobe((unsigned long)p->addr);
|
||||
if (unlikely(old_p))
|
||||
unoptimize_kprobe(old_p); /* Fallback to unoptimized kprobe */
|
||||
_p = get_optimized_kprobe((unsigned long)p->addr);
|
||||
if (unlikely(_p))
|
||||
/* Fallback to unoptimized kprobe */
|
||||
unoptimize_kprobe(_p, true);
|
||||
|
||||
arch_arm_kprobe(p);
|
||||
optimize_kprobe(p); /* Try to optimize (add kprobe to a list) */
|
||||
}
|
||||
|
||||
static void __kprobes __disarm_kprobe(struct kprobe *p)
|
||||
/* Remove the breakpoint of a probe. Must be called with text_mutex locked */
|
||||
static void __kprobes __disarm_kprobe(struct kprobe *p, bool reopt)
|
||||
{
|
||||
struct kprobe *old_p;
|
||||
struct kprobe *_p;
|
||||
|
||||
unoptimize_kprobe(p); /* Try to unoptimize */
|
||||
arch_disarm_kprobe(p);
|
||||
unoptimize_kprobe(p, false); /* Try to unoptimize */
|
||||
|
||||
/* If another kprobe was blocked, optimize it. */
|
||||
old_p = get_optimized_kprobe((unsigned long)p->addr);
|
||||
if (unlikely(old_p))
|
||||
optimize_kprobe(old_p);
|
||||
if (!kprobe_queued(p)) {
|
||||
arch_disarm_kprobe(p);
|
||||
/* If another kprobe was blocked, optimize it. */
|
||||
_p = get_optimized_kprobe((unsigned long)p->addr);
|
||||
if (unlikely(_p) && reopt)
|
||||
optimize_kprobe(_p);
|
||||
}
|
||||
/* TODO: reoptimize others after unoptimized this probe */
|
||||
}
|
||||
|
||||
#else /* !CONFIG_OPTPROBES */
|
||||
|
||||
#define optimize_kprobe(p) do {} while (0)
|
||||
#define unoptimize_kprobe(p) do {} while (0)
|
||||
#define unoptimize_kprobe(p, f) do {} while (0)
|
||||
#define kill_optimized_kprobe(p) do {} while (0)
|
||||
#define prepare_optimized_kprobe(p) do {} while (0)
|
||||
#define try_to_optimize_kprobe(p) do {} while (0)
|
||||
#define __arm_kprobe(p) arch_arm_kprobe(p)
|
||||
#define __disarm_kprobe(p) arch_disarm_kprobe(p)
|
||||
#define __disarm_kprobe(p, o) arch_disarm_kprobe(p)
|
||||
#define kprobe_disarmed(p) kprobe_disabled(p)
|
||||
#define wait_for_kprobe_optimizer() do {} while (0)
|
||||
|
||||
/* There should be no unused kprobes can be reused without optimization */
|
||||
static void reuse_unused_kprobe(struct kprobe *ap)
|
||||
{
|
||||
printk(KERN_ERR "Error: There should be no unused kprobe here.\n");
|
||||
BUG_ON(kprobe_unused(ap));
|
||||
}
|
||||
|
||||
static __kprobes void free_aggr_kprobe(struct kprobe *p)
|
||||
{
|
||||
arch_remove_kprobe(p);
|
||||
kfree(p);
|
||||
}
|
||||
|
||||
@ -732,11 +923,10 @@ static void __kprobes arm_kprobe(struct kprobe *kp)
|
||||
/* Disarm a kprobe with text_mutex */
|
||||
static void __kprobes disarm_kprobe(struct kprobe *kp)
|
||||
{
|
||||
get_online_cpus(); /* For avoiding text_mutex deadlock */
|
||||
/* Ditto */
|
||||
mutex_lock(&text_mutex);
|
||||
__disarm_kprobe(kp);
|
||||
__disarm_kprobe(kp, true);
|
||||
mutex_unlock(&text_mutex);
|
||||
put_online_cpus();
|
||||
}
|
||||
|
||||
/*
|
||||
@ -942,7 +1132,7 @@ static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
|
||||
BUG_ON(kprobe_gone(ap) || kprobe_gone(p));
|
||||
|
||||
if (p->break_handler || p->post_handler)
|
||||
unoptimize_kprobe(ap); /* Fall back to normal kprobe */
|
||||
unoptimize_kprobe(ap, true); /* Fall back to normal kprobe */
|
||||
|
||||
if (p->break_handler) {
|
||||
if (ap->break_handler)
|
||||
@ -993,19 +1183,21 @@ static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
|
||||
* This is the second or subsequent kprobe at the address - handle
|
||||
* the intricacies
|
||||
*/
|
||||
static int __kprobes register_aggr_kprobe(struct kprobe *old_p,
|
||||
static int __kprobes register_aggr_kprobe(struct kprobe *orig_p,
|
||||
struct kprobe *p)
|
||||
{
|
||||
int ret = 0;
|
||||
struct kprobe *ap = old_p;
|
||||
struct kprobe *ap = orig_p;
|
||||
|
||||
if (!kprobe_aggrprobe(old_p)) {
|
||||
/* If old_p is not an aggr_kprobe, create new aggr_kprobe. */
|
||||
ap = alloc_aggr_kprobe(old_p);
|
||||
if (!kprobe_aggrprobe(orig_p)) {
|
||||
/* If orig_p is not an aggr_kprobe, create new aggr_kprobe. */
|
||||
ap = alloc_aggr_kprobe(orig_p);
|
||||
if (!ap)
|
||||
return -ENOMEM;
|
||||
init_aggr_kprobe(ap, old_p);
|
||||
}
|
||||
init_aggr_kprobe(ap, orig_p);
|
||||
} else if (kprobe_unused(ap))
|
||||
/* This probe is going to die. Rescue it */
|
||||
reuse_unused_kprobe(ap);
|
||||
|
||||
if (kprobe_gone(ap)) {
|
||||
/*
|
||||
@ -1039,23 +1231,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe *old_p,
|
||||
return add_new_kprobe(ap, p);
|
||||
}
|
||||
|
||||
/* Try to disable aggr_kprobe, and return 1 if succeeded.*/
|
||||
static int __kprobes try_to_disable_aggr_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *kp;
|
||||
|
||||
list_for_each_entry_rcu(kp, &p->list, list) {
|
||||
if (!kprobe_disabled(kp))
|
||||
/*
|
||||
* There is an active probe on the list.
|
||||
* We can't disable aggr_kprobe.
|
||||
*/
|
||||
return 0;
|
||||
}
|
||||
p->flags |= KPROBE_FLAG_DISABLED;
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int __kprobes in_kprobes_functions(unsigned long addr)
|
||||
{
|
||||
struct kprobe_blackpoint *kb;
|
||||
@ -1098,34 +1273,33 @@ static kprobe_opcode_t __kprobes *kprobe_addr(struct kprobe *p)
|
||||
/* Check passed kprobe is valid and return kprobe in kprobe_table. */
|
||||
static struct kprobe * __kprobes __get_valid_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *old_p, *list_p;
|
||||
struct kprobe *ap, *list_p;
|
||||
|
||||
old_p = get_kprobe(p->addr);
|
||||
if (unlikely(!old_p))
|
||||
ap = get_kprobe(p->addr);
|
||||
if (unlikely(!ap))
|
||||
return NULL;
|
||||
|
||||
if (p != old_p) {
|
||||
list_for_each_entry_rcu(list_p, &old_p->list, list)
|
||||
if (p != ap) {
|
||||
list_for_each_entry_rcu(list_p, &ap->list, list)
|
||||
if (list_p == p)
|
||||
/* kprobe p is a valid probe */
|
||||
goto valid;
|
||||
return NULL;
|
||||
}
|
||||
valid:
|
||||
return old_p;
|
||||
return ap;
|
||||
}
|
||||
|
||||
/* Return error if the kprobe is being re-registered */
|
||||
static inline int check_kprobe_rereg(struct kprobe *p)
|
||||
{
|
||||
int ret = 0;
|
||||
struct kprobe *old_p;
|
||||
|
||||
mutex_lock(&kprobe_mutex);
|
||||
old_p = __get_valid_kprobe(p);
|
||||
if (old_p)
|
||||
if (__get_valid_kprobe(p))
|
||||
ret = -EINVAL;
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
@ -1229,67 +1403,121 @@ fail_with_jump_label:
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(register_kprobe);
|
||||
|
||||
/* Check if all probes on the aggrprobe are disabled */
|
||||
static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
|
||||
{
|
||||
struct kprobe *kp;
|
||||
|
||||
list_for_each_entry_rcu(kp, &ap->list, list)
|
||||
if (!kprobe_disabled(kp))
|
||||
/*
|
||||
* There is an active probe on the list.
|
||||
* We can't disable this ap.
|
||||
*/
|
||||
return 0;
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Disable one kprobe: Make sure called under kprobe_mutex is locked */
|
||||
static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *orig_p;
|
||||
|
||||
/* Get an original kprobe for return */
|
||||
orig_p = __get_valid_kprobe(p);
|
||||
if (unlikely(orig_p == NULL))
|
||||
return NULL;
|
||||
|
||||
if (!kprobe_disabled(p)) {
|
||||
/* Disable probe if it is a child probe */
|
||||
if (p != orig_p)
|
||||
p->flags |= KPROBE_FLAG_DISABLED;
|
||||
|
||||
/* Try to disarm and disable this/parent probe */
|
||||
if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
|
||||
disarm_kprobe(orig_p);
|
||||
orig_p->flags |= KPROBE_FLAG_DISABLED;
|
||||
}
|
||||
}
|
||||
|
||||
return orig_p;
|
||||
}
|
||||
|
||||
/*
|
||||
* Unregister a kprobe without a scheduler synchronization.
|
||||
*/
|
||||
static int __kprobes __unregister_kprobe_top(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *old_p, *list_p;
|
||||
struct kprobe *ap, *list_p;
|
||||
|
||||
old_p = __get_valid_kprobe(p);
|
||||
if (old_p == NULL)
|
||||
/* Disable kprobe. This will disarm it if needed. */
|
||||
ap = __disable_kprobe(p);
|
||||
if (ap == NULL)
|
||||
return -EINVAL;
|
||||
|
||||
if (old_p == p ||
|
||||
(kprobe_aggrprobe(old_p) &&
|
||||
list_is_singular(&old_p->list))) {
|
||||
if (ap == p)
|
||||
/*
|
||||
* Only probe on the hash list. Disarm only if kprobes are
|
||||
* enabled and not gone - otherwise, the breakpoint would
|
||||
* already have been removed. We save on flushing icache.
|
||||
* This probe is an independent(and non-optimized) kprobe
|
||||
* (not an aggrprobe). Remove from the hash list.
|
||||
*/
|
||||
if (!kprobes_all_disarmed && !kprobe_disabled(old_p))
|
||||
disarm_kprobe(old_p);
|
||||
hlist_del_rcu(&old_p->hlist);
|
||||
} else {
|
||||
goto disarmed;
|
||||
|
||||
/* Following process expects this probe is an aggrprobe */
|
||||
WARN_ON(!kprobe_aggrprobe(ap));
|
||||
|
||||
if (list_is_singular(&ap->list) && kprobe_disarmed(ap))
|
||||
/*
|
||||
* !disarmed could be happen if the probe is under delayed
|
||||
* unoptimizing.
|
||||
*/
|
||||
goto disarmed;
|
||||
else {
|
||||
/* If disabling probe has special handlers, update aggrprobe */
|
||||
if (p->break_handler && !kprobe_gone(p))
|
||||
old_p->break_handler = NULL;
|
||||
ap->break_handler = NULL;
|
||||
if (p->post_handler && !kprobe_gone(p)) {
|
||||
list_for_each_entry_rcu(list_p, &old_p->list, list) {
|
||||
list_for_each_entry_rcu(list_p, &ap->list, list) {
|
||||
if ((list_p != p) && (list_p->post_handler))
|
||||
goto noclean;
|
||||
}
|
||||
old_p->post_handler = NULL;
|
||||
ap->post_handler = NULL;
|
||||
}
|
||||
noclean:
|
||||
/*
|
||||
* Remove from the aggrprobe: this path will do nothing in
|
||||
* __unregister_kprobe_bottom().
|
||||
*/
|
||||
list_del_rcu(&p->list);
|
||||
if (!kprobe_disabled(old_p)) {
|
||||
try_to_disable_aggr_kprobe(old_p);
|
||||
if (!kprobes_all_disarmed) {
|
||||
if (kprobe_disabled(old_p))
|
||||
disarm_kprobe(old_p);
|
||||
else
|
||||
/* Try to optimize this probe again */
|
||||
optimize_kprobe(old_p);
|
||||
}
|
||||
}
|
||||
if (!kprobe_disabled(ap) && !kprobes_all_disarmed)
|
||||
/*
|
||||
* Try to optimize this probe again, because post
|
||||
* handler may have been changed.
|
||||
*/
|
||||
optimize_kprobe(ap);
|
||||
}
|
||||
return 0;
|
||||
|
||||
disarmed:
|
||||
BUG_ON(!kprobe_disarmed(ap));
|
||||
hlist_del_rcu(&ap->hlist);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
|
||||
{
|
||||
struct kprobe *old_p;
|
||||
struct kprobe *ap;
|
||||
|
||||
if (list_empty(&p->list))
|
||||
/* This is an independent kprobe */
|
||||
arch_remove_kprobe(p);
|
||||
else if (list_is_singular(&p->list)) {
|
||||
/* "p" is the last child of an aggr_kprobe */
|
||||
old_p = list_entry(p->list.next, struct kprobe, list);
|
||||
/* This is the last child of an aggrprobe */
|
||||
ap = list_entry(p->list.next, struct kprobe, list);
|
||||
list_del(&p->list);
|
||||
arch_remove_kprobe(old_p);
|
||||
free_aggr_kprobe(old_p);
|
||||
free_aggr_kprobe(ap);
|
||||
}
|
||||
/* Otherwise, do nothing. */
|
||||
}
|
||||
|
||||
int __kprobes register_kprobes(struct kprobe **kps, int num)
|
||||
@ -1607,29 +1835,13 @@ static void __kprobes kill_kprobe(struct kprobe *p)
|
||||
int __kprobes disable_kprobe(struct kprobe *kp)
|
||||
{
|
||||
int ret = 0;
|
||||
struct kprobe *p;
|
||||
|
||||
mutex_lock(&kprobe_mutex);
|
||||
|
||||
/* Check whether specified probe is valid. */
|
||||
p = __get_valid_kprobe(kp);
|
||||
if (unlikely(p == NULL)) {
|
||||
/* Disable this kprobe */
|
||||
if (__disable_kprobe(kp) == NULL)
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* If the probe is already disabled (or gone), just return */
|
||||
if (kprobe_disabled(kp))
|
||||
goto out;
|
||||
|
||||
kp->flags |= KPROBE_FLAG_DISABLED;
|
||||
if (p != kp)
|
||||
/* When kp != p, p is always enabled. */
|
||||
try_to_disable_aggr_kprobe(p);
|
||||
|
||||
if (!kprobes_all_disarmed && kprobe_disabled(p))
|
||||
disarm_kprobe(p);
|
||||
out:
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
return ret;
|
||||
}
|
||||
@ -1927,36 +2139,27 @@ static void __kprobes disarm_all_kprobes(void)
|
||||
mutex_lock(&kprobe_mutex);
|
||||
|
||||
/* If kprobes are already disarmed, just return */
|
||||
if (kprobes_all_disarmed)
|
||||
goto already_disabled;
|
||||
if (kprobes_all_disarmed) {
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
return;
|
||||
}
|
||||
|
||||
kprobes_all_disarmed = true;
|
||||
printk(KERN_INFO "Kprobes globally disabled\n");
|
||||
|
||||
/*
|
||||
* Here we call get_online_cpus() for avoiding text_mutex deadlock,
|
||||
* because disarming may also unoptimize kprobes.
|
||||
*/
|
||||
get_online_cpus();
|
||||
mutex_lock(&text_mutex);
|
||||
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
|
||||
head = &kprobe_table[i];
|
||||
hlist_for_each_entry_rcu(p, node, head, hlist) {
|
||||
if (!arch_trampoline_kprobe(p) && !kprobe_disabled(p))
|
||||
__disarm_kprobe(p);
|
||||
__disarm_kprobe(p, false);
|
||||
}
|
||||
}
|
||||
|
||||
mutex_unlock(&text_mutex);
|
||||
put_online_cpus();
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
/* Allow all currently running kprobes to complete */
|
||||
synchronize_sched();
|
||||
return;
|
||||
|
||||
already_disabled:
|
||||
mutex_unlock(&kprobe_mutex);
|
||||
return;
|
||||
/* Wait for disarming all kprobes by optimizer */
|
||||
wait_for_kprobe_optimizer();
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -13,6 +13,7 @@
|
||||
#include <linux/mm.h>
|
||||
#include <linux/cpu.h>
|
||||
#include <linux/smp.h>
|
||||
#include <linux/idr.h>
|
||||
#include <linux/file.h>
|
||||
#include <linux/poll.h>
|
||||
#include <linux/slab.h>
|
||||
@ -21,7 +22,9 @@
|
||||
#include <linux/dcache.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/ptrace.h>
|
||||
#include <linux/reboot.h>
|
||||
#include <linux/vmstat.h>
|
||||
#include <linux/device.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/hardirq.h>
|
||||
#include <linux/rculist.h>
|
||||
@ -133,6 +136,28 @@ static void unclone_ctx(struct perf_event_context *ctx)
|
||||
}
|
||||
}
|
||||
|
||||
static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
|
||||
{
|
||||
/*
|
||||
* only top level events have the pid namespace they were created in
|
||||
*/
|
||||
if (event->parent)
|
||||
event = event->parent;
|
||||
|
||||
return task_tgid_nr_ns(p, event->ns);
|
||||
}
|
||||
|
||||
static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
|
||||
{
|
||||
/*
|
||||
* only top level events have the pid namespace they were created in
|
||||
*/
|
||||
if (event->parent)
|
||||
event = event->parent;
|
||||
|
||||
return task_pid_nr_ns(p, event->ns);
|
||||
}
|
||||
|
||||
/*
|
||||
* If we inherit events we want to return the parent event id
|
||||
* to userspace.
|
||||
@ -312,9 +337,84 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
|
||||
ctx->nr_stat++;
|
||||
}
|
||||
|
||||
/*
|
||||
* Called at perf_event creation and when events are attached/detached from a
|
||||
* group.
|
||||
*/
|
||||
static void perf_event__read_size(struct perf_event *event)
|
||||
{
|
||||
int entry = sizeof(u64); /* value */
|
||||
int size = 0;
|
||||
int nr = 1;
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
|
||||
size += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
|
||||
size += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_ID)
|
||||
entry += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_GROUP) {
|
||||
nr += event->group_leader->nr_siblings;
|
||||
size += sizeof(u64);
|
||||
}
|
||||
|
||||
size += entry * nr;
|
||||
event->read_size = size;
|
||||
}
|
||||
|
||||
static void perf_event__header_size(struct perf_event *event)
|
||||
{
|
||||
struct perf_sample_data *data;
|
||||
u64 sample_type = event->attr.sample_type;
|
||||
u16 size = 0;
|
||||
|
||||
perf_event__read_size(event);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_IP)
|
||||
size += sizeof(data->ip);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ADDR)
|
||||
size += sizeof(data->addr);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_PERIOD)
|
||||
size += sizeof(data->period);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_READ)
|
||||
size += event->read_size;
|
||||
|
||||
event->header_size = size;
|
||||
}
|
||||
|
||||
static void perf_event__id_header_size(struct perf_event *event)
|
||||
{
|
||||
struct perf_sample_data *data;
|
||||
u64 sample_type = event->attr.sample_type;
|
||||
u16 size = 0;
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TID)
|
||||
size += sizeof(data->tid_entry);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TIME)
|
||||
size += sizeof(data->time);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ID)
|
||||
size += sizeof(data->id);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_STREAM_ID)
|
||||
size += sizeof(data->stream_id);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_CPU)
|
||||
size += sizeof(data->cpu_entry);
|
||||
|
||||
event->id_header_size = size;
|
||||
}
|
||||
|
||||
static void perf_group_attach(struct perf_event *event)
|
||||
{
|
||||
struct perf_event *group_leader = event->group_leader;
|
||||
struct perf_event *group_leader = event->group_leader, *pos;
|
||||
|
||||
/*
|
||||
* We can have double attach due to group movement in perf_event_open.
|
||||
@ -333,6 +433,11 @@ static void perf_group_attach(struct perf_event *event)
|
||||
|
||||
list_add_tail(&event->group_entry, &group_leader->sibling_list);
|
||||
group_leader->nr_siblings++;
|
||||
|
||||
perf_event__header_size(group_leader);
|
||||
|
||||
list_for_each_entry(pos, &group_leader->sibling_list, group_entry)
|
||||
perf_event__header_size(pos);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -391,7 +496,7 @@ static void perf_group_detach(struct perf_event *event)
|
||||
if (event->group_leader != event) {
|
||||
list_del_init(&event->group_entry);
|
||||
event->group_leader->nr_siblings--;
|
||||
return;
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!list_empty(&event->group_entry))
|
||||
@ -410,6 +515,12 @@ static void perf_group_detach(struct perf_event *event)
|
||||
/* Inherit group flags from the previous leader */
|
||||
sibling->group_flags = event->group_flags;
|
||||
}
|
||||
|
||||
out:
|
||||
perf_event__header_size(event->group_leader);
|
||||
|
||||
list_for_each_entry(tmp, &event->group_leader->sibling_list, group_entry)
|
||||
perf_event__header_size(tmp);
|
||||
}
|
||||
|
||||
static inline int
|
||||
@ -1073,7 +1184,7 @@ static int perf_event_refresh(struct perf_event *event, int refresh)
|
||||
/*
|
||||
* not supported on inherited events
|
||||
*/
|
||||
if (event->attr.inherit)
|
||||
if (event->attr.inherit || !is_sampling_event(event))
|
||||
return -EINVAL;
|
||||
|
||||
atomic_add(refresh, &event->event_limit);
|
||||
@ -2289,31 +2400,6 @@ static int perf_release(struct inode *inode, struct file *file)
|
||||
return perf_event_release_kernel(event);
|
||||
}
|
||||
|
||||
static int perf_event_read_size(struct perf_event *event)
|
||||
{
|
||||
int entry = sizeof(u64); /* value */
|
||||
int size = 0;
|
||||
int nr = 1;
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
|
||||
size += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
|
||||
size += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_ID)
|
||||
entry += sizeof(u64);
|
||||
|
||||
if (event->attr.read_format & PERF_FORMAT_GROUP) {
|
||||
nr += event->group_leader->nr_siblings;
|
||||
size += sizeof(u64);
|
||||
}
|
||||
|
||||
size += entry * nr;
|
||||
|
||||
return size;
|
||||
}
|
||||
|
||||
u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
|
||||
{
|
||||
struct perf_event *child;
|
||||
@ -2428,7 +2514,7 @@ perf_read_hw(struct perf_event *event, char __user *buf, size_t count)
|
||||
if (event->state == PERF_EVENT_STATE_ERROR)
|
||||
return 0;
|
||||
|
||||
if (count < perf_event_read_size(event))
|
||||
if (count < event->read_size)
|
||||
return -ENOSPC;
|
||||
|
||||
WARN_ON_ONCE(event->ctx->parent_ctx);
|
||||
@ -2514,7 +2600,7 @@ static int perf_event_period(struct perf_event *event, u64 __user *arg)
|
||||
int ret = 0;
|
||||
u64 value;
|
||||
|
||||
if (!event->attr.sample_period)
|
||||
if (!is_sampling_event(event))
|
||||
return -EINVAL;
|
||||
|
||||
if (copy_from_user(&value, arg, sizeof(value)))
|
||||
@ -3305,6 +3391,73 @@ __always_inline void perf_output_copy(struct perf_output_handle *handle,
|
||||
} while (len);
|
||||
}
|
||||
|
||||
static void __perf_event_header__init_id(struct perf_event_header *header,
|
||||
struct perf_sample_data *data,
|
||||
struct perf_event *event)
|
||||
{
|
||||
u64 sample_type = event->attr.sample_type;
|
||||
|
||||
data->type = sample_type;
|
||||
header->size += event->id_header_size;
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TID) {
|
||||
/* namespace issues */
|
||||
data->tid_entry.pid = perf_event_pid(event, current);
|
||||
data->tid_entry.tid = perf_event_tid(event, current);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TIME)
|
||||
data->time = perf_clock();
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ID)
|
||||
data->id = primary_event_id(event);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_STREAM_ID)
|
||||
data->stream_id = event->id;
|
||||
|
||||
if (sample_type & PERF_SAMPLE_CPU) {
|
||||
data->cpu_entry.cpu = raw_smp_processor_id();
|
||||
data->cpu_entry.reserved = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static void perf_event_header__init_id(struct perf_event_header *header,
|
||||
struct perf_sample_data *data,
|
||||
struct perf_event *event)
|
||||
{
|
||||
if (event->attr.sample_id_all)
|
||||
__perf_event_header__init_id(header, data, event);
|
||||
}
|
||||
|
||||
static void __perf_event__output_id_sample(struct perf_output_handle *handle,
|
||||
struct perf_sample_data *data)
|
||||
{
|
||||
u64 sample_type = data->type;
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TID)
|
||||
perf_output_put(handle, data->tid_entry);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TIME)
|
||||
perf_output_put(handle, data->time);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ID)
|
||||
perf_output_put(handle, data->id);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_STREAM_ID)
|
||||
perf_output_put(handle, data->stream_id);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_CPU)
|
||||
perf_output_put(handle, data->cpu_entry);
|
||||
}
|
||||
|
||||
static void perf_event__output_id_sample(struct perf_event *event,
|
||||
struct perf_output_handle *handle,
|
||||
struct perf_sample_data *sample)
|
||||
{
|
||||
if (event->attr.sample_id_all)
|
||||
__perf_event__output_id_sample(handle, sample);
|
||||
}
|
||||
|
||||
int perf_output_begin(struct perf_output_handle *handle,
|
||||
struct perf_event *event, unsigned int size,
|
||||
int nmi, int sample)
|
||||
@ -3312,6 +3465,7 @@ int perf_output_begin(struct perf_output_handle *handle,
|
||||
struct perf_buffer *buffer;
|
||||
unsigned long tail, offset, head;
|
||||
int have_lost;
|
||||
struct perf_sample_data sample_data;
|
||||
struct {
|
||||
struct perf_event_header header;
|
||||
u64 id;
|
||||
@ -3338,8 +3492,12 @@ int perf_output_begin(struct perf_output_handle *handle,
|
||||
goto out;
|
||||
|
||||
have_lost = local_read(&buffer->lost);
|
||||
if (have_lost)
|
||||
size += sizeof(lost_event);
|
||||
if (have_lost) {
|
||||
lost_event.header.size = sizeof(lost_event);
|
||||
perf_event_header__init_id(&lost_event.header, &sample_data,
|
||||
event);
|
||||
size += lost_event.header.size;
|
||||
}
|
||||
|
||||
perf_output_get_handle(handle);
|
||||
|
||||
@ -3370,11 +3528,11 @@ int perf_output_begin(struct perf_output_handle *handle,
|
||||
if (have_lost) {
|
||||
lost_event.header.type = PERF_RECORD_LOST;
|
||||
lost_event.header.misc = 0;
|
||||
lost_event.header.size = sizeof(lost_event);
|
||||
lost_event.id = event->id;
|
||||
lost_event.lost = local_xchg(&buffer->lost, 0);
|
||||
|
||||
perf_output_put(handle, lost_event);
|
||||
perf_event__output_id_sample(event, handle, &sample_data);
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -3407,28 +3565,6 @@ void perf_output_end(struct perf_output_handle *handle)
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
|
||||
{
|
||||
/*
|
||||
* only top level events have the pid namespace they were created in
|
||||
*/
|
||||
if (event->parent)
|
||||
event = event->parent;
|
||||
|
||||
return task_tgid_nr_ns(p, event->ns);
|
||||
}
|
||||
|
||||
static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
|
||||
{
|
||||
/*
|
||||
* only top level events have the pid namespace they were created in
|
||||
*/
|
||||
if (event->parent)
|
||||
event = event->parent;
|
||||
|
||||
return task_pid_nr_ns(p, event->ns);
|
||||
}
|
||||
|
||||
static void perf_output_read_one(struct perf_output_handle *handle,
|
||||
struct perf_event *event,
|
||||
u64 enabled, u64 running)
|
||||
@ -3603,62 +3739,17 @@ void perf_prepare_sample(struct perf_event_header *header,
|
||||
{
|
||||
u64 sample_type = event->attr.sample_type;
|
||||
|
||||
data->type = sample_type;
|
||||
|
||||
header->type = PERF_RECORD_SAMPLE;
|
||||
header->size = sizeof(*header);
|
||||
header->size = sizeof(*header) + event->header_size;
|
||||
|
||||
header->misc = 0;
|
||||
header->misc |= perf_misc_flags(regs);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_IP) {
|
||||
__perf_event_header__init_id(header, data, event);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_IP)
|
||||
data->ip = perf_instruction_pointer(regs);
|
||||
|
||||
header->size += sizeof(data->ip);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TID) {
|
||||
/* namespace issues */
|
||||
data->tid_entry.pid = perf_event_pid(event, current);
|
||||
data->tid_entry.tid = perf_event_tid(event, current);
|
||||
|
||||
header->size += sizeof(data->tid_entry);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_TIME) {
|
||||
data->time = perf_clock();
|
||||
|
||||
header->size += sizeof(data->time);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ADDR)
|
||||
header->size += sizeof(data->addr);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_ID) {
|
||||
data->id = primary_event_id(event);
|
||||
|
||||
header->size += sizeof(data->id);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_STREAM_ID) {
|
||||
data->stream_id = event->id;
|
||||
|
||||
header->size += sizeof(data->stream_id);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_CPU) {
|
||||
data->cpu_entry.cpu = raw_smp_processor_id();
|
||||
data->cpu_entry.reserved = 0;
|
||||
|
||||
header->size += sizeof(data->cpu_entry);
|
||||
}
|
||||
|
||||
if (sample_type & PERF_SAMPLE_PERIOD)
|
||||
header->size += sizeof(data->period);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_READ)
|
||||
header->size += perf_event_read_size(event);
|
||||
|
||||
if (sample_type & PERF_SAMPLE_CALLCHAIN) {
|
||||
int size = 1;
|
||||
|
||||
@ -3722,23 +3813,26 @@ perf_event_read_event(struct perf_event *event,
|
||||
struct task_struct *task)
|
||||
{
|
||||
struct perf_output_handle handle;
|
||||
struct perf_sample_data sample;
|
||||
struct perf_read_event read_event = {
|
||||
.header = {
|
||||
.type = PERF_RECORD_READ,
|
||||
.misc = 0,
|
||||
.size = sizeof(read_event) + perf_event_read_size(event),
|
||||
.size = sizeof(read_event) + event->read_size,
|
||||
},
|
||||
.pid = perf_event_pid(event, task),
|
||||
.tid = perf_event_tid(event, task),
|
||||
};
|
||||
int ret;
|
||||
|
||||
perf_event_header__init_id(&read_event.header, &sample, event);
|
||||
ret = perf_output_begin(&handle, event, read_event.header.size, 0, 0);
|
||||
if (ret)
|
||||
return;
|
||||
|
||||
perf_output_put(&handle, read_event);
|
||||
perf_output_read(&handle, event);
|
||||
perf_event__output_id_sample(event, &handle, &sample);
|
||||
|
||||
perf_output_end(&handle);
|
||||
}
|
||||
@ -3768,14 +3862,16 @@ static void perf_event_task_output(struct perf_event *event,
|
||||
struct perf_task_event *task_event)
|
||||
{
|
||||
struct perf_output_handle handle;
|
||||
struct perf_sample_data sample;
|
||||
struct task_struct *task = task_event->task;
|
||||
int size, ret;
|
||||
int ret, size = task_event->event_id.header.size;
|
||||
|
||||
size = task_event->event_id.header.size;
|
||||
ret = perf_output_begin(&handle, event, size, 0, 0);
|
||||
perf_event_header__init_id(&task_event->event_id.header, &sample, event);
|
||||
|
||||
ret = perf_output_begin(&handle, event,
|
||||
task_event->event_id.header.size, 0, 0);
|
||||
if (ret)
|
||||
return;
|
||||
goto out;
|
||||
|
||||
task_event->event_id.pid = perf_event_pid(event, task);
|
||||
task_event->event_id.ppid = perf_event_pid(event, current);
|
||||
@ -3785,7 +3881,11 @@ static void perf_event_task_output(struct perf_event *event,
|
||||
|
||||
perf_output_put(&handle, task_event->event_id);
|
||||
|
||||
perf_event__output_id_sample(event, &handle, &sample);
|
||||
|
||||
perf_output_end(&handle);
|
||||
out:
|
||||
task_event->event_id.header.size = size;
|
||||
}
|
||||
|
||||
static int perf_event_task_match(struct perf_event *event)
|
||||
@ -3900,11 +4000,16 @@ static void perf_event_comm_output(struct perf_event *event,
|
||||
struct perf_comm_event *comm_event)
|
||||
{
|
||||
struct perf_output_handle handle;
|
||||
struct perf_sample_data sample;
|
||||
int size = comm_event->event_id.header.size;
|
||||
int ret = perf_output_begin(&handle, event, size, 0, 0);
|
||||
int ret;
|
||||
|
||||
perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
|
||||
ret = perf_output_begin(&handle, event,
|
||||
comm_event->event_id.header.size, 0, 0);
|
||||
|
||||
if (ret)
|
||||
return;
|
||||
goto out;
|
||||
|
||||
comm_event->event_id.pid = perf_event_pid(event, comm_event->task);
|
||||
comm_event->event_id.tid = perf_event_tid(event, comm_event->task);
|
||||
@ -3912,7 +4017,12 @@ static void perf_event_comm_output(struct perf_event *event,
|
||||
perf_output_put(&handle, comm_event->event_id);
|
||||
perf_output_copy(&handle, comm_event->comm,
|
||||
comm_event->comm_size);
|
||||
|
||||
perf_event__output_id_sample(event, &handle, &sample);
|
||||
|
||||
perf_output_end(&handle);
|
||||
out:
|
||||
comm_event->event_id.header.size = size;
|
||||
}
|
||||
|
||||
static int perf_event_comm_match(struct perf_event *event)
|
||||
@ -3957,7 +4067,6 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event)
|
||||
comm_event->comm_size = size;
|
||||
|
||||
comm_event->event_id.header.size = sizeof(comm_event->event_id) + size;
|
||||
|
||||
rcu_read_lock();
|
||||
list_for_each_entry_rcu(pmu, &pmus, entry) {
|
||||
cpuctx = get_cpu_ptr(pmu->pmu_cpu_context);
|
||||
@ -4038,11 +4147,15 @@ static void perf_event_mmap_output(struct perf_event *event,
|
||||
struct perf_mmap_event *mmap_event)
|
||||
{
|
||||
struct perf_output_handle handle;
|
||||
struct perf_sample_data sample;
|
||||
int size = mmap_event->event_id.header.size;
|
||||
int ret = perf_output_begin(&handle, event, size, 0, 0);
|
||||
int ret;
|
||||
|
||||
perf_event_header__init_id(&mmap_event->event_id.header, &sample, event);
|
||||
ret = perf_output_begin(&handle, event,
|
||||
mmap_event->event_id.header.size, 0, 0);
|
||||
if (ret)
|
||||
return;
|
||||
goto out;
|
||||
|
||||
mmap_event->event_id.pid = perf_event_pid(event, current);
|
||||
mmap_event->event_id.tid = perf_event_tid(event, current);
|
||||
@ -4050,7 +4163,12 @@ static void perf_event_mmap_output(struct perf_event *event,
|
||||
perf_output_put(&handle, mmap_event->event_id);
|
||||
perf_output_copy(&handle, mmap_event->file_name,
|
||||
mmap_event->file_size);
|
||||
|
||||
perf_event__output_id_sample(event, &handle, &sample);
|
||||
|
||||
perf_output_end(&handle);
|
||||
out:
|
||||
mmap_event->event_id.header.size = size;
|
||||
}
|
||||
|
||||
static int perf_event_mmap_match(struct perf_event *event,
|
||||
@ -4205,6 +4323,7 @@ void perf_event_mmap(struct vm_area_struct *vma)
|
||||
static void perf_log_throttle(struct perf_event *event, int enable)
|
||||
{
|
||||
struct perf_output_handle handle;
|
||||
struct perf_sample_data sample;
|
||||
int ret;
|
||||
|
||||
struct {
|
||||
@ -4226,11 +4345,15 @@ static void perf_log_throttle(struct perf_event *event, int enable)
|
||||
if (enable)
|
||||
throttle_event.header.type = PERF_RECORD_UNTHROTTLE;
|
||||
|
||||
ret = perf_output_begin(&handle, event, sizeof(throttle_event), 1, 0);
|
||||
perf_event_header__init_id(&throttle_event.header, &sample, event);
|
||||
|
||||
ret = perf_output_begin(&handle, event,
|
||||
throttle_event.header.size, 1, 0);
|
||||
if (ret)
|
||||
return;
|
||||
|
||||
perf_output_put(&handle, throttle_event);
|
||||
perf_event__output_id_sample(event, &handle, &sample);
|
||||
perf_output_end(&handle);
|
||||
}
|
||||
|
||||
@ -4246,6 +4369,13 @@ static int __perf_event_overflow(struct perf_event *event, int nmi,
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
int ret = 0;
|
||||
|
||||
/*
|
||||
* Non-sampling counters might still use the PMI to fold short
|
||||
* hardware counters, ignore those.
|
||||
*/
|
||||
if (unlikely(!is_sampling_event(event)))
|
||||
return 0;
|
||||
|
||||
if (!throttle) {
|
||||
hwc->interrupts++;
|
||||
} else {
|
||||
@ -4391,7 +4521,7 @@ static void perf_swevent_event(struct perf_event *event, u64 nr,
|
||||
if (!regs)
|
||||
return;
|
||||
|
||||
if (!hwc->sample_period)
|
||||
if (!is_sampling_event(event))
|
||||
return;
|
||||
|
||||
if (nr == 1 && hwc->sample_period == 1 && !event->attr.freq)
|
||||
@ -4554,7 +4684,7 @@ static int perf_swevent_add(struct perf_event *event, int flags)
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
struct hlist_head *head;
|
||||
|
||||
if (hwc->sample_period) {
|
||||
if (is_sampling_event(event)) {
|
||||
hwc->last_period = hwc->sample_period;
|
||||
perf_swevent_set_period(event);
|
||||
}
|
||||
@ -4811,15 +4941,6 @@ static int perf_tp_event_init(struct perf_event *event)
|
||||
if (event->attr.type != PERF_TYPE_TRACEPOINT)
|
||||
return -ENOENT;
|
||||
|
||||
/*
|
||||
* Raw tracepoint data is a severe data leak, only allow root to
|
||||
* have these.
|
||||
*/
|
||||
if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
|
||||
perf_paranoid_tracepoint_raw() &&
|
||||
!capable(CAP_SYS_ADMIN))
|
||||
return -EPERM;
|
||||
|
||||
err = perf_trace_init(event);
|
||||
if (err)
|
||||
return err;
|
||||
@ -4842,7 +4963,7 @@ static struct pmu perf_tracepoint = {
|
||||
|
||||
static inline void perf_tp_register(void)
|
||||
{
|
||||
perf_pmu_register(&perf_tracepoint);
|
||||
perf_pmu_register(&perf_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
|
||||
}
|
||||
|
||||
static int perf_event_set_filter(struct perf_event *event, void __user *arg)
|
||||
@ -4932,31 +5053,33 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer)
|
||||
static void perf_swevent_start_hrtimer(struct perf_event *event)
|
||||
{
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
s64 period;
|
||||
|
||||
if (!is_sampling_event(event))
|
||||
return;
|
||||
|
||||
hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
|
||||
hwc->hrtimer.function = perf_swevent_hrtimer;
|
||||
if (hwc->sample_period) {
|
||||
s64 period = local64_read(&hwc->period_left);
|
||||
|
||||
if (period) {
|
||||
if (period < 0)
|
||||
period = 10000;
|
||||
period = local64_read(&hwc->period_left);
|
||||
if (period) {
|
||||
if (period < 0)
|
||||
period = 10000;
|
||||
|
||||
local64_set(&hwc->period_left, 0);
|
||||
} else {
|
||||
period = max_t(u64, 10000, hwc->sample_period);
|
||||
}
|
||||
__hrtimer_start_range_ns(&hwc->hrtimer,
|
||||
local64_set(&hwc->period_left, 0);
|
||||
} else {
|
||||
period = max_t(u64, 10000, hwc->sample_period);
|
||||
}
|
||||
__hrtimer_start_range_ns(&hwc->hrtimer,
|
||||
ns_to_ktime(period), 0,
|
||||
HRTIMER_MODE_REL_PINNED, 0);
|
||||
}
|
||||
}
|
||||
|
||||
static void perf_swevent_cancel_hrtimer(struct perf_event *event)
|
||||
{
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
|
||||
if (hwc->sample_period) {
|
||||
if (is_sampling_event(event)) {
|
||||
ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
|
||||
local64_set(&hwc->period_left, ktime_to_ns(remaining));
|
||||
|
||||
@ -5184,8 +5307,61 @@ static void free_pmu_context(struct pmu *pmu)
|
||||
out:
|
||||
mutex_unlock(&pmus_lock);
|
||||
}
|
||||
static struct idr pmu_idr;
|
||||
|
||||
int perf_pmu_register(struct pmu *pmu)
|
||||
static ssize_t
|
||||
type_show(struct device *dev, struct device_attribute *attr, char *page)
|
||||
{
|
||||
struct pmu *pmu = dev_get_drvdata(dev);
|
||||
|
||||
return snprintf(page, PAGE_SIZE-1, "%d\n", pmu->type);
|
||||
}
|
||||
|
||||
static struct device_attribute pmu_dev_attrs[] = {
|
||||
__ATTR_RO(type),
|
||||
__ATTR_NULL,
|
||||
};
|
||||
|
||||
static int pmu_bus_running;
|
||||
static struct bus_type pmu_bus = {
|
||||
.name = "event_source",
|
||||
.dev_attrs = pmu_dev_attrs,
|
||||
};
|
||||
|
||||
static void pmu_dev_release(struct device *dev)
|
||||
{
|
||||
kfree(dev);
|
||||
}
|
||||
|
||||
static int pmu_dev_alloc(struct pmu *pmu)
|
||||
{
|
||||
int ret = -ENOMEM;
|
||||
|
||||
pmu->dev = kzalloc(sizeof(struct device), GFP_KERNEL);
|
||||
if (!pmu->dev)
|
||||
goto out;
|
||||
|
||||
device_initialize(pmu->dev);
|
||||
ret = dev_set_name(pmu->dev, "%s", pmu->name);
|
||||
if (ret)
|
||||
goto free_dev;
|
||||
|
||||
dev_set_drvdata(pmu->dev, pmu);
|
||||
pmu->dev->bus = &pmu_bus;
|
||||
pmu->dev->release = pmu_dev_release;
|
||||
ret = device_add(pmu->dev);
|
||||
if (ret)
|
||||
goto free_dev;
|
||||
|
||||
out:
|
||||
return ret;
|
||||
|
||||
free_dev:
|
||||
put_device(pmu->dev);
|
||||
goto out;
|
||||
}
|
||||
|
||||
int perf_pmu_register(struct pmu *pmu, char *name, int type)
|
||||
{
|
||||
int cpu, ret;
|
||||
|
||||
@ -5195,13 +5371,38 @@ int perf_pmu_register(struct pmu *pmu)
|
||||
if (!pmu->pmu_disable_count)
|
||||
goto unlock;
|
||||
|
||||
pmu->type = -1;
|
||||
if (!name)
|
||||
goto skip_type;
|
||||
pmu->name = name;
|
||||
|
||||
if (type < 0) {
|
||||
int err = idr_pre_get(&pmu_idr, GFP_KERNEL);
|
||||
if (!err)
|
||||
goto free_pdc;
|
||||
|
||||
err = idr_get_new_above(&pmu_idr, pmu, PERF_TYPE_MAX, &type);
|
||||
if (err) {
|
||||
ret = err;
|
||||
goto free_pdc;
|
||||
}
|
||||
}
|
||||
pmu->type = type;
|
||||
|
||||
if (pmu_bus_running) {
|
||||
ret = pmu_dev_alloc(pmu);
|
||||
if (ret)
|
||||
goto free_idr;
|
||||
}
|
||||
|
||||
skip_type:
|
||||
pmu->pmu_cpu_context = find_pmu_context(pmu->task_ctx_nr);
|
||||
if (pmu->pmu_cpu_context)
|
||||
goto got_cpu_context;
|
||||
|
||||
pmu->pmu_cpu_context = alloc_percpu(struct perf_cpu_context);
|
||||
if (!pmu->pmu_cpu_context)
|
||||
goto free_pdc;
|
||||
goto free_dev;
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct perf_cpu_context *cpuctx;
|
||||
@ -5245,6 +5446,14 @@ unlock:
|
||||
|
||||
return ret;
|
||||
|
||||
free_dev:
|
||||
device_del(pmu->dev);
|
||||
put_device(pmu->dev);
|
||||
|
||||
free_idr:
|
||||
if (pmu->type >= PERF_TYPE_MAX)
|
||||
idr_remove(&pmu_idr, pmu->type);
|
||||
|
||||
free_pdc:
|
||||
free_percpu(pmu->pmu_disable_count);
|
||||
goto unlock;
|
||||
@ -5264,6 +5473,10 @@ void perf_pmu_unregister(struct pmu *pmu)
|
||||
synchronize_rcu();
|
||||
|
||||
free_percpu(pmu->pmu_disable_count);
|
||||
if (pmu->type >= PERF_TYPE_MAX)
|
||||
idr_remove(&pmu_idr, pmu->type);
|
||||
device_del(pmu->dev);
|
||||
put_device(pmu->dev);
|
||||
free_pmu_context(pmu);
|
||||
}
|
||||
|
||||
@ -5273,6 +5486,13 @@ struct pmu *perf_init_event(struct perf_event *event)
|
||||
int idx;
|
||||
|
||||
idx = srcu_read_lock(&pmus_srcu);
|
||||
|
||||
rcu_read_lock();
|
||||
pmu = idr_find(&pmu_idr, event->attr.type);
|
||||
rcu_read_unlock();
|
||||
if (pmu)
|
||||
goto unlock;
|
||||
|
||||
list_for_each_entry_rcu(pmu, &pmus, entry) {
|
||||
int ret = pmu->event_init(event);
|
||||
if (!ret)
|
||||
@ -5737,6 +5957,12 @@ SYSCALL_DEFINE5(perf_event_open,
|
||||
list_add_tail(&event->owner_entry, ¤t->perf_event_list);
|
||||
mutex_unlock(¤t->perf_event_mutex);
|
||||
|
||||
/*
|
||||
* Precalculate sample_data sizes
|
||||
*/
|
||||
perf_event__header_size(event);
|
||||
perf_event__id_header_size(event);
|
||||
|
||||
/*
|
||||
* Drop the reference on the group_event after placing the
|
||||
* new event on the sibling_list. This ensures destruction
|
||||
@ -6089,6 +6315,12 @@ inherit_event(struct perf_event *parent_event,
|
||||
child_event->ctx = child_ctx;
|
||||
child_event->overflow_handler = parent_event->overflow_handler;
|
||||
|
||||
/*
|
||||
* Precalculate sample_data sizes
|
||||
*/
|
||||
perf_event__header_size(child_event);
|
||||
perf_event__id_header_size(child_event);
|
||||
|
||||
/*
|
||||
* Link it up in the child's context:
|
||||
*/
|
||||
@ -6320,7 +6552,7 @@ static void __cpuinit perf_event_init_cpu(int cpu)
|
||||
mutex_unlock(&swhash->hlist_mutex);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_HOTPLUG_CPU
|
||||
#if defined CONFIG_HOTPLUG_CPU || defined CONFIG_KEXEC
|
||||
static void perf_pmu_rotate_stop(struct pmu *pmu)
|
||||
{
|
||||
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
|
||||
@ -6374,6 +6606,26 @@ static void perf_event_exit_cpu(int cpu)
|
||||
static inline void perf_event_exit_cpu(int cpu) { }
|
||||
#endif
|
||||
|
||||
static int
|
||||
perf_reboot(struct notifier_block *notifier, unsigned long val, void *v)
|
||||
{
|
||||
int cpu;
|
||||
|
||||
for_each_online_cpu(cpu)
|
||||
perf_event_exit_cpu(cpu);
|
||||
|
||||
return NOTIFY_OK;
|
||||
}
|
||||
|
||||
/*
|
||||
* Run the perf reboot notifier at the very last possible moment so that
|
||||
* the generic watchdog code runs as long as possible.
|
||||
*/
|
||||
static struct notifier_block perf_reboot_notifier = {
|
||||
.notifier_call = perf_reboot,
|
||||
.priority = INT_MIN,
|
||||
};
|
||||
|
||||
static int __cpuinit
|
||||
perf_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu)
|
||||
{
|
||||
@ -6402,14 +6654,45 @@ void __init perf_event_init(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
idr_init(&pmu_idr);
|
||||
|
||||
perf_event_init_all_cpus();
|
||||
init_srcu_struct(&pmus_srcu);
|
||||
perf_pmu_register(&perf_swevent);
|
||||
perf_pmu_register(&perf_cpu_clock);
|
||||
perf_pmu_register(&perf_task_clock);
|
||||
perf_pmu_register(&perf_swevent, "software", PERF_TYPE_SOFTWARE);
|
||||
perf_pmu_register(&perf_cpu_clock, NULL, -1);
|
||||
perf_pmu_register(&perf_task_clock, NULL, -1);
|
||||
perf_tp_register();
|
||||
perf_cpu_notifier(perf_cpu_notify);
|
||||
register_reboot_notifier(&perf_reboot_notifier);
|
||||
|
||||
ret = init_hw_breakpoint();
|
||||
WARN(ret, "hw_breakpoint initialization failed with: %d", ret);
|
||||
}
|
||||
|
||||
static int __init perf_event_sysfs_init(void)
|
||||
{
|
||||
struct pmu *pmu;
|
||||
int ret;
|
||||
|
||||
mutex_lock(&pmus_lock);
|
||||
|
||||
ret = bus_register(&pmu_bus);
|
||||
if (ret)
|
||||
goto unlock;
|
||||
|
||||
list_for_each_entry(pmu, &pmus, entry) {
|
||||
if (!pmu->name || pmu->type < 0)
|
||||
continue;
|
||||
|
||||
ret = pmu_dev_alloc(pmu);
|
||||
WARN(ret, "Failed to register pmu: %s, reason %d\n", pmu->name, ret);
|
||||
}
|
||||
pmu_bus_running = 1;
|
||||
ret = 0;
|
||||
|
||||
unlock:
|
||||
mutex_unlock(&pmus_lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
device_initcall(perf_event_sysfs_init);
|
||||
|
@ -22,6 +22,7 @@
|
||||
#include <linux/mm.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/suspend.h>
|
||||
#include <trace/events/power.h>
|
||||
|
||||
#include "power.h"
|
||||
|
||||
@ -201,6 +202,7 @@ int suspend_devices_and_enter(suspend_state_t state)
|
||||
if (!suspend_ops)
|
||||
return -ENOSYS;
|
||||
|
||||
trace_machine_suspend(state);
|
||||
if (suspend_ops->begin) {
|
||||
error = suspend_ops->begin(state);
|
||||
if (error)
|
||||
@ -229,6 +231,7 @@ int suspend_devices_and_enter(suspend_state_t state)
|
||||
Close:
|
||||
if (suspend_ops->end)
|
||||
suspend_ops->end();
|
||||
trace_machine_suspend(PWR_EVENT_EXIT);
|
||||
return error;
|
||||
|
||||
Recover_platform:
|
||||
|
@ -8293,8 +8293,6 @@ void __init sched_init(void)
|
||||
zalloc_cpumask_var(&cpu_isolated_map, GFP_NOWAIT);
|
||||
#endif /* SMP */
|
||||
|
||||
perf_event_init();
|
||||
|
||||
scheduler_running = 1;
|
||||
}
|
||||
|
||||
|
@ -745,8 +745,15 @@ static struct ctl_table kern_table[] = {
|
||||
.extra1 = &zero,
|
||||
.extra2 = &one,
|
||||
},
|
||||
{
|
||||
.procname = "nmi_watchdog",
|
||||
.data = &watchdog_enabled,
|
||||
.maxlen = sizeof (int),
|
||||
.mode = 0644,
|
||||
.proc_handler = proc_dowatchdog_enabled,
|
||||
},
|
||||
#endif
|
||||
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
|
||||
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
|
||||
{
|
||||
.procname = "unknown_nmi_panic",
|
||||
.data = &unknown_nmi_panic,
|
||||
@ -754,13 +761,6 @@ static struct ctl_table kern_table[] = {
|
||||
.mode = 0644,
|
||||
.proc_handler = proc_dointvec,
|
||||
},
|
||||
{
|
||||
.procname = "nmi_watchdog",
|
||||
.data = &nmi_watchdog_enabled,
|
||||
.maxlen = sizeof (int),
|
||||
.mode = 0644,
|
||||
.proc_handler = proc_nmi_enabled,
|
||||
},
|
||||
#endif
|
||||
#if defined(CONFIG_X86)
|
||||
{
|
||||
|
@ -136,7 +136,6 @@ static const struct bin_table bin_kern_table[] = {
|
||||
{ CTL_INT, KERN_IA64_UNALIGNED, "ignore-unaligned-usertrap" },
|
||||
{ CTL_INT, KERN_COMPAT_LOG, "compat-log" },
|
||||
{ CTL_INT, KERN_MAX_LOCK_DEPTH, "max_lock_depth" },
|
||||
{ CTL_INT, KERN_NMI_WATCHDOG, "nmi_watchdog" },
|
||||
{ CTL_INT, KERN_PANIC_ON_NMI, "panic_on_unrecovered_nmi" },
|
||||
{}
|
||||
};
|
||||
|
@ -69,6 +69,21 @@ config EVENT_TRACING
|
||||
select CONTEXT_SWITCH_TRACER
|
||||
bool
|
||||
|
||||
config EVENT_POWER_TRACING_DEPRECATED
|
||||
depends on EVENT_TRACING
|
||||
bool "Deprecated power event trace API, to be removed"
|
||||
default y
|
||||
help
|
||||
Provides old power event types:
|
||||
C-state/idle accounting events:
|
||||
power:power_start
|
||||
power:power_end
|
||||
and old cpufreq accounting event:
|
||||
power:power_frequency
|
||||
This is for userspace compatibility
|
||||
and will vanish after 5 kernel iterations,
|
||||
namely 2.6.41.
|
||||
|
||||
config CONTEXT_SWITCH_TRACER
|
||||
bool
|
||||
|
||||
|
@ -13,5 +13,8 @@
|
||||
#define CREATE_TRACE_POINTS
|
||||
#include <trace/events/power.h>
|
||||
|
||||
EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
|
||||
#ifdef EVENT_POWER_TRACING_DEPRECATED
|
||||
EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
|
||||
#endif
|
||||
EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
|
||||
|
||||
|
@ -21,17 +21,46 @@ typedef typeof(unsigned long [PERF_MAX_TRACE_SIZE / sizeof(unsigned long)])
|
||||
/* Count the events in use (per event id, not per instance) */
|
||||
static int total_ref_count;
|
||||
|
||||
static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
|
||||
struct perf_event *p_event)
|
||||
{
|
||||
/* No tracing, just counting, so no obvious leak */
|
||||
if (!(p_event->attr.sample_type & PERF_SAMPLE_RAW))
|
||||
return 0;
|
||||
|
||||
/* Some events are ok to be traced by non-root users... */
|
||||
if (p_event->attach_state == PERF_ATTACH_TASK) {
|
||||
if (tp_event->flags & TRACE_EVENT_FL_CAP_ANY)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* ...otherwise raw tracepoint data can be a severe data leak,
|
||||
* only allow root to have these.
|
||||
*/
|
||||
if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
|
||||
return -EPERM;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int perf_trace_event_init(struct ftrace_event_call *tp_event,
|
||||
struct perf_event *p_event)
|
||||
{
|
||||
struct hlist_head __percpu *list;
|
||||
int ret = -ENOMEM;
|
||||
int ret;
|
||||
int cpu;
|
||||
|
||||
ret = perf_trace_event_perm(tp_event, p_event);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
p_event->tp_event = tp_event;
|
||||
if (tp_event->perf_refcount++ > 0)
|
||||
return 0;
|
||||
|
||||
ret = -ENOMEM;
|
||||
|
||||
list = alloc_percpu(struct hlist_head);
|
||||
if (!list)
|
||||
goto fail;
|
||||
|
@ -27,6 +27,12 @@
|
||||
|
||||
DEFINE_MUTEX(event_mutex);
|
||||
|
||||
DEFINE_MUTEX(event_storage_mutex);
|
||||
EXPORT_SYMBOL_GPL(event_storage_mutex);
|
||||
|
||||
char event_storage[EVENT_STORAGE_SIZE];
|
||||
EXPORT_SYMBOL_GPL(event_storage);
|
||||
|
||||
LIST_HEAD(ftrace_events);
|
||||
LIST_HEAD(ftrace_common_fields);
|
||||
|
||||
|
@ -83,13 +83,19 @@ static void __always_unused ____ftrace_check_##name(void) \
|
||||
|
||||
#undef __array
|
||||
#define __array(type, item, len) \
|
||||
BUILD_BUG_ON(len > MAX_FILTER_STR_VAL); \
|
||||
ret = trace_define_field(event_call, #type "[" #len "]", #item, \
|
||||
do { \
|
||||
BUILD_BUG_ON(len > MAX_FILTER_STR_VAL); \
|
||||
mutex_lock(&event_storage_mutex); \
|
||||
snprintf(event_storage, sizeof(event_storage), \
|
||||
"%s[%d]", #type, len); \
|
||||
ret = trace_define_field(event_call, event_storage, #item, \
|
||||
offsetof(typeof(field), item), \
|
||||
sizeof(field.item), \
|
||||
is_signed_type(type), FILTER_OTHER); \
|
||||
if (ret) \
|
||||
return ret;
|
||||
mutex_unlock(&event_storage_mutex); \
|
||||
if (ret) \
|
||||
return ret; \
|
||||
} while (0);
|
||||
|
||||
#undef __array_desc
|
||||
#define __array_desc(type, container, item, len) \
|
||||
|
@ -57,6 +57,8 @@ static int __init hardlockup_panic_setup(char *str)
|
||||
{
|
||||
if (!strncmp(str, "panic", 5))
|
||||
hardlockup_panic = 1;
|
||||
else if (!strncmp(str, "0", 1))
|
||||
no_watchdog = 1;
|
||||
return 1;
|
||||
}
|
||||
__setup("nmi_watchdog=", hardlockup_panic_setup);
|
||||
@ -548,13 +550,13 @@ static struct notifier_block __cpuinitdata cpu_nfb = {
|
||||
.notifier_call = cpu_callback
|
||||
};
|
||||
|
||||
static int __init spawn_watchdog_task(void)
|
||||
void __init lockup_detector_init(void)
|
||||
{
|
||||
void *cpu = (void *)(long)smp_processor_id();
|
||||
int err;
|
||||
|
||||
if (no_watchdog)
|
||||
return 0;
|
||||
return;
|
||||
|
||||
err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
|
||||
WARN_ON(notifier_to_errno(err));
|
||||
@ -562,6 +564,5 @@ static int __init spawn_watchdog_task(void)
|
||||
cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
|
||||
register_cpu_notifier(&cpu_nfb);
|
||||
|
||||
return 0;
|
||||
return;
|
||||
}
|
||||
early_initcall(spawn_watchdog_task);
|
||||
|
@ -173,7 +173,8 @@ config LOCKUP_DETECTOR
|
||||
An NMI is generated every 60 seconds or so to check for hardlockups.
|
||||
|
||||
config HARDLOCKUP_DETECTOR
|
||||
def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI
|
||||
def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI && \
|
||||
!ARCH_HAS_NMI_WATCHDOG
|
||||
|
||||
config BOOTPARAM_SOFTLOCKUP_PANIC
|
||||
bool "Panic (Reboot) On Soft Lockups"
|
||||
|
@ -214,17 +214,22 @@ ifdef BUILD_C_RECORDMCOUNT
|
||||
# The empty.o file is created in the make process in order to determine
|
||||
# the target endianness and word size. It is made before all other C
|
||||
# files, including recordmcount.
|
||||
cmd_record_mcount = if [ $(@) != "scripts/mod/empty.o" ]; then \
|
||||
$(objtree)/scripts/recordmcount "$(@)"; \
|
||||
fi;
|
||||
sub_cmd_record_mcount = \
|
||||
if [ $(@) != "scripts/mod/empty.o" ]; then \
|
||||
$(objtree)/scripts/recordmcount "$(@)"; \
|
||||
fi;
|
||||
else
|
||||
cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
|
||||
sub_cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
|
||||
"$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \
|
||||
"$(if $(CONFIG_64BIT),64,32)" \
|
||||
"$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CFLAGS)" \
|
||||
"$(LD)" "$(NM)" "$(RM)" "$(MV)" \
|
||||
"$(if $(part-of-module),1,0)" "$(@)";
|
||||
endif
|
||||
cmd_record_mcount = \
|
||||
if [ "$(findstring -pg,$(_c_flags))" = "-pg" ]; then \
|
||||
$(sub_cmd_record_mcount) \
|
||||
fi;
|
||||
endif
|
||||
|
||||
define rule_cc_o_c
|
||||
|
@ -24,12 +24,47 @@ OPTIONS
|
||||
--input=::
|
||||
Input file name. (default: perf.data)
|
||||
|
||||
-d::
|
||||
--dsos=<dso[,dso...]>::
|
||||
Only consider symbols in these dsos.
|
||||
-s::
|
||||
--symbol=<symbol>::
|
||||
Symbol to annotate.
|
||||
|
||||
-f::
|
||||
--force::
|
||||
Don't complain, do it.
|
||||
|
||||
-v::
|
||||
--verbose::
|
||||
Be more verbose. (Show symbol address, etc)
|
||||
|
||||
-D::
|
||||
--dump-raw-trace::
|
||||
Dump raw trace in ASCII.
|
||||
|
||||
-k::
|
||||
--vmlinux=<file>::
|
||||
vmlinux pathname.
|
||||
|
||||
-m::
|
||||
--modules::
|
||||
Load module symbols. WARNING: use only with -k and LIVE kernel.
|
||||
|
||||
-l::
|
||||
--print-line::
|
||||
Print matching source lines (may be slow).
|
||||
|
||||
-P::
|
||||
--full-paths::
|
||||
Don't shorten the displayed pathnames.
|
||||
|
||||
--stdio:: Use the stdio interface.
|
||||
|
||||
--tui:: Use the TUI interface Use of --tui requires a tty, if one is not
|
||||
present, as when piping to other commands, the stdio interface is
|
||||
used. This interfaces starts by centering on the line with more
|
||||
samples, TAB/UNTAB cycles thru the lines with more samples.
|
||||
samples, TAB/UNTAB cycles through the lines with more samples.
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
|
@ -18,6 +18,9 @@ perf report.
|
||||
|
||||
OPTIONS
|
||||
-------
|
||||
-H::
|
||||
--with-hits::
|
||||
Show only DSOs with hits.
|
||||
-i::
|
||||
--input=::
|
||||
Input file name. (default: perf.data)
|
||||
|
@ -19,6 +19,18 @@ If no parameters are passed it will assume perf.data.old and perf.data.
|
||||
|
||||
OPTIONS
|
||||
-------
|
||||
-M::
|
||||
--displacement::
|
||||
Show position displacement relative to baseline.
|
||||
|
||||
-D::
|
||||
--dump-raw-trace::
|
||||
Dump raw trace in ASCII.
|
||||
|
||||
-m::
|
||||
--modules::
|
||||
Load module symbols. WARNING: use only with -k and LIVE kernel
|
||||
|
||||
-d::
|
||||
--dsos=::
|
||||
Only consider symbols in these dsos. CSV that understands
|
||||
@ -42,7 +54,7 @@ OPTIONS
|
||||
--field-separator=::
|
||||
|
||||
Use a special separator character and don't pad with spaces, replacing
|
||||
all occurances of this separator in symbol names (and other output)
|
||||
all occurrences of this separator in symbol names (and other output)
|
||||
with a '.' character, that thus it's the only non valid separator.
|
||||
|
||||
-v::
|
||||
@ -50,6 +62,13 @@ OPTIONS
|
||||
Be verbose, for instance, show the raw counts in addition to the
|
||||
diff.
|
||||
|
||||
-f::
|
||||
--force::
|
||||
Don't complain, do it.
|
||||
|
||||
--symfs=<directory>::
|
||||
Look for files with symbols relative to this directory.
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
linkperf:perf-record[1]
|
||||
|
@ -22,7 +22,7 @@ There are a couple of variants of perf kvm:
|
||||
a performance counter profile of guest os in realtime
|
||||
of an arbitrary workload.
|
||||
|
||||
'perf kvm record <command>' to record the performance couinter profile
|
||||
'perf kvm record <command>' to record the performance counter profile
|
||||
of an arbitrary workload and save it into a perf data file. If both
|
||||
--host and --guest are input, the perf data file name is perf.data.kvm.
|
||||
If there is no --host but --guest, the file name is perf.data.guest.
|
||||
@ -40,6 +40,12 @@ There are a couple of variants of perf kvm:
|
||||
|
||||
OPTIONS
|
||||
-------
|
||||
-i::
|
||||
--input=::
|
||||
Input file name.
|
||||
-o::
|
||||
--output::
|
||||
Output file name.
|
||||
--host=::
|
||||
Collect host side performance profile.
|
||||
--guest=::
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user