mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-01 18:55:12 +00:00
perf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents
TPEBS (Timed PEBS(Precise Event-Based Sampling)) is a new feature Intel PMU from Granite Rapids microarchitecture. It will be used in new TMA (Top-Down Microarchitecture Analysis) releases. Add related introduction to documents while adding new code to support it in 'perf stat'. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-8-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit is contained in:
parent
d546e3acf3
commit
169f18fd98
@ -72,6 +72,7 @@ counted. The following modifiers exist:
|
||||
W - group is weak and will fallback to non-group if not schedulable,
|
||||
e - group or event are exclusive and do not share the PMU
|
||||
b - use BPF aggregration (see perf stat --bpf-counters)
|
||||
R - retire latency value of the event
|
||||
|
||||
The 'p' modifier can be used for specifying how precise the instruction
|
||||
address should be. The 'p' modifier can be specified multiple times:
|
||||
|
@ -325,6 +325,36 @@ other four level 2 metrics by subtracting corresponding metrics as below.
|
||||
Fetch_Bandwidth = Frontend_Bound - Fetch_Latency
|
||||
Core_Bound = Backend_Bound - Memory_Bound
|
||||
|
||||
TPEBS in TopDown
|
||||
================
|
||||
|
||||
TPEBS (Timed PEBS) is one of the new Intel PMU features provided since Granite
|
||||
Rapids microarchitecture. The TPEBS feature adds a 16 bit retire_latency field
|
||||
in the Basic Info group of the PEBS record. It records the Core cycles since the
|
||||
retirement of the previous instruction to the retirement of current instruction.
|
||||
Please refer to Section 8.4.1 of "Intel® Architecture Instruction Set Extensions
|
||||
Programming Reference" for more details about this feature. Because this feature
|
||||
extends PEBS record, sampling with weight option is required to get the
|
||||
retire_latency value.
|
||||
|
||||
perf record -e event_name -W ...
|
||||
|
||||
In the most recent release of TMA, the metrics begin to use event retire_latency
|
||||
values in some of the metrics’ formulas on processors that support TPEBS feature.
|
||||
For previous generations that do not support TPEBS, the values are static and
|
||||
predefined per processor family by the hardware architects. Due to the diversity
|
||||
of workloads in execution environments, retire_latency values measured at real
|
||||
time are more accurate. Therefore, new TMA metrics that use TPEBS will provide
|
||||
more accurate performance analysis results.
|
||||
|
||||
To support TPEBS in TMA metrics, a new modifier :R on event is added. Perf would
|
||||
capture retire_latency value of required events(event with :R in metric formula)
|
||||
with perf record. The retire_latency value would be used in metric calculation.
|
||||
Currently, this feature is supported through perf stat
|
||||
|
||||
perf stat -M metric_name --record-tpebs ...
|
||||
|
||||
|
||||
|
||||
[1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
|
||||
[2] https://sites.google.com/site/analysismethods/yasin-pubs
|
||||
|
Loading…
Reference in New Issue
Block a user