mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-07 13:43:51 +00:00
13159a139d
Add a common options section and move some items to the section. Also add description of new options to report options. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/20240802180913.1023886-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
134 lines
4.0 KiB
Plaintext
134 lines
4.0 KiB
Plaintext
perf-mem(1)
|
|
===========
|
|
|
|
NAME
|
|
----
|
|
perf-mem - Profile memory accesses
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'perf mem' [<options>] (record [<command>] | report)
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
"perf mem record" runs a command and gathers memory operation data
|
|
from it, into perf.data. Perf record options are accepted and are passed through.
|
|
|
|
"perf mem report" displays the result. It invokes perf report with the
|
|
right set of options to display a memory access profile. By default, loads
|
|
and stores are sampled. Use the -t option to limit to loads or stores.
|
|
|
|
Note that on Intel systems the memory latency reported is the use-latency,
|
|
not the pure load (or store latency). Use latency includes any pipeline
|
|
queuing delays in addition to the memory subsystem latency.
|
|
|
|
On Arm64 this uses SPE to sample load and store operations, therefore hardware
|
|
and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide.
|
|
Due to the statistical nature of SPE sampling, not every memory operation will
|
|
be sampled.
|
|
|
|
COMMON OPTIONS
|
|
--------------
|
|
-f::
|
|
--force::
|
|
Don't do ownership validation
|
|
|
|
-t::
|
|
--type=<type>::
|
|
Select the memory operation type: load or store (default: load,store)
|
|
|
|
-v::
|
|
--verbose::
|
|
Be more verbose (show counter open errors, etc)
|
|
|
|
-p::
|
|
--phys-data::
|
|
Record/Report sample physical addresses
|
|
|
|
--data-page-size::
|
|
Record/Report sample data address page size
|
|
|
|
RECORD OPTIONS
|
|
--------------
|
|
<command>...::
|
|
Any command you can specify in a shell.
|
|
|
|
-e::
|
|
--event <event>::
|
|
Event selector. Use 'perf mem record -e list' to list available events.
|
|
|
|
-K::
|
|
--all-kernel::
|
|
Configure all used events to run in kernel space.
|
|
|
|
-U::
|
|
--all-user::
|
|
Configure all used events to run in user space.
|
|
|
|
--ldlat <n>::
|
|
Specify desired latency for loads event. Supported on Intel and Arm64
|
|
processors only. Ignored on other archs.
|
|
|
|
REPORT OPTIONS
|
|
--------------
|
|
-i::
|
|
--input=<file>::
|
|
Input file name.
|
|
|
|
-C::
|
|
--cpu=<cpu>::
|
|
Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
|
|
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -
|
|
like 0-2. Default is to monitor all CPUS.
|
|
|
|
-D::
|
|
--dump-raw-samples::
|
|
Dump the raw decoded samples on the screen in a format that is easy to parse with
|
|
one sample per line.
|
|
|
|
-s::
|
|
--sort=<key>::
|
|
Group result by given key(s) - multiple keys can be specified
|
|
in CSV format. The keys are specific to memory samples are:
|
|
symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem, snoop,
|
|
dcacheline, phys_daddr, data_page_size, blocked.
|
|
|
|
- symbol_daddr: name of data symbol being executed on at the time of sample
|
|
- symbol_iaddr: name of code symbol being executed on at the time of sample
|
|
- dso_daddr: name of library or module containing the data being executed
|
|
on at the time of the sample
|
|
- locked: whether the bus was locked at the time of the sample
|
|
- tlb: type of tlb access for the data at the time of the sample
|
|
- mem: type of memory access for the data at the time of the sample
|
|
- snoop: type of snoop (if any) for the data at the time of the sample
|
|
- dcacheline: the cacheline the data address is on at the time of the sample
|
|
- phys_daddr: physical address of data being executed on at the time of sample
|
|
- data_page_size: the data page size of data being executed on at the time of sample
|
|
- blocked: reason of blocked load access for the data at the time of the sample
|
|
|
|
And the default sort keys are changed to local_weight, mem, sym, dso,
|
|
symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat.
|
|
|
|
-T::
|
|
--type-profile::
|
|
Show data-type profile result instead of code symbols. This requires
|
|
the debug information and it will change the default sort keys to:
|
|
mem, snoop, tlb, type.
|
|
|
|
-U::
|
|
--hide-unresolved::
|
|
Only display entries resolved to a symbol.
|
|
|
|
-x::
|
|
--field-separator=<separator>::
|
|
Specify the field separator used when dump raw samples (-D option). By default,
|
|
The separator is the space character.
|
|
|
|
In addition, for report all perf report options are valid, and for record
|
|
all perf record options.
|
|
|
|
SEE ALSO
|
|
--------
|
|
linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]
|