linux-stable/tools/perf/Documentation/perf-annotate.txt
Namhyung Kim ce533c9bc6 perf annotate: Add --skip-empty option
Like in 'perf report', we want to hide empty events in the 'perf annotate'
output.  This is consistent when the option is set in perf report.

For example, the following command would use 3 events including dummy.

  $ perf mem record -a -- perf test -w noploop

  $ perf evlist
  cpu/mem-loads,ldlat=30/P
  cpu/mem-stores/P
  dummy:u

Just using perf annotate with --group will show the all 3 events.

  $ perf annotate --group --stdio | head
   Percent                 |	Source code & Disassembly of ...
  --------------------------------------------------------------
                           : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00    0.00 :    e06d:       movl    %edx, %r13d

Now with --skip-empty, it'll hide the last dummy event.

  $ perf annotate --group --stdio --skip-empty | head
   Percent         |	Source code & Disassembly of ...
  ------------------------------------------------------
                   : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00 :    e06d:       movl    %edx, %r13d

Committer testing:

  root@x1:~# perf evlist
  cpu_atom/mem-loads,ldlat=30/P
  cpu_atom/mem-stores/P
  dummy:u
  root@x1:~#

Before:

  root@x1:~# perf annotate --group --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P, dummy:u', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent                       0x9900 <do_lookup_x>:
                                  pushq      %rbp
                                  movq       %rsp,%rbp
                                  pushq      %r15
                                  pushq      %r14
                                  pushq      %r13
                                  pushq      %r12
                                  pushq      %rbx
                                  subq       $0x88,%rsp
                                  movq       %rdi,-0x50(%rbp)
                                  movl       8(%r9),%edi
                                  movq       0x10(%rbp),%r12
                                  movq       0x28(%rbp),%r10
                                  movq       %rdx,-0x70(%rbp)
                                  movq       %rcx,-0x58(%rbp)
                                  movq       %rdi,%r11
     0.00    5.73    0.00         movq       %r8,-0x68(%rbp)
                                  movq       (%r9),%r8
                                  movl       %esi,%eax
     8.30    0.00    0.00         movl       0x30(%rbp),%r9d
                                  movl       %esi,%r15d
                                  shrl       $6, %eax
                                  movq       %r8,%r13
  root@x1:~#

After:

  root@x1:~# perf annotate --group --skip-empty --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent               0x9900 <do_lookup_x>:
                          pushq      %rbp
                          movq       %rsp,%rbp
                          pushq      %r15
                          pushq      %r14
                          pushq      %r13
                          pushq      %r12
                          pushq      %rbx
                          subq       $0x88,%rsp
                          movq       %rdi,-0x50(%rbp)
                          movl       8(%r9),%edi
                          movq       0x10(%rbp),%r12
                          movq       0x28(%rbp),%r10
                          movq       %rdx,-0x70(%rbp)
                          movq       %rcx,-0x58(%rbp)
                          movq       %rdi,%r11
     0.00    5.73         movq       %r8,-0x68(%rbp)
                          movq       (%r9),%r8
                          movl       %esi,%eax
     8.30    0.00         movl       0x30(%rbp),%r9d
                          movl       %esi,%r15d
                          shrl       $6, %eax
                          movq       %r8,%r13
  root@x1:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05 16:14:01 -03:00

175 lines
4.5 KiB
Plaintext

perf-annotate(1)
================
NAME
----
perf-annotate - Read perf.data (created by perf record) and display annotated code
SYNOPSIS
--------
[verse]
'perf annotate' [-i <file> | --input=file] [symbol_name]
DESCRIPTION
-----------
This command reads the input file and displays an annotated version of the
code. If the object file has debug symbols then the source code will be
displayed alongside assembly code.
If there is no debug info in the object, then annotated assembly is displayed.
OPTIONS
-------
-i::
--input=<file>::
Input file name. (default: perf.data unless stdin is a fifo)
-d::
--dsos=<dso[,dso...]>::
Only consider symbols in these dsos.
-s::
--symbol=<symbol>::
Symbol to annotate.
-f::
--force::
Don't do ownership validation.
-v::
--verbose::
Be more verbose. (Show symbol address, etc)
-q::
--quiet::
Do not show any warnings or messages. (Suppress -v)
-n::
--show-nr-samples::
Show the number of samples for each symbol
-D::
--dump-raw-trace::
Dump raw trace in ASCII.
-k::
--vmlinux=<file>::
vmlinux pathname.
--ignore-vmlinux::
Ignore vmlinux files.
--itrace::
Options for decoding instruction tracing data. The options are:
include::itrace.txt[]
To disable decoding entirely, use --no-itrace.
-m::
--modules::
Load module symbols. WARNING: use only with -k and LIVE kernel.
-l::
--print-line::
Print matching source lines (may be slow).
-P::
--full-paths::
Don't shorten the displayed pathnames.
--stdio:: Use the stdio interface.
--stdio2:: Use the stdio2 interface, non-interactive, uses the TUI formatting.
--stdio-color=<mode>::
'always', 'never' or 'auto', allowing configuring color output
via the command line, in addition to via "color.ui" .perfconfig.
Use '--stdio-color always' to generate color even when redirecting
to a pipe or file. Using just '--stdio-color' is equivalent to
using 'always'.
--tui:: Use the TUI interface. Use of --tui requires a tty, if one is not
present, as when piping to other commands, the stdio interface is
used. This interfaces starts by centering on the line with more
samples, TAB/UNTAB cycles through the lines with more samples.
--gtk:: Use the GTK interface.
-C::
--cpu=<cpu>:: Only report samples for the list of CPUs provided. Multiple CPUs can
be provided as a comma-separated list with no space: 0,1. Ranges of
CPUs are specified with -: 0-2. Default is to report samples on all
CPUs.
--asm-raw::
Show raw instruction encoding of assembly instructions.
--show-total-period:: Show a column with the sum of periods.
--source::
Interleave source code with assembly code. Enabled by default,
disable with --no-source.
--symfs=<directory>::
Look for files with symbols relative to this directory.
-M::
--disassembler-style=:: Set disassembler style for objdump.
--addr2line=<path>::
Path to addr2line binary.
--objdump=<path>::
Path to objdump binary.
--prefix=PREFIX::
--prefix-strip=N::
Remove first N entries from source file path names in executables
and add PREFIX. This allows to display source code compiled on systems
with different file system layout.
--skip-missing::
Skip symbols that cannot be annotated.
--group::
Show event group information together
--demangle::
Demangle symbol names to human readable form. It's enabled by default,
disable with --no-demangle.
--demangle-kernel::
Demangle kernel symbol names to human readable form (for C++ kernels).
--percent-type::
Set annotation percent type from following choices:
global-period, local-period, global-hits, local-hits
The local/global keywords set if the percentage is computed
in the scope of the function (local) or the whole data (global).
The period/hits keywords set the base the percentage is computed
on - the samples period or the number of samples (hits).
--percent-limit::
Do not show functions which have an overhead under that percent on
stdio or stdio2 (Default: 0). Note that this is about selection of
functions to display, not about lines within the function.
--data-type[=TYPE_NAME]::
Display data type annotation instead of code. It infers data type of
samples (if they are memory accessing instructions) using DWARF debug
information. It can take an optional argument of data type name. In
that case it'd show annotation for the type only, otherwise it'd show
all data types it finds.
--type-stat::
Show stats for the data type annotation.
--skip-empty::
Do not display empty (or dummy) events.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1]