perf tools improvements and fixes for v6.12:

- Use BPF + BTF to collect and pretty print syscall and tracepoint arguments in
   'perf trace', done as an GSoC activity.
 
 - Data-type profiling improvements:
 
   - Cache debuginfo to speed up data type resolution.
 
   - Add the 'typecln' sort order, to show which cacheline in a target is hot or
     cold. The following shows members in the cfs_rq's first cache line:
 
       $ perf report -s type,typecln,typeoff -H
       ...
       -    2.67%        struct cfs_rq
          +    1.23%        struct cfs_rq: cache-line 2
          +    0.57%        struct cfs_rq: cache-line 4
          +    0.46%        struct cfs_rq: cache-line 6
          -    0.41%        struct cfs_rq: cache-line 0
                  0.39%        struct cfs_rq +0x14 (h_nr_running)
                  0.02%        struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
 
   - When a typedef resolves to a unnamed struct, use the typedef name.
 
   - When a struct has just one basic type field (int, etc), resolve the type
     sort order to the name of the struct, not the type of the field.
 
   - Support type folding/unfolding in the data-type annotation TUI.
 
   - Fix bitfields offsets and sizes.
 
   - Initial support for PowerPC, using libcapstone and the usual objdump
     disassembly parsing routines.
 
 - Add support for disassembling and addr2line using the LLVM libraries,
   speeding up those operations.
 
 - Support --addr2line option in 'perf script' as with other tools.
 
 - Intel branch counters (LBR event logging) support, only available in recent
   Intel processors, for instance, the new "brcntr" field can be asked from
   'perf script' to print the information collected from this feature:
 
   $ perf script -F +brstackinsn,+brcntr
 
   # Branch counter abbr list:
   # branch-instructions:ppp = A
   # branch-misses = B
   # '-' No event occurs
   # '+' Event occurrences may be lost due to branch counter saturated
       tchain_edit  332203 3366329.405674:  53030 branch-instructions:ppp:    401781 f3+0x2c (home/sdp/test/tchain_edit)
          f3+31:
       0000000000401774   insn: eb 04                  br_cntr: AA  # PRED 5 cycles [5]
       000000000040177a   insn: 81 7d fc 0f 27 00 00
       0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 1 cycles [6] 2.00 IPC
       0000000000401766   insn: 8b 45 fc
       0000000000401769   insn: 83 e0 01
       000000000040176c   insn: 85 c0
       000000000040176e   insn: 74 06                  br_cntr: A   # PRED 1 cycles [7] 4.00 IPC
       0000000000401776   insn: 83 45 fc 01
       000000000040177a   insn: 81 7d fc 0f 27 00 00
       0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 7 cycles [14] 0.43 IPC
 
 - Support Timed PEBS (Precise Event-Based Sampling), a recent hardware feature
   in Intel processors.
 
 - Add 'perf ftrace profile' subcommand, using ftrace's function-graph tracer so
   that users can see the total, average, max execution time as well as the
   number of invocations easily, for instance:
 
   $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
     perf stat -e cycles -C1 true 2> /dev/null | head
   # Total (us)  Avg (us)  Max (us)  Count  Function
         65.611    65.611    65.611      1  __x64_sys_perf_event_open
         30.527    30.527    30.527      1  anon_inode_getfile
         30.260    30.260    30.260      1  __anon_inode_getfile
         29.700    29.700    29.700      1  alloc_file_pseudo
         17.578    17.578    17.578      1  d_alloc_pseudo
         17.382    17.382    17.382      1  __d_alloc
         16.738    16.738    16.738      1  kmem_cache_alloc_lru
         15.686    15.686    15.686      1  perf_event_alloc
         14.012     7.006    11.264      2  obj_cgroup_charge
   #
 
 - 'perf sched timehist' improvements, including the addition of priority
   showing/filtering command line options.
 
 - Varios improvements to the 'perf probe', including 'perf test' regression
   testings.
 
 - Introduce the 'perf check', initially to check if some feature is in place,
   using it in 'perf test'.
 
 - Various fixes for 32-bit systems.
 
 - Address more leak sanitizer failures.
 
 - Fix memory leaks (LBR, disasm lock ops, etc).
 
 - More reference counting fixes (branch_info, etc).
 
 - Constify 'struct perf_tool' parameters to improve code generation and reduce
   the chances of having its internals changed, which isn't expected.
 
 - More constifications in various other places.
 
 - Add more build tests, including for JEVENTS.
 
 - Add more 'perf test' entries ('perf record LBR', pipe/inject, --setup-filter,
   'perf ftrace', 'cgroup sampling', etc).
 
 - Inject build ids for all entries in a call chain in 'perf inject', not just
   for the main sample.
 
 - Improve the BPF based sample filter, allowing root to setup filters in bpffs
   that then can be used by non-root users.
 
 - Allow filtering by cgroups with the BPF based sample filter.
 
 - Allow a more compact way for 'perf mem report' using the -T/--type-profile and
   also provide a --sort option similar to the one in 'perf report', 'perf top',
   to setup the sort order manually.
 
 - Fix --group behavior in 'perf annotate' when leader has no samples, where it
   was not showing anything even when other events in the group had samples.
 
 - Fix spinlock and rwlock accounting in 'perf lock contention'
 
 - Fix libsubcmd fixdep Makefile dependencies.
 
 - Improve 'perf ftrace' error message when ftrace isn't available.
 
 - Update various Intel JSON vendor event files.
 
 - ARM64 CoreSight hardware tracing infrastructure improvements, mostly not
   visible to users.
 
 - Update power10 JSON events.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZuwxgwAKCRCyPKLppCJ+
 JxfHAQCrgSD4itg4HA7znUoYBEGL73NisJT2Juq0lyDK2gniOQD+Mln6isvRnMag
 k7BFXvgHj/LDQdOznkG2pojSFJcSgQo=
 =kazH
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:

 - Use BPF + BTF to collect and pretty print syscall and tracepoint
   arguments in 'perf trace', done as an GSoC activity

 - Data-type profiling improvements:

     - Cache debuginfo to speed up data type resolution

     - Add the 'typecln' sort order, to show which cacheline in a target
       is hot or cold. The following shows members in the cfs_rq's first
       cache line:

         $ perf report -s type,typecln,typeoff -H
         ...
         -    2.67%        struct cfs_rq
            +    1.23%        struct cfs_rq: cache-line 2
            +    0.57%        struct cfs_rq: cache-line 4
            +    0.46%        struct cfs_rq: cache-line 6
            -    0.41%        struct cfs_rq: cache-line 0
                    0.39%        struct cfs_rq +0x14 (h_nr_running)
                    0.02%        struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)

     - When a typedef resolves to a unnamed struct, use the typedef name

     - When a struct has just one basic type field (int, etc), resolve
       the type sort order to the name of the struct, not the type of
       the field

     - Support type folding/unfolding in the data-type annotation TUI

     - Fix bitfields offsets and sizes

     - Initial support for PowerPC, using libcapstone and the usual
       objdump disassembly parsing routines

 - Add support for disassembling and addr2line using the LLVM libraries,
   speeding up those operations

 - Support --addr2line option in 'perf script' as with other tools

 - Intel branch counters (LBR event logging) support, only available in
   recent Intel processors, for instance, the new "brcntr" field can be
   asked from 'perf script' to print the information collected from this
   feature:

     $ perf script -F +brstackinsn,+brcntr

     # Branch counter abbr list:
     # branch-instructions:ppp = A
     # branch-misses = B
     # '-' No event occurs
     # '+' Event occurrences may be lost due to branch counter saturated
         tchain_edit  332203 3366329.405674:  53030 branch-instructions:ppp:    401781 f3+0x2c (home/sdp/test/tchain_edit)
            f3+31:
         0000000000401774   insn: eb 04                  br_cntr: AA  # PRED 5 cycles [5]
         000000000040177a   insn: 81 7d fc 0f 27 00 00
         0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 1 cycles [6] 2.00 IPC
         0000000000401766   insn: 8b 45 fc
         0000000000401769   insn: 83 e0 01
         000000000040176c   insn: 85 c0
         000000000040176e   insn: 74 06                  br_cntr: A   # PRED 1 cycles [7] 4.00 IPC
         0000000000401776   insn: 83 45 fc 01
         000000000040177a   insn: 81 7d fc 0f 27 00 00
         0000000000401781   insn: 7e e3                  br_cntr: A   # PRED 7 cycles [14] 0.43 IPC

 - Support Timed PEBS (Precise Event-Based Sampling), a recent hardware
   feature in Intel processors

 - Add 'perf ftrace profile' subcommand, using ftrace's function-graph
   tracer so that users can see the total, average, max execution time
   as well as the number of invocations easily, for instance:

     $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
       perf stat -e cycles -C1 true 2> /dev/null | head
     # Total (us)  Avg (us)  Max (us)  Count  Function
           65.611    65.611    65.611      1  __x64_sys_perf_event_open
           30.527    30.527    30.527      1  anon_inode_getfile
           30.260    30.260    30.260      1  __anon_inode_getfile
           29.700    29.700    29.700      1  alloc_file_pseudo
           17.578    17.578    17.578      1  d_alloc_pseudo
           17.382    17.382    17.382      1  __d_alloc
           16.738    16.738    16.738      1  kmem_cache_alloc_lru
           15.686    15.686    15.686      1  perf_event_alloc
           14.012     7.006    11.264      2  obj_cgroup_charge

 - 'perf sched timehist' improvements, including the addition of
   priority showing/filtering command line options

 - Varios improvements to the 'perf probe', including 'perf test'
   regression testings

 - Introduce the 'perf check', initially to check if some feature is
   in place, using it in 'perf test'

 - Various fixes for 32-bit systems

 - Address more leak sanitizer failures

 - Fix memory leaks (LBR, disasm lock ops, etc)

 - More reference counting fixes (branch_info, etc)

 - Constify 'struct perf_tool' parameters to improve code generation
   and reduce the chances of having its internals changed, which isn't
   expected

 - More constifications in various other places

 - Add more build tests, including for JEVENTS

 - Add more 'perf test' entries ('perf record LBR', pipe/inject,
   --setup-filter, 'perf ftrace', 'cgroup sampling', etc)

 - Inject build ids for all entries in a call chain in 'perf inject',
   not just for the main sample

 - Improve the BPF based sample filter, allowing root to setup filters
   in bpffs that then can be used by non-root users

 - Allow filtering by cgroups with the BPF based sample filter

 - Allow a more compact way for 'perf mem report' using the
   -T/--type-profile and also provide a --sort option similar to the one
   in 'perf report', 'perf top', to setup the sort order manually

 - Fix --group behavior in 'perf annotate' when leader has no samples,
   where it was not showing anything even when other events in the group
   had samples

 - Fix spinlock and rwlock accounting in 'perf lock contention'

 - Fix libsubcmd fixdep Makefile dependencies

 - Improve 'perf ftrace' error message when ftrace isn't available

 - Update various Intel JSON vendor event files

 - ARM64 CoreSight hardware tracing infrastructure improvements, mostly
   not visible to users

 - Update power10 JSON events

* tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (310 commits)
  perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space
  perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space
  perf env: Find correct branch counter info on hybrid
  perf evlist: Print hint for group
  tools: Drop nonsensical -O6
  perf pmu: To info add event_type_desc
  perf evsel: Add accessor for tool_event
  perf pmus: Fake PMU clean up
  perf list: Avoid potential out of bounds memory read
  perf help: Fix a typo ("bellow")
  perf ftrace: Detect whether ftrace is enabled on system
  perf test shell probe_vfs_getname: Remove extraneous '=' from probe line number regex
  perf build: Require at least clang 16.0.6 to build BPF skeletons
  perf trace: If a syscall arg is marked as 'const', assume it is coming _from_ userspace
  perf parse-events: Remove duplicated include in parse-events.c
  perf callchain: Allow symbols to be optional when resolving a callchain
  perf inject: Lazy build-id mmap2 event insertion
  perf inject: Add new mmap2-buildid-all option
  perf inject: Fix build ID injection
  perf annotate-data: Add pr_debug_scope()
  ...
This commit is contained in:
Linus Torvalds 2024-09-22 09:11:14 -07:00
commit 891e8abed5
290 changed files with 15195 additions and 4267 deletions

View File

@ -1,3 +0,0 @@
hostprogs := fixdep
fixdep-y := fixdep.o

View File

@ -43,12 +43,5 @@ ifneq ($(wildcard $(TMP_O)),)
$(Q)$(MAKE) -C feature OUTPUT=$(TMP_O) clean >/dev/null
endif
$(OUTPUT)fixdep-in.o: FORCE
$(Q)$(MAKE) $(build)=fixdep
$(OUTPUT)fixdep: $(OUTPUT)fixdep-in.o
$(QUIET_LINK)$(HOSTCC) $(KBUILD_HOSTLDFLAGS) -o $@ $<
FORCE:
.PHONY: FORCE
$(OUTPUT)fixdep: $(srctree)/tools/build/fixdep.c
$(QUIET_CC)$(HOSTCC) $(KBUILD_HOSTCFLAGS) $(KBUILD_HOSTLDFLAGS) -o $@ $<

View File

@ -100,7 +100,6 @@ FEATURE_TESTS_EXTRA := \
libunwind-debug-frame-aarch64 \
cxx \
llvm \
llvm-version \
clang \
libbpf \
libbpf-btf__load_from_kernel_by_id \
@ -136,6 +135,7 @@ FEATURE_DISPLAY ?= \
libunwind \
libdw-dwarf-unwind \
libcapstone \
llvm-perf \
zlib \
lzma \
get_cpuid \

View File

@ -1,8 +1,18 @@
# SPDX-License-Identifier: GPL-2.0-only
build := -f $(srctree)/tools/build/Makefile.build dir=. obj
# More than just $(Q), we sometimes want to suppress all command output from a
# recursive make -- even the 'up to date' printout.
ifeq ($(V),1)
Q ?=
SILENT_MAKE = +$(Q)$(MAKE)
else
Q ?= @
SILENT_MAKE = +$(Q)$(MAKE) --silent
endif
fixdep:
$(Q)$(MAKE) -C $(srctree)/tools/build CFLAGS= LDFLAGS= $(OUTPUT)fixdep
$(SILENT_MAKE) -C $(srctree)/tools/build CFLAGS= LDFLAGS= $(OUTPUT)fixdep
fixdep-clean:
$(Q)$(MAKE) -C $(srctree)/tools/build clean

View File

@ -73,7 +73,7 @@ FILES= \
test-libopencsd.bin \
test-clang.bin \
test-llvm.bin \
test-llvm-version.bin \
test-llvm-perf.bin \
test-libaio.bin \
test-libzstd.bin \
test-clang-bpf-co-re.bin \
@ -388,9 +388,12 @@ $(OUTPUT)test-llvm.bin:
$(shell $(LLVM_CONFIG) --system-libs) \
> $(@:.bin=.make.output) 2>&1
$(OUTPUT)test-llvm-version.bin:
$(BUILDXX) -std=gnu++17 \
-I$(shell $(LLVM_CONFIG) --includedir) \
$(OUTPUT)test-llvm-perf.bin:
$(BUILDXX) -std=gnu++17 \
-I$(shell $(LLVM_CONFIG) --includedir) \
-L$(shell $(LLVM_CONFIG) --libdir) \
$(shell $(LLVM_CONFIG) --libs Core BPF) \
$(shell $(LLVM_CONFIG) --system-libs) \
> $(@:.bin=.make.output) 2>&1
$(OUTPUT)test-clang.bin:

View File

@ -134,10 +134,6 @@
#undef main
#endif
#define main main_test_libcapstone
# include "test-libcapstone.c"
#undef main
#define main main_test_lzma
# include "test-lzma.c"
#undef main

View File

@ -0,0 +1,14 @@
// SPDX-License-Identifier: GPL-2.0
#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/raw_ostream.h"
#if LLVM_VERSION_MAJOR < 13
# error "Perf requires llvm-devel/llvm-dev version 13 or greater"
#endif
int main()
{
llvm::errs() << "Hello World!\n";
llvm::llvm_shutdown();
return 0;
}

View File

@ -49,12 +49,21 @@
* Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload.
* Used to associate a CPU with the CoreSight Trace ID.
* [07:00] - Trace ID - uses 8 bits to make value easy to read in file.
* [59:08] - Unused (SBZ)
* [63:60] - Version
* [39:08] - Sink ID - as reported in /sys/bus/event_source/devices/cs_etm/sinks/
* Added in minor version 1.
* [55:40] - Unused (SBZ)
* [59:56] - Minor Version - previously existing fields are compatible with
* all minor versions.
* [63:60] - Major Version - previously existing fields mean different things
* in new major versions.
*/
#define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0)
#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60)
#define CS_AUX_HW_ID_SINK_ID_MASK GENMASK_ULL(39, 8)
#define CS_AUX_HW_ID_CURR_VERSION 0
#define CS_AUX_HW_ID_MINOR_VERSION_MASK GENMASK_ULL(59, 56)
#define CS_AUX_HW_ID_MAJOR_VERSION_MASK GENMASK_ULL(63, 60)
#define CS_AUX_HW_ID_MAJOR_VERSION 0
#define CS_AUX_HW_ID_MINOR_VERSION 1
#endif

View File

@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
extern char *strim(char *);
extern void remove_spaces(char *s);
extern void *memchr_inv(const void *start, int c, size_t bytes);
#endif /* _TOOLS_LINUX_STRING_H_ */

View File

@ -31,11 +31,7 @@ CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -U_FORTIFY_SOURCE -fPIC
ifeq ($(DEBUG),0)
ifeq ($(CC_NO_CLANG), 0)
CFLAGS += -O3
else
CFLAGS += -O6
endif
endif
ifeq ($(DEBUG),0)

View File

@ -69,7 +69,7 @@ char *get_tracing_file(const char *name)
{
char *file;
if (asprintf(&file, "%s/%s", tracing_path_mount(), name) < 0)
if (asprintf(&file, "%s%s", tracing_path_mount(), name) < 0)
return NULL;
return file;

View File

@ -5,3 +5,4 @@ TAGS
tags
cscope.*
/bpf_helper_defs.h
fixdep

View File

@ -108,6 +108,8 @@ MAKEOVERRIDES=
all:
OUTPUT ?= ./
OUTPUT := $(abspath $(OUTPUT))/
export srctree OUTPUT CC LD CFLAGS V
include $(srctree)/tools/build/Makefile.include
@ -141,7 +143,10 @@ all: fixdep
all_cmd: $(CMD_TARGETS) check
$(BPF_IN_SHARED): force $(BPF_GENERATED)
$(SHARED_OBJDIR) $(STATIC_OBJDIR):
$(Q)mkdir -p $@
$(BPF_IN_SHARED): force $(BPF_GENERATED) | $(SHARED_OBJDIR)
@(test -f ../../include/uapi/linux/bpf.h -a -f ../../../include/uapi/linux/bpf.h && ( \
(diff -B ../../include/uapi/linux/bpf.h ../../../include/uapi/linux/bpf.h >/dev/null) || \
echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'" >&2 )) || true
@ -151,9 +156,11 @@ $(BPF_IN_SHARED): force $(BPF_GENERATED)
@(test -f ../../include/uapi/linux/if_xdp.h -a -f ../../../include/uapi/linux/if_xdp.h && ( \
(diff -B ../../include/uapi/linux/if_xdp.h ../../../include/uapi/linux/if_xdp.h >/dev/null) || \
echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true
$(SILENT_MAKE) -C $(srctree)/tools/build CFLAGS= LDFLAGS= OUTPUT=$(SHARED_OBJDIR) $(SHARED_OBJDIR)fixdep
$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(SHARED_OBJDIR) CFLAGS="$(CFLAGS) $(SHLIB_FLAGS)"
$(BPF_IN_STATIC): force $(BPF_GENERATED)
$(BPF_IN_STATIC): force $(BPF_GENERATED) | $(STATIC_OBJDIR)
$(SILENT_MAKE) -C $(srctree)/tools/build CFLAGS= LDFLAGS= OUTPUT=$(STATIC_OBJDIR) $(STATIC_OBJDIR)fixdep
$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
$(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h
@ -263,7 +270,7 @@ install_pkgconfig: $(PC_FILE)
install: install_lib install_pkgconfig install_headers
clean:
clean: fixdep-clean
$(call QUIET_CLEAN, libbpf) $(RM) -rf $(CMD_TARGETS) \
*~ .*.d .*.cmd LIBBPF-CFLAGS $(BPF_GENERATED) \
$(SHARED_OBJDIR) $(STATIC_OBJDIR) \

5
tools/lib/perf/.gitignore vendored Normal file
View File

@ -0,0 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
libperf.pc
libperf.so.*
tests-shared
tests-static

View File

@ -153,6 +153,19 @@ char *strim(char *s)
return skip_spaces(s);
}
/*
* remove_spaces - Removes whitespaces from @s
*/
void remove_spaces(char *s)
{
char *d = s;
do {
while (*d == ' ')
++d;
} while ((*s++ = *d++));
}
/**
* strreplace - Replace all occurrences of character in string.
* @s: The string to operate on.

View File

@ -38,10 +38,8 @@ endif
ifeq ($(DEBUG),1)
CFLAGS += -O0
else ifeq ($(CC_NO_CLANG), 0)
CFLAGS += -O3
else
CFLAGS += -O6
CFLAGS += -O3
endif
# Treat warnings as errors unless directed not to
@ -76,7 +74,7 @@ include $(srctree)/tools/build/Makefile.include
all: fixdep $(LIBFILE)
$(SUBCMD_IN): FORCE
$(SUBCMD_IN): fixdep FORCE
@$(MAKE) $(build)=libsubcmd
$(LIBFILE): $(SUBCMD_IN)

View File

@ -633,10 +633,11 @@ int parse_options_subcommand(int argc, const char **argv, const struct option *o
const char *const subcommands[], const char *usagestr[], int flags)
{
struct parse_opt_ctx_t ctx;
char *buf = NULL;
/* build usage string if it's not provided */
if (subcommands && !usagestr[0]) {
char *buf = NULL;
astrcatf(&buf, "%s %s [<options>] {", subcmd_config.exec_name, argv[0]);
for (int i = 0; subcommands[i]; i++) {
@ -678,10 +679,7 @@ int parse_options_subcommand(int argc, const char **argv, const struct option *o
astrcatf(&error_buf, "unknown switch `%c'", *ctx.opt);
usage_with_options(usagestr, options);
}
if (buf) {
usagestr[0] = NULL;
free(buf);
}
return parse_options_end(&ctx);
}

View File

@ -31,11 +31,7 @@ CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS)
CFLAGS += -ggdb3 -Wall -Wextra -std=gnu11 -U_FORTIFY_SOURCE -fPIC
ifeq ($(DEBUG),0)
ifeq ($(CC_NO_CLANG), 0)
CFLAGS += -O3
else
CFLAGS += -O6
endif
endif
ifeq ($(DEBUG),0)

View File

@ -1,5 +1,6 @@
perf-bench-y += builtin-bench.o
perf-y += builtin-annotate.o
perf-y += builtin-check.o
perf-y += builtin-config.o
perf-y += builtin-diff.o
perf-y += builtin-evlist.o

View File

@ -165,6 +165,9 @@ include::itrace.txt[]
--type-stat::
Show stats for the data type annotation.
--skip-empty::
Do not display empty (or dummy) events.
SEE ALSO
--------

View File

@ -0,0 +1,82 @@
perf-check(1)
===============
NAME
----
perf-check - check if features are present in perf
SYNOPSIS
--------
[verse]
'perf check' [<options>]
'perf check' {feature <feature_list>} [<options>]
DESCRIPTION
-----------
With no subcommands given, 'perf check' command just prints the command
usage on the standard output.
If the subcommand 'feature' is used, then status of feature is printed
on the standard output (unless '-q' is also passed), ie. whether it is
compiled-in/built-in or not.
Also, 'perf check feature' returns with exit status 0 if the feature
is built-in, otherwise returns with exit status 1.
SUBCOMMANDS
-----------
feature::
Print whether feature(s) is compiled-in or not, and also returns with an
exit status of 0, if passed feature(s) are compiled-in, else 1.
It expects a feature list as an argument. There can be a single feature
name/macro, or multiple features can also be passed as a comma-separated
list, in which case the exit status will be 0 only if all of the passed
features are compiled-in.
The feature names/macros are case-insensitive.
Example Usage:
perf check feature libtraceevent
perf check feature HAVE_LIBTRACEEVENT
perf check feature libtraceevent,bpf
Supported feature names/macro:
aio / HAVE_AIO_SUPPORT
bpf / HAVE_LIBBPF_SUPPORT
bpf_skeletons / HAVE_BPF_SKEL
debuginfod / HAVE_DEBUGINFOD_SUPPORT
dwarf / HAVE_DWARF_SUPPORT
dwarf_getlocations / HAVE_DWARF_GETLOCATIONS_SUPPORT
dwarf-unwind / HAVE_DWARF_UNWIND_SUPPORT
auxtrace / HAVE_AUXTRACE_SUPPORT
libaudit / HAVE_LIBAUDIT_SUPPORT
libbfd / HAVE_LIBBFD_SUPPORT
libcapstone / HAVE_LIBCAPSTONE_SUPPORT
libcrypto / HAVE_LIBCRYPTO_SUPPORT
libdw-dwarf-unwind / HAVE_DWARF_SUPPORT
libelf / HAVE_LIBELF_SUPPORT
libnuma / HAVE_LIBNUMA_SUPPORT
libopencsd / HAVE_CSTRACE_SUPPORT
libperl / HAVE_LIBPERL_SUPPORT
libpfm4 / HAVE_LIBPFM
libpython / HAVE_LIBPYTHON_SUPPORT
libslang / HAVE_SLANG_SUPPORT
libtraceevent / HAVE_LIBTRACEEVENT
libunwind / HAVE_LIBUNWIND_SUPPORT
lzma / HAVE_LZMA_SUPPORT
numa_num_possible_cpus / HAVE_LIBNUMA_SUPPORT
syscall_table / HAVE_SYSCALL_TABLE_SUPPORT
zlib / HAVE_ZLIB_SUPPORT
zstd / HAVE_ZSTD_SUPPORT
OPTIONS
-------
-q::
--quiet::
Do not print any messages or warnings
This can be used along with subcommands such as 'perf check feature'
to hide unnecessary output in test scripts, eg.
'perf check feature --quiet libtraceevent'

View File

@ -9,7 +9,7 @@ perf-ftrace - simple wrapper for kernel's ftrace functionality
SYNOPSIS
--------
[verse]
'perf ftrace' {trace|latency} <command>
'perf ftrace' {trace|latency|profile} <command>
DESCRIPTION
-----------
@ -23,6 +23,9 @@ kernel's ftrace infrastructure.
'perf ftrace latency' calculates execution latency of a given function
(optionally with BPF) and display it as a histogram.
'perf ftrace profile' show a execution profile for each function including
total, average, max time and the number of calls.
The following options apply to perf ftrace.
COMMON OPTIONS
@ -125,6 +128,7 @@ OPTIONS for 'perf ftrace trace'
- verbose - Show process names, PIDs, timestamps, etc.
- thresh=<n> - Setup trace duration threshold in microseconds.
- depth=<n> - Set max depth for function graph tracer to follow.
- tail - Print function name at the end.
OPTIONS for 'perf ftrace latency'
@ -145,6 +149,48 @@ OPTIONS for 'perf ftrace latency'
Use nano-second instead of micro-second as a base unit of the histogram.
OPTIONS for 'perf ftrace profile'
---------------------------------
-T::
--trace-funcs=::
Set function filter on the given function (or a glob pattern).
Multiple functions can be given by using this option more than once.
The function argument also can be a glob pattern. It will be passed
to 'set_ftrace_filter' in tracefs.
-N::
--notrace-funcs=::
Do not trace functions given by the argument. Like -T option, this
can be used more than once to specify multiple functions (or glob
patterns). It will be passed to 'set_ftrace_notrace' in tracefs.
-G::
--graph-funcs=::
Set graph filter on the given function (or a glob pattern). This is
useful to trace for functions executed from the given function. This
can be used more than once to specify multiple functions. It will be
passed to 'set_graph_function' in tracefs.
-g::
--nograph-funcs=::
Set graph notrace filter on the given function (or a glob pattern).
Like -G option, this is useful for the function_graph tracer only and
disables tracing for function executed from the given function. This
can be used more than once to specify multiple functions. It will be
passed to 'set_graph_notrace' in tracefs.
-m::
--buffer-size::
Set the size of per-cpu tracing buffer, <size> is expected to
be a number with appended unit character - B/K/M/G.
-s::
--sort=::
Sort the result by the given field. Available values are:
total, avg, max, count, name. Default is 'total'.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-trace[1]

View File

@ -115,9 +115,9 @@ STAT LIVE OPTIONS
-m::
--mmap-pages=::
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
Number of mmap data pages (must be a power of two) or size
specification in bytes with appended unit character - B/K/M/G.
The size is rounded up to the nearest power-of-two page value.
-a::
--all-cpus::

View File

@ -72,6 +72,7 @@ counted. The following modifiers exist:
W - group is weak and will fallback to non-group if not schedulable,
e - group or event are exclusive and do not share the PMU
b - use BPF aggregration (see perf stat --bpf-counters)
R - retire latency value of the event
The 'p' modifier can be used for specifying how precise the instruction
address should be. The 'p' modifier can be specified multiple times:

View File

@ -28,15 +28,8 @@ and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide.
Due to the statistical nature of SPE sampling, not every memory operation will
be sampled.
OPTIONS
-------
<command>...::
Any command you can specify in a shell.
-i::
--input=<file>::
Input file name.
COMMON OPTIONS
--------------
-f::
--force::
Don't do ownership validation
@ -45,24 +38,9 @@ OPTIONS
--type=<type>::
Select the memory operation type: load or store (default: load,store)
-D::
--dump-raw-samples::
Dump the raw decoded samples on the screen in a format that is easy to parse with
one sample per line.
-x::
--field-separator=<separator>::
Specify the field separator used when dump raw samples (-D option). By default,
The separator is the space character.
-C::
--cpu=<cpu>::
Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. Default
is to monitor all CPUS.
-U::
--hide-unresolved::
Only display entries resolved to a symbol.
-v::
--verbose::
Be more verbose (show counter open errors, etc)
-p::
--phys-data::
@ -73,6 +51,9 @@ OPTIONS
RECORD OPTIONS
--------------
<command>...::
Any command you can specify in a shell.
-e::
--event <event>::
Event selector. Use 'perf mem record -e list' to list available events.
@ -85,14 +66,65 @@ RECORD OPTIONS
--all-user::
Configure all used events to run in user space.
-v::
--verbose::
Be more verbose (show counter open errors, etc)
--ldlat <n>::
Specify desired latency for loads event. Supported on Intel and Arm64
processors only. Ignored on other archs.
REPORT OPTIONS
--------------
-i::
--input=<file>::
Input file name.
-C::
--cpu=<cpu>::
Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -
like 0-2. Default is to monitor all CPUS.
-D::
--dump-raw-samples::
Dump the raw decoded samples on the screen in a format that is easy to parse with
one sample per line.
-s::
--sort=<key>::
Group result by given key(s) - multiple keys can be specified
in CSV format. The keys are specific to memory samples are:
symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem, snoop,
dcacheline, phys_daddr, data_page_size, blocked.
- symbol_daddr: name of data symbol being executed on at the time of sample
- symbol_iaddr: name of code symbol being executed on at the time of sample
- dso_daddr: name of library or module containing the data being executed
on at the time of the sample
- locked: whether the bus was locked at the time of the sample
- tlb: type of tlb access for the data at the time of the sample
- mem: type of memory access for the data at the time of the sample
- snoop: type of snoop (if any) for the data at the time of the sample
- dcacheline: the cacheline the data address is on at the time of the sample
- phys_daddr: physical address of data being executed on at the time of sample
- data_page_size: the data page size of data being executed on at the time of sample
- blocked: reason of blocked load access for the data at the time of the sample
And the default sort keys are changed to local_weight, mem, sym, dso,
symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat.
-T::
--type-profile::
Show data-type profile result instead of code symbols. This requires
the debug information and it will change the default sort keys to:
mem, snoop, tlb, type.
-U::
--hide-unresolved::
Only display entries resolved to a symbol.
-x::
--field-separator=<separator>::
Specify the field separator used when dump raw samples (-D option). By default,
The separator is the space character.
In addition, for report all perf report options are valid, and for record
all perf record options.

View File

@ -273,10 +273,11 @@ OPTIONS
-m::
--mmap-pages=::
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
Also, by adding a comma, the number of mmap pages for AUX
area tracing can be specified.
specification in bytes with appended unit character - B/K/M/G.
The size is rounded up to the nearest power-of-two page value.
By adding a comma, an additional parameter with the same
semantics used for the normal mmap areas can be specified for
AUX tracing area.
-g::
Enables call-graph (stack chain/backtrace) recording for both
@ -828,6 +829,11 @@ filtered through the mask provided by -C option.
only, as of now. So the applications built without the frame
pointer might see bogus addresses.
--setup-filter=<action>::
Prepare BPF filter to be used by regular users. The action should be
either "pin" or "unpin". The filter can be used after it's pinned.
include::intel-hybrid.txt[]
SEE ALSO

View File

@ -614,6 +614,7 @@ include::itrace.txt[]
'Avg Cycles%' - block average sampled cycles / sum of total block average
sampled cycles
'Avg Cycles' - block average sampled cycles
'Branch Counter' - block branch counter histogram (with -v showing the number)
--skip-empty::
Do not print 0 results in the --stat output.

View File

@ -212,6 +212,15 @@ OPTIONS for 'perf sched timehist'
--state::
Show task state when it switched out.
--show-prio::
Show task priority.
--prio::
Only show events for given task priority(ies). Multiple priorities can be
provided as a comma-separated list with no spaces: 0,120. Ranges of
priorities are specified with -: 120-129. A combination of both can also be
provided: 0,120-129.
OPTIONS for 'perf sched replay'
------------------------------

View File

@ -134,7 +134,7 @@ OPTIONS
srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output,
brstackinsn, brstackinsnlen, brstackdisasm, brstackoff, callindent, insn, disasm,
insnlen, synth, phys_addr, metric, misc, srccode, ipc, data_page_size,
code_page_size, ins_lat, machine_pid, vcpu, cgroup, retire_lat,
code_page_size, ins_lat, machine_pid, vcpu, cgroup, retire_lat, brcntr,
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
@ -369,6 +369,9 @@ OPTIONS
--demangle-kernel::
Demangle kernel symbol names to human readable form (for C++ kernels).
--addr2line=<path>::
Path to addr2line binary.
--header
Show perf.data header.

View File

@ -498,6 +498,14 @@ To interpret the results it is usually needed to know on which
CPUs the workload runs on. If needed the CPUs can be forced using
taskset.
--record-tpebs::
Enable automatic sampling on Intel TPEBS retire_latency events (event with :R
modifier). Without this option, perf would not capture dynamic retire_latency
at runtime. Currently, a zero value is assigned to the retire_latency event when
this option is not set. The TPEBS hardware feature starts from Intel Granite
Rapids microarchitecture. This option only exists in X86_64 and is meaningful on
Intel platforms with TPEBS feature.
--td-level::
Print the top-down statistics that equal the input level. It allows
users to print the interested top-down metrics level instead of the

View File

@ -83,8 +83,8 @@ Default is to monitor all CPUS.
-m <pages>::
--mmap-pages=<pages>::
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
specification in bytes with appended unit character - B/K/M/G.
The size is rounded up to the nearest power-of-two page value.
-p <pid>::
--pid=<pid>::

View File

@ -106,8 +106,8 @@ filter out the startup phase of the program, which is often very different.
-m::
--mmap-pages=::
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
specification in bytes with appended unit character - B/K/M/G.
The size is rounded up to the nearest power-of-two page value.
-C::
--cpu::

View File

@ -325,6 +325,36 @@ other four level 2 metrics by subtracting corresponding metrics as below.
Fetch_Bandwidth = Frontend_Bound - Fetch_Latency
Core_Bound = Backend_Bound - Memory_Bound
TPEBS in TopDown
================
TPEBS (Timed PEBS) is one of the new Intel PMU features provided since Granite
Rapids microarchitecture. The TPEBS feature adds a 16 bit retire_latency field
in the Basic Info group of the PEBS record. It records the Core cycles since the
retirement of the previous instruction to the retirement of current instruction.
Please refer to Section 8.4.1 of "Intel® Architecture Instruction Set Extensions
Programming Reference" for more details about this feature. Because this feature
extends PEBS record, sampling with weight option is required to get the
retire_latency value.
perf record -e event_name -W ...
In the most recent release of TMA, the metrics begin to use event retire_latency
values in some of the metrics formulas on processors that support TPEBS feature.
For previous generations that do not support TPEBS, the values are static and
predefined per processor family by the hardware architects. Due to the diversity
of workloads in execution environments, retire_latency values measured at real
time are more accurate. Therefore, new TMA metrics that use TPEBS will provide
more accurate performance analysis results.
To support TPEBS in TMA metrics, a new modifier :R on event is added. Perf would
capture retire_latency value of required events(event with :R in metric formula)
with perf record. The retire_latency value would be used in metric calculation.
Currently, this feature is supported through perf stat
perf stat -M metric_name --record-tpebs ...
[1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
[2] https://sites.google.com/site/analysismethods/yasin-pubs

View File

@ -51,8 +51,14 @@ else
override DEBUG = 0
endif
ifeq ($(JOBS),1)
BUILD_TYPE := sequential
else
BUILD_TYPE := parallel
endif
define print_msg
@printf ' BUILD: Doing '\''make \033[33m-j'$(JOBS)'\033[m'\'' parallel build\n'
@printf ' BUILD: Doing '\''make \033[33m-j'$(JOBS)'\033[m'\'' $(BUILD_TYPE) build\n'
endef
define make

View File

@ -31,14 +31,8 @@ $(call detected_var,SRCARCH)
ifneq ($(NO_SYSCALL_TABLE),1)
NO_SYSCALL_TABLE := 1
ifeq ($(SRCARCH),x86)
ifeq (${IS_64_BIT}, 1)
NO_SYSCALL_TABLE := 0
endif
else
ifeq ($(SRCARCH),$(filter $(SRCARCH),powerpc arm64 s390 mips loongarch))
NO_SYSCALL_TABLE := 0
endif
ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 powerpc arm64 s390 mips loongarch))
NO_SYSCALL_TABLE := 0
endif
ifneq ($(NO_SYSCALL_TABLE),1)
@ -55,8 +49,9 @@ endif
# Additional ARCH settings for x86
ifeq ($(SRCARCH),x86)
$(call detected,CONFIG_X86)
CFLAGS += -I$(OUTPUT)arch/x86/include/generated
ifeq (${IS_64_BIT}, 1)
CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -I$(OUTPUT)arch/x86/include/generated
CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT
ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
LIBUNWIND_LIBS = -lunwind-x86_64 -lunwind -llzma
$(call detected,CONFIG_X86_64)
@ -238,11 +233,7 @@ endif
ifeq ($(DEBUG),0)
CORE_CFLAGS += -DNDEBUG=1
ifeq ($(CC_NO_CLANG), 0)
CORE_CFLAGS += -O3
else
CORE_CFLAGS += -O6
endif
CORE_CFLAGS += -O3
else
CORE_CFLAGS += -g
CXXFLAGS += -g
@ -710,8 +701,8 @@ ifeq ($(BUILD_BPF_SKEL),1)
BUILD_BPF_SKEL := 0
else
CLANG_VERSION := $(shell $(CLANG) --version | head -1 | sed 's/.*clang version \([[:digit:]]\+.[[:digit:]]\+.[[:digit:]]\+\).*/\1/g')
ifeq ($(call version-lt3,$(CLANG_VERSION),12.0.1),1)
$(warning Warning: Disabled BPF skeletons as reliable BTF generation needs at least $(CLANG) version 12.0.1)
ifeq ($(call version-lt3,$(CLANG_VERSION),16.0.6),1)
$(warning Warning: Disabled BPF skeletons as at least $(CLANG) version 16.0.6 is reported to be a working setup with the current of BPF based perf features)
BUILD_BPF_SKEL := 0
endif
endif
@ -985,6 +976,23 @@ ifdef BUILD_NONDISTRO
endif
endif
ifndef NO_LIBLLVM
$(call feature_check,llvm-perf)
ifeq ($(feature-llvm-perf), 1)
CFLAGS += -DHAVE_LIBLLVM_SUPPORT
CFLAGS += $(shell $(LLVM_CONFIG) --cflags)
CXXFLAGS += -DHAVE_LIBLLVM_SUPPORT
CXXFLAGS += $(shell $(LLVM_CONFIG) --cxxflags)
LIBLLVM = $(shell $(LLVM_CONFIG) --libs all) $(shell $(LLVM_CONFIG) --system-libs)
EXTLIBS += -L$(shell $(LLVM_CONFIG) --libdir) $(LIBLLVM)
EXTLIBS += -lstdc++
$(call detected,CONFIG_LIBLLVM)
else
$(warning No libllvm 13+ found, slower source file resolution, please install llvm-devel/llvm-dev)
NO_LIBLLVM := 1
endif
endif
ifndef NO_DEMANGLE
$(call feature_check,cxa-demangle)
ifeq ($(feature-cxa-demangle), 1)
@ -1031,17 +1039,6 @@ ifndef NO_LIBZSTD
endif
endif
ifndef NO_LIBCAP
ifeq ($(feature-libcap), 1)
CFLAGS += -DHAVE_LIBCAP_SUPPORT
EXTLIBS += -lcap
$(call detected,CONFIG_LIBCAP)
else
$(warning No libcap found, disables capability support, please install libcap-devel/libcap-dev)
NO_LIBCAP := 1
endif
endif
ifndef NO_BACKTRACE
ifeq ($(feature-backtrace), 1)
CFLAGS += -DHAVE_BACKTRACE_SUPPORT

View File

@ -163,6 +163,8 @@ ifneq ($(OUTPUT),)
# for flex/bison parsers.
VPATH += $(OUTPUT)
export VPATH
# create symlink to the original source
SOURCE := $(shell ln -sf $(srctree)/tools/perf $(OUTPUT)/source)
endif
ifeq ($(V),1)
@ -1140,6 +1142,8 @@ install-tests: all install-gtk
$(INSTALL) tests/shell/common/*.pl '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/common'; \
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/base_probe'; \
$(INSTALL) tests/shell/base_probe/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/base_probe'; \
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/base_report'; \
$(INSTALL) tests/shell/base_probe/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/base_report'; \
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/coresight' ; \
$(INSTALL) tests/shell/coresight/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/coresight'
$(Q)$(MAKE) -C tests/shell/coresight install-tests
@ -1277,6 +1281,8 @@ clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(
$(OUTPUT)util/intel-pt-decoder/inat-tables.c \
$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
$(OUTPUT)pmu-events/pmu-events.c \
$(OUTPUT)pmu-events/test-empty-pmu-events.c \
$(OUTPUT)pmu-events/empty-pmu-events.log \
$(OUTPUT)pmu-events/metric_test.log \
$(OUTPUT)$(fadvise_advice_array) \
$(OUTPUT)$(fsconfig_arrays) \

View File

@ -643,7 +643,8 @@ static bool cs_etm_is_ete(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu)
static __u64 cs_etm_get_legacy_trace_id(struct perf_cpu cpu)
{
return CORESIGHT_LEGACY_CPU_TRACE_ID(cpu.cpu);
/* Wrap at 48 so that invalid trace IDs aren't saved into files. */
return CORESIGHT_LEGACY_CPU_TRACE_ID(cpu.cpu % 48);
}
static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, struct perf_cpu cpu)
@ -654,8 +655,7 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr,
/* Get trace configuration register */
data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr);
/* traceID set to legacy version, in case new perf running on older system */
data[CS_ETMV4_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) |
CORESIGHT_TRACE_ID_UNUSED_FLAG;
data[CS_ETMV4_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu);
/* Get read-only information from sysFS */
cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0],
@ -687,7 +687,7 @@ static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, st
/* Get trace configuration register */
data[CS_ETE_TRCCONFIGR] = cs_etmv4_get_config(itr);
/* traceID set to legacy version, in case new perf running on older system */
data[CS_ETE_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG;
data[CS_ETE_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu);
/* Get read-only information from sysFS */
cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR0], &data[CS_ETE_TRCIDR0]);
@ -743,8 +743,7 @@ static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset,
/* Get configuration register */
info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr);
/* traceID set to legacy value in case new perf running on old system */
info->priv[*offset + CS_ETM_ETMTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) |
CORESIGHT_TRACE_ID_UNUSED_FLAG;
info->priv[*offset + CS_ETM_ETMTRACEIDR] = cs_etm_get_legacy_trace_id(cpu);
/* Get read-only information from sysFS */
cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv3_ro[CS_ETM_ETMCCER],
&info->priv[*offset + CS_ETM_ETMCCER]);
@ -888,7 +887,6 @@ struct auxtrace_record *cs_etm_record_init(int *err)
}
ptr->cs_etm_pmu = cs_etm_pmu;
ptr->itr.pmu = cs_etm_pmu;
ptr->itr.parse_snapshot_options = cs_etm_parse_snapshot_options;
ptr->itr.recording_options = cs_etm_recording_options;
ptr->itr.info_priv_size = cs_etm_info_priv_size;

View File

@ -23,16 +23,19 @@ void perf_pmu__arch_init(struct perf_pmu *pmu)
#ifdef HAVE_AUXTRACE_SUPPORT
if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) {
/* add ETM default config here */
pmu->auxtrace = true;
pmu->selectable = true;
pmu->perf_event_attr_init_default = cs_etm_get_default_config;
#if defined(__aarch64__)
} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
pmu->auxtrace = true;
pmu->selectable = true;
pmu->is_uncore = false;
pmu->perf_event_attr_init_default = arm_spe_pmu_default_config;
if (strstarts(pmu->name, "arm_spe_"))
pmu->mem_events = perf_mem_events_arm;
} else if (strstarts(pmu->name, HISI_PTT_PMU_NAME)) {
pmu->auxtrace = true;
pmu->selectable = true;
#endif
}

View File

@ -11,7 +11,8 @@ struct arm64_annotate {
static int arm64_mov__parse(struct arch *arch __maybe_unused,
struct ins_operands *ops,
struct map_symbol *ms __maybe_unused)
struct map_symbol *ms __maybe_unused,
struct disasm_line *dl __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *endptr;

View File

@ -8,6 +8,7 @@
#include <linux/types.h>
#include <linux/bitops.h>
#include <linux/log2.h>
#include <linux/string.h>
#include <linux/zalloc.h>
#include <time.h>
@ -132,32 +133,66 @@ static __u64 arm_spe_pmu__sample_period(const struct perf_pmu *arm_spe_pmu)
return sample_period;
}
static void arm_spe_setup_evsel(struct evsel *evsel, struct perf_cpu_map *cpus)
{
u64 bit;
evsel->core.attr.freq = 0;
evsel->core.attr.sample_period = arm_spe_pmu__sample_period(evsel->pmu);
evsel->needs_auxtrace_mmap = true;
/*
* To obtain the auxtrace buffer file descriptor, the auxtrace event
* must come first.
*/
evlist__to_front(evsel->evlist, evsel);
/*
* In the case of per-cpu mmaps, sample CPU for AUX event;
* also enable the timestamp tracing for samples correlation.
*/
if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) {
evsel__set_sample_bit(evsel, CPU);
evsel__set_config_if_unset(evsel->pmu, evsel, "ts_enable", 1);
}
/*
* Set this only so that perf report knows that SPE generates memory info. It has no effect
* on the opening of the event or the SPE data produced.
*/
evsel__set_sample_bit(evsel, DATA_SRC);
/*
* The PHYS_ADDR flag does not affect the driver behaviour, it is used to
* inform that the resulting output's SPE samples contain physical addresses
* where applicable.
*/
bit = perf_pmu__format_bits(evsel->pmu, "pa_enable");
if (evsel->core.attr.config & bit)
evsel__set_sample_bit(evsel, PHYS_ADDR);
}
static int arm_spe_recording_options(struct auxtrace_record *itr,
struct evlist *evlist,
struct record_opts *opts)
{
struct arm_spe_recording *sper =
container_of(itr, struct arm_spe_recording, itr);
struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
struct evsel *evsel, *arm_spe_evsel = NULL;
struct evsel *evsel, *tmp;
struct perf_cpu_map *cpus = evlist->core.user_requested_cpus;
bool privileged = perf_event_paranoid_check(-1);
struct evsel *tracking_evsel;
int err;
u64 bit;
sper->evlist = evlist;
evlist__for_each_entry(evlist, evsel) {
if (evsel->core.attr.type == arm_spe_pmu->type) {
if (arm_spe_evsel) {
pr_err("There may be only one " ARM_SPE_PMU_NAME "x event\n");
if (evsel__is_aux_event(evsel)) {
if (!strstarts(evsel->pmu_name, ARM_SPE_PMU_NAME)) {
pr_err("Found unexpected auxtrace event: %s\n",
evsel->pmu_name);
return -EINVAL;
}
evsel->core.attr.freq = 0;
evsel->core.attr.sample_period = arm_spe_pmu__sample_period(arm_spe_pmu);
evsel->needs_auxtrace_mmap = true;
arm_spe_evsel = evsel;
opts->full_auxtrace = true;
}
}
@ -222,37 +257,11 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
pr_debug2("%sx snapshot size: %zu\n", ARM_SPE_PMU_NAME,
opts->auxtrace_snapshot_size);
/*
* To obtain the auxtrace buffer file descriptor, the auxtrace event
* must come first.
*/
evlist__to_front(evlist, arm_spe_evsel);
/*
* In the case of per-cpu mmaps, sample CPU for AUX event;
* also enable the timestamp tracing for samples correlation.
*/
if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) {
evsel__set_sample_bit(arm_spe_evsel, CPU);
evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
"ts_enable", 1);
evlist__for_each_entry_safe(evlist, tmp, evsel) {
if (evsel__is_aux_event(evsel))
arm_spe_setup_evsel(evsel, cpus);
}
/*
* Set this only so that perf report knows that SPE generates memory info. It has no effect
* on the opening of the event or the SPE data produced.
*/
evsel__set_sample_bit(arm_spe_evsel, DATA_SRC);
/*
* The PHYS_ADDR flag does not affect the driver behaviour, it is used to
* inform that the resulting output's SPE samples contain physical addresses
* where applicable.
*/
bit = perf_pmu__format_bits(arm_spe_pmu, "pa_enable");
if (arm_spe_evsel->core.attr.config & bit)
evsel__set_sample_bit(arm_spe_evsel, PHYS_ADDR);
/* Add dummy event to keep tracking */
err = parse_event(evlist, "dummy:u");
if (err)
@ -301,12 +310,16 @@ static int arm_spe_snapshot_start(struct auxtrace_record *itr)
struct arm_spe_recording *ptr =
container_of(itr, struct arm_spe_recording, itr);
struct evsel *evsel;
int ret = -EINVAL;
evlist__for_each_entry(ptr->evlist, evsel) {
if (evsel->core.attr.type == ptr->arm_spe_pmu->type)
return evsel__disable(evsel);
if (evsel__is_aux_event(evsel)) {
ret = evsel__disable(evsel);
if (ret < 0)
return ret;
}
}
return -EINVAL;
return ret;
}
static int arm_spe_snapshot_finish(struct auxtrace_record *itr)
@ -314,12 +327,16 @@ static int arm_spe_snapshot_finish(struct auxtrace_record *itr)
struct arm_spe_recording *ptr =
container_of(itr, struct arm_spe_recording, itr);
struct evsel *evsel;
int ret = -EINVAL;
evlist__for_each_entry(ptr->evlist, evsel) {
if (evsel->core.attr.type == ptr->arm_spe_pmu->type)
return evsel__enable(evsel);
if (evsel__is_aux_event(evsel)) {
ret = evsel__enable(evsel);
if (ret < 0)
return ret;
}
}
return -EINVAL;
return ret;
}
static int arm_spe_alloc_wrapped_array(struct arm_spe_recording *ptr, int idx)
@ -497,7 +514,6 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
}
sper->arm_spe_pmu = arm_spe_pmu;
sper->itr.pmu = arm_spe_pmu;
sper->itr.snapshot_start = arm_spe_snapshot_start;
sper->itr.snapshot_finish = arm_spe_snapshot_finish;
sper->itr.find_snapshot = arm_spe_find_snapshot;

View File

@ -174,7 +174,6 @@ struct auxtrace_record *hisi_ptt_recording_init(int *err,
}
pttr->hisi_ptt_pmu = hisi_ptt_pmu;
pttr->itr.pmu = hisi_ptt_pmu;
pttr->itr.recording_options = hisi_ptt_recording_options;
pttr->itr.info_priv_size = hisi_ptt_info_priv_size;
pttr->itr.info_fill = hisi_ptt_info_fill;

View File

@ -5,7 +5,8 @@
* Copyright (C) 2020-2023 Loongson Technology Corporation Limited
*/
static int loongarch_call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
static int loongarch_call__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
struct disasm_line *dl __maybe_unused)
{
char *c, *endptr, *tok, *name;
struct map *map = ms->map;
@ -51,7 +52,8 @@ static struct ins_ops loongarch_call_ops = {
.scnprintf = call__scnprintf,
};
static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms)
static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms,
struct disasm_line *dl __maybe_unused)
{
struct map *map = ms->map;
struct symbol *sym = ms->sym;

View File

@ -49,12 +49,266 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
}
#define PPC_OP(op) (((op) >> 26) & 0x3F)
#define PPC_21_30(R) (((R) >> 1) & 0x3ff)
#define PPC_22_30(R) (((R) >> 1) & 0x1ff)
struct insn_offset {
const char *name;
int value;
};
/*
* There are memory instructions with opcode 31 which are
* of X Form, Example:
* ldx RT,RA,RB
* ______________________________________
* | 31 | RT | RA | RB | 21 |/|
* --------------------------------------
* 0 6 11 16 21 30 31
*
* But all instructions with opcode 31 are not memory.
* Example: add RT,RA,RB
*
* Use bits 21 to 30 to check memory insns with 31 as opcode.
* In ins_array below, for ldx instruction:
* name => OP_31_XOP_LDX
* value => 21
*/
static struct insn_offset ins_array[] = {
{ .name = "OP_31_XOP_LXSIWZX", .value = 12, },
{ .name = "OP_31_XOP_LWARX", .value = 20, },
{ .name = "OP_31_XOP_LDX", .value = 21, },
{ .name = "OP_31_XOP_LWZX", .value = 23, },
{ .name = "OP_31_XOP_LDUX", .value = 53, },
{ .name = "OP_31_XOP_LWZUX", .value = 55, },
{ .name = "OP_31_XOP_LXSIWAX", .value = 76, },
{ .name = "OP_31_XOP_LDARX", .value = 84, },
{ .name = "OP_31_XOP_LBZX", .value = 87, },
{ .name = "OP_31_XOP_LVX", .value = 103, },
{ .name = "OP_31_XOP_LBZUX", .value = 119, },
{ .name = "OP_31_XOP_STXSIWX", .value = 140, },
{ .name = "OP_31_XOP_STDX", .value = 149, },
{ .name = "OP_31_XOP_STWX", .value = 151, },
{ .name = "OP_31_XOP_STDUX", .value = 181, },
{ .name = "OP_31_XOP_STWUX", .value = 183, },
{ .name = "OP_31_XOP_STBX", .value = 215, },
{ .name = "OP_31_XOP_STVX", .value = 231, },
{ .name = "OP_31_XOP_STBUX", .value = 247, },
{ .name = "OP_31_XOP_LHZX", .value = 279, },
{ .name = "OP_31_XOP_LHZUX", .value = 311, },
{ .name = "OP_31_XOP_LXVDSX", .value = 332, },
{ .name = "OP_31_XOP_LWAX", .value = 341, },
{ .name = "OP_31_XOP_LHAX", .value = 343, },
{ .name = "OP_31_XOP_LWAUX", .value = 373, },
{ .name = "OP_31_XOP_LHAUX", .value = 375, },
{ .name = "OP_31_XOP_STHX", .value = 407, },
{ .name = "OP_31_XOP_STHUX", .value = 439, },
{ .name = "OP_31_XOP_LXSSPX", .value = 524, },
{ .name = "OP_31_XOP_LDBRX", .value = 532, },
{ .name = "OP_31_XOP_LSWX", .value = 533, },
{ .name = "OP_31_XOP_LWBRX", .value = 534, },
{ .name = "OP_31_XOP_LFSUX", .value = 567, },
{ .name = "OP_31_XOP_LXSDX", .value = 588, },
{ .name = "OP_31_XOP_LSWI", .value = 597, },
{ .name = "OP_31_XOP_LFDX", .value = 599, },
{ .name = "OP_31_XOP_LFDUX", .value = 631, },
{ .name = "OP_31_XOP_STXSSPX", .value = 652, },
{ .name = "OP_31_XOP_STDBRX", .value = 660, },
{ .name = "OP_31_XOP_STXWX", .value = 661, },
{ .name = "OP_31_XOP_STWBRX", .value = 662, },
{ .name = "OP_31_XOP_STFSX", .value = 663, },
{ .name = "OP_31_XOP_STFSUX", .value = 695, },
{ .name = "OP_31_XOP_STXSDX", .value = 716, },
{ .name = "OP_31_XOP_STSWI", .value = 725, },
{ .name = "OP_31_XOP_STFDX", .value = 727, },
{ .name = "OP_31_XOP_STFDUX", .value = 759, },
{ .name = "OP_31_XOP_LXVW4X", .value = 780, },
{ .name = "OP_31_XOP_LHBRX", .value = 790, },
{ .name = "OP_31_XOP_LXVD2X", .value = 844, },
{ .name = "OP_31_XOP_LFIWAX", .value = 855, },
{ .name = "OP_31_XOP_LFIWZX", .value = 887, },
{ .name = "OP_31_XOP_STXVW4X", .value = 908, },
{ .name = "OP_31_XOP_STHBRX", .value = 918, },
{ .name = "OP_31_XOP_STXVD2X", .value = 972, },
{ .name = "OP_31_XOP_STFIWX", .value = 983, },
};
/*
* Arithmetic instructions which are having opcode as 31.
* These instructions are tracked to save the register state
* changes. Example:
*
* lwz r10,264(r3)
* add r31, r3, r3
* lwz r9, 0(r31)
*
* Here instruction tracking needs to identify the "add"
* instruction and save data type of r3 to r31. If a sample
* is hit at next "lwz r9, 0(r31)", by this instruction tracking,
* data type of r31 can be resolved.
*/
static struct insn_offset arithmetic_ins_op_31[] = {
{ .name = "SUB_CARRY_XO_FORM", .value = 8, },
{ .name = "MUL_HDW_XO_FORM1", .value = 9, },
{ .name = "ADD_CARRY_XO_FORM", .value = 10, },
{ .name = "MUL_HW_XO_FORM1", .value = 11, },
{ .name = "SUB_XO_FORM", .value = 40, },
{ .name = "MUL_HDW_XO_FORM", .value = 73, },
{ .name = "MUL_HW_XO_FORM", .value = 75, },
{ .name = "SUB_EXT_XO_FORM", .value = 136, },
{ .name = "ADD_EXT_XO_FORM", .value = 138, },
{ .name = "SUB_ZERO_EXT_XO_FORM", .value = 200, },
{ .name = "ADD_ZERO_EXT_XO_FORM", .value = 202, },
{ .name = "SUB_EXT_XO_FORM2", .value = 232, },
{ .name = "MUL_DW_XO_FORM", .value = 233, },
{ .name = "ADD_EXT_XO_FORM2", .value = 234, },
{ .name = "MUL_W_XO_FORM", .value = 235, },
{ .name = "ADD_XO_FORM", .value = 266, },
{ .name = "DIV_DW_XO_FORM1", .value = 457, },
{ .name = "DIV_W_XO_FORM1", .value = 459, },
{ .name = "DIV_DW_XO_FORM", .value = 489, },
{ .name = "DIV_W_XO_FORM", .value = 491, },
};
static struct insn_offset arithmetic_two_ops[] = {
{ .name = "mulli", .value = 7, },
{ .name = "subfic", .value = 8, },
{ .name = "addic", .value = 12, },
{ .name = "addic.", .value = 13, },
{ .name = "addi", .value = 14, },
{ .name = "addis", .value = 15, },
};
static int cmp_offset(const void *a, const void *b)
{
const struct insn_offset *val1 = a;
const struct insn_offset *val2 = b;
return (val1->value - val2->value);
}
static struct ins_ops *check_ppc_insn(struct disasm_line *dl)
{
int raw_insn = dl->raw.raw_insn;
int opcode = PPC_OP(raw_insn);
int mem_insn_31 = PPC_21_30(raw_insn);
struct insn_offset *ret;
struct insn_offset mem_insns_31_opcode = {
"OP_31_INSN",
mem_insn_31
};
char name_insn[32];
/*
* Instructions with opcode 32 to 63 are memory
* instructions in powerpc
*/
if ((opcode & 0x20)) {
/*
* Set name in case of raw instruction to
* opcode to be used in insn-stat
*/
if (!strlen(dl->ins.name)) {
sprintf(name_insn, "%d", opcode);
dl->ins.name = strdup(name_insn);
}
return &load_store_ops;
} else if (opcode == 31) {
/* Check for memory instructions with opcode 31 */
ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
if (ret) {
if (!strlen(dl->ins.name))
dl->ins.name = strdup(ret->name);
return &load_store_ops;
} else {
mem_insns_31_opcode.value = PPC_22_30(raw_insn);
ret = bsearch(&mem_insns_31_opcode, arithmetic_ins_op_31, ARRAY_SIZE(arithmetic_ins_op_31),
sizeof(arithmetic_ins_op_31[0]), cmp_offset);
if (ret != NULL)
return &arithmetic_ops;
/* Bits 21 to 30 has value 444 for "mr" insn ie, OR X form */
if (PPC_21_30(raw_insn) == 444)
return &arithmetic_ops;
}
} else {
mem_insns_31_opcode.value = opcode;
ret = bsearch(&mem_insns_31_opcode, arithmetic_two_ops, ARRAY_SIZE(arithmetic_two_ops),
sizeof(arithmetic_two_ops[0]), cmp_offset);
if (ret != NULL)
return &arithmetic_ops;
}
return NULL;
}
/*
* Instruction tracking function to track register state moves.
* Example sequence:
* ld r10,264(r3)
* mr r31,r3
* <<after some sequence>
* ld r9,312(r31)
*
* Previous instruction sequence shows that register state of r3
* is moved to r31. update_insn_state_powerpc tracks these state
* changes
*/
#ifdef HAVE_DWARF_SUPPORT
static void update_insn_state_powerpc(struct type_state *state,
struct data_loc_info *dloc, Dwarf_Die * cu_die __maybe_unused,
struct disasm_line *dl)
{
struct annotated_insn_loc loc;
struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
struct type_state_reg *tsr;
u32 insn_offset = dl->al.offset;
if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
return;
/*
* Value 444 for bits 21:30 is for "mr"
* instruction. "mr" is extended OR. So set the
* source and destination reg correctly
*/
if (PPC_21_30(dl->raw.raw_insn) == 444) {
int src_reg = src->reg1;
src->reg1 = dst->reg1;
dst->reg1 = src_reg;
}
if (!has_reg_type(state, dst->reg1))
return;
tsr = &state->regs[dst->reg1];
if (!has_reg_type(state, src->reg1) ||
!state->regs[src->reg1].ok) {
tsr->ok = false;
return;
}
tsr->type = state->regs[src->reg1].type;
tsr->kind = state->regs[src->reg1].kind;
tsr->ok = true;
pr_debug_dtp("mov [%x] reg%d -> reg%d",
insn_offset, src->reg1, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
#endif /* HAVE_DWARF_SUPPORT */
static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
{
if (!arch->initialized) {
arch->initialized = true;
arch->associate_instruction_ops = powerpc__associate_instruction_ops;
arch->objdump.comment_char = '#';
annotate_opts.show_asm_raw = true;
}
return 0;

View File

@ -98,3 +98,56 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
}
#define PPC_OP(op) (((op) >> 26) & 0x3F)
#define PPC_RA(a) (((a) >> 16) & 0x1f)
#define PPC_RT(t) (((t) >> 21) & 0x1f)
#define PPC_RB(b) (((b) >> 11) & 0x1f)
#define PPC_D(D) ((D) & 0xfffe)
#define PPC_DS(DS) ((DS) & 0xfffc)
#define OP_LD 58
#define OP_STD 62
static int get_source_reg(u32 raw_insn)
{
return PPC_RA(raw_insn);
}
static int get_target_reg(u32 raw_insn)
{
return PPC_RT(raw_insn);
}
static int get_offset_opcode(u32 raw_insn)
{
int opcode = PPC_OP(raw_insn);
/* DS- form */
if ((opcode == OP_LD) || (opcode == OP_STD))
return PPC_DS(raw_insn);
else
return PPC_D(raw_insn);
}
/*
* Fills the required fields for op_loc depending on if it
* is a source or target.
* D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
* DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
* X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
*/
void get_powerpc_regs(u32 raw_insn, int is_source,
struct annotated_op_loc *op_loc)
{
if (is_source)
op_loc->reg1 = get_source_reg(raw_insn);
else
op_loc->reg1 = get_target_reg(raw_insn);
if (op_loc->multi_regs)
op_loc->reg2 = PPC_RB(raw_insn);
/* TODO: Implement offset handling for X Form */
if ((op_loc->mem_ref) && (PPC_OP(raw_insn) != 31))
op_loc->offset = get_offset_opcode(raw_insn);
}

View File

@ -2,7 +2,7 @@
#include <linux/compiler.h>
static int s390_call__parse(struct arch *arch, struct ins_operands *ops,
struct map_symbol *ms)
struct map_symbol *ms, struct disasm_line *dl __maybe_unused)
{
char *endptr, *tok, *name;
struct map *map = ms->map;
@ -52,7 +52,8 @@ static struct ins_ops s390_call_ops = {
static int s390_mov__parse(struct arch *arch __maybe_unused,
struct ins_operands *ops,
struct map_symbol *ms __maybe_unused)
struct map_symbol *ms __maybe_unused,
struct disasm_line *dl __maybe_unused)
{
char *s = strchr(ops->raw, ','), *target, *endptr;

View File

@ -13,6 +13,7 @@ PERF_HAVE_JITDUMP := 1
generated := $(OUTPUT)arch/x86/include/generated
out := $(generated)/asm
header := $(out)/syscalls_64.c
header_32 := $(out)/syscalls_32.c
sys := $(srctree)/tools/perf/arch/x86/entry/syscalls
systbl := $(sys)/syscalltbl.sh
@ -22,7 +23,10 @@ $(shell [ -d '$(out)' ] || mkdir -p '$(out)')
$(header): $(sys)/syscall_64.tbl $(systbl)
$(Q)$(SHELL) '$(systbl)' $(sys)/syscall_64.tbl 'x86_64' > $@
$(header_32): $(sys)/syscall_32.tbl $(systbl)
$(Q)$(SHELL) '$(systbl)' $(sys)/syscall_32.tbl 'x86' > $@
clean::
$(call QUIET_CLEAN, x86) $(RM) -r $(header) $(generated)
archheaders: $(header)
archheaders: $(header) $(header_32)

View File

@ -206,3 +206,392 @@ static int x86__annotate_init(struct arch *arch, char *cpuid)
arch->initialized = true;
return err;
}
#ifdef HAVE_DWARF_SUPPORT
static void update_insn_state_x86(struct type_state *state,
struct data_loc_info *dloc, Dwarf_Die *cu_die,
struct disasm_line *dl)
{
struct annotated_insn_loc loc;
struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
struct type_state_reg *tsr;
Dwarf_Die type_die;
u32 insn_offset = dl->al.offset;
int fbreg = dloc->fbreg;
int fboff = 0;
if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
return;
if (ins__is_call(&dl->ins)) {
struct symbol *func = dl->ops.target.sym;
if (func == NULL)
return;
/* __fentry__ will preserve all registers */
if (!strcmp(func->name, "__fentry__"))
return;
pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
/* Otherwise invalidate caller-saved registers after call */
for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
if (state->regs[i].caller_saved)
state->regs[i].ok = false;
}
/* Update register with the return type (if any) */
if (die_find_func_rettype(cu_die, func->name, &type_die)) {
tsr = &state->regs[state->ret_reg];
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
pr_debug_dtp("call [%x] return -> reg%d",
insn_offset, state->ret_reg);
pr_debug_type_name(&type_die, tsr->kind);
}
return;
}
if (!strncmp(dl->ins.name, "add", 3)) {
u64 imm_value = -1ULL;
int offset;
const char *var_name = NULL;
struct map_symbol *ms = dloc->ms;
u64 ip = ms->sym->start + dl->al.offset;
if (!has_reg_type(state, dst->reg1))
return;
tsr = &state->regs[dst->reg1];
tsr->copied_from = -1;
if (src->imm)
imm_value = src->offset;
else if (has_reg_type(state, src->reg1) &&
state->regs[src->reg1].kind == TSR_KIND_CONST)
imm_value = state->regs[src->reg1].imm_value;
else if (src->reg1 == DWARF_REG_PC) {
u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
src->offset, dl);
if (get_global_var_info(dloc, var_addr,
&var_name, &offset) &&
!strcmp(var_name, "this_cpu_off") &&
tsr->kind == TSR_KIND_CONST) {
tsr->kind = TSR_KIND_PERCPU_BASE;
tsr->ok = true;
imm_value = tsr->imm_value;
}
}
else
return;
if (tsr->kind != TSR_KIND_PERCPU_BASE)
return;
if (get_global_var_type(cu_die, dloc, ip, imm_value, &offset,
&type_die) && offset == 0) {
/*
* This is not a pointer type, but it should be treated
* as a pointer.
*/
tsr->type = type_die;
tsr->kind = TSR_KIND_POINTER;
tsr->ok = true;
pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
insn_offset, imm_value, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
return;
}
if (strncmp(dl->ins.name, "mov", 3))
return;
if (dloc->fb_cfa) {
u64 ip = dloc->ms->sym->start + dl->al.offset;
u64 pc = map__rip_2objdump(dloc->ms->map, ip);
if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
fbreg = -1;
}
/* Case 1. register to register or segment:offset to register transfers */
if (!src->mem_ref && !dst->mem_ref) {
if (!has_reg_type(state, dst->reg1))
return;
tsr = &state->regs[dst->reg1];
tsr->copied_from = -1;
if (dso__kernel(map__dso(dloc->ms->map)) &&
src->segment == INSN_SEG_X86_GS && src->imm) {
u64 ip = dloc->ms->sym->start + dl->al.offset;
u64 var_addr;
int offset;
/*
* In kernel, %gs points to a per-cpu region for the
* current CPU. Access with a constant offset should
* be treated as a global variable access.
*/
var_addr = src->offset;
if (var_addr == 40) {
tsr->kind = TSR_KIND_CANARY;
tsr->ok = true;
pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
insn_offset, dst->reg1);
return;
}
if (!get_global_var_type(cu_die, dloc, ip, var_addr,
&offset, &type_die) ||
!die_get_member_type(&type_die, offset, &type_die)) {
tsr->ok = false;
return;
}
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
insn_offset, var_addr, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
return;
}
if (src->imm) {
tsr->kind = TSR_KIND_CONST;
tsr->imm_value = src->offset;
tsr->ok = true;
pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
insn_offset, tsr->imm_value, dst->reg1);
return;
}
if (!has_reg_type(state, src->reg1) ||
!state->regs[src->reg1].ok) {
tsr->ok = false;
return;
}
tsr->type = state->regs[src->reg1].type;
tsr->kind = state->regs[src->reg1].kind;
tsr->imm_value = state->regs[src->reg1].imm_value;
tsr->ok = true;
/* To copy back the variable type later (hopefully) */
if (tsr->kind == TSR_KIND_TYPE)
tsr->copied_from = src->reg1;
pr_debug_dtp("mov [%x] reg%d -> reg%d",
insn_offset, src->reg1, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
/* Case 2. memory to register transers */
if (src->mem_ref && !dst->mem_ref) {
int sreg = src->reg1;
if (!has_reg_type(state, dst->reg1))
return;
tsr = &state->regs[dst->reg1];
tsr->copied_from = -1;
retry:
/* Check stack variables with offset */
if (sreg == fbreg) {
struct type_state_stack *stack;
int offset = src->offset - fboff;
stack = find_stack_state(state, offset);
if (stack == NULL) {
tsr->ok = false;
return;
} else if (!stack->compound) {
tsr->type = stack->type;
tsr->kind = stack->kind;
tsr->ok = true;
} else if (die_get_member_type(&stack->type,
offset - stack->offset,
&type_die)) {
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
} else {
tsr->ok = false;
return;
}
pr_debug_dtp("mov [%x] -%#x(stack) -> reg%d",
insn_offset, -offset, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
/* And then dereference the pointer if it has one */
else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
state->regs[sreg].kind == TSR_KIND_TYPE &&
die_deref_ptr_type(&state->regs[sreg].type,
src->offset, &type_die)) {
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
insn_offset, src->offset, sreg, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
/* Or check if it's a global variable */
else if (sreg == DWARF_REG_PC) {
struct map_symbol *ms = dloc->ms;
u64 ip = ms->sym->start + dl->al.offset;
u64 addr;
int offset;
addr = annotate_calc_pcrel(ms, ip, src->offset, dl);
if (!get_global_var_type(cu_die, dloc, ip, addr, &offset,
&type_die) ||
!die_get_member_type(&type_die, offset, &type_die)) {
tsr->ok = false;
return;
}
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
insn_offset, addr, dst->reg1);
pr_debug_type_name(&type_die, tsr->kind);
}
/* And check percpu access with base register */
else if (has_reg_type(state, sreg) &&
state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
u64 ip = dloc->ms->sym->start + dl->al.offset;
u64 var_addr = src->offset;
int offset;
if (src->multi_regs) {
int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
state->regs[reg2].kind == TSR_KIND_CONST)
var_addr += state->regs[reg2].imm_value;
}
/*
* In kernel, %gs points to a per-cpu region for the
* current CPU. Access with a constant offset should
* be treated as a global variable access.
*/
if (get_global_var_type(cu_die, dloc, ip, var_addr,
&offset, &type_die) &&
die_get_member_type(&type_die, offset, &type_die)) {
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
if (src->multi_regs) {
pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
insn_offset, src->offset, src->reg1,
src->reg2, dst->reg1);
} else {
pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
insn_offset, src->offset, sreg, dst->reg1);
}
pr_debug_type_name(&tsr->type, tsr->kind);
} else {
tsr->ok = false;
}
}
/* And then dereference the calculated pointer if it has one */
else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
state->regs[sreg].kind == TSR_KIND_POINTER &&
die_get_member_type(&state->regs[sreg].type,
src->offset, &type_die)) {
tsr->type = type_die;
tsr->kind = TSR_KIND_TYPE;
tsr->ok = true;
pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
insn_offset, src->offset, sreg, dst->reg1);
pr_debug_type_name(&tsr->type, tsr->kind);
}
/* Or try another register if any */
else if (src->multi_regs && sreg == src->reg1 &&
src->reg1 != src->reg2) {
sreg = src->reg2;
goto retry;
}
else {
int offset;
const char *var_name = NULL;
/* it might be per-cpu variable (in kernel) access */
if (src->offset < 0) {
if (get_global_var_info(dloc, (s64)src->offset,
&var_name, &offset) &&
!strcmp(var_name, "__per_cpu_offset")) {
tsr->kind = TSR_KIND_PERCPU_BASE;
tsr->ok = true;
pr_debug_dtp("mov [%x] percpu base reg%d\n",
insn_offset, dst->reg1);
return;
}
}
tsr->ok = false;
}
}
/* Case 3. register to memory transfers */
if (!src->mem_ref && dst->mem_ref) {
if (!has_reg_type(state, src->reg1) ||
!state->regs[src->reg1].ok)
return;
/* Check stack variables with offset */
if (dst->reg1 == fbreg) {
struct type_state_stack *stack;
int offset = dst->offset - fboff;
tsr = &state->regs[src->reg1];
stack = find_stack_state(state, offset);
if (stack) {
/*
* The source register is likely to hold a type
* of member if it's a compound type. Do not
* update the stack variable type since we can
* get the member type later by using the
* die_get_member_type().
*/
if (!stack->compound)
set_stack_state(stack, offset, tsr->kind,
&tsr->type);
} else {
findnew_stack_state(state, offset, tsr->kind,
&tsr->type);
}
pr_debug_dtp("mov [%x] reg%d -> -%#x(stack)",
insn_offset, src->reg1, -offset);
pr_debug_type_name(&tsr->type, tsr->kind);
}
/*
* Ignore other transfers since it'd set a value in a struct
* and won't change the type.
*/
}
/* Case 4. memory to memory transfers (not handled for now) */
}
#endif

View File

@ -0,0 +1,470 @@
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# 32-bit system call numbers and entry vectors
#
# The format is:
# <number> <abi> <name> <entry point> [<compat entry point> [noreturn]]
#
# The __ia32_sys and __ia32_compat_sys stubs are created on-the-fly for
# sys_*() system calls and compat_sys_*() compat system calls if
# IA32_EMULATION is defined, and expect struct pt_regs *regs as their only
# parameter.
#
# The abi is always "i386" for this file.
#
0 i386 restart_syscall sys_restart_syscall
1 i386 exit sys_exit - noreturn
2 i386 fork sys_fork
3 i386 read sys_read
4 i386 write sys_write
5 i386 open sys_open compat_sys_open
6 i386 close sys_close
7 i386 waitpid sys_waitpid
8 i386 creat sys_creat
9 i386 link sys_link
10 i386 unlink sys_unlink
11 i386 execve sys_execve compat_sys_execve
12 i386 chdir sys_chdir
13 i386 time sys_time32
14 i386 mknod sys_mknod
15 i386 chmod sys_chmod
16 i386 lchown sys_lchown16
17 i386 break
18 i386 oldstat sys_stat
19 i386 lseek sys_lseek compat_sys_lseek
20 i386 getpid sys_getpid
21 i386 mount sys_mount
22 i386 umount sys_oldumount
23 i386 setuid sys_setuid16
24 i386 getuid sys_getuid16
25 i386 stime sys_stime32
26 i386 ptrace sys_ptrace compat_sys_ptrace
27 i386 alarm sys_alarm
28 i386 oldfstat sys_fstat
29 i386 pause sys_pause
30 i386 utime sys_utime32
31 i386 stty
32 i386 gtty
33 i386 access sys_access
34 i386 nice sys_nice
35 i386 ftime
36 i386 sync sys_sync
37 i386 kill sys_kill
38 i386 rename sys_rename
39 i386 mkdir sys_mkdir
40 i386 rmdir sys_rmdir
41 i386 dup sys_dup
42 i386 pipe sys_pipe
43 i386 times sys_times compat_sys_times
44 i386 prof
45 i386 brk sys_brk
46 i386 setgid sys_setgid16
47 i386 getgid sys_getgid16
48 i386 signal sys_signal
49 i386 geteuid sys_geteuid16
50 i386 getegid sys_getegid16
51 i386 acct sys_acct
52 i386 umount2 sys_umount
53 i386 lock
54 i386 ioctl sys_ioctl compat_sys_ioctl
55 i386 fcntl sys_fcntl compat_sys_fcntl64
56 i386 mpx
57 i386 setpgid sys_setpgid
58 i386 ulimit
59 i386 oldolduname sys_olduname
60 i386 umask sys_umask
61 i386 chroot sys_chroot
62 i386 ustat sys_ustat compat_sys_ustat
63 i386 dup2 sys_dup2
64 i386 getppid sys_getppid
65 i386 getpgrp sys_getpgrp
66 i386 setsid sys_setsid
67 i386 sigaction sys_sigaction compat_sys_sigaction
68 i386 sgetmask sys_sgetmask
69 i386 ssetmask sys_ssetmask
70 i386 setreuid sys_setreuid16
71 i386 setregid sys_setregid16
72 i386 sigsuspend sys_sigsuspend
73 i386 sigpending sys_sigpending compat_sys_sigpending
74 i386 sethostname sys_sethostname
75 i386 setrlimit sys_setrlimit compat_sys_setrlimit
76 i386 getrlimit sys_old_getrlimit compat_sys_old_getrlimit
77 i386 getrusage sys_getrusage compat_sys_getrusage
78 i386 gettimeofday sys_gettimeofday compat_sys_gettimeofday
79 i386 settimeofday sys_settimeofday compat_sys_settimeofday
80 i386 getgroups sys_getgroups16
81 i386 setgroups sys_setgroups16
82 i386 select sys_old_select compat_sys_old_select
83 i386 symlink sys_symlink
84 i386 oldlstat sys_lstat
85 i386 readlink sys_readlink
86 i386 uselib sys_uselib
87 i386 swapon sys_swapon
88 i386 reboot sys_reboot
89 i386 readdir sys_old_readdir compat_sys_old_readdir
90 i386 mmap sys_old_mmap compat_sys_ia32_mmap
91 i386 munmap sys_munmap
92 i386 truncate sys_truncate compat_sys_truncate
93 i386 ftruncate sys_ftruncate compat_sys_ftruncate
94 i386 fchmod sys_fchmod
95 i386 fchown sys_fchown16
96 i386 getpriority sys_getpriority
97 i386 setpriority sys_setpriority
98 i386 profil
99 i386 statfs sys_statfs compat_sys_statfs
100 i386 fstatfs sys_fstatfs compat_sys_fstatfs
101 i386 ioperm sys_ioperm
102 i386 socketcall sys_socketcall compat_sys_socketcall
103 i386 syslog sys_syslog
104 i386 setitimer sys_setitimer compat_sys_setitimer
105 i386 getitimer sys_getitimer compat_sys_getitimer
106 i386 stat sys_newstat compat_sys_newstat
107 i386 lstat sys_newlstat compat_sys_newlstat
108 i386 fstat sys_newfstat compat_sys_newfstat
109 i386 olduname sys_uname
110 i386 iopl sys_iopl
111 i386 vhangup sys_vhangup
112 i386 idle
113 i386 vm86old sys_vm86old sys_ni_syscall
114 i386 wait4 sys_wait4 compat_sys_wait4
115 i386 swapoff sys_swapoff
116 i386 sysinfo sys_sysinfo compat_sys_sysinfo
117 i386 ipc sys_ipc compat_sys_ipc
118 i386 fsync sys_fsync
119 i386 sigreturn sys_sigreturn compat_sys_sigreturn
120 i386 clone sys_clone compat_sys_ia32_clone
121 i386 setdomainname sys_setdomainname
122 i386 uname sys_newuname
123 i386 modify_ldt sys_modify_ldt
124 i386 adjtimex sys_adjtimex_time32
125 i386 mprotect sys_mprotect
126 i386 sigprocmask sys_sigprocmask compat_sys_sigprocmask
127 i386 create_module
128 i386 init_module sys_init_module
129 i386 delete_module sys_delete_module
130 i386 get_kernel_syms
131 i386 quotactl sys_quotactl
132 i386 getpgid sys_getpgid
133 i386 fchdir sys_fchdir
134 i386 bdflush sys_ni_syscall
135 i386 sysfs sys_sysfs
136 i386 personality sys_personality
137 i386 afs_syscall
138 i386 setfsuid sys_setfsuid16
139 i386 setfsgid sys_setfsgid16
140 i386 _llseek sys_llseek
141 i386 getdents sys_getdents compat_sys_getdents
142 i386 _newselect sys_select compat_sys_select
143 i386 flock sys_flock
144 i386 msync sys_msync
145 i386 readv sys_readv
146 i386 writev sys_writev
147 i386 getsid sys_getsid
148 i386 fdatasync sys_fdatasync
149 i386 _sysctl sys_ni_syscall
150 i386 mlock sys_mlock
151 i386 munlock sys_munlock
152 i386 mlockall sys_mlockall
153 i386 munlockall sys_munlockall
154 i386 sched_setparam sys_sched_setparam
155 i386 sched_getparam sys_sched_getparam
156 i386 sched_setscheduler sys_sched_setscheduler
157 i386 sched_getscheduler sys_sched_getscheduler
158 i386 sched_yield sys_sched_yield
159 i386 sched_get_priority_max sys_sched_get_priority_max
160 i386 sched_get_priority_min sys_sched_get_priority_min
161 i386 sched_rr_get_interval sys_sched_rr_get_interval_time32
162 i386 nanosleep sys_nanosleep_time32
163 i386 mremap sys_mremap
164 i386 setresuid sys_setresuid16
165 i386 getresuid sys_getresuid16
166 i386 vm86 sys_vm86 sys_ni_syscall
167 i386 query_module
168 i386 poll sys_poll
169 i386 nfsservctl
170 i386 setresgid sys_setresgid16
171 i386 getresgid sys_getresgid16
172 i386 prctl sys_prctl
173 i386 rt_sigreturn sys_rt_sigreturn compat_sys_rt_sigreturn
174 i386 rt_sigaction sys_rt_sigaction compat_sys_rt_sigaction
175 i386 rt_sigprocmask sys_rt_sigprocmask compat_sys_rt_sigprocmask
176 i386 rt_sigpending sys_rt_sigpending compat_sys_rt_sigpending
177 i386 rt_sigtimedwait sys_rt_sigtimedwait_time32 compat_sys_rt_sigtimedwait_time32
178 i386 rt_sigqueueinfo sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo
179 i386 rt_sigsuspend sys_rt_sigsuspend compat_sys_rt_sigsuspend
180 i386 pread64 sys_ia32_pread64
181 i386 pwrite64 sys_ia32_pwrite64
182 i386 chown sys_chown16
183 i386 getcwd sys_getcwd
184 i386 capget sys_capget
185 i386 capset sys_capset
186 i386 sigaltstack sys_sigaltstack compat_sys_sigaltstack
187 i386 sendfile sys_sendfile compat_sys_sendfile
188 i386 getpmsg
189 i386 putpmsg
190 i386 vfork sys_vfork
191 i386 ugetrlimit sys_getrlimit compat_sys_getrlimit
192 i386 mmap2 sys_mmap_pgoff
193 i386 truncate64 sys_ia32_truncate64
194 i386 ftruncate64 sys_ia32_ftruncate64
195 i386 stat64 sys_stat64 compat_sys_ia32_stat64
196 i386 lstat64 sys_lstat64 compat_sys_ia32_lstat64
197 i386 fstat64 sys_fstat64 compat_sys_ia32_fstat64
198 i386 lchown32 sys_lchown
199 i386 getuid32 sys_getuid
200 i386 getgid32 sys_getgid
201 i386 geteuid32 sys_geteuid
202 i386 getegid32 sys_getegid
203 i386 setreuid32 sys_setreuid
204 i386 setregid32 sys_setregid
205 i386 getgroups32 sys_getgroups
206 i386 setgroups32 sys_setgroups
207 i386 fchown32 sys_fchown
208 i386 setresuid32 sys_setresuid
209 i386 getresuid32 sys_getresuid
210 i386 setresgid32 sys_setresgid
211 i386 getresgid32 sys_getresgid
212 i386 chown32 sys_chown
213 i386 setuid32 sys_setuid
214 i386 setgid32 sys_setgid
215 i386 setfsuid32 sys_setfsuid
216 i386 setfsgid32 sys_setfsgid
217 i386 pivot_root sys_pivot_root
218 i386 mincore sys_mincore
219 i386 madvise sys_madvise
220 i386 getdents64 sys_getdents64
221 i386 fcntl64 sys_fcntl64 compat_sys_fcntl64
# 222 is unused
# 223 is unused
224 i386 gettid sys_gettid
225 i386 readahead sys_ia32_readahead
226 i386 setxattr sys_setxattr
227 i386 lsetxattr sys_lsetxattr
228 i386 fsetxattr sys_fsetxattr
229 i386 getxattr sys_getxattr
230 i386 lgetxattr sys_lgetxattr
231 i386 fgetxattr sys_fgetxattr
232 i386 listxattr sys_listxattr
233 i386 llistxattr sys_llistxattr
234 i386 flistxattr sys_flistxattr
235 i386 removexattr sys_removexattr
236 i386 lremovexattr sys_lremovexattr
237 i386 fremovexattr sys_fremovexattr
238 i386 tkill sys_tkill
239 i386 sendfile64 sys_sendfile64
240 i386 futex sys_futex_time32
241 i386 sched_setaffinity sys_sched_setaffinity compat_sys_sched_setaffinity
242 i386 sched_getaffinity sys_sched_getaffinity compat_sys_sched_getaffinity
243 i386 set_thread_area sys_set_thread_area
244 i386 get_thread_area sys_get_thread_area
245 i386 io_setup sys_io_setup compat_sys_io_setup
246 i386 io_destroy sys_io_destroy
247 i386 io_getevents sys_io_getevents_time32
248 i386 io_submit sys_io_submit compat_sys_io_submit
249 i386 io_cancel sys_io_cancel
250 i386 fadvise64 sys_ia32_fadvise64
# 251 is available for reuse (was briefly sys_set_zone_reclaim)
252 i386 exit_group sys_exit_group - noreturn
253 i386 lookup_dcookie
254 i386 epoll_create sys_epoll_create
255 i386 epoll_ctl sys_epoll_ctl
256 i386 epoll_wait sys_epoll_wait
257 i386 remap_file_pages sys_remap_file_pages
258 i386 set_tid_address sys_set_tid_address
259 i386 timer_create sys_timer_create compat_sys_timer_create
260 i386 timer_settime sys_timer_settime32
261 i386 timer_gettime sys_timer_gettime32
262 i386 timer_getoverrun sys_timer_getoverrun
263 i386 timer_delete sys_timer_delete
264 i386 clock_settime sys_clock_settime32
265 i386 clock_gettime sys_clock_gettime32
266 i386 clock_getres sys_clock_getres_time32
267 i386 clock_nanosleep sys_clock_nanosleep_time32
268 i386 statfs64 sys_statfs64 compat_sys_statfs64
269 i386 fstatfs64 sys_fstatfs64 compat_sys_fstatfs64
270 i386 tgkill sys_tgkill
271 i386 utimes sys_utimes_time32
272 i386 fadvise64_64 sys_ia32_fadvise64_64
273 i386 vserver
274 i386 mbind sys_mbind
275 i386 get_mempolicy sys_get_mempolicy
276 i386 set_mempolicy sys_set_mempolicy
277 i386 mq_open sys_mq_open compat_sys_mq_open
278 i386 mq_unlink sys_mq_unlink
279 i386 mq_timedsend sys_mq_timedsend_time32
280 i386 mq_timedreceive sys_mq_timedreceive_time32
281 i386 mq_notify sys_mq_notify compat_sys_mq_notify
282 i386 mq_getsetattr sys_mq_getsetattr compat_sys_mq_getsetattr
283 i386 kexec_load sys_kexec_load compat_sys_kexec_load
284 i386 waitid sys_waitid compat_sys_waitid
# 285 sys_setaltroot
286 i386 add_key sys_add_key
287 i386 request_key sys_request_key
288 i386 keyctl sys_keyctl compat_sys_keyctl
289 i386 ioprio_set sys_ioprio_set
290 i386 ioprio_get sys_ioprio_get
291 i386 inotify_init sys_inotify_init
292 i386 inotify_add_watch sys_inotify_add_watch
293 i386 inotify_rm_watch sys_inotify_rm_watch
294 i386 migrate_pages sys_migrate_pages
295 i386 openat sys_openat compat_sys_openat
296 i386 mkdirat sys_mkdirat
297 i386 mknodat sys_mknodat
298 i386 fchownat sys_fchownat
299 i386 futimesat sys_futimesat_time32
300 i386 fstatat64 sys_fstatat64 compat_sys_ia32_fstatat64
301 i386 unlinkat sys_unlinkat
302 i386 renameat sys_renameat
303 i386 linkat sys_linkat
304 i386 symlinkat sys_symlinkat
305 i386 readlinkat sys_readlinkat
306 i386 fchmodat sys_fchmodat
307 i386 faccessat sys_faccessat
308 i386 pselect6 sys_pselect6_time32 compat_sys_pselect6_time32
309 i386 ppoll sys_ppoll_time32 compat_sys_ppoll_time32
310 i386 unshare sys_unshare
311 i386 set_robust_list sys_set_robust_list compat_sys_set_robust_list
312 i386 get_robust_list sys_get_robust_list compat_sys_get_robust_list
313 i386 splice sys_splice
314 i386 sync_file_range sys_ia32_sync_file_range
315 i386 tee sys_tee
316 i386 vmsplice sys_vmsplice
317 i386 move_pages sys_move_pages
318 i386 getcpu sys_getcpu
319 i386 epoll_pwait sys_epoll_pwait
320 i386 utimensat sys_utimensat_time32
321 i386 signalfd sys_signalfd compat_sys_signalfd
322 i386 timerfd_create sys_timerfd_create
323 i386 eventfd sys_eventfd
324 i386 fallocate sys_ia32_fallocate
325 i386 timerfd_settime sys_timerfd_settime32
326 i386 timerfd_gettime sys_timerfd_gettime32
327 i386 signalfd4 sys_signalfd4 compat_sys_signalfd4
328 i386 eventfd2 sys_eventfd2
329 i386 epoll_create1 sys_epoll_create1
330 i386 dup3 sys_dup3
331 i386 pipe2 sys_pipe2
332 i386 inotify_init1 sys_inotify_init1
333 i386 preadv sys_preadv compat_sys_preadv
334 i386 pwritev sys_pwritev compat_sys_pwritev
335 i386 rt_tgsigqueueinfo sys_rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
336 i386 perf_event_open sys_perf_event_open
337 i386 recvmmsg sys_recvmmsg_time32 compat_sys_recvmmsg_time32
338 i386 fanotify_init sys_fanotify_init
339 i386 fanotify_mark sys_fanotify_mark compat_sys_fanotify_mark
340 i386 prlimit64 sys_prlimit64
341 i386 name_to_handle_at sys_name_to_handle_at
342 i386 open_by_handle_at sys_open_by_handle_at compat_sys_open_by_handle_at
343 i386 clock_adjtime sys_clock_adjtime32
344 i386 syncfs sys_syncfs
345 i386 sendmmsg sys_sendmmsg compat_sys_sendmmsg
346 i386 setns sys_setns
347 i386 process_vm_readv sys_process_vm_readv
348 i386 process_vm_writev sys_process_vm_writev
349 i386 kcmp sys_kcmp
350 i386 finit_module sys_finit_module
351 i386 sched_setattr sys_sched_setattr
352 i386 sched_getattr sys_sched_getattr
353 i386 renameat2 sys_renameat2
354 i386 seccomp sys_seccomp
355 i386 getrandom sys_getrandom
356 i386 memfd_create sys_memfd_create
357 i386 bpf sys_bpf
358 i386 execveat sys_execveat compat_sys_execveat
359 i386 socket sys_socket
360 i386 socketpair sys_socketpair
361 i386 bind sys_bind
362 i386 connect sys_connect
363 i386 listen sys_listen
364 i386 accept4 sys_accept4
365 i386 getsockopt sys_getsockopt sys_getsockopt
366 i386 setsockopt sys_setsockopt sys_setsockopt
367 i386 getsockname sys_getsockname
368 i386 getpeername sys_getpeername
369 i386 sendto sys_sendto
370 i386 sendmsg sys_sendmsg compat_sys_sendmsg
371 i386 recvfrom sys_recvfrom compat_sys_recvfrom
372 i386 recvmsg sys_recvmsg compat_sys_recvmsg
373 i386 shutdown sys_shutdown
374 i386 userfaultfd sys_userfaultfd
375 i386 membarrier sys_membarrier
376 i386 mlock2 sys_mlock2
377 i386 copy_file_range sys_copy_file_range
378 i386 preadv2 sys_preadv2 compat_sys_preadv2
379 i386 pwritev2 sys_pwritev2 compat_sys_pwritev2
380 i386 pkey_mprotect sys_pkey_mprotect
381 i386 pkey_alloc sys_pkey_alloc
382 i386 pkey_free sys_pkey_free
383 i386 statx sys_statx
384 i386 arch_prctl sys_arch_prctl compat_sys_arch_prctl
385 i386 io_pgetevents sys_io_pgetevents_time32 compat_sys_io_pgetevents
386 i386 rseq sys_rseq
393 i386 semget sys_semget
394 i386 semctl sys_semctl compat_sys_semctl
395 i386 shmget sys_shmget
396 i386 shmctl sys_shmctl compat_sys_shmctl
397 i386 shmat sys_shmat compat_sys_shmat
398 i386 shmdt sys_shmdt
399 i386 msgget sys_msgget
400 i386 msgsnd sys_msgsnd compat_sys_msgsnd
401 i386 msgrcv sys_msgrcv compat_sys_msgrcv
402 i386 msgctl sys_msgctl compat_sys_msgctl
403 i386 clock_gettime64 sys_clock_gettime
404 i386 clock_settime64 sys_clock_settime
405 i386 clock_adjtime64 sys_clock_adjtime
406 i386 clock_getres_time64 sys_clock_getres
407 i386 clock_nanosleep_time64 sys_clock_nanosleep
408 i386 timer_gettime64 sys_timer_gettime
409 i386 timer_settime64 sys_timer_settime
410 i386 timerfd_gettime64 sys_timerfd_gettime
411 i386 timerfd_settime64 sys_timerfd_settime
412 i386 utimensat_time64 sys_utimensat
413 i386 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64
414 i386 ppoll_time64 sys_ppoll compat_sys_ppoll_time64
416 i386 io_pgetevents_time64 sys_io_pgetevents compat_sys_io_pgetevents_time64
417 i386 recvmmsg_time64 sys_recvmmsg compat_sys_recvmmsg_time64
418 i386 mq_timedsend_time64 sys_mq_timedsend
419 i386 mq_timedreceive_time64 sys_mq_timedreceive
420 i386 semtimedop_time64 sys_semtimedop
421 i386 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 i386 futex_time64 sys_futex
423 i386 sched_rr_get_interval_time64 sys_sched_rr_get_interval
424 i386 pidfd_send_signal sys_pidfd_send_signal
425 i386 io_uring_setup sys_io_uring_setup
426 i386 io_uring_enter sys_io_uring_enter
427 i386 io_uring_register sys_io_uring_register
428 i386 open_tree sys_open_tree
429 i386 move_mount sys_move_mount
430 i386 fsopen sys_fsopen
431 i386 fsconfig sys_fsconfig
432 i386 fsmount sys_fsmount
433 i386 fspick sys_fspick
434 i386 pidfd_open sys_pidfd_open
435 i386 clone3 sys_clone3
436 i386 close_range sys_close_range
437 i386 openat2 sys_openat2
438 i386 pidfd_getfd sys_pidfd_getfd
439 i386 faccessat2 sys_faccessat2
440 i386 process_madvise sys_process_madvise
441 i386 epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2
442 i386 mount_setattr sys_mount_setattr
443 i386 quotactl_fd sys_quotactl_fd
444 i386 landlock_create_ruleset sys_landlock_create_ruleset
445 i386 landlock_add_rule sys_landlock_add_rule
446 i386 landlock_restrict_self sys_landlock_restrict_self
447 i386 memfd_secret sys_memfd_secret
448 i386 process_mrelease sys_process_mrelease
449 i386 futex_waitv sys_futex_waitv
450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node
451 i386 cachestat sys_cachestat
452 i386 fchmodat2 sys_fchmodat2
453 i386 map_shadow_stack sys_map_shadow_stack
454 i386 futex_wake sys_futex_wake
455 i386 futex_wait sys_futex_wait
456 i386 futex_requeue sys_futex_requeue
457 i386 statmount sys_statmount
458 i386 listmount sys_listmount
459 i386 lsm_get_self_attr sys_lsm_get_self_attr
460 i386 lsm_set_self_attr sys_lsm_set_self_attr
461 i386 lsm_list_modules sys_lsm_list_modules
462 i386 mseal sys_mseal

View File

@ -15,7 +15,7 @@
#if defined(__x86_64__)
struct perf_event__synthesize_extra_kmaps_cb_args {
struct perf_tool *tool;
const struct perf_tool *tool;
perf_event__handler_t process;
struct machine *machine;
union perf_event *event;
@ -65,7 +65,7 @@ static int perf_event__synthesize_extra_kmaps_cb(struct map *map, void *data)
return 0;
}
int perf_event__synthesize_extra_kmaps(struct perf_tool *tool,
int perf_event__synthesize_extra_kmaps(const struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine)
{

View File

@ -89,6 +89,12 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs)
return 1;
}
/* Retire latency event should not be group leader*/
if (lhs->retire_lat && !rhs->retire_lat)
return 1;
if (!lhs->retire_lat && rhs->retire_lat)
return -1;
/* Default ordering by insertion index. */
return lhs->core.idx - rhs->core.idx;
}

View File

@ -434,7 +434,6 @@ struct auxtrace_record *intel_bts_recording_init(int *err)
}
btsr->intel_bts_pmu = intel_bts_pmu;
btsr->itr.pmu = intel_bts_pmu;
btsr->itr.recording_options = intel_bts_recording_options;
btsr->itr.info_priv_size = intel_bts_info_priv_size;
btsr->itr.info_fill = intel_bts_info_fill;

View File

@ -1197,7 +1197,6 @@ struct auxtrace_record *intel_pt_recording_init(int *err)
}
ptr->intel_pt_pmu = intel_pt_pmu;
ptr->itr.pmu = intel_pt_pmu;
ptr->itr.recording_options = intel_pt_recording_options;
ptr->itr.info_priv_size = intel_pt_info_priv_size;
ptr->itr.info_fill = intel_pt_info_fill;

View File

@ -49,7 +49,7 @@ static const char *const bench_usage[] = {
static atomic_t event_count;
static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
static int process_synthesized_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)

View File

@ -221,7 +221,8 @@ static int process_branch_callback(struct evsel *evsel,
if (a.map != NULL)
dso__set_hit(map__dso(a.map));
hist__account_cycles(sample->branch_stack, al, sample, false, NULL);
hist__account_cycles(sample->branch_stack, al, sample, false,
NULL, evsel);
ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
out:
@ -279,7 +280,7 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
return ret;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -396,10 +397,10 @@ static void print_annotate_item_stat(struct list_head *head, const char *title)
printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total,
total_good, 100.0 * total_good / (total ?: 1),
total_bad, 100.0 * total_bad / (total ?: 1));
printf(" %-10s: %5s %5s\n", "Name", "Good", "Bad");
printf(" %-20s: %5s %5s\n", "Name/opcode", "Good", "Bad");
printf("-----------------------------------------------------------\n");
list_for_each_entry(istat, head, list)
printf(" %-10s: %5d %5d\n", istat->name, istat->good, istat->bad);
printf(" %-20s: %5d %5d\n", istat->name, istat->good, istat->bad);
printf("\n");
}
@ -632,12 +633,22 @@ static int __cmd_annotate(struct perf_annotate *ann)
evlist__for_each_entry(session->evlist, pos) {
struct hists *hists = evsel__hists(pos);
u32 nr_samples = hists->stats.nr_samples;
struct ui_progress prog;
struct evsel *evsel;
if (!symbol_conf.event_group || !evsel__is_group_leader(pos))
continue;
for_each_group_member(evsel, pos)
nr_samples += evsel__hists(evsel)->stats.nr_samples;
if (nr_samples == 0)
continue;
if (!symbol_conf.event_group || !evsel__is_group_leader(pos))
continue;
ui_progress__init(&prog, nr_samples,
"Sorting group events for output...");
evsel__output_resort(pos, &prog);
ui_progress__finish();
hists__find_annotations(hists, pos, ann);
}
@ -686,28 +697,7 @@ static const char * const annotate_usage[] = {
int cmd_annotate(int argc, const char **argv)
{
struct perf_annotate annotate = {
.tool = {
.sample = process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.namespaces = perf_event__process_namespaces,
.attr = perf_event__process_attr,
.build_id = perf_event__process_build_id,
#ifdef HAVE_LIBTRACEEVENT
.tracing_data = perf_event__process_tracing_data,
#endif
.id_index = perf_event__process_id_index,
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.feature = process_feature_event,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
};
struct perf_annotate annotate = {};
struct perf_data data = {
.mode = PERF_DATA_MODE_READ,
};
@ -795,6 +785,8 @@ int cmd_annotate(int argc, const char **argv)
"Show stats for the data type annotation"),
OPT_BOOLEAN(0, "insn-stat", &annotate.insn_stat,
"Show instruction stats for the data type annotation"),
OPT_BOOLEAN(0, "skip-empty", &symbol_conf.skip_empty,
"Do not display empty (or dummy) events in the output"),
OPT_END()
};
int ret;
@ -864,6 +856,25 @@ int cmd_annotate(int argc, const char **argv)
data.path = input_name;
perf_tool__init(&annotate.tool, /*ordered_events=*/true);
annotate.tool.sample = process_sample_event;
annotate.tool.mmap = perf_event__process_mmap;
annotate.tool.mmap2 = perf_event__process_mmap2;
annotate.tool.comm = perf_event__process_comm;
annotate.tool.exit = perf_event__process_exit;
annotate.tool.fork = perf_event__process_fork;
annotate.tool.namespaces = perf_event__process_namespaces;
annotate.tool.attr = perf_event__process_attr;
annotate.tool.build_id = perf_event__process_build_id;
#ifdef HAVE_LIBTRACEEVENT
annotate.tool.tracing_data = perf_event__process_tracing_data;
#endif
annotate.tool.id_index = perf_event__process_id_index;
annotate.tool.auxtrace_info = perf_event__process_auxtrace_info;
annotate.tool.auxtrace = perf_event__process_auxtrace;
annotate.tool.feature = process_feature_event;
annotate.tool.ordering_requires_timestamps = true;
annotate.session = perf_session__new(&data, &annotate.tool);
if (IS_ERR(annotate.session))
return PTR_ERR(annotate.session);
@ -916,11 +927,15 @@ int cmd_annotate(int argc, const char **argv)
sort_order = "dso,symbol";
/*
* Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
* if branch info is in perf data in TUI mode.
* Set SORT_MODE__BRANCH so that annotate displays IPC/Cycle and
* branch counters, if the corresponding branch info is available
* in the perf data in the TUI mode.
*/
if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack)
if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) {
sort__mode = SORT_MODE__BRANCH;
if (annotate.session->evlist->nr_br_cntr > 0)
annotate_opts.show_br_cntr = true;
}
if (setup_sorting(NULL) < 0)
usage_with_options(annotate_usage, options);

View File

@ -89,6 +89,7 @@ static int perf_session__list_build_ids(bool force, bool with_hits)
.mode = PERF_DATA_MODE_READ,
.force = force,
};
struct perf_tool build_id__mark_dso_hit_ops;
symbol__elf_init();
/*
@ -97,6 +98,15 @@ static int perf_session__list_build_ids(bool force, bool with_hits)
if (filename__fprintf_build_id(input_name, stdout) > 0)
goto out;
perf_tool__init(&build_id__mark_dso_hit_ops, /*ordered_events=*/true);
build_id__mark_dso_hit_ops.sample = build_id__mark_dso_hit;
build_id__mark_dso_hit_ops.mmap = perf_event__process_mmap;
build_id__mark_dso_hit_ops.mmap2 = perf_event__process_mmap2;
build_id__mark_dso_hit_ops.fork = perf_event__process_fork;
build_id__mark_dso_hit_ops.exit = perf_event__exit_del_thread;
build_id__mark_dso_hit_ops.attr = perf_event__process_attr;
build_id__mark_dso_hit_ops.build_id = perf_event__process_build_id;
session = perf_session__new(&data, &build_id__mark_dso_hit_ops);
if (IS_ERR(session))
return PTR_ERR(session);

View File

@ -273,7 +273,7 @@ static void compute_stats(struct c2c_hist_entry *c2c_he,
update_stats(&cstats->load, weight);
}
static int process_sample_event(struct perf_tool *tool __maybe_unused,
static int process_sample_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -385,24 +385,6 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
goto out;
}
static struct perf_c2c c2c = {
.tool = {
.sample = process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.lost = perf_event__process_lost,
.attr = perf_event__process_attr,
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
};
static const char * const c2c_usage[] = {
"perf c2c {record|report}",
NULL
@ -3070,6 +3052,19 @@ static int perf_c2c__report(int argc, const char **argv)
data.path = input_name;
data.force = symbol_conf.force;
perf_tool__init(&c2c.tool, /*ordered_events=*/true);
c2c.tool.sample = process_sample_event;
c2c.tool.mmap = perf_event__process_mmap;
c2c.tool.mmap2 = perf_event__process_mmap2;
c2c.tool.comm = perf_event__process_comm;
c2c.tool.exit = perf_event__process_exit;
c2c.tool.fork = perf_event__process_fork;
c2c.tool.lost = perf_event__process_lost;
c2c.tool.attr = perf_event__process_attr;
c2c.tool.auxtrace_info = perf_event__process_auxtrace_info;
c2c.tool.auxtrace = perf_event__process_auxtrace;
c2c.tool.auxtrace_error = perf_event__process_auxtrace_error;
c2c.tool.ordering_requires_timestamps = true;
session = perf_session__new(&data, &c2c.tool);
if (IS_ERR(session)) {
err = PTR_ERR(session);
@ -3266,7 +3261,7 @@ static int perf_c2c__record(int argc, const char **argv)
return -1;
}
if (perf_pmu__mem_events_init(pmu)) {
if (perf_pmu__mem_events_init()) {
pr_err("failed: memory events not supported\n");
return -1;
}
@ -3290,19 +3285,15 @@ static int perf_c2c__record(int argc, const char **argv)
* PERF_MEM_EVENTS__LOAD_STORE if it is supported.
*/
if (e->tag) {
e->record = true;
perf_mem_record[PERF_MEM_EVENTS__LOAD_STORE] = true;
rec_argv[i++] = "-W";
} else {
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
e->record = true;
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
e->record = true;
perf_mem_record[PERF_MEM_EVENTS__LOAD] = true;
perf_mem_record[PERF_MEM_EVENTS__STORE] = true;
}
}
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
if (e->record)
if (perf_mem_record[PERF_MEM_EVENTS__LOAD])
rec_argv[i++] = "-W";
rec_argv[i++] = "-d";

180
tools/perf/builtin-check.c Normal file
View File

@ -0,0 +1,180 @@
// SPDX-License-Identifier: GPL-2.0
#include "builtin.h"
#include "color.h"
#include "util/debug.h"
#include "util/header.h"
#include <tools/config.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <subcmd/parse-options.h>
static const char * const check_subcommands[] = { "feature", NULL };
static struct option check_options[] = {
OPT_BOOLEAN('q', "quiet", &quiet, "do not show any warnings or messages"),
OPT_END()
};
static struct option check_feature_options[] = { OPT_PARENT(check_options) };
static const char *check_usage[] = { NULL, NULL };
static const char *check_feature_usage[] = {
"perf check feature <feature_list>",
NULL
};
struct feature_status supported_features[] = {
FEATURE_STATUS("aio", HAVE_AIO_SUPPORT),
FEATURE_STATUS("bpf", HAVE_LIBBPF_SUPPORT),
FEATURE_STATUS("bpf_skeletons", HAVE_BPF_SKEL),
FEATURE_STATUS("debuginfod", HAVE_DEBUGINFOD_SUPPORT),
FEATURE_STATUS("dwarf", HAVE_DWARF_SUPPORT),
FEATURE_STATUS("dwarf_getlocations", HAVE_DWARF_GETLOCATIONS_SUPPORT),
FEATURE_STATUS("dwarf-unwind", HAVE_DWARF_UNWIND_SUPPORT),
FEATURE_STATUS("auxtrace", HAVE_AUXTRACE_SUPPORT),
FEATURE_STATUS("libaudit", HAVE_LIBAUDIT_SUPPORT),
FEATURE_STATUS("libbfd", HAVE_LIBBFD_SUPPORT),
FEATURE_STATUS("libcapstone", HAVE_LIBCAPSTONE_SUPPORT),
FEATURE_STATUS("libcrypto", HAVE_LIBCRYPTO_SUPPORT),
FEATURE_STATUS("libdw-dwarf-unwind", HAVE_DWARF_SUPPORT),
FEATURE_STATUS("libelf", HAVE_LIBELF_SUPPORT),
FEATURE_STATUS("libnuma", HAVE_LIBNUMA_SUPPORT),
FEATURE_STATUS("libopencsd", HAVE_CSTRACE_SUPPORT),
FEATURE_STATUS("libperl", HAVE_LIBPERL_SUPPORT),
FEATURE_STATUS("libpfm4", HAVE_LIBPFM),
FEATURE_STATUS("libpython", HAVE_LIBPYTHON_SUPPORT),
FEATURE_STATUS("libslang", HAVE_SLANG_SUPPORT),
FEATURE_STATUS("libtraceevent", HAVE_LIBTRACEEVENT),
FEATURE_STATUS("libunwind", HAVE_LIBUNWIND_SUPPORT),
FEATURE_STATUS("lzma", HAVE_LZMA_SUPPORT),
FEATURE_STATUS("numa_num_possible_cpus", HAVE_LIBNUMA_SUPPORT),
FEATURE_STATUS("syscall_table", HAVE_SYSCALL_TABLE_SUPPORT),
FEATURE_STATUS("zlib", HAVE_ZLIB_SUPPORT),
FEATURE_STATUS("zstd", HAVE_ZSTD_SUPPORT),
/* this should remain at end, to know the array end */
FEATURE_STATUS(NULL, _)
};
static void on_off_print(const char *status)
{
printf("[ ");
if (!strcmp(status, "OFF"))
color_fprintf(stdout, PERF_COLOR_RED, "%-3s", status);
else
color_fprintf(stdout, PERF_COLOR_GREEN, "%-3s", status);
printf(" ]");
}
/* Helper function to print status of a feature along with name/macro */
static void status_print(const char *name, const char *macro,
const char *status)
{
printf("%22s: ", name);
on_off_print(status);
printf(" # %s\n", macro);
}
#define STATUS(feature) \
do { \
if (feature.is_builtin) \
status_print(feature.name, feature.macro, "on"); \
else \
status_print(feature.name, feature.macro, "OFF"); \
} while (0)
/**
* check whether "feature" is built-in with perf
*
* returns:
* 0: NOT built-in or Feature not known
* 1: Built-in
*/
static int has_support(const char *feature)
{
for (int i = 0; supported_features[i].name; ++i) {
if ((strcasecmp(feature, supported_features[i].name) == 0) ||
(strcasecmp(feature, supported_features[i].macro) == 0)) {
if (!quiet)
STATUS(supported_features[i]);
return supported_features[i].is_builtin;
}
}
if (!quiet)
pr_err("Unknown feature '%s', please use 'perf version --build-options' to see which ones are available.\n", feature);
return 0;
}
/**
* Usage: 'perf check feature <feature_list>'
*
* <feature_list> can be a single feature name/macro, or a comma-separated list
* of feature names/macros
* eg. argument can be "libtraceevent" or "libtraceevent,bpf" etc
*
* In case of a comma-separated list, feature_enabled will be 1, only if
* all features passed in the string are supported
*
* Note that argv will get modified
*/
static int subcommand_feature(int argc, const char **argv)
{
char *feature_list;
char *feature_name;
int feature_enabled;
argc = parse_options(argc, argv, check_feature_options,
check_feature_usage, 0);
if (!argc)
usage_with_options(check_feature_usage, check_feature_options);
if (argc > 1) {
pr_err("Too many arguments passed to 'perf check feature'\n");
return -1;
}
feature_enabled = 1;
/* feature_list is a non-const copy of 'argv[0]' */
feature_list = strdup(argv[0]);
if (!feature_list) {
pr_err("ERROR: failed to allocate memory for feature list\n");
return -1;
}
feature_name = strtok(feature_list, ",");
while (feature_name) {
feature_enabled &= has_support(feature_name);
feature_name = strtok(NULL, ",");
}
free(feature_list);
return !feature_enabled;
}
int cmd_check(int argc, const char **argv)
{
argc = parse_options_subcommand(argc, argv, check_options,
check_subcommands, check_usage, 0);
if (!argc)
usage_with_options(check_usage, check_options);
if (strcmp(argv[0], "feature") == 0)
return subcommand_feature(argc, argv);
/* If no subcommand matched above, print usage help */
pr_err("Unknown subcommand: %s\n", argv[0]);
usage_with_options(check_usage, check_options);
/* free usage string allocated by parse_options_subcommand */
free((void *)check_usage[0]);
return 0;
}

View File

@ -1434,7 +1434,7 @@ static int __cmd_signal(struct daemon *daemon, struct option parent_options[],
}
memset(&cmd, 0, sizeof(cmd));
cmd.signal.cmd = CMD_SIGNAL,
cmd.signal.cmd = CMD_SIGNAL;
cmd.signal.sig = SIGUSR2;
strncpy(cmd.signal.name, name, sizeof(cmd.signal.name) - 1);

View File

@ -388,7 +388,7 @@ struct hist_entry_ops block_hist_ops = {
.free = block_hist_free,
};
static int diff__process_sample_event(struct perf_tool *tool,
static int diff__process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -431,8 +431,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
goto out;
}
hist__account_cycles(sample->branch_stack, &al, sample, false,
NULL);
hist__account_cycles(sample->branch_stack, &al, sample,
false, NULL, evsel);
break;
case COMPUTE_STREAM:
@ -467,21 +467,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
return ret;
}
static struct perf_diff pdiff = {
.tool = {
.sample = diff__process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.lost = perf_event__process_lost,
.namespaces = perf_event__process_namespaces,
.cgroup = perf_event__process_cgroup,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
};
static struct perf_diff pdiff;
static struct evsel *evsel_match(struct evsel *evsel,
struct evlist *evlist)
@ -705,7 +691,7 @@ static void hists__precompute(struct hists *hists)
if (compute == COMPUTE_CYCLES) {
bh = container_of(he, struct block_hist, he);
init_block_hist(bh);
block_info__process_sym(he, bh, NULL, 0);
block_info__process_sym(he, bh, NULL, 0, 0);
}
data__for_each_file_new(i, d) {
@ -728,7 +714,7 @@ static void hists__precompute(struct hists *hists)
pair_bh = container_of(pair, struct block_hist,
he);
init_block_hist(pair_bh);
block_info__process_sym(pair, pair_bh, NULL, 0);
block_info__process_sym(pair, pair_bh, NULL, 0, 0);
bh = container_of(he, struct block_hist, he);
@ -1959,6 +1945,18 @@ int cmd_diff(int argc, const char **argv)
if (ret < 0)
return ret;
perf_tool__init(&pdiff.tool, /*ordered_events=*/true);
pdiff.tool.sample = diff__process_sample_event;
pdiff.tool.mmap = perf_event__process_mmap;
pdiff.tool.mmap2 = perf_event__process_mmap2;
pdiff.tool.comm = perf_event__process_comm;
pdiff.tool.exit = perf_event__process_exit;
pdiff.tool.fork = perf_event__process_fork;
pdiff.tool.lost = perf_event__process_lost;
pdiff.tool.namespaces = perf_event__process_namespaces;
pdiff.tool.cgroup = perf_event__process_cgroup;
pdiff.tool.ordering_requires_timestamps = true;
perf_config(diff__config, NULL);
argc = parse_options(argc, argv, options, diff_usage, 0);

View File

@ -35,13 +35,13 @@ static int __cmd_evlist(const char *file_name, struct perf_attr_details *details
.mode = PERF_DATA_MODE_READ,
.force = details->force,
};
struct perf_tool tool = {
/* only needed for pipe mode */
.attr = perf_event__process_attr,
.feature = process_header_feature,
};
bool has_tracepoint = false;
struct perf_tool tool;
bool has_tracepoint = false, has_group = false;
perf_tool__init(&tool, /*ordered_events=*/false);
/* only needed for pipe mode */
tool.attr = perf_event__process_attr;
tool.feature = process_header_feature;
session = perf_session__new(&data, &tool);
if (IS_ERR(session))
return PTR_ERR(session);
@ -54,11 +54,17 @@ static int __cmd_evlist(const char *file_name, struct perf_attr_details *details
if (pos->core.attr.type == PERF_TYPE_TRACEPOINT)
has_tracepoint = true;
if (!evsel__is_group_leader(pos))
has_group = true;
}
if (has_tracepoint && !details->trace_fields)
printf("# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events\n");
if (has_group && !details->event_group)
printf("# Tip: use 'perf evlist -g' to show group information\n");
perf_session__delete(session);
return 0;
}

View File

@ -13,6 +13,7 @@
#include <signal.h>
#include <stdlib.h>
#include <fcntl.h>
#include <inttypes.h>
#include <math.h>
#include <poll.h>
#include <ctype.h>
@ -22,15 +23,18 @@
#include "debug.h"
#include <subcmd/pager.h>
#include <subcmd/parse-options.h>
#include <api/io.h>
#include <api/fs/tracing_path.h>
#include "evlist.h"
#include "target.h"
#include "cpumap.h"
#include "hashmap.h"
#include "thread_map.h"
#include "strfilter.h"
#include "util/cap.h"
#include "util/config.h"
#include "util/ftrace.h"
#include "util/stat.h"
#include "util/units.h"
#include "util/parse-sublevel-options.h"
@ -59,6 +63,41 @@ static void ftrace__workload_exec_failed_signal(int signo __maybe_unused,
done = true;
}
static bool check_ftrace_capable(void)
{
bool used_root;
if (perf_cap__capable(CAP_PERFMON, &used_root))
return true;
if (!used_root && perf_cap__capable(CAP_SYS_ADMIN, &used_root))
return true;
pr_err("ftrace only works for %s!\n",
used_root ? "root"
: "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
);
return false;
}
static bool is_ftrace_supported(void)
{
char *file;
bool supported = false;
file = get_tracing_file("set_ftrace_pid");
if (!file) {
pr_debug("cannot get tracing file set_ftrace_pid\n");
return false;
}
if (!access(file, F_OK))
supported = true;
put_tracing_file(file);
return supported;
}
static int __write_tracing_file(const char *name, const char *val, bool append)
{
char *file;
@ -228,6 +267,7 @@ static void reset_tracing_options(struct perf_ftrace *ftrace __maybe_unused)
write_tracing_option_file("funcgraph-irqs", "1");
write_tracing_option_file("funcgraph-proc", "0");
write_tracing_option_file("funcgraph-abstime", "0");
write_tracing_option_file("funcgraph-tail", "0");
write_tracing_option_file("latency-format", "0");
write_tracing_option_file("irq-info", "0");
}
@ -464,6 +504,17 @@ static int set_tracing_funcgraph_verbose(struct perf_ftrace *ftrace)
return 0;
}
static int set_tracing_funcgraph_tail(struct perf_ftrace *ftrace)
{
if (!ftrace->graph_tail)
return 0;
if (write_tracing_option_file("funcgraph-tail", "1") < 0)
return -1;
return 0;
}
static int set_tracing_thresh(struct perf_ftrace *ftrace)
{
int ret;
@ -540,6 +591,11 @@ static int set_tracing_options(struct perf_ftrace *ftrace)
return -1;
}
if (set_tracing_funcgraph_tail(ftrace) < 0) {
pr_err("failed to set tracing option funcgraph-tail\n");
return -1;
}
return 0;
}
@ -569,18 +625,6 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace)
.events = POLLIN,
};
if (!(perf_cap__capable(CAP_PERFMON) ||
perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
#ifdef HAVE_LIBCAP_SUPPORT
"users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
#else
"root"
#endif
);
return -1;
}
select_tracer(ftrace);
if (reset_tracing_files(ftrace) < 0) {
@ -885,18 +929,6 @@ static int __cmd_latency(struct perf_ftrace *ftrace)
};
int buckets[NUM_BUCKET] = { };
if (!(perf_cap__capable(CAP_PERFMON) ||
perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
#ifdef HAVE_LIBCAP_SUPPORT
"users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
#else
"root"
#endif
);
return -1;
}
trace_fd = prepare_func_latency(ftrace);
if (trace_fd < 0)
goto out;
@ -950,6 +982,326 @@ static int __cmd_latency(struct perf_ftrace *ftrace)
return (done && !workload_exec_errno) ? 0 : -1;
}
static size_t profile_hash(long func, void *ctx __maybe_unused)
{
return str_hash((char *)func);
}
static bool profile_equal(long func1, long func2, void *ctx __maybe_unused)
{
return !strcmp((char *)func1, (char *)func2);
}
static int prepare_func_profile(struct perf_ftrace *ftrace)
{
ftrace->tracer = "function_graph";
ftrace->graph_tail = 1;
ftrace->profile_hash = hashmap__new(profile_hash, profile_equal, NULL);
if (ftrace->profile_hash == NULL)
return -ENOMEM;
return 0;
}
/* This is saved in a hashmap keyed by the function name */
struct ftrace_profile_data {
struct stats st;
};
static int add_func_duration(struct perf_ftrace *ftrace, char *func, double time_ns)
{
struct ftrace_profile_data *prof = NULL;
if (!hashmap__find(ftrace->profile_hash, func, &prof)) {
char *key = strdup(func);
if (key == NULL)
return -ENOMEM;
prof = zalloc(sizeof(*prof));
if (prof == NULL) {
free(key);
return -ENOMEM;
}
init_stats(&prof->st);
hashmap__add(ftrace->profile_hash, key, prof);
}
update_stats(&prof->st, time_ns);
return 0;
}
/*
* The ftrace function_graph text output normally looks like below:
*
* CPU DURATION FUNCTION
*
* 0) | syscall_trace_enter.isra.0() {
* 0) | __audit_syscall_entry() {
* 0) | auditd_test_task() {
* 0) 0.271 us | __rcu_read_lock();
* 0) 0.275 us | __rcu_read_unlock();
* 0) 1.254 us | } /\* auditd_test_task *\/
* 0) 0.279 us | ktime_get_coarse_real_ts64();
* 0) 2.227 us | } /\* __audit_syscall_entry *\/
* 0) 2.713 us | } /\* syscall_trace_enter.isra.0 *\/
*
* Parse the line and get the duration and function name.
*/
static int parse_func_duration(struct perf_ftrace *ftrace, char *line, size_t len)
{
char *p;
char *func;
double duration;
/* skip CPU */
p = strchr(line, ')');
if (p == NULL)
return 0;
/* get duration */
p = skip_spaces(p + 1);
/* no duration? */
if (p == NULL || *p == '|')
return 0;
/* skip markers like '*' or '!' for longer than ms */
if (!isdigit(*p))
p++;
duration = strtod(p, &p);
if (strncmp(p, " us", 3)) {
pr_debug("non-usec time found.. ignoring\n");
return 0;
}
/*
* profile stat keeps the max and min values as integer,
* convert to nsec time so that we can have accurate max.
*/
duration *= 1000;
/* skip to the pipe */
while (p < line + len && *p != '|')
p++;
if (*p++ != '|')
return -EINVAL;
/* get function name */
func = skip_spaces(p);
/* skip the closing bracket and the start of comment */
if (*func == '}')
func += 5;
/* remove semi-colon or end of comment at the end */
p = line + len - 1;
while (!isalnum(*p) && *p != ']') {
*p = '\0';
--p;
}
return add_func_duration(ftrace, func, duration);
}
enum perf_ftrace_profile_sort_key {
PFP_SORT_TOTAL = 0,
PFP_SORT_AVG,
PFP_SORT_MAX,
PFP_SORT_COUNT,
PFP_SORT_NAME,
};
static enum perf_ftrace_profile_sort_key profile_sort = PFP_SORT_TOTAL;
static int cmp_profile_data(const void *a, const void *b)
{
const struct hashmap_entry *e1 = *(const struct hashmap_entry **)a;
const struct hashmap_entry *e2 = *(const struct hashmap_entry **)b;
struct ftrace_profile_data *p1 = e1->pvalue;
struct ftrace_profile_data *p2 = e2->pvalue;
double v1, v2;
switch (profile_sort) {
case PFP_SORT_NAME:
return strcmp(e1->pkey, e2->pkey);
case PFP_SORT_AVG:
v1 = p1->st.mean;
v2 = p2->st.mean;
break;
case PFP_SORT_MAX:
v1 = p1->st.max;
v2 = p2->st.max;
break;
case PFP_SORT_COUNT:
v1 = p1->st.n;
v2 = p2->st.n;
break;
case PFP_SORT_TOTAL:
default:
v1 = p1->st.n * p1->st.mean;
v2 = p2->st.n * p2->st.mean;
break;
}
if (v1 > v2)
return -1;
else
return 1;
}
static void print_profile_result(struct perf_ftrace *ftrace)
{
struct hashmap_entry *entry, **profile;
size_t i, nr, bkt;
nr = hashmap__size(ftrace->profile_hash);
if (nr == 0)
return;
profile = calloc(nr, sizeof(*profile));
if (profile == NULL) {
pr_err("failed to allocate memory for the result\n");
return;
}
i = 0;
hashmap__for_each_entry(ftrace->profile_hash, entry, bkt)
profile[i++] = entry;
assert(i == nr);
//cmp_profile_data(profile[0], profile[1]);
qsort(profile, nr, sizeof(*profile), cmp_profile_data);
printf("# %10s %10s %10s %10s %s\n",
"Total (us)", "Avg (us)", "Max (us)", "Count", "Function");
for (i = 0; i < nr; i++) {
const char *name = profile[i]->pkey;
struct ftrace_profile_data *p = profile[i]->pvalue;
printf("%12.3f %10.3f %6"PRIu64".%03"PRIu64" %10.0f %s\n",
p->st.n * p->st.mean / 1000, p->st.mean / 1000,
p->st.max / 1000, p->st.max % 1000, p->st.n, name);
}
free(profile);
hashmap__for_each_entry(ftrace->profile_hash, entry, bkt) {
free((char *)entry->pkey);
free(entry->pvalue);
}
hashmap__free(ftrace->profile_hash);
ftrace->profile_hash = NULL;
}
static int __cmd_profile(struct perf_ftrace *ftrace)
{
char *trace_file;
int trace_fd;
char buf[4096];
struct io io;
char *line = NULL;
size_t line_len = 0;
if (prepare_func_profile(ftrace) < 0) {
pr_err("failed to prepare func profiler\n");
goto out;
}
if (reset_tracing_files(ftrace) < 0) {
pr_err("failed to reset ftrace\n");
goto out;
}
/* reset ftrace buffer */
if (write_tracing_file("trace", "0") < 0)
goto out;
if (set_tracing_options(ftrace) < 0)
return -1;
if (write_tracing_file("current_tracer", ftrace->tracer) < 0) {
pr_err("failed to set current_tracer to %s\n", ftrace->tracer);
goto out_reset;
}
setup_pager();
trace_file = get_tracing_file("trace_pipe");
if (!trace_file) {
pr_err("failed to open trace_pipe\n");
goto out_reset;
}
trace_fd = open(trace_file, O_RDONLY);
put_tracing_file(trace_file);
if (trace_fd < 0) {
pr_err("failed to open trace_pipe\n");
goto out_reset;
}
fcntl(trace_fd, F_SETFL, O_NONBLOCK);
if (write_tracing_file("tracing_on", "1") < 0) {
pr_err("can't enable tracing\n");
goto out_close_fd;
}
evlist__start_workload(ftrace->evlist);
io__init(&io, trace_fd, buf, sizeof(buf));
io.timeout_ms = -1;
while (!done && !io.eof) {
if (io__getline(&io, &line, &line_len) < 0)
break;
if (parse_func_duration(ftrace, line, line_len) < 0)
break;
}
write_tracing_file("tracing_on", "0");
if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, buf, sizeof(buf));
/* flush stdout first so below error msg appears at the end. */
fflush(stdout);
pr_err("workload failed: %s\n", emsg);
goto out_free_line;
}
/* read remaining buffer contents */
io.timeout_ms = 0;
while (!io.eof) {
if (io__getline(&io, &line, &line_len) < 0)
break;
if (parse_func_duration(ftrace, line, line_len) < 0)
break;
}
print_profile_result(ftrace);
out_free_line:
free(line);
out_close_fd:
close(trace_fd);
out_reset:
reset_tracing_files(ftrace);
out:
return (done && !workload_exec_errno) ? 0 : -1;
}
static int perf_ftrace_config(const char *var, const char *value, void *cb)
{
struct perf_ftrace *ftrace = cb;
@ -1099,6 +1451,7 @@ static int parse_graph_tracer_opts(const struct option *opt,
{ .name = "verbose", .value_ptr = &ftrace->graph_verbose },
{ .name = "thresh", .value_ptr = &ftrace->graph_thresh },
{ .name = "depth", .value_ptr = &ftrace->graph_depth },
{ .name = "tail", .value_ptr = &ftrace->graph_tail },
{ .name = NULL, }
};
@ -1112,10 +1465,35 @@ static int parse_graph_tracer_opts(const struct option *opt,
return 0;
}
static int parse_sort_key(const struct option *opt, const char *str, int unset)
{
enum perf_ftrace_profile_sort_key *key = (void *)opt->value;
if (unset)
return 0;
if (!strcmp(str, "total"))
*key = PFP_SORT_TOTAL;
else if (!strcmp(str, "avg"))
*key = PFP_SORT_AVG;
else if (!strcmp(str, "max"))
*key = PFP_SORT_MAX;
else if (!strcmp(str, "count"))
*key = PFP_SORT_COUNT;
else if (!strcmp(str, "name"))
*key = PFP_SORT_NAME;
else {
pr_err("Unknown sort key: %s\n", str);
return -1;
}
return 0;
}
enum perf_ftrace_subcommand {
PERF_FTRACE_NONE,
PERF_FTRACE_TRACE,
PERF_FTRACE_LATENCY,
PERF_FTRACE_PROFILE,
};
int cmd_ftrace(int argc, const char **argv)
@ -1181,13 +1559,31 @@ int cmd_ftrace(int argc, const char **argv)
"Use nano-second histogram"),
OPT_PARENT(common_options),
};
const struct option profile_options[] = {
OPT_CALLBACK('T', "trace-funcs", &ftrace.filters, "func",
"Trace given functions using function tracer",
parse_filter_func),
OPT_CALLBACK('N', "notrace-funcs", &ftrace.notrace, "func",
"Do not trace given functions", parse_filter_func),
OPT_CALLBACK('G', "graph-funcs", &ftrace.graph_funcs, "func",
"Trace given functions using function_graph tracer",
parse_filter_func),
OPT_CALLBACK('g', "nograph-funcs", &ftrace.nograph_funcs, "func",
"Set nograph filter on given functions", parse_filter_func),
OPT_CALLBACK('m', "buffer-size", &ftrace.percpu_buffer_size, "size",
"Size of per cpu buffer, needs to use a B, K, M or G suffix.", parse_buffer_size),
OPT_CALLBACK('s', "sort", &profile_sort, "key",
"Sort result by key: total (default), avg, max, count, name.",
parse_sort_key),
OPT_PARENT(common_options),
};
const struct option *options = ftrace_options;
const char * const ftrace_usage[] = {
"perf ftrace [<options>] [<command>]",
"perf ftrace [<options>] -- [<command>] [<options>]",
"perf ftrace {trace|latency} [<options>] [<command>]",
"perf ftrace {trace|latency} [<options>] -- [<command>] [<options>]",
"perf ftrace {trace|latency|profile} [<options>] [<command>]",
"perf ftrace {trace|latency|profile} [<options>] -- [<command>] [<options>]",
NULL
};
enum perf_ftrace_subcommand subcmd = PERF_FTRACE_NONE;
@ -1202,6 +1598,14 @@ int cmd_ftrace(int argc, const char **argv)
signal(SIGCHLD, sig_handler);
signal(SIGPIPE, sig_handler);
if (!check_ftrace_capable())
return -1;
if (!is_ftrace_supported()) {
pr_err("ftrace is not supported on this system\n");
return -ENOTSUP;
}
ret = perf_config(perf_ftrace_config, &ftrace);
if (ret < 0)
return -1;
@ -1212,6 +1616,9 @@ int cmd_ftrace(int argc, const char **argv)
} else if (!strcmp(argv[1], "latency")) {
subcmd = PERF_FTRACE_LATENCY;
options = latency_options;
} else if (!strcmp(argv[1], "profile")) {
subcmd = PERF_FTRACE_PROFILE;
options = profile_options;
}
if (subcmd != PERF_FTRACE_NONE) {
@ -1247,6 +1654,9 @@ int cmd_ftrace(int argc, const char **argv)
}
cmd_func = __cmd_latency;
break;
case PERF_FTRACE_PROFILE:
cmd_func = __cmd_profile;
break;
case PERF_FTRACE_NONE:
default:
pr_err("Invalid subcommand\n");

View File

@ -417,7 +417,7 @@ static void open_html(const char *path)
static int show_html_page(const char *perf_cmd)
{
const char *page = cmd_to_page(perf_cmd);
char *page_path; /* it leaks but we exec bellow */
char *page_path; /* it leaks but we exec below */
if (get_html_page_path(&page_path, page) < 0)
return -1;

File diff suppressed because it is too large Load Diff

View File

@ -955,7 +955,7 @@ static bool perf_kmem__skip_sample(struct perf_sample *sample)
typedef int (*tracepoint_handler)(struct evsel *evsel,
struct perf_sample *sample);
static int process_sample_event(struct perf_tool *tool __maybe_unused,
static int process_sample_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -986,15 +986,6 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return err;
}
static struct perf_tool perf_kmem = {
.sample = process_sample_event,
.comm = perf_event__process_comm,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.namespaces = perf_event__process_namespaces,
.ordered_events = true,
};
static double fragmentation(unsigned long n_req, unsigned long n_alloc)
{
if (n_alloc == 0)
@ -1971,6 +1962,7 @@ int cmd_kmem(int argc, const char **argv)
NULL
};
struct perf_session *session;
struct perf_tool perf_kmem;
static const char errmsg[] = "No %s allocation events found. Have you run 'perf kmem record --%s'?\n";
int ret = perf_config(kmem_config, NULL);
@ -1998,6 +1990,13 @@ int cmd_kmem(int argc, const char **argv)
data.path = input_name;
perf_tool__init(&perf_kmem, /*ordered_events=*/true);
perf_kmem.sample = process_sample_event;
perf_kmem.comm = perf_event__process_comm;
perf_kmem.mmap = perf_event__process_mmap;
perf_kmem.mmap2 = perf_event__process_mmap2;
perf_kmem.namespaces = perf_event__process_namespaces;
kmem_session = session = perf_session__new(&data, &perf_kmem);
if (IS_ERR(session))
return PTR_ERR(session);
@ -2058,7 +2057,8 @@ int cmd_kmem(int argc, const char **argv)
out_delete:
perf_session__delete(session);
/* free usage string allocated by parse_options_subcommand */
free((void *)kmem_usage[0]);
return ret;
}

View File

@ -1166,7 +1166,7 @@ static void print_result(struct perf_kvm_stat *kvm)
}
#if defined(HAVE_TIMERFD_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
static int process_lost_event(struct perf_tool *tool,
static int process_lost_event(const struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -1187,7 +1187,7 @@ static bool skip_sample(struct perf_kvm_stat *kvm,
return false;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -1603,19 +1603,17 @@ static int read_events(struct perf_kvm_stat *kvm)
{
int ret;
struct perf_tool eops = {
.sample = process_sample_event,
.comm = perf_event__process_comm,
.namespaces = perf_event__process_namespaces,
.ordered_events = true,
};
struct perf_data file = {
.path = kvm->file_name,
.mode = PERF_DATA_MODE_READ,
.force = kvm->force,
};
kvm->tool = eops;
perf_tool__init(&kvm->tool, /*ordered_events=*/true);
kvm->tool.sample = process_sample_event;
kvm->tool.comm = perf_event__process_comm;
kvm->tool.namespaces = perf_event__process_namespaces;
kvm->session = perf_session__new(&file, &kvm->tool);
if (IS_ERR(kvm->session)) {
pr_err("Initializing perf session failed\n");
@ -1919,14 +1917,13 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
/* event handling */
perf_tool__init(&kvm->tool, /*ordered_events=*/true);
kvm->tool.sample = process_sample_event;
kvm->tool.comm = perf_event__process_comm;
kvm->tool.exit = perf_event__process_exit;
kvm->tool.fork = perf_event__process_fork;
kvm->tool.lost = process_lost_event;
kvm->tool.namespaces = perf_event__process_namespaces;
kvm->tool.ordered_events = true;
perf_tool__fill_defaults(&kvm->tool);
/* set defaults */
kvm->display_time = 1;
@ -2187,5 +2184,8 @@ int cmd_kvm(int argc, const char **argv)
else
usage_with_options(kvm_usage, kvm_options);
/* free usage string allocated by parse_options_subcommand */
free((void *)kvm_usage[0]);
return 0;
}

View File

@ -958,7 +958,7 @@ static int top_sched_switch_event(struct perf_kwork *kwork,
}
static struct kwork_class kwork_irq;
static int process_irq_handler_entry_event(struct perf_tool *tool,
static int process_irq_handler_entry_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -971,7 +971,7 @@ static int process_irq_handler_entry_event(struct perf_tool *tool,
return 0;
}
static int process_irq_handler_exit_event(struct perf_tool *tool,
static int process_irq_handler_exit_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1037,7 +1037,7 @@ static struct kwork_class kwork_irq = {
};
static struct kwork_class kwork_softirq;
static int process_softirq_raise_event(struct perf_tool *tool,
static int process_softirq_raise_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1051,7 +1051,7 @@ static int process_softirq_raise_event(struct perf_tool *tool,
return 0;
}
static int process_softirq_entry_event(struct perf_tool *tool,
static int process_softirq_entry_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1065,7 +1065,7 @@ static int process_softirq_entry_event(struct perf_tool *tool,
return 0;
}
static int process_softirq_exit_event(struct perf_tool *tool,
static int process_softirq_exit_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1167,7 +1167,7 @@ static struct kwork_class kwork_softirq = {
};
static struct kwork_class kwork_workqueue;
static int process_workqueue_activate_work_event(struct perf_tool *tool,
static int process_workqueue_activate_work_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1181,7 +1181,7 @@ static int process_workqueue_activate_work_event(struct perf_tool *tool,
return 0;
}
static int process_workqueue_execute_start_event(struct perf_tool *tool,
static int process_workqueue_execute_start_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1195,7 +1195,7 @@ static int process_workqueue_execute_start_event(struct perf_tool *tool,
return 0;
}
static int process_workqueue_execute_end_event(struct perf_tool *tool,
static int process_workqueue_execute_end_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1266,7 +1266,7 @@ static struct kwork_class kwork_workqueue = {
};
static struct kwork_class kwork_sched;
static int process_sched_switch_event(struct perf_tool *tool,
static int process_sched_switch_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1945,12 +1945,12 @@ static int perf_kwork__report(struct perf_kwork *kwork)
return 0;
}
typedef int (*tracepoint_handler)(struct perf_tool *tool,
typedef int (*tracepoint_handler)(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine);
static int perf_kwork__process_tracepoint_sample(struct perf_tool *tool,
static int perf_kwork__process_tracepoint_sample(const struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct perf_sample *sample,
struct evsel *evsel,
@ -2322,12 +2322,6 @@ int cmd_kwork(int argc, const char **argv)
{
static struct perf_kwork kwork = {
.class_list = LIST_HEAD_INIT(kwork.class_list),
.tool = {
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.sample = perf_kwork__process_tracepoint_sample,
.ordered_events = true,
},
.atom_page_list = LIST_HEAD_INIT(kwork.atom_page_list),
.sort_list = LIST_HEAD_INIT(kwork.sort_list),
.cmp_id = LIST_HEAD_INIT(kwork.cmp_id),
@ -2462,6 +2456,11 @@ int cmd_kwork(int argc, const char **argv)
"record", "report", "latency", "timehist", "top", NULL
};
perf_tool__init(&kwork.tool, /*ordered_events=*/true);
kwork.tool.mmap = perf_event__process_mmap;
kwork.tool.mmap2 = perf_event__process_mmap2;
kwork.tool.sample = perf_kwork__process_tracepoint_sample;
argc = parse_options_subcommand(argc, argv, kwork_options,
kwork_subcommands, kwork_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
@ -2520,5 +2519,8 @@ int cmd_kwork(int argc, const char **argv)
} else
usage_with_options(kwork_usage, kwork_options);
/* free usage string allocated by parse_options_subcommand */
free((void *)kwork_usage[0]);
return 0;
}

View File

@ -173,7 +173,7 @@ static void default_print_event(void *ps, const char *pmu_name, const char *topi
if (pmu_name && strcmp(pmu_name, "default_core")) {
desc_len = strlen(desc);
desc_len = asprintf(&desc_with_unit,
desc[desc_len - 1] != '.'
desc_len > 0 && desc[desc_len - 1] != '.'
? "%s. Unit: %s" : "%s Unit: %s",
desc, pmu_name);
}

View File

@ -1501,7 +1501,7 @@ static const struct evsel_str_handler contention_tracepoints[] = {
{ "lock:contention_end", evsel__process_contention_end, },
};
static int process_event_update(struct perf_tool *tool,
static int process_event_update(const struct perf_tool *tool,
union perf_event *event,
struct evlist **pevlist)
{
@ -1520,7 +1520,7 @@ static int process_event_update(struct perf_tool *tool,
typedef int (*tracepoint_handler)(struct evsel *evsel,
struct perf_sample *sample);
static int process_sample_event(struct perf_tool *tool __maybe_unused,
static int process_sample_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -1933,22 +1933,21 @@ static bool force;
static int __cmd_report(bool display_info)
{
int err = -EINVAL;
struct perf_tool eops = {
.attr = perf_event__process_attr,
.event_update = process_event_update,
.sample = process_sample_event,
.comm = perf_event__process_comm,
.mmap = perf_event__process_mmap,
.namespaces = perf_event__process_namespaces,
.tracing_data = perf_event__process_tracing_data,
.ordered_events = true,
};
struct perf_tool eops;
struct perf_data data = {
.path = input_name,
.mode = PERF_DATA_MODE_READ,
.force = force,
};
perf_tool__init(&eops, /*ordered_events=*/true);
eops.attr = perf_event__process_attr;
eops.event_update = process_event_update;
eops.sample = process_sample_event;
eops.comm = perf_event__process_comm;
eops.mmap = perf_event__process_mmap;
eops.namespaces = perf_event__process_namespaces;
eops.tracing_data = perf_event__process_tracing_data;
session = perf_session__new(&data, &eops);
if (IS_ERR(session)) {
pr_err("Initializing perf session failed\n");
@ -2069,15 +2068,7 @@ static int check_lock_contention_options(const struct option *options,
static int __cmd_contention(int argc, const char **argv)
{
int err = -EINVAL;
struct perf_tool eops = {
.attr = perf_event__process_attr,
.event_update = process_event_update,
.sample = process_sample_event,
.comm = perf_event__process_comm,
.mmap = perf_event__process_mmap,
.tracing_data = perf_event__process_tracing_data,
.ordered_events = true,
};
struct perf_tool eops;
struct perf_data data = {
.path = input_name,
.mode = PERF_DATA_MODE_READ,
@ -2100,6 +2091,14 @@ static int __cmd_contention(int argc, const char **argv)
con.result = &lockhash_table[0];
perf_tool__init(&eops, /*ordered_events=*/true);
eops.attr = perf_event__process_attr;
eops.event_update = process_event_update;
eops.sample = process_sample_event;
eops.comm = perf_event__process_comm;
eops.mmap = perf_event__process_mmap;
eops.tracing_data = perf_event__process_tracing_data;
session = perf_session__new(use_bpf ? NULL : &data, &eops);
if (IS_ERR(session)) {
pr_err("Initializing perf session failed\n");
@ -2713,6 +2712,9 @@ int cmd_lock(int argc, const char **argv)
usage_with_options(lock_usage, lock_options);
}
/* free usage string allocated by parse_options_subcommand */
free((void *)lock_usage[0]);
zfree(&lockhash_table);
return rc;
}

View File

@ -19,6 +19,7 @@
#include "util/symbol.h"
#include "util/pmus.h"
#include "util/sample.h"
#include "util/sort.h"
#include "util/string2.h"
#include "util/util.h"
#include <linux/err.h>
@ -28,12 +29,16 @@
struct perf_mem {
struct perf_tool tool;
char const *input_name;
const char *input_name;
const char *sort_key;
bool hide_unresolved;
bool dump_raw;
bool force;
bool phys_addr;
bool data_page_size;
bool all_kernel;
bool all_user;
bool data_type;
int operation;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@ -42,7 +47,7 @@ struct perf_mem {
static int parse_record_events(const struct option *opt,
const char *str, int unset __maybe_unused)
{
struct perf_mem *mem = *(struct perf_mem **)opt->value;
struct perf_mem *mem = (struct perf_mem *)opt->value;
struct perf_pmu *pmu;
pmu = perf_mem_events_find_pmu();
@ -62,33 +67,19 @@ static int parse_record_events(const struct option *opt,
return 0;
}
static const char * const __usage[] = {
"perf mem record [<options>] [<command>]",
"perf mem record [<options>] -- <command> [<options>]",
NULL
};
static const char * const *record_mem_usage = __usage;
static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
static int __cmd_record(int argc, const char **argv, struct perf_mem *mem,
const struct option *options)
{
int rec_argc, i = 0, j;
int start, end;
const char **rec_argv;
int ret;
bool all_user = false, all_kernel = false;
struct perf_mem_event *e;
struct perf_pmu *pmu;
struct option options[] = {
OPT_CALLBACK('e', "event", &mem, "event",
"event selector. use 'perf mem record -e list' to list available events",
parse_record_events),
OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_BOOLEAN('U', "all-user", &all_user, "collect only user level data"),
OPT_BOOLEAN('K', "all-kernel", &all_kernel, "collect only kernel level data"),
OPT_END()
const char * const record_usage[] = {
"perf mem record [<options>] [<command>]",
"perf mem record [<options>] -- <command> [<options>]",
NULL
};
pmu = perf_mem_events_find_pmu();
@ -97,12 +88,12 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
return -1;
}
if (perf_pmu__mem_events_init(pmu)) {
if (perf_pmu__mem_events_init()) {
pr_err("failed: memory events not supported\n");
return -1;
}
argc = parse_options(argc, argv, options, record_mem_usage,
argc = parse_options(argc, argv, options, record_usage,
PARSE_OPT_KEEP_UNKNOWN);
/* Max number of arguments multiplied by number of PMUs that can support them. */
@ -126,22 +117,17 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
if (e->tag &&
(mem->operation & MEM_OPERATION_LOAD) &&
(mem->operation & MEM_OPERATION_STORE)) {
e->record = true;
perf_mem_record[PERF_MEM_EVENTS__LOAD_STORE] = true;
rec_argv[i++] = "-W";
} else {
if (mem->operation & MEM_OPERATION_LOAD) {
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
e->record = true;
}
if (mem->operation & MEM_OPERATION_LOAD)
perf_mem_record[PERF_MEM_EVENTS__LOAD] = true;
if (mem->operation & MEM_OPERATION_STORE) {
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__STORE);
e->record = true;
}
if (mem->operation & MEM_OPERATION_STORE)
perf_mem_record[PERF_MEM_EVENTS__STORE] = true;
}
e = perf_pmu__mem_events_ptr(pmu, PERF_MEM_EVENTS__LOAD);
if (e->record)
if (perf_mem_record[PERF_MEM_EVENTS__LOAD])
rec_argv[i++] = "-W";
rec_argv[i++] = "-d";
@ -158,10 +144,10 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
goto out;
end = i;
if (all_user)
if (mem->all_user)
rec_argv[i++] = "--all-user";
if (all_kernel)
if (mem->all_kernel)
rec_argv[i++] = "--all-kernel";
if (mem->cpu_list) {
@ -188,7 +174,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
}
static int
dump_raw_samples(struct perf_tool *tool,
dump_raw_samples(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -262,7 +248,7 @@ dump_raw_samples(struct perf_tool *tool,
return 0;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel __maybe_unused,
@ -285,7 +271,23 @@ static int report_raw_events(struct perf_mem *mem)
.force = mem->force,
};
int ret;
struct perf_session *session = perf_session__new(&data, &mem->tool);
struct perf_session *session;
perf_tool__init(&mem->tool, /*ordered_events=*/true);
mem->tool.sample = process_sample_event;
mem->tool.mmap = perf_event__process_mmap;
mem->tool.mmap2 = perf_event__process_mmap2;
mem->tool.comm = perf_event__process_comm;
mem->tool.lost = perf_event__process_lost;
mem->tool.fork = perf_event__process_fork;
mem->tool.attr = perf_event__process_attr;
mem->tool.build_id = perf_event__process_build_id;
mem->tool.namespaces = perf_event__process_namespaces;
mem->tool.auxtrace_info = perf_event__process_auxtrace_info;
mem->tool.auxtrace = perf_event__process_auxtrace;
mem->tool.auxtrace_error = perf_event__process_auxtrace_error;
session = perf_session__new(&data, &mem->tool);
if (IS_ERR(session))
return PTR_ERR(session);
@ -319,16 +321,21 @@ static int report_raw_events(struct perf_mem *mem)
perf_session__delete(session);
return ret;
}
static char *get_sort_order(struct perf_mem *mem)
{
bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false;
char sort[128];
if (mem->sort_key)
scnprintf(sort, sizeof(sort), "--sort=%s", mem->sort_key);
else if (mem->data_type)
strcpy(sort, "--sort=mem,snoop,tlb,type");
/*
* there is no weight (cost) associated with stores, so don't print
* the column
*/
if (!(mem->operation & MEM_OPERATION_LOAD)) {
else if (!(mem->operation & MEM_OPERATION_LOAD)) {
strcpy(sort, "--sort=mem,sym,dso,symbol_daddr,"
"dso_daddr,tlb,locked");
} else if (has_extra_options) {
@ -343,14 +350,26 @@ static char *get_sort_order(struct perf_mem *mem)
if (mem->data_page_size)
strcat(sort, ",data_page_size");
/* make sure it has 'type' sort key even -s option is used */
if (mem->data_type && !strstr(sort, "type"))
strcat(sort, ",type");
return strdup(sort);
}
static int report_events(int argc, const char **argv, struct perf_mem *mem)
static int __cmd_report(int argc, const char **argv, struct perf_mem *mem,
const struct option *options)
{
const char **rep_argv;
int ret, i = 0, j, rep_argc;
char *new_sort_order;
const char * const report_usage[] = {
"perf mem report [<options>]",
NULL
};
argc = parse_options(argc, argv, options, report_usage,
PARSE_OPT_KEEP_UNKNOWN);
if (mem->dump_raw)
return report_raw_events(mem);
@ -368,10 +387,11 @@ static int report_events(int argc, const char **argv, struct perf_mem *mem)
if (new_sort_order)
rep_argv[i++] = new_sort_order;
for (j = 1; j < argc; j++, i++)
for (j = 0; j < argc; j++, i++)
rep_argv[i] = argv[j];
ret = cmd_report(i, rep_argv);
free(new_sort_order);
free(rep_argv);
return ret;
}
@ -449,47 +469,51 @@ int cmd_mem(int argc, const char **argv)
{
struct stat st;
struct perf_mem mem = {
.tool = {
.sample = process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.lost = perf_event__process_lost,
.fork = perf_event__process_fork,
.attr = perf_event__process_attr,
.build_id = perf_event__process_build_id,
.namespaces = perf_event__process_namespaces,
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
.ordered_events = true,
},
.input_name = "perf.data",
/*
* default to both load an store sampling
*/
.operation = MEM_OPERATION_LOAD | MEM_OPERATION_STORE,
};
char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__MEMORY);
const struct option mem_options[] = {
OPT_CALLBACK('t', "type", &mem.operation,
"type", "memory operations(load,store) Default load,store",
parse_mem_ops),
OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
"list of cpus to profile"),
OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
OPT_END()
};
const struct option record_options[] = {
OPT_CALLBACK('e', "event", &mem, "event",
"event selector. use 'perf mem record -e list' to list available events",
parse_record_events),
OPT_UINTEGER(0, "ldlat", &perf_mem_events__loads_ldlat, "mem-loads latency"),
OPT_BOOLEAN('U', "all-user", &mem.all_user, "collect only user level data"),
OPT_BOOLEAN('K', "all-kernel", &mem.all_kernel, "collect only kernel level data"),
OPT_PARENT(mem_options)
};
const struct option report_options[] = {
OPT_BOOLEAN('D', "dump-raw-samples", &mem.dump_raw,
"dump raw samples in ASCII"),
OPT_BOOLEAN('U', "hide-unresolved", &mem.hide_unresolved,
"Only display entries resolved to a symbol"),
OPT_STRING('i', "input", &input_name, "file",
"input file name"),
OPT_STRING('C', "cpu", &mem.cpu_list, "cpu",
"list of cpus to profile"),
OPT_STRING_NOEMPTY('x', "field-separator", &symbol_conf.field_sep,
"separator",
"separator for columns, no spaces will be added"
" between columns '.' is reserved."),
OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"),
OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"),
OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"),
OPT_END()
OPT_STRING('s', "sort", &mem.sort_key, "key[,key2...]",
sort_order_help),
OPT_BOOLEAN('T', "type-profile", &mem.data_type,
"Show data-type profile result"),
OPT_PARENT(mem_options)
};
const char *const mem_subcommands[] = { "record", "report", NULL };
const char *mem_usage[] = {
@ -498,7 +522,7 @@ int cmd_mem(int argc, const char **argv)
};
argc = parse_options_subcommand(argc, argv, mem_options, mem_subcommands,
mem_usage, PARSE_OPT_KEEP_UNKNOWN);
mem_usage, PARSE_OPT_STOP_AT_NON_OPTION);
if (!argc || !(strncmp(argv[0], "rec", 3) || mem.operation))
usage_with_options(mem_usage, mem_options);
@ -511,11 +535,14 @@ int cmd_mem(int argc, const char **argv)
}
if (strlen(argv[0]) > 2 && strstarts("record", argv[0]))
return __cmd_record(argc, argv, &mem);
return __cmd_record(argc, argv, &mem, record_options);
else if (strlen(argv[0]) > 2 && strstarts("report", argv[0]))
return report_events(argc, argv, &mem);
return __cmd_report(argc, argv, &mem, report_options);
else
usage_with_options(mem_usage, mem_options);
/* free usage string allocated by parse_options_subcommand */
free((void *)mem_usage[0]);
return 0;
}

View File

@ -171,6 +171,7 @@ struct record {
bool timestamp_filename;
bool timestamp_boundary;
bool off_cpu;
const char *filter_action;
struct switch_output switch_output;
unsigned long long samples;
unsigned long output_max_size; /* = 0: unlimited */
@ -193,6 +194,15 @@ static const char *affinity_tags[PERF_AFFINITY_MAX] = {
"SYS", "NODE", "CPU"
};
static int build_id__process_mmap(const struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine);
static int build_id__process_mmap2(const struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine);
static int process_timestamp_boundary(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine);
#ifndef HAVE_GETTID
static inline pid_t gettid(void)
{
@ -608,7 +618,7 @@ static int record__comp_enabled(struct record *rec)
return rec->opts.comp_level > 0;
}
static int process_synthesized_event(struct perf_tool *tool,
static int process_synthesized_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -619,7 +629,7 @@ static int process_synthesized_event(struct perf_tool *tool,
static struct mutex synth_lock;
static int process_locked_synthesized_event(struct perf_tool *tool,
static int process_locked_synthesized_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -704,7 +714,7 @@ static void record__sig_exit(void)
#ifdef HAVE_AUXTRACE_SUPPORT
static int record__process_auxtrace(struct perf_tool *tool,
static int record__process_auxtrace(const struct perf_tool *tool,
struct mmap *map,
union perf_event *event, void *data1,
size_t len1, void *data2, size_t len2)
@ -1389,7 +1399,7 @@ static int record__open(struct record *rec)
"even with a suitable vmlinux or kallsyms file.\n\n");
}
if (evlist__apply_filters(evlist, &pos)) {
if (evlist__apply_filters(evlist, &pos, &opts->target)) {
pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
pos->filter ?: "BPF", evsel__name(pos), errno,
str_error_r(errno, msg, sizeof(msg)));
@ -1416,7 +1426,7 @@ static void set_timestamp_boundary(struct record *rec, u64 sample_time)
rec->evlist->last_sample_time = sample_time;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -1458,7 +1468,7 @@ static int process_buildids(struct record *rec)
* first/last samples.
*/
if (rec->buildid_all && !rec->timestamp_boundary)
rec->tool.sample = NULL;
rec->tool.sample = process_event_sample_stub;
return perf_session__process_events(session);
}
@ -2364,13 +2374,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
signal(SIGTERM, sig_handler);
signal(SIGSEGV, sigsegv_handler);
if (rec->opts.record_namespaces)
tool->namespace_events = true;
if (rec->opts.record_cgroup) {
#ifdef HAVE_FILE_HANDLE
tool->cgroup_events = true;
#else
#ifndef HAVE_FILE_HANDLE
pr_err("cgroup tracking is not supported\n");
return -1;
#endif
@ -2386,6 +2391,18 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
signal(SIGUSR2, SIG_IGN);
}
perf_tool__init(tool, /*ordered_events=*/true);
tool->sample = process_sample_event;
tool->fork = perf_event__process_fork;
tool->exit = perf_event__process_exit;
tool->comm = perf_event__process_comm;
tool->namespaces = perf_event__process_namespaces;
tool->mmap = build_id__process_mmap;
tool->mmap2 = build_id__process_mmap2;
tool->itrace_start = process_timestamp_boundary;
tool->aux = process_timestamp_boundary;
tool->namespace_events = rec->opts.record_namespaces;
tool->cgroup_events = rec->opts.record_cgroup;
session = perf_session__new(data, tool);
if (IS_ERR(session)) {
pr_err("Perf session creation failed.\n");
@ -3243,7 +3260,7 @@ static const char * const __record_usage[] = {
};
const char * const *record_usage = __record_usage;
static int build_id__process_mmap(struct perf_tool *tool, union perf_event *event,
static int build_id__process_mmap(const struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine)
{
/*
@ -3255,7 +3272,7 @@ static int build_id__process_mmap(struct perf_tool *tool, union perf_event *even
return perf_event__process_mmap(tool, event, sample, machine);
}
static int build_id__process_mmap2(struct perf_tool *tool, union perf_event *event,
static int build_id__process_mmap2(const struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine)
{
/*
@ -3268,7 +3285,7 @@ static int build_id__process_mmap2(struct perf_tool *tool, union perf_event *eve
return perf_event__process_mmap2(tool, event, sample, machine);
}
static int process_timestamp_boundary(struct perf_tool *tool,
static int process_timestamp_boundary(const struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct perf_sample *sample,
struct machine *machine __maybe_unused)
@ -3326,18 +3343,6 @@ static struct record record = {
.ctl_fd_ack = -1,
.synth = PERF_SYNTH_ALL,
},
.tool = {
.sample = process_sample_event,
.fork = perf_event__process_fork,
.exit = perf_event__process_exit,
.comm = perf_event__process_comm,
.namespaces = perf_event__process_namespaces,
.mmap = build_id__process_mmap,
.mmap2 = build_id__process_mmap2,
.itrace_start = process_timestamp_boundary,
.aux = process_timestamp_boundary,
.ordered_events = true,
},
};
const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
@ -3557,6 +3562,8 @@ static struct option __record_options[] = {
"write collected trace data into several data files using parallel threads",
record__parse_threads),
OPT_BOOLEAN(0, "off-cpu", &record.off_cpu, "Enable off-cpu analysis"),
OPT_STRING(0, "setup-filter", &record.filter_action, "pin|unpin",
"BPF filter action"),
OPT_END()
};
@ -4086,6 +4093,18 @@ int cmd_record(int argc, const char **argv)
pr_warning("WARNING: --timestamp-filename option is not available in parallel streaming mode.\n");
}
if (rec->filter_action) {
if (!strcmp(rec->filter_action, "pin"))
err = perf_bpf_filter__pin();
else if (!strcmp(rec->filter_action, "unpin"))
err = perf_bpf_filter__unpin();
else {
pr_warning("Unknown BPF filter action: %s\n", rec->filter_action);
err = -EINVAL;
}
goto out_opts;
}
/*
* Allow aliases to facilitate the lookup of symbols for address
* filters. Refer to auxtrace_parse_filters().
@ -4242,13 +4261,13 @@ int cmd_record(int argc, const char **argv)
err = __cmd_record(&record, argc, argv);
out:
evlist__delete(rec->evlist);
record__free_thread_masks(rec, rec->nr_threads);
rec->nr_threads = 0;
symbol__exit();
auxtrace_record__free(rec->itr);
out_opts:
record__free_thread_masks(rec, rec->nr_threads);
rec->nr_threads = 0;
evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.ctl_fd_close);
evlist__delete(rec->evlist);
return err;
}

View File

@ -263,7 +263,7 @@ static int process_feature_event(struct perf_session *session,
return 0;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -328,7 +328,7 @@ static int process_sample_event(struct perf_tool *tool,
if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
hist__account_cycles(sample->branch_stack, &al, sample,
rep->nonany_branch_mode,
&rep->total_cycles);
&rep->total_cycles, evsel);
}
ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
@ -339,7 +339,7 @@ static int process_sample_event(struct perf_tool *tool,
return ret;
}
static int process_read_event(struct perf_tool *tool,
static int process_read_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct evsel *evsel,
@ -565,6 +565,7 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c
struct hists *hists = evsel__hists(pos);
const char *evname = evsel__name(pos);
i++;
if (symbol_conf.event_group && !evsel__is_group_leader(pos))
continue;
@ -574,7 +575,14 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c
hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
if (rep->total_cycles_mode) {
report__browse_block_hists(&rep->block_reports[i++].hist,
char *buf;
if (!annotation_br_cntr_abbr_list(&buf, pos, true)) {
fprintf(stdout, "%s", buf);
fprintf(stdout, "#\n");
free(buf);
}
report__browse_block_hists(&rep->block_reports[i - 1].hist,
rep->min_percent, pos, NULL);
continue;
}
@ -765,7 +773,7 @@ static void report__output_resort(struct report *rep)
ui_progress__finish();
}
static int count_sample_event(struct perf_tool *tool __maybe_unused,
static int count_sample_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct evsel *evsel,
@ -777,7 +785,7 @@ static int count_sample_event(struct perf_tool *tool __maybe_unused,
return 0;
}
static int count_lost_samples_event(struct perf_tool *tool,
static int count_lost_samples_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine __maybe_unused)
@ -787,22 +795,28 @@ static int count_lost_samples_event(struct perf_tool *tool,
evsel = evlist__id2evsel(rep->session->evlist, sample->id);
if (evsel) {
hists__inc_nr_lost_samples(evsel__hists(evsel),
event->lost_samples.lost);
struct hists *hists = evsel__hists(evsel);
u32 count = event->lost_samples.lost;
if (event->header.misc & PERF_RECORD_MISC_LOST_SAMPLES_BPF)
hists__inc_nr_dropped_samples(hists, count);
else
hists__inc_nr_lost_samples(hists, count);
}
return 0;
}
static int process_attr(struct perf_tool *tool __maybe_unused,
static int process_attr(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct evlist **pevlist);
static void stats_setup(struct report *rep)
{
memset(&rep->tool, 0, sizeof(rep->tool));
perf_tool__init(&rep->tool, /*ordered_events=*/false);
rep->tool.attr = process_attr;
rep->tool.sample = count_sample_event;
rep->tool.lost_samples = count_lost_samples_event;
rep->tool.event_update = perf_event__process_event_update;
rep->tool.no_warn = true;
}
@ -817,8 +831,7 @@ static int stats_print(struct report *rep)
static void tasks_setup(struct report *rep)
{
memset(&rep->tool, 0, sizeof(rep->tool));
rep->tool.ordered_events = true;
perf_tool__init(&rep->tool, /*ordered_events=*/true);
if (rep->mmaps_mode) {
rep->tool.mmap = perf_event__process_mmap;
rep->tool.mmap2 = perf_event__process_mmap2;
@ -1119,18 +1132,23 @@ static int __cmd_report(struct report *rep)
report__output_resort(rep);
if (rep->total_cycles_mode) {
int block_hpps[6] = {
int nr_hpps = 4;
int block_hpps[PERF_HPP_REPORT__BLOCK_MAX_INDEX] = {
PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
PERF_HPP_REPORT__BLOCK_LBR_CYCLES,
PERF_HPP_REPORT__BLOCK_CYCLES_PCT,
PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
PERF_HPP_REPORT__BLOCK_RANGE,
PERF_HPP_REPORT__BLOCK_DSO,
};
if (session->evlist->nr_br_cntr > 0)
block_hpps[nr_hpps++] = PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER;
block_hpps[nr_hpps++] = PERF_HPP_REPORT__BLOCK_RANGE;
block_hpps[nr_hpps++] = PERF_HPP_REPORT__BLOCK_DSO;
rep->block_reports = block_info__create_report(session->evlist,
rep->total_cycles,
block_hpps, 6,
block_hpps, nr_hpps,
&rep->nr_block_reports);
if (!rep->block_reports)
return -1;
@ -1233,7 +1251,7 @@ parse_percent_limit(const struct option *opt, const char *str,
return 0;
}
static int process_attr(struct perf_tool *tool __maybe_unused,
static int process_attr(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct evlist **pevlist)
{
@ -1272,37 +1290,13 @@ int cmd_report(int argc, const char **argv)
NULL
};
struct report report = {
.tool = {
.sample = process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.namespaces = perf_event__process_namespaces,
.cgroup = perf_event__process_cgroup,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.lost = perf_event__process_lost,
.read = process_read_event,
.attr = process_attr,
#ifdef HAVE_LIBTRACEEVENT
.tracing_data = perf_event__process_tracing_data,
#endif
.build_id = perf_event__process_build_id,
.id_index = perf_event__process_id_index,
.auxtrace_info = perf_event__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.event_update = perf_event__process_event_update,
.feature = process_feature_event,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
.max_stack = PERF_MAX_STACK_DEPTH,
.pretty_printing_style = "normal",
.socket_filter = -1,
.skip_empty = true,
};
char *sort_order_help = sort_help("sort by key(s):");
char *field_order_help = sort_help("output field(s): overhead period sample ");
char *sort_order_help = sort_help("sort by key(s):", SORT_MODE__NORMAL);
char *field_order_help = sort_help("output field(s):", SORT_MODE__NORMAL);
const char *disassembler_style = NULL, *objdump_path = NULL, *addr2line_path = NULL;
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file",
@ -1477,6 +1471,7 @@ int cmd_report(int argc, const char **argv)
};
int ret = hists__init();
char sort_tmp[128];
bool ordered_events = true;
if (ret < 0)
goto exit;
@ -1531,7 +1526,7 @@ int cmd_report(int argc, const char **argv)
report.tasks_mode = true;
if (dump_trace && report.disable_order)
report.tool.ordered_events = false;
ordered_events = false;
if (quiet)
perf_quiet_option();
@ -1562,6 +1557,29 @@ int cmd_report(int argc, const char **argv)
symbol_conf.skip_empty = report.skip_empty;
repeat:
perf_tool__init(&report.tool, ordered_events);
report.tool.sample = process_sample_event;
report.tool.mmap = perf_event__process_mmap;
report.tool.mmap2 = perf_event__process_mmap2;
report.tool.comm = perf_event__process_comm;
report.tool.namespaces = perf_event__process_namespaces;
report.tool.cgroup = perf_event__process_cgroup;
report.tool.exit = perf_event__process_exit;
report.tool.fork = perf_event__process_fork;
report.tool.lost = perf_event__process_lost;
report.tool.read = process_read_event;
report.tool.attr = process_attr;
#ifdef HAVE_LIBTRACEEVENT
report.tool.tracing_data = perf_event__process_tracing_data;
#endif
report.tool.build_id = perf_event__process_build_id;
report.tool.id_index = perf_event__process_id_index;
report.tool.auxtrace_info = perf_event__process_auxtrace_info;
report.tool.auxtrace = perf_event__process_auxtrace;
report.tool.event_update = perf_event__process_event_update;
report.tool.feature = process_feature_event;
report.tool.ordering_requires_timestamps = true;
session = perf_session__new(&data, &report.tool);
if (IS_ERR(session)) {
ret = PTR_ERR(session);

View File

@ -51,6 +51,7 @@
#define COMM_LEN 20
#define SYM_LEN 129
#define MAX_PID 1024000
#define MAX_PRIO 140
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@ -228,11 +229,14 @@ struct perf_sched {
bool show_next;
bool show_migrations;
bool show_state;
bool show_prio;
u64 skipped_samples;
const char *time_str;
struct perf_time_interval ptime;
struct perf_time_interval hist_time;
volatile bool thread_funcs_exit;
const char *prio_str;
DECLARE_BITMAP(prio_bitmap, MAX_PRIO);
};
/* per thread run time data */
@ -258,6 +262,8 @@ struct thread_runtime {
bool comm_changed;
u64 migrations;
int prio;
};
/* per event run time data */
@ -920,6 +926,11 @@ struct sort_dimension {
struct list_head list;
};
static inline void init_prio(struct thread_runtime *r)
{
r->prio = -1;
}
/*
* handle runtime stats saved per thread
*/
@ -932,6 +943,7 @@ static struct thread_runtime *thread__init_runtime(struct thread *thread)
return NULL;
init_stats(&r->run_stats);
init_prio(r);
thread__set_priv(thread, r);
return r;
@ -1489,7 +1501,7 @@ static void perf_sched__sort_lat(struct perf_sched *sched)
}
}
static int process_sched_wakeup_event(struct perf_tool *tool,
static int process_sched_wakeup_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1502,7 +1514,7 @@ static int process_sched_wakeup_event(struct perf_tool *tool,
return 0;
}
static int process_sched_wakeup_ignore(struct perf_tool *tool __maybe_unused,
static int process_sched_wakeup_ignore(const struct perf_tool *tool __maybe_unused,
struct evsel *evsel __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -1770,7 +1782,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
return 0;
}
static int process_sched_switch_event(struct perf_tool *tool,
static int process_sched_switch_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1796,7 +1808,7 @@ static int process_sched_switch_event(struct perf_tool *tool,
return err;
}
static int process_sched_runtime_event(struct perf_tool *tool,
static int process_sched_runtime_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1809,7 +1821,7 @@ static int process_sched_runtime_event(struct perf_tool *tool,
return 0;
}
static int perf_sched__process_fork_event(struct perf_tool *tool,
static int perf_sched__process_fork_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -1826,7 +1838,7 @@ static int perf_sched__process_fork_event(struct perf_tool *tool,
return 0;
}
static int process_sched_migrate_task_event(struct perf_tool *tool,
static int process_sched_migrate_task_event(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
@ -1839,12 +1851,12 @@ static int process_sched_migrate_task_event(struct perf_tool *tool,
return 0;
}
typedef int (*tracepoint_handler)(struct perf_tool *tool,
typedef int (*tracepoint_handler)(const struct perf_tool *tool,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine);
static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_unused,
static int perf_sched__process_tracepoint_sample(const struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct perf_sample *sample,
struct evsel *evsel,
@ -1860,7 +1872,7 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_
return err;
}
static int perf_sched__process_comm(struct perf_tool *tool __maybe_unused,
static int perf_sched__process_comm(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2036,6 +2048,24 @@ static char *timehist_get_commstr(struct thread *thread)
return str;
}
/* prio field format: xxx or xxx->yyy */
#define MAX_PRIO_STR_LEN 8
static char *timehist_get_priostr(struct evsel *evsel,
struct thread *thread,
struct perf_sample *sample)
{
static char prio_str[16];
int prev_prio = (int)evsel__intval(evsel, sample, "prev_prio");
struct thread_runtime *tr = thread__priv(thread);
if (tr->prio != prev_prio && tr->prio != -1)
scnprintf(prio_str, sizeof(prio_str), "%d->%d", tr->prio, prev_prio);
else
scnprintf(prio_str, sizeof(prio_str), "%d", prev_prio);
return prio_str;
}
static void timehist_header(struct perf_sched *sched)
{
u32 ncpus = sched->max_cpu.cpu + 1;
@ -2053,8 +2083,14 @@ static void timehist_header(struct perf_sched *sched)
printf(" ");
}
printf(" %-*s %9s %9s %9s", comm_width,
"task name", "wait time", "sch delay", "run time");
if (sched->show_prio) {
printf(" %-*s %-*s %9s %9s %9s",
comm_width, "task name", MAX_PRIO_STR_LEN, "prio",
"wait time", "sch delay", "run time");
} else {
printf(" %-*s %9s %9s %9s", comm_width,
"task name", "wait time", "sch delay", "run time");
}
if (sched->show_state)
printf(" %s", "state");
@ -2069,8 +2105,14 @@ static void timehist_header(struct perf_sched *sched)
if (sched->show_cpu_visual)
printf(" %*s ", ncpus, "");
printf(" %-*s %9s %9s %9s", comm_width,
"[tid/pid]", "(msec)", "(msec)", "(msec)");
if (sched->show_prio) {
printf(" %-*s %-*s %9s %9s %9s",
comm_width, "[tid/pid]", MAX_PRIO_STR_LEN, "",
"(msec)", "(msec)", "(msec)");
} else {
printf(" %-*s %9s %9s %9s", comm_width,
"[tid/pid]", "(msec)", "(msec)", "(msec)");
}
if (sched->show_state)
printf(" %5s", "");
@ -2085,9 +2127,15 @@ static void timehist_header(struct perf_sched *sched)
if (sched->show_cpu_visual)
printf(" %.*s ", ncpus, graph_dotted_line);
printf(" %.*s %.9s %.9s %.9s", comm_width,
graph_dotted_line, graph_dotted_line, graph_dotted_line,
graph_dotted_line);
if (sched->show_prio) {
printf(" %.*s %.*s %.9s %.9s %.9s",
comm_width, graph_dotted_line, MAX_PRIO_STR_LEN, graph_dotted_line,
graph_dotted_line, graph_dotted_line, graph_dotted_line);
} else {
printf(" %.*s %.9s %.9s %.9s", comm_width,
graph_dotted_line, graph_dotted_line, graph_dotted_line,
graph_dotted_line);
}
if (sched->show_state)
printf(" %.5s", graph_dotted_line);
@ -2134,6 +2182,9 @@ static void timehist_print_sample(struct perf_sched *sched,
printf(" %-*s ", comm_width, timehist_get_commstr(thread));
if (sched->show_prio)
printf(" %-*s ", MAX_PRIO_STR_LEN, timehist_get_priostr(evsel, thread, sample));
wait_time = tr->dt_sleep + tr->dt_iowait + tr->dt_preempt;
print_sched_time(wait_time, 6);
@ -2301,6 +2352,7 @@ static int init_idle_thread(struct thread *thread)
if (itr == NULL)
return -ENOMEM;
init_prio(&itr->tr);
init_stats(&itr->tr.run_stats);
callchain_init(&itr->callchain);
callchain_cursor_reset(&itr->cursor);
@ -2455,12 +2507,33 @@ static bool timehist_skip_sample(struct perf_sched *sched,
struct perf_sample *sample)
{
bool rc = false;
int prio = -1;
struct thread_runtime *tr = NULL;
if (thread__is_filtered(thread)) {
rc = true;
sched->skipped_samples++;
}
if (sched->prio_str) {
/*
* Because priority may be changed during task execution,
* first read priority from prev sched_in event for current task.
* If prev sched_in event is not saved, then read priority from
* current task sched_out event.
*/
tr = thread__get_runtime(thread);
if (tr && tr->prio != -1)
prio = tr->prio;
else if (evsel__name_is(evsel, "sched:sched_switch"))
prio = evsel__intval(evsel, sample, "prev_prio");
if (prio != -1 && !test_bit(prio, sched->prio_bitmap)) {
rc = true;
sched->skipped_samples++;
}
}
if (sched->idle_hist) {
if (!evsel__name_is(evsel, "sched:sched_switch"))
rc = true;
@ -2506,7 +2579,7 @@ static void timehist_print_wakeup_event(struct perf_sched *sched,
printf("\n");
}
static int timehist_sched_wakeup_ignore(struct perf_tool *tool __maybe_unused,
static int timehist_sched_wakeup_ignore(const struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
struct evsel *evsel __maybe_unused,
struct perf_sample *sample __maybe_unused,
@ -2515,7 +2588,7 @@ static int timehist_sched_wakeup_ignore(struct perf_tool *tool __maybe_unused,
return 0;
}
static int timehist_sched_wakeup_event(struct perf_tool *tool,
static int timehist_sched_wakeup_event(const struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct evsel *evsel,
struct perf_sample *sample,
@ -2599,7 +2672,7 @@ static void timehist_print_migration_event(struct perf_sched *sched,
printf("\n");
}
static int timehist_migrate_task_event(struct perf_tool *tool,
static int timehist_migrate_task_event(const struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct evsel *evsel,
struct perf_sample *sample,
@ -2627,7 +2700,31 @@ static int timehist_migrate_task_event(struct perf_tool *tool,
return 0;
}
static int timehist_sched_change_event(struct perf_tool *tool,
static void timehist_update_task_prio(struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine)
{
struct thread *thread;
struct thread_runtime *tr = NULL;
const u32 next_pid = evsel__intval(evsel, sample, "next_pid");
const u32 next_prio = evsel__intval(evsel, sample, "next_prio");
if (next_pid == 0)
thread = get_idle_thread(sample->cpu);
else
thread = machine__findnew_thread(machine, -1, next_pid);
if (thread == NULL)
return;
tr = thread__get_runtime(thread);
if (tr == NULL)
return;
tr->prio = next_prio;
}
static int timehist_sched_change_event(const struct perf_tool *tool,
union perf_event *event,
struct evsel *evsel,
struct perf_sample *sample,
@ -2650,6 +2747,9 @@ static int timehist_sched_change_event(struct perf_tool *tool,
goto out;
}
if (sched->show_prio || sched->prio_str)
timehist_update_task_prio(evsel, sample, machine);
thread = timehist_get_thread(sched, sample, machine, evsel);
if (thread == NULL) {
rc = -1;
@ -2683,9 +2783,12 @@ static int timehist_sched_change_event(struct perf_tool *tool,
* - previous sched event is out of window - we are done
* - sample time is beyond window user cares about - reset it
* to close out stats for time window interest
* - If tprev is 0, that is, sched_in event for current task is
* not recorded, cannot determine whether sched_in event is
* within time window interest - ignore it
*/
if (ptime->end) {
if (tprev > ptime->end)
if (!tprev || tprev > ptime->end)
goto out;
if (t > ptime->end)
@ -2700,8 +2803,6 @@ static int timehist_sched_change_event(struct perf_tool *tool,
struct idle_thread_runtime *itr = (void *)tr;
struct thread_runtime *last_tr;
BUG_ON(thread__tid(thread) != 0);
if (itr->last_thread == NULL)
goto out;
@ -2727,10 +2828,10 @@ static int timehist_sched_change_event(struct perf_tool *tool,
itr->last_thread = NULL;
}
}
if (!sched->summary_only)
timehist_print_sample(sched, evsel, sample, &al, thread, t, state);
if (!sched->summary_only)
timehist_print_sample(sched, evsel, sample, &al, thread, t, state);
}
out:
if (sched->hist_time.start == 0 && t >= ptime->start)
@ -2758,7 +2859,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
return rc;
}
static int timehist_sched_switch_event(struct perf_tool *tool,
static int timehist_sched_switch_event(const struct perf_tool *tool,
union perf_event *event,
struct evsel *evsel,
struct perf_sample *sample,
@ -2767,7 +2868,7 @@ static int timehist_sched_switch_event(struct perf_tool *tool,
return timehist_sched_change_event(tool, event, evsel, sample, machine);
}
static int process_lost(struct perf_tool *tool __maybe_unused,
static int process_lost(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine __maybe_unused)
@ -3010,13 +3111,13 @@ static void timehist_print_summary(struct perf_sched *sched,
printf(" (x %d)\n", sched->max_cpu.cpu);
}
typedef int (*sched_handler)(struct perf_tool *tool,
typedef int (*sched_handler)(const struct perf_tool *tool,
union perf_event *event,
struct evsel *evsel,
struct perf_sample *sample,
struct machine *machine);
static int perf_timehist__process_sample(struct perf_tool *tool,
static int perf_timehist__process_sample(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -3066,6 +3167,47 @@ static int timehist_check_attr(struct perf_sched *sched,
return 0;
}
static int timehist_parse_prio_str(struct perf_sched *sched)
{
char *p;
unsigned long start_prio, end_prio;
const char *str = sched->prio_str;
if (!str)
return 0;
while (isdigit(*str)) {
p = NULL;
start_prio = strtoul(str, &p, 0);
if (start_prio >= MAX_PRIO || (*p != '\0' && *p != ',' && *p != '-'))
return -1;
if (*p == '-') {
str = ++p;
p = NULL;
end_prio = strtoul(str, &p, 0);
if (end_prio >= MAX_PRIO || (*p != '\0' && *p != ','))
return -1;
if (end_prio < start_prio)
return -1;
} else {
end_prio = start_prio;
}
for (; start_prio <= end_prio; start_prio++)
__set_bit(start_prio, sched->prio_bitmap);
if (*p)
++p;
str = p;
}
return 0;
}
static int perf_sched__timehist(struct perf_sched *sched)
{
struct evsel_str_handler handlers[] = {
@ -3100,7 +3242,6 @@ static int perf_sched__timehist(struct perf_sched *sched)
sched->tool.tracing_data = perf_event__process_tracing_data;
sched->tool.build_id = perf_event__process_build_id;
sched->tool.ordered_events = true;
sched->tool.ordering_requires_timestamps = true;
symbol_conf.use_callchain = sched->show_callchain;
@ -3121,12 +3262,18 @@ static int perf_sched__timehist(struct perf_sched *sched)
if (perf_time__parse_str(&sched->ptime, sched->time_str) != 0) {
pr_err("Invalid time string\n");
return -EINVAL;
err = -EINVAL;
goto out;
}
if (timehist_check_attr(sched, evlist) != 0)
goto out;
if (timehist_parse_prio_str(sched) != 0) {
pr_err("Invalid prio string\n");
goto out;
}
setup_pager();
/* prefer sched_waking if it is captured */
@ -3605,14 +3752,6 @@ int cmd_sched(int argc, const char **argv)
{
static const char default_sort_order[] = "avg, max, switch, runtime";
struct perf_sched sched = {
.tool = {
.sample = perf_sched__process_tracepoint_sample,
.comm = perf_sched__process_comm,
.namespaces = perf_event__process_namespaces,
.lost = perf_event__process_lost,
.fork = perf_sched__process_fork_event,
.ordered_events = true,
},
.cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
.sort_list = LIST_HEAD_INIT(sched.sort_list),
.sort_order = default_sort_order,
@ -3691,6 +3830,9 @@ int cmd_sched(int argc, const char **argv)
OPT_STRING('t', "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
"analyze events only for given thread id(s)"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
OPT_BOOLEAN(0, "show-prio", &sched.show_prio, "Show task priority"),
OPT_STRING(0, "prio", &sched.prio_str, "prio",
"analyze events only for given task priority(ies)"),
OPT_PARENT(sched_options)
};
@ -3733,6 +3875,13 @@ int cmd_sched(int argc, const char **argv)
};
int ret;
perf_tool__init(&sched.tool, /*ordered_events=*/true);
sched.tool.sample = perf_sched__process_tracepoint_sample;
sched.tool.comm = perf_sched__process_comm;
sched.tool.namespaces = perf_event__process_namespaces;
sched.tool.lost = perf_event__process_lost;
sched.tool.fork = perf_sched__process_fork_event;
argc = parse_options_subcommand(argc, argv, sched_options, sched_subcommands,
sched_usage, PARSE_OPT_STOP_AT_NON_OPTION);
if (!argc)
@ -3805,5 +3954,8 @@ int cmd_sched(int argc, const char **argv)
usage_with_options(sched_usage, sched_options);
}
/* free usage string allocated by parse_options_subcommand */
free((void *)sched_usage[0]);
return 0;
}

View File

@ -62,6 +62,7 @@
#include "util/record.h"
#include "util/util.h"
#include "util/cgroup.h"
#include "util/annotate.h"
#include "perf.h"
#include <linux/ctype.h>
@ -138,6 +139,7 @@ enum perf_output_field {
PERF_OUTPUT_DSOFF = 1ULL << 41,
PERF_OUTPUT_DISASM = 1ULL << 42,
PERF_OUTPUT_BRSTACKDISASM = 1ULL << 43,
PERF_OUTPUT_BRCNTR = 1ULL << 44,
};
struct perf_script {
@ -213,6 +215,7 @@ struct output_option {
{.str = "cgroup", .field = PERF_OUTPUT_CGROUP},
{.str = "retire_lat", .field = PERF_OUTPUT_RETIRE_LAT},
{.str = "brstackdisasm", .field = PERF_OUTPUT_BRSTACKDISASM},
{.str = "brcntr", .field = PERF_OUTPUT_BRCNTR},
};
enum {
@ -520,6 +523,12 @@ static int evsel__check_attr(struct evsel *evsel, struct perf_session *session)
"Hint: run 'perf record -b ...'\n");
return -EINVAL;
}
if (PRINT_FIELD(BRCNTR) &&
!(evlist__combined_branch_type(session->evlist) & PERF_SAMPLE_BRANCH_COUNTERS)) {
pr_err("Display of branch counter requested but it's not enabled\n"
"Hint: run 'perf record -j any,counter ...'\n");
return -EINVAL;
}
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID", PERF_OUTPUT_TID|PERF_OUTPUT_PID))
return -EINVAL;
@ -789,6 +798,19 @@ static int perf_sample__fprintf_start(struct perf_script *script,
int printed = 0;
char tstr[128];
/*
* Print the branch counter's abbreviation list,
* if the branch counter is available.
*/
if (PRINT_FIELD(BRCNTR) && !verbose) {
char *buf;
if (!annotation_br_cntr_abbr_list(&buf, evsel, true)) {
printed += fprintf(stdout, "%s", buf);
free(buf);
}
}
if (PRINT_FIELD(MACHINE_PID) && sample->machine_pid)
printed += fprintf(fp, "VM:%5d ", sample->machine_pid);
@ -1195,7 +1217,9 @@ static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
struct perf_insn *x, u8 *inbuf, int len,
int insn, FILE *fp, int *total_cycles,
struct perf_event_attr *attr,
struct thread *thread)
struct thread *thread,
struct evsel *evsel,
u64 br_cntr)
{
int ilen = 0;
int printed = fprintf(fp, "\t%016" PRIx64 "\t", ip);
@ -1216,6 +1240,29 @@ static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
addr_location__exit(&al);
}
if (PRINT_FIELD(BRCNTR)) {
struct evsel *pos = evsel__leader(evsel);
unsigned int i = 0, j, num, mask, width;
perf_env__find_br_cntr_info(evsel__env(evsel), NULL, &width);
mask = (1L << width) - 1;
printed += fprintf(fp, "br_cntr: ");
evlist__for_each_entry_from(evsel->evlist, pos) {
if (!(pos->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS))
continue;
if (evsel__leader(pos) != evsel__leader(evsel))
break;
num = (br_cntr >> (i++ * width)) & mask;
if (!verbose) {
for (j = 0; j < num; j++)
printed += fprintf(fp, "%s", pos->abbr_name);
} else
printed += fprintf(fp, "%s %d ", pos->name, num);
}
printed += fprintf(fp, "\t");
}
printed += fprintf(fp, "#%s%s%s%s",
en->flags.predicted ? " PRED" : "",
en->flags.mispred ? " MISPRED" : "",
@ -1272,6 +1319,7 @@ static int ip__fprintf_sym(uint64_t addr, struct thread *thread,
}
static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
struct evsel *evsel,
struct thread *thread,
struct perf_event_attr *attr,
struct machine *machine, FILE *fp)
@ -1285,6 +1333,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
unsigned off;
struct symbol *lastsym = NULL;
int total_cycles = 0;
u64 br_cntr = 0;
if (!(br && br->nr))
return 0;
@ -1296,6 +1345,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
x.machine = machine;
x.cpu = sample->cpu;
if (PRINT_FIELD(BRCNTR) && sample->branch_stack_cntr)
br_cntr = sample->branch_stack_cntr[nr - 1];
printed += fprintf(fp, "%c", '\n');
/* Handle first from jump, of which we don't know the entry. */
@ -1307,7 +1359,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
x.cpumode, x.cpu, &lastsym, attr, fp);
printed += ip__fprintf_jump(entries[nr - 1].from, &entries[nr - 1],
&x, buffer, len, 0, fp, &total_cycles,
attr, thread);
attr, thread, evsel, br_cntr);
if (PRINT_FIELD(SRCCODE))
printed += print_srccode(thread, x.cpumode, entries[nr - 1].from);
}
@ -1337,8 +1389,10 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp);
if (ip == end) {
if (PRINT_FIELD(BRCNTR) && sample->branch_stack_cntr)
br_cntr = sample->branch_stack_cntr[i];
printed += ip__fprintf_jump(ip, &entries[i], &x, buffer + off, len - off, ++insn, fp,
&total_cycles, attr, thread);
&total_cycles, attr, thread, evsel, br_cntr);
if (PRINT_FIELD(SRCCODE))
printed += print_srccode(thread, x.cpumode, ip);
break;
@ -1375,7 +1429,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
* Due to pipeline delays the LBRs might be missing a branch
* or two, which can result in very large or negative blocks
* between final branch and sample. When this happens just
* continue walking after the last TO until we hit a branch.
* continue walking after the last TO.
*/
start = entries[0].to;
end = sample->ip;
@ -1410,7 +1464,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
printed += fprintf(fp, "\n");
if (ilen == 0)
break;
if (arch_is_branch(buffer + off, len - off, x.is64bit) && start + off != sample->ip) {
if ((attr->branch_sample_type == 0 || attr->branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
&& arch_is_uncond_branch(buffer + off, len - off, x.is64bit)
&& start + off != sample->ip) {
/*
* Hit a missing branch. Just stop.
*/
@ -1547,6 +1603,7 @@ void script_fetch_insn(struct perf_sample *sample, struct thread *thread,
}
static int perf_sample__fprintf_insn(struct perf_sample *sample,
struct evsel *evsel,
struct perf_event_attr *attr,
struct thread *thread,
struct machine *machine, FILE *fp,
@ -1567,7 +1624,7 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
printed += sample__fprintf_insn_asm(sample, thread, machine, fp, al);
}
if (PRINT_FIELD(BRSTACKINSN) || PRINT_FIELD(BRSTACKINSNLEN) || PRINT_FIELD(BRSTACKDISASM))
printed += perf_sample__fprintf_brstackinsn(sample, thread, attr, machine, fp);
printed += perf_sample__fprintf_brstackinsn(sample, evsel, thread, attr, machine, fp);
return printed;
}
@ -1639,7 +1696,7 @@ static int perf_sample__fprintf_bts(struct perf_sample *sample,
if (print_srcline_last)
printed += map__fprintf_srcline(al->map, al->addr, "\n ", fp);
printed += perf_sample__fprintf_insn(sample, attr, thread, machine, fp, al);
printed += perf_sample__fprintf_insn(sample, evsel, attr, thread, machine, fp, al);
printed += fprintf(fp, "\n");
if (PRINT_FIELD(SRCCODE)) {
int ret = map__fprintf_srccode(al->map, al->addr, stdout,
@ -2297,7 +2354,7 @@ static void process_event(struct perf_script *script,
if (evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
perf_sample__fprintf_bpf_output(sample, fp);
perf_sample__fprintf_insn(sample, attr, thread, machine, fp, al);
perf_sample__fprintf_insn(sample, evsel, attr, thread, machine, fp, al);
if (PRINT_FIELD(PHYS_ADDR))
fprintf(fp, "%16" PRIx64, sample->phys_addr);
@ -2399,7 +2456,7 @@ static bool filter_cpu(struct perf_sample *sample)
return false;
}
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -2486,7 +2543,7 @@ static int process_sample_event(struct perf_tool *tool,
// Used when scr->per_event_dump is not set
static struct evsel_script es_stdout;
static int process_attr(struct perf_tool *tool, union perf_event *event,
static int process_attr(const struct perf_tool *tool, union perf_event *event,
struct evlist **pevlist)
{
struct perf_script *scr = container_of(tool, struct perf_script, tool);
@ -2552,7 +2609,7 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
return 0;
}
static int print_event_with_time(struct perf_tool *tool,
static int print_event_with_time(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine,
@ -2588,14 +2645,14 @@ static int print_event_with_time(struct perf_tool *tool,
return 0;
}
static int print_event(struct perf_tool *tool, union perf_event *event,
static int print_event(const struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine,
pid_t pid, pid_t tid)
{
return print_event_with_time(tool, event, sample, machine, pid, tid, 0);
}
static int process_comm_event(struct perf_tool *tool,
static int process_comm_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2607,7 +2664,7 @@ static int process_comm_event(struct perf_tool *tool,
event->comm.tid);
}
static int process_namespaces_event(struct perf_tool *tool,
static int process_namespaces_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2619,7 +2676,7 @@ static int process_namespaces_event(struct perf_tool *tool,
event->namespaces.tid);
}
static int process_cgroup_event(struct perf_tool *tool,
static int process_cgroup_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2631,7 +2688,7 @@ static int process_cgroup_event(struct perf_tool *tool,
sample->tid);
}
static int process_fork_event(struct perf_tool *tool,
static int process_fork_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2643,7 +2700,7 @@ static int process_fork_event(struct perf_tool *tool,
event->fork.pid, event->fork.tid,
event->fork.time);
}
static int process_exit_event(struct perf_tool *tool,
static int process_exit_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2656,7 +2713,7 @@ static int process_exit_event(struct perf_tool *tool,
return perf_event__process_exit(tool, event, sample, machine);
}
static int process_mmap_event(struct perf_tool *tool,
static int process_mmap_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2668,7 +2725,7 @@ static int process_mmap_event(struct perf_tool *tool,
event->mmap.tid);
}
static int process_mmap2_event(struct perf_tool *tool,
static int process_mmap2_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2680,7 +2737,7 @@ static int process_mmap2_event(struct perf_tool *tool,
event->mmap2.tid);
}
static int process_switch_event(struct perf_tool *tool,
static int process_switch_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2712,7 +2769,7 @@ static int process_auxtrace_error(struct perf_session *session,
}
static int
process_lost_event(struct perf_tool *tool,
process_lost_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2722,7 +2779,7 @@ process_lost_event(struct perf_tool *tool,
}
static int
process_throttle_event(struct perf_tool *tool __maybe_unused,
process_throttle_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2733,7 +2790,7 @@ process_throttle_event(struct perf_tool *tool __maybe_unused,
}
static int
process_finished_round_event(struct perf_tool *tool __maybe_unused,
process_finished_round_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct ordered_events *oe __maybe_unused)
@ -2743,7 +2800,7 @@ process_finished_round_event(struct perf_tool *tool __maybe_unused,
}
static int
process_bpf_events(struct perf_tool *tool __maybe_unused,
process_bpf_events(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -2755,7 +2812,7 @@ process_bpf_events(struct perf_tool *tool __maybe_unused,
sample->tid);
}
static int process_text_poke_events(struct perf_tool *tool,
static int process_text_poke_events(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -3757,7 +3814,7 @@ static
int process_thread_map_event(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
const struct perf_tool *tool = session->tool;
struct perf_script *script = container_of(tool, struct perf_script, tool);
if (dump_trace)
@ -3779,7 +3836,7 @@ static
int process_cpu_map_event(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
const struct perf_tool *tool = session->tool;
struct perf_script *script = container_of(tool, struct perf_script, tool);
if (dump_trace)
@ -3809,11 +3866,10 @@ static int process_feature_event(struct perf_session *session,
static int perf_script__process_auxtrace_info(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
int ret = perf_event__process_auxtrace_info(session, event);
if (ret == 0) {
const struct perf_tool *tool = session->tool;
struct perf_script *script = container_of(tool, struct perf_script, tool);
ret = perf_script__setup_per_event_dump(script);
@ -3900,38 +3956,7 @@ int cmd_script(int argc, const char **argv)
const char *dlfilter_file = NULL;
const char **__argv;
int i, j, err = 0;
struct perf_script script = {
.tool = {
.sample = process_sample_event,
.mmap = perf_event__process_mmap,
.mmap2 = perf_event__process_mmap2,
.comm = perf_event__process_comm,
.namespaces = perf_event__process_namespaces,
.cgroup = perf_event__process_cgroup,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.attr = process_attr,
.event_update = perf_event__process_event_update,
#ifdef HAVE_LIBTRACEEVENT
.tracing_data = perf_event__process_tracing_data,
#endif
.feature = process_feature_event,
.build_id = perf_event__process_build_id,
.id_index = perf_event__process_id_index,
.auxtrace_info = perf_script__process_auxtrace_info,
.auxtrace = perf_event__process_auxtrace,
.auxtrace_error = perf_event__process_auxtrace_error,
.stat = perf_event__process_stat_event,
.stat_round = process_stat_round_event,
.stat_config = process_stat_config_event,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.throttle = process_throttle_event,
.unthrottle = process_throttle_event,
.ordered_events = true,
.ordering_requires_timestamps = true,
},
};
struct perf_script script = {};
struct perf_data data = {
.mode = PERF_DATA_MODE_READ,
};
@ -3979,7 +4004,8 @@ int cmd_script(int argc, const char **argv)
"brstacksym,flags,data_src,weight,bpf-output,brstackinsn,"
"brstackinsnlen,brstackdisasm,brstackoff,callindent,insn,disasm,insnlen,synth,"
"phys_addr,metric,misc,srccode,ipc,tod,data_page_size,"
"code_page_size,ins_lat,machine_pid,vcpu,cgroup,retire_lat",
"code_page_size,ins_lat,machine_pid,vcpu,cgroup,retire_lat,"
"brcntr",
parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
@ -4052,6 +4078,8 @@ int cmd_script(int argc, const char **argv)
"Enable symbol demangling"),
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_STRING(0, "addr2line", &symbol_conf.addr2line_path, "path",
"addr2line binary to use for line numbers"),
OPT_STRING(0, "time", &script.time_str, "str",
"Time span of interest (start,stop)"),
OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
@ -4103,10 +4131,8 @@ int cmd_script(int argc, const char **argv)
data.path = input_name;
data.force = symbol_conf.force;
if (unsorted_dump) {
if (unsorted_dump)
dump_trace = true;
script.tool.ordered_events = false;
}
if (symbol__validate_sym_arguments())
return -1;
@ -4297,6 +4323,34 @@ int cmd_script(int argc, const char **argv)
use_browser = 0;
}
perf_tool__init(&script.tool, !unsorted_dump);
script.tool.sample = process_sample_event;
script.tool.mmap = perf_event__process_mmap;
script.tool.mmap2 = perf_event__process_mmap2;
script.tool.comm = perf_event__process_comm;
script.tool.namespaces = perf_event__process_namespaces;
script.tool.cgroup = perf_event__process_cgroup;
script.tool.exit = perf_event__process_exit;
script.tool.fork = perf_event__process_fork;
script.tool.attr = process_attr;
script.tool.event_update = perf_event__process_event_update;
#ifdef HAVE_LIBTRACEEVENT
script.tool.tracing_data = perf_event__process_tracing_data;
#endif
script.tool.feature = process_feature_event;
script.tool.build_id = perf_event__process_build_id;
script.tool.id_index = perf_event__process_id_index;
script.tool.auxtrace_info = perf_script__process_auxtrace_info;
script.tool.auxtrace = perf_event__process_auxtrace;
script.tool.auxtrace_error = perf_event__process_auxtrace_error;
script.tool.stat = perf_event__process_stat_event;
script.tool.stat_round = process_stat_round_event;
script.tool.stat_config = process_stat_config_event;
script.tool.thread_map = process_thread_map_event;
script.tool.cpu_map = process_cpu_map_event;
script.tool.throttle = process_throttle_event;
script.tool.unthrottle = process_throttle_event;
script.tool.ordering_requires_timestamps = true;
session = perf_session__new(&data, &script.tool);
if (IS_ERR(session))
return PTR_ERR(session);

View File

@ -70,6 +70,7 @@
#include "util/bpf_counter.h"
#include "util/iostat.h"
#include "util/util.h"
#include "util/intel-tpebs.h"
#include "asm/bug.h"
#include <linux/time64.h>
@ -248,7 +249,7 @@ static void perf_stat__reset_stats(void)
perf_stat__reset_shadow_stats();
}
static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
static int process_synthesized_event(const struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -293,14 +294,14 @@ static int read_single_counter(struct evsel *counter, int cpu_map_idx, int threa
* terminates. Use the wait4 values in that case.
*/
if (err && cpu_map_idx == 0 &&
(counter->tool_event == PERF_TOOL_USER_TIME ||
counter->tool_event == PERF_TOOL_SYSTEM_TIME)) {
(evsel__tool_event(counter) == PERF_TOOL_USER_TIME ||
evsel__tool_event(counter) == PERF_TOOL_SYSTEM_TIME)) {
u64 val, *start_time;
struct perf_counts_values *count =
perf_counts(counter->counts, cpu_map_idx, thread);
start_time = xyarray__entry(counter->start_times, cpu_map_idx, thread);
if (counter->tool_event == PERF_TOOL_USER_TIME)
if (evsel__tool_event(counter) == PERF_TOOL_USER_TIME)
val = ru_stats.ru_utime_usec_stat.mean;
else
val = ru_stats.ru_stime_usec_stat.mean;
@ -683,6 +684,9 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
if (child_pid != -1)
kill(child_pid, SIGTERM);
tpebs_delete();
return COUNTER_FATAL;
}
@ -833,7 +837,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
return -1;
}
if (evlist__apply_filters(evsel_list, &counter)) {
if (evlist__apply_filters(evsel_list, &counter, &target)) {
pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
counter->filter, evsel__name(counter), errno,
str_error_r(errno, msg, sizeof(msg)));
@ -2180,7 +2184,7 @@ static
int process_stat_config_event(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
const struct perf_tool *tool = session->tool;
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
perf_event__read_stat_config(&stat_config, &event->stat_config);
@ -2229,7 +2233,7 @@ static
int process_thread_map_event(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
const struct perf_tool *tool = session->tool;
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
if (st->threads) {
@ -2248,7 +2252,7 @@ static
int process_cpu_map_event(struct perf_session *session,
union perf_event *event)
{
struct perf_tool *tool = session->tool;
const struct perf_tool *tool = session->tool;
struct perf_stat *st = container_of(tool, struct perf_stat, tool);
struct perf_cpu_map *cpus;
@ -2271,15 +2275,6 @@ static const char * const stat_report_usage[] = {
};
static struct perf_stat perf_stat = {
.tool = {
.attr = perf_event__process_attr,
.event_update = perf_event__process_event_update,
.thread_map = process_thread_map_event,
.cpu_map = process_cpu_map_event,
.stat_config = process_stat_config_event,
.stat = perf_event__process_stat_event,
.stat_round = process_stat_round_event,
},
.aggr_mode = AGGR_UNSET,
.aggr_level = 0,
};
@ -2322,6 +2317,15 @@ static int __cmd_report(int argc, const char **argv)
perf_stat.data.path = input_name;
perf_stat.data.mode = PERF_DATA_MODE_READ;
perf_tool__init(&perf_stat.tool, /*ordered_events=*/false);
perf_stat.tool.attr = perf_event__process_attr;
perf_stat.tool.event_update = perf_event__process_event_update;
perf_stat.tool.thread_map = process_thread_map_event;
perf_stat.tool.cpu_map = process_cpu_map_event;
perf_stat.tool.stat_config = process_stat_config_event;
perf_stat.tool.stat = perf_event__process_stat_event;
perf_stat.tool.stat_round = process_stat_round_event;
session = perf_session__new(&perf_stat.data, &perf_stat.tool);
if (IS_ERR(session))
return PTR_ERR(session);
@ -2471,6 +2475,10 @@ int cmd_stat(int argc, const char **argv)
"disable adding events for the metric threshold calculation"),
OPT_BOOLEAN(0, "topdown", &topdown_run,
"measure top-down statistics"),
#ifdef HAVE_ARCH_X86_64_SUPPORT
OPT_BOOLEAN(0, "record-tpebs", &tpebs_recording,
"enable recording for tpebs when retire_latency required"),
#endif
OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
"Set the metrics level for the top-down statistics (0: max level)"),
OPT_BOOLEAN(0, "smi-cost", &smi_cost,

View File

@ -320,7 +320,7 @@ static int *cpus_cstate_state;
static u64 *cpus_pstate_start_times;
static u64 *cpus_pstate_state;
static int process_comm_event(struct perf_tool *tool,
static int process_comm_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -330,7 +330,7 @@ static int process_comm_event(struct perf_tool *tool,
return 0;
}
static int process_fork_event(struct perf_tool *tool,
static int process_fork_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -340,7 +340,7 @@ static int process_fork_event(struct perf_tool *tool,
return 0;
}
static int process_exit_event(struct perf_tool *tool,
static int process_exit_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
@ -571,7 +571,7 @@ typedef int (*tracepoint_handler)(struct timechart *tchart,
struct perf_sample *sample,
const char *backtrace);
static int process_sample_event(struct perf_tool *tool,
static int process_sample_event(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -1606,10 +1606,16 @@ static int __cmd_timechart(struct timechart *tchart, const char *output_name)
.mode = PERF_DATA_MODE_READ,
.force = tchart->force,
};
struct perf_session *session = perf_session__new(&data, &tchart->tool);
struct perf_session *session;
int ret = -EINVAL;
perf_tool__init(&tchart->tool, /*ordered_events=*/true);
tchart->tool.comm = process_comm_event;
tchart->tool.fork = process_fork_event;
tchart->tool.exit = process_exit_event;
tchart->tool.sample = process_sample_event;
session = perf_session__new(&data, &tchart->tool);
if (IS_ERR(session))
return PTR_ERR(session);
@ -1924,13 +1930,6 @@ parse_time(const struct option *opt, const char *arg, int __maybe_unused unset)
int cmd_timechart(int argc, const char **argv)
{
struct timechart tchart = {
.tool = {
.comm = process_comm_event,
.fork = process_fork_event,
.exit = process_exit_event,
.sample = process_sample_event,
.ordered_events = true,
},
.proc_num = 15,
.min_time = NSEC_PER_MSEC,
.merge_dist = 1000,

View File

@ -191,7 +191,7 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
if (use_browser <= 0)
sleep(5);
map__set_erange_warned(map, true);
map__set_erange_warned(map);
}
static void perf_top__record_precise_ip(struct perf_top *top,
@ -735,12 +735,12 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
perf_top__record_precise_ip(top, iter->he, iter->sample, evsel, al->addr);
hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
!(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY),
NULL);
!(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY),
NULL, evsel);
return 0;
}
static void perf_event__process_sample(struct perf_tool *tool,
static void perf_event__process_sample(const struct perf_tool *tool,
const union perf_event *event,
struct evsel *evsel,
struct perf_sample *sample,
@ -1055,7 +1055,7 @@ static int perf_top__start_counters(struct perf_top *top)
}
}
if (evlist__apply_filters(evlist, &counter)) {
if (evlist__apply_filters(evlist, &counter, &opts->target)) {
pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
counter->filter ?: "BPF", evsel__name(counter), errno,
str_error_r(errno, msg, sizeof(msg)));

View File

@ -19,6 +19,7 @@
#ifdef HAVE_LIBBPF_SUPPORT
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
#include <bpf/btf.h>
#ifdef HAVE_BPF_SKEL
#include "bpf_skel/augmented_raw_syscalls.skel.h"
#endif
@ -64,6 +65,7 @@
#include "syscalltbl.h"
#include "rb_resort.h"
#include "../perf.h"
#include "trace_augment.h"
#include <errno.h>
#include <inttypes.h>
@ -101,6 +103,12 @@
/*
* strtoul: Go from a string to a value, i.e. for msr: MSR_FS_BASE to 0xc0000100
*
* We have to explicitely mark the direction of the flow of data, if from the
* kernel to user space or the other way around, since the BPF collector we
* have so far copies only from user to kernel space, mark the arguments that
* go that direction, so that we don´t end up collecting the previous contents
* for syscall args that goes from kernel to user space.
*/
struct syscall_arg_fmt {
size_t (*scnprintf)(char *bf, size_t size, struct syscall_arg *arg);
@ -109,7 +117,12 @@ struct syscall_arg_fmt {
void *parm;
const char *name;
u16 nr_entries; // for arrays
bool from_user;
bool show_zero;
#ifdef HAVE_LIBBPF_SUPPORT
const struct btf_type *type;
int type_id; /* used in btf_dump */
#endif
};
struct syscall_fmt {
@ -139,6 +152,9 @@ struct trace {
} syscalls;
#ifdef HAVE_BPF_SKEL
struct augmented_raw_syscalls_bpf *skel;
#endif
#ifdef HAVE_LIBBPF_SUPPORT
struct btf *btf;
#endif
struct record_opts opts;
struct evlist *evlist;
@ -196,6 +212,7 @@ struct trace {
bool show_string_prefix;
bool force;
bool vfs_getname;
bool force_btf;
int trace_pgfaults;
char *perfconfig_events;
struct {
@ -204,6 +221,20 @@ struct trace {
} oe;
};
static void trace__load_vmlinux_btf(struct trace *trace __maybe_unused)
{
#ifdef HAVE_LIBBPF_SUPPORT
if (trace->btf != NULL)
return;
trace->btf = btf__load_vmlinux_btf();
if (verbose > 0) {
fprintf(trace->output, trace->btf ? "vmlinux BTF loaded\n" :
"Failed to load vmlinux BTF\n");
}
#endif
}
struct tp_field {
int offset;
union {
@ -830,6 +861,15 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
#define SCA_FILENAME syscall_arg__scnprintf_filename
// 'argname' is just documentational at this point, to remove the previous comment with that info
#define SCA_FILENAME_FROM_USER(argname) \
{ .scnprintf = SCA_FILENAME, \
.from_user = true, }
static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg);
#define SCA_BUF syscall_arg__scnprintf_buf
static size_t syscall_arg__scnprintf_pipe_flags(char *bf, size_t size,
struct syscall_arg *arg)
{
@ -887,6 +927,177 @@ static size_t syscall_arg__scnprintf_getrandom_flags(char *bf, size_t size,
#define SCA_GETRANDOM_FLAGS syscall_arg__scnprintf_getrandom_flags
#ifdef HAVE_LIBBPF_SUPPORT
static void syscall_arg_fmt__cache_btf_enum(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
{
int id;
type = strstr(type, "enum ");
if (type == NULL)
return;
type += 5; // skip "enum " to get the enumeration name
id = btf__find_by_name(btf, type);
if (id < 0)
return;
arg_fmt->type = btf__type_by_id(btf, id);
}
static bool syscall_arg__strtoul_btf_enum(char *bf, size_t size, struct syscall_arg *arg, u64 *val)
{
const struct btf_type *bt = arg->fmt->type;
struct btf *btf = arg->trace->btf;
struct btf_enum *be = btf_enum(bt);
for (int i = 0; i < btf_vlen(bt); ++i, ++be) {
const char *name = btf__name_by_offset(btf, be->name_off);
int max_len = max(size, strlen(name));
if (strncmp(name, bf, max_len) == 0) {
*val = be->val;
return true;
}
}
return false;
}
static bool syscall_arg__strtoul_btf_type(char *bf, size_t size, struct syscall_arg *arg, u64 *val)
{
const struct btf_type *bt;
char *type = arg->type_name;
struct btf *btf;
trace__load_vmlinux_btf(arg->trace);
btf = arg->trace->btf;
if (btf == NULL)
return false;
if (arg->fmt->type == NULL) {
// See if this is an enum
syscall_arg_fmt__cache_btf_enum(arg->fmt, btf, type);
}
// Now let's see if we have a BTF type resolved
bt = arg->fmt->type;
if (bt == NULL)
return false;
// If it is an enum:
if (btf_is_enum(arg->fmt->type))
return syscall_arg__strtoul_btf_enum(bf, size, arg, val);
return false;
}
static size_t btf_enum_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, int val)
{
struct btf_enum *be = btf_enum(type);
const int nr_entries = btf_vlen(type);
for (int i = 0; i < nr_entries; ++i, ++be) {
if (be->val == val) {
return scnprintf(bf, size, "%s",
btf__name_by_offset(btf, be->name_off));
}
}
return 0;
}
struct trace_btf_dump_snprintf_ctx {
char *bf;
size_t printed, size;
};
static void trace__btf_dump_snprintf(void *vctx, const char *fmt, va_list args)
{
struct trace_btf_dump_snprintf_ctx *ctx = vctx;
ctx->printed += vscnprintf(ctx->bf + ctx->printed, ctx->size - ctx->printed, fmt, args);
}
static size_t btf_struct_scnprintf(const struct btf_type *type, struct btf *btf, char *bf, size_t size, struct syscall_arg *arg)
{
struct trace_btf_dump_snprintf_ctx ctx = {
.bf = bf,
.size = size,
};
struct augmented_arg *augmented_arg = arg->augmented.args;
int type_id = arg->fmt->type_id, consumed;
struct btf_dump *btf_dump;
LIBBPF_OPTS(btf_dump_opts, dump_opts);
LIBBPF_OPTS(btf_dump_type_data_opts, dump_data_opts);
if (arg == NULL || arg->augmented.args == NULL)
return 0;
dump_data_opts.compact = true;
dump_data_opts.skip_names = !arg->trace->show_arg_names;
btf_dump = btf_dump__new(btf, trace__btf_dump_snprintf, &ctx, &dump_opts);
if (btf_dump == NULL)
return 0;
/* pretty print the struct data here */
if (btf_dump__dump_type_data(btf_dump, type_id, arg->augmented.args->value, type->size, &dump_data_opts) == 0)
return 0;
consumed = sizeof(*augmented_arg) + augmented_arg->size;
arg->augmented.args = ((void *)arg->augmented.args) + consumed;
arg->augmented.size -= consumed;
btf_dump__free(btf_dump);
return ctx.printed;
}
static size_t trace__btf_scnprintf(struct trace *trace, struct syscall_arg *arg, char *bf,
size_t size, int val, char *type)
{
struct syscall_arg_fmt *arg_fmt = arg->fmt;
if (trace->btf == NULL)
return 0;
if (arg_fmt->type == NULL) {
// Check if this is an enum and if we have the BTF type for it.
syscall_arg_fmt__cache_btf_enum(arg_fmt, trace->btf, type);
}
// Did we manage to find a BTF type for the syscall/tracepoint argument?
if (arg_fmt->type == NULL)
return 0;
if (btf_is_enum(arg_fmt->type))
return btf_enum_scnprintf(arg_fmt->type, trace->btf, bf, size, val);
else if (btf_is_struct(arg_fmt->type) || btf_is_union(arg_fmt->type))
return btf_struct_scnprintf(arg_fmt->type, trace->btf, bf, size, arg);
return 0;
}
#else // HAVE_LIBBPF_SUPPORT
static size_t trace__btf_scnprintf(struct trace *trace __maybe_unused, struct syscall_arg *arg __maybe_unused,
char *bf __maybe_unused, size_t size __maybe_unused, int val __maybe_unused,
char *type __maybe_unused)
{
return 0;
}
static bool syscall_arg__strtoul_btf_type(char *bf __maybe_unused, size_t size __maybe_unused,
struct syscall_arg *arg __maybe_unused, u64 *val __maybe_unused)
{
return false;
}
#endif // HAVE_LIBBPF_SUPPORT
#define STUL_BTF_TYPE syscall_arg__strtoul_btf_type
#define STRARRAY(name, array) \
{ .scnprintf = SCA_STRARRAY, \
.strtoul = STUL_STRARRAY, \
@ -921,16 +1132,17 @@ static const struct syscall_fmt syscall_fmts[] = {
[1] = { .scnprintf = SCA_PTR, /* arg2 */ }, }, },
{ .name = "bind",
.arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
[1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ },
[1] = SCA_SOCKADDR_FROM_USER(umyaddr),
[2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
{ .name = "bpf",
.arg = { [0] = STRARRAY(cmd, bpf_cmd), }, },
.arg = { [0] = STRARRAY(cmd, bpf_cmd),
[1] = { .from_user = true /* attr */, }, } },
{ .name = "brk", .hexret = true,
.arg = { [0] = { .scnprintf = SCA_PTR, /* brk */ }, }, },
{ .name = "clock_gettime",
.arg = { [0] = STRARRAY(clk_id, clockid), }, },
{ .name = "clock_nanosleep",
.arg = { [2] = { .scnprintf = SCA_TIMESPEC, /* rqtp */ }, }, },
.arg = { [2] = SCA_TIMESPEC_FROM_USER(req), }, },
{ .name = "clone", .errpid = true, .nr_args = 5,
.arg = { [0] = { .name = "flags", .scnprintf = SCA_CLONE_FLAGS, },
[1] = { .name = "child_stack", .scnprintf = SCA_HEX, },
@ -941,7 +1153,7 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [0] = { .scnprintf = SCA_CLOSE_FD, /* fd */ }, }, },
{ .name = "connect",
.arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
[1] = { .scnprintf = SCA_SOCKADDR, /* servaddr */ },
[1] = SCA_SOCKADDR_FROM_USER(servaddr),
[2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
{ .name = "epoll_ctl",
.arg = { [1] = STRARRAY(op, epoll_ctl_ops), }, },
@ -949,11 +1161,11 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [1] = { .scnprintf = SCA_EFD_FLAGS, /* flags */ }, }, },
{ .name = "faccessat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dirfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* pathname */ },
[1] = SCA_FILENAME_FROM_USER(pathname),
[2] = { .scnprintf = SCA_ACCMODE, /* mode */ }, }, },
{ .name = "faccessat2",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dirfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* pathname */ },
[1] = SCA_FILENAME_FROM_USER(pathname),
[2] = { .scnprintf = SCA_ACCMODE, /* mode */ },
[3] = { .scnprintf = SCA_FACCESSAT2_FLAGS, /* flags */ }, }, },
{ .name = "fchmodat",
@ -975,7 +1187,7 @@ static const struct syscall_fmt syscall_fmts[] = {
[2] = { .scnprintf = SCA_FSMOUNT_ATTR_FLAGS, /* attr_flags */ }, }, },
{ .name = "fspick",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* path */ },
[1] = SCA_FILENAME_FROM_USER(path),
[2] = { .scnprintf = SCA_FSPICK_FLAGS, /* flags */ }, }, },
{ .name = "fstat", .alias = "newfstat", },
{ .name = "futex",
@ -1039,29 +1251,29 @@ static const struct syscall_fmt syscall_fmts[] = {
.parm = &strarray__mmap_flags, },
[5] = { .scnprintf = SCA_HEX, /* offset */ }, }, },
{ .name = "mount",
.arg = { [0] = { .scnprintf = SCA_FILENAME, /* dev_name */ },
.arg = { [0] = SCA_FILENAME_FROM_USER(devname),
[3] = { .scnprintf = SCA_MOUNT_FLAGS, /* flags */
.mask_val = SCAMV_MOUNT_FLAGS, /* flags */ }, }, },
{ .name = "move_mount",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* from_dfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* from_pathname */ },
[1] = SCA_FILENAME_FROM_USER(pathname),
[2] = { .scnprintf = SCA_FDAT, /* to_dfd */ },
[3] = { .scnprintf = SCA_FILENAME, /* to_pathname */ },
[3] = SCA_FILENAME_FROM_USER(pathname),
[4] = { .scnprintf = SCA_MOVE_MOUNT_FLAGS, /* flags */ }, }, },
{ .name = "mprotect",
.arg = { [0] = { .scnprintf = SCA_HEX, /* start */ },
[2] = { .scnprintf = SCA_MMAP_PROT, .show_zero = true, /* prot */ }, }, },
{ .name = "mq_unlink",
.arg = { [0] = { .scnprintf = SCA_FILENAME, /* u_name */ }, }, },
.arg = { [0] = SCA_FILENAME_FROM_USER(u_name), }, },
{ .name = "mremap", .hexret = true,
.arg = { [3] = { .scnprintf = SCA_MREMAP_FLAGS, /* flags */ }, }, },
{ .name = "name_to_handle_at",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
{ .name = "nanosleep",
.arg = { [0] = { .scnprintf = SCA_TIMESPEC, /* req */ }, }, },
.arg = { [0] = SCA_TIMESPEC_FROM_USER(req), }, },
{ .name = "newfstatat", .alias = "fstatat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dirfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* pathname */ },
[1] = SCA_FILENAME_FROM_USER(pathname),
[3] = { .scnprintf = SCA_FS_AT_FLAGS, /* flags */ }, }, },
{ .name = "open",
.arg = { [1] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
@ -1072,7 +1284,7 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ },
[2] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
{ .name = "perf_event_open",
.arg = { [0] = { .scnprintf = SCA_PERF_ATTR, /* attr */ },
.arg = { [0] = SCA_PERF_ATTR_FROM_USER(attr),
[2] = { .scnprintf = SCA_INT, /* cpu */ },
[3] = { .scnprintf = SCA_FD, /* group_fd */ },
[4] = { .scnprintf = SCA_PERF_FLAGS, /* flags */ }, }, },
@ -1097,7 +1309,8 @@ static const struct syscall_fmt syscall_fmts[] = {
{ .name = "pread", .alias = "pread64", },
{ .name = "preadv", .alias = "pread", },
{ .name = "prlimit64",
.arg = { [1] = STRARRAY(resource, rlimit_resources), }, },
.arg = { [1] = STRARRAY(resource, rlimit_resources),
[2] = { .from_user = true /* new_rlim */, }, }, },
{ .name = "pwrite", .alias = "pwrite64", },
{ .name = "readlinkat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
@ -1114,6 +1327,8 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
[2] = { .scnprintf = SCA_FDAT, /* newdirfd */ },
[4] = { .scnprintf = SCA_RENAMEAT2_FLAGS, /* flags */ }, }, },
{ .name = "rseq", .errpid = true,
.arg = { [0] = { .from_user = true /* rseq */, }, }, },
{ .name = "rt_sigaction",
.arg = { [0] = { .scnprintf = SCA_SIGNUM, /* sig */ }, }, },
{ .name = "rt_sigprocmask",
@ -1135,12 +1350,15 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [2] = { .scnprintf = SCA_MSG_FLAGS, /* flags */ }, }, },
{ .name = "sendto",
.arg = { [3] = { .scnprintf = SCA_MSG_FLAGS, /* flags */ },
[4] = { .scnprintf = SCA_SOCKADDR, /* addr */ }, }, },
[4] = SCA_SOCKADDR_FROM_USER(addr), }, },
{ .name = "set_robust_list", .errpid = true,
.arg = { [0] = { .from_user = true /* head */, }, }, },
{ .name = "set_tid_address", .errpid = true, },
{ .name = "setitimer",
.arg = { [0] = STRARRAY(which, itimers), }, },
{ .name = "setrlimit",
.arg = { [0] = STRARRAY(resource, rlimit_resources), }, },
.arg = { [0] = STRARRAY(resource, rlimit_resources),
[1] = { .from_user = true /* rlim */, }, }, },
{ .name = "setsockopt",
.arg = { [1] = STRARRAY(level, socket_level), }, },
{ .name = "socket",
@ -1157,9 +1375,9 @@ static const struct syscall_fmt syscall_fmts[] = {
[2] = { .scnprintf = SCA_FS_AT_FLAGS, /* flags */ } ,
[3] = { .scnprintf = SCA_STATX_MASK, /* mask */ }, }, },
{ .name = "swapoff",
.arg = { [0] = { .scnprintf = SCA_FILENAME, /* specialfile */ }, }, },
.arg = { [0] = SCA_FILENAME_FROM_USER(specialfile), }, },
{ .name = "swapon",
.arg = { [0] = { .scnprintf = SCA_FILENAME, /* specialfile */ }, }, },
.arg = { [0] = SCA_FILENAME_FROM_USER(specialfile), }, },
{ .name = "symlinkat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
{ .name = "sync_file_range",
@ -1169,11 +1387,11 @@ static const struct syscall_fmt syscall_fmts[] = {
{ .name = "tkill",
.arg = { [1] = { .scnprintf = SCA_SIGNUM, /* sig */ }, }, },
{ .name = "umount2", .alias = "umount",
.arg = { [0] = { .scnprintf = SCA_FILENAME, /* name */ }, }, },
.arg = { [0] = SCA_FILENAME_FROM_USER(name), }, },
{ .name = "uname", .alias = "newuname", },
{ .name = "unlinkat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ },
[1] = { .scnprintf = SCA_FILENAME, /* pathname */ },
[1] = SCA_FILENAME_FROM_USER(pathname),
[2] = { .scnprintf = SCA_FS_AT_FLAGS, /* flags */ }, }, },
{ .name = "utimensat",
.arg = { [0] = { .scnprintf = SCA_FDAT, /* dirfd */ }, }, },
@ -1181,6 +1399,8 @@ static const struct syscall_fmt syscall_fmts[] = {
.arg = { [2] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
{ .name = "waitid", .errpid = true,
.arg = { [3] = { .scnprintf = SCA_WAITID_OPTIONS, /* options */ }, }, },
{ .name = "write", .errpid = true,
.arg = { [1] = { .scnprintf = SCA_BUF /* buf */, .from_user = true, }, }, },
};
static int syscall_fmt__cmp(const void *name, const void *fmtp)
@ -1238,6 +1458,7 @@ struct syscall {
bool is_exit;
bool is_open;
bool nonexistent;
bool use_btf;
struct tep_format_field *args;
const char *name;
const struct syscall_fmt *fmt;
@ -1551,6 +1772,32 @@ static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
return 0;
}
#define MAX_CONTROL_CHAR 31
#define MAX_ASCII 127
static size_t syscall_arg__scnprintf_buf(char *bf, size_t size, struct syscall_arg *arg)
{
struct augmented_arg *augmented_arg = arg->augmented.args;
unsigned char *orig = (unsigned char *)augmented_arg->value;
size_t printed = 0;
int consumed;
if (augmented_arg == NULL)
return 0;
for (int j = 0; j < augmented_arg->size; ++j) {
bool control_char = orig[j] <= MAX_CONTROL_CHAR || orig[j] >= MAX_ASCII;
/* print control characters (0~31 and 127), and non-ascii characters in \(digits) */
printed += scnprintf(bf + printed, size - printed, control_char ? "\\%d" : "%c", (int)orig[j]);
}
consumed = sizeof(*augmented_arg) + augmented_arg->size;
arg->augmented.args = ((void *)arg->augmented.args) + consumed;
arg->augmented.size -= consumed;
return printed;
}
static bool trace__filter_duration(struct trace *trace, double t)
{
return t < (trace->duration_filter * NSEC_PER_MSEC);
@ -1637,7 +1884,7 @@ static int trace__process_event(struct trace *trace, struct machine *machine,
return ret;
}
static int trace__tool_process(struct perf_tool *tool,
static int trace__tool_process(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
@ -1744,7 +1991,8 @@ static const struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *n
}
static struct tep_format_field *
syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field)
syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field,
bool *use_btf)
{
struct tep_format_field *last_field = NULL;
int len;
@ -1757,11 +2005,15 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
len = strlen(field->name);
// As far as heuristics (or intention) goes this seems to hold true, and makes sense!
if ((field->flags & TEP_FIELD_IS_POINTER) && strstarts(field->type, "const "))
arg->from_user = true;
if (strcmp(field->type, "const char *") == 0 &&
((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
strstr(field->name, "path") != NULL))
strstr(field->name, "path") != NULL)) {
arg->scnprintf = SCA_FILENAME;
else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
} else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
arg->scnprintf = SCA_PTR;
else if (strcmp(field->type, "pid_t") == 0)
arg->scnprintf = SCA_PID;
@ -1782,6 +2034,9 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
* 7 unsigned long
*/
arg->scnprintf = SCA_FD;
} else if (strstr(field->type, "enum") && use_btf != NULL) {
*use_btf = true;
arg->strtoul = STUL_BTF_TYPE;
} else {
const struct syscall_arg_fmt *fmt =
syscall_arg_fmt__find_by_name(field->name);
@ -1798,7 +2053,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
static int syscall__set_arg_fmts(struct syscall *sc)
{
struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args);
struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args,
&sc->use_btf);
if (last_field)
sc->args_size = last_field->offset + last_field->size;
@ -1811,6 +2067,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
char tp_name[128];
struct syscall *sc;
const char *name = syscalltbl__name(trace->sctbl, id);
int err;
#ifdef HAVE_SYSCALL_TABLE_SUPPORT
if (trace->syscalls.table == NULL) {
@ -1883,15 +2140,21 @@ static int trace__read_syscall_info(struct trace *trace, int id)
sc->is_exit = !strcmp(name, "exit_group") || !strcmp(name, "exit");
sc->is_open = !strcmp(name, "open") || !strcmp(name, "openat");
return syscall__set_arg_fmts(sc);
err = syscall__set_arg_fmts(sc);
/* after calling syscall__set_arg_fmts() we'll know whether use_btf is true */
if (sc->use_btf)
trace__load_vmlinux_btf(trace);
return err;
}
static int evsel__init_tp_arg_scnprintf(struct evsel *evsel)
static int evsel__init_tp_arg_scnprintf(struct evsel *evsel, bool *use_btf)
{
struct syscall_arg_fmt *fmt = evsel__syscall_arg_fmt(evsel);
if (fmt != NULL) {
syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields);
syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields, use_btf);
return 0;
}
@ -2050,7 +2313,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
unsigned char *args, void *augmented_args, int augmented_args_size,
struct trace *trace, struct thread *thread)
{
size_t printed = 0;
size_t printed = 0, btf_printed;
unsigned long val;
u8 bit = 1;
struct syscall_arg arg = {
@ -2066,6 +2329,7 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
.show_string_prefix = trace->show_string_prefix,
};
struct thread_trace *ttrace = thread__priv(thread);
void *default_scnprintf;
/*
* Things like fcntl will set this in its 'cmd' formatter to pick the
@ -2093,9 +2357,13 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
/*
* Suppress this argument if its value is zero and show_zero
* property isn't set.
*
* If it has a BTF type, then override the zero suppression knob
* as the common case is for zero in an enum to have an associated entry.
*/
if (val == 0 && !trace->show_zeros &&
!(sc->arg_fmt && sc->arg_fmt[arg.idx].show_zero))
!(sc->arg_fmt && sc->arg_fmt[arg.idx].show_zero) &&
!(sc->arg_fmt && sc->arg_fmt[arg.idx].strtoul == STUL_BTF_TYPE))
continue;
printed += scnprintf(bf + printed, size - printed, "%s", printed ? ", " : "");
@ -2103,6 +2371,17 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
if (trace->show_arg_names)
printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
default_scnprintf = sc->arg_fmt[arg.idx].scnprintf;
if (trace->force_btf || default_scnprintf == NULL || default_scnprintf == SCA_PTR) {
btf_printed = trace__btf_scnprintf(trace, &arg, bf + printed,
size - printed, val, field->type);
if (btf_printed) {
printed += btf_printed;
continue;
}
}
printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
bf + printed, size - printed, &arg, val);
}
@ -2749,7 +3028,7 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
size_t size = sizeof(bf);
struct tep_format_field *field = evsel->tp_format->format.fields;
struct syscall_arg_fmt *arg = __evsel__syscall_arg_fmt(evsel);
size_t printed = 0;
size_t printed = 0, btf_printed;
unsigned long val;
u8 bit = 1;
struct syscall_arg syscall_arg = {
@ -2791,7 +3070,7 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
val = syscall_arg_fmt__mask_val(arg, &syscall_arg, val);
/* Suppress this argument if its value is zero and show_zero property isn't set. */
if (val == 0 && !trace->show_zeros && !arg->show_zero)
if (val == 0 && !trace->show_zeros && !arg->show_zero && arg->strtoul != STUL_BTF_TYPE)
continue;
printed += scnprintf(bf + printed, size - printed, "%s", printed ? ", " : "");
@ -2799,6 +3078,12 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
if (trace->show_arg_names)
printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
btf_printed = trace__btf_scnprintf(trace, &syscall_arg, bf + printed, size - printed, val, field->type);
if (btf_printed) {
printed += btf_printed;
continue;
}
printed += syscall_arg_fmt__scnprintf_val(arg, bf + printed, size - printed, &syscall_arg, val);
}
@ -3009,7 +3294,7 @@ static void trace__set_base_time(struct trace *trace,
trace->base_time = sample->time;
}
static int trace__process_sample(struct perf_tool *tool,
static int trace__process_sample(const struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct evsel *evsel,
@ -3276,6 +3561,23 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
}
#ifdef HAVE_BPF_SKEL
static int syscall_arg_fmt__cache_btf_struct(struct syscall_arg_fmt *arg_fmt, struct btf *btf, char *type)
{
int id;
if (arg_fmt->type != NULL)
return -1;
id = btf__find_by_name(btf, type);
if (id < 0)
return -1;
arg_fmt->type = btf__type_by_id(btf, id);
arg_fmt->type_id = id;
return 0;
}
static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
{
struct bpf_program *pos, *prog = NULL;
@ -3351,6 +3653,91 @@ static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->skel->progs.syscall_unaugmented);
}
static int trace__bpf_sys_enter_beauty_map(struct trace *trace, int key, unsigned int *beauty_array)
{
struct tep_format_field *field;
struct syscall *sc = trace__syscall_info(trace, NULL, key);
const struct btf_type *bt;
char *struct_offset, *tmp, name[32];
bool can_augment = false;
int i, cnt;
if (sc == NULL)
return -1;
trace__load_vmlinux_btf(trace);
if (trace->btf == NULL)
return -1;
for (i = 0, field = sc->args; field; ++i, field = field->next) {
// XXX We're only collecting pointer payloads _from_ user space
if (!sc->arg_fmt[i].from_user)
continue;
struct_offset = strstr(field->type, "struct ");
if (struct_offset == NULL)
struct_offset = strstr(field->type, "union ");
else
struct_offset++; // "union" is shorter
if (field->flags & TEP_FIELD_IS_POINTER && struct_offset) { /* struct or union (think BPF's attr arg) */
struct_offset += 6;
/* for 'struct foo *', we only want 'foo' */
for (tmp = struct_offset, cnt = 0; *tmp != ' ' && *tmp != '\0'; ++tmp, ++cnt) {
}
strncpy(name, struct_offset, cnt);
name[cnt] = '\0';
/* cache struct's btf_type and type_id */
if (syscall_arg_fmt__cache_btf_struct(&sc->arg_fmt[i], trace->btf, name))
continue;
bt = sc->arg_fmt[i].type;
beauty_array[i] = bt->size;
can_augment = true;
} else if (field->flags & TEP_FIELD_IS_POINTER && /* string */
strcmp(field->type, "const char *") == 0 &&
(strstr(field->name, "name") ||
strstr(field->name, "path") ||
strstr(field->name, "file") ||
strstr(field->name, "root") ||
strstr(field->name, "key") ||
strstr(field->name, "special") ||
strstr(field->name, "type") ||
strstr(field->name, "description"))) {
beauty_array[i] = 1;
can_augment = true;
} else if (field->flags & TEP_FIELD_IS_POINTER && /* buffer */
strstr(field->type, "char *") &&
(strstr(field->name, "buf") ||
strstr(field->name, "val") ||
strstr(field->name, "msg"))) {
int j;
struct tep_format_field *field_tmp;
/* find the size of the buffer that appears in pairs with buf */
for (j = 0, field_tmp = sc->args; field_tmp; ++j, field_tmp = field_tmp->next) {
if (!(field_tmp->flags & TEP_FIELD_IS_POINTER) && /* only integers */
(strstr(field_tmp->name, "count") ||
strstr(field_tmp->name, "siz") || /* size, bufsiz */
(strstr(field_tmp->name, "len") && strcmp(field_tmp->name, "filename")))) {
/* filename's got 'len' in it, we don't want that */
beauty_array[i] = -(j + 1);
can_augment = true;
break;
}
}
}
}
if (can_augment)
return 0;
return -1;
}
static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
{
struct tep_format_field *field, *candidate_field;
@ -3455,7 +3842,9 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
{
int map_enter_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_enter);
int map_exit_fd = bpf_map__fd(trace->skel->maps.syscalls_sys_exit);
int beauty_map_fd = bpf_map__fd(trace->skel->maps.beauty_map_enter);
int err = 0;
unsigned int beauty_array[6];
for (int i = 0; i < trace->sctbl->syscalls.nr_entries; ++i) {
int prog_fd, key = syscalltbl__id_at_idx(trace->sctbl, i);
@ -3474,6 +3863,15 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
if (err)
break;
/* use beauty_map to tell BPF how many bytes to collect, set beauty_map's value here */
memset(beauty_array, 0, sizeof(beauty_array));
err = trace__bpf_sys_enter_beauty_map(trace, key, (unsigned int *)beauty_array);
if (err)
continue;
err = bpf_map_update_elem(beauty_map_fd, &key, beauty_array, BPF_ANY);
if (err)
break;
}
/*
@ -3680,7 +4078,8 @@ static int ordered_events__deliver_event(struct ordered_events *oe,
return __trace__deliver_event(trace, event->event);
}
static struct syscall_arg_fmt *evsel__find_syscall_arg_fmt_by_name(struct evsel *evsel, char *arg)
static struct syscall_arg_fmt *evsel__find_syscall_arg_fmt_by_name(struct evsel *evsel, char *arg,
char **type)
{
struct tep_format_field *field;
struct syscall_arg_fmt *fmt = __evsel__syscall_arg_fmt(evsel);
@ -3689,13 +4088,15 @@ static struct syscall_arg_fmt *evsel__find_syscall_arg_fmt_by_name(struct evsel
return NULL;
for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
if (strcmp(field->name, arg) == 0)
if (strcmp(field->name, arg) == 0) {
*type = field->type;
return fmt;
}
return NULL;
}
static int trace__expand_filter(struct trace *trace __maybe_unused, struct evsel *evsel)
static int trace__expand_filter(struct trace *trace, struct evsel *evsel)
{
char *tok, *left = evsel->filter, *new_filter = evsel->filter;
@ -3728,14 +4129,14 @@ static int trace__expand_filter(struct trace *trace __maybe_unused, struct evsel
struct syscall_arg_fmt *fmt;
int left_size = tok - left,
right_size = right_end - right;
char arg[128];
char arg[128], *type;
while (isspace(left[left_size - 1]))
--left_size;
scnprintf(arg, sizeof(arg), "%.*s", left_size, left);
fmt = evsel__find_syscall_arg_fmt_by_name(evsel, arg);
fmt = evsel__find_syscall_arg_fmt_by_name(evsel, arg, &type);
if (fmt == NULL) {
pr_err("\"%s\" not found in \"%s\", can't set filter \"%s\"\n",
arg, evsel->name, evsel->filter);
@ -3748,6 +4149,9 @@ static int trace__expand_filter(struct trace *trace __maybe_unused, struct evsel
if (fmt->strtoul) {
u64 val;
struct syscall_arg syscall_arg = {
.trace = trace,
.fmt = fmt,
.type_name = type,
.parm = fmt->parm,
};
@ -3959,7 +4363,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
err = trace__expand_filters(trace, &evsel);
if (err)
goto out_delete_evlist;
err = evlist__apply_filters(evlist, &evsel);
err = evlist__apply_filters(evlist, &evsel, &trace->opts.target);
if (err < 0)
goto out_error_apply_filters;
@ -4451,7 +4855,7 @@ static void evsel__set_syscall_arg_fmt(struct evsel *evsel, const char *name)
}
}
static int evlist__set_syscall_tp_fields(struct evlist *evlist)
static int evlist__set_syscall_tp_fields(struct evlist *evlist, bool *use_btf)
{
struct evsel *evsel;
@ -4460,7 +4864,7 @@ static int evlist__set_syscall_tp_fields(struct evlist *evlist)
continue;
if (strcmp(evsel->tp_format->system, "syscalls")) {
evsel__init_tp_arg_scnprintf(evsel);
evsel__init_tp_arg_scnprintf(evsel, use_btf);
continue;
}
@ -4781,6 +5185,8 @@ int cmd_trace(int argc, const char **argv)
OPT_INTEGER('D', "delay", &trace.opts.target.initial_delay,
"ms to wait before starting measurement after program "
"start"),
OPT_BOOLEAN(0, "force-btf", &trace.force_btf, "Prefer btf_dump general pretty printer"
"to customized ones"),
OPTS_EVSWITCH(&trace.evswitch),
OPT_END()
};
@ -4938,11 +5344,16 @@ int cmd_trace(int argc, const char **argv)
}
if (trace.evlist->core.nr_entries > 0) {
bool use_btf = false;
evlist__set_default_evsel_handler(trace.evlist, trace__event_handler);
if (evlist__set_syscall_tp_fields(trace.evlist)) {
if (evlist__set_syscall_tp_fields(trace.evlist, &use_btf)) {
perror("failed to set syscalls:* tracepoint fields");
goto out;
}
if (use_btf)
trace__load_vmlinux_btf(&trace);
}
if (trace.sort_events) {

View File

@ -46,45 +46,18 @@ static void status_print(const char *name, const char *macro,
printf(" # %s\n", macro);
}
#define STATUS(__d, __m) \
do { \
if (IS_BUILTIN(__d)) \
status_print(#__m, #__d, "on"); \
else \
status_print(#__m, #__d, "OFF"); \
#define STATUS(feature) \
do { \
if (feature.is_builtin) \
status_print(feature.name, feature.macro, "on"); \
else \
status_print(feature.name, feature.macro, "OFF"); \
} while (0)
static void library_status(void)
{
STATUS(HAVE_DWARF_SUPPORT, dwarf);
STATUS(HAVE_DWARF_GETLOCATIONS_SUPPORT, dwarf_getlocations);
#ifndef HAVE_SYSCALL_TABLE_SUPPORT
STATUS(HAVE_LIBAUDIT_SUPPORT, libaudit);
#endif
STATUS(HAVE_SYSCALL_TABLE_SUPPORT, syscall_table);
STATUS(HAVE_LIBBFD_SUPPORT, libbfd);
STATUS(HAVE_DEBUGINFOD_SUPPORT, debuginfod);
STATUS(HAVE_LIBELF_SUPPORT, libelf);
STATUS(HAVE_LIBNUMA_SUPPORT, libnuma);
STATUS(HAVE_LIBNUMA_SUPPORT, numa_num_possible_cpus);
STATUS(HAVE_LIBPERL_SUPPORT, libperl);
STATUS(HAVE_LIBPYTHON_SUPPORT, libpython);
STATUS(HAVE_SLANG_SUPPORT, libslang);
STATUS(HAVE_LIBCRYPTO_SUPPORT, libcrypto);
STATUS(HAVE_LIBUNWIND_SUPPORT, libunwind);
STATUS(HAVE_DWARF_SUPPORT, libdw-dwarf-unwind);
STATUS(HAVE_LIBCAPSTONE_SUPPORT, libcapstone);
STATUS(HAVE_ZLIB_SUPPORT, zlib);
STATUS(HAVE_LZMA_SUPPORT, lzma);
STATUS(HAVE_AUXTRACE_SUPPORT, get_cpuid);
STATUS(HAVE_LIBBPF_SUPPORT, bpf);
STATUS(HAVE_AIO_SUPPORT, aio);
STATUS(HAVE_ZSTD_SUPPORT, zstd);
STATUS(HAVE_LIBPFM, libpfm4);
STATUS(HAVE_LIBTRACEEVENT, libtraceevent);
STATUS(HAVE_BPF_SKEL, bpf_skeletons);
STATUS(HAVE_DWARF_UNWIND_SUPPORT, dwarf-unwind-support);
STATUS(HAVE_CSTRACE_SUPPORT, libopencsd);
for (int i = 0; supported_features[i].name; ++i)
STATUS(supported_features[i]);
}
int cmd_version(int argc, const char **argv)

View File

@ -2,6 +2,22 @@
#ifndef BUILTIN_H
#define BUILTIN_H
#include <stddef.h>
#include <linux/compiler.h>
#include <tools/config.h>
struct feature_status {
const char *name;
const char *macro;
int is_builtin;
};
#define FEATURE_STATUS(name_, macro_) { \
.name = name_, \
.macro = #macro_, \
.is_builtin = IS_BUILTIN(macro_) }
extern struct feature_status supported_features[];
struct cmdnames;
void list_common_cmds_help(void);
@ -11,6 +27,7 @@ int cmd_annotate(int argc, const char **argv);
int cmd_bench(int argc, const char **argv);
int cmd_buildid_cache(int argc, const char **argv);
int cmd_buildid_list(int argc, const char **argv);
int cmd_check(int argc, const char **argv);
int cmd_config(int argc, const char **argv);
int cmd_c2c(int argc, const char **argv);
int cmd_diff(int argc, const char **argv);

View File

@ -172,6 +172,7 @@ check lib/ctype.c '-I "^EXPORT_SYMBOL" -I "^#include <linux/export.h>" -B
check lib/list_sort.c '-I "^#include <linux/bug.h>"'
# diff non-symmetric files
check_2 tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
check_2 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
check_2 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
check_2 tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl

View File

@ -52,6 +52,7 @@ static struct cmd_struct commands[] = {
{ "archive", NULL, 0 },
{ "buildid-cache", cmd_buildid_cache, 0 },
{ "buildid-list", cmd_buildid_list, 0 },
{ "check", cmd_check, 0 },
{ "config", cmd_config, 0 },
{ "c2c", cmd_c2c, 0 },
{ "diff", cmd_diff, 0 },

View File

@ -11,6 +11,8 @@ METRIC_TEST_PY = pmu-events/metric_test.py
EMPTY_PMU_EVENTS_C = pmu-events/empty-pmu-events.c
PMU_EVENTS_C = $(OUTPUT)pmu-events/pmu-events.c
METRIC_TEST_LOG = $(OUTPUT)pmu-events/metric_test.log
TEST_EMPTY_PMU_EVENTS_C = $(OUTPUT)pmu-events/test-empty-pmu-events.c
EMPTY_PMU_EVENTS_TEST_LOG = $(OUTPUT)pmu-events/empty-pmu-events.log
ifeq ($(JEVENTS_ARCH),)
JEVENTS_ARCH=$(SRCARCH)
@ -31,7 +33,15 @@ $(METRIC_TEST_LOG): $(METRIC_TEST_PY) $(METRIC_PY)
$(call rule_mkdir)
$(Q)$(call echo-cmd,test)$(PYTHON) $< 2> $@ || (cat $@ && false)
$(PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG)
$(TEST_EMPTY_PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) none none pmu-events/arch $@
$(EMPTY_PMU_EVENTS_TEST_LOG): $(EMPTY_PMU_EVENTS_C) $(TEST_EMPTY_PMU_EVENTS_C)
$(call rule_mkdir)
$(Q)$(call echo-cmd,test)diff -u $^ 2> $@ || (cat $@ && false)
$(PMU_EVENTS_C): $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(METRIC_PY) $(METRIC_TEST_LOG) $(EMPTY_PMU_EVENTS_TEST_LOG)
$(call rule_mkdir)
$(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(JEVENTS_ARCH) $(JEVENTS_MODEL) pmu-events/arch $@
endif

View File

@ -77,9 +77,6 @@
{
"ArchStdEvent": "OP_RETIRED"
},
{
"ArchStdEvent": "OP_SPEC"
},
{
"PublicDescription": "Operation speculatively executed, NOP",
"EventCode": "0x100",

View File

@ -1,12 +1,22 @@
[
{
"EventCode": "0x1002C",
"EventName": "PM_LD_PREFETCH_CACHE_LINE_MISS",
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a prefetch request."
},
{
"EventCode": "0x200FD",
"EventName": "PM_L1_ICACHE_MISS",
"BriefDescription": "Demand instruction cache miss."
},
{
"EventCode": "0x30068",
"EventName": "PM_L1_ICACHE_RELOADED_PREF",
"BriefDescription": "Counts all instruction cache prefetch reloads (includes demand turned into prefetch)."
},
{
"EventCode": "0x300F4",
"EventName": "PM_RUN_INST_CMPL_CONC",
"BriefDescription": "PowerPC instruction completed by this thread when all threads in the core had the run-latch set."
},
{
"EventCode": "0x400F6",
"EventName": "PM_BR_MPRED_CMPL",
"BriefDescription": "A mispredicted branch completed. Includes direction and target."
}
]

View File

@ -1,4 +1,14 @@
[
{
"EventCode": "0x1505E",
"EventName": "PM_LD_HIT_L1",
"BriefDescription": "Load finished without experiencing an L1 miss."
},
{
"EventCode": "0x100FC",
"EventName": "PM_LD_REF_L1",
"BriefDescription": "All L1 D cache load references counted at finish, gated by reject. In P9 and earlier this event counted only cacheable loads but in P10 both cacheable and non-cacheable loads are included."
},
{
"EventCode": "0x200FE",
"EventName": "PM_DATA_FROM_L2MISS",
@ -9,11 +19,41 @@
"EventName": "PM_DATA_FROM_L3MISS",
"BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss."
},
{
"EventCode": "0x400F0",
"EventName": "PM_LD_DEMAND_MISS_L1_FIN",
"BriefDescription": "Load missed L1, counted at finish time."
},
{
"EventCode": "0x400FE",
"EventName": "PM_DATA_FROM_MEMORY",
"BriefDescription": "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss."
},
{
"EventCode": "0x0000004080",
"EventName": "PM_INST_FROM_L1",
"BriefDescription": "An instruction fetch hit in the L1. Each fetch group contains 8 instructions. The same line can hit 4 times if 32 sequential instructions are fetched."
},
{
"EventCode": "0x000000026080",
"EventName": "PM_L2_LD_MISS",
"BriefDescription": "All successful D-Side Load dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2."
},
{
"EventCode": "0x000000026880",
"EventName": "PM_L2_ST_MISS",
"BriefDescription": "All successful D-Side Store dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2."
},
{
"EventCode": "0x010000046880",
"EventName": "PM_L2_ST_HIT",
"BriefDescription": "All successful D-side store dispatches for this thread that were L2 hits. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2."
},
{
"EventCode": "0x000000036880",
"EventName": "PM_L2_INST_MISS",
"BriefDescription": "All successful instruction (demand and prefetch) dispatches for this thread that missed in the L2. Since the event happens in a 2:1 clock domain and is time-sliced across all 4 threads, the event count should be multiplied by 2."
},
{
"EventCode": "0x000300000000C040",
"EventName": "PM_INST_FROM_L2",

View File

@ -74,19 +74,49 @@
"EventName": "PM_ISSUE_KILL",
"BriefDescription": "Cycles in which an instruction or group of instructions were cancelled after being issued. This event increments once per occurrence, regardless of how many instructions are included in the issue group."
},
{
"EventCode": "0x44054",
"EventName": "PM_VECTOR_LD_CMPL",
"BriefDescription": "Vector load instruction completed."
},
{
"EventCode": "0x44056",
"EventName": "PM_VECTOR_ST_CMPL",
"BriefDescription": "Vector store instruction completed."
},
{
"EventCode": "0x4D05E",
"EventName": "PM_BR_CMPL",
"BriefDescription": "A branch completed. All branches are included."
},
{
"EventCode": "0x4E054",
"EventName": "PM_DTLB_HIT_1G",
"BriefDescription": "Data TLB hit (DERAT reload) page size 1G. Implies radix translation. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
"EventCode": "0x400F6",
"EventName": "PM_BR_MPRED_CMPL",
"BriefDescription": "A mispredicted branch completed. Includes direction and target."
},
{
"EventCode": "0x400FC",
"EventName": "PM_ITLB_MISS",
"BriefDescription": "Instruction TLB reload (after a miss), all page sizes. Includes only demand misses."
},
{
"EventCode": "0x00000048B4",
"EventName": "PM_BR_TKN_UNCOND_FIN",
"BriefDescription": "An unconditional branch finished. All unconditional branches are taken."
},
{
"EventCode": "0x00000040B8",
"EventName": "PM_PRED_BR_TKN_COND_DIR",
"BriefDescription": "A conditional branch finished with correctly predicted direction. Resolved taken."
},
{
"EventCode": "0x00000048B8",
"EventName": "PM_PRED_BR_NTKN_COND_DIR",
"BriefDescription": "A conditional branch finished with correctly predicted direction. Resolved not taken."
}
]

View File

@ -4,9 +4,19 @@
"EventName": "PM_STCX_FAIL_FIN",
"BriefDescription": "Conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock."
},
{
"EventCode": "0x2E014",
"EventName": "PM_STCX_FIN",
"BriefDescription": "Conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
"EventCode": "0x4E050",
"EventName": "PM_STCX_PASS_FIN",
"BriefDescription": "Conditional store instruction (STCX) passed. LARX and STCX are instructions used to acquire a lock."
},
{
"EventCode": "0x000000C8B8",
"EventName": "PM_STCX_SUCCESS_CMPL",
"BriefDescription": "STCX instructions that completed successfully. Specifically, counts only when a pass status is returned from the nest."
}
]

View File

@ -69,6 +69,11 @@
"EventName": "PM_XFER_FROM_SRC_PMC3",
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
"EventCode": "0x3F04A",
"EventName": "PM_LSU_ST5_FIN",
"BriefDescription": "LSU Finished an internal operation in ST2 port."
},
{
"EventCode": "0x3C054",
"EventName": "PM_DERAT_MISS_16M",
@ -108,5 +113,30 @@
"EventCode": "0x4C05A",
"EventName": "PM_DTLB_MISS_1G",
"BriefDescription": "Data TLB reload (after a miss) page size 1G. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
"EventCode": "0x000000F880",
"EventName": "PM_SNOOP_TLBIE_CYC",
"BriefDescription": "Cycles in which TLBIE snoops are executed in the LSU."
},
{
"EventCode": "0x000000F084",
"EventName": "PM_SNOOP_TLBIE_CACHE_WALK_CYC",
"BriefDescription": "TLBIE snoop cycles in which the data cache is being walked."
},
{
"EventCode": "0x000000F884",
"EventName": "PM_SNOOP_TLBIE_WAIT_ST_CYC",
"BriefDescription": "TLBIE snoop cycles in which older stores are still draining."
},
{
"EventCode": "0x000000F088",
"EventName": "PM_SNOOP_TLBIE_WAIT_LD_CYC",
"BriefDescription": "TLBIE snoop cycles in which older loads are still draining."
},
{
"EventCode": "0x000000F08C",
"EventName": "PM_SNOOP_TLBIE_WAIT_MMU_CYC",
"BriefDescription": "TLBIE snoop cycles in which the Load-Store unit is waiting for the MMU to finish invalidation."
}
]

View File

@ -1,89 +1,19 @@
[
{
"EventCode": "0x1002C",
"EventName": "PM_LD_PREFETCH_CACHE_LINE_MISS",
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a prefetch request."
},
{
"EventCode": "0x1505E",
"EventName": "PM_LD_HIT_L1",
"BriefDescription": "Load finished without experiencing an L1 miss."
},
{
"EventCode": "0x1F056",
"EventName": "PM_DISP_SS0_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 1 or 2 instructions."
},
{
"EventCode": "0x1F05A",
"EventName": "PM_DISP_HELD_SYNC_CYC",
"BriefDescription": "Cycles dispatch is held because of a synchronizing instruction that requires the ICT to be empty before dispatch."
},
{
"EventCode": "0x10066",
"EventName": "PM_ADJUNCT_CYC",
"BriefDescription": "Cycles in which the thread is in Adjunct state. MSR[S HV PR] bits = 011."
},
{
"EventCode": "0x100FC",
"EventName": "PM_LD_REF_L1",
"BriefDescription": "All L1 D cache load references counted at finish, gated by reject. In P9 and earlier this event counted only cacheable loads but in P10 both cacheable and non-cacheable loads are included."
},
{
"EventCode": "0x2E010",
"EventName": "PM_ADJUNCT_INST_CMPL",
"BriefDescription": "PowerPC instruction completed while the thread was in Adjunct state."
},
{
"EventCode": "0x2E014",
"EventName": "PM_STCX_FIN",
"BriefDescription": "Conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
"EventCode": "0x2F054",
"EventName": "PM_DISP_SS1_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 1 or 2 instructions."
},
{
"EventCode": "0x2F056",
"EventName": "PM_DISP_SS1_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 3 or 4 instructions."
},
{
"EventCode": "0x200F2",
"EventName": "PM_INST_DISP",
"BriefDescription": "PowerPC instruction dispatched."
},
{
"EventCode": "0x200FD",
"EventName": "PM_L1_ICACHE_MISS",
"BriefDescription": "Demand instruction cache miss."
},
{
"EventCode": "0x3F04A",
"EventName": "PM_LSU_ST5_FIN",
"BriefDescription": "LSU Finished an internal operation in ST2 port."
},
{
"EventCode": "0x3405A",
"EventName": "PM_PRIVILEGED_INST_CMPL",
"BriefDescription": "PowerPC instruction completed while the thread was in Privileged state."
},
{
"EventCode": "0x3F054",
"EventName": "PM_DISP_SS0_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 3 or 4 instructions."
},
{
"EventCode": "0x3F056",
"EventName": "PM_DISP_SS0_8_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 5, 6, 7 or 8 instructions."
},
{
"EventCode": "0x30068",
"EventName": "PM_L1_ICACHE_RELOADED_PREF",
"BriefDescription": "Counts all instruction cache prefetch reloads (includes demand turned into prefetch)."
},
{
"EventCode": "0x300F6",
"EventName": "PM_LD_DEMAND_MISS_L1",
@ -95,18 +25,48 @@
"BriefDescription": "Counts all instruction cache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch."
},
{
"EventCode": "0x44054",
"EventName": "PM_VECTOR_LD_CMPL",
"BriefDescription": "Vector load instruction completed."
"EventCode": "0x00000038BC",
"EventName": "PM_ISYNC_CMPL",
"BriefDescription": "Isync completion count per thread."
},
{
"EventCode": "0x4D05E",
"EventName": "PM_BR_CMPL",
"BriefDescription": "A branch completed. All branches are included."
"EventCode": "0x000000C088",
"EventName": "PM_LD0_32B_FIN",
"BriefDescription": "256-bit load finished in the LD0 load execution unit."
},
{
"EventCode": "0x400F0",
"EventName": "PM_LD_DEMAND_MISS_L1_FIN",
"BriefDescription": "Load missed L1, counted at finish time."
"EventCode": "0x000000C888",
"EventName": "PM_LD1_32B_FIN",
"BriefDescription": "256-bit load finished in the LD1 load execution unit."
},
{
"EventCode": "0x000000C090",
"EventName": "PM_LD0_UNALIGNED_FIN",
"BriefDescription": "Load instructions in LD0 port that are either unaligned, or treated as unaligned and require an additional recycle through the pipeline using the load gather buffer. This typically adds about 10 cycles to the latency of the instruction. This includes loads that cross the 128 byte boundary, octword loads that are not aligned, and a special forward progress case of a load that does not hit in the L1 and crosses the 32 byte boundary and is launched NTC. Counted at finish time."
},
{
"EventCode": "0x000000C890",
"EventName": "PM_LD1_UNALIGNED_FIN",
"BriefDescription": "Load instructions in LD1 port that are either unaligned, or treated as unaligned and require an additional recycle through the pipeline using the load gather buffer. This typically adds about 10 cycles to the latency of the instruction. This includes loads that cross the 128 byte boundary, octword loads that are not aligned, and a special forward progress case of a load that does not hit in the L1 and crosses the 32 byte boundary and is launched NTC. Counted at finish time."
},
{
"EventCode": "0x000000C0A4",
"EventName": "PM_ST0_UNALIGNED_FIN",
"BriefDescription": "Store instructions in ST0 port that are either unaligned, or treated as unaligned and require an additional recycle through the pipeline. This typically adds about 10 cycles to the latency of the instruction. This only includes stores that cross the 128 byte boundary. Counted at finish time."
},
{
"EventCode": "0x000000C8A4",
"EventName": "PM_ST1_UNALIGNED_FIN",
"BriefDescription": "Store instructions in ST1 port that are either unaligned, or treated as unaligned and require an additional recycle through the pipeline. This typically adds about 10 cycles to the latency of the instruction. This only includes stores that cross the 128 byte boundary. Counted at finish time."
},
{
"EventCode": "0x000000D0B4",
"EventName": "PM_DC_PREF_STRIDED_CONF",
"BriefDescription": "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software."
},
{
"EventCode": "0x0000004884",
"EventName": "PM_NO_FETCH_IBUF_FULL_CYC",
"BriefDescription": "Cycles in which no instructions are fetched because there is no room in the instruction buffers."
}
]

View File

@ -94,11 +94,21 @@
"EventName": "PM_CMPL_STALL_LWSYNC",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a lwsync waiting to complete."
},
{
"EventCode": "0x1F056",
"EventName": "PM_DISP_SS0_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 1 or 2 instructions."
},
{
"EventCode": "0x1F058",
"EventName": "PM_DISP_HELD_CYC",
"BriefDescription": "Cycles dispatch is held."
},
{
"EventCode": "0x1F05A",
"EventName": "PM_DISP_HELD_SYNC_CYC",
"BriefDescription": "Cycles dispatch is held because of a synchronizing instruction that requires the ICT to be empty before dispatch."
},
{
"EventCode": "0x10064",
"EventName": "PM_DISP_STALL_IC_L2",
@ -229,6 +239,16 @@
"EventName": "PM_NTC_FIN",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline (NTC) finishes. Note that instructions can finish out of order, therefore not all the instructions that finish have a Next-to-complete status."
},
{
"EventCode": "0x2F054",
"EventName": "PM_DISP_SS1_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 1 or 2 instructions."
},
{
"EventCode": "0x2F056",
"EventName": "PM_DISP_SS1_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 3 or 4 instructions."
},
{
"EventCode": "0x20066",
"EventName": "PM_DISP_HELD_OTHER_CYC",
@ -329,6 +349,16 @@
"EventName": "PM_DISP_STALL_IC_L3",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L3."
},
{
"EventCode": "0x3F054",
"EventName": "PM_DISP_SS0_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 3 or 4 instructions."
},
{
"EventCode": "0x3F056",
"EventName": "PM_DISP_SS0_8_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 5, 6, 7 or 8 instructions."
},
{
"EventCode": "0x30060",
"EventName": "PM_DISP_HELD_XVFC_MAPPER_CYC",
@ -458,5 +488,20 @@
"EventCode": "0x400F8",
"EventName": "PM_FLUSH",
"BriefDescription": "Flush (any type)."
},
{
"EventCode": "0x0B0000016080",
"EventName": "PM_L2_TLBIE_SLBIE_START",
"BriefDescription": "NCU Master received a TLBIE/SLBIEG/SLBIAG operation from the core. Event count should be multiplied by 2 since the data is coming from a 2:1 clock domain and the data is time sliced across all 4 threads."
},
{
"EventCode": "0x0B0000016880",
"EventName": "PM_L2_TLBIE_SLBIE_DELAY",
"BriefDescription": "Cycles when a TLBIE/SLBIEG/SLBIAG command was held in a hottemp condition by the NCU Master. Multiply this count by 1000 to obtain the total number of cycles. This can be divided by PM_L2_TLBIE_SLBIE_SENT to obtain the average time a TLBIE/SLBIEG/SLBIAG command was held. Event count should be multiplied by 2 since the data is coming from a 2:1 clock domain and the data is time sliced across all 4 threads."
},
{
"EventCode": "0x0B0000026880",
"EventName": "PM_L2_SNP_TLBIE_SLBIE_DELAY",
"BriefDescription": "Cycles when a TLBIE/SLBIEG/SLBIAG that targets this thread's LPAR was in flight while in a hottemp condition. Multiply this count by 1000 to obtain the total number of cycles. This can be divided by PM_L2_SNP_TLBIE_SLBIE_START to obtain the overall efficiency. Note: 'inflight' means SnpTLB has been sent to core(ie doesn't include when SnpTLB is in NCU waiting to be launched serially behind different SnpTLB). The NCU Snooper gets in a 'hottemp' delay window when it detects it is above its TLBIE/SLBIE threshold for process SnpTLBIE/SLBIE with this core. Event count should be multiplied by 2 since the data is coming from a 2:1 clock domain and the data is time sliced across all 4 threads."
}
]

View File

@ -104,6 +104,11 @@
"EventName": "PM_RUN_CYC",
"BriefDescription": "Processor cycles gated by the run latch."
},
{
"EventCode": "0x200F8",
"EventName": "PM_EXT_INT",
"BriefDescription": "Cycles an external interrupt was active."
},
{
"EventCode": "0x30010",
"EventName": "PM_PMC2_OVERFLOW",
@ -124,6 +129,11 @@
"EventName": "PM_PMC6_OVERFLOW",
"BriefDescription": "The event selected for PMC6 caused the event counter to overflow."
},
{
"EventCode": "0x3405A",
"EventName": "PM_PRIVILEGED_INST_CMPL",
"BriefDescription": "PowerPC instruction completed while the thread was in Privileged state."
},
{
"EventCode": "0x3006C",
"EventName": "PM_RUN_CYC_SMT2_MODE",

View File

@ -4577,7 +4577,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : CRds issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4588,7 +4588,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : DRds issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4599,7 +4599,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4609,7 +4609,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4619,7 +4619,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : LLCPrefRFO issued by iA Cores that hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4630,7 +4630,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : RFOs issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4651,7 +4651,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : CRds issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4662,7 +4662,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : DRds issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4673,7 +4673,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4683,7 +4683,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4693,7 +4693,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : LLCPrefRFO issued by iA Cores that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4704,7 +4704,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : RFOs issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4747,7 +4747,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_ITOM",
"Experimental": "1",
"Filter": "config1=0x49033",
"Filter": "config1=0x4903300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO ItoM requests that miss the LLC. An ItoM request is used by IIO to request a data write without first reading the data for ownership.",
"UMask": "0x24",
@ -4759,7 +4759,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_RDCUR",
"Experimental": "1",
"Filter": "config1=0x43C33",
"Filter": "config1=0x43c3300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO RdCur requests and miss the LLC. A RdCur request is used by IIO to read data without changing state.",
"UMask": "0x24",
@ -4771,7 +4771,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_RFO",
"Experimental": "1",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO RFO requests that miss the LLC. A read for ownership (RFO) requests a cache line to be cached in E state with the intent to modify.",
"UMask": "0x24",
@ -4999,7 +4999,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -5010,7 +5010,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -5021,7 +5021,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -5031,7 +5031,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -5041,7 +5041,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -5052,7 +5052,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -5073,7 +5073,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -5084,7 +5084,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -5095,7 +5095,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -5105,7 +5105,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -5115,7 +5115,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -5126,7 +5126,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -5171,7 +5171,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM",
"Experimental": "1",
"Filter": "config1=0x49033",
"Filter": "config1=0x4903300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO ItoM requests that miss the LLC. An ItoM is used by IIO to request a data write without first reading the data for ownership.",
"UMask": "0x24",
@ -5183,7 +5183,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_RDCUR",
"Experimental": "1",
"Filter": "config1=0x43C33",
"Filter": "config1=0x43c3300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO RdCur requests that miss the LLC. A RdCur request is used by IIO to read data without changing state.",
"UMask": "0x24",
@ -5195,7 +5195,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_RFO",
"Experimental": "1",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO RFO requests that miss the LLC. A read for ownership (RFO) requests data to be cached in E state with the intent to modify.",
"UMask": "0x24",

View File

@ -0,0 +1,142 @@
{
"Backend": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Bad": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BadSpec": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BigFootprint": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BrMispredicts": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Branches": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvBC": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvBO": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvCB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvFB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvIO": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvMB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvML": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvMP": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvMS": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvMT": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvOB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"BvUW": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"C0Wait": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"CacheHits": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"CacheMisses": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"CodeGen": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Compute": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Cor": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"DSB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"DSBmiss": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"DataSharing": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Fed": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"FetchBW": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"FetchLat": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Flops": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"FpScalar": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"FpVector": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Frontend": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"HPC": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"IcMiss": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Ifetch": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"InsType": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"IntVector": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"L2Evicts": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"LSD": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Load_Store_Miss": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MachineClears": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Machine_Clears": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Mem": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MemOffcore": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Mem_Exec": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MemoryBW": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MemoryBound": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MemoryLat": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MemoryTLB": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Memory_BW": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Memory_Lat": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"MicroSeq": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"OS": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Offcore": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"PGO": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Pipeline": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"PortsUtil": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Power": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Prefetches": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Ret": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Retire": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"SMT": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Server": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Snoop": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"SoC": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"Summary": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"TmaL1": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"TmaL2": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"TmaL3mem": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"TopdownL1": "Metrics for top-down breakdown at level 1",
"TopdownL2": "Metrics for top-down breakdown at level 2",
"TopdownL3": "Metrics for top-down breakdown at level 3",
"TopdownL4": "Metrics for top-down breakdown at level 4",
"TopdownL5": "Metrics for top-down breakdown at level 5",
"TopdownL6": "Metrics for top-down breakdown at level 6",
"load_store_bound": "Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet",
"tma_L1_group": "Metrics for top-down breakdown at level 1",
"tma_L2_group": "Metrics for top-down breakdown at level 2",
"tma_L3_group": "Metrics for top-down breakdown at level 3",
"tma_L4_group": "Metrics for top-down breakdown at level 4",
"tma_L5_group": "Metrics for top-down breakdown at level 5",
"tma_L6_group": "Metrics for top-down breakdown at level 6",
"tma_alu_op_utilization_group": "Metrics contributing to tma_alu_op_utilization category",
"tma_assists_group": "Metrics contributing to tma_assists category",
"tma_backend_bound_group": "Metrics contributing to tma_backend_bound category",
"tma_bad_speculation_group": "Metrics contributing to tma_bad_speculation category",
"tma_branch_mispredicts_group": "Metrics contributing to tma_branch_mispredicts category",
"tma_branch_resteers_group": "Metrics contributing to tma_branch_resteers category",
"tma_core_bound_group": "Metrics contributing to tma_core_bound category",
"tma_dram_bound_group": "Metrics contributing to tma_dram_bound category",
"tma_dtlb_load_group": "Metrics contributing to tma_dtlb_load category",
"tma_dtlb_store_group": "Metrics contributing to tma_dtlb_store category",
"tma_fetch_bandwidth_group": "Metrics contributing to tma_fetch_bandwidth category",
"tma_fetch_latency_group": "Metrics contributing to tma_fetch_latency category",
"tma_fp_arith_group": "Metrics contributing to tma_fp_arith category",
"tma_fp_vector_group": "Metrics contributing to tma_fp_vector category",
"tma_frontend_bound_group": "Metrics contributing to tma_frontend_bound category",
"tma_heavy_operations_group": "Metrics contributing to tma_heavy_operations category",
"tma_ifetch_bandwidth_group": "Metrics contributing to tma_ifetch_bandwidth category",
"tma_ifetch_latency_group": "Metrics contributing to tma_ifetch_latency category",
"tma_int_operations_group": "Metrics contributing to tma_int_operations category",
"tma_issue2P": "Metrics related by the issue $issue2P",
"tma_issueBM": "Metrics related by the issue $issueBM",
"tma_issueBW": "Metrics related by the issue $issueBW",
"tma_issueComp": "Metrics related by the issue $issueComp",
"tma_issueD0": "Metrics related by the issue $issueD0",
"tma_issueFB": "Metrics related by the issue $issueFB",
"tma_issueFL": "Metrics related by the issue $issueFL",
"tma_issueL1": "Metrics related by the issue $issueL1",
"tma_issueLat": "Metrics related by the issue $issueLat",
"tma_issueMC": "Metrics related by the issue $issueMC",
"tma_issueMS": "Metrics related by the issue $issueMS",
"tma_issueMV": "Metrics related by the issue $issueMV",
"tma_issueRFO": "Metrics related by the issue $issueRFO",
"tma_issueSL": "Metrics related by the issue $issueSL",
"tma_issueSO": "Metrics related by the issue $issueSO",
"tma_issueSmSt": "Metrics related by the issue $issueSmSt",
"tma_issueSpSt": "Metrics related by the issue $issueSpSt",
"tma_issueSyncxn": "Metrics related by the issue $issueSyncxn",
"tma_issueTLB": "Metrics related by the issue $issueTLB",
"tma_l1_bound_group": "Metrics contributing to tma_l1_bound category",
"tma_l3_bound_group": "Metrics contributing to tma_l3_bound category",
"tma_light_operations_group": "Metrics contributing to tma_light_operations category",
"tma_load_op_utilization_group": "Metrics contributing to tma_load_op_utilization category",
"tma_machine_clears_group": "Metrics contributing to tma_machine_clears category",
"tma_mem_latency_group": "Metrics contributing to tma_mem_latency category",
"tma_memory_bound_group": "Metrics contributing to tma_memory_bound category",
"tma_microcode_sequencer_group": "Metrics contributing to tma_microcode_sequencer category",
"tma_mite_group": "Metrics contributing to tma_mite category",
"tma_other_light_ops_group": "Metrics contributing to tma_other_light_ops category",
"tma_ports_utilization_group": "Metrics contributing to tma_ports_utilization category",
"tma_ports_utilized_0_group": "Metrics contributing to tma_ports_utilized_0 category",
"tma_ports_utilized_3m_group": "Metrics contributing to tma_ports_utilized_3m category",
"tma_resource_bound_group": "Metrics contributing to tma_resource_bound category",
"tma_retiring_group": "Metrics contributing to tma_retiring category",
"tma_serializing_operation_group": "Metrics contributing to tma_serializing_operation category",
"tma_store_bound_group": "Metrics contributing to tma_store_bound category",
"tma_store_op_utilization_group": "Metrics contributing to tma_store_op_utilization category"
}

File diff suppressed because it is too large Load Diff

View File

@ -4454,7 +4454,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : CRds issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4465,7 +4465,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : DRds issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4476,7 +4476,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4486,7 +4486,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4496,7 +4496,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : LLCPrefRFO issued by iA Cores that hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4507,7 +4507,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : RFOs issued by iA Cores that Hit the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4528,7 +4528,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : CRds issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4539,7 +4539,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : DRds issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4550,7 +4550,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4560,7 +4560,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4570,7 +4570,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : LLCPrefRFO issued by iA Cores that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4581,7 +4581,7 @@
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : RFOs issued by iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4624,7 +4624,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_ITOM",
"Experimental": "1",
"Filter": "config1=0x49033",
"Filter": "config1=0x4903300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO ItoM requests that miss the LLC. An ItoM request is used by IIO to request a data write without first reading the data for ownership.",
"UMask": "0x24",
@ -4636,7 +4636,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_RDCUR",
"Experimental": "1",
"Filter": "config1=0x43C33",
"Filter": "config1=0x43c3300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO RdCur requests and miss the LLC. A RdCur request is used by IIO to read data without changing state.",
"UMask": "0x24",
@ -4648,7 +4648,7 @@
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_RFO",
"Experimental": "1",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries successfully inserted into the TOR that are generated from local IO RFO requests that miss the LLC. A read for ownership (RFO) requests a cache line to be cached in E state with the intent to modify.",
"UMask": "0x24",
@ -4865,7 +4865,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4876,7 +4876,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4887,7 +4887,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4897,7 +4897,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x11",
"Unit": "CHA"
@ -4907,7 +4907,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4918,7 +4918,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x11",
@ -4939,7 +4939,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CRD",
"Filter": "config1=0x40233",
"Filter": "config1=0x4023300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4950,7 +4950,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD",
"Filter": "config1=0x40433",
"Filter": "config1=0x4043300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4961,7 +4961,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefCRD",
"Filter": "config1=0x4b233",
"Filter": "config1=0x4b23300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4971,7 +4971,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefDRD",
"Filter": "config1=0x4b433",
"Filter": "config1=0x4b43300000000",
"PerPkg": "1",
"UMask": "0x21",
"Unit": "CHA"
@ -4981,7 +4981,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefRFO",
"Filter": "config1=0x4b033",
"Filter": "config1=0x4b03300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -4992,7 +4992,7 @@
"Counter": "0",
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0x21",
@ -5037,7 +5037,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM",
"Experimental": "1",
"Filter": "config1=0x49033",
"Filter": "config1=0x4903300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO ItoM requests that miss the LLC. An ItoM is used by IIO to request a data write without first reading the data for ownership.",
"UMask": "0x24",
@ -5049,7 +5049,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_RDCUR",
"Experimental": "1",
"Filter": "config1=0x43C33",
"Filter": "config1=0x43c3300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO RdCur requests that miss the LLC. A RdCur request is used by IIO to read data without changing state.",
"UMask": "0x24",
@ -5061,7 +5061,7 @@
"EventCode": "0x36",
"EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_RFO",
"Experimental": "1",
"Filter": "config1=0x40033",
"Filter": "config1=0x4003300000000",
"PerPkg": "1",
"PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that are generated from local IO RFO requests that miss the LLC. A read for ownership (RFO) requests data to be cached in E state with the intent to modify.",
"UMask": "0x24",

View File

@ -1,61 +1,4 @@
[
{
"BriefDescription": "MMIO reads. Derived from unc_cha_tor_inserts.ia_miss",
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "LLC_MISSES.MMIO_READ",
"Filter": "config1=0x40040e33",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : All requests from iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0xc001fe01",
"Unit": "CHA"
},
{
"BriefDescription": "MMIO writes. Derived from unc_cha_tor_inserts.ia_miss",
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "LLC_MISSES.MMIO_WRITE",
"Filter": "config1=0x40041e33",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : All requests from iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0xc001fe01",
"Unit": "CHA"
},
{
"BriefDescription": "LLC misses - Uncacheable reads (from cpu) . Derived from unc_cha_tor_inserts.ia_miss",
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "LLC_MISSES.UNCACHEABLE",
"Filter": "config1=0x40e33",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : All requests from iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"UMask": "0xc001fe01",
"Unit": "CHA"
},
{
"BriefDescription": "Streaming stores (full cache line). Derived from unc_cha_tor_inserts.ia_miss",
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "LLC_REFERENCES.STREAMING_FULL",
"Filter": "config1=0x41833",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : All requests from iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"ScaleUnit": "64Bytes",
"UMask": "0xc001fe01",
"Unit": "CHA"
},
{
"BriefDescription": "Streaming stores (partial cache line). Derived from unc_cha_tor_inserts.ia_miss",
"Counter": "0,1,2,3",
"EventCode": "0x35",
"EventName": "LLC_REFERENCES.STREAMING_PARTIAL",
"Filter": "config1=0x41a33",
"PerPkg": "1",
"PublicDescription": "TOR Inserts : All requests from iA Cores that Missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
"ScaleUnit": "64Bytes",
"UMask": "0xc001fe01",
"Unit": "CHA"
},
{
"BriefDescription": "CMS Agent0 AD Credits Acquired : For Transgress 0",
"Counter": "0,1,2,3",

File diff suppressed because it is too large Load Diff

View File

@ -503,8 +503,11 @@ def print_pending_events() -> None:
first = True
last_pmu = None
last_name = None
pmus = set()
for event in sorted(_pending_events, key=event_cmp_key):
if last_pmu and last_pmu == event.pmu:
assert event.name != last_name, f"Duplicate event: {last_pmu}/{last_name}/ in {_pending_events_tblname}"
if event.pmu != last_pmu:
if not first:
_args.output_file.write('};\n')
@ -516,6 +519,7 @@ def print_pending_events() -> None:
pmus.add((event.pmu, pmu_name))
_args.output_file.write(event.to_c_string(metric=False))
last_name = event.name
_pending_events = []
_args.output_file.write(f"""
@ -631,14 +635,17 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
"""Process a JSON file during the main walk."""
def is_leaf_dir(path: str) -> bool:
def is_leaf_dir_ignoring_sys(path: str) -> bool:
for item in os.scandir(path):
if item.is_dir():
if item.is_dir() and item.name != 'sys':
return False
return True
# model directory, reset topic
if item.is_dir() and is_leaf_dir(item.path):
# Model directories are leaves (ignoring possible sys
# directories). The FTW will walk into the directory next. Flush
# pending events and metrics and update the table names for the new
# model directory.
if item.is_dir() and is_leaf_dir_ignoring_sys(item.path):
print_pending_events()
print_pending_metrics()
@ -906,7 +913,7 @@ static int pmu_events_table__find_event_pmu(const struct pmu_events_table *table
do_call:
return fn ? fn(&pe, table, data) : 0;
}
return -1000;
return PMU_EVENTS__NOT_FOUND;
}
int pmu_events_table__for_each_event(const struct pmu_events_table *table,
@ -944,10 +951,10 @@ int pmu_events_table__find_event(const struct pmu_events_table *table,
continue;
ret = pmu_events_table__find_event_pmu(table, table_pmu, name, fn, data);
if (ret != -1000)
if (ret != PMU_EVENTS__NOT_FOUND)
return ret;
}
return -1000;
return PMU_EVENTS__NOT_FOUND;
}
size_t pmu_events_table__num_events(const struct pmu_events_table *table,
@ -1256,6 +1263,10 @@ such as "arm/cortex-a34".''',
'output_file', type=argparse.FileType('w', encoding='utf-8'), nargs='?', default=sys.stdout)
_args = ap.parse_args()
_args.output_file.write(f"""
/* SPDX-License-Identifier: GPL-2.0 */
/* THIS FILE WAS AUTOGENERATED BY jevents.py arch={_args.arch} model={_args.model} ! */
""")
_args.output_file.write("""
#include <pmu-events/pmu-events.h>
#include "util/header.h"
@ -1281,7 +1292,7 @@ struct pmu_table_entry {
if item.name == _args.arch or _args.arch == 'all' or item.name == 'test':
archs.append(item.name)
if len(archs) < 2:
if len(archs) < 2 and _args.arch != 'none':
raise IOError(f'Missing architecture directory \'{_args.arch}\'')
archs.sort()

Some files were not shown because too many files have changed in this diff Show More