mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2024-12-29 17:25:38 +00:00
perf tools improvements and fixes for v6.8:
- Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns processing patches, switching roles from perf-tools to perf-tools-next at each Linux release. Data profiling: - Associate samples that identify loads and stores with data structures. This uses events available on Intel, AMD and others and DWARF info: # To get memory access samples in kernel for 1 second (on Intel) $ perf mem record -a -K --ldlat=4 -- sleep 1 # Similar for the AMD (but it requires 6.3+ kernel for BPF filters) $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1 Then, amongst several modes of post processing, one can do things like: $ perf report -s type,typeoff --hierarchy --group --stdio ... # # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u' # Event count (approx.): 602758064 # # Overhead Data Type / Data Type Offset # ........................... ............................ # 26.09% 3.28% 0.00% long unsigned int 26.09% 3.28% 0.00% long unsigned int +0 (no field) 18.48% 0.73% 0.00% struct page 10.83% 0.02% 0.00% struct page +8 (lru.next) 3.90% 0.28% 0.00% struct page +0 (flags) 3.45% 0.06% 0.00% struct page +24 (mapping) 0.25% 0.28% 0.00% struct page +48 (_mapcount.counter) 0.02% 0.06% 0.00% struct page +32 (index) 0.02% 0.00% 0.00% struct page +52 (_refcount.counter) 0.02% 0.01% 0.00% struct page +56 (memcg_data) 0.00% 0.01% 0.00% struct page +16 (lru.prev) 15.37% 17.54% 0.00% (stack operation) 15.37% 17.54% 0.00% (stack operation) +0 (no field) 11.71% 50.27% 0.00% (unknown) 11.71% 50.27% 0.00% (unknown) +0 (no field) $ perf annotate --data-type ... Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples): ============================================================================ samples offset size field 13 0 640 struct cfs_rq { 2 0 16 struct load_weight load { 2 0 8 unsigned long weight; 0 8 4 u32 inv_weight; }; 0 16 8 unsigned long runnable_weight; 0 24 4 unsigned int nr_running; 1 28 4 unsigned int h_nr_running; ... $ perf annotate --data-type=page --group Annotate type: 'struct page' in [kernel.kallsyms] (480 samples): event[0] = cpu/mem-loads,ldlat=4/P event[1] = cpu/mem-stores/P event[2] = dummy:u =================================================================================== samples offset size field 447 33 0 0 64 struct page { 108 8 0 0 8 long unsigned int flags; 319 13 0 8 40 union { 319 13 0 8 40 struct { 236 2 0 8 16 union { 236 2 0 8 16 struct list_head lru { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct { 236 1 0 8 8 void* __filler; 0 1 0 16 4 unsigned int mlock_count; }; 236 2 0 8 16 struct list_head buddy_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct list_head pcp_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; }; 82 4 0 24 8 struct address_space* mapping; 1 7 0 32 8 union { 1 7 0 32 8 long unsigned int index; 1 7 0 32 8 long unsigned int share; }; 0 0 0 40 8 long unsigned int private; }; This uses the existing annotate code, calling objdump to do the disassembly, with improvements to avoid having this take too long, but longer term a switch to a disassembler library, possibly reusing code in the kernel will be pursued. This is the initial implementation, please use it and report impressions and bugs. Make sure the kernel-debuginfo packages match the running kernel. The 'perf report' phase for non short perf.data files may take a while. There is a great article about it on LWN: https://lwn.net/Articles/955709/ - "Data-type profiling for perf" One last test I did while writing this text, on a AMD Ryzen 5950X, using a distro kernel, while doing a simple 'find /' on an otherwise idle system resulted in: # uname -r 6.6.9-100.fc38.x86_64 # perf -vv | grep BPF_ bpf: [ on ] # HAVE_LIBBPF_SUPPORT bpf_skeletons: [ on ] # HAVE_BPF_SKEL # rpm -qa | grep kernel-debuginfo kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64 kernel-debuginfo-6.6.9-100.fc38.x86_64 # # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ] # # ls -la perf.data -rw-------. 1 root root 2346486 Jan 9 18:36 perf.data # perf evlist ibs_op// dummy:u # perf evlist -v ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1 dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf report -s type,typeoff --hierarchy --group --stdio # Total Lost Samples: 0 # # Samples: 2K of events 'ibs_op//, dummy:u' # Event count (approx.): 1904553038 # # Overhead Data Type / Data Type Offset # ................... ............................ # 73.70% 0.00% (unknown) 73.70% 0.00% (unknown) +0 (no field) 3.01% 0.00% long unsigned int 3.00% 0.00% long unsigned int +0 (no field) 0.01% 0.00% long unsigned int +2 (no field) 2.73% 0.00% struct task_struct 1.71% 0.00% struct task_struct +52 (on_cpu) 0.38% 0.00% struct task_struct +2104 (rcu_read_unlock_special.b.blocked) 0.23% 0.00% struct task_struct +2100 (rcu_read_lock_nesting) 0.14% 0.00% struct task_struct +2384 () 0.06% 0.00% struct task_struct +3096 (signal) 0.05% 0.00% struct task_struct +3616 (cgroups) 0.05% 0.00% struct task_struct +2344 (active_mm) 0.02% 0.00% struct task_struct +46 (flags) 0.02% 0.00% struct task_struct +2096 (migration_disabled) 0.01% 0.00% struct task_struct +24 (__state) 0.01% 0.00% struct task_struct +3956 (mm_cid_active) 0.01% 0.00% struct task_struct +1048 (cpus_ptr) 0.01% 0.00% struct task_struct +184 (se.group_node.next) 0.01% 0.00% struct task_struct +20 (thread_info.cpu) 0.00% 0.00% struct task_struct +104 (on_rq) 0.00% 0.00% struct task_struct +2456 (pid) 1.36% 0.00% struct module 0.59% 0.00% struct module +952 (kallsyms) 0.42% 0.00% struct module +0 (state) 0.23% 0.00% struct module +8 (list.next) 0.12% 0.00% struct module +216 (syms) 0.95% 0.00% struct inode 0.41% 0.00% struct inode +40 (i_sb) 0.22% 0.00% struct inode +0 (i_mode) 0.06% 0.00% struct inode +76 (i_rdev) 0.06% 0.00% struct inode +56 (i_security) <SNIP> perf top/report: - Don't ignore job control, allowing control+Z + bg to work. - Add s390 raw data interpretation for PAI (Processor Activity Instrumentation) counters. perf archive: - Add new option '--all' to pack perf.data with DSOs. - Add new option '--unpack' to expand tarballs. Initialization speedups: - Lazily initialize zstd streams to save memory when not using it. - Lazily allocate/size mmap event copy. - Lazy load kernel symbols in 'perf record'. - Be lazier in allocating lost samples buffer in 'perf record'. - Don't synthesize BPF events when disabled via the command line (perf record --no-bpf-event). Assorted improvements: - Show note on AMD systems that the :p, :pp, :ppp and :P are all the same, as IBS (Instruction Based Sampling) is used and it is inherentely precise, not having levels of precision like in Intel systems. - When 'cycles' isn't available, fall back to the "task-clock" event when not system wide, not to 'cpu-clock'. - Add --debug-file option to redirect debug output, e.g.: $ perf --debug-file /tmp/perf.log record -v true - Shrink 'struct map' to under one cacheline by avoiding function pointers for selecting if addresses are identity or DSO relative, and using just a byte for some boolean struct members. - Resolve the arch specific strerrno just once to use in perf_env__arch_strerrno(). - Reduce memory for recording PERF_RECORD_LOST_SAMPLES event. Assorted fixes: - Fix the default 'perf top' usage on Intel hybrid systems, now it starts with a browser showing the number of samples for Efficiency (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This behaviour is similar on ARM64, with its respective set of big.LITTLE processors. - Fix segfault on build_mem_topology() error path. - Fix 'perf mem' error on hybrid related to availability of mem event in a PMU. - Fix missing reference count gets (map, maps) in the db-export code. - Avoid recursively taking env->bpf_progs.lock in the 'perf_env' code. - Use the newly introduced maps__for_each_map() to add missing locking around iteration of 'struct map' entries. - Parse NOTE segments until the build id is found, don't stop on the first one, ELF files may have several such NOTE segments. - Remove 'egrep' usage, its deprecated, use 'grep -E' instead. - Warn first about missing libelf, not libbpf, that depends on libelf. - Use alternative to 'find ... -printf' as this isn't supported in busybox. - Address python 3.6 DeprecationWarning for string scapes. - Fix memory leak in uniq() in libsubcmd. - Fix man page formatting for 'perf lock' - Fix some spelling mistakes. perf tests: - Fail shell tests that needs some symbol in perf itself if it is stripped. These tests check if a symbol is resolved, if some hot function is indeed detected by profiling, etc. - The 'perf test sigtrap' test is currently failing on PREEMPT_RT, skip it if sleeping spinlocks are detected (using BTF) and point to the mailing list discussion about it. This test is also being skipped on several architectures (powerpc, s390x, arm and aarch64) due to other pending issues with intruction breakpoints. - Adjust test case perf record offcpu profiling tests for s390. - Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest, addressing issues caused by the fallback from cycles to task-clock done in this release. - Fix mask for VG register in the user-regs test. - Use shellcheck on 'perf test' shell scripts automatically to make sure changes don't introduce things it flags as problematic. - Add option to change objdump binary and allow it to be set via 'perf config'. - Add basic 'perf script', 'perf list --json" and 'perf diff' tests. - Basic branch counter support. - Make DSO tests a suite rather than individual. - Remove atomics from test_loop to avoid test failures. - Fix call chain match on powerpc for the record+probe_libc_inet_pton test. - Improve Intel hybrid tests. Vendor event files (JSON): powerpc: - Update datasource event name to fix duplicate events on IBM's Power10. - Add PVN for HX-C2000 CPU with Power8 Architecture. Intel: - Alderlake/rocketlake metric fixes. - Update emeraldrapids events to v1.02. - Update icelakex events to v1.23. - Update sapphirerapids events to v1.17. - Add skx, clx, icx and spr upi bandwidth metric. AMD: - Add Zen 4 memory controller events. RISC-V: - Add StarFive Dubhe-80 and Dubhe-90 JSON files. https://www.starfivetech.com/en/site/cpu-u - Add T-HEAD C9xx JSON file. https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md ARM64: - Remove UTF-8 characters from cmn.json, that were causing build failure in some distros. - Add core PMU events and metrics for Ampere One X. - Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT libperf: - Rename several perf_cpu_map constructor names to clarify what they really do. - Ditto for some other methods, coping with some issues in their semantics, like perf_cpu_map__empty() -> perf_cpu_map__has_any_cpu_or_is_empty(). - Document perf_cpu_map__nr()'s behavior perf stat: - Exit if parse groups fails. - Combine the -A/--no-aggr and --no-merge options. - Fix help message for --metric-no-threshold option. Hardware tracing: ARM64 CoreSight: - Bump minimum OpenCSD version to ensure a bugfix is present. - Add 'T' itrace option for timestamp trace - Set start vm addr of exectable file to 0 and don't ignore first sample on the arm-cs-trace-disasm.py 'perf script'. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZZ3FpgAKCRCyPKLppCJ+ Jz21AQDB93J4X05bwHJlRloN3KuA3LuwzvAQkwFoJSfFFMDnzgEAgbAMF1sANirP 5UcGxVgqoXWdrp9pkMcGlcFc7jsz5gA= =SM26 -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns processing patches, switching roles from perf-tools to perf-tools-next at each Linux release. Data profiling: - Associate samples that identify loads and stores with data structures. This uses events available on Intel, AMD and others and DWARF info: # To get memory access samples in kernel for 1 second (on Intel) $ perf mem record -a -K --ldlat=4 -- sleep 1 # Similar for the AMD (but it requires 6.3+ kernel for BPF filters) $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1 Then, amongst several modes of post processing, one can do things like: $ perf report -s type,typeoff --hierarchy --group --stdio ... # # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u' # Event count (approx.): 602758064 # # Overhead Data Type / Data Type Offset # ........................... ............................ # 26.09% 3.28% 0.00% long unsigned int 26.09% 3.28% 0.00% long unsigned int +0 (no field) 18.48% 0.73% 0.00% struct page 10.83% 0.02% 0.00% struct page +8 (lru.next) 3.90% 0.28% 0.00% struct page +0 (flags) 3.45% 0.06% 0.00% struct page +24 (mapping) 0.25% 0.28% 0.00% struct page +48 (_mapcount.counter) 0.02% 0.06% 0.00% struct page +32 (index) 0.02% 0.00% 0.00% struct page +52 (_refcount.counter) 0.02% 0.01% 0.00% struct page +56 (memcg_data) 0.00% 0.01% 0.00% struct page +16 (lru.prev) 15.37% 17.54% 0.00% (stack operation) 15.37% 17.54% 0.00% (stack operation) +0 (no field) 11.71% 50.27% 0.00% (unknown) 11.71% 50.27% 0.00% (unknown) +0 (no field) $ perf annotate --data-type ... Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples): ============================================================================ samples offset size field 13 0 640 struct cfs_rq { 2 0 16 struct load_weight load { 2 0 8 unsigned long weight; 0 8 4 u32 inv_weight; }; 0 16 8 unsigned long runnable_weight; 0 24 4 unsigned int nr_running; 1 28 4 unsigned int h_nr_running; ... $ perf annotate --data-type=page --group Annotate type: 'struct page' in [kernel.kallsyms] (480 samples): event[0] = cpu/mem-loads,ldlat=4/P event[1] = cpu/mem-stores/P event[2] = dummy:u =================================================================================== samples offset size field 447 33 0 0 64 struct page { 108 8 0 0 8 long unsigned int flags; 319 13 0 8 40 union { 319 13 0 8 40 struct { 236 2 0 8 16 union { 236 2 0 8 16 struct list_head lru { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct { 236 1 0 8 8 void* __filler; 0 1 0 16 4 unsigned int mlock_count; }; 236 2 0 8 16 struct list_head buddy_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct list_head pcp_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; }; 82 4 0 24 8 struct address_space* mapping; 1 7 0 32 8 union { 1 7 0 32 8 long unsigned int index; 1 7 0 32 8 long unsigned int share; }; 0 0 0 40 8 long unsigned int private; }; This uses the existing annotate code, calling objdump to do the disassembly, with improvements to avoid having this take too long, but longer term a switch to a disassembler library, possibly reusing code in the kernel will be pursued. This is the initial implementation, please use it and report impressions and bugs. Make sure the kernel-debuginfo packages match the running kernel. The 'perf report' phase for non short perf.data files may take a while. There is a great article about it on LWN: https://lwn.net/Articles/955709/ - "Data-type profiling for perf" One last test I did while writing this text, on a AMD Ryzen 5950X, using a distro kernel, while doing a simple 'find /' on an otherwise idle system resulted in: # uname -r 6.6.9-100.fc38.x86_64 # perf -vv | grep BPF_ bpf: [ on ] # HAVE_LIBBPF_SUPPORT bpf_skeletons: [ on ] # HAVE_BPF_SKEL # rpm -qa | grep kernel-debuginfo kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64 kernel-debuginfo-6.6.9-100.fc38.x86_64 # # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ] # # ls -la perf.data -rw-------. 1 root root 2346486 Jan 9 18:36 perf.data # perf evlist ibs_op// dummy:u # perf evlist -v ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1 dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf report -s type,typeoff --hierarchy --group --stdio # Total Lost Samples: 0 # # Samples: 2K of events 'ibs_op//, dummy:u' # Event count (approx.): 1904553038 # # Overhead Data Type / Data Type Offset # ................... ............................ # 73.70% 0.00% (unknown) 73.70% 0.00% (unknown) +0 (no field) 3.01% 0.00% long unsigned int 3.00% 0.00% long unsigned int +0 (no field) 0.01% 0.00% long unsigned int +2 (no field) 2.73% 0.00% struct task_struct 1.71% 0.00% struct task_struct +52 (on_cpu) 0.38% 0.00% struct task_struct +2104 (rcu_read_unlock_special.b.blocked) 0.23% 0.00% struct task_struct +2100 (rcu_read_lock_nesting) 0.14% 0.00% struct task_struct +2384 () 0.06% 0.00% struct task_struct +3096 (signal) 0.05% 0.00% struct task_struct +3616 (cgroups) 0.05% 0.00% struct task_struct +2344 (active_mm) 0.02% 0.00% struct task_struct +46 (flags) 0.02% 0.00% struct task_struct +2096 (migration_disabled) 0.01% 0.00% struct task_struct +24 (__state) 0.01% 0.00% struct task_struct +3956 (mm_cid_active) 0.01% 0.00% struct task_struct +1048 (cpus_ptr) 0.01% 0.00% struct task_struct +184 (se.group_node.next) 0.01% 0.00% struct task_struct +20 (thread_info.cpu) 0.00% 0.00% struct task_struct +104 (on_rq) 0.00% 0.00% struct task_struct +2456 (pid) 1.36% 0.00% struct module 0.59% 0.00% struct module +952 (kallsyms) 0.42% 0.00% struct module +0 (state) 0.23% 0.00% struct module +8 (list.next) 0.12% 0.00% struct module +216 (syms) 0.95% 0.00% struct inode 0.41% 0.00% struct inode +40 (i_sb) 0.22% 0.00% struct inode +0 (i_mode) 0.06% 0.00% struct inode +76 (i_rdev) 0.06% 0.00% struct inode +56 (i_security) <SNIP> perf top/report: - Don't ignore job control, allowing control+Z + bg to work. - Add s390 raw data interpretation for PAI (Processor Activity Instrumentation) counters. perf archive: - Add new option '--all' to pack perf.data with DSOs. - Add new option '--unpack' to expand tarballs. Initialization speedups: - Lazily initialize zstd streams to save memory when not using it. - Lazily allocate/size mmap event copy. - Lazy load kernel symbols in 'perf record'. - Be lazier in allocating lost samples buffer in 'perf record'. - Don't synthesize BPF events when disabled via the command line (perf record --no-bpf-event). Assorted improvements: - Show note on AMD systems that the :p, :pp, :ppp and :P are all the same, as IBS (Instruction Based Sampling) is used and it is inherentely precise, not having levels of precision like in Intel systems. - When 'cycles' isn't available, fall back to the "task-clock" event when not system wide, not to 'cpu-clock'. - Add --debug-file option to redirect debug output, e.g.: $ perf --debug-file /tmp/perf.log record -v true - Shrink 'struct map' to under one cacheline by avoiding function pointers for selecting if addresses are identity or DSO relative, and using just a byte for some boolean struct members. - Resolve the arch specific strerrno just once to use in perf_env__arch_strerrno(). - Reduce memory for recording PERF_RECORD_LOST_SAMPLES event. Assorted fixes: - Fix the default 'perf top' usage on Intel hybrid systems, now it starts with a browser showing the number of samples for Efficiency (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This behaviour is similar on ARM64, with its respective set of big.LITTLE processors. - Fix segfault on build_mem_topology() error path. - Fix 'perf mem' error on hybrid related to availability of mem event in a PMU. - Fix missing reference count gets (map, maps) in the db-export code. - Avoid recursively taking env->bpf_progs.lock in the 'perf_env' code. - Use the newly introduced maps__for_each_map() to add missing locking around iteration of 'struct map' entries. - Parse NOTE segments until the build id is found, don't stop on the first one, ELF files may have several such NOTE segments. - Remove 'egrep' usage, its deprecated, use 'grep -E' instead. - Warn first about missing libelf, not libbpf, that depends on libelf. - Use alternative to 'find ... -printf' as this isn't supported in busybox. - Address python 3.6 DeprecationWarning for string scapes. - Fix memory leak in uniq() in libsubcmd. - Fix man page formatting for 'perf lock' - Fix some spelling mistakes. perf tests: - Fail shell tests that needs some symbol in perf itself if it is stripped. These tests check if a symbol is resolved, if some hot function is indeed detected by profiling, etc. - The 'perf test sigtrap' test is currently failing on PREEMPT_RT, skip it if sleeping spinlocks are detected (using BTF) and point to the mailing list discussion about it. This test is also being skipped on several architectures (powerpc, s390x, arm and aarch64) due to other pending issues with intruction breakpoints. - Adjust test case perf record offcpu profiling tests for s390. - Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest, addressing issues caused by the fallback from cycles to task-clock done in this release. - Fix mask for VG register in the user-regs test. - Use shellcheck on 'perf test' shell scripts automatically to make sure changes don't introduce things it flags as problematic. - Add option to change objdump binary and allow it to be set via 'perf config'. - Add basic 'perf script', 'perf list --json" and 'perf diff' tests. - Basic branch counter support. - Make DSO tests a suite rather than individual. - Remove atomics from test_loop to avoid test failures. - Fix call chain match on powerpc for the record+probe_libc_inet_pton test. - Improve Intel hybrid tests. Vendor event files (JSON): powerpc: - Update datasource event name to fix duplicate events on IBM's Power10. - Add PVN for HX-C2000 CPU with Power8 Architecture. Intel: - Alderlake/rocketlake metric fixes. - Update emeraldrapids events to v1.02. - Update icelakex events to v1.23. - Update sapphirerapids events to v1.17. - Add skx, clx, icx and spr upi bandwidth metric. AMD: - Add Zen 4 memory controller events. RISC-V: - Add StarFive Dubhe-80 and Dubhe-90 JSON files. https://www.starfivetech.com/en/site/cpu-u - Add T-HEAD C9xx JSON file. https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md ARM64: - Remove UTF-8 characters from cmn.json, that were causing build failure in some distros. - Add core PMU events and metrics for Ampere One X. - Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT libperf: - Rename several perf_cpu_map constructor names to clarify what they really do. - Ditto for some other methods, coping with some issues in their semantics, like perf_cpu_map__empty() -> perf_cpu_map__has_any_cpu_or_is_empty(). - Document perf_cpu_map__nr()'s behavior perf stat: - Exit if parse groups fails. - Combine the -A/--no-aggr and --no-merge options. - Fix help message for --metric-no-threshold option. Hardware tracing: ARM64 CoreSight: - Bump minimum OpenCSD version to ensure a bugfix is present. - Add 'T' itrace option for timestamp trace - Set start vm addr of exectable file to 0 and don't ignore first sample on the arm-cs-trace-disasm.py 'perf script'" * tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits) MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm perf db-export: Fix missing reference count get in call_path_from_sample() perf tests: Add perf script test libsubcmd: Fix memory leak in uniq() perf TUI: Don't ignore job control perf vendor events intel: Update sapphirerapids events to v1.17 perf vendor events intel: Update icelakex events to v1.23 perf vendor events intel: Update emeraldrapids events to v1.02 perf vendor events intel: Alderlake/rocketlake metric fixes perf x86 test: Add hybrid test for conflicting legacy/sysfs event perf x86 test: Update hybrid expectations perf vendor events amd: Add Zen 4 memory controller events perf stat: Fix hard coded LL miss units perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event perf env: Avoid recursively taking env->bpf_progs.lock perf annotate: Add --insn-stat option for debugging perf annotate: Add --type-stat option for debugging perf annotate: Support event group display perf annotate: Add --data-type option ...
This commit is contained in:
commit
9d64bf433c
@ -17140,10 +17140,10 @@ PERFORMANCE EVENTS SUBSYSTEM
|
||||
M: Peter Zijlstra <peterz@infradead.org>
|
||||
M: Ingo Molnar <mingo@redhat.com>
|
||||
M: Arnaldo Carvalho de Melo <acme@kernel.org>
|
||||
M: Namhyung Kim <namhyung@kernel.org>
|
||||
R: Mark Rutland <mark.rutland@arm.com>
|
||||
R: Alexander Shishkin <alexander.shishkin@linux.intel.com>
|
||||
R: Jiri Olsa <jolsa@kernel.org>
|
||||
R: Namhyung Kim <namhyung@kernel.org>
|
||||
R: Ian Rogers <irogers@google.com>
|
||||
R: Adrian Hunter <adrian.hunter@intel.com>
|
||||
L: linux-perf-users@vger.kernel.org
|
||||
|
@ -32,6 +32,7 @@ FEATURE_TESTS_BASIC := \
|
||||
backtrace \
|
||||
dwarf \
|
||||
dwarf_getlocations \
|
||||
dwarf_getcfi \
|
||||
eventfd \
|
||||
fortify-source \
|
||||
get_current_dir_name \
|
||||
|
@ -7,6 +7,7 @@ FILES= \
|
||||
test-bionic.bin \
|
||||
test-dwarf.bin \
|
||||
test-dwarf_getlocations.bin \
|
||||
test-dwarf_getcfi.bin \
|
||||
test-eventfd.bin \
|
||||
test-fortify-source.bin \
|
||||
test-get_current_dir_name.bin \
|
||||
@ -154,6 +155,9 @@ $(OUTPUT)test-dwarf.bin:
|
||||
$(OUTPUT)test-dwarf_getlocations.bin:
|
||||
$(BUILD) $(DWARFLIBS)
|
||||
|
||||
$(OUTPUT)test-dwarf_getcfi.bin:
|
||||
$(BUILD) $(DWARFLIBS)
|
||||
|
||||
$(OUTPUT)test-libelf-getphdrnum.bin:
|
||||
$(BUILD) -lelf
|
||||
|
||||
|
9
tools/build/feature/test-dwarf_getcfi.c
Normal file
9
tools/build/feature/test-dwarf_getcfi.c
Normal file
@ -0,0 +1,9 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <stdio.h>
|
||||
#include <elfutils/libdw.h>
|
||||
|
||||
int main(void)
|
||||
{
|
||||
Dwarf *dwarf = NULL;
|
||||
return dwarf_getcfi(dwarf) == NULL;
|
||||
}
|
@ -4,9 +4,9 @@
|
||||
/*
|
||||
* Check OpenCSD library version is sufficient to provide required features
|
||||
*/
|
||||
#define OCSD_MIN_VER ((1 << 16) | (1 << 8) | (1))
|
||||
#define OCSD_MIN_VER ((1 << 16) | (2 << 8) | (1))
|
||||
#if !defined(OCSD_VER_NUM) || (OCSD_VER_NUM < OCSD_MIN_VER)
|
||||
#error "OpenCSD >= 1.1.1 is required"
|
||||
#error "OpenCSD >= 1.2.1 is required"
|
||||
#endif
|
||||
|
||||
int main(void)
|
||||
|
@ -204,6 +204,8 @@ enum perf_branch_sample_type_shift {
|
||||
|
||||
PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT = 18, /* save privilege mode */
|
||||
|
||||
PERF_SAMPLE_BRANCH_COUNTERS_SHIFT = 19, /* save occurrences of events on a branch */
|
||||
|
||||
PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */
|
||||
};
|
||||
|
||||
@ -235,6 +237,8 @@ enum perf_branch_sample_type {
|
||||
|
||||
PERF_SAMPLE_BRANCH_PRIV_SAVE = 1U << PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT,
|
||||
|
||||
PERF_SAMPLE_BRANCH_COUNTERS = 1U << PERF_SAMPLE_BRANCH_COUNTERS_SHIFT,
|
||||
|
||||
PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
|
||||
};
|
||||
|
||||
@ -982,6 +986,12 @@ enum perf_event_type {
|
||||
* { u64 nr;
|
||||
* { u64 hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
|
||||
* { u64 from, to, flags } lbr[nr];
|
||||
* #
|
||||
* # The format of the counters is decided by the
|
||||
* # "branch_counter_nr" and "branch_counter_width",
|
||||
* # which are defined in the ABI.
|
||||
* #
|
||||
* { u64 counters; } cntr[nr] && PERF_SAMPLE_BRANCH_COUNTERS
|
||||
* } && PERF_SAMPLE_BRANCH_STACK
|
||||
*
|
||||
* { u64 abi; # enum perf_sample_regs_abi
|
||||
@ -1427,6 +1437,9 @@ struct perf_branch_entry {
|
||||
reserved:31;
|
||||
};
|
||||
|
||||
/* Size of used info bits in struct perf_branch_entry */
|
||||
#define PERF_BRANCH_ENTRY_INFO_BITS_MAX 33
|
||||
|
||||
union perf_sample_weight {
|
||||
__u64 full;
|
||||
#if defined(__LITTLE_ENDIAN_BITFIELD)
|
||||
|
@ -16,6 +16,7 @@
|
||||
#include <sys/mount.h>
|
||||
|
||||
#include "fs.h"
|
||||
#include "../io.h"
|
||||
#include "debug-internal.h"
|
||||
|
||||
#define _STR(x) #x
|
||||
@ -344,53 +345,24 @@ int filename__read_ull(const char *filename, unsigned long long *value)
|
||||
return filename__read_ull_base(filename, value, 0);
|
||||
}
|
||||
|
||||
#define STRERR_BUFSIZE 128 /* For the buffer size of strerror_r */
|
||||
|
||||
int filename__read_str(const char *filename, char **buf, size_t *sizep)
|
||||
{
|
||||
size_t size = 0, alloc_size = 0;
|
||||
void *bf = NULL, *nbf;
|
||||
int fd, n, err = 0;
|
||||
char sbuf[STRERR_BUFSIZE];
|
||||
struct io io;
|
||||
char bf[128];
|
||||
int err;
|
||||
|
||||
fd = open(filename, O_RDONLY);
|
||||
if (fd < 0)
|
||||
io.fd = open(filename, O_RDONLY);
|
||||
if (io.fd < 0)
|
||||
return -errno;
|
||||
|
||||
do {
|
||||
if (size == alloc_size) {
|
||||
alloc_size += BUFSIZ;
|
||||
nbf = realloc(bf, alloc_size);
|
||||
if (!nbf) {
|
||||
err = -ENOMEM;
|
||||
break;
|
||||
}
|
||||
|
||||
bf = nbf;
|
||||
}
|
||||
|
||||
n = read(fd, bf + size, alloc_size - size);
|
||||
if (n < 0) {
|
||||
if (size) {
|
||||
pr_warn("read failed %d: %s\n", errno,
|
||||
strerror_r(errno, sbuf, sizeof(sbuf)));
|
||||
err = 0;
|
||||
} else
|
||||
err = -errno;
|
||||
|
||||
break;
|
||||
}
|
||||
|
||||
size += n;
|
||||
} while (n > 0);
|
||||
|
||||
if (!err) {
|
||||
*sizep = size;
|
||||
*buf = bf;
|
||||
io__init(&io, io.fd, bf, sizeof(bf));
|
||||
*buf = NULL;
|
||||
err = io__getdelim(&io, buf, sizep, /*delim=*/-1);
|
||||
if (err < 0) {
|
||||
free(*buf);
|
||||
*buf = NULL;
|
||||
} else
|
||||
free(bf);
|
||||
|
||||
close(fd);
|
||||
err = 0;
|
||||
close(io.fd);
|
||||
return err;
|
||||
}
|
||||
|
||||
@ -475,15 +447,22 @@ int sysfs__read_str(const char *entry, char **buf, size_t *sizep)
|
||||
|
||||
int sysfs__read_bool(const char *entry, bool *value)
|
||||
{
|
||||
char *buf;
|
||||
size_t size;
|
||||
int ret;
|
||||
struct io io;
|
||||
char bf[16];
|
||||
int ret = 0;
|
||||
char path[PATH_MAX];
|
||||
const char *sysfs = sysfs__mountpoint();
|
||||
|
||||
ret = sysfs__read_str(entry, &buf, &size);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
if (!sysfs)
|
||||
return -1;
|
||||
|
||||
switch (buf[0]) {
|
||||
snprintf(path, sizeof(path), "%s/%s", sysfs, entry);
|
||||
io.fd = open(path, O_RDONLY);
|
||||
if (io.fd < 0)
|
||||
return -errno;
|
||||
|
||||
io__init(&io, io.fd, bf, sizeof(bf));
|
||||
switch (io__get_char(&io)) {
|
||||
case '1':
|
||||
case 'y':
|
||||
case 'Y':
|
||||
@ -497,8 +476,7 @@ int sysfs__read_bool(const char *entry, bool *value)
|
||||
default:
|
||||
ret = -1;
|
||||
}
|
||||
|
||||
free(buf);
|
||||
close(io.fd);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -12,6 +12,7 @@
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
struct io {
|
||||
/* File descriptor being read/ */
|
||||
@ -140,8 +141,8 @@ static inline int io__get_dec(struct io *io, __u64 *dec)
|
||||
}
|
||||
}
|
||||
|
||||
/* Read up to and including the first newline following the pattern of getline. */
|
||||
static inline ssize_t io__getline(struct io *io, char **line_out, size_t *line_len_out)
|
||||
/* Read up to and including the first delim. */
|
||||
static inline ssize_t io__getdelim(struct io *io, char **line_out, size_t *line_len_out, int delim)
|
||||
{
|
||||
char buf[128];
|
||||
int buf_pos = 0;
|
||||
@ -151,7 +152,7 @@ static inline ssize_t io__getline(struct io *io, char **line_out, size_t *line_l
|
||||
|
||||
/* TODO: reuse previously allocated memory. */
|
||||
free(*line_out);
|
||||
while (ch != '\n') {
|
||||
while (ch != delim) {
|
||||
ch = io__get_char(io);
|
||||
|
||||
if (ch < 0)
|
||||
@ -184,4 +185,9 @@ static inline ssize_t io__getline(struct io *io, char **line_out, size_t *line_l
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
static inline ssize_t io__getline(struct io *io, char **line_out, size_t *line_len_out)
|
||||
{
|
||||
return io__getdelim(io, line_out, line_len_out, /*delim=*/'\n');
|
||||
}
|
||||
|
||||
#endif /* __API_IO__ */
|
||||
|
@ -39,7 +39,7 @@ int main(int argc, char **argv)
|
||||
|
||||
libperf_init(libperf_print);
|
||||
|
||||
cpus = perf_cpu_map__new(NULL);
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
if (!cpus) {
|
||||
fprintf(stderr, "failed to create cpus\n");
|
||||
return -1;
|
||||
|
@ -97,7 +97,7 @@ In this case we will monitor all the available CPUs:
|
||||
|
||||
[source,c]
|
||||
--
|
||||
42 cpus = perf_cpu_map__new(NULL);
|
||||
42 cpus = perf_cpu_map__new_online_cpus();
|
||||
43 if (!cpus) {
|
||||
44 fprintf(stderr, "failed to create cpus\n");
|
||||
45 return -1;
|
||||
|
@ -37,7 +37,7 @@ SYNOPSIS
|
||||
|
||||
struct perf_cpu_map;
|
||||
|
||||
struct perf_cpu_map *perf_cpu_map__dummy_new(void);
|
||||
struct perf_cpu_map *perf_cpu_map__new_any_cpu(void);
|
||||
struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list);
|
||||
struct perf_cpu_map *perf_cpu_map__read(FILE *file);
|
||||
struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
|
||||
@ -46,7 +46,7 @@ SYNOPSIS
|
||||
void perf_cpu_map__put(struct perf_cpu_map *map);
|
||||
int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
|
||||
int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
|
||||
bool perf_cpu_map__empty(const struct perf_cpu_map *map);
|
||||
bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map);
|
||||
int perf_cpu_map__max(struct perf_cpu_map *map);
|
||||
bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu);
|
||||
|
||||
|
@ -9,6 +9,7 @@
|
||||
#include <unistd.h>
|
||||
#include <ctype.h>
|
||||
#include <limits.h>
|
||||
#include "internal.h"
|
||||
|
||||
void perf_cpu_map__set_nr(struct perf_cpu_map *map, int nr_cpus)
|
||||
{
|
||||
@ -27,7 +28,7 @@ struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
|
||||
return result;
|
||||
}
|
||||
|
||||
struct perf_cpu_map *perf_cpu_map__dummy_new(void)
|
||||
struct perf_cpu_map *perf_cpu_map__new_any_cpu(void)
|
||||
{
|
||||
struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);
|
||||
|
||||
@ -66,15 +67,21 @@ void perf_cpu_map__put(struct perf_cpu_map *map)
|
||||
}
|
||||
}
|
||||
|
||||
static struct perf_cpu_map *cpu_map__default_new(void)
|
||||
static struct perf_cpu_map *cpu_map__new_sysconf(void)
|
||||
{
|
||||
struct perf_cpu_map *cpus;
|
||||
int nr_cpus;
|
||||
int nr_cpus, nr_cpus_conf;
|
||||
|
||||
nr_cpus = sysconf(_SC_NPROCESSORS_ONLN);
|
||||
if (nr_cpus < 0)
|
||||
return NULL;
|
||||
|
||||
nr_cpus_conf = sysconf(_SC_NPROCESSORS_CONF);
|
||||
if (nr_cpus != nr_cpus_conf) {
|
||||
pr_warning("Number of online CPUs (%d) differs from the number configured (%d) the CPU map will only cover the first %d CPUs.",
|
||||
nr_cpus, nr_cpus_conf, nr_cpus);
|
||||
}
|
||||
|
||||
cpus = perf_cpu_map__alloc(nr_cpus);
|
||||
if (cpus != NULL) {
|
||||
int i;
|
||||
@ -86,9 +93,27 @@ static struct perf_cpu_map *cpu_map__default_new(void)
|
||||
return cpus;
|
||||
}
|
||||
|
||||
struct perf_cpu_map *perf_cpu_map__default_new(void)
|
||||
static struct perf_cpu_map *cpu_map__new_sysfs_online(void)
|
||||
{
|
||||
return cpu_map__default_new();
|
||||
struct perf_cpu_map *cpus = NULL;
|
||||
FILE *onlnf;
|
||||
|
||||
onlnf = fopen("/sys/devices/system/cpu/online", "r");
|
||||
if (onlnf) {
|
||||
cpus = perf_cpu_map__read(onlnf);
|
||||
fclose(onlnf);
|
||||
}
|
||||
return cpus;
|
||||
}
|
||||
|
||||
struct perf_cpu_map *perf_cpu_map__new_online_cpus(void)
|
||||
{
|
||||
struct perf_cpu_map *cpus = cpu_map__new_sysfs_online();
|
||||
|
||||
if (cpus)
|
||||
return cpus;
|
||||
|
||||
return cpu_map__new_sysconf();
|
||||
}
|
||||
|
||||
|
||||
@ -180,27 +205,11 @@ struct perf_cpu_map *perf_cpu_map__read(FILE *file)
|
||||
|
||||
if (nr_cpus > 0)
|
||||
cpus = cpu_map__trim_new(nr_cpus, tmp_cpus);
|
||||
else
|
||||
cpus = cpu_map__default_new();
|
||||
out_free_tmp:
|
||||
free(tmp_cpus);
|
||||
return cpus;
|
||||
}
|
||||
|
||||
static struct perf_cpu_map *cpu_map__read_all_cpu_map(void)
|
||||
{
|
||||
struct perf_cpu_map *cpus = NULL;
|
||||
FILE *onlnf;
|
||||
|
||||
onlnf = fopen("/sys/devices/system/cpu/online", "r");
|
||||
if (!onlnf)
|
||||
return cpu_map__default_new();
|
||||
|
||||
cpus = perf_cpu_map__read(onlnf);
|
||||
fclose(onlnf);
|
||||
return cpus;
|
||||
}
|
||||
|
||||
struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
|
||||
{
|
||||
struct perf_cpu_map *cpus = NULL;
|
||||
@ -211,7 +220,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
|
||||
int max_entries = 0;
|
||||
|
||||
if (!cpu_list)
|
||||
return cpu_map__read_all_cpu_map();
|
||||
return perf_cpu_map__new_online_cpus();
|
||||
|
||||
/*
|
||||
* must handle the case of empty cpumap to cover
|
||||
@ -268,10 +277,12 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
|
||||
|
||||
if (nr_cpus > 0)
|
||||
cpus = cpu_map__trim_new(nr_cpus, tmp_cpus);
|
||||
else if (*cpu_list != '\0')
|
||||
cpus = cpu_map__default_new();
|
||||
else
|
||||
cpus = perf_cpu_map__dummy_new();
|
||||
else if (*cpu_list != '\0') {
|
||||
pr_warning("Unexpected characters at end of cpu list ('%s'), using online CPUs.",
|
||||
cpu_list);
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
} else
|
||||
cpus = perf_cpu_map__new_any_cpu();
|
||||
invalid:
|
||||
free(tmp_cpus);
|
||||
out:
|
||||
@ -300,7 +311,7 @@ int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
|
||||
return cpus ? __perf_cpu_map__nr(cpus) : 1;
|
||||
}
|
||||
|
||||
bool perf_cpu_map__empty(const struct perf_cpu_map *map)
|
||||
bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map)
|
||||
{
|
||||
return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true;
|
||||
}
|
||||
|
@ -39,7 +39,7 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
|
||||
if (evsel->system_wide) {
|
||||
/* System wide: set the cpu map of the evsel to all online CPUs. */
|
||||
perf_cpu_map__put(evsel->cpus);
|
||||
evsel->cpus = perf_cpu_map__new(NULL);
|
||||
evsel->cpus = perf_cpu_map__new_online_cpus();
|
||||
} else if (evlist->has_user_cpus && evsel->is_pmu_core) {
|
||||
/*
|
||||
* User requested CPUs on a core PMU, ensure the requested CPUs
|
||||
@ -619,7 +619,7 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist)
|
||||
|
||||
/* One for each CPU */
|
||||
nr_mmaps = perf_cpu_map__nr(evlist->all_cpus);
|
||||
if (perf_cpu_map__empty(evlist->all_cpus)) {
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(evlist->all_cpus)) {
|
||||
/* Plus one for each thread */
|
||||
nr_mmaps += perf_thread_map__nr(evlist->threads);
|
||||
/* Minus the per-thread CPU (-1) */
|
||||
@ -653,7 +653,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
|
||||
if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
|
||||
return -ENOMEM;
|
||||
|
||||
if (perf_cpu_map__empty(cpus))
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(cpus))
|
||||
return mmap_per_thread(evlist, ops, mp);
|
||||
|
||||
return mmap_per_cpu(evlist, ops, mp);
|
||||
|
@ -120,7 +120,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus,
|
||||
static struct perf_cpu_map *empty_cpu_map;
|
||||
|
||||
if (empty_cpu_map == NULL) {
|
||||
empty_cpu_map = perf_cpu_map__dummy_new();
|
||||
empty_cpu_map = perf_cpu_map__new_any_cpu();
|
||||
if (empty_cpu_map == NULL)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
@ -33,7 +33,8 @@ struct perf_mmap {
|
||||
bool overwrite;
|
||||
u64 flush;
|
||||
libperf_unmap_cb_t unmap_cb;
|
||||
char event_copy[PERF_SAMPLE_MAX_SIZE] __aligned(8);
|
||||
void *event_copy;
|
||||
size_t event_copy_sz;
|
||||
struct perf_mmap *next;
|
||||
};
|
||||
|
||||
|
@ -19,10 +19,23 @@ struct perf_cache {
|
||||
struct perf_cpu_map;
|
||||
|
||||
/**
|
||||
* perf_cpu_map__dummy_new - a map with a singular "any CPU"/dummy -1 value.
|
||||
* perf_cpu_map__new_any_cpu - a map with a singular "any CPU"/dummy -1 value.
|
||||
*/
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_any_cpu(void);
|
||||
/**
|
||||
* perf_cpu_map__new_online_cpus - a map read from
|
||||
* /sys/devices/system/cpu/online if
|
||||
* available. If reading wasn't possible a map
|
||||
* is created using the online processors
|
||||
* assuming the first 'n' processors are all
|
||||
* online.
|
||||
*/
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_online_cpus(void);
|
||||
/**
|
||||
* perf_cpu_map__new - create a map from the given cpu_list such as "0-7". If no
|
||||
* cpu_list argument is provided then
|
||||
* perf_cpu_map__new_online_cpus is returned.
|
||||
*/
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__dummy_new(void);
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void);
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list);
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file);
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
|
||||
@ -31,12 +44,23 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
|
||||
LIBPERF_API struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
|
||||
struct perf_cpu_map *other);
|
||||
LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map);
|
||||
/**
|
||||
* perf_cpu_map__cpu - get the CPU value at the given index. Returns -1 if index
|
||||
* is invalid.
|
||||
*/
|
||||
LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx);
|
||||
/**
|
||||
* perf_cpu_map__nr - for an empty map returns 1, as perf_cpu_map__cpu returns a
|
||||
* cpu of -1 for an invalid index, this makes an empty map
|
||||
* look like it contains the "any CPU"/dummy value. Otherwise
|
||||
* the result is the number CPUs in the map plus one if the
|
||||
* "any CPU"/dummy value is present.
|
||||
*/
|
||||
LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus);
|
||||
/**
|
||||
* perf_cpu_map__empty - is map either empty or the "any CPU"/dummy value.
|
||||
* perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value.
|
||||
*/
|
||||
LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map);
|
||||
LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map);
|
||||
LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map);
|
||||
LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu);
|
||||
LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs,
|
||||
@ -51,6 +75,12 @@ LIBPERF_API bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map);
|
||||
(idx) < perf_cpu_map__nr(cpus); \
|
||||
(idx)++, (cpu) = perf_cpu_map__cpu(cpus, idx))
|
||||
|
||||
#define perf_cpu_map__for_each_cpu_skip_any(_cpu, idx, cpus) \
|
||||
for ((idx) = 0, (_cpu) = perf_cpu_map__cpu(cpus, idx); \
|
||||
(idx) < perf_cpu_map__nr(cpus); \
|
||||
(idx)++, (_cpu) = perf_cpu_map__cpu(cpus, idx)) \
|
||||
if ((_cpu).cpu != -1)
|
||||
|
||||
#define perf_cpu_map__for_each_idx(idx, cpus) \
|
||||
for ((idx) = 0; (idx) < perf_cpu_map__nr(cpus); (idx)++)
|
||||
|
||||
|
@ -1,15 +1,15 @@
|
||||
LIBPERF_0.0.1 {
|
||||
global:
|
||||
libperf_init;
|
||||
perf_cpu_map__dummy_new;
|
||||
perf_cpu_map__default_new;
|
||||
perf_cpu_map__new_any_cpu;
|
||||
perf_cpu_map__new_online_cpus;
|
||||
perf_cpu_map__get;
|
||||
perf_cpu_map__put;
|
||||
perf_cpu_map__new;
|
||||
perf_cpu_map__read;
|
||||
perf_cpu_map__nr;
|
||||
perf_cpu_map__cpu;
|
||||
perf_cpu_map__empty;
|
||||
perf_cpu_map__has_any_cpu_or_is_empty;
|
||||
perf_cpu_map__max;
|
||||
perf_cpu_map__has;
|
||||
perf_thread_map__new_array;
|
||||
|
@ -19,6 +19,7 @@
|
||||
void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
|
||||
bool overwrite, libperf_unmap_cb_t unmap_cb)
|
||||
{
|
||||
/* Assume fields were zero initialized. */
|
||||
map->fd = -1;
|
||||
map->overwrite = overwrite;
|
||||
map->unmap_cb = unmap_cb;
|
||||
@ -51,13 +52,18 @@ int perf_mmap__mmap(struct perf_mmap *map, struct perf_mmap_param *mp,
|
||||
|
||||
void perf_mmap__munmap(struct perf_mmap *map)
|
||||
{
|
||||
if (map && map->base != NULL) {
|
||||
if (!map)
|
||||
return;
|
||||
|
||||
zfree(&map->event_copy);
|
||||
map->event_copy_sz = 0;
|
||||
if (map->base) {
|
||||
munmap(map->base, perf_mmap__mmap_len(map));
|
||||
map->base = NULL;
|
||||
map->fd = -1;
|
||||
refcount_set(&map->refcnt, 0);
|
||||
}
|
||||
if (map && map->unmap_cb)
|
||||
if (map->unmap_cb)
|
||||
map->unmap_cb(map);
|
||||
}
|
||||
|
||||
@ -223,9 +229,17 @@ static union perf_event *perf_mmap__read(struct perf_mmap *map,
|
||||
*/
|
||||
if ((*startp & map->mask) + size != ((*startp + size) & map->mask)) {
|
||||
unsigned int offset = *startp;
|
||||
unsigned int len = min(sizeof(*event), size), cpy;
|
||||
unsigned int len = size, cpy;
|
||||
void *dst = map->event_copy;
|
||||
|
||||
if (size > map->event_copy_sz) {
|
||||
dst = realloc(map->event_copy, size);
|
||||
if (!dst)
|
||||
return NULL;
|
||||
map->event_copy = dst;
|
||||
map->event_copy_sz = size;
|
||||
}
|
||||
|
||||
do {
|
||||
cpy = min(map->mask + 1 - (offset & map->mask), len);
|
||||
memcpy(dst, &data[offset & map->mask], cpy);
|
||||
|
@ -21,7 +21,7 @@ int test_cpumap(int argc, char **argv)
|
||||
|
||||
libperf_init(libperf_print);
|
||||
|
||||
cpus = perf_cpu_map__dummy_new();
|
||||
cpus = perf_cpu_map__new_any_cpu();
|
||||
if (!cpus)
|
||||
return -1;
|
||||
|
||||
@ -29,7 +29,7 @@ int test_cpumap(int argc, char **argv)
|
||||
perf_cpu_map__put(cpus);
|
||||
perf_cpu_map__put(cpus);
|
||||
|
||||
cpus = perf_cpu_map__default_new();
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
if (!cpus)
|
||||
return -1;
|
||||
|
||||
|
@ -46,7 +46,7 @@ static int test_stat_cpu(void)
|
||||
};
|
||||
int err, idx;
|
||||
|
||||
cpus = perf_cpu_map__new(NULL);
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
__T("failed to create cpus", cpus);
|
||||
|
||||
evlist = perf_evlist__new();
|
||||
@ -261,7 +261,7 @@ static int test_mmap_thread(void)
|
||||
threads = perf_thread_map__new_dummy();
|
||||
__T("failed to create threads", threads);
|
||||
|
||||
cpus = perf_cpu_map__dummy_new();
|
||||
cpus = perf_cpu_map__new_any_cpu();
|
||||
__T("failed to create cpus", cpus);
|
||||
|
||||
perf_thread_map__set_pid(threads, 0, pid);
|
||||
@ -350,7 +350,7 @@ static int test_mmap_cpus(void)
|
||||
|
||||
attr.config = id;
|
||||
|
||||
cpus = perf_cpu_map__new(NULL);
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
__T("failed to create cpus", cpus);
|
||||
|
||||
evlist = perf_evlist__new();
|
||||
|
@ -27,7 +27,7 @@ static int test_stat_cpu(void)
|
||||
};
|
||||
int err, idx;
|
||||
|
||||
cpus = perf_cpu_map__new(NULL);
|
||||
cpus = perf_cpu_map__new_online_cpus();
|
||||
__T("failed to create cpus", cpus);
|
||||
|
||||
evsel = perf_evsel__new(&attr);
|
||||
|
@ -52,11 +52,21 @@ void uniq(struct cmdnames *cmds)
|
||||
if (!cmds->cnt)
|
||||
return;
|
||||
|
||||
for (i = j = 1; i < cmds->cnt; i++)
|
||||
if (strcmp(cmds->names[i]->name, cmds->names[i-1]->name))
|
||||
cmds->names[j++] = cmds->names[i];
|
||||
|
||||
for (i = 1; i < cmds->cnt; i++) {
|
||||
if (!strcmp(cmds->names[i]->name, cmds->names[i-1]->name))
|
||||
zfree(&cmds->names[i - 1]);
|
||||
}
|
||||
for (i = 0, j = 0; i < cmds->cnt; i++) {
|
||||
if (cmds->names[i]) {
|
||||
if (i == j)
|
||||
j++;
|
||||
else
|
||||
cmds->names[j++] = cmds->names[i];
|
||||
}
|
||||
}
|
||||
cmds->cnt = j;
|
||||
while (j < i)
|
||||
cmds->names[j++] = NULL;
|
||||
}
|
||||
|
||||
void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes)
|
||||
|
4
tools/perf/.gitignore
vendored
4
tools/perf/.gitignore
vendored
@ -39,6 +39,9 @@ trace/beauty/generated/
|
||||
pmu-events/pmu-events.c
|
||||
pmu-events/jevents
|
||||
pmu-events/metric_test.log
|
||||
tests/shell/*.shellcheck_log
|
||||
tests/shell/coresight/*.shellcheck_log
|
||||
tests/shell/lib/*.shellcheck_log
|
||||
feature/
|
||||
libapi/
|
||||
libbpf/
|
||||
@ -49,3 +52,4 @@ libtraceevent/
|
||||
libtraceevent_plugins/
|
||||
fixdep
|
||||
Documentation/doc.dep
|
||||
python_ext_build/
|
||||
|
@ -25,6 +25,7 @@
|
||||
q quicker (less detailed) decoding
|
||||
A approximate IPC
|
||||
Z prefer to ignore timestamps (so-called "timeless" decoding)
|
||||
T use the timestamp trace as kernel time
|
||||
|
||||
The default is all events i.e. the same as --itrace=iybxwpe,
|
||||
except for perf script where it is --itrace=ce
|
||||
|
@ -155,6 +155,17 @@ include::itrace.txt[]
|
||||
stdio or stdio2 (Default: 0). Note that this is about selection of
|
||||
functions to display, not about lines within the function.
|
||||
|
||||
--data-type[=TYPE_NAME]::
|
||||
Display data type annotation instead of code. It infers data type of
|
||||
samples (if they are memory accessing instructions) using DWARF debug
|
||||
information. It can take an optional argument of data type name. In
|
||||
that case it'd show annotation for the type only, otherwise it'd show
|
||||
all data types it finds.
|
||||
|
||||
--type-stat::
|
||||
Show stats for the data type annotation.
|
||||
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
linkperf:perf-record[1], linkperf:perf-report[1]
|
||||
|
@ -251,7 +251,8 @@ annotate.*::
|
||||
addr2line binary to use for file names and line numbers.
|
||||
|
||||
annotate.objdump::
|
||||
objdump binary to use for disassembly and annotations.
|
||||
objdump binary to use for disassembly and annotations,
|
||||
including in the 'perf test' command.
|
||||
|
||||
annotate.disassembler_style::
|
||||
Use this to change the default disassembler style to some other value
|
||||
@ -722,7 +723,6 @@ session-<NAME>.*::
|
||||
Defines new record session for daemon. The value is record's
|
||||
command line without the 'record' keyword.
|
||||
|
||||
|
||||
SEE ALSO
|
||||
--------
|
||||
linkperf:perf[1]
|
||||
|
@ -81,11 +81,13 @@ For Intel systems precise event sampling is implemented with PEBS
|
||||
which supports up to precise-level 2, and precise level 3 for
|
||||
some special cases
|
||||
|
||||
On AMD systems it is implemented using IBS (up to precise-level 2).
|
||||
The precise modifier works with event types 0x76 (cpu-cycles, CPU
|
||||
clocks not halted) and 0xC1 (micro-ops retired). Both events map to
|
||||
IBS execution sampling (IBS op) with the IBS Op Counter Control bit
|
||||
(IbsOpCntCtl) set respectively (see the
|
||||
On AMD systems it is implemented using IBS OP (up to precise-level 2).
|
||||
Unlike Intel PEBS which provides levels of precision, AMD core pmu is
|
||||
inherently non-precise and IBS is inherently precise. (i.e. ibs_op//,
|
||||
ibs_op//p, ibs_op//pp and ibs_op//ppp are all same). The precise modifier
|
||||
works with event types 0x76 (cpu-cycles, CPU clocks not halted) and 0xC1
|
||||
(micro-ops retired). Both events map to IBS execution sampling (IBS op)
|
||||
with the IBS Op Counter Control bit (IbsOpCntCtl) set respectively (see the
|
||||
Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS)
|
||||
section of the [AMD Processor Programming Reference (PPR)] relevant to the
|
||||
family, model and stepping of the processor being used).
|
||||
|
@ -119,7 +119,7 @@ INFO OPTIONS
|
||||
|
||||
|
||||
CONTENTION OPTIONS
|
||||
--------------
|
||||
------------------
|
||||
|
||||
-k::
|
||||
--key=<value>::
|
||||
|
@ -445,6 +445,10 @@ following filters are defined:
|
||||
4th-Gen Xeon+ server), the save branch type is unconditionally enabled
|
||||
when the taken branch stack sampling is enabled.
|
||||
- priv: save privilege state during sampling in case binary is not available later
|
||||
- counter: save occurrences of the event since the last branch entry. Currently, the
|
||||
feature is only supported by a newer CPU, e.g., Intel Sierra Forest and
|
||||
later platforms. An error out is expected if it's used on the unsupported
|
||||
kernel or CPUs.
|
||||
|
||||
+
|
||||
The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
|
||||
|
@ -118,6 +118,9 @@ OPTIONS
|
||||
- retire_lat: On X86, this reports pipeline stall of this instruction compared
|
||||
to the previous instruction in cycles. And currently supported only on X86
|
||||
- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
|
||||
- type: Data type of sample memory access.
|
||||
- typeoff: Offset in the data type of sample memory access.
|
||||
- symoff: Offset in the symbol.
|
||||
|
||||
By default, comm, dso and symbol keys are used.
|
||||
(i.e. --sort comm,dso,symbol)
|
||||
|
@ -422,7 +422,34 @@ See perf list output for the possible metrics and metricgroups.
|
||||
|
||||
-A::
|
||||
--no-aggr::
|
||||
Do not aggregate counts across all monitored CPUs.
|
||||
--no-merge::
|
||||
Do not aggregate/merge counts across monitored CPUs or PMUs.
|
||||
|
||||
When multiple events are created from a single event specification,
|
||||
stat will, by default, aggregate the event counts and show the result
|
||||
in a single row. This option disables that behavior and shows the
|
||||
individual events and counts.
|
||||
|
||||
Multiple events are created from a single event specification when:
|
||||
|
||||
1. PID monitoring isn't requested and the system has more than one
|
||||
CPU. For example, a system with 8 SMT threads will have one event
|
||||
opened on each thread and aggregation is performed across them.
|
||||
|
||||
2. Prefix or glob wildcard matching is used for the PMU name. For
|
||||
example, multiple memory controller PMUs may exist typically with a
|
||||
suffix of _0, _1, etc. By default the event counts will all be
|
||||
combined if the PMU is specified without the suffix such as
|
||||
uncore_imc rather than uncore_imc_0.
|
||||
|
||||
3. Aliases, which are listed immediately after the Kernel PMU events
|
||||
by perf list, are used.
|
||||
|
||||
--hybrid-merge::
|
||||
Merge core event counts from all core PMUs. In hybrid or big.LITTLE
|
||||
systems by default each core PMU will report its count
|
||||
separately. This option forces core PMU counts to be combined to give
|
||||
a behavior closer to having a single CPU type in the system.
|
||||
|
||||
--topdown::
|
||||
Print top-down metrics supported by the CPU. This allows to determine
|
||||
@ -475,29 +502,6 @@ highlight 'tma_frontend_bound'. This metric may be drilled into with
|
||||
|
||||
Error out if the input is higher than the supported max level.
|
||||
|
||||
--no-merge::
|
||||
Do not merge results from same PMUs.
|
||||
|
||||
When multiple events are created from a single event specification,
|
||||
stat will, by default, aggregate the event counts and show the result
|
||||
in a single row. This option disables that behavior and shows
|
||||
the individual events and counts.
|
||||
|
||||
Multiple events are created from a single event specification when:
|
||||
1. Prefix or glob matching is used for the PMU name.
|
||||
2. Aliases, which are listed immediately after the Kernel PMU events
|
||||
by perf list, are used.
|
||||
|
||||
--hybrid-merge::
|
||||
Merge the hybrid event counts from all PMUs.
|
||||
|
||||
For hybrid events, by default, the stat aggregates and reports the event
|
||||
counts per PMU. But sometimes, it's also useful to aggregate event counts
|
||||
from all PMUs. This option enables that behavior and reports the counts
|
||||
without PMUs.
|
||||
|
||||
For non-hybrid events, it should be no effect.
|
||||
|
||||
--smi-cost::
|
||||
Measure SMI cost if msr/aperf/ and msr/smi/ events are supported.
|
||||
|
||||
|
@ -64,6 +64,9 @@ OPTIONS
|
||||
perf-event-open - Print perf_event_open() arguments and
|
||||
return value
|
||||
|
||||
--debug-file::
|
||||
Write debug output to a specified file.
|
||||
|
||||
DESCRIPTION
|
||||
-----------
|
||||
Performance counters for Linux are a new kernel-based subsystem
|
||||
|
@ -476,6 +476,11 @@ else
|
||||
else
|
||||
CFLAGS += -DHAVE_DWARF_GETLOCATIONS_SUPPORT
|
||||
endif # dwarf_getlocations
|
||||
ifneq ($(feature-dwarf_getcfi), 1)
|
||||
msg := $(warning Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.142);
|
||||
else
|
||||
CFLAGS += -DHAVE_DWARF_CFI_SUPPORT
|
||||
endif # dwarf_getcfi
|
||||
endif # Dwarf support
|
||||
endif # libelf support
|
||||
endif # NO_LIBELF
|
||||
@ -680,15 +685,15 @@ ifndef BUILD_BPF_SKEL
|
||||
endif
|
||||
|
||||
ifeq ($(BUILD_BPF_SKEL),1)
|
||||
ifeq ($(filter -DHAVE_LIBBPF_SUPPORT, $(CFLAGS)),)
|
||||
dummy := $(warning Warning: Disabled BPF skeletons as libbpf is required)
|
||||
BUILD_BPF_SKEL := 0
|
||||
else ifeq ($(filter -DHAVE_LIBELF_SUPPORT, $(CFLAGS)),)
|
||||
ifeq ($(filter -DHAVE_LIBELF_SUPPORT, $(CFLAGS)),)
|
||||
dummy := $(warning Warning: Disabled BPF skeletons as libelf is required by bpftool)
|
||||
BUILD_BPF_SKEL := 0
|
||||
else ifeq ($(filter -DHAVE_ZLIB_SUPPORT, $(CFLAGS)),)
|
||||
dummy := $(warning Warning: Disabled BPF skeletons as zlib is required by bpftool)
|
||||
BUILD_BPF_SKEL := 0
|
||||
else ifeq ($(filter -DHAVE_LIBBPF_SUPPORT, $(CFLAGS)),)
|
||||
dummy := $(warning Warning: Disabled BPF skeletons as libbpf is required)
|
||||
BUILD_BPF_SKEL := 0
|
||||
else ifeq ($(call get-executable,$(CLANG)),)
|
||||
dummy := $(warning Warning: Disabled BPF skeletons as clang ($(CLANG)) is missing)
|
||||
BUILD_BPF_SKEL := 0
|
||||
|
@ -134,6 +134,8 @@ include ../scripts/utilities.mak
|
||||
# x86 instruction decoder - new instructions test
|
||||
#
|
||||
# Define GEN_VMLINUX_H to generate vmlinux.h from the BTF.
|
||||
#
|
||||
# Define NO_SHELLCHECK if you do not want to run shellcheck during build
|
||||
|
||||
# As per kernel Makefile, avoid funny character set dependencies
|
||||
unexport LC_ALL
|
||||
@ -227,8 +229,15 @@ else
|
||||
force_fixdep := $(config)
|
||||
endif
|
||||
|
||||
# Runs shellcheck on perf test shell scripts
|
||||
ifeq ($(NO_SHELLCHECK),1)
|
||||
SHELLCHECK :=
|
||||
else
|
||||
SHELLCHECK := $(shell which shellcheck 2> /dev/null)
|
||||
endif
|
||||
|
||||
export srctree OUTPUT RM CC CXX LD AR CFLAGS CXXFLAGS V BISON FLEX AWK
|
||||
export HOSTCC HOSTLD HOSTAR HOSTCFLAGS
|
||||
export HOSTCC HOSTLD HOSTAR HOSTCFLAGS SHELLCHECK
|
||||
|
||||
include $(srctree)/tools/build/Makefile.include
|
||||
|
||||
@ -1152,7 +1161,7 @@ bpf-skel-clean:
|
||||
|
||||
clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(LIBPERF)-clean arm64-sysreg-defs-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean
|
||||
$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
|
||||
$(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
|
||||
$(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete -o -name '*.shellcheck_log' -delete
|
||||
$(Q)$(RM) $(OUTPUT).config-detected
|
||||
$(call QUIET_CLEAN, core-progs) $(RM) $(ALL_PROGRAMS) perf perf-read-vdso32 perf-read-vdsox32 $(OUTPUT)$(LIBJVMTI).so
|
||||
$(call QUIET_CLEAN, core-gen) $(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope* $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)FEATURE-DUMP $(OUTPUT)util/*-bison* $(OUTPUT)util/*-flex* \
|
||||
|
@ -199,7 +199,7 @@ static int cs_etm_validate_config(struct auxtrace_record *itr,
|
||||
{
|
||||
int i, err = -EINVAL;
|
||||
struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
|
||||
|
||||
/* Set option of each CPU we have */
|
||||
for (i = 0; i < cpu__max_cpu().cpu; i++) {
|
||||
@ -211,7 +211,7 @@ static int cs_etm_validate_config(struct auxtrace_record *itr,
|
||||
* program can run on any CPUs in this case, thus don't skip
|
||||
* validation.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(event_cpus) &&
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus) &&
|
||||
!perf_cpu_map__has(event_cpus, cpu))
|
||||
continue;
|
||||
|
||||
@ -435,7 +435,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
|
||||
* Also the case of per-cpu mmaps, need the contextID in order to be notified
|
||||
* when a context switch happened.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(cpus)) {
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
|
||||
evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
|
||||
"timestamp", 1);
|
||||
evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
|
||||
@ -461,7 +461,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
|
||||
evsel->core.attr.sample_period = 1;
|
||||
|
||||
/* In per-cpu case, always need the time of mmap events etc */
|
||||
if (!perf_cpu_map__empty(cpus))
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus))
|
||||
evsel__set_sample_bit(evsel, TIME);
|
||||
|
||||
err = cs_etm_validate_config(itr, cs_etm_evsel);
|
||||
@ -536,10 +536,10 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
|
||||
int i;
|
||||
int etmv3 = 0, etmv4 = 0, ete = 0;
|
||||
struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus;
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
|
||||
|
||||
/* cpu map is not empty, we have specific CPUs to work with */
|
||||
if (!perf_cpu_map__empty(event_cpus)) {
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) {
|
||||
for (i = 0; i < cpu__max_cpu().cpu; i++) {
|
||||
struct perf_cpu cpu = { .cpu = i, };
|
||||
|
||||
@ -802,7 +802,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
|
||||
u64 nr_cpu, type;
|
||||
struct perf_cpu_map *cpu_map;
|
||||
struct perf_cpu_map *event_cpus = session->evlist->core.user_requested_cpus;
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL);
|
||||
struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus();
|
||||
struct cs_etm_recording *ptr =
|
||||
container_of(itr, struct cs_etm_recording, itr);
|
||||
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
|
||||
@ -814,7 +814,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr,
|
||||
return -EINVAL;
|
||||
|
||||
/* If the cpu_map is empty all online CPUs are involved */
|
||||
if (perf_cpu_map__empty(event_cpus)) {
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) {
|
||||
cpu_map = online_cpus;
|
||||
} else {
|
||||
/* Make sure all specified CPUs are online */
|
||||
|
@ -232,7 +232,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
|
||||
* In the case of per-cpu mmaps, sample CPU for AUX event;
|
||||
* also enable the timestamp tracing for samples correlation.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(cpus)) {
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
|
||||
evsel__set_sample_bit(arm_spe_evsel, CPU);
|
||||
evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
|
||||
"ts_enable", 1);
|
||||
@ -265,7 +265,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
|
||||
tracking_evsel->core.attr.sample_period = 1;
|
||||
|
||||
/* In per-cpu case, always need the time of mmap events etc */
|
||||
if (!perf_cpu_map__empty(cpus)) {
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
|
||||
evsel__set_sample_bit(tracking_evsel, TIME);
|
||||
evsel__set_sample_bit(tracking_evsel, CPU);
|
||||
|
||||
|
@ -57,7 +57,7 @@ static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus)
|
||||
|
||||
int get_cpuid(char *buf, size_t sz)
|
||||
{
|
||||
struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
|
||||
struct perf_cpu_map *cpus = perf_cpu_map__new_online_cpus();
|
||||
int ret;
|
||||
|
||||
if (!cpus)
|
||||
|
@ -61,10 +61,10 @@ static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, st
|
||||
const char *c = strchr(ops->raw, '#');
|
||||
u64 start, end;
|
||||
|
||||
ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char);
|
||||
ops->raw_func_start = strchr(ops->raw, '<');
|
||||
ops->jump.raw_comment = strchr(ops->raw, arch->objdump.comment_char);
|
||||
ops->jump.raw_func_start = strchr(ops->raw, '<');
|
||||
|
||||
if (ops->raw_func_start && c > ops->raw_func_start)
|
||||
if (ops->jump.raw_func_start && c > ops->jump.raw_func_start)
|
||||
c = NULL;
|
||||
|
||||
if (c++ != NULL)
|
||||
|
@ -47,7 +47,7 @@ static int test__hybrid_hw_group_event(struct evlist *evlist)
|
||||
evsel = evsel__next(evsel);
|
||||
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
|
||||
TEST_ASSERT_VAL("wrong hybrid type", test_hybrid_type(evsel, PERF_TYPE_RAW));
|
||||
TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS));
|
||||
TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_INSTRUCTIONS));
|
||||
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
|
||||
return TEST_OK;
|
||||
}
|
||||
@ -102,7 +102,7 @@ static int test__hybrid_group_modifier1(struct evlist *evlist)
|
||||
evsel = evsel__next(evsel);
|
||||
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
|
||||
TEST_ASSERT_VAL("wrong hybrid type", test_hybrid_type(evsel, PERF_TYPE_RAW));
|
||||
TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS));
|
||||
TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_INSTRUCTIONS));
|
||||
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
|
||||
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
|
||||
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
|
||||
@ -163,6 +163,24 @@ static int test__checkevent_pmu(struct evlist *evlist)
|
||||
return TEST_OK;
|
||||
}
|
||||
|
||||
static int test__hybrid_hw_group_event_2(struct evlist *evlist)
|
||||
{
|
||||
struct evsel *evsel, *leader;
|
||||
|
||||
evsel = leader = evlist__first(evlist);
|
||||
TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
|
||||
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
|
||||
TEST_ASSERT_VAL("wrong hybrid type", test_hybrid_type(evsel, PERF_TYPE_RAW));
|
||||
TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES));
|
||||
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
|
||||
|
||||
evsel = evsel__next(evsel);
|
||||
TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
|
||||
TEST_ASSERT_VAL("wrong config", evsel->core.attr.config == 0x3c);
|
||||
TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
|
||||
return TEST_OK;
|
||||
}
|
||||
|
||||
struct evlist_test {
|
||||
const char *name;
|
||||
bool (*valid)(void);
|
||||
@ -171,27 +189,27 @@ struct evlist_test {
|
||||
|
||||
static const struct evlist_test test__hybrid_events[] = {
|
||||
{
|
||||
.name = "cpu_core/cpu-cycles/",
|
||||
.name = "cpu_core/cycles/",
|
||||
.check = test__hybrid_hw_event_with_pmu,
|
||||
/* 0 */
|
||||
},
|
||||
{
|
||||
.name = "{cpu_core/cpu-cycles/,cpu_core/instructions/}",
|
||||
.name = "{cpu_core/cycles/,cpu_core/branches/}",
|
||||
.check = test__hybrid_hw_group_event,
|
||||
/* 1 */
|
||||
},
|
||||
{
|
||||
.name = "{cpu-clock,cpu_core/cpu-cycles/}",
|
||||
.name = "{cpu-clock,cpu_core/cycles/}",
|
||||
.check = test__hybrid_sw_hw_group_event,
|
||||
/* 2 */
|
||||
},
|
||||
{
|
||||
.name = "{cpu_core/cpu-cycles/,cpu-clock}",
|
||||
.name = "{cpu_core/cycles/,cpu-clock}",
|
||||
.check = test__hybrid_hw_sw_group_event,
|
||||
/* 3 */
|
||||
},
|
||||
{
|
||||
.name = "{cpu_core/cpu-cycles/k,cpu_core/instructions/u}",
|
||||
.name = "{cpu_core/cycles/k,cpu_core/branches/u}",
|
||||
.check = test__hybrid_group_modifier1,
|
||||
/* 4 */
|
||||
},
|
||||
@ -215,6 +233,11 @@ static const struct evlist_test test__hybrid_events[] = {
|
||||
.check = test__hybrid_cache_event,
|
||||
/* 8 */
|
||||
},
|
||||
{
|
||||
.name = "{cpu_core/cycles/,cpu_core/cpu-cycles/}",
|
||||
.check = test__hybrid_hw_group_event_2,
|
||||
/* 9 */
|
||||
},
|
||||
};
|
||||
|
||||
static int test_event(const struct evlist_test *e)
|
||||
|
@ -113,3 +113,41 @@ int regs_query_register_offset(const char *name)
|
||||
return roff->offset;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
struct dwarf_regs_idx {
|
||||
const char *name;
|
||||
int idx;
|
||||
};
|
||||
|
||||
static const struct dwarf_regs_idx x86_regidx_table[] = {
|
||||
{ "rax", 0 }, { "eax", 0 }, { "ax", 0 }, { "al", 0 },
|
||||
{ "rdx", 1 }, { "edx", 1 }, { "dx", 1 }, { "dl", 1 },
|
||||
{ "rcx", 2 }, { "ecx", 2 }, { "cx", 2 }, { "cl", 2 },
|
||||
{ "rbx", 3 }, { "edx", 3 }, { "bx", 3 }, { "bl", 3 },
|
||||
{ "rsi", 4 }, { "esi", 4 }, { "si", 4 }, { "sil", 4 },
|
||||
{ "rdi", 5 }, { "edi", 5 }, { "di", 5 }, { "dil", 5 },
|
||||
{ "rbp", 6 }, { "ebp", 6 }, { "bp", 6 }, { "bpl", 6 },
|
||||
{ "rsp", 7 }, { "esp", 7 }, { "sp", 7 }, { "spl", 7 },
|
||||
{ "r8", 8 }, { "r8d", 8 }, { "r8w", 8 }, { "r8b", 8 },
|
||||
{ "r9", 9 }, { "r9d", 9 }, { "r9w", 9 }, { "r9b", 9 },
|
||||
{ "r10", 10 }, { "r10d", 10 }, { "r10w", 10 }, { "r10b", 10 },
|
||||
{ "r11", 11 }, { "r11d", 11 }, { "r11w", 11 }, { "r11b", 11 },
|
||||
{ "r12", 12 }, { "r12d", 12 }, { "r12w", 12 }, { "r12b", 12 },
|
||||
{ "r13", 13 }, { "r13d", 13 }, { "r13w", 13 }, { "r13b", 13 },
|
||||
{ "r14", 14 }, { "r14d", 14 }, { "r14w", 14 }, { "r14b", 14 },
|
||||
{ "r15", 15 }, { "r15d", 15 }, { "r15w", 15 }, { "r15b", 15 },
|
||||
{ "rip", DWARF_REG_PC },
|
||||
};
|
||||
|
||||
int get_arch_regnum(const char *name)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
if (*name != '%')
|
||||
return -EINVAL;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(x86_regidx_table); i++)
|
||||
if (!strcmp(x86_regidx_table[i].name, name + 1))
|
||||
return x86_regidx_table[i].idx;
|
||||
return -ENOENT;
|
||||
}
|
||||
|
@ -14,66 +14,79 @@
|
||||
|
||||
#if defined(__x86_64__)
|
||||
|
||||
struct perf_event__synthesize_extra_kmaps_cb_args {
|
||||
struct perf_tool *tool;
|
||||
perf_event__handler_t process;
|
||||
struct machine *machine;
|
||||
union perf_event *event;
|
||||
};
|
||||
|
||||
static int perf_event__synthesize_extra_kmaps_cb(struct map *map, void *data)
|
||||
{
|
||||
struct perf_event__synthesize_extra_kmaps_cb_args *args = data;
|
||||
union perf_event *event = args->event;
|
||||
struct kmap *kmap;
|
||||
size_t size;
|
||||
|
||||
if (!__map__is_extra_kernel_map(map))
|
||||
return 0;
|
||||
|
||||
kmap = map__kmap(map);
|
||||
|
||||
size = sizeof(event->mmap) - sizeof(event->mmap.filename) +
|
||||
PERF_ALIGN(strlen(kmap->name) + 1, sizeof(u64)) +
|
||||
args->machine->id_hdr_size;
|
||||
|
||||
memset(event, 0, size);
|
||||
|
||||
event->mmap.header.type = PERF_RECORD_MMAP;
|
||||
|
||||
/*
|
||||
* kernel uses 0 for user space maps, see kernel/perf_event.c
|
||||
* __perf_event_mmap
|
||||
*/
|
||||
if (machine__is_host(args->machine))
|
||||
event->header.misc = PERF_RECORD_MISC_KERNEL;
|
||||
else
|
||||
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
|
||||
|
||||
event->mmap.header.size = size;
|
||||
|
||||
event->mmap.start = map__start(map);
|
||||
event->mmap.len = map__size(map);
|
||||
event->mmap.pgoff = map__pgoff(map);
|
||||
event->mmap.pid = args->machine->pid;
|
||||
|
||||
strlcpy(event->mmap.filename, kmap->name, PATH_MAX);
|
||||
|
||||
if (perf_tool__process_synth_event(args->tool, event, args->machine, args->process) != 0)
|
||||
return -1;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int perf_event__synthesize_extra_kmaps(struct perf_tool *tool,
|
||||
perf_event__handler_t process,
|
||||
struct machine *machine)
|
||||
{
|
||||
int rc = 0;
|
||||
struct map_rb_node *pos;
|
||||
int rc;
|
||||
struct maps *kmaps = machine__kernel_maps(machine);
|
||||
union perf_event *event = zalloc(sizeof(event->mmap) +
|
||||
machine->id_hdr_size);
|
||||
struct perf_event__synthesize_extra_kmaps_cb_args args = {
|
||||
.tool = tool,
|
||||
.process = process,
|
||||
.machine = machine,
|
||||
.event = zalloc(sizeof(args.event->mmap) + machine->id_hdr_size),
|
||||
};
|
||||
|
||||
if (!event) {
|
||||
if (!args.event) {
|
||||
pr_debug("Not enough memory synthesizing mmap event "
|
||||
"for extra kernel maps\n");
|
||||
return -1;
|
||||
}
|
||||
|
||||
maps__for_each_entry(kmaps, pos) {
|
||||
struct kmap *kmap;
|
||||
size_t size;
|
||||
struct map *map = pos->map;
|
||||
rc = maps__for_each_map(kmaps, perf_event__synthesize_extra_kmaps_cb, &args);
|
||||
|
||||
if (!__map__is_extra_kernel_map(map))
|
||||
continue;
|
||||
|
||||
kmap = map__kmap(map);
|
||||
|
||||
size = sizeof(event->mmap) - sizeof(event->mmap.filename) +
|
||||
PERF_ALIGN(strlen(kmap->name) + 1, sizeof(u64)) +
|
||||
machine->id_hdr_size;
|
||||
|
||||
memset(event, 0, size);
|
||||
|
||||
event->mmap.header.type = PERF_RECORD_MMAP;
|
||||
|
||||
/*
|
||||
* kernel uses 0 for user space maps, see kernel/perf_event.c
|
||||
* __perf_event_mmap
|
||||
*/
|
||||
if (machine__is_host(machine))
|
||||
event->header.misc = PERF_RECORD_MISC_KERNEL;
|
||||
else
|
||||
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
|
||||
|
||||
event->mmap.header.size = size;
|
||||
|
||||
event->mmap.start = map__start(map);
|
||||
event->mmap.len = map__size(map);
|
||||
event->mmap.pgoff = map__pgoff(map);
|
||||
event->mmap.pid = machine->pid;
|
||||
|
||||
strlcpy(event->mmap.filename, kmap->name, PATH_MAX);
|
||||
|
||||
if (perf_tool__process_synth_event(tool, event, machine,
|
||||
process) != 0) {
|
||||
rc = -1;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
free(event);
|
||||
free(args.event);
|
||||
return rc;
|
||||
}
|
||||
|
||||
|
@ -143,7 +143,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
|
||||
if (!opts->full_auxtrace)
|
||||
return 0;
|
||||
|
||||
if (opts->full_auxtrace && !perf_cpu_map__empty(cpus)) {
|
||||
if (opts->full_auxtrace && !perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
|
||||
pr_err(INTEL_BTS_PMU_NAME " does not support per-cpu recording\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
@ -224,7 +224,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr,
|
||||
* In the case of per-cpu mmaps, we need the CPU on the
|
||||
* AUX event.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(cpus))
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus))
|
||||
evsel__set_sample_bit(intel_bts_evsel, CPU);
|
||||
}
|
||||
|
||||
|
@ -369,7 +369,7 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
|
||||
ui__warning("Intel Processor Trace: TSC not available\n");
|
||||
}
|
||||
|
||||
per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.user_requested_cpus);
|
||||
per_cpu_mmaps = !perf_cpu_map__has_any_cpu_or_is_empty(session->evlist->core.user_requested_cpus);
|
||||
|
||||
auxtrace_info->type = PERF_AUXTRACE_INTEL_PT;
|
||||
auxtrace_info->priv[INTEL_PT_PMU_TYPE] = intel_pt_pmu->type;
|
||||
@ -774,7 +774,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
|
||||
* Per-cpu recording needs sched_switch events to distinguish different
|
||||
* threads.
|
||||
*/
|
||||
if (have_timing_info && !perf_cpu_map__empty(cpus) &&
|
||||
if (have_timing_info && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) &&
|
||||
!record_opts__no_switch_events(opts)) {
|
||||
if (perf_can_record_switch_events()) {
|
||||
bool cpu_wide = !target__none(&opts->target) &&
|
||||
@ -832,7 +832,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
|
||||
* In the case of per-cpu mmaps, we need the CPU on the
|
||||
* AUX event.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(cpus))
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus))
|
||||
evsel__set_sample_bit(intel_pt_evsel, CPU);
|
||||
}
|
||||
|
||||
@ -858,7 +858,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
|
||||
tracking_evsel->immediate = true;
|
||||
|
||||
/* In per-cpu case, always need the time of mmap events etc */
|
||||
if (!perf_cpu_map__empty(cpus)) {
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) {
|
||||
evsel__set_sample_bit(tracking_evsel, TIME);
|
||||
/* And the CPU for switch events */
|
||||
evsel__set_sample_bit(tracking_evsel, CPU);
|
||||
@ -870,7 +870,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
|
||||
* Warn the user when we do not have enough information to decode i.e.
|
||||
* per-cpu with no sched_switch (except workload-only).
|
||||
*/
|
||||
if (!ptr->have_sched_switch && !perf_cpu_map__empty(cpus) &&
|
||||
if (!ptr->have_sched_switch && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) &&
|
||||
!target__none(&opts->target) &&
|
||||
!intel_pt_evsel->core.attr.exclude_user)
|
||||
ui__warning("Intel Processor Trace decoding will not be possible except for kernel tracing!\n");
|
||||
|
@ -330,7 +330,7 @@ int bench_epoll_ctl(int argc, const char **argv)
|
||||
act.sa_sigaction = toggle_done;
|
||||
sigaction(SIGINT, &act, NULL);
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
goto errmem;
|
||||
|
||||
|
@ -444,7 +444,7 @@ int bench_epoll_wait(int argc, const char **argv)
|
||||
act.sa_sigaction = toggle_done;
|
||||
sigaction(SIGINT, &act, NULL);
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
goto errmem;
|
||||
|
||||
|
@ -138,7 +138,7 @@ int bench_futex_hash(int argc, const char **argv)
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
goto errmem;
|
||||
|
||||
|
@ -172,7 +172,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
|
||||
if (argc)
|
||||
goto err;
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
err(EXIT_FAILURE, "calloc");
|
||||
|
||||
|
@ -174,7 +174,7 @@ int bench_futex_requeue(int argc, const char **argv)
|
||||
if (argc)
|
||||
goto err;
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
err(EXIT_FAILURE, "cpu_map__new");
|
||||
|
||||
|
@ -264,7 +264,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
|
||||
err(EXIT_FAILURE, "mlockall");
|
||||
}
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
err(EXIT_FAILURE, "calloc");
|
||||
|
||||
|
@ -149,7 +149,7 @@ int bench_futex_wake(int argc, const char **argv)
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
cpu = perf_cpu_map__new(NULL);
|
||||
cpu = perf_cpu_map__new_online_cpus();
|
||||
if (!cpu)
|
||||
err(EXIT_FAILURE, "calloc");
|
||||
|
||||
|
@ -32,7 +32,7 @@ static bool sync_mode;
|
||||
static const struct option options[] = {
|
||||
OPT_U64('l', "loop", &loops, "Specify number of loops"),
|
||||
OPT_BOOLEAN('s', "sync-mode", &sync_mode,
|
||||
"Enable the synchronious mode for seccomp notifications"),
|
||||
"Enable the synchronous mode for seccomp notifications"),
|
||||
OPT_END()
|
||||
};
|
||||
|
||||
|
@ -20,6 +20,7 @@
|
||||
#include "util/evlist.h"
|
||||
#include "util/evsel.h"
|
||||
#include "util/annotate.h"
|
||||
#include "util/annotate-data.h"
|
||||
#include "util/event.h"
|
||||
#include <subcmd/parse-options.h>
|
||||
#include "util/parse-events.h"
|
||||
@ -45,7 +46,6 @@
|
||||
struct perf_annotate {
|
||||
struct perf_tool tool;
|
||||
struct perf_session *session;
|
||||
struct annotation_options opts;
|
||||
#ifdef HAVE_SLANG_SUPPORT
|
||||
bool use_tui;
|
||||
#endif
|
||||
@ -56,9 +56,13 @@ struct perf_annotate {
|
||||
bool skip_missing;
|
||||
bool has_br_stack;
|
||||
bool group_set;
|
||||
bool data_type;
|
||||
bool type_stat;
|
||||
bool insn_stat;
|
||||
float min_percent;
|
||||
const char *sym_hist_filter;
|
||||
const char *cpu_list;
|
||||
const char *target_data_type;
|
||||
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
|
||||
};
|
||||
|
||||
@ -94,6 +98,7 @@ static void process_basic_block(struct addr_map_symbol *start,
|
||||
struct annotation *notes = sym ? symbol__annotation(sym) : NULL;
|
||||
struct block_range_iter iter;
|
||||
struct block_range *entry;
|
||||
struct annotated_branch *branch;
|
||||
|
||||
/*
|
||||
* Sanity; NULL isn't executable and the CPU cannot execute backwards
|
||||
@ -105,6 +110,8 @@ static void process_basic_block(struct addr_map_symbol *start,
|
||||
if (!block_range_iter__valid(&iter))
|
||||
return;
|
||||
|
||||
branch = annotation__get_branch(notes);
|
||||
|
||||
/*
|
||||
* First block in range is a branch target.
|
||||
*/
|
||||
@ -118,8 +125,8 @@ static void process_basic_block(struct addr_map_symbol *start,
|
||||
entry->coverage++;
|
||||
entry->sym = sym;
|
||||
|
||||
if (notes)
|
||||
notes->max_coverage = max(notes->max_coverage, entry->coverage);
|
||||
if (branch)
|
||||
branch->max_coverage = max(branch->max_coverage, entry->coverage);
|
||||
|
||||
} while (block_range_iter__next(&iter));
|
||||
|
||||
@ -315,9 +322,153 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
|
||||
struct perf_annotate *ann)
|
||||
{
|
||||
if (!ann->use_stdio2)
|
||||
return symbol__tty_annotate(&he->ms, evsel, &ann->opts);
|
||||
return symbol__tty_annotate(&he->ms, evsel);
|
||||
|
||||
return symbol__tty_annotate2(&he->ms, evsel, &ann->opts);
|
||||
return symbol__tty_annotate2(&he->ms, evsel);
|
||||
}
|
||||
|
||||
static void print_annotated_data_header(struct hist_entry *he, struct evsel *evsel)
|
||||
{
|
||||
struct dso *dso = map__dso(he->ms.map);
|
||||
int nr_members = 1;
|
||||
int nr_samples = he->stat.nr_events;
|
||||
|
||||
if (evsel__is_group_event(evsel)) {
|
||||
struct hist_entry *pair;
|
||||
|
||||
list_for_each_entry(pair, &he->pairs.head, pairs.node)
|
||||
nr_samples += pair->stat.nr_events;
|
||||
}
|
||||
|
||||
printf("Annotate type: '%s' in %s (%d samples):\n",
|
||||
he->mem_type->self.type_name, dso->name, nr_samples);
|
||||
|
||||
if (evsel__is_group_event(evsel)) {
|
||||
struct evsel *pos;
|
||||
int i = 0;
|
||||
|
||||
for_each_group_evsel(pos, evsel)
|
||||
printf(" event[%d] = %s\n", i++, pos->name);
|
||||
|
||||
nr_members = evsel->core.nr_members;
|
||||
}
|
||||
|
||||
printf("============================================================================\n");
|
||||
printf("%*s %10s %10s %s\n", 11 * nr_members, "samples", "offset", "size", "field");
|
||||
}
|
||||
|
||||
static void print_annotated_data_type(struct annotated_data_type *mem_type,
|
||||
struct annotated_member *member,
|
||||
struct evsel *evsel, int indent)
|
||||
{
|
||||
struct annotated_member *child;
|
||||
struct type_hist *h = mem_type->histograms[evsel->core.idx];
|
||||
int i, nr_events = 1, samples = 0;
|
||||
|
||||
for (i = 0; i < member->size; i++)
|
||||
samples += h->addr[member->offset + i].nr_samples;
|
||||
printf(" %10d", samples);
|
||||
|
||||
if (evsel__is_group_event(evsel)) {
|
||||
struct evsel *pos;
|
||||
|
||||
for_each_group_member(pos, evsel) {
|
||||
h = mem_type->histograms[pos->core.idx];
|
||||
|
||||
samples = 0;
|
||||
for (i = 0; i < member->size; i++)
|
||||
samples += h->addr[member->offset + i].nr_samples;
|
||||
printf(" %10d", samples);
|
||||
}
|
||||
nr_events = evsel->core.nr_members;
|
||||
}
|
||||
|
||||
printf(" %10d %10d %*s%s\t%s",
|
||||
member->offset, member->size, indent, "", member->type_name,
|
||||
member->var_name ?: "");
|
||||
|
||||
if (!list_empty(&member->children))
|
||||
printf(" {\n");
|
||||
|
||||
list_for_each_entry(child, &member->children, node)
|
||||
print_annotated_data_type(mem_type, child, evsel, indent + 4);
|
||||
|
||||
if (!list_empty(&member->children))
|
||||
printf("%*s}", 11 * nr_events + 24 + indent, "");
|
||||
printf(";\n");
|
||||
}
|
||||
|
||||
static void print_annotate_data_stat(struct annotated_data_stat *s)
|
||||
{
|
||||
#define PRINT_STAT(fld) if (s->fld) printf("%10d : %s\n", s->fld, #fld)
|
||||
|
||||
int bad = s->no_sym +
|
||||
s->no_insn +
|
||||
s->no_insn_ops +
|
||||
s->no_mem_ops +
|
||||
s->no_reg +
|
||||
s->no_dbginfo +
|
||||
s->no_cuinfo +
|
||||
s->no_var +
|
||||
s->no_typeinfo +
|
||||
s->invalid_size +
|
||||
s->bad_offset;
|
||||
int ok = s->total - bad;
|
||||
|
||||
printf("Annotate data type stats:\n");
|
||||
printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n",
|
||||
s->total, ok, 100.0 * ok / (s->total ?: 1), bad, 100.0 * bad / (s->total ?: 1));
|
||||
printf("-----------------------------------------------------------\n");
|
||||
PRINT_STAT(no_sym);
|
||||
PRINT_STAT(no_insn);
|
||||
PRINT_STAT(no_insn_ops);
|
||||
PRINT_STAT(no_mem_ops);
|
||||
PRINT_STAT(no_reg);
|
||||
PRINT_STAT(no_dbginfo);
|
||||
PRINT_STAT(no_cuinfo);
|
||||
PRINT_STAT(no_var);
|
||||
PRINT_STAT(no_typeinfo);
|
||||
PRINT_STAT(invalid_size);
|
||||
PRINT_STAT(bad_offset);
|
||||
printf("\n");
|
||||
|
||||
#undef PRINT_STAT
|
||||
}
|
||||
|
||||
static void print_annotate_item_stat(struct list_head *head, const char *title)
|
||||
{
|
||||
struct annotated_item_stat *istat, *pos, *iter;
|
||||
int total_good, total_bad, total;
|
||||
int sum1, sum2;
|
||||
LIST_HEAD(tmp);
|
||||
|
||||
/* sort the list by count */
|
||||
list_splice_init(head, &tmp);
|
||||
total_good = total_bad = 0;
|
||||
|
||||
list_for_each_entry_safe(istat, pos, &tmp, list) {
|
||||
total_good += istat->good;
|
||||
total_bad += istat->bad;
|
||||
sum1 = istat->good + istat->bad;
|
||||
|
||||
list_for_each_entry(iter, head, list) {
|
||||
sum2 = iter->good + iter->bad;
|
||||
if (sum1 > sum2)
|
||||
break;
|
||||
}
|
||||
list_move_tail(&istat->list, &iter->list);
|
||||
}
|
||||
total = total_good + total_bad;
|
||||
|
||||
printf("Annotate %s stats\n", title);
|
||||
printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total,
|
||||
total_good, 100.0 * total_good / (total ?: 1),
|
||||
total_bad, 100.0 * total_bad / (total ?: 1));
|
||||
printf(" %-10s: %5s %5s\n", "Name", "Good", "Bad");
|
||||
printf("-----------------------------------------------------------\n");
|
||||
list_for_each_entry(istat, head, list)
|
||||
printf(" %-10s: %5d %5d\n", istat->name, istat->good, istat->bad);
|
||||
printf("\n");
|
||||
}
|
||||
|
||||
static void hists__find_annotations(struct hists *hists,
|
||||
@ -327,6 +478,11 @@ static void hists__find_annotations(struct hists *hists,
|
||||
struct rb_node *nd = rb_first_cached(&hists->entries), *next;
|
||||
int key = K_RIGHT;
|
||||
|
||||
if (ann->type_stat)
|
||||
print_annotate_data_stat(&ann_data_stat);
|
||||
if (ann->insn_stat)
|
||||
print_annotate_item_stat(&ann_insn_stat, "Instruction");
|
||||
|
||||
while (nd) {
|
||||
struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
|
||||
struct annotation *notes;
|
||||
@ -359,11 +515,38 @@ static void hists__find_annotations(struct hists *hists,
|
||||
continue;
|
||||
}
|
||||
|
||||
if (ann->data_type) {
|
||||
/* skip unknown type */
|
||||
if (he->mem_type->histograms == NULL)
|
||||
goto find_next;
|
||||
|
||||
if (ann->target_data_type) {
|
||||
const char *type_name = he->mem_type->self.type_name;
|
||||
|
||||
/* skip 'struct ' prefix in the type name */
|
||||
if (strncmp(ann->target_data_type, "struct ", 7) &&
|
||||
!strncmp(type_name, "struct ", 7))
|
||||
type_name += 7;
|
||||
|
||||
/* skip 'union ' prefix in the type name */
|
||||
if (strncmp(ann->target_data_type, "union ", 6) &&
|
||||
!strncmp(type_name, "union ", 6))
|
||||
type_name += 6;
|
||||
|
||||
if (strcmp(ann->target_data_type, type_name))
|
||||
goto find_next;
|
||||
}
|
||||
|
||||
print_annotated_data_header(he, evsel);
|
||||
print_annotated_data_type(he->mem_type, &he->mem_type->self, evsel, 0);
|
||||
printf("\n");
|
||||
goto find_next;
|
||||
}
|
||||
|
||||
if (use_browser == 2) {
|
||||
int ret;
|
||||
int (*annotate)(struct hist_entry *he,
|
||||
struct evsel *evsel,
|
||||
struct annotation_options *options,
|
||||
struct hist_browser_timer *hbt);
|
||||
|
||||
annotate = dlsym(perf_gtk_handle,
|
||||
@ -373,14 +556,14 @@ static void hists__find_annotations(struct hists *hists,
|
||||
return;
|
||||
}
|
||||
|
||||
ret = annotate(he, evsel, &ann->opts, NULL);
|
||||
ret = annotate(he, evsel, NULL);
|
||||
if (!ret || !ann->skip_missing)
|
||||
return;
|
||||
|
||||
/* skip missing symbols */
|
||||
nd = rb_next(nd);
|
||||
} else if (use_browser == 1) {
|
||||
key = hist_entry__tui_annotate(he, evsel, NULL, &ann->opts);
|
||||
key = hist_entry__tui_annotate(he, evsel, NULL);
|
||||
|
||||
switch (key) {
|
||||
case -1:
|
||||
@ -422,9 +605,9 @@ static int __cmd_annotate(struct perf_annotate *ann)
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!ann->opts.objdump_path) {
|
||||
if (!annotate_opts.objdump_path) {
|
||||
ret = perf_env__lookup_objdump(&session->header.env,
|
||||
&ann->opts.objdump_path);
|
||||
&annotate_opts.objdump_path);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
@ -457,8 +640,20 @@ static int __cmd_annotate(struct perf_annotate *ann)
|
||||
evsel__reset_sample_bit(pos, CALLCHAIN);
|
||||
evsel__output_resort(pos, NULL);
|
||||
|
||||
if (symbol_conf.event_group && !evsel__is_group_leader(pos))
|
||||
/*
|
||||
* An event group needs to display other events too.
|
||||
* Let's delay printing until other events are processed.
|
||||
*/
|
||||
if (symbol_conf.event_group) {
|
||||
if (!evsel__is_group_leader(pos)) {
|
||||
struct hists *leader_hists;
|
||||
|
||||
leader_hists = evsel__hists(evsel__leader(pos));
|
||||
hists__match(leader_hists, hists);
|
||||
hists__link(leader_hists, hists);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
hists__find_annotations(hists, pos, ann);
|
||||
}
|
||||
@ -469,6 +664,20 @@ static int __cmd_annotate(struct perf_annotate *ann)
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* Display group events together */
|
||||
evlist__for_each_entry(session->evlist, pos) {
|
||||
struct hists *hists = evsel__hists(pos);
|
||||
u32 nr_samples = hists->stats.nr_samples;
|
||||
|
||||
if (nr_samples == 0)
|
||||
continue;
|
||||
|
||||
if (!symbol_conf.event_group || !evsel__is_group_leader(pos))
|
||||
continue;
|
||||
|
||||
hists__find_annotations(hists, pos, ann);
|
||||
}
|
||||
|
||||
if (use_browser == 2) {
|
||||
void (*show_annotations)(void);
|
||||
|
||||
@ -495,6 +704,17 @@ static int parse_percent_limit(const struct option *opt, const char *str,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int parse_data_type(const struct option *opt, const char *str, int unset)
|
||||
{
|
||||
struct perf_annotate *ann = opt->value;
|
||||
|
||||
ann->data_type = !unset;
|
||||
if (str)
|
||||
ann->target_data_type = strdup(str);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static const char * const annotate_usage[] = {
|
||||
"perf annotate [<options>]",
|
||||
NULL
|
||||
@ -558,9 +778,9 @@ int cmd_annotate(int argc, const char **argv)
|
||||
"file", "vmlinux pathname"),
|
||||
OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules,
|
||||
"load module symbols - WARNING: use only with -k and LIVE kernel"),
|
||||
OPT_BOOLEAN('l', "print-line", &annotate.opts.print_lines,
|
||||
OPT_BOOLEAN('l', "print-line", &annotate_opts.print_lines,
|
||||
"print matching source lines (may be slow)"),
|
||||
OPT_BOOLEAN('P', "full-paths", &annotate.opts.full_path,
|
||||
OPT_BOOLEAN('P', "full-paths", &annotate_opts.full_path,
|
||||
"Don't shorten the displayed pathnames"),
|
||||
OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing,
|
||||
"Skip symbols that cannot be annotated"),
|
||||
@ -571,15 +791,15 @@ int cmd_annotate(int argc, const char **argv)
|
||||
OPT_CALLBACK(0, "symfs", NULL, "directory",
|
||||
"Look for files with symbols relative to this directory",
|
||||
symbol__config_symfs),
|
||||
OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src,
|
||||
OPT_BOOLEAN(0, "source", &annotate_opts.annotate_src,
|
||||
"Interleave source code with assembly code (default)"),
|
||||
OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
|
||||
OPT_BOOLEAN(0, "asm-raw", &annotate_opts.show_asm_raw,
|
||||
"Display raw encoding of assembly instructions (default)"),
|
||||
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
|
||||
"Specify disassembler style (e.g. -M intel for intel syntax)"),
|
||||
OPT_STRING(0, "prefix", &annotate.opts.prefix, "prefix",
|
||||
OPT_STRING(0, "prefix", &annotate_opts.prefix, "prefix",
|
||||
"Add prefix to source file path names in programs (with --prefix-strip)"),
|
||||
OPT_STRING(0, "prefix-strip", &annotate.opts.prefix_strip, "N",
|
||||
OPT_STRING(0, "prefix-strip", &annotate_opts.prefix_strip, "N",
|
||||
"Strip first N entries of source file path name in programs (with --prefix)"),
|
||||
OPT_STRING(0, "objdump", &objdump_path, "path",
|
||||
"objdump binary to use for disassembly and annotations"),
|
||||
@ -598,7 +818,7 @@ int cmd_annotate(int argc, const char **argv)
|
||||
OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
|
||||
"'always' (default), 'never' or 'auto' only applicable to --stdio mode",
|
||||
stdio__config_color, "always"),
|
||||
OPT_CALLBACK(0, "percent-type", &annotate.opts, "local-period",
|
||||
OPT_CALLBACK(0, "percent-type", &annotate_opts, "local-period",
|
||||
"Set percent type local/global-period/hits",
|
||||
annotate_parse_percent_type),
|
||||
OPT_CALLBACK(0, "percent-limit", &annotate, "percent",
|
||||
@ -606,7 +826,13 @@ int cmd_annotate(int argc, const char **argv)
|
||||
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
|
||||
"Instruction Tracing options\n" ITRACE_HELP,
|
||||
itrace_parse_synth_opts),
|
||||
|
||||
OPT_CALLBACK_OPTARG(0, "data-type", &annotate, NULL, "name",
|
||||
"Show data type annotate for the memory accesses",
|
||||
parse_data_type),
|
||||
OPT_BOOLEAN(0, "type-stat", &annotate.type_stat,
|
||||
"Show stats for the data type annotation"),
|
||||
OPT_BOOLEAN(0, "insn-stat", &annotate.insn_stat,
|
||||
"Show instruction stats for the data type annotation"),
|
||||
OPT_END()
|
||||
};
|
||||
int ret;
|
||||
@ -614,13 +840,13 @@ int cmd_annotate(int argc, const char **argv)
|
||||
set_option_flag(options, 0, "show-total-period", PARSE_OPT_EXCLUSIVE);
|
||||
set_option_flag(options, 0, "show-nr-samples", PARSE_OPT_EXCLUSIVE);
|
||||
|
||||
annotation_options__init(&annotate.opts);
|
||||
annotation_options__init();
|
||||
|
||||
ret = hists__init();
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
annotation_config__init(&annotate.opts);
|
||||
annotation_config__init();
|
||||
|
||||
argc = parse_options(argc, argv, options, annotate_usage, 0);
|
||||
if (argc) {
|
||||
@ -635,13 +861,13 @@ int cmd_annotate(int argc, const char **argv)
|
||||
}
|
||||
|
||||
if (disassembler_style) {
|
||||
annotate.opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!annotate.opts.disassembler_style)
|
||||
annotate_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!annotate_opts.disassembler_style)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (objdump_path) {
|
||||
annotate.opts.objdump_path = strdup(objdump_path);
|
||||
if (!annotate.opts.objdump_path)
|
||||
annotate_opts.objdump_path = strdup(objdump_path);
|
||||
if (!annotate_opts.objdump_path)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (addr2line_path) {
|
||||
@ -650,7 +876,7 @@ int cmd_annotate(int argc, const char **argv)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
if (annotate_check_args(&annotate.opts) < 0)
|
||||
if (annotate_check_args() < 0)
|
||||
return -EINVAL;
|
||||
|
||||
#ifdef HAVE_GTK2_SUPPORT
|
||||
@ -660,6 +886,13 @@ int cmd_annotate(int argc, const char **argv)
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifndef HAVE_DWARF_GETLOCATIONS_SUPPORT
|
||||
if (annotate.data_type) {
|
||||
pr_err("Error: Data type profiling is disabled due to missing DWARF support\n");
|
||||
return -ENOTSUP;
|
||||
}
|
||||
#endif
|
||||
|
||||
ret = symbol__validate_sym_arguments();
|
||||
if (ret)
|
||||
return ret;
|
||||
@ -702,6 +935,14 @@ int cmd_annotate(int argc, const char **argv)
|
||||
use_browser = 2;
|
||||
#endif
|
||||
|
||||
/* FIXME: only support stdio for now */
|
||||
if (annotate.data_type) {
|
||||
use_browser = 0;
|
||||
annotate_opts.annotate_src = false;
|
||||
symbol_conf.annotate_data_member = true;
|
||||
symbol_conf.annotate_data_sample = true;
|
||||
}
|
||||
|
||||
setup_browser(true);
|
||||
|
||||
/*
|
||||
@ -709,7 +950,10 @@ int cmd_annotate(int argc, const char **argv)
|
||||
* symbol, we do not care about the processes in annotate,
|
||||
* set sort order to avoid repeated output.
|
||||
*/
|
||||
sort_order = "dso,symbol";
|
||||
if (annotate.data_type)
|
||||
sort_order = "dso,type";
|
||||
else
|
||||
sort_order = "dso,symbol";
|
||||
|
||||
/*
|
||||
* Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
|
||||
@ -731,7 +975,7 @@ int cmd_annotate(int argc, const char **argv)
|
||||
#ifndef NDEBUG
|
||||
perf_session__delete(annotate.session);
|
||||
#endif
|
||||
annotation_options__exit(&annotate.opts);
|
||||
annotation_options__exit();
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -2320,7 +2320,7 @@ static int setup_nodes(struct perf_session *session)
|
||||
nodes[node] = set;
|
||||
|
||||
/* empty node, skip */
|
||||
if (perf_cpu_map__empty(map))
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(map))
|
||||
continue;
|
||||
|
||||
perf_cpu_map__for_each_cpu(cpu, idx, map) {
|
||||
|
@ -333,7 +333,7 @@ static int set_tracing_func_irqinfo(struct perf_ftrace *ftrace)
|
||||
|
||||
static int reset_tracing_cpu(void)
|
||||
{
|
||||
struct perf_cpu_map *cpumap = perf_cpu_map__new(NULL);
|
||||
struct perf_cpu_map *cpumap = perf_cpu_map__new_online_cpus();
|
||||
int ret;
|
||||
|
||||
ret = set_tracing_cpumask(cpumap);
|
||||
|
@ -2265,6 +2265,12 @@ int cmd_inject(int argc, const char **argv)
|
||||
"perf inject [<options>]",
|
||||
NULL
|
||||
};
|
||||
|
||||
if (!inject.itrace_synth_opts.set) {
|
||||
/* Disable eager loading of kernel symbols that adds overhead to perf inject. */
|
||||
symbol_conf.lazy_load_kernel_maps = true;
|
||||
}
|
||||
|
||||
#ifndef HAVE_JITDUMP
|
||||
set_option_nobuild(options, 'j', "jit", "NO_LIBELF=1", true);
|
||||
#endif
|
||||
|
@ -2285,8 +2285,10 @@ static int __cmd_record(int argc, const char **argv)
|
||||
else
|
||||
ev_name = strdup(contention_tracepoints[j].name);
|
||||
|
||||
if (!ev_name)
|
||||
if (!ev_name) {
|
||||
free(rec_argv);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
rec_argv[i++] = "-e";
|
||||
rec_argv[i++] = ev_name;
|
||||
|
@ -270,7 +270,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
|
||||
|
||||
static int record__aio_enabled(struct record *rec);
|
||||
static int record__comp_enabled(struct record *rec);
|
||||
static size_t zstd_compress(struct perf_session *session, struct mmap *map,
|
||||
static ssize_t zstd_compress(struct perf_session *session, struct mmap *map,
|
||||
void *dst, size_t dst_size, void *src, size_t src_size);
|
||||
|
||||
#ifdef HAVE_AIO_SUPPORT
|
||||
@ -405,9 +405,13 @@ static int record__aio_pushfn(struct mmap *map, void *to, void *buf, size_t size
|
||||
*/
|
||||
|
||||
if (record__comp_enabled(aio->rec)) {
|
||||
size = zstd_compress(aio->rec->session, NULL, aio->data + aio->size,
|
||||
mmap__mmap_len(map) - aio->size,
|
||||
buf, size);
|
||||
ssize_t compressed = zstd_compress(aio->rec->session, NULL, aio->data + aio->size,
|
||||
mmap__mmap_len(map) - aio->size,
|
||||
buf, size);
|
||||
if (compressed < 0)
|
||||
return (int)compressed;
|
||||
|
||||
size = compressed;
|
||||
} else {
|
||||
memcpy(aio->data + aio->size, buf, size);
|
||||
}
|
||||
@ -633,7 +637,13 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
|
||||
struct record *rec = to;
|
||||
|
||||
if (record__comp_enabled(rec)) {
|
||||
size = zstd_compress(rec->session, map, map->data, mmap__mmap_len(map), bf, size);
|
||||
ssize_t compressed = zstd_compress(rec->session, map, map->data,
|
||||
mmap__mmap_len(map), bf, size);
|
||||
|
||||
if (compressed < 0)
|
||||
return (int)compressed;
|
||||
|
||||
size = compressed;
|
||||
bf = map->data;
|
||||
}
|
||||
|
||||
@ -1350,7 +1360,7 @@ static int record__open(struct record *rec)
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
try_again:
|
||||
if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
|
||||
if (evsel__fallback(pos, errno, msg, sizeof(msg))) {
|
||||
if (evsel__fallback(pos, &opts->target, errno, msg, sizeof(msg))) {
|
||||
if (verbose > 0)
|
||||
ui__warning("%s\n", msg);
|
||||
goto try_again;
|
||||
@ -1527,10 +1537,10 @@ static size_t process_comp_header(void *record, size_t increment)
|
||||
return size;
|
||||
}
|
||||
|
||||
static size_t zstd_compress(struct perf_session *session, struct mmap *map,
|
||||
static ssize_t zstd_compress(struct perf_session *session, struct mmap *map,
|
||||
void *dst, size_t dst_size, void *src, size_t src_size)
|
||||
{
|
||||
size_t compressed;
|
||||
ssize_t compressed;
|
||||
size_t max_record_size = PERF_SAMPLE_MAX_SIZE - sizeof(struct perf_record_compressed) - 1;
|
||||
struct zstd_data *zstd_data = &session->zstd_data;
|
||||
|
||||
@ -1539,6 +1549,8 @@ static size_t zstd_compress(struct perf_session *session, struct mmap *map,
|
||||
|
||||
compressed = zstd_compress_stream_to_records(zstd_data, dst, dst_size, src, src_size,
|
||||
max_record_size, process_comp_header);
|
||||
if (compressed < 0)
|
||||
return compressed;
|
||||
|
||||
if (map && map->file) {
|
||||
thread->bytes_transferred += src_size;
|
||||
@ -1912,21 +1924,13 @@ static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
|
||||
static void record__read_lost_samples(struct record *rec)
|
||||
{
|
||||
struct perf_session *session = rec->session;
|
||||
struct perf_record_lost_samples *lost;
|
||||
struct perf_record_lost_samples *lost = NULL;
|
||||
struct evsel *evsel;
|
||||
|
||||
/* there was an error during record__open */
|
||||
if (session->evlist == NULL)
|
||||
return;
|
||||
|
||||
lost = zalloc(PERF_SAMPLE_MAX_SIZE);
|
||||
if (lost == NULL) {
|
||||
pr_debug("Memory allocation failed\n");
|
||||
return;
|
||||
}
|
||||
|
||||
lost->header.type = PERF_RECORD_LOST_SAMPLES;
|
||||
|
||||
evlist__for_each_entry(session->evlist, evsel) {
|
||||
struct xyarray *xy = evsel->core.sample_id;
|
||||
u64 lost_count;
|
||||
@ -1949,6 +1953,15 @@ static void record__read_lost_samples(struct record *rec)
|
||||
}
|
||||
|
||||
if (count.lost) {
|
||||
if (!lost) {
|
||||
lost = zalloc(sizeof(*lost) +
|
||||
session->machines.host.id_hdr_size);
|
||||
if (!lost) {
|
||||
pr_debug("Memory allocation failed\n");
|
||||
return;
|
||||
}
|
||||
lost->header.type = PERF_RECORD_LOST_SAMPLES;
|
||||
}
|
||||
__record__save_lost_samples(rec, evsel, lost,
|
||||
x, y, count.lost, 0);
|
||||
}
|
||||
@ -1956,9 +1969,19 @@ static void record__read_lost_samples(struct record *rec)
|
||||
}
|
||||
|
||||
lost_count = perf_bpf_filter__lost_count(evsel);
|
||||
if (lost_count)
|
||||
if (lost_count) {
|
||||
if (!lost) {
|
||||
lost = zalloc(sizeof(*lost) +
|
||||
session->machines.host.id_hdr_size);
|
||||
if (!lost) {
|
||||
pr_debug("Memory allocation failed\n");
|
||||
return;
|
||||
}
|
||||
lost->header.type = PERF_RECORD_LOST_SAMPLES;
|
||||
}
|
||||
__record__save_lost_samples(rec, evsel, lost, 0, 0, lost_count,
|
||||
PERF_RECORD_MISC_LOST_SAMPLES_BPF);
|
||||
}
|
||||
}
|
||||
out:
|
||||
free(lost);
|
||||
@ -2216,32 +2239,6 @@ static void hit_auxtrace_snapshot_trigger(struct record *rec)
|
||||
}
|
||||
}
|
||||
|
||||
static void record__uniquify_name(struct record *rec)
|
||||
{
|
||||
struct evsel *pos;
|
||||
struct evlist *evlist = rec->evlist;
|
||||
char *new_name;
|
||||
int ret;
|
||||
|
||||
if (perf_pmus__num_core_pmus() == 1)
|
||||
return;
|
||||
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
if (!evsel__is_hybrid(pos))
|
||||
continue;
|
||||
|
||||
if (strchr(pos->name, '/'))
|
||||
continue;
|
||||
|
||||
ret = asprintf(&new_name, "%s/%s/",
|
||||
pos->pmu_name, pos->name);
|
||||
if (ret) {
|
||||
free(pos->name);
|
||||
pos->name = new_name;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static int record__terminate_thread(struct record_thread *thread_data)
|
||||
{
|
||||
int err;
|
||||
@ -2475,7 +2472,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
|
||||
if (data->is_pipe && rec->evlist->core.nr_entries == 1)
|
||||
rec->opts.sample_id = true;
|
||||
|
||||
record__uniquify_name(rec);
|
||||
evlist__uniquify_name(rec->evlist);
|
||||
|
||||
/* Debug message used by test scripts */
|
||||
pr_debug3("perf record opening and mmapping events\n");
|
||||
@ -3580,9 +3577,7 @@ static int record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cp
|
||||
if (cpu_map__is_dummy(cpus))
|
||||
return 0;
|
||||
|
||||
perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
|
||||
if (cpu.cpu == -1)
|
||||
continue;
|
||||
perf_cpu_map__for_each_cpu_skip_any(cpu, idx, cpus) {
|
||||
/* Return ENODEV is input cpu is greater than max cpu */
|
||||
if ((unsigned long)cpu.cpu > mask->nbits)
|
||||
return -ENODEV;
|
||||
@ -3989,6 +3984,8 @@ int cmd_record(int argc, const char **argv)
|
||||
# undef set_nobuild
|
||||
#endif
|
||||
|
||||
/* Disable eager loading of kernel symbols that adds overhead to perf record. */
|
||||
symbol_conf.lazy_load_kernel_maps = true;
|
||||
rec->opts.affinity = PERF_AFFINITY_SYS;
|
||||
|
||||
rec->evlist = evlist__new();
|
||||
|
@ -96,9 +96,9 @@ struct report {
|
||||
bool stitch_lbr;
|
||||
bool disable_order;
|
||||
bool skip_empty;
|
||||
bool data_type;
|
||||
int max_stack;
|
||||
struct perf_read_values show_threads_values;
|
||||
struct annotation_options annotation_opts;
|
||||
const char *pretty_printing_style;
|
||||
const char *cpu_list;
|
||||
const char *symbol_filter_str;
|
||||
@ -171,7 +171,7 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
|
||||
struct mem_info *mi;
|
||||
struct branch_info *bi;
|
||||
|
||||
if (!ui__has_annotation() && !rep->symbol_ipc)
|
||||
if (!ui__has_annotation() && !rep->symbol_ipc && !rep->data_type)
|
||||
return 0;
|
||||
|
||||
if (sort__mode == SORT_MODE__BRANCH) {
|
||||
@ -541,8 +541,7 @@ static int evlist__tui_block_hists_browse(struct evlist *evlist, struct report *
|
||||
evlist__for_each_entry(evlist, pos) {
|
||||
ret = report__browse_block_hists(&rep->block_reports[i++].hist,
|
||||
rep->min_percent, pos,
|
||||
&rep->session->header.env,
|
||||
&rep->annotation_opts);
|
||||
&rep->session->header.env);
|
||||
if (ret != 0)
|
||||
return ret;
|
||||
}
|
||||
@ -574,8 +573,7 @@ static int evlist__tty_browse_hists(struct evlist *evlist, struct report *rep, c
|
||||
|
||||
if (rep->total_cycles_mode) {
|
||||
report__browse_block_hists(&rep->block_reports[i++].hist,
|
||||
rep->min_percent, pos,
|
||||
NULL, NULL);
|
||||
rep->min_percent, pos, NULL);
|
||||
continue;
|
||||
}
|
||||
|
||||
@ -670,7 +668,7 @@ static int report__browse_hists(struct report *rep)
|
||||
}
|
||||
|
||||
ret = evlist__tui_browse_hists(evlist, help, NULL, rep->min_percent,
|
||||
&session->header.env, true, &rep->annotation_opts);
|
||||
&session->header.env, true);
|
||||
/*
|
||||
* Usually "ret" is the last pressed key, and we only
|
||||
* care if the key notifies us to switch data file.
|
||||
@ -745,7 +743,7 @@ static int hists__resort_cb(struct hist_entry *he, void *arg)
|
||||
if (rep->symbol_ipc && sym && !sym->annotate2) {
|
||||
struct evsel *evsel = hists_to_evsel(he->hists);
|
||||
|
||||
symbol__annotate2(&he->ms, evsel, &rep->annotation_opts, NULL);
|
||||
symbol__annotate2(&he->ms, evsel, NULL);
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -859,27 +857,47 @@ static struct task *tasks_list(struct task *task, struct machine *machine)
|
||||
return tasks_list(parent_task, machine);
|
||||
}
|
||||
|
||||
struct maps__fprintf_task_args {
|
||||
int indent;
|
||||
FILE *fp;
|
||||
size_t printed;
|
||||
};
|
||||
|
||||
static int maps__fprintf_task_cb(struct map *map, void *data)
|
||||
{
|
||||
struct maps__fprintf_task_args *args = data;
|
||||
const struct dso *dso = map__dso(map);
|
||||
u32 prot = map__prot(map);
|
||||
int ret;
|
||||
|
||||
ret = fprintf(args->fp,
|
||||
"%*s %" PRIx64 "-%" PRIx64 " %c%c%c%c %08" PRIx64 " %" PRIu64 " %s\n",
|
||||
args->indent, "", map__start(map), map__end(map),
|
||||
prot & PROT_READ ? 'r' : '-',
|
||||
prot & PROT_WRITE ? 'w' : '-',
|
||||
prot & PROT_EXEC ? 'x' : '-',
|
||||
map__flags(map) ? 's' : 'p',
|
||||
map__pgoff(map),
|
||||
dso->id.ino, dso->name);
|
||||
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
args->printed += ret;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
|
||||
{
|
||||
size_t printed = 0;
|
||||
struct map_rb_node *rb_node;
|
||||
struct maps__fprintf_task_args args = {
|
||||
.indent = indent,
|
||||
.fp = fp,
|
||||
.printed = 0,
|
||||
};
|
||||
|
||||
maps__for_each_entry(maps, rb_node) {
|
||||
struct map *map = rb_node->map;
|
||||
const struct dso *dso = map__dso(map);
|
||||
u32 prot = map__prot(map);
|
||||
maps__for_each_map(maps, maps__fprintf_task_cb, &args);
|
||||
|
||||
printed += fprintf(fp, "%*s %" PRIx64 "-%" PRIx64 " %c%c%c%c %08" PRIx64 " %" PRIu64 " %s\n",
|
||||
indent, "", map__start(map), map__end(map),
|
||||
prot & PROT_READ ? 'r' : '-',
|
||||
prot & PROT_WRITE ? 'w' : '-',
|
||||
prot & PROT_EXEC ? 'x' : '-',
|
||||
map__flags(map) ? 's' : 'p',
|
||||
map__pgoff(map),
|
||||
dso->id.ino, dso->name);
|
||||
}
|
||||
|
||||
return printed;
|
||||
return args.printed;
|
||||
}
|
||||
|
||||
static void task__print_level(struct task *task, FILE *fp, int level)
|
||||
@ -1341,15 +1359,15 @@ int cmd_report(int argc, const char **argv)
|
||||
"list of cpus to profile"),
|
||||
OPT_BOOLEAN('I', "show-info", &report.show_full_info,
|
||||
"Display extended information about perf.data file"),
|
||||
OPT_BOOLEAN(0, "source", &report.annotation_opts.annotate_src,
|
||||
OPT_BOOLEAN(0, "source", &annotate_opts.annotate_src,
|
||||
"Interleave source code with assembly code (default)"),
|
||||
OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
|
||||
OPT_BOOLEAN(0, "asm-raw", &annotate_opts.show_asm_raw,
|
||||
"Display raw encoding of assembly instructions (default)"),
|
||||
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
|
||||
"Specify disassembler style (e.g. -M intel for intel syntax)"),
|
||||
OPT_STRING(0, "prefix", &report.annotation_opts.prefix, "prefix",
|
||||
OPT_STRING(0, "prefix", &annotate_opts.prefix, "prefix",
|
||||
"Add prefix to source file path names in programs (with --prefix-strip)"),
|
||||
OPT_STRING(0, "prefix-strip", &report.annotation_opts.prefix_strip, "N",
|
||||
OPT_STRING(0, "prefix-strip", &annotate_opts.prefix_strip, "N",
|
||||
"Strip first N entries of source file path name in programs (with --prefix)"),
|
||||
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
|
||||
"Show a column with the sum of periods"),
|
||||
@ -1401,7 +1419,7 @@ int cmd_report(int argc, const char **argv)
|
||||
"Time span of interest (start,stop)"),
|
||||
OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
|
||||
"Show inline function"),
|
||||
OPT_CALLBACK(0, "percent-type", &report.annotation_opts, "local-period",
|
||||
OPT_CALLBACK(0, "percent-type", &annotate_opts, "local-period",
|
||||
"Set percent type local/global-period/hits",
|
||||
annotate_parse_percent_type),
|
||||
OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
|
||||
@ -1426,7 +1444,14 @@ int cmd_report(int argc, const char **argv)
|
||||
if (ret < 0)
|
||||
goto exit;
|
||||
|
||||
annotation_options__init(&report.annotation_opts);
|
||||
/*
|
||||
* tasks_mode require access to exited threads to list those that are in
|
||||
* the data file. Off-cpu events are synthesized after other events and
|
||||
* reference exited threads.
|
||||
*/
|
||||
symbol_conf.keep_exited_threads = true;
|
||||
|
||||
annotation_options__init();
|
||||
|
||||
ret = perf_config(report__config, &report);
|
||||
if (ret)
|
||||
@ -1445,13 +1470,13 @@ int cmd_report(int argc, const char **argv)
|
||||
}
|
||||
|
||||
if (disassembler_style) {
|
||||
report.annotation_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!report.annotation_opts.disassembler_style)
|
||||
annotate_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!annotate_opts.disassembler_style)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (objdump_path) {
|
||||
report.annotation_opts.objdump_path = strdup(objdump_path);
|
||||
if (!report.annotation_opts.objdump_path)
|
||||
annotate_opts.objdump_path = strdup(objdump_path);
|
||||
if (!annotate_opts.objdump_path)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (addr2line_path) {
|
||||
@ -1460,7 +1485,7 @@ int cmd_report(int argc, const char **argv)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
if (annotate_check_args(&report.annotation_opts) < 0) {
|
||||
if (annotate_check_args() < 0) {
|
||||
ret = -EINVAL;
|
||||
goto exit;
|
||||
}
|
||||
@ -1615,6 +1640,16 @@ int cmd_report(int argc, const char **argv)
|
||||
sort_order = NULL;
|
||||
}
|
||||
|
||||
if (sort_order && strstr(sort_order, "type")) {
|
||||
report.data_type = true;
|
||||
annotate_opts.annotate_src = false;
|
||||
|
||||
#ifndef HAVE_DWARF_GETLOCATIONS_SUPPORT
|
||||
pr_err("Error: Data type profiling is disabled due to missing DWARF support\n");
|
||||
goto error;
|
||||
#endif
|
||||
}
|
||||
|
||||
if (strcmp(input_name, "-") != 0)
|
||||
setup_browser(true);
|
||||
else
|
||||
@ -1673,7 +1708,7 @@ int cmd_report(int argc, const char **argv)
|
||||
* so don't allocate extra space that won't be used in the stdio
|
||||
* implementation.
|
||||
*/
|
||||
if (ui__has_annotation() || report.symbol_ipc ||
|
||||
if (ui__has_annotation() || report.symbol_ipc || report.data_type ||
|
||||
report.total_cycles_mode) {
|
||||
ret = symbol__annotation_init();
|
||||
if (ret < 0)
|
||||
@ -1692,7 +1727,7 @@ int cmd_report(int argc, const char **argv)
|
||||
*/
|
||||
symbol_conf.priv_size += sizeof(u32);
|
||||
}
|
||||
annotation_config__init(&report.annotation_opts);
|
||||
annotation_config__init();
|
||||
}
|
||||
|
||||
if (symbol__init(&session->header.env) < 0)
|
||||
@ -1746,7 +1781,7 @@ int cmd_report(int argc, const char **argv)
|
||||
zstd_fini(&(session->zstd_data));
|
||||
perf_session__delete(session);
|
||||
exit:
|
||||
annotation_options__exit(&report.annotation_opts);
|
||||
annotation_options__exit();
|
||||
free(sort_order_help);
|
||||
free(field_order_help);
|
||||
return ret;
|
||||
|
@ -653,7 +653,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter)
|
||||
if ((evsel__leader(counter) != counter) ||
|
||||
!(counter->core.leader->nr_members > 1))
|
||||
return COUNTER_SKIP;
|
||||
} else if (evsel__fallback(counter, errno, msg, sizeof(msg))) {
|
||||
} else if (evsel__fallback(counter, &target, errno, msg, sizeof(msg))) {
|
||||
if (verbose > 0)
|
||||
ui__warning("%s\n", msg);
|
||||
return COUNTER_RETRY;
|
||||
@ -1204,8 +1204,9 @@ static struct option stat_options[] = {
|
||||
OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
|
||||
"list of cpus to monitor in system-wide"),
|
||||
OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
|
||||
"disable CPU count aggregation", AGGR_NONE),
|
||||
OPT_BOOLEAN(0, "no-merge", &stat_config.no_merge, "Do not merge identical named events"),
|
||||
"disable aggregation across CPUs or PMUs", AGGR_NONE),
|
||||
OPT_SET_UINT(0, "no-merge", &stat_config.aggr_mode,
|
||||
"disable aggregation the same as -A or -no-aggr", AGGR_NONE),
|
||||
OPT_BOOLEAN(0, "hybrid-merge", &stat_config.hybrid_merge,
|
||||
"Merge identical named hybrid events"),
|
||||
OPT_STRING('x', "field-separator", &stat_config.csv_sep, "separator",
|
||||
@ -1255,7 +1256,7 @@ static struct option stat_options[] = {
|
||||
OPT_BOOLEAN(0, "metric-no-merge", &stat_config.metric_no_merge,
|
||||
"don't try to share events between metrics in a group"),
|
||||
OPT_BOOLEAN(0, "metric-no-threshold", &stat_config.metric_no_threshold,
|
||||
"don't try to share events between metrics in a group "),
|
||||
"disable adding events for the metric threshold calculation"),
|
||||
OPT_BOOLEAN(0, "topdown", &topdown_run,
|
||||
"measure top-down statistics"),
|
||||
OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
|
||||
@ -1316,7 +1317,7 @@ static int cpu__get_cache_id_from_map(struct perf_cpu cpu, char *map)
|
||||
* be the first online CPU in the cache domain else use the
|
||||
* first online CPU of the cache domain as the ID.
|
||||
*/
|
||||
if (perf_cpu_map__empty(cpu_map))
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(cpu_map))
|
||||
id = cpu.cpu;
|
||||
else
|
||||
id = perf_cpu_map__cpu(cpu_map, 0).cpu;
|
||||
@ -1622,7 +1623,7 @@ static int perf_stat_init_aggr_mode(void)
|
||||
* taking the highest cpu number to be the size of
|
||||
* the aggregation translate cpumap.
|
||||
*/
|
||||
if (!perf_cpu_map__empty(evsel_list->core.user_requested_cpus))
|
||||
if (!perf_cpu_map__has_any_cpu_or_is_empty(evsel_list->core.user_requested_cpus))
|
||||
nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu;
|
||||
else
|
||||
nr = 0;
|
||||
@ -2289,7 +2290,7 @@ int process_stat_config_event(struct perf_session *session,
|
||||
|
||||
perf_event__read_stat_config(&stat_config, &event->stat_config);
|
||||
|
||||
if (perf_cpu_map__empty(st->cpus)) {
|
||||
if (perf_cpu_map__has_any_cpu_or_is_empty(st->cpus)) {
|
||||
if (st->aggr_mode != AGGR_UNSET)
|
||||
pr_warning("warning: processing task data, aggregation mode not set\n");
|
||||
} else if (st->aggr_mode != AGGR_UNSET) {
|
||||
@ -2695,15 +2696,19 @@ int cmd_stat(int argc, const char **argv)
|
||||
*/
|
||||
if (metrics) {
|
||||
const char *pmu = parse_events_option_args.pmu_filter ?: "all";
|
||||
int ret = metricgroup__parse_groups(evsel_list, pmu, metrics,
|
||||
stat_config.metric_no_group,
|
||||
stat_config.metric_no_merge,
|
||||
stat_config.metric_no_threshold,
|
||||
stat_config.user_requested_cpu_list,
|
||||
stat_config.system_wide,
|
||||
&stat_config.metric_events);
|
||||
|
||||
metricgroup__parse_groups(evsel_list, pmu, metrics,
|
||||
stat_config.metric_no_group,
|
||||
stat_config.metric_no_merge,
|
||||
stat_config.metric_no_threshold,
|
||||
stat_config.user_requested_cpu_list,
|
||||
stat_config.system_wide,
|
||||
&stat_config.metric_events);
|
||||
zfree(&metrics);
|
||||
if (ret) {
|
||||
status = ret;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
if (add_default_attributes())
|
||||
|
@ -147,7 +147,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
|
||||
return err;
|
||||
}
|
||||
|
||||
err = symbol__annotate(&he->ms, evsel, &top->annotation_opts, NULL);
|
||||
err = symbol__annotate(&he->ms, evsel, NULL);
|
||||
if (err == 0) {
|
||||
top->sym_filter_entry = he;
|
||||
} else {
|
||||
@ -261,9 +261,9 @@ static void perf_top__show_details(struct perf_top *top)
|
||||
goto out_unlock;
|
||||
|
||||
printf("Showing %s for %s\n", evsel__name(top->sym_evsel), symbol->name);
|
||||
printf(" Events Pcnt (>=%d%%)\n", top->annotation_opts.min_pcnt);
|
||||
printf(" Events Pcnt (>=%d%%)\n", annotate_opts.min_pcnt);
|
||||
|
||||
more = symbol__annotate_printf(&he->ms, top->sym_evsel, &top->annotation_opts);
|
||||
more = symbol__annotate_printf(&he->ms, top->sym_evsel);
|
||||
|
||||
if (top->evlist->enabled) {
|
||||
if (top->zero)
|
||||
@ -450,7 +450,7 @@ static void perf_top__print_mapped_keys(struct perf_top *top)
|
||||
|
||||
fprintf(stdout, "\t[f] profile display filter (count). \t(%d)\n", top->count_filter);
|
||||
|
||||
fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", top->annotation_opts.min_pcnt);
|
||||
fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", annotate_opts.min_pcnt);
|
||||
fprintf(stdout, "\t[s] annotate symbol. \t(%s)\n", name?: "NULL");
|
||||
fprintf(stdout, "\t[S] stop annotation.\n");
|
||||
|
||||
@ -553,7 +553,7 @@ static bool perf_top__handle_keypress(struct perf_top *top, int c)
|
||||
prompt_integer(&top->count_filter, "Enter display event count filter");
|
||||
break;
|
||||
case 'F':
|
||||
prompt_percent(&top->annotation_opts.min_pcnt,
|
||||
prompt_percent(&annotate_opts.min_pcnt,
|
||||
"Enter details display event filter (percent)");
|
||||
break;
|
||||
case 'K':
|
||||
@ -646,8 +646,7 @@ static void *display_thread_tui(void *arg)
|
||||
}
|
||||
|
||||
ret = evlist__tui_browse_hists(top->evlist, help, &hbt, top->min_percent,
|
||||
&top->session->header.env, !top->record_opts.overwrite,
|
||||
&top->annotation_opts);
|
||||
&top->session->header.env, !top->record_opts.overwrite);
|
||||
if (ret == K_RELOAD) {
|
||||
top->zero = true;
|
||||
goto repeat;
|
||||
@ -1027,8 +1026,8 @@ static int perf_top__start_counters(struct perf_top *top)
|
||||
|
||||
evlist__for_each_entry(evlist, counter) {
|
||||
try_again:
|
||||
if (evsel__open(counter, top->evlist->core.user_requested_cpus,
|
||||
top->evlist->core.threads) < 0) {
|
||||
if (evsel__open(counter, counter->core.cpus,
|
||||
counter->core.threads) < 0) {
|
||||
|
||||
/*
|
||||
* Specially handle overwrite fall back.
|
||||
@ -1044,7 +1043,7 @@ static int perf_top__start_counters(struct perf_top *top)
|
||||
perf_top_overwrite_fallback(top, counter))
|
||||
goto try_again;
|
||||
|
||||
if (evsel__fallback(counter, errno, msg, sizeof(msg))) {
|
||||
if (evsel__fallback(counter, &opts->target, errno, msg, sizeof(msg))) {
|
||||
if (verbose > 0)
|
||||
ui__warning("%s\n", msg);
|
||||
goto try_again;
|
||||
@ -1241,9 +1240,9 @@ static int __cmd_top(struct perf_top *top)
|
||||
pthread_t thread, thread_process;
|
||||
int ret;
|
||||
|
||||
if (!top->annotation_opts.objdump_path) {
|
||||
if (!annotate_opts.objdump_path) {
|
||||
ret = perf_env__lookup_objdump(&top->session->header.env,
|
||||
&top->annotation_opts.objdump_path);
|
||||
&annotate_opts.objdump_path);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
@ -1299,6 +1298,7 @@ static int __cmd_top(struct perf_top *top)
|
||||
}
|
||||
}
|
||||
|
||||
evlist__uniquify_name(top->evlist);
|
||||
ret = perf_top__start_counters(top);
|
||||
if (ret)
|
||||
return ret;
|
||||
@ -1536,9 +1536,9 @@ int cmd_top(int argc, const char **argv)
|
||||
"only consider symbols in these comms"),
|
||||
OPT_STRING(0, "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
|
||||
"only consider these symbols"),
|
||||
OPT_BOOLEAN(0, "source", &top.annotation_opts.annotate_src,
|
||||
OPT_BOOLEAN(0, "source", &annotate_opts.annotate_src,
|
||||
"Interleave source code with assembly code (default)"),
|
||||
OPT_BOOLEAN(0, "asm-raw", &top.annotation_opts.show_asm_raw,
|
||||
OPT_BOOLEAN(0, "asm-raw", &annotate_opts.show_asm_raw,
|
||||
"Display raw encoding of assembly instructions (default)"),
|
||||
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
|
||||
"Enable kernel symbol demangling"),
|
||||
@ -1549,9 +1549,9 @@ int cmd_top(int argc, const char **argv)
|
||||
"addr2line binary to use for line numbers"),
|
||||
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
|
||||
"Specify disassembler style (e.g. -M intel for intel syntax)"),
|
||||
OPT_STRING(0, "prefix", &top.annotation_opts.prefix, "prefix",
|
||||
OPT_STRING(0, "prefix", &annotate_opts.prefix, "prefix",
|
||||
"Add prefix to source file path names in programs (with --prefix-strip)"),
|
||||
OPT_STRING(0, "prefix-strip", &top.annotation_opts.prefix_strip, "N",
|
||||
OPT_STRING(0, "prefix-strip", &annotate_opts.prefix_strip, "N",
|
||||
"Strip first N entries of source file path name in programs (with --prefix)"),
|
||||
OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"),
|
||||
OPT_CALLBACK(0, "percent-limit", &top, "percent",
|
||||
@ -1609,10 +1609,10 @@ int cmd_top(int argc, const char **argv)
|
||||
if (status < 0)
|
||||
return status;
|
||||
|
||||
annotation_options__init(&top.annotation_opts);
|
||||
annotation_options__init();
|
||||
|
||||
top.annotation_opts.min_pcnt = 5;
|
||||
top.annotation_opts.context = 4;
|
||||
annotate_opts.min_pcnt = 5;
|
||||
annotate_opts.context = 4;
|
||||
|
||||
top.evlist = evlist__new();
|
||||
if (top.evlist == NULL)
|
||||
@ -1642,13 +1642,13 @@ int cmd_top(int argc, const char **argv)
|
||||
usage_with_options(top_usage, options);
|
||||
|
||||
if (disassembler_style) {
|
||||
top.annotation_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!top.annotation_opts.disassembler_style)
|
||||
annotate_opts.disassembler_style = strdup(disassembler_style);
|
||||
if (!annotate_opts.disassembler_style)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (objdump_path) {
|
||||
top.annotation_opts.objdump_path = strdup(objdump_path);
|
||||
if (!top.annotation_opts.objdump_path)
|
||||
annotate_opts.objdump_path = strdup(objdump_path);
|
||||
if (!annotate_opts.objdump_path)
|
||||
return -ENOMEM;
|
||||
}
|
||||
if (addr2line_path) {
|
||||
@ -1661,7 +1661,7 @@ int cmd_top(int argc, const char **argv)
|
||||
if (status)
|
||||
goto out_delete_evlist;
|
||||
|
||||
if (annotate_check_args(&top.annotation_opts) < 0)
|
||||
if (annotate_check_args() < 0)
|
||||
goto out_delete_evlist;
|
||||
|
||||
if (!top.evlist->core.nr_entries) {
|
||||
@ -1787,7 +1787,7 @@ int cmd_top(int argc, const char **argv)
|
||||
if (status < 0)
|
||||
goto out_delete_evlist;
|
||||
|
||||
annotation_config__init(&top.annotation_opts);
|
||||
annotation_config__init();
|
||||
|
||||
symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
|
||||
status = symbol__init(NULL);
|
||||
@ -1840,7 +1840,7 @@ int cmd_top(int argc, const char **argv)
|
||||
out_delete_evlist:
|
||||
evlist__delete(top.evlist);
|
||||
perf_session__delete(top.session);
|
||||
annotation_options__exit(&top.annotation_opts);
|
||||
annotation_options__exit();
|
||||
|
||||
return status;
|
||||
}
|
||||
|
@ -2470,9 +2470,8 @@ static int trace__fprintf_callchain(struct trace *trace, struct perf_sample *sam
|
||||
static const char *errno_to_name(struct evsel *evsel, int err)
|
||||
{
|
||||
struct perf_env *env = evsel__env(evsel);
|
||||
const char *arch_name = perf_env__arch(env);
|
||||
|
||||
return arch_syscalls__strerrno(arch_name, err);
|
||||
return perf_env__arch_strerrno(env, err);
|
||||
}
|
||||
|
||||
static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
|
||||
@ -4264,12 +4263,11 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
|
||||
printed += fprintf(fp, " %9.3f %9.2f%%\n", max, pct);
|
||||
|
||||
if (trace->errno_summary && stats->nr_failures) {
|
||||
const char *arch_name = perf_env__arch(trace->host->env);
|
||||
int e;
|
||||
|
||||
for (e = 0; e < stats->max_errno; ++e) {
|
||||
if (stats->errnos[e] != 0)
|
||||
fprintf(fp, "\t\t\t\t%s: %d\n", arch_syscalls__strerrno(arch_name, e + 1), stats->errnos[e]);
|
||||
fprintf(fp, "\t\t\t\t%s: %d\n", perf_env__arch_strerrno(trace->host->env, e + 1), stats->errnos[e]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
86
tools/perf/perf-archive.sh
Normal file → Executable file
86
tools/perf/perf-archive.sh
Normal file → Executable file
@ -4,8 +4,73 @@
|
||||
# Arnaldo Carvalho de Melo <acme@redhat.com>
|
||||
|
||||
PERF_DATA=perf.data
|
||||
if [ $# -ne 0 ] ; then
|
||||
PERF_DATA=$1
|
||||
PERF_SYMBOLS=perf.symbols
|
||||
PERF_ALL=perf.all
|
||||
ALL=0
|
||||
UNPACK=0
|
||||
|
||||
while [ $# -gt 0 ] ; do
|
||||
if [ $1 == "--all" ]; then
|
||||
ALL=1
|
||||
shift
|
||||
elif [ $1 == "--unpack" ]; then
|
||||
UNPACK=1
|
||||
shift
|
||||
else
|
||||
PERF_DATA=$1
|
||||
UNPACK_TAR=$1
|
||||
shift
|
||||
fi
|
||||
done
|
||||
|
||||
if [ $UNPACK -eq 1 ]; then
|
||||
if [ ! -z "$UNPACK_TAR" ]; then # tar given as an argument
|
||||
if [ ! -e "$UNPACK_TAR" ]; then
|
||||
echo "Provided file $UNPACK_TAR does not exist"
|
||||
exit 1
|
||||
fi
|
||||
TARGET="$UNPACK_TAR"
|
||||
else # search for perf tar in the current directory
|
||||
TARGET=`find . -regex "\./perf.*\.tar\.bz2"`
|
||||
TARGET_NUM=`echo -n "$TARGET" | grep -c '^'`
|
||||
|
||||
if [ -z "$TARGET" -o $TARGET_NUM -gt 1 ]; then
|
||||
echo -e "Error: $TARGET_NUM files found for unpacking:\n$TARGET"
|
||||
echo "Provide the requested file as an argument"
|
||||
exit 1
|
||||
else
|
||||
echo "Found target file for unpacking: $TARGET"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ "$TARGET" =~ (\./)?$PERF_ALL.*.tar.bz2 ]]; then # perf tar generated by --all option
|
||||
TAR_CONTENTS=`tar tvf "$TARGET" | tr -s " " | cut -d " " -f 6`
|
||||
VALID_TAR=`echo "$TAR_CONTENTS" | grep "$PERF_SYMBOLS.tar.bz2" | wc -l` # check if it contains a sub-tar perf.symbols
|
||||
if [ $VALID_TAR -ne 1 ]; then
|
||||
echo "Error: $TARGET file is not valid (contains zero or multiple sub-tar files with debug symbols)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
INTERSECT=`comm -12 <(ls) <(echo "$TAR_CONTENTS") | tr "\n" " "` # check for overwriting
|
||||
if [ ! -z "$INTERSECT" ]; then # prompt if file(s) already exist in the current directory
|
||||
echo "File(s) ${INTERSECT::-1} already exist in the current directory."
|
||||
while true; do
|
||||
read -p 'Do you wish to overwrite them? ' yn
|
||||
case $yn in
|
||||
[Yy]* ) break;;
|
||||
[Nn]* ) exit 1;;
|
||||
* ) echo "Please answer yes or no.";;
|
||||
esac
|
||||
done
|
||||
fi
|
||||
|
||||
# unzip the perf.data file in the current working directory and debug symbols in ~/.debug directory
|
||||
tar xvf $TARGET && tar xvf $PERF_SYMBOLS.tar.bz2 -C ~/.debug
|
||||
|
||||
else # perf tar generated by perf archive (contains only debug symbols)
|
||||
tar xvf $TARGET -C ~/.debug
|
||||
fi
|
||||
exit 0
|
||||
fi
|
||||
|
||||
#
|
||||
@ -39,9 +104,18 @@ while read build_id ; do
|
||||
echo ${filename#$PERF_BUILDID_LINKDIR} >> $MANIFEST
|
||||
done
|
||||
|
||||
tar cjf $PERF_DATA.tar.bz2 -C $PERF_BUILDID_DIR -T $MANIFEST
|
||||
rm $MANIFEST $BUILDIDS || true
|
||||
if [ $ALL -eq 1 ]; then # pack perf.data file together with tar containing debug symbols
|
||||
HOSTNAME=$(hostname)
|
||||
DATE=$(date '+%Y%m%d-%H%M%S')
|
||||
tar cjf $PERF_SYMBOLS.tar.bz2 -C $PERF_BUILDID_DIR -T $MANIFEST
|
||||
tar cjf $PERF_ALL-$HOSTNAME-$DATE.tar.bz2 $PERF_DATA $PERF_SYMBOLS.tar.bz2
|
||||
rm $PERF_SYMBOLS.tar.bz2 $MANIFEST $BUILDIDS || true
|
||||
else # pack only the debug symbols
|
||||
tar cjf $PERF_DATA.tar.bz2 -C $PERF_BUILDID_DIR -T $MANIFEST
|
||||
rm $MANIFEST $BUILDIDS || true
|
||||
fi
|
||||
|
||||
echo -e "Now please run:\n"
|
||||
echo -e "$ tar xvf $PERF_DATA.tar.bz2 -C ~/.debug\n"
|
||||
echo "wherever you need to run 'perf report' on."
|
||||
echo -e "$ perf archive --unpack\n"
|
||||
echo "or unpack the tar manually wherever you need to run 'perf report' on."
|
||||
exit 0
|
||||
|
@ -39,6 +39,7 @@
|
||||
#include <linux/zalloc.h>
|
||||
|
||||
static int use_pager = -1;
|
||||
static FILE *debug_fp = NULL;
|
||||
|
||||
struct cmd_struct {
|
||||
const char *cmd;
|
||||
@ -162,6 +163,19 @@ static void commit_pager_choice(void)
|
||||
}
|
||||
}
|
||||
|
||||
static int set_debug_file(const char *path)
|
||||
{
|
||||
debug_fp = fopen(path, "w");
|
||||
if (!debug_fp) {
|
||||
fprintf(stderr, "Open debug file '%s' failed: %s\n",
|
||||
path, strerror(errno));
|
||||
return -1;
|
||||
}
|
||||
|
||||
debug_set_file(debug_fp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct option options[] = {
|
||||
OPT_ARGUMENT("help", "help"),
|
||||
OPT_ARGUMENT("version", "version"),
|
||||
@ -174,6 +188,7 @@ struct option options[] = {
|
||||
OPT_ARGUMENT("list-cmds", "list-cmds"),
|
||||
OPT_ARGUMENT("list-opts", "list-opts"),
|
||||
OPT_ARGUMENT("debug", "debug"),
|
||||
OPT_ARGUMENT("debug-file", "debug-file"),
|
||||
OPT_END()
|
||||
};
|
||||
|
||||
@ -287,6 +302,18 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
|
||||
|
||||
(*argv)++;
|
||||
(*argc)--;
|
||||
} else if (!strcmp(cmd, "--debug-file")) {
|
||||
if (*argc < 2) {
|
||||
fprintf(stderr, "No path given for --debug-file.\n");
|
||||
usage(perf_usage_string);
|
||||
}
|
||||
|
||||
if (set_debug_file((*argv)[1]))
|
||||
usage(perf_usage_string);
|
||||
|
||||
(*argv)++;
|
||||
(*argc)--;
|
||||
|
||||
} else {
|
||||
fprintf(stderr, "Unknown option: %s\n", cmd);
|
||||
usage(perf_usage_string);
|
||||
@ -547,5 +574,8 @@ int main(int argc, const char **argv)
|
||||
fprintf(stderr, "Failed to run command '%s': %s\n",
|
||||
cmd, str_error_r(errno, sbuf, sizeof(sbuf)));
|
||||
out:
|
||||
if (debug_fp)
|
||||
fclose(debug_fp);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
@ -110,7 +110,7 @@
|
||||
{
|
||||
"PublicDescription": "Flushes due to memory hazards",
|
||||
"EventCode": "0x121",
|
||||
"EventName": "BPU_FLUSH_MEM_FAULT",
|
||||
"EventName": "GPC_FLUSH_MEM_FAULT",
|
||||
"BriefDescription": "Flushes due to memory hazards"
|
||||
},
|
||||
{
|
||||
|
125
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/branch.json
Normal file
125
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/branch.json
Normal file
@ -0,0 +1,125 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "BR_IMMED_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_RETURN_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_INDIRECT_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_MIS_PRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_PRED"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, branch not taken",
|
||||
"EventCode": "0x8107",
|
||||
"EventName": "BR_SKIP_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, branch not taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, immediate branch taken",
|
||||
"EventCode": "0x8108",
|
||||
"EventName": "BR_IMMED_TAKEN_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, immediate branch taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, indirect branch excluding return retired",
|
||||
"EventCode": "0x810c",
|
||||
"EventName": "BR_INDNR_TAKEN_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, indirect branch excluding return retired"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted immediate branch",
|
||||
"EventCode": "0x8110",
|
||||
"EventName": "BR_IMMED_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted immediate branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted immediate branch",
|
||||
"EventCode": "0x8111",
|
||||
"EventName": "BR_IMMED_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted immediate branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted indirect branch",
|
||||
"EventCode": "0x8112",
|
||||
"EventName": "BR_IND_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted indirect branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted indirect branch",
|
||||
"EventCode": "0x8113",
|
||||
"EventName": "BR_IND_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted indirect branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted procedure return",
|
||||
"EventCode": "0x8114",
|
||||
"EventName": "BR_RETURN_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted procedure return"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted procedure return",
|
||||
"EventCode": "0x8115",
|
||||
"EventName": "BR_RETURN_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted procedure return"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted indirect branch excluding return",
|
||||
"EventCode": "0x8116",
|
||||
"EventName": "BR_INDNR_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted indirect branch excluding return"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted indirect branch excluding return",
|
||||
"EventCode": "0x8117",
|
||||
"EventName": "BR_INDNR_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted indirect branch excluding return"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted branch, taken",
|
||||
"EventCode": "0x8118",
|
||||
"EventName": "BR_TAKEN_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted branch, taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted branch, taken",
|
||||
"EventCode": "0x8119",
|
||||
"EventName": "BR_TAKEN_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted branch, taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted branch, not taken",
|
||||
"EventCode": "0x811a",
|
||||
"EventName": "BR_SKIP_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted branch, not taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, mispredicted branch, not taken",
|
||||
"EventCode": "0x811b",
|
||||
"EventName": "BR_SKIP_MIS_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, mispredicted branch, not taken"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, predicted branch",
|
||||
"EventCode": "0x811c",
|
||||
"EventName": "BR_PRED_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, predicted branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction architecturally executed, indirect branch",
|
||||
"EventCode": "0x811d",
|
||||
"EventName": "BR_IND_RETIRED",
|
||||
"BriefDescription": "Instruction architecturally executed, indirect branch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Branch Record captured.",
|
||||
"EventCode": "0x811f",
|
||||
"EventName": "BRB_FILTRATE",
|
||||
"BriefDescription": "Branch Record captured."
|
||||
}
|
||||
]
|
20
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/bus.json
Normal file
20
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/bus.json
Normal file
@ -0,0 +1,20 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "CPU_CYCLES"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BUS_CYCLES"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BUS_ACCESS_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BUS_ACCESS_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BUS_ACCESS"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "CNT_CYCLES"
|
||||
}
|
||||
]
|
206
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/cache.json
Normal file
206
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/cache.json
Normal file
@ -0,0 +1,206 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_REFILL_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_INVAL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_REFILL_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_REFILL_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_REFILL_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_REFILL_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_WB_VICTIM"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_WB_CLEAN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_INVAL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_WB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_TLB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2I_TLB_REFILL"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2I_TLB"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DTLB_WALK"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ITLB_WALK"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_REFILL_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_CACHE_LMISS_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1I_CACHE_LMISS"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_CACHE_LMISS_RD"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 data or unified cache demand access",
|
||||
"EventCode": "0x8140",
|
||||
"EventName": "L1D_CACHE_RW",
|
||||
"BriefDescription": "Level 1 data or unified cache demand access"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 data or unified cache preload or prefetch",
|
||||
"EventCode": "0x8142",
|
||||
"EventName": "L1D_CACHE_PRFM",
|
||||
"BriefDescription": "Level 1 data or unified cache preload or prefetch"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 data or unified cache refill, preload or prefetch",
|
||||
"EventCode": "0x8146",
|
||||
"EventName": "L1D_CACHE_REFILL_PRFM",
|
||||
"BriefDescription": "Level 1 data or unified cache refill, preload or prefetch"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L1D_TLB_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_REFILL_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_REFILL_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "L2D_TLB_WR"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L1D TLB miss",
|
||||
"EventCode": "0xD600",
|
||||
"EventName": "L1D_TLB_MISS",
|
||||
"BriefDescription": "L1D TLB miss"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 prefetcher, load prefetch requests generated",
|
||||
"EventCode": "0xd606",
|
||||
"EventName": "L1_PREFETCH_LD_GEN",
|
||||
"BriefDescription": "Level 1 prefetcher, load prefetch requests generated"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 prefetcher, load prefetch fills into the level 1 cache",
|
||||
"EventCode": "0xd607",
|
||||
"EventName": "L1_PREFETCH_LD_FILL",
|
||||
"BriefDescription": "Level 1 prefetcher, load prefetch fills into the level 1 cache"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 prefetcher, load prefetch to level 2 generated",
|
||||
"EventCode": "0xd608",
|
||||
"EventName": "L1_PREFETCH_L2_REQ",
|
||||
"BriefDescription": "Level 1 prefetcher, load prefetch to level 2 generated"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L1 prefetcher, distance was reset",
|
||||
"EventCode": "0xd609",
|
||||
"EventName": "L1_PREFETCH_DIST_RST",
|
||||
"BriefDescription": "L1 prefetcher, distance was reset"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L1 prefetcher, distance was increased",
|
||||
"EventCode": "0xd60a",
|
||||
"EventName": "L1_PREFETCH_DIST_INC",
|
||||
"BriefDescription": "L1 prefetcher, distance was increased"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 1 prefetcher, table entry is trained",
|
||||
"EventCode": "0xd60b",
|
||||
"EventName": "L1_PREFETCH_ENTRY_TRAINED",
|
||||
"BriefDescription": "Level 1 prefetcher, table entry is trained"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L1 data cache refill - Read or Write",
|
||||
"EventCode": "0xd60e",
|
||||
"EventName": "L1D_CACHE_REFILL_RW",
|
||||
"BriefDescription": "L1 data cache refill - Read or Write"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 cache refill from instruction-side miss, including IMMU refills",
|
||||
"EventCode": "0xD701",
|
||||
"EventName": "L2C_INST_REFILL",
|
||||
"BriefDescription": "Level 2 cache refill from instruction-side miss, including IMMU refills"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 cache refill from data-side miss, including DMMU refills",
|
||||
"EventCode": "0xD702",
|
||||
"EventName": "L2C_DATA_REFILL",
|
||||
"BriefDescription": "Level 2 cache refill from data-side miss, including DMMU refills"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 cache prefetcher, load prefetch requests generated",
|
||||
"EventCode": "0xD703",
|
||||
"EventName": "L2_PREFETCH_REQ",
|
||||
"BriefDescription": "Level 2 cache prefetcher, load prefetch requests generated"
|
||||
}
|
||||
]
|
@ -0,0 +1,464 @@
|
||||
[
|
||||
{
|
||||
"PublicDescription": "Level 2 prefetch requests, refilled to L2 cache",
|
||||
"EventCode": "0x10A",
|
||||
"EventName": "L2_PREFETCH_REFILL",
|
||||
"BriefDescription": "Level 2 prefetch requests, refilled to L2 cache"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 prefetch requests, late",
|
||||
"EventCode": "0x10B",
|
||||
"EventName": "L2_PREFETCH_UPGRADE",
|
||||
"BriefDescription": "Level 2 prefetch requests, late"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable branch speculatively executed that hit any level of BTB",
|
||||
"EventCode": "0x110",
|
||||
"EventName": "BPU_HIT_BTB",
|
||||
"BriefDescription": "Predictable branch speculatively executed that hit any level of BTB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable conditional branch speculatively executed that hit any level of BTB",
|
||||
"EventCode": "0x111",
|
||||
"EventName": "BPU_CONDITIONAL_BRANCH_HIT_BTB",
|
||||
"BriefDescription": "Predictable conditional branch speculatively executed that hit any level of BTB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the indirect predictor",
|
||||
"EventCode": "0x112",
|
||||
"EventName": "BPU_HIT_INDIRECT_PREDICTOR",
|
||||
"BriefDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the indirect predictor"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the return predictor",
|
||||
"EventCode": "0x113",
|
||||
"EventName": "BPU_HIT_RSB",
|
||||
"BriefDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the return predictor"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable unconditional branch speculatively executed that did not hit any level of BTB",
|
||||
"EventCode": "0x114",
|
||||
"EventName": "BPU_UNCONDITIONAL_BRANCH_MISS_BTB",
|
||||
"BriefDescription": "Predictable unconditional branch speculatively executed that did not hit any level of BTB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable branch speculatively executed, unpredicted",
|
||||
"EventCode": "0x115",
|
||||
"EventName": "BPU_BRANCH_NO_HIT",
|
||||
"BriefDescription": "Predictable branch speculatively executed, unpredicted"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable branch speculatively executed that hit any level of BTB that mispredict",
|
||||
"EventCode": "0x116",
|
||||
"EventName": "BPU_HIT_BTB_AND_MISPREDICT",
|
||||
"BriefDescription": "Predictable branch speculatively executed that hit any level of BTB that mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable conditional branch speculatively executed that hit any level of BTB that (direction) mispredict",
|
||||
"EventCode": "0x117",
|
||||
"EventName": "BPU_CONDITIONAL_BRANCH_HIT_BTB_AND_MISPREDICT",
|
||||
"BriefDescription": "Predictable conditional branch speculatively executed that hit any level of BTB that (direction) mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the indirect predictor that mispredict",
|
||||
"EventCode": "0x118",
|
||||
"EventName": "BPU_INDIRECT_BRANCH_HIT_BTB_AND_MISPREDICT",
|
||||
"BriefDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the indirect predictor that mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the return predictor that mispredict",
|
||||
"EventCode": "0x119",
|
||||
"EventName": "BPU_HIT_RSB_AND_MISPREDICT",
|
||||
"BriefDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the return predictor that mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the overflow/underflow return predictor that mispredict",
|
||||
"EventCode": "0x11a",
|
||||
"EventName": "BPU_MISS_RSB_AND_MISPREDICT",
|
||||
"BriefDescription": "Predictable taken branch speculatively executed that hit any level of BTB that access the overflow/underflow return predictor that mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Predictable branch speculatively executed, unpredicted, that mispredict",
|
||||
"EventCode": "0x11b",
|
||||
"EventName": "BPU_NO_PREDICTION_MISPREDICT",
|
||||
"BriefDescription": "Predictable branch speculatively executed, unpredicted, that mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Preditable branch update the BTB region buffer entry",
|
||||
"EventCode": "0x11c",
|
||||
"EventName": "BPU_BTB_UPDATE",
|
||||
"BriefDescription": "Preditable branch update the BTB region buffer entry"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Count predict pipe stalls due to speculative return address predictor full",
|
||||
"EventCode": "0x11d",
|
||||
"EventName": "BPU_RSB_FULL_STALL",
|
||||
"BriefDescription": "Count predict pipe stalls due to speculative return address predictor full"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Macro-ops speculatively decoded",
|
||||
"EventCode": "0x11f",
|
||||
"EventName": "ICF_INST_SPEC_DECODE",
|
||||
"BriefDescription": "Macro-ops speculatively decoded"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Flushes",
|
||||
"EventCode": "0x120",
|
||||
"EventName": "GPC_FLUSH",
|
||||
"BriefDescription": "Flushes"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Flushes due to memory hazards",
|
||||
"EventCode": "0x121",
|
||||
"EventName": "GPC_FLUSH_MEM_FAULT",
|
||||
"BriefDescription": "Flushes due to memory hazards"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "ETM extout bit 0",
|
||||
"EventCode": "0x141",
|
||||
"EventName": "MSC_ETM_EXTOUT0",
|
||||
"BriefDescription": "ETM extout bit 0"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "ETM extout bit 1",
|
||||
"EventCode": "0x142",
|
||||
"EventName": "MSC_ETM_EXTOUT1",
|
||||
"BriefDescription": "ETM extout bit 1"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "ETM extout bit 2",
|
||||
"EventCode": "0x143",
|
||||
"EventName": "MSC_ETM_EXTOUT2",
|
||||
"BriefDescription": "ETM extout bit 2"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "ETM extout bit 3",
|
||||
"EventCode": "0x144",
|
||||
"EventName": "MSC_ETM_EXTOUT3",
|
||||
"BriefDescription": "ETM extout bit 3"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Bus request sn",
|
||||
"EventCode": "0x156",
|
||||
"EventName": "L2C_SNOOP",
|
||||
"BriefDescription": "Bus request sn"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L2 TXDAT LCRD blocked",
|
||||
"EventCode": "0x169",
|
||||
"EventName": "L2C_DAT_CRD_STALL",
|
||||
"BriefDescription": "L2 TXDAT LCRD blocked"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L2 TXRSP LCRD blocked",
|
||||
"EventCode": "0x16a",
|
||||
"EventName": "L2C_RSP_CRD_STALL",
|
||||
"BriefDescription": "L2 TXRSP LCRD blocked"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L2 TXREQ LCRD blocked",
|
||||
"EventCode": "0x16b",
|
||||
"EventName": "L2C_REQ_CRD_STALL",
|
||||
"BriefDescription": "L2 TXREQ LCRD blocked"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Early mispredict",
|
||||
"EventCode": "0xD100",
|
||||
"EventName": "ICF_EARLY_MIS_PRED",
|
||||
"BriefDescription": "Early mispredict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "FEQ full cycles",
|
||||
"EventCode": "0xD101",
|
||||
"EventName": "ICF_FEQ_FULL",
|
||||
"BriefDescription": "FEQ full cycles"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction FIFO Full",
|
||||
"EventCode": "0xD102",
|
||||
"EventName": "ICF_INST_FIFO_FULL",
|
||||
"BriefDescription": "Instruction FIFO Full"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "L1I TLB miss",
|
||||
"EventCode": "0xD103",
|
||||
"EventName": "L1I_TLB_MISS",
|
||||
"BriefDescription": "L1I TLB miss"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "ICF sent 0 instructions to IDR this cycle",
|
||||
"EventCode": "0xD104",
|
||||
"EventName": "ICF_STALL",
|
||||
"BriefDescription": "ICF sent 0 instructions to IDR this cycle"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "PC FIFO Full",
|
||||
"EventCode": "0xD105",
|
||||
"EventName": "ICF_PC_FIFO_FULL",
|
||||
"BriefDescription": "PC FIFO Full"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Stall due to BOB ID",
|
||||
"EventCode": "0xD200",
|
||||
"EventName": "IDR_STALL_BOB_ID",
|
||||
"BriefDescription": "Stall due to BOB ID"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to LOB entries",
|
||||
"EventCode": "0xD201",
|
||||
"EventName": "IDR_STALL_LOB_ID",
|
||||
"BriefDescription": "Dispatch stall due to LOB entries"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to SOB entries",
|
||||
"EventCode": "0xD202",
|
||||
"EventName": "IDR_STALL_SOB_ID",
|
||||
"BriefDescription": "Dispatch stall due to SOB entries"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to IXU scheduler entries",
|
||||
"EventCode": "0xD203",
|
||||
"EventName": "IDR_STALL_IXU_SCHED",
|
||||
"BriefDescription": "Dispatch stall due to IXU scheduler entries"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to FSU scheduler entries",
|
||||
"EventCode": "0xD204",
|
||||
"EventName": "IDR_STALL_FSU_SCHED",
|
||||
"BriefDescription": "Dispatch stall due to FSU scheduler entries"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to ROB entries",
|
||||
"EventCode": "0xD205",
|
||||
"EventName": "IDR_STALL_ROB_ID",
|
||||
"BriefDescription": "Dispatch stall due to ROB entries"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to flush",
|
||||
"EventCode": "0xD206",
|
||||
"EventName": "IDR_STALL_FLUSH",
|
||||
"BriefDescription": "Dispatch stall due to flush"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to WFI",
|
||||
"EventCode": "0xD207",
|
||||
"EventName": "IDR_STALL_WFI",
|
||||
"BriefDescription": "Dispatch stall due to WFI"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Number of SWOB drains triggered by timeout",
|
||||
"EventCode": "0xD208",
|
||||
"EventName": "IDR_STALL_SWOB_TIMEOUT",
|
||||
"BriefDescription": "Number of SWOB drains triggered by timeout"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Number of SWOB drains triggered by system register or special-purpose register read-after-write or specific special-purpose register writes that cause SWOB drain",
|
||||
"EventCode": "0xD209",
|
||||
"EventName": "IDR_STALL_SWOB_RAW",
|
||||
"BriefDescription": "Number of SWOB drains triggered by system register or special-purpose register read-after-write or specific special-purpose register writes that cause SWOB drain"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Number of SWOB drains triggered by system register write when SWOB full",
|
||||
"EventCode": "0xD20A",
|
||||
"EventName": "IDR_STALL_SWOB_FULL",
|
||||
"BriefDescription": "Number of SWOB drains triggered by system register write when SWOB full"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to L1 instruction cache miss",
|
||||
"EventCode": "0xD20B",
|
||||
"EventName": "STALL_FRONTEND_CACHE",
|
||||
"BriefDescription": "Dispatch stall due to L1 instruction cache miss"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to L1 data cache miss",
|
||||
"EventCode": "0xD20D",
|
||||
"EventName": "STALL_BACKEND_CACHE",
|
||||
"BriefDescription": "Dispatch stall due to L1 data cache miss"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Dispatch stall due to lack of any core resource",
|
||||
"EventCode": "0xD20F",
|
||||
"EventName": "STALL_BACKEND_RESOURCE",
|
||||
"BriefDescription": "Dispatch stall due to lack of any core resource"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instructions issued by the scheduler",
|
||||
"EventCode": "0xD300",
|
||||
"EventName": "IXU_NUM_UOPS_ISSUED",
|
||||
"BriefDescription": "Instructions issued by the scheduler"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Any uop issued was canceled for any reason",
|
||||
"EventCode": "0xD301",
|
||||
"EventName": "IXU_ISSUE_CANCEL",
|
||||
"BriefDescription": "Any uop issued was canceled for any reason"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "A load wakeup to the scheduler has been canceled",
|
||||
"EventCode": "0xD302",
|
||||
"EventName": "IXU_LOAD_CANCEL",
|
||||
"BriefDescription": "A load wakeup to the scheduler has been canceled"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "The scheduler had to cancel one slow Uop due to resource conflict",
|
||||
"EventCode": "0xD303",
|
||||
"EventName": "IXU_SLOW_CANCEL",
|
||||
"BriefDescription": "The scheduler had to cancel one slow Uop due to resource conflict"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXA",
|
||||
"EventCode": "0xD304",
|
||||
"EventName": "IXU_IXA_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXA"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXA Par 0",
|
||||
"EventCode": "0xD305",
|
||||
"EventName": "IXU_IXA_PAR0_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXA Par 0"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXA Par 1",
|
||||
"EventCode": "0xD306",
|
||||
"EventName": "IXU_IXA_PAR1_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXA Par 1"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXB",
|
||||
"EventCode": "0xD307",
|
||||
"EventName": "IXU_IXB_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXB Par 0",
|
||||
"EventCode": "0xD308",
|
||||
"EventName": "IXU_IXB_PAR0_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXB Par 0"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXB Par 1",
|
||||
"EventCode": "0xD309",
|
||||
"EventName": "IXU_IXB_PAR1_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXB Par 1"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXC",
|
||||
"EventCode": "0xD30A",
|
||||
"EventName": "IXU_IXC_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXC"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXC Par 0",
|
||||
"EventCode": "0xD30B",
|
||||
"EventName": "IXU_IXC_PAR0_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXC Par 0"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXC Par 1",
|
||||
"EventCode": "0xD30C",
|
||||
"EventName": "IXU_IXC_PAR1_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXC Par 1"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXD",
|
||||
"EventCode": "0xD30D",
|
||||
"EventName": "IXU_IXD_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXD"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXD Par 0",
|
||||
"EventCode": "0xD30E",
|
||||
"EventName": "IXU_IXD_PAR0_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXD Par 0"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on IXD Par 1",
|
||||
"EventCode": "0xD30F",
|
||||
"EventName": "IXU_IXD_PAR1_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on IXD Par 1"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the FSU scheduler",
|
||||
"EventCode": "0xD400",
|
||||
"EventName": "FSU_ISSUED",
|
||||
"BriefDescription": "Uops issued by the FSU scheduler"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on FSX",
|
||||
"EventCode": "0xD401",
|
||||
"EventName": "FSU_FSX_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on FSX"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on FSY",
|
||||
"EventCode": "0xD402",
|
||||
"EventName": "FSU_FSY_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on FSY"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops issued by the scheduler on FSZ",
|
||||
"EventCode": "0xD403",
|
||||
"EventName": "FSU_FSZ_ISSUED",
|
||||
"BriefDescription": "Uops issued by the scheduler on FSZ"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Uops canceled (load cancels)",
|
||||
"EventCode": "0xD404",
|
||||
"EventName": "FSU_CANCEL",
|
||||
"BriefDescription": "Uops canceled (load cancels)"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Count scheduler stalls due to divide/sqrt",
|
||||
"EventCode": "0xD405",
|
||||
"EventName": "FSU_DIV_SQRT_STALL",
|
||||
"BriefDescription": "Count scheduler stalls due to divide/sqrt"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Number of SWOB drains",
|
||||
"EventCode": "0xD500",
|
||||
"EventName": "GPC_SWOB_DRAIN",
|
||||
"BriefDescription": "Number of SWOB drains"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "GPC detected a Breakpoint instruction match",
|
||||
"EventCode": "0xD501",
|
||||
"EventName": "BREAKPOINT_MATCH",
|
||||
"BriefDescription": "GPC detected a Breakpoint instruction match"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Core progress monitor triggered",
|
||||
"EventCode": "0xd502",
|
||||
"EventName": "GPC_CPM_TRIGGER",
|
||||
"BriefDescription": "Core progress monitor triggered"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Fill buffer full",
|
||||
"EventCode": "0xD601",
|
||||
"EventName": "OFB_FULL",
|
||||
"BriefDescription": "Fill buffer full"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Load satisified from store forwarded data",
|
||||
"EventCode": "0xD605",
|
||||
"EventName": "LD_FROM_ST_FWD",
|
||||
"BriefDescription": "Load satisified from store forwarded data"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Store retirement pipe stall",
|
||||
"EventCode": "0xD60C",
|
||||
"EventName": "LSU_ST_RETIRE_STALL",
|
||||
"BriefDescription": "Store retirement pipe stall"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "LSU detected a Watchpoint data match",
|
||||
"EventCode": "0xD60D",
|
||||
"EventName": "WATCHPOINT_MATCH",
|
||||
"BriefDescription": "LSU detected a Watchpoint data match"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Counts cycles that MSC is telling GPC to stall commit due to ETM ISTALL feature",
|
||||
"EventCode": "0xda00",
|
||||
"EventName": "MSC_ETM_COMMIT_STALL",
|
||||
"BriefDescription": "Counts cycles that MSC is telling GPC to stall commit due to ETM ISTALL feature"
|
||||
}
|
||||
]
|
@ -0,0 +1,47 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "EXC_UNDEF"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_SVC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_PABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_DABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_IRQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_FIQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_HVC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TRAP_PABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TRAP_DABORT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TRAP_OTHER"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TRAP_IRQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TRAP_FIQ"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_TAKEN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_RETURN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "EXC_SMC"
|
||||
}
|
||||
]
|
@ -0,0 +1,128 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "SW_INCR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ST_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LD_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LDST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ASE_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "VFP_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "PC_WRITE_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_IMMED_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_RETURN_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "CRYPTO_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ISB_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DSB_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "DMB_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "RC_LD_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "RC_ST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "INST_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "CID_WRITE_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "PC_WRITE_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "INST_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "TTBR_WRITE_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "BR_MIS_PRED_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "OP_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "OP_SPEC"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Operation speculatively executed - ASE Scalar",
|
||||
"EventCode": "0xd210",
|
||||
"EventName": "ASE_SCALAR_SPEC",
|
||||
"BriefDescription": "Operation speculatively executed - ASE Scalar"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Operation speculatively executed - ASE Vector",
|
||||
"EventCode": "0xd211",
|
||||
"EventName": "ASE_VECTOR_SPEC",
|
||||
"BriefDescription": "Operation speculatively executed - ASE Vector"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Barrier speculatively executed, CSDB",
|
||||
"EventCode": "0x7f",
|
||||
"EventName": "CSDB_SPEC",
|
||||
"BriefDescription": "Barrier speculatively executed, CSDB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch sent to L2.",
|
||||
"EventCode": "0xd106",
|
||||
"EventName": "ICF_PREFETCH_DISPATCH",
|
||||
"BriefDescription": "Prefetch sent to L2."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch response received but was dropped since we don't support inflight upgrades.",
|
||||
"EventCode": "0xd107",
|
||||
"EventName": "ICF_PREFETCH_DROPPED_NO_UPGRADE",
|
||||
"BriefDescription": "Prefetch response received but was dropped since we don't support inflight upgrades."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch request missed TLB.",
|
||||
"EventCode": "0xd108",
|
||||
"EventName": "ICF_PREFETCH_DROPPED_TLB_MISS",
|
||||
"BriefDescription": "Prefetch request missed TLB."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch request dropped since duplicate was found in TLB.",
|
||||
"EventCode": "0xd109",
|
||||
"EventName": "ICF_PREFETCH_DROPPED_DUPLICATE",
|
||||
"BriefDescription": "Prefetch request dropped since duplicate was found in TLB."
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Prefetch request dropped since it was found in cache.",
|
||||
"EventCode": "0xd10a",
|
||||
"EventName": "ICF_PREFETCH_DROPPED_CACHE_HIT",
|
||||
"BriefDescription": "Prefetch request dropped since it was found in cache."
|
||||
}
|
||||
]
|
@ -0,0 +1,14 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "LDREX_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STREX_PASS_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STREX_FAIL_SPEC"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STREX_SPEC"
|
||||
}
|
||||
]
|
@ -0,0 +1,41 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "LD_RETIRED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS_WR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LD_ALIGN_LAT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "ST_ALIGN_LAT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEMORY_ERROR"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "LDST_ALIGN_LAT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS_CHECKED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS_CHECKED_RD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "MEM_ACCESS_CHECKED_WR"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Flushes due to memory hazards",
|
||||
"EventCode": "0x121",
|
||||
"EventName": "BPU_FLUSH_MEM_FAULT",
|
||||
"BriefDescription": "Flushes due to memory hazards"
|
||||
}
|
||||
]
|
442
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json
Normal file
442
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json
Normal file
@ -0,0 +1,442 @@
|
||||
[
|
||||
{
|
||||
"MetricName": "branch_miss_pred_rate",
|
||||
"MetricExpr": "BR_MIS_PRED / BR_PRED",
|
||||
"BriefDescription": "Branch predictor misprediction rate. May not count branches that are never resolved because they are in the misprediction shadow of an earlier branch",
|
||||
"MetricGroup": "branch",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "bus_utilization",
|
||||
"MetricExpr": "BUS_ACCESS / (BUS_CYCLES * 1)",
|
||||
"BriefDescription": "Core-to-uncore bus utilization",
|
||||
"MetricGroup": "Bus",
|
||||
"ScaleUnit": "100percent of bus cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_miss_ratio",
|
||||
"MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
|
||||
"BriefDescription": "This metric measures the ratio of level 1 data cache accesses missed to the total number of level 1 data cache accesses. This gives an indication of the effectiveness of the level 1 data cache.",
|
||||
"MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
|
||||
"ScaleUnit": "1per cache access"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1i_cache_miss_ratio",
|
||||
"MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
|
||||
"BriefDescription": "This metric measures the ratio of level 1 instruction cache accesses missed to the total number of level 1 instruction cache accesses. This gives an indication of the effectiveness of the level 1 instruction cache.",
|
||||
"MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
|
||||
"ScaleUnit": "1per cache access"
|
||||
},
|
||||
{
|
||||
"MetricName": "Miss_Ratio;l1d_cache_read_miss",
|
||||
"MetricExpr": "L1D_CACHE_LMISS_RD / L1D_CACHE_RD",
|
||||
"BriefDescription": "L1D cache read miss rate",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "1per cache read access"
|
||||
},
|
||||
{
|
||||
"MetricName": "l2_cache_miss_ratio",
|
||||
"MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
|
||||
"BriefDescription": "This metric measures the ratio of level 2 cache accesses missed to the total number of level 2 cache accesses. This gives an indication of the effectiveness of the level 2 cache, which is a unified cache that stores both data and instruction. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a unified cache.",
|
||||
"MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
|
||||
"ScaleUnit": "1per cache access"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1i_cache_read_miss_rate",
|
||||
"MetricExpr": "L1I_CACHE_LMISS / L1I_CACHE",
|
||||
"BriefDescription": "L1I cache read miss rate",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "1per cache access"
|
||||
},
|
||||
{
|
||||
"MetricName": "l2d_cache_read_miss_rate",
|
||||
"MetricExpr": "L2D_CACHE_LMISS_RD / L2D_CACHE_RD",
|
||||
"BriefDescription": "L2 cache read miss rate",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "1per cache read access"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_miss_mpki",
|
||||
"MetricExpr": "(L1D_CACHE_LMISS_RD * 1e3) / INST_RETIRED",
|
||||
"BriefDescription": "Misses per thousand instructions (data)",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "1MPKI"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1i_cache_miss_mpki",
|
||||
"MetricExpr": "(L1I_CACHE_LMISS * 1e3) / INST_RETIRED",
|
||||
"BriefDescription": "Misses per thousand instructions (instruction)",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "1MPKI"
|
||||
},
|
||||
{
|
||||
"MetricName": "simd_percentage",
|
||||
"MetricExpr": "ASE_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures advanced SIMD operations as a percentage of total operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "crypto_percentage",
|
||||
"MetricExpr": "CRYPTO_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures crypto operations as a percentage of operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "gflops",
|
||||
"MetricExpr": "VFP_SPEC / (duration_time * 1e9)",
|
||||
"BriefDescription": "Giga-floating point operations per second",
|
||||
"MetricGroup": "InstructionMix"
|
||||
},
|
||||
{
|
||||
"MetricName": "integer_dp_percentage",
|
||||
"MetricExpr": "DP_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures scalar integer operations as a percentage of operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "ipc",
|
||||
"MetricExpr": "INST_RETIRED / CPU_CYCLES",
|
||||
"BriefDescription": "This metric measures the number of instructions retired per cycle.",
|
||||
"MetricGroup": "General",
|
||||
"ScaleUnit": "1per cycle"
|
||||
},
|
||||
{
|
||||
"MetricName": "load_percentage",
|
||||
"MetricExpr": "LD_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures load operations as a percentage of operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "load_store_spec_rate",
|
||||
"MetricExpr": "LDST_SPEC / INST_SPEC",
|
||||
"BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speclatively executed",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "retired_mips",
|
||||
"MetricExpr": "INST_RETIRED / (duration_time * 1e6)",
|
||||
"BriefDescription": "Millions of instructions per second",
|
||||
"MetricGroup": "InstructionMix"
|
||||
},
|
||||
{
|
||||
"MetricName": "spec_utilization_mips",
|
||||
"MetricExpr": "INST_SPEC / (duration_time * 1e6)",
|
||||
"BriefDescription": "Millions of instructions per second",
|
||||
"MetricGroup": "PEutilization"
|
||||
},
|
||||
{
|
||||
"MetricName": "pc_write_spec_rate",
|
||||
"MetricExpr": "PC_WRITE_SPEC / INST_SPEC",
|
||||
"BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speclatively executed",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "store_percentage",
|
||||
"MetricExpr": "ST_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures store operations as a percentage of operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "scalar_fp_percentage",
|
||||
"MetricExpr": "VFP_SPEC / INST_SPEC",
|
||||
"BriefDescription": "This metric measures scalar floating point operations as a percentage of operations speculatively executed.",
|
||||
"MetricGroup": "Operation_Mix",
|
||||
"ScaleUnit": "100percent of operations"
|
||||
},
|
||||
{
|
||||
"MetricName": "retired_rate",
|
||||
"MetricExpr": "OP_RETIRED / OP_SPEC",
|
||||
"BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
|
||||
"MetricGroup": "General",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "wasted",
|
||||
"MetricExpr": "1 - (OP_RETIRED / (CPU_CYCLES * #slots))",
|
||||
"BriefDescription": "Of all the micro-operations issued, what proportion are lost",
|
||||
"MetricGroup": "General",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "wasted_rate",
|
||||
"MetricExpr": "1 - OP_RETIRED / OP_SPEC",
|
||||
"BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
|
||||
"MetricGroup": "General",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_backend_cache_rate",
|
||||
"MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
|
||||
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and cache miss",
|
||||
"MetricGroup": "Stall",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_backend_resource_rate",
|
||||
"MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
|
||||
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and resource full",
|
||||
"MetricGroup": "Stall",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_backend_tlb_rate",
|
||||
"MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
|
||||
"BriefDescription": "Proportion of cycles stalled and no operations issued to backend and TLB miss",
|
||||
"MetricGroup": "Stall",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_frontend_cache_rate",
|
||||
"MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
|
||||
"BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss",
|
||||
"MetricGroup": "Stall",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_frontend_tlb_rate",
|
||||
"MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
|
||||
"BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss",
|
||||
"MetricGroup": "Stall",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "dtlb_walk_ratio",
|
||||
"MetricExpr": "DTLB_WALK / L1D_TLB",
|
||||
"BriefDescription": "This metric measures the ratio of data TLB Walks to the total number of data TLB accesses. This gives an indication of the effectiveness of the data TLB accesses.",
|
||||
"MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
|
||||
"ScaleUnit": "1per TLB access"
|
||||
},
|
||||
{
|
||||
"MetricName": "itlb_walk_ratio",
|
||||
"MetricExpr": "ITLB_WALK / L1I_TLB",
|
||||
"BriefDescription": "This metric measures the ratio of instruction TLB Walks to the total number of instruction TLB accesses. This gives an indication of the effectiveness of the instruction TLB accesses.",
|
||||
"MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
|
||||
"ScaleUnit": "1per TLB access"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "backend_bound"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "frontend_bound",
|
||||
"MetricExpr": "100 - (retired_fraction + slots_lost_misspeculation_fraction + backend_bound)"
|
||||
},
|
||||
{
|
||||
"MetricName": "slots_lost_misspeculation_fraction",
|
||||
"MetricExpr": "(OP_SPEC - OP_RETIRED) / (CPU_CYCLES * #slots)",
|
||||
"BriefDescription": "Fraction of slots lost due to misspeculation",
|
||||
"DefaultMetricgroupName": "TopdownL1",
|
||||
"MetricGroup": "Default;TopdownL1",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "retired_fraction",
|
||||
"MetricExpr": "OP_RETIRED / (CPU_CYCLES * #slots)",
|
||||
"BriefDescription": "Fraction of slots retiring, useful work",
|
||||
"DefaultMetricgroupName": "TopdownL1",
|
||||
"MetricGroup": "Default;TopdownL1",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "backend_core",
|
||||
"MetricExpr": "(backend_bound / 100) - backend_memory",
|
||||
"BriefDescription": "Fraction of slots the CPU was stalled due to backend non-memory subsystem issues",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "backend_memory",
|
||||
"MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE) / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of slots the CPU was stalled due to backend memory subsystem issues (cache/tlb miss)",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "branch_mispredict",
|
||||
"MetricExpr": "(BR_MIS_PRED_RETIRED / GPC_FLUSH) * slots_lost_misspeculation_fraction",
|
||||
"BriefDescription": "Fraction of slots lost due to branch misprediciton",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "1percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "frontend_bandwidth",
|
||||
"MetricExpr": "frontend_bound - frontend_latency",
|
||||
"BriefDescription": "Fraction of slots the CPU did not dispatch at full bandwidth - able to dispatch partial slots only (1, 2, or 3 uops)",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "1percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "frontend_latency",
|
||||
"MetricExpr": "(STALL_FRONTEND - ((STALL_SLOT_FRONTEND - ((frontend_bound / 100) * CPU_CYCLES * #slots)) / #slots)) / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of slots the CPU was stalled due to frontend latency issues (cache/tlb miss); nothing to dispatch",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "other_miss_pred",
|
||||
"MetricExpr": "slots_lost_misspeculation_fraction - branch_mispredict",
|
||||
"BriefDescription": "Fraction of slots lost due to other/non-branch misprediction misspeculation",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "1percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "pipe_utilization",
|
||||
"MetricExpr": "100 * ((IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6))",
|
||||
"BriefDescription": "Fraction of execute slots utilized",
|
||||
"MetricGroup": "TopdownL2",
|
||||
"ScaleUnit": "1percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "d_cache_l2_miss_rate",
|
||||
"MetricExpr": "STALL_BACKEND_MEM / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to data L2 cache miss",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "d_cache_miss_rate",
|
||||
"MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to data cache miss",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "d_tlb_miss_rate",
|
||||
"MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to data TLB miss",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "fsu_pipe_utilization",
|
||||
"MetricExpr": "FSU_ISSUED / (CPU_CYCLES * 2)",
|
||||
"BriefDescription": "Fraction of FSU execute slots utilized",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "i_cache_miss_rate",
|
||||
"MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to instruction cache miss",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "i_tlb_miss_rate",
|
||||
"MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to instruction TLB miss",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "ixu_pipe_utilization",
|
||||
"MetricExpr": "IXU_NUM_UOPS_ISSUED / (CPU_CYCLES * #slots)",
|
||||
"BriefDescription": "Fraction of IXU execute slots utilized",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_recovery_rate",
|
||||
"MetricExpr": "IDR_STALL_FLUSH / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled due to flush recovery",
|
||||
"MetricGroup": "TopdownL3",
|
||||
"ScaleUnit": "100percent of slots"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_fsu_sched_rate",
|
||||
"MetricExpr": "IDR_STALL_FSU_SCHED / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled and FSU was full",
|
||||
"MetricGroup": "TopdownL4",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_ixu_sched_rate",
|
||||
"MetricExpr": "IDR_STALL_IXU_SCHED / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled and IXU was full",
|
||||
"MetricGroup": "TopdownL4",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_lob_id_rate",
|
||||
"MetricExpr": "IDR_STALL_LOB_ID / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled and LOB was full",
|
||||
"MetricGroup": "TopdownL4",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_rob_id_rate",
|
||||
"MetricExpr": "IDR_STALL_ROB_ID / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled and ROB was full",
|
||||
"MetricGroup": "TopdownL4",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "stall_sob_id_rate",
|
||||
"MetricExpr": "IDR_STALL_SOB_ID / CPU_CYCLES",
|
||||
"BriefDescription": "Fraction of cycles the CPU was stalled and SOB was full",
|
||||
"MetricGroup": "TopdownL4",
|
||||
"ScaleUnit": "100percent of cycles"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_access_demand",
|
||||
"MetricExpr": "L1D_CACHE_RW / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache access - demand",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_access_prefetces",
|
||||
"MetricExpr": "L1D_CACHE_PRFM / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache access - prefetch",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_RW / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses_read",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_RD / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses - read",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_demand_misses_write",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_WR / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache demand misses - write",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "l1d_cache_prefetch_misses",
|
||||
"MetricExpr": "L1D_CACHE_REFILL_PRFM / L1D_CACHE",
|
||||
"BriefDescription": "L1D cache prefetch misses",
|
||||
"MetricGroup": "Cache",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "ase_scalar_mix",
|
||||
"MetricExpr": "ASE_SCALAR_SPEC / OP_SPEC",
|
||||
"BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) scalar operations",
|
||||
"MetricGroup": "Instructions",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
},
|
||||
{
|
||||
"MetricName": "ase_vector_mix",
|
||||
"MetricExpr": "ASE_VECTOR_SPEC / OP_SPEC",
|
||||
"BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) vector operations",
|
||||
"MetricGroup": "Instructions",
|
||||
"ScaleUnit": "100percent of cache acceses"
|
||||
}
|
||||
]
|
170
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/mmu.json
Normal file
170
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/mmu.json
Normal file
@ -0,0 +1,170 @@
|
||||
[
|
||||
{
|
||||
"PublicDescription": "Level 2 data translation buffer allocation",
|
||||
"EventCode": "0xD800",
|
||||
"EventName": "MMU_D_OTB_ALLOC",
|
||||
"BriefDescription": "Level 2 data translation buffer allocation"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data TLB translation cache hit on S1L2 walk cache entry",
|
||||
"EventCode": "0xd801",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S1L2_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S1L2 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data TLB translation cache hit on S1L1 walk cache entry",
|
||||
"EventCode": "0xd802",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S1L1_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S1L1 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data TLB translation cache hit on S1L0 walk cache entry",
|
||||
"EventCode": "0xd803",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S1L0_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S1L0 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data TLB translation cache hit on S2L2 walk cache entry",
|
||||
"EventCode": "0xd804",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S2L2_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S2L2 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data TLB translation cache hit on S2L1 walk cache entry",
|
||||
"EventCode": "0xd805",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S2L1_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S2L1 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data TLB translation cache hit on S2L0 walk cache entry",
|
||||
"EventCode": "0xd806",
|
||||
"EventName": "MMU_D_TRANS_CACHE_HIT_S2L0_WALK",
|
||||
"BriefDescription": "Data TLB translation cache hit on S2L0 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data-side S1 page walk cache lookup",
|
||||
"EventCode": "0xd807",
|
||||
"EventName": "MMU_D_S1_WALK_CACHE_LOOKUP",
|
||||
"BriefDescription": "Data-side S1 page walk cache lookup"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data-side S1 page walk cache refill",
|
||||
"EventCode": "0xd808",
|
||||
"EventName": "MMU_D_S1_WALK_CACHE_REFILL",
|
||||
"BriefDescription": "Data-side S1 page walk cache refill"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data-side S2 page walk cache lookup",
|
||||
"EventCode": "0xd809",
|
||||
"EventName": "MMU_D_S2_WALK_CACHE_LOOKUP",
|
||||
"BriefDescription": "Data-side S2 page walk cache lookup"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Data-side S2 page walk cache refill",
|
||||
"EventCode": "0xd80a",
|
||||
"EventName": "MMU_D_S2_WALK_CACHE_REFILL",
|
||||
"BriefDescription": "Data-side S2 page walk cache refill"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data-side S1 table walk fault",
|
||||
"EventCode": "0xD80B",
|
||||
"EventName": "MMU_D_S1_WALK_FAULT",
|
||||
"BriefDescription": "Data-side S1 table walk fault"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data-side S2 table walk fault",
|
||||
"EventCode": "0xD80C",
|
||||
"EventName": "MMU_D_S2_WALK_FAULT",
|
||||
"BriefDescription": "Data-side S2 table walk fault"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Data-side table walk steps or descriptor fetches",
|
||||
"EventCode": "0xD80D",
|
||||
"EventName": "MMU_D_WALK_STEPS",
|
||||
"BriefDescription": "Data-side table walk steps or descriptor fetches"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Level 2 instruction translation buffer allocation",
|
||||
"EventCode": "0xD900",
|
||||
"EventName": "MMU_I_OTB_ALLOC",
|
||||
"BriefDescription": "Level 2 instruction translation buffer allocation"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S1L2 walk cache entry",
|
||||
"EventCode": "0xd901",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S1L2_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S1L2 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S1L1 walk cache entry",
|
||||
"EventCode": "0xd902",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S1L1_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S1L1 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S1L0 walk cache entry",
|
||||
"EventCode": "0xd903",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S1L0_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S1L0 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S2L2 walk cache entry",
|
||||
"EventCode": "0xd904",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S2L2_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S2L2 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S2L1 walk cache entry",
|
||||
"EventCode": "0xd905",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S2L1_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S2L1 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction TLB translation cache hit on S2L0 walk cache entry",
|
||||
"EventCode": "0xd906",
|
||||
"EventName": "MMU_I_TRANS_CACHE_HIT_S2L0_WALK",
|
||||
"BriefDescription": "Instruction TLB translation cache hit on S2L0 walk cache entry"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction-side S1 page walk cache lookup",
|
||||
"EventCode": "0xd907",
|
||||
"EventName": "MMU_I_S1_WALK_CACHE_LOOKUP",
|
||||
"BriefDescription": "Instruction-side S1 page walk cache lookup"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction-side S1 page walk cache refill",
|
||||
"EventCode": "0xd908",
|
||||
"EventName": "MMU_I_S1_WALK_CACHE_REFILL",
|
||||
"BriefDescription": "Instruction-side S1 page walk cache refill"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction-side S2 page walk cache lookup",
|
||||
"EventCode": "0xd909",
|
||||
"EventName": "MMU_I_S2_WALK_CACHE_LOOKUP",
|
||||
"BriefDescription": "Instruction-side S2 page walk cache lookup"
|
||||
},
|
||||
{
|
||||
"PublicDescrition": "Instruction-side S2 page walk cache refill",
|
||||
"EventCode": "0xd90a",
|
||||
"EventName": "MMU_I_S2_WALK_CACHE_REFILL",
|
||||
"BriefDescription": "Instruction-side S2 page walk cache refill"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction-side S1 table walk fault",
|
||||
"EventCode": "0xD90B",
|
||||
"EventName": "MMU_I_S1_WALK_FAULT",
|
||||
"BriefDescription": "Instruction-side S1 table walk fault"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction-side S2 table walk fault",
|
||||
"EventCode": "0xD90C",
|
||||
"EventName": "MMU_I_S2_WALK_FAULT",
|
||||
"BriefDescription": "Instruction-side S2 table walk fault"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Instruction-side table walk steps or descriptor fetches",
|
||||
"EventCode": "0xD90D",
|
||||
"EventName": "MMU_I_WALK_STEPS",
|
||||
"BriefDescription": "Instruction-side table walk steps or descriptor fetches"
|
||||
}
|
||||
]
|
@ -0,0 +1,41 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "STALL_FRONTEND",
|
||||
"Errata": "Errata AC03_CPU_29",
|
||||
"BriefDescription": "Impacted by errata, use metrics instead -"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_BACKEND"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL",
|
||||
"Errata": "Errata AC03_CPU_29",
|
||||
"BriefDescription": "Impacted by errata, use metrics instead -"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_SLOT_BACKEND"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_SLOT_FRONTEND",
|
||||
"Errata": "Errata AC03_CPU_29",
|
||||
"BriefDescription": "Impacted by errata, use metrics instead -"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_SLOT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "STALL_BACKEND_MEM"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Frontend stall cycles, TLB",
|
||||
"EventCode": "0x815c",
|
||||
"EventName": "STALL_FRONTEND_TLB",
|
||||
"BriefDescription": "Frontend stall cycles, TLB"
|
||||
},
|
||||
{
|
||||
"PublicDescription": "Backend stall cycles, TLB",
|
||||
"EventCode": "0x8167",
|
||||
"EventName": "STALL_BACKEND_TLB",
|
||||
"BriefDescription": "Backend stall cycles, TLB"
|
||||
}
|
||||
]
|
14
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/spe.json
Normal file
14
tools/perf/pmu-events/arch/arm64/ampere/ampereonex/spe.json
Normal file
@ -0,0 +1,14 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "SAMPLE_POP"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SAMPLE_FEED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SAMPLE_FILTRATE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "SAMPLE_COLLISION"
|
||||
}
|
||||
]
|
@ -107,7 +107,7 @@
|
||||
"EventName": "hnf_qos_hh_retry",
|
||||
"EventidCode": "0xe",
|
||||
"NodeType": "0x5",
|
||||
"BriefDescription": "Counts number of times a HighHigh priority request is protocolretried at the HN‑F.",
|
||||
"BriefDescription": "Counts number of times a HighHigh priority request is protocolretried at the HN-F.",
|
||||
"Unit": "arm_cmn",
|
||||
"Compat": "(434|436|43c|43a).*"
|
||||
},
|
||||
|
@ -42,3 +42,4 @@
|
||||
0x00000000480fd010,v1,hisilicon/hip08,core
|
||||
0x00000000500f0000,v1,ampere/emag,core
|
||||
0x00000000c00fac30,v1,ampere/ampereone,core
|
||||
0x00000000c00fac40,v1,ampere/ampereonex,core
|
||||
|
|
@ -11,8 +11,7 @@
|
||||
#
|
||||
# Multiple PVRs could map to a single JSON file.
|
||||
#
|
||||
|
||||
# Power8 entries
|
||||
0x004[bcd][[:xdigit:]]{4},1,power8,core
|
||||
0x0066[[:xdigit:]]{4},1,power8,core
|
||||
0x004e[[:xdigit:]]{4},1,power9,core
|
||||
0x0080[[:xdigit:]]{4},1,power10,core
|
||||
|
|
@ -99,6 +99,11 @@
|
||||
"EventName": "PM_INST_FROM_L2MISS",
|
||||
"BriefDescription": "The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "0x0003C0000000C040",
|
||||
"EventName": "PM_DATA_FROM_L2MISS_DSRC",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "0x000380000010C040",
|
||||
"EventName": "PM_INST_FROM_L2MISS_ALL",
|
||||
@ -161,9 +166,14 @@
|
||||
},
|
||||
{
|
||||
"EventCode": "0x000780000000C040",
|
||||
"EventName": "PM_INST_FROM_L3MISS",
|
||||
"EventName": "PM_INST_FROM_L3MISS_DSRC",
|
||||
"BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "0x0007C0000000C040",
|
||||
"EventName": "PM_DATA_FROM_L3MISS_DSRC",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss."
|
||||
},
|
||||
{
|
||||
"EventCode": "0x000780000010C040",
|
||||
"EventName": "PM_INST_FROM_L3MISS_ALL",
|
||||
@ -981,7 +991,7 @@
|
||||
},
|
||||
{
|
||||
"EventCode": "0x0003C0000000C142",
|
||||
"EventName": "PM_MRK_DATA_FROM_L2MISS",
|
||||
"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction."
|
||||
},
|
||||
{
|
||||
@ -1046,12 +1056,12 @@
|
||||
},
|
||||
{
|
||||
"EventCode": "0x000780000000C142",
|
||||
"EventName": "PM_MRK_INST_FROM_L3MISS",
|
||||
"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
|
||||
"BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction."
|
||||
},
|
||||
{
|
||||
"EventCode": "0x0007C0000000C142",
|
||||
"EventName": "PM_MRK_DATA_FROM_L3MISS",
|
||||
"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
|
||||
"BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction."
|
||||
},
|
||||
{
|
||||
|
@ -15,3 +15,5 @@
|
||||
#
|
||||
#MVENDORID-MARCHID-MIMPID,Version,Filename,EventType
|
||||
0x489-0x8000000000000007-0x[[:xdigit:]]+,v1,sifive/u74,core
|
||||
0x5b7-0x0-0x0,v1,thead/c900-legacy,core
|
||||
0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core
|
||||
|
|
172
tools/perf/pmu-events/arch/riscv/starfive/dubhe-80/common.json
Normal file
172
tools/perf/pmu-events/arch/riscv/starfive/dubhe-80/common.json
Normal file
@ -0,0 +1,172 @@
|
||||
[
|
||||
{
|
||||
"EventName": "ACCESS_MMU_STLB",
|
||||
"EventCode": "0x1",
|
||||
"BriefDescription": "access MMU STLB"
|
||||
},
|
||||
{
|
||||
"EventName": "MISS_MMU_STLB",
|
||||
"EventCode": "0x2",
|
||||
"BriefDescription": "miss MMU STLB"
|
||||
},
|
||||
{
|
||||
"EventName": "ACCESS_MMU_PTE_C",
|
||||
"EventCode": "0x3",
|
||||
"BriefDescription": "access MMU PTE-Cache"
|
||||
},
|
||||
{
|
||||
"EventName": "MISS_MMU_PTE_C",
|
||||
"EventCode": "0x4",
|
||||
"BriefDescription": "miss MMU PTE-Cache"
|
||||
},
|
||||
{
|
||||
"EventName": "ROB_FLUSH",
|
||||
"EventCode": "0x5",
|
||||
"BriefDescription": "ROB flush (all kinds of exceptions)"
|
||||
},
|
||||
{
|
||||
"EventName": "BTB_PREDICTION_MISS",
|
||||
"EventCode": "0x6",
|
||||
"BriefDescription": "BTB prediction miss"
|
||||
},
|
||||
{
|
||||
"EventName": "ITLB_MISS",
|
||||
"EventCode": "0x7",
|
||||
"BriefDescription": "ITLB miss"
|
||||
},
|
||||
{
|
||||
"EventName": "SYNC_DEL_FETCH_G",
|
||||
"EventCode": "0x8",
|
||||
"BriefDescription": "SYNC delivery a fetch-group"
|
||||
},
|
||||
{
|
||||
"EventName": "ICACHE_MISS",
|
||||
"EventCode": "0x9",
|
||||
"BriefDescription": "ICache miss"
|
||||
},
|
||||
{
|
||||
"EventName": "BPU_BR_RETIRE",
|
||||
"EventCode": "0xA",
|
||||
"BriefDescription": "condition branch instruction retire"
|
||||
},
|
||||
{
|
||||
"EventName": "BPU_BR_MISS",
|
||||
"EventCode": "0xB",
|
||||
"BriefDescription": "condition branch instruction miss"
|
||||
},
|
||||
{
|
||||
"EventName": "RET_INS_RETIRE",
|
||||
"EventCode": "0xC",
|
||||
"BriefDescription": "return instruction retire"
|
||||
},
|
||||
{
|
||||
"EventName": "RET_INS_MISS",
|
||||
"EventCode": "0xD",
|
||||
"BriefDescription": "return instruction miss"
|
||||
},
|
||||
{
|
||||
"EventName": "INDIRECT_JR_MISS",
|
||||
"EventCode": "0xE",
|
||||
"BriefDescription": "indirect JR instruction miss (inlcude without target)"
|
||||
},
|
||||
{
|
||||
"EventName": "IBUF_VAL_ID_NORDY",
|
||||
"EventCode": "0xF",
|
||||
"BriefDescription": "IBUF valid while ID not ready"
|
||||
},
|
||||
{
|
||||
"EventName": "IBUF_NOVAL_ID_RDY",
|
||||
"EventCode": "0x10",
|
||||
"BriefDescription": "IBUF not valid while ID ready"
|
||||
},
|
||||
{
|
||||
"EventName": "REN_INT_PHY_REG_NORDY",
|
||||
"EventCode": "0x11",
|
||||
"BriefDescription": "REN integer physical register file is not ready"
|
||||
},
|
||||
{
|
||||
"EventName": "REN_FP_PHY_REG_NORDY",
|
||||
"EventCode": "0x12",
|
||||
"BriefDescription": "REN floating point physical register file is not ready"
|
||||
},
|
||||
{
|
||||
"EventName": "REN_CP_NORDY",
|
||||
"EventCode": "0x13",
|
||||
"BriefDescription": "REN checkpoint is not ready"
|
||||
},
|
||||
{
|
||||
"EventName": "DEC_VAL_ROB_NORDY",
|
||||
"EventCode": "0x14",
|
||||
"BriefDescription": "DEC is valid and ROB is not ready"
|
||||
},
|
||||
{
|
||||
"EventName": "OOD_FLUSH_LS_DEP",
|
||||
"EventCode": "0x15",
|
||||
"BriefDescription": "out of order flush due to load/store dependency"
|
||||
},
|
||||
{
|
||||
"EventName": "BRU_RET_IJR_INS",
|
||||
"EventCode": "0x16",
|
||||
"BriefDescription": "BRU retire an IJR instruction"
|
||||
},
|
||||
{
|
||||
"EventName": "ACCESS_DTLB",
|
||||
"EventCode": "0x17",
|
||||
"BriefDescription": "access DTLB"
|
||||
},
|
||||
{
|
||||
"EventName": "MISS_DTLB",
|
||||
"EventCode": "0x18",
|
||||
"BriefDescription": "miss DTLB"
|
||||
},
|
||||
{
|
||||
"EventName": "LOAD_INS_DCACHE",
|
||||
"EventCode": "0x19",
|
||||
"BriefDescription": "load instruction access DCache"
|
||||
},
|
||||
{
|
||||
"EventName": "LOAD_INS_MISS_DCACHE",
|
||||
"EventCode": "0x1A",
|
||||
"BriefDescription": "load instruction miss DCache"
|
||||
},
|
||||
{
|
||||
"EventName": "STORE_INS_DCACHE",
|
||||
"EventCode": "0x1B",
|
||||
"BriefDescription": "store/amo instruction access DCache"
|
||||
},
|
||||
{
|
||||
"EventName": "STORE_INS_MISS_DCACHE",
|
||||
"EventCode": "0x1C",
|
||||
"BriefDescription": "store/amo instruction miss DCache"
|
||||
},
|
||||
{
|
||||
"EventName": "LOAD_SCACHE",
|
||||
"EventCode": "0x1D",
|
||||
"BriefDescription": "load access SCache"
|
||||
},
|
||||
{
|
||||
"EventName": "STORE_SCACHE",
|
||||
"EventCode": "0x1E",
|
||||
"BriefDescription": "store access SCache"
|
||||
},
|
||||
{
|
||||
"EventName": "LOAD_MISS_SCACHE",
|
||||
"EventCode": "0x1F",
|
||||
"BriefDescription": "load miss SCache"
|
||||
},
|
||||
{
|
||||
"EventName": "STORE_MISS_SCACHE",
|
||||
"EventCode": "0x20",
|
||||
"BriefDescription": "store miss SCache"
|
||||
},
|
||||
{
|
||||
"EventName": "L2C_PF_REQ",
|
||||
"EventCode": "0x21",
|
||||
"BriefDescription": "L2C data-prefetcher request"
|
||||
},
|
||||
{
|
||||
"EventName": "L2C_PF_HIT",
|
||||
"EventCode": "0x22",
|
||||
"BriefDescription": "L2C data-prefetcher hit"
|
||||
}
|
||||
]
|
@ -0,0 +1,68 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "FW_MISALIGNED_LOAD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_MISALIGNED_STORE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ACCESS_LOAD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ACCESS_STORE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ILLEGAL_INSN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SET_TIMER"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_IPI_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_IPI_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_FENCE_I_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_FENCE_I_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_ASID_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_VMID_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_VMID_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_ASID_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_ASID_RECEIVED"
|
||||
}
|
||||
]
|
@ -0,0 +1,67 @@
|
||||
[
|
||||
{
|
||||
"EventName": "L1_ICACHE_ACCESS",
|
||||
"EventCode": "0x00000001",
|
||||
"BriefDescription": "L1 instruction cache access"
|
||||
},
|
||||
{
|
||||
"EventName": "L1_ICACHE_MISS",
|
||||
"EventCode": "0x00000002",
|
||||
"BriefDescription": "L1 instruction cache miss"
|
||||
},
|
||||
{
|
||||
"EventName": "ITLB_MISS",
|
||||
"EventCode": "0x00000003",
|
||||
"BriefDescription": "I-UTLB miss"
|
||||
},
|
||||
{
|
||||
"EventName": "DTLB_MISS",
|
||||
"EventCode": "0x00000004",
|
||||
"BriefDescription": "D-UTLB miss"
|
||||
},
|
||||
{
|
||||
"EventName": "JTLB_MISS",
|
||||
"EventCode": "0x00000005",
|
||||
"BriefDescription": "JTLB miss"
|
||||
},
|
||||
{
|
||||
"EventName": "L1_DCACHE_READ_ACCESS",
|
||||
"EventCode": "0x0000000c",
|
||||
"BriefDescription": "L1 data cache read access"
|
||||
},
|
||||
{
|
||||
"EventName": "L1_DCACHE_READ_MISS",
|
||||
"EventCode": "0x0000000d",
|
||||
"BriefDescription": "L1 data cache read miss"
|
||||
},
|
||||
{
|
||||
"EventName": "L1_DCACHE_WRITE_ACCESS",
|
||||
"EventCode": "0x0000000e",
|
||||
"BriefDescription": "L1 data cache write access"
|
||||
},
|
||||
{
|
||||
"EventName": "L1_DCACHE_WRITE_MISS",
|
||||
"EventCode": "0x0000000f",
|
||||
"BriefDescription": "L1 data cache write miss"
|
||||
},
|
||||
{
|
||||
"EventName": "LL_CACHE_READ_ACCESS",
|
||||
"EventCode": "0x00000010",
|
||||
"BriefDescription": "LL Cache read access"
|
||||
},
|
||||
{
|
||||
"EventName": "LL_CACHE_READ_MISS",
|
||||
"EventCode": "0x00000011",
|
||||
"BriefDescription": "LL Cache read miss"
|
||||
},
|
||||
{
|
||||
"EventName": "LL_CACHE_WRITE_ACCESS",
|
||||
"EventCode": "0x00000012",
|
||||
"BriefDescription": "LL Cache write access"
|
||||
},
|
||||
{
|
||||
"EventName": "LL_CACHE_WRITE_MISS",
|
||||
"EventCode": "0x00000013",
|
||||
"BriefDescription": "LL Cache write miss"
|
||||
}
|
||||
]
|
@ -0,0 +1,68 @@
|
||||
[
|
||||
{
|
||||
"ArchStdEvent": "FW_MISALIGNED_LOAD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_MISALIGNED_STORE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ACCESS_LOAD"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ACCESS_STORE"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_ILLEGAL_INSN"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SET_TIMER"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_IPI_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_IPI_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_FENCE_I_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_FENCE_I_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_SFENCE_VMA_ASID_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_VMID_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_GVMA_VMID_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_RECEIVED"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_ASID_SENT"
|
||||
},
|
||||
{
|
||||
"ArchStdEvent": "FW_HFENCE_VVMA_ASID_RECEIVED"
|
||||
}
|
||||
]
|
@ -0,0 +1,72 @@
|
||||
[
|
||||
{
|
||||
"EventName": "INST_BRANCH_MISPREDICT",
|
||||
"EventCode": "0x00000006",
|
||||
"BriefDescription": "Mispredicted branch instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_BRANCH",
|
||||
"EventCode": "0x00000007",
|
||||
"BriefDescription": "Retired branch instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_JMP_MISPREDICT",
|
||||
"EventCode": "0x00000008",
|
||||
"BriefDescription": "Indirect branch mispredict"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_JMP",
|
||||
"EventCode": "0x00000009",
|
||||
"BriefDescription": "Retired jmp instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_STORE",
|
||||
"EventCode": "0x0000000b",
|
||||
"BriefDescription": "Retired store instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_ALU",
|
||||
"EventCode": "0x0000001d",
|
||||
"BriefDescription": "Retired ALU instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_LDST",
|
||||
"EventCode": "0x0000001e",
|
||||
"BriefDescription": "Retired Load/Store instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_VECTOR",
|
||||
"EventCode": "0x0000001f",
|
||||
"BriefDescription": "Retired Vector instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_CSR",
|
||||
"EventCode": "0x00000020",
|
||||
"BriefDescription": "Retired CSR instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_SYNC",
|
||||
"EventCode": "0x00000021",
|
||||
"BriefDescription": "Retired sync instructions (AMO/LR/SC instructions)"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_UNALIGNED_ACCESS",
|
||||
"EventCode": "0x00000022",
|
||||
"BriefDescription": "Retired Store/Load instructions with unaligned memory access"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_ECALL",
|
||||
"EventCode": "0x00000025",
|
||||
"BriefDescription": "Retired ecall instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_LONG_JP",
|
||||
"EventCode": "0x00000026",
|
||||
"BriefDescription": "Retired long jump instructions"
|
||||
},
|
||||
{
|
||||
"EventName": "INST_FP",
|
||||
"EventCode": "0x0000002a",
|
||||
"BriefDescription": "Retired FPU instructions"
|
||||
}
|
||||
]
|
@ -0,0 +1,80 @@
|
||||
[
|
||||
{
|
||||
"EventName": "LSU_SPEC_FAIL",
|
||||
"EventCode": "0x0000000a",
|
||||
"BriefDescription": "LSU speculation fail"
|
||||
},
|
||||
{
|
||||
"EventName": "IDU_RF_PIPE_FAIL",
|
||||
"EventCode": "0x00000014",
|
||||
"BriefDescription": "Instruction decode unit launch pipeline failed in RF state"
|
||||
},
|
||||
{
|
||||
"EventName": "IDU_RF_REG_FAIL",
|
||||
"EventCode": "0x00000015",
|
||||
"BriefDescription": "Instruction decode unit launch register file fail in RF state"
|
||||
},
|
||||
{
|
||||
"EventName": "IDU_RF_INSTRUCTION",
|
||||
"EventCode": "0x00000016",
|
||||
"BriefDescription": "retired instruction count of Instruction decode unit in RF (Register File) stage"
|
||||
},
|
||||
{
|
||||
"EventName": "LSU_4K_STALL",
|
||||
"EventCode": "0x00000017",
|
||||
"BriefDescription": "LSU stall times for long distance data access (Over 4K)",
|
||||
"PublicDescription": "This stall occurs when translate virtual address with page offset over 4k"
|
||||
},
|
||||
{
|
||||
"EventName": "LSU_OTHER_STALL",
|
||||
"EventCode": "0x00000018",
|
||||
"BriefDescription": "LSU stall times for other reasons (except the 4k stall)"
|
||||
},
|
||||
{
|
||||
"EventName": "LSU_SQ_OTHER_DIS",
|
||||
"EventCode": "0x00000019",
|
||||
"BriefDescription": "LSU store queue discard others"
|
||||
},
|
||||
{
|
||||
"EventName": "LSU_SQ_DATA_DISCARD",
|
||||
"EventCode": "0x0000001a",
|
||||
"BriefDescription": "LSU store queue discard data (uops)"
|
||||
},
|
||||
{
|
||||
"EventName": "BRANCH_DIRECTION_MISPREDICTION",
|
||||
"EventCode": "0x0000001b",
|
||||
"BriefDescription": "Branch misprediction in BTB"
|
||||
},
|
||||
{
|
||||
"EventName": "BRANCH_DIRECTION_PREDICTION",
|
||||
"EventCode": "0x0000001c",
|
||||
"BriefDescription": "All branch prediction in BTB",
|
||||
"PublicDescription": "This event including both successful prediction and failed prediction in BTB"
|
||||
},
|
||||
{
|
||||
"EventName": "INTERRUPT_ACK_COUNT",
|
||||
"EventCode": "0x00000023",
|
||||
"BriefDescription": "acknowledged interrupt count"
|
||||
},
|
||||
{
|
||||
"EventName": "INTERRUPT_OFF_CYCLE",
|
||||
"EventCode": "0x00000024",
|
||||
"BriefDescription": "PLIC arbitration time when the interrupt is not responded",
|
||||
"PublicDescription": "The arbitration time is recorded while meeting any of the following:\n- CPU is M-mode and MIE == 0\n- CPU is S-mode and delegation and SIE == 0\n"
|
||||
},
|
||||
{
|
||||
"EventName": "IFU_STALLED_CYCLE",
|
||||
"EventCode": "0x00000027",
|
||||
"BriefDescription": "Number of stall cycles of the instruction fetch unit (IFU)."
|
||||
},
|
||||
{
|
||||
"EventName": "IDU_STALLED_CYCLE",
|
||||
"EventCode": "0x00000028",
|
||||
"BriefDescription": "hpcp_backend_stall Number of stall cycles of the instruction decoding unit (IDU) and next-level pipeline unit."
|
||||
},
|
||||
{
|
||||
"EventName": "SYNC_STALL",
|
||||
"EventCode": "0x00000029",
|
||||
"BriefDescription": "Sync instruction stall cycle fence/fence.i/sync/sfence"
|
||||
}
|
||||
]
|
@ -69,12 +69,6 @@
|
||||
"MetricName": "C9_Pkg_Residency",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Uncore frequency per die [GHZ]",
|
||||
"MetricExpr": "tma_info_system_socket_clks / #num_dies / duration_time / 1e9",
|
||||
"MetricGroup": "SoC",
|
||||
"MetricName": "UNCORE_FREQ"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
|
||||
"MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
|
||||
@ -809,6 +803,13 @@
|
||||
"ScaleUnit": "100%",
|
||||
"Unit": "cpu_atom"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Uncore frequency per die [GHZ]",
|
||||
"MetricExpr": "tma_info_system_socket_clks / #num_dies / duration_time / 1e9",
|
||||
"MetricGroup": "SoC",
|
||||
"MetricName": "UNCORE_FREQ",
|
||||
"Unit": "cpu_core"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execution ports for ALU operations.",
|
||||
"MetricExpr": "(cpu_core@UOPS_DISPATCHED.PORT_0@ + cpu_core@UOPS_DISPATCHED.PORT_1@ + cpu_core@UOPS_DISPATCHED.PORT_5_11@ + cpu_core@UOPS_DISPATCHED.PORT_6@) / (5 * tma_info_core_core_clks)",
|
||||
@ -1838,7 +1839,7 @@
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Average number of parallel data read requests to external memory",
|
||||
"MetricExpr": "UNC_ARB_DAT_OCCUPANCY.RD / cpu_core@UNC_ARB_DAT_OCCUPANCY.RD\\,cmask\\=1@",
|
||||
"MetricExpr": "UNC_ARB_DAT_OCCUPANCY.RD / UNC_ARB_DAT_OCCUPANCY.RD@cmask\\=1@",
|
||||
"MetricGroup": "Mem;MemoryBW;SoC",
|
||||
"MetricName": "tma_info_system_mem_parallel_reads",
|
||||
"PublicDescription": "Average number of parallel data read requests to external memory. Accounts for demand loads and L1/L2 prefetches",
|
||||
|
101
tools/perf/pmu-events/arch/x86/amdzen4/memory-controller.json
Normal file
101
tools/perf/pmu-events/arch/x86/amdzen4/memory-controller.json
Normal file
@ -0,0 +1,101 @@
|
||||
[
|
||||
{
|
||||
"EventName": "umc_mem_clk",
|
||||
"PublicDescription": "Number of memory clock cycles.",
|
||||
"EventCode": "0x00",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_act_cmd.all",
|
||||
"PublicDescription": "Number of ACTIVATE commands sent.",
|
||||
"EventCode": "0x05",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_act_cmd.rd",
|
||||
"PublicDescription": "Number of ACTIVATE commands sent for reads.",
|
||||
"EventCode": "0x05",
|
||||
"RdWrMask": "0x1",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_act_cmd.wr",
|
||||
"PublicDescription": "Number of ACTIVATE commands sent for writes.",
|
||||
"EventCode": "0x05",
|
||||
"RdWrMask": "0x2",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_pchg_cmd.all",
|
||||
"PublicDescription": "Number of PRECHARGE commands sent.",
|
||||
"EventCode": "0x06",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_pchg_cmd.rd",
|
||||
"PublicDescription": "Number of PRECHARGE commands sent for reads.",
|
||||
"EventCode": "0x06",
|
||||
"RdWrMask": "0x1",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_pchg_cmd.wr",
|
||||
"PublicDescription": "Number of PRECHARGE commands sent for writes.",
|
||||
"EventCode": "0x06",
|
||||
"RdWrMask": "0x2",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_cas_cmd.all",
|
||||
"PublicDescription": "Number of CAS commands sent.",
|
||||
"EventCode": "0x0a",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_cas_cmd.rd",
|
||||
"PublicDescription": "Number of CAS commands sent for reads.",
|
||||
"EventCode": "0x0a",
|
||||
"RdWrMask": "0x1",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_cas_cmd.wr",
|
||||
"PublicDescription": "Number of CAS commands sent for writes.",
|
||||
"EventCode": "0x0a",
|
||||
"RdWrMask": "0x2",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_data_slot_clks.all",
|
||||
"PublicDescription": "Number of clocks used by the data bus.",
|
||||
"EventCode": "0x14",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_data_slot_clks.rd",
|
||||
"PublicDescription": "Number of clocks used by the data bus for reads.",
|
||||
"EventCode": "0x14",
|
||||
"RdWrMask": "0x1",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
},
|
||||
{
|
||||
"EventName": "umc_data_slot_clks.wr",
|
||||
"PublicDescription": "Number of clocks used by the data bus for writes.",
|
||||
"EventCode": "0x14",
|
||||
"RdWrMask": "0x2",
|
||||
"PerPkg": "1",
|
||||
"Unit": "UMCPMC"
|
||||
}
|
||||
]
|
@ -330,5 +330,89 @@
|
||||
"MetricGroup": "data_fabric",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "6.103515625e-5MiB"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_data_bus_utilization",
|
||||
"BriefDescription": "Memory controller data bus utilization.",
|
||||
"MetricExpr": "d_ratio(umc_data_slot_clks.all / 2, umc_mem_clk)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_cas_cmd_rate",
|
||||
"BriefDescription": "Memory controller CAS command rate.",
|
||||
"MetricExpr": "d_ratio(umc_cas_cmd.all * 1000, umc_mem_clk)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_cas_cmd_read_ratio",
|
||||
"BriefDescription": "Ratio of memory controller CAS commands for reads.",
|
||||
"MetricExpr": "d_ratio(umc_cas_cmd.rd, umc_cas_cmd.all)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_cas_cmd_write_ratio",
|
||||
"BriefDescription": "Ratio of memory controller CAS commands for writes.",
|
||||
"MetricExpr": "d_ratio(umc_cas_cmd.wr, umc_cas_cmd.all)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_mem_read_bandwidth",
|
||||
"BriefDescription": "Estimated memory read bandwidth.",
|
||||
"MetricExpr": "(umc_cas_cmd.rd * 64) / 1e6 / duration_time",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "1MB/s"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_mem_write_bandwidth",
|
||||
"BriefDescription": "Estimated memory write bandwidth.",
|
||||
"MetricExpr": "(umc_cas_cmd.wr * 64) / 1e6 / duration_time",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "1MB/s"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_mem_bandwidth",
|
||||
"BriefDescription": "Estimated combined memory bandwidth.",
|
||||
"MetricExpr": "(umc_cas_cmd.all * 64) / 1e6 / duration_time",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "1MB/s"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_cas_cmd_read_ratio",
|
||||
"BriefDescription": "Ratio of memory controller CAS commands for reads.",
|
||||
"MetricExpr": "d_ratio(umc_cas_cmd.rd, umc_cas_cmd.all)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1",
|
||||
"ScaleUnit": "100%"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_cas_cmd_rate",
|
||||
"BriefDescription": "Memory controller CAS command rate.",
|
||||
"MetricExpr": "d_ratio(umc_cas_cmd.all * 1000, umc_mem_clk)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_activate_cmd_rate",
|
||||
"BriefDescription": "Memory controller ACTIVATE command rate.",
|
||||
"MetricExpr": "d_ratio(umc_act_cmd.all * 1000, umc_mem_clk)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1"
|
||||
},
|
||||
{
|
||||
"MetricName": "umc_precharge_cmd_rate",
|
||||
"BriefDescription": "Memory controller PRECHARGE command rate.",
|
||||
"MetricExpr": "d_ratio(umc_pchg_cmd.all * 1000, umc_mem_clk)",
|
||||
"MetricGroup": "memory_controller",
|
||||
"PerPkg": "1"
|
||||
}
|
||||
]
|
||||
|
@ -1862,6 +1862,12 @@
|
||||
"MetricName": "uncore_frequency",
|
||||
"ScaleUnit": "1GHz"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel(R) Ultra Path Interconnect (UPI) data receive bandwidth (MB/sec)",
|
||||
"MetricExpr": "UNC_UPI_RxL_FLITS.ALL_DATA * 7.111111111111111 / 1e6 / duration_time",
|
||||
"MetricName": "upi_data_receive_bw",
|
||||
"ScaleUnit": "1MB/s"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel(R) Ultra Path Interconnect (UPI) data transmit bandwidth (MB/sec)",
|
||||
"MetricExpr": "UNC_UPI_TxL_FLITS.ALL_DATA * 7.111111111111111 / 1e6 / duration_time",
|
||||
|
@ -23,26 +23,47 @@
|
||||
"UMask": "0x10"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_0",
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_0 [This event is alias to FP_ARITH_DISPATCHED.V0]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.PORT_0",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_1",
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_1 [This event is alias to FP_ARITH_DISPATCHED.V1]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.PORT_1",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x2"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_5",
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.PORT_5 [This event is alias to FP_ARITH_DISPATCHED.V2]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.PORT_5",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x4"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.V0 [This event is alias to FP_ARITH_DISPATCHED.PORT_0]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.V0",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.V1 [This event is alias to FP_ARITH_DISPATCHED.PORT_1]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.V1",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x2"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "FP_ARITH_DISPATCHED.V2 [This event is alias to FP_ARITH_DISPATCHED.PORT_5]",
|
||||
"EventCode": "0xb3",
|
||||
"EventName": "FP_ARITH_DISPATCHED.V2",
|
||||
"SampleAfterValue": "2000003",
|
||||
"UMask": "0x4"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Counts number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired; some instructions will count twice as noted below. Each count represents 2 computation operations, one for each element. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform 2 calculations per element.",
|
||||
"EventCode": "0xc7",
|
||||
|
@ -1,20 +1,4 @@
|
||||
[
|
||||
{
|
||||
"BriefDescription": "AMX retired arithmetic BF16 operations.",
|
||||
"EventCode": "0xce",
|
||||
"EventName": "AMX_OPS_RETIRED.BF16",
|
||||
"PublicDescription": "Number of AMX-based retired arithmetic bfloat16 (BF16) floating-point operations. Counts TDPBF16PS FP instructions. SW to use operation multiplier of 4",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x2"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "AMX retired arithmetic integer 8-bit operations.",
|
||||
"EventCode": "0xce",
|
||||
"EventName": "AMX_OPS_RETIRED.INT8",
|
||||
"PublicDescription": "Number of AMX-based retired arithmetic integer operations of 8-bit width source operands. Counts TDPB[SS,UU,US,SU]D instructions. SW should use operation multiplier of 8.",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "This event is deprecated. Refer to new event ARITH.DIV_ACTIVE",
|
||||
"CounterMask": "1",
|
||||
@ -505,7 +489,7 @@
|
||||
"UMask": "0x1"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "INT_MISC.UNKNOWN_BRANCH_CYCLES",
|
||||
"BriefDescription": "Bubble cycles of BAClear (Unknown Branch).",
|
||||
"EventCode": "0xad",
|
||||
"EventName": "INT_MISC.UNKNOWN_BRANCH_CYCLES",
|
||||
"MSRIndex": "0x3F7",
|
||||
|
@ -4825,11 +4825,11 @@
|
||||
"Unit": "M3UPI"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of allocations into the CRS Egress used to queue up requests destined to the mesh (AD Bouncable)",
|
||||
"BriefDescription": "Number of allocations into the CRS Egress used to queue up requests destined to the mesh (AD Bounceable)",
|
||||
"EventCode": "0x47",
|
||||
"EventName": "UNC_MDF_CRS_TxR_INSERTS.AD_BNC",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "AD Bouncable : Number of allocations into the CRS Egress",
|
||||
"PublicDescription": "AD Bounceable : Number of allocations into the CRS Egress",
|
||||
"UMask": "0x1",
|
||||
"Unit": "MDF"
|
||||
},
|
||||
@ -4861,11 +4861,11 @@
|
||||
"Unit": "MDF"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Number of allocations into the CRS Egress used to queue up requests destined to the mesh (BL Bouncable)",
|
||||
"BriefDescription": "Number of allocations into the CRS Egress used to queue up requests destined to the mesh (BL Bounceable)",
|
||||
"EventCode": "0x47",
|
||||
"EventName": "UNC_MDF_CRS_TxR_INSERTS.BL_BNC",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "BL Bouncable : Number of allocations into the CRS Egress",
|
||||
"PublicDescription": "BL Bounceable : Number of allocations into the CRS Egress",
|
||||
"UMask": "0x4",
|
||||
"Unit": "MDF"
|
||||
},
|
||||
|
@ -1185,6 +1185,36 @@
|
||||
"UMask": "0x70ff010",
|
||||
"Unit": "IIO"
|
||||
},
|
||||
{
|
||||
"BriefDescription": ": IOTLB Hits to a 1G Page",
|
||||
"EventCode": "0x40",
|
||||
"EventName": "UNC_IIO_IOMMU0.1G_HITS",
|
||||
"PerPkg": "1",
|
||||
"PortMask": "0x0000",
|
||||
"PublicDescription": ": IOTLB Hits to a 1G Page : Counts if a transaction to a 1G page, on its first lookup, hits the IOTLB.",
|
||||
"UMask": "0x10",
|
||||
"Unit": "IIO"
|
||||
},
|
||||
{
|
||||
"BriefDescription": ": IOTLB Hits to a 2M Page",
|
||||
"EventCode": "0x40",
|
||||
"EventName": "UNC_IIO_IOMMU0.2M_HITS",
|
||||
"PerPkg": "1",
|
||||
"PortMask": "0x0000",
|
||||
"PublicDescription": ": IOTLB Hits to a 2M Page : Counts if a transaction to a 2M page, on its first lookup, hits the IOTLB.",
|
||||
"UMask": "0x8",
|
||||
"Unit": "IIO"
|
||||
},
|
||||
{
|
||||
"BriefDescription": ": IOTLB Hits to a 4K Page",
|
||||
"EventCode": "0x40",
|
||||
"EventName": "UNC_IIO_IOMMU0.4K_HITS",
|
||||
"PerPkg": "1",
|
||||
"PortMask": "0x0000",
|
||||
"PublicDescription": ": IOTLB Hits to a 4K Page : Counts if a transaction to a 4K page, on its first lookup, hits the IOTLB.",
|
||||
"UMask": "0x4",
|
||||
"Unit": "IIO"
|
||||
},
|
||||
{
|
||||
"BriefDescription": ": Context cache hits",
|
||||
"EventCode": "0x40",
|
||||
|
@ -1846,6 +1846,12 @@
|
||||
"MetricName": "uncore_frequency",
|
||||
"ScaleUnit": "1GHz"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel(R) Ultra Path Interconnect (UPI) data receive bandwidth (MB/sec)",
|
||||
"MetricExpr": "UNC_UPI_RxL_FLITS.ALL_DATA * 7.111111111111111 / 1e6 / duration_time",
|
||||
"MetricName": "upi_data_receive_bw",
|
||||
"ScaleUnit": "1MB/s"
|
||||
},
|
||||
{
|
||||
"BriefDescription": "Intel(R) Ultra Path Interconnect (UPI) data transmit bandwidth (MB/sec)",
|
||||
"MetricExpr": "UNC_UPI_TxL_FLITS.ALL_DATA * 7.111111111111111 / 1e6 / duration_time",
|
||||
|
@ -19,7 +19,7 @@
|
||||
"BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.",
|
||||
"EventCode": "0x28",
|
||||
"EventName": "CORE_POWER.LVL2_TURBO_LICENSE",
|
||||
"PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server microarchtecture). This includes high current AVX 512-bit instructions.",
|
||||
"PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server microarchitecture). This includes high current AVX 512-bit instructions.",
|
||||
"SampleAfterValue": "200003",
|
||||
"UMask": "0x20"
|
||||
},
|
||||
|
@ -519,7 +519,7 @@
|
||||
"BriefDescription": "Cycles when Reservation Station (RS) is empty for the thread",
|
||||
"EventCode": "0x5e",
|
||||
"EventName": "RS_EVENTS.EMPTY_CYCLES",
|
||||
"PublicDescription": "Counts cycles during which the reservation station (RS) is empty for this logical processor. This is usually caused when the front-end pipeline runs into stravation periods (e.g. branch mispredictions or i-cache misses)",
|
||||
"PublicDescription": "Counts cycles during which the reservation station (RS) is empty for this logical processor. This is usually caused when the front-end pipeline runs into starvation periods (e.g. branch mispredictions or i-cache misses)",
|
||||
"SampleAfterValue": "1000003",
|
||||
"UMask": "0x1"
|
||||
},
|
||||
|
@ -38,7 +38,7 @@
|
||||
"EventCode": "0x10",
|
||||
"EventName": "UNC_I_COHERENT_OPS.CLFLUSH",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "Coherent Ops : CLFlush : Counts the number of coherency related operations servied by the IRP",
|
||||
"PublicDescription": "Coherent Ops : CLFlush : Counts the number of coherency related operations serviced by the IRP",
|
||||
"UMask": "0x80",
|
||||
"Unit": "IRP"
|
||||
},
|
||||
@ -65,7 +65,7 @@
|
||||
"EventCode": "0x10",
|
||||
"EventName": "UNC_I_COHERENT_OPS.WBMTOI",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "Coherent Ops : WbMtoI : Counts the number of coherency related operations servied by the IRP",
|
||||
"PublicDescription": "Coherent Ops : WbMtoI : Counts the number of coherency related operations serviced by the IRP",
|
||||
"UMask": "0x40",
|
||||
"Unit": "IRP"
|
||||
},
|
||||
@ -454,7 +454,7 @@
|
||||
"EventCode": "0x11",
|
||||
"EventName": "UNC_I_TRANSACTIONS.WRITES",
|
||||
"PerPkg": "1",
|
||||
"PublicDescription": "Inbound Transaction Count : Writes : Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID. : Trackes only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
|
||||
"PublicDescription": "Inbound Transaction Count : Writes : Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID. : Tracks only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
|
||||
"UMask": "0x2",
|
||||
"Unit": "IRP"
|
||||
},
|
||||
|
@ -7,7 +7,7 @@ GenuineIntel-6-56,v11,broadwellde,core
|
||||
GenuineIntel-6-4F,v22,broadwellx,core
|
||||
GenuineIntel-6-55-[56789ABCDEF],v1.20,cascadelakex,core
|
||||
GenuineIntel-6-9[6C],v1.04,elkhartlake,core
|
||||
GenuineIntel-6-CF,v1.01,emeraldrapids,core
|
||||
GenuineIntel-6-CF,v1.02,emeraldrapids,core
|
||||
GenuineIntel-6-5[CF],v13,goldmont,core
|
||||
GenuineIntel-6-7A,v1.01,goldmontplus,core
|
||||
GenuineIntel-6-B6,v1.00,grandridge,core
|
||||
@ -15,7 +15,7 @@ GenuineIntel-6-A[DE],v1.01,graniterapids,core
|
||||
GenuineIntel-6-(3C|45|46),v33,haswell,core
|
||||
GenuineIntel-6-3F,v28,haswellx,core
|
||||
GenuineIntel-6-7[DE],v1.19,icelake,core
|
||||
GenuineIntel-6-6[AC],v1.21,icelakex,core
|
||||
GenuineIntel-6-6[AC],v1.23,icelakex,core
|
||||
GenuineIntel-6-3A,v24,ivybridge,core
|
||||
GenuineIntel-6-3E,v24,ivytown,core
|
||||
GenuineIntel-6-2D,v24,jaketown,core
|
||||
@ -26,7 +26,7 @@ GenuineIntel-6-1[AEF],v4,nehalemep,core
|
||||
GenuineIntel-6-2E,v4,nehalemex,core
|
||||
GenuineIntel-6-A7,v1.01,rocketlake,core
|
||||
GenuineIntel-6-2A,v19,sandybridge,core
|
||||
GenuineIntel-6-8F,v1.16,sapphirerapids,core
|
||||
GenuineIntel-6-8F,v1.17,sapphirerapids,core
|
||||
GenuineIntel-6-AF,v1.00,sierraforest,core
|
||||
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
|
||||
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v57,skylake,core
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user