mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-04 04:06:26 +00:00
d99b312572
1227 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Arnaldo Carvalho de Melo
|
a6e8a58de6 |
perf disasm: Allow configuring what disassemblers to use
The perf tools annotation code used for a long time parsing the output of binutils's objdump (or its reimplementations, like llvm's) to then parse and augment it with samples, allow navigation, etc. More recently disassemblers from the capstone and llvm (libraries, not parsing the output of tools using those libraries to mimic binutils's objdump output) were introduced. So when all those methods are available, there is a static preference for a series of attempts of disassembling a binary, with the 'llvm, capstone, objdump' sequence being hard coded. This patch allows users to change that sequence, specifying via a 'perf config' 'annotate.disassemblers' entry which and in what order disassemblers should be attempted. As alluded to in the comments in the source code of this series, this flexibility is useful for users and developers alike, elliminating the requirement to rebuild the tool with some specific set of libraries to see how the output of disassembling would be for one of these methods. root@x1:~# rm -f ~/.perfconfig root@x1:~# perf annotate -v --stdio2 update_load_avg <SNIP> symbol__disassemble: filename=/usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux, sym=update_load_avg, start=0xffffffffb6148fe0, en> annotating [0x6ff7170] /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux : [0x7407ca0] update_load_avg Disassembled with llvm annotate.disassemblers=llvm,capstone,objdump Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent 0xffffffff81148fe0 <update_load_avg>: 1.61 pushq %r15 pushq %r14 1.00 pushq %r13 movl %edx,%r13d 1.90 pushq %r12 pushq %rbp movq %rsi,%rbp pushq %rbx movq %rdi,%rbx subq $0x18,%rsp 15.14 movl 0x1a4(%rdi),%eax root@x1:~# perf config annotate.disassemblers=capstone root@x1:~# cat ~/.perfconfig # this file is auto-generated. [annotate] disassemblers = capstone root@x1:~# root@x1:~# perf annotate -v --stdio2 update_load_avg <SNIP> Disassembled with capstone annotate.disassemblers=capstone Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent 0xffffffff81148fe0 <update_load_avg>: 1.61 pushq %r15 pushq %r14 1.00 pushq %r13 movl %edx,%r13d 1.90 pushq %r12 pushq %rbp movq %rsi,%rbp pushq %rbx movq %rdi,%rbx subq $0x18,%rsp 15.14 movl 0x1a4(%rdi),%eax root@x1:~# perf config annotate.disassemblers=objdump,capstone root@x1:~# perf config annotate.disassemblers annotate.disassemblers=objdump,capstone root@x1:~# cat ~/.perfconfig # this file is auto-generated. [annotate] disassemblers = objdump,capstone root@x1:~# perf annotate -v --stdio2 update_load_avg Executing: objdump --start-address=0xffffffff81148fe0 \ --stop-address=0xffffffff811497aa \ -d --no-show-raw-insn -S -C "$1" Disassembled with objdump annotate.disassemblers=objdump,capstone Samples: 66 of event 'cpu_atom/cycles/P', 10000 Hz, Event count (approx.): 5185444, [percent: local period] update_load_avg() /usr/lib/debug/lib/modules/6.11.4-201.fc40.x86_64/vmlinux Percent Disassembly of section .text: ffffffff81148fe0 <update_load_avg>: #define DO_ATTACH 0x4 ffffffff81148fe0 <update_load_avg>: #define DO_ATTACH 0x4 #define DO_DETACH 0x8 /* Update task and its cfs_rq load average */ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { 1.61 push %r15 push %r14 1.00 push %r13 mov %edx,%r13d 1.90 push %r12 push %rbp mov %rsi,%rbp push %rbx mov %rdi,%rbx sub $0x18,%rsp } /* rq->task_clock normalized against any time this cfs_rq has spent throttled */ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq) { if (unlikely(cfs_rq->throttle_count)) 15.14 mov 0x1a4(%rdi),%eax root@x1:~# After adding a way to select the disassembler from the command line a 'perf test' comparing the output of the various diassemblers should be introduced, to test these codebases. Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Link: https://lore.kernel.org/r/20241111151734.1018476-4-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6d5d90a6ab |
perf docs: Document tool and hwmon events
Add a few paragraphs on tool and hwmon events. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20241109003759.473460-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Graham Woodward
|
35f5aa9ccc |
perf arm-spe: Update --itrace help text
The --itrace help now needs updating to reflect that the --itrace=b argument sythesises branches as well as branch misses. Signed-off-by: Graham Woodward <graham.woodward@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: nd@arm.com Cc: mike.leach@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241025143009.25419-5-graham.woodward@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Arnaldo Carvalho de Melo
|
915a377627 |
perf test: Document the -w/--workload option
Wasn't documented so far, mention that it is mostly used in the shell regression tests. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: James Clark <james.clark@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/r/20241020021842.1752770-4-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
8838abf626 |
perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORT
In Makefile.config for unwinding the name dwarf implies either libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
5eb2242513 |
perf libdw: Remove unnecessary defines
As HAVE_DWARF_GETLOCATIONS_SUPPORT and HAVE_DWARF_CFI_SUPPORT always match HAVE_DWARF_SUPPORT remove the macros and use HAVE_DWARF_SUPPORT. If building the file is guarded by CONFIG_DWARF then remove all ifs. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241017001354.56973-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Madadi Vineeth Reddy
|
cd912ab3b6 |
perf sched timehist: Add pre-migration wait time option
pre-migration wait time is the time that a task unnecessarily spends on the runqueue of a CPU but doesn't get switched-in there. In terms of tracepoints, it is the time between sched:sched_wakeup and sched:sched_migrate_task. Let's say a task woke up on CPU2, then it got migrated to CPU4 and then it's switched-in to CPU4. So, here pre-migration wait time is time that it was waiting on runqueue of CPU2 after it is woken up. The general pattern for pre-migration to occur is: sched:sched_wakeup sched:sched_migrate_task sched:sched_switch The sched:sched_waking event is used to capture the wakeup time, as it aligns with the existing code and only introduces a negligible time difference. pre-migrations are generally not useful and it increases migrations. This metric would be helpful in testing patches mainly related to wakeup and load-balancer code paths as better wakeup logic would choose an optimal CPU where task would be switched-in and thereby reducing pre- migrations. The sample output(s) when -P or --pre-migrations is used: ================= time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 38456.720806 [0001] schbench[28634/28574] 4.917 4.768 1.004 0.000 38456.720810 [0001] rcu_preempt[18] 3.919 0.003 0.004 0.000 38456.721800 [0006] schbench[28779/28574] 23.465 23.465 1.999 0.000 38456.722800 [0002] schbench[28773/28574] 60.371 60.237 3.955 60.197 38456.722806 [0001] schbench[28634/28574] 0.004 0.004 1.996 0.000 38456.722811 [0001] rcu_preempt[18] 1.996 0.005 0.005 0.000 38456.723800 [0000] schbench[28833/28574] 4.000 4.000 3.999 0.000 38456.723800 [0004] schbench[28762/28574] 42.951 42.839 3.999 39.867 38456.723802 [0007] schbench[28812/28574] 43.947 43.817 3.999 40.866 38456.723804 [0001] schbench[28587/28574] 7.935 7.822 0.993 0.000 Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lore.kernel.org/r/20241004170756.18064-1-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Thomas Falcon
|
48966a5a48 |
perf report: Display columns Predicted/Abort/Cycles in --branch-history
The original commit message: " Use current sort mechanism but the real .se_cmp() just returns 0 so that new columns "Predicted", "Abort" and "Cycles" are created in display but actually these keys are not the sort keys. For example: Overhead Source:Line Symbol Shared Object Predicted Abort Cycles ........ ............ ........ ............. ......... ..... ...... 38.25% div.c:45 [.] main div 97.6% 0 3 " Update missed commit from series "perf report: Show branch flags/cycles in --branch-history callgraph view" to apply to current repository so that new columns described above are visible. Link to original series: https://lore.kernel.org/lkml/1477876794-30749-1-git-send-email-yao.jin@linux.intel.com/ Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Suggested-by: Kan Liang <kan.liang@linux.intel.com> Co-developed-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20241010184046.203822-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Yoshihiro Furudera
|
f7ef062fe1 |
perf list: update option desc in man page
There is a difference between the SYNOPSIS section of the help message and the man page (tools/perf/Documentation/perf-list.txt) for the perf list command. After checking, we found that the help message reflected the latest specifications. Therefore, revised the SYNOPSIS section of the man page to match the help message. Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Liang Link: https://lore.kernel.org/r/20241003002404.2592094-1-fj5100bi@fujitsu.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
James Clark
|
9943581c64 |
perf scripting python: Add function to get a config value
This can be used to get config values like which objdump Perf uses for disassembly. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: John Garry <john.g.garry@oracle.com> Cc: scclevenger@os.amperecomputing.com Link: https://lore.kernel.org/r/20240916135743.1490403-4-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Aditya Gupta
|
35439fe4e2 |
perf check: Fix inconsistencies in feature names
Fix two inconsistencies in feature names as discussed in [1]: 1. Rename "dwarf-unwind-support" to "dwarf-unwind" 2. 'get_cpuid' feature and 'HAVE_AUXTRACE_SUPPORT' names don't look related, change the feature name to 'auxtrace' to match the macro name, as 'get_cpuid' string is not used anywhere to check the feature presence [1]: https://lore.kernel.org/linux-perf-users/ZoRw5we4HLSTZND6@x1/ Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240904190132.415212-7-adityag@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Aditya Gupta
|
98ad0b7732 |
perf check: Introduce 'check' subcommand
Currently the presence of a feature is checked with a combination of perf version --build-options and greps, such as: perf version --build-options | grep " on .* HAVE_FEATURE" Instead of this, introduce a subcommand "perf check feature", with which scripts can test for presence of a feature, such as: perf check feature HAVE_FEATURE 'perf check feature' command is expected to have exit status of 0 if feature is built-in, and 1 if it's not built-in or if feature is not known. Multiple features can also be passed as a comma-separated list, in which case the exit status will be 1 only if all of the passed features are built-in. For example, with below command, it will have exit status of 0 only if both libtraceevent and bpf are enabled, else 1 in all other cases perf check feature libtraceevent,bpf The arguments are case-insensitive. An array 'supported_features' has also been introduced that can be used by other commands like 'perf version --build-options', so that new features can be added in one place, with the array Committer testing: $ perf check feature libtraceevent,bpf libtraceevent: [ on ] # HAVE_LIBTRACEEVENT bpf: [ on ] # HAVE_LIBBPF_SUPPORT $ perf check feature libtraceevent libtraceevent: [ on ] # HAVE_LIBTRACEEVENT $ perf check feature bpf bpf: [ on ] # HAVE_LIBBPF_SUPPORT $ perf check -q feature bpf && echo "BPF support is present" BPF support is present $ perf check -q feature Bogus && echo "Bogus support is present" $ Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240904061836.55873-3-adityag@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
9b3a48bbe2 |
perf sched timehist: Add --prio option
The --prio option is used to only show events for the given task priority(ies).
The default is to show events for all priority tasks, which is consistent with
the previous behavior.
Testcase:
# perf sched record nice -n 9 perf bench sched messaging -l 10000
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 3.435 [sec]
[ perf record: Woken up 270 times to write data ]
[ perf record: Captured and wrote 618.688 MB perf.data (5729036 samples) ]
# perf sched timehist -h
Usage: perf sched timehist [<options>]
-C, --cpu <cpu> list of cpus to profile
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-g, --call-graph Display call chains if present (default on)
-I, --idle-hist Show idle events only
-i, --input <file> input file name
-k, --vmlinux <file> vmlinux pathname
-M, --migrations Show migration events
-n, --next Show next task
-p, --pid <pid[,pid...]>
analyze events only for given process id(s)
-s, --summary Show only syscall summary with statistics
-S, --with-summary Show all syscalls and summary with statistics
-t, --tid <tid[,tid...]>
analyze events only for given thread id(s)
-V, --cpu-visual Add CPU visual
-v, --verbose be more verbose (show symbol address, etc)
-w, --wakeups Show wakeup events
--kallsyms <file>
kallsyms pathname
--max-stack <n> Maximum number of functions to display backtrace.
--prio <prio> analyze events only for given task priority(ies)
--show-prio Show task priority
--state Show task state when sched-out
--symfs <directory>
Look for files with symbols relative to this directory
--time <str> Time span for analysis (start,stop)
# perf sched timehist --prio 140
Samples of sched_switch event do not have callchains.
Invalid prio string
# perf sched timehist --show-prio --prio 129
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
2090450.765421 [0002] sched-messaging[1229618] 129 0.000 0.000 0.029
2090450.765445 [0007] sched-messaging[1229616] 129 0.000 0.062 0.043
2090450.765448 [0014] sched-messaging[1229619] 129 0.000 0.000 0.032
2090450.765478 [0013] sched-messaging[1229617] 129 0.000 0.065 0.048
2090450.765503 [0014] sched-messaging[1229622] 129 0.000 0.000 0.017
2090450.765550 [0002] sched-messaging[1229624] 129 0.000 0.000 0.021
2090450.765562 [0007] sched-messaging[1229621] 129 0.000 0.071 0.028
2090450.765570 [0005] sched-messaging[1229620] 129 0.000 0.064 0.066
2090450.765583 [0001] sched-messaging[1229625] 129 0.000 0.001 0.031
2090450.765595 [0013] sched-messaging[1229623] 129 0.000 0.060 0.028
2090450.765637 [0014] sched-messaging[1229628] 129 0.000 0.000 0.019
2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
<SNIP>
# perf sched timehist --show-prio --prio 0,120-129
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
2090450.763231 [0000] perf[1229608] 120 0.000 0.000 0.000
2090450.763235 [0000] migration/0[15] 0 0.000 0.001 0.003
2090450.763263 [0001] perf[1229608] 120 0.000 0.000 0.000
2090450.763268 [0001] migration/1[21] 0 0.000 0.001 0.004
2090450.763302 [0002] perf[1229608] 120 0.000 0.000 0.000
2090450.763309 [0002] migration/2[27] 0 0.000 0.001 0.007
2090450.763338 [0003] perf[1229608] 120 0.000 0.000 0.000
2090450.763343 [0003] migration/3[33] 0 0.000 0.001 0.004
2090450.763459 [0004] perf[1229608] 120 0.000 0.000 0.000
2090450.763469 [0004] migration/4[39] 0 0.000 0.002 0.010
2090450.763496 [0005] perf[1229608] 120 0.000 0.000 0.000
2090450.763501 [0005] migration/5[45] 0 0.000 0.001 0.004
2090450.763613 [0006] perf[1229608] 120 0.000 0.000 0.000
2090450.763622 [0006] migration/6[51] 0 0.000 0.001 0.008
2090450.763652 [0007] perf[1229608] 120 0.000 0.000 0.000
2090450.763660 [0007] migration/7[57] 0 0.000 0.001 0.008
<SNIP>
2090450.765665 [0001] <idle> 120 0.031 0.031 0.081
2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
2090450.765667 [0000] s1-perf[8235/7168] 120 0.008 0.000 0.004
2090450.765684 [0013] <idle> 120 0.028 0.028 0.088
2090450.765685 [0001] sched-messaging[
|
||
Yang Jihong
|
3fcd740990 |
perf sched timehist: Add --show-prio option
The --show-prio option is used to display the priority of task. It is disabled by default, which is consistent with original behavior. The display format is xxx (priority does not change during task running) or xxx->yyy (priority changes during task running) Testcase: # perf sched record nice -n 9 true [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.497 MB perf.data ] # perf sched timehist -h Usage: perf sched timehist [<options>] -C, --cpu <cpu> list of cpus to profile -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -g, --call-graph Display call chains if present (default on) -I, --idle-hist Show idle events only -i, --input <file> input file name -k, --vmlinux <file> vmlinux pathname -M, --migrations Show migration events -n, --next Show next task -p, --pid <pid[,pid...]> analyze events only for given process id(s) -s, --summary Show only syscall summary with statistics -S, --with-summary Show all syscalls and summary with statistics -t, --tid <tid[,tid...]> analyze events only for given thread id(s) -V, --cpu-visual Add CPU visual -v, --verbose be more verbose (show symbol address, etc) -w, --wakeups Show wakeup events --kallsyms <file> kallsyms pathname --max-stack <n> Maximum number of functions to display backtrace. --show-prio Show task priority --state Show task state when sched-out --symfs <directory> Look for files with symbols relative to this directory --time <str> Time span for analysis (start,stop) # perf sched timehist Samples of sched_switch event do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- 23952.006537 [0000] perf[534] 0.000 0.000 0.000 23952.006593 [0000] migration/0[19] 0.000 0.014 0.056 23952.006899 [0001] perf[534] 0.000 0.000 0.000 23952.006947 [0001] migration/1[22] 0.000 0.015 0.047 23952.007138 [0002] perf[534] 0.000 0.000 0.000 <SNIP> # perf sched timehist --show-prio Samples of sched_switch event do not have callchains. time cpu task name prio wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ -------- --------- --------- --------- 23952.006537 [0000] perf[534] 120 0.000 0.000 0.000 23952.006593 [0000] migration/0[19] 0 0.000 0.014 0.056 23952.006899 [0001] perf[534] 120 0.000 0.000 0.000 <SNIP> 23952.034843 [0003] nice[535] 120->129 0.189 0.024 23.314 <SNIP> 23952.053838 [0005] rcu_preempt[16] 120 3.993 0.000 0.023 23952.053990 [0005] <idle> 120 0.023 0.023 0.152 23952.054137 [0006] <idle> 120 1.427 1.427 17.855 23952.054278 [0007] <idle> 120 0.506 0.506 1.650 Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240819033016.2427235-2-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
6f9d8d1de2 |
perf script: Add branch counters
It's useful to print the branch counter information for each jump in the brstackinsn when it's available. Add a new field 'brcntr' to display the branch counter information. By default, the abbreviation will be used to indicate the branch counter. In the verbose mode, the real event name is shown. $ perf script -F +brstackinsn,+brcntr # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC $ perf script -F +brstackinsn,+brcntr -v tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: branch-instructions:ppp 2 branch-misses 0 # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 7 cycles [14] 0.43 IPC Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-9-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
20d6f55528 |
perf report: Display the branch counter histogram
Reusing the existing --total-cycles option to display the branch counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display the logged branch counter events. They are shown right after all the cycle-related annotations. Extend the 'struct block_info' to store and pass the branch counter related information. The annotation_br_cntr_entry() is to print the histogram of each branch counter event. If the number of logged events is less than 4, the exact number of the abbr name is printed. Otherwise, using '+' to stands for more than 3 events. Assume the number of logged events is less than 4. The annotation_br_cntr_abbr_list() prints the branch counter's abbreviation list. Press 'B' to display the list in the TUI mode. $ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter $ perf report --total-cycles --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }' # Event count (approx.): 1610046 # # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 |A |- | ... 25.27% 1.1M 0.00% 2 |AA |- | ... 15.61% 667.2K 0.00% 1 |A |- | ... 0.16% 6.9K 0.81% 575 |A |- | ... 0.16% 6.8K 1.38% 977 |AA |- | ... 0.16% 6.8K 0.04% 28 |AA |B | ... 0.15% 6.6K 1.33% 946 |A |- | ... 0.11% 4.5K 0.06% 46 |AAA+|- | ... 0.10% 4.4K 0.88% 624 |A |- | ... 0.09% 3.7K 0.74% 524 |AAA+|B | ... With -v applied, # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter [Program Block Range] # ............... .............. ........... .......... .............. .................. # 57.55% 2.5M 0.00% 3 A=1 ,B=- ... 25.27% 1.1M 0.00% 2 A=2 ,B=- ... 15.61% 667.2K 0.00% 1 A=1 ,B=- ... 0.16% 6.9K 0.81% 575 A=1 ,B=- ... 0.16% 6.8K 1.38% 977 A=2 ,B=- ... 0.16% 6.8K 0.04% 28 A=2 ,B=1 ... 0.15% 6.6K 1.33% 946 A=1 ,B=- ... 0.11% 4.5K 0.06% 46 A=3+,B=- ... 0.10% 4.4K 0.88% 624 A=1 ,B=- ... 0.09% 3.7K 0.74% 524 A=3+,B=1 ... Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Weilin Wang
|
169f18fd98 |
perf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents
TPEBS (Timed PEBS(Precise Event-Based Sampling)) is a new feature Intel PMU from Granite Rapids microarchitecture. It will be used in new TMA (Top-Down Microarchitecture Analysis) releases. Add related introduction to documents while adding new code to support it in 'perf stat'. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-8-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Weilin Wang
|
d546e3acf3 |
perf stat: Add command line option for enabling TPEBS recording
With this command line option, TPEBS recording is turned off in 'perf stat' on default. It will only be turned on when this option is given in 'perf stat' command. Example with --record-tpebs: perf stat -M tma_split_loads -C1-4 --record-tpebs sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.044 MB - ] Performance counter stats for 'CPU(s) 1-4': 53,259,156,071 cpu_core/TOPDOWN.SLOTS/ # 1.6 % tma_split_loads (50.00%) 15,867,565,250 cpu_core/topdown-retiring/ (50.00%) 15,655,580,731 cpu_core/topdown-mem-bound/ (50.00%) 11,738,022,218 cpu_core/topdown-bad-spec/ (50.00%) 6,151,265,424 cpu_core/topdown-fe-bound/ (50.00%) 20,445,917,581 cpu_core/topdown-be-bound/ (50.00%) 6,925,098,013 cpu_core/L1D_PEND_MISS.PENDING/ (50.00%) 3,838,653,421 cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/ (50.00%) 4,797,059,783 cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/ (50.00%) 11,931,916,714 cpu_core/CPU_CLK_UNHALTED.THREAD/ (50.00%) 102,576,164 cpu_core/MEM_LOAD_COMPLETED.L1_MISS_ANY/ (50.00%) 64,071,854 cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/ (50.00%) 3 cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/R 1.003049679 seconds time elapsed Example without --record-tpebs: perf stat -M tma_contested_accesses -C1 sleep 1 Performance counter stats for 'CPU(s) 1': 50,203,891 cpu_core/TOPDOWN.SLOTS/ # 0.0 % tma_contested_accesses (63.60%) 10,040,777 cpu_core/topdown-retiring/ (63.60%) 6,890,729 cpu_core/topdown-mem-bound/ (63.60%) 2,756,463 cpu_core/topdown-bad-spec/ (63.60%) 10,828,288 cpu_core/topdown-fe-bound/ (63.60%) 28,350,432 cpu_core/topdown-be-bound/ (63.60%) 98 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM/ (63.70%) 577,520 cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/ (54.62%) 313,339 cpu_core/MEMORY_ACTIVITY.STALLS_L3_MISS/ (54.62%) 14,155 cpu_core/MEM_LOAD_RETIRED.L1_MISS/ (45.54%) 0 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD/ (36.30%) 8,468,077 cpu_core/CPU_CLK_UNHALTED.THREAD/ (45.38%) 198 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/ (45.38%) 8,324 cpu_core/MEM_LOAD_RETIRED.FB_HIT/ (45.38%) 3,388,031,520 TSC 23,226,785 cpu_core/CPU_CLK_UNHALTED.REF_TSC/ (54.46%) 80 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/ (54.46%) 0 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/R 0 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/R 1,006,816,667 ns duration_time 1.002537737 seconds time elapsed Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-7-weilin.wang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
043da846c2 |
perf docs: Refine the description for the buffer size
Current description for the AUX trace buffer size is misleading. When a user specifies the option '-m,512M', it represents a size value in bytes (512MiB) but not 512M pages (512M x 4KiB regard to a page of 4KiB). Make the document clear that the normal buffer and the AUX tracing buffer share the same semantics. Syncs the documents for consistent text. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240812093459.2575278-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Martin Liška
|
e6b56ae7c2 |
perf script: add --addr2line option
Similarly to other subcommands (like report, top), it would be handy to provide a path for addr2line command. Signed-off-by: Martin Liska <martin.liska@hey.com> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/eadc3e36-029d-4848-9d69-272fe5a83a26@foxlink.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
ce533c9bc6 |
perf annotate: Add --skip-empty option
Like in 'perf report', we want to hide empty events in the 'perf annotate' output. This is consistent when the option is set in perf report. For example, the following command would use 3 events including dummy. $ perf mem record -a -- perf test -w noploop $ perf evlist cpu/mem-loads,ldlat=30/P cpu/mem-stores/P dummy:u Just using perf annotate with --group will show the all 3 events. $ perf annotate --group --stdio | head Percent | Source code & Disassembly of ... -------------------------------------------------------------- : 0 0xe060 <_dl_relocate_object>: 0.00 0.00 0.00 : e060: pushq %rbp 0.00 0.00 0.00 : e061: movq %rsp, %rbp 0.00 0.00 0.00 : e064: pushq %r15 0.00 0.00 0.00 : e066: movq %rdi, %r15 0.00 0.00 0.00 : e069: pushq %r14 0.00 0.00 0.00 : e06b: pushq %r13 0.00 0.00 0.00 : e06d: movl %edx, %r13d Now with --skip-empty, it'll hide the last dummy event. $ perf annotate --group --stdio --skip-empty | head Percent | Source code & Disassembly of ... ------------------------------------------------------ : 0 0xe060 <_dl_relocate_object>: 0.00 0.00 : e060: pushq %rbp 0.00 0.00 : e061: movq %rsp, %rbp 0.00 0.00 : e064: pushq %r15 0.00 0.00 : e066: movq %rdi, %r15 0.00 0.00 : e069: pushq %r14 0.00 0.00 : e06b: pushq %r13 0.00 0.00 : e06d: movl %edx, %r13d Committer testing: root@x1:~# perf evlist cpu_atom/mem-loads,ldlat=30/P cpu_atom/mem-stores/P dummy:u root@x1:~# Before: root@x1:~# perf annotate --group --stdio2 do_lookup_x | head -25 Samples: 20 of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P, dummy:u', 4000 Hz, Event count (approx.): 769079, [percent: local period] do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2 Percent 0x9900 <do_lookup_x>: pushq %rbp movq %rsp,%rbp pushq %r15 pushq %r14 pushq %r13 pushq %r12 pushq %rbx subq $0x88,%rsp movq %rdi,-0x50(%rbp) movl 8(%r9),%edi movq 0x10(%rbp),%r12 movq 0x28(%rbp),%r10 movq %rdx,-0x70(%rbp) movq %rcx,-0x58(%rbp) movq %rdi,%r11 0.00 5.73 0.00 movq %r8,-0x68(%rbp) movq (%r9),%r8 movl %esi,%eax 8.30 0.00 0.00 movl 0x30(%rbp),%r9d movl %esi,%r15d shrl $6, %eax movq %r8,%r13 root@x1:~# After: root@x1:~# perf annotate --group --skip-empty --stdio2 do_lookup_x | head -25 Samples: 20 of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P', 4000 Hz, Event count (approx.): 769079, [percent: local period] do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2 Percent 0x9900 <do_lookup_x>: pushq %rbp movq %rsp,%rbp pushq %r15 pushq %r14 pushq %r13 pushq %r12 pushq %rbx subq $0x88,%rsp movq %rdi,-0x50(%rbp) movl 8(%r9),%edi movq 0x10(%rbp),%r12 movq 0x28(%rbp),%r10 movq %rdx,-0x70(%rbp) movq %rcx,-0x58(%rbp) movq %rdi,%r11 0.00 5.73 movq %r8,-0x68(%rbp) movq (%r9),%r8 movl %esi,%eax 8.30 0.00 movl 0x30(%rbp),%r9d movl %esi,%r15d shrl $6, %eax movq %r8,%r13 root@x1:~# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240803211332.1107222-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
13159a139d |
perf mem: Update documentation for new options
Add a common options section and move some items to the section. Also add description of new options to report options. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/20240802180913.1023886-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
3dee4b83a6 |
perf record: Add --setup-filter option
To allow BPF filters for unprivileged users it needs to pin the BPF objects to BPF-fs first. Let's add a new option to pin and unpin the objects easily. I'm not sure 'perf record' is a right place to do this but I don't have a better idea right now. $ sudo perf record --setup-filter pin The above command would pin BPF program and maps for the filter when the system has BPF-fs (usually at /sys/fs/bpf/). To unpin the objects, users can run the following command (as root). $ sudo perf record --setup-filter unpin Committer testing: root@number:~# perf record --setup-filter pin root@number:~# ls -la /sys/fs/bpf/perf_filter/ total 0 drwxr-xr-x. 2 root root 0 Jul 31 10:43 . drwxr-xr-t. 3 root root 0 Jul 31 10:43 .. -rw-rw-rw-. 1 root root 0 Jul 31 10:43 dropped -rw-rw-rw-. 1 root root 0 Jul 31 10:43 filters -rwxrwxrwx. 1 root root 0 Jul 31 10:43 perf_sample_filter -rw-rw-rw-. 1 root root 0 Jul 31 10:43 pid_hash -rw-------. 1 root root 0 Jul 31 10:43 sample_f_rodata root@number:~# ls -la /sys/fs/bpf/perf_filter/perf_sample_filter -rwxrwxrwx. 1 root root 0 Jul 31 10:43 /sys/fs/bpf/perf_filter/perf_sample_filter root@number:~# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20240703223035.2024586-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
74ae366c37 |
perf ftrace profile: Add -s/--sort option
The -s/--sort option is to sort the output by given column. $ sudo perf ftrace profile -s max sync | head # Total (us) Avg (us) Max (us) Count Function 6301.811 6301.811 6301.811 1 __do_sys_sync 6301.328 6301.328 6301.328 1 ksys_sync 5320.300 1773.433 2858.819 3 iterate_supers 2755.875 17.012 2610.633 162 sync_fs_one_sb 2728.351 682.088 2610.413 4 ext4_sync_fs [ext4] 2603.654 2603.654 2603.654 1 jbd2_log_wait_commit [jbd2] 4750.615 593.827 2597.427 8 schedule 2164.986 26.728 2115.673 81 sync_inodes_one_sb 2143.842 26.467 2115.438 81 sync_inodes_sb Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/lkml/20240729004127.238611-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
0f223813ed |
perf ftrace: Add 'profile' command
The 'perf ftrace profile' command is to get function execution profiles using function-graph tracer so that users can see the total, average, max execution time as well as the number of invocations easily. The following is a profile for the perf_event_open syscall. $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \ perf stat -e cycles -C1 true 2> /dev/null | head # Total (us) Avg (us) Max (us) Count Function 65.611 65.611 65.611 1 __x64_sys_perf_event_open 30.527 30.527 30.527 1 anon_inode_getfile 30.260 30.260 30.260 1 __anon_inode_getfile 29.700 29.700 29.700 1 alloc_file_pseudo 17.578 17.578 17.578 1 d_alloc_pseudo 17.382 17.382 17.382 1 __d_alloc 16.738 16.738 16.738 1 kmem_cache_alloc_lru 15.686 15.686 15.686 1 perf_event_alloc 14.012 7.006 11.264 2 obj_cgroup_charge # Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/lkml/20240729004127.238611-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
c77800894b |
perf ftrace: Add 'tail' option to --graph-opts
The 'graph-tail' option is to print function name as a comment at the end. This is useful when a large function is mixed with other functions (possibly from different CPUs). For example, $ sudo perf ftrace -- perf stat true ... 1) | get_unused_fd_flags() { 1) | alloc_fd() { 1) 0.178 us | _raw_spin_lock(); 1) 0.187 us | expand_files(); 1) 0.169 us | _raw_spin_unlock(); 1) 1.211 us | } 1) 1.503 us | } $ sudo perf ftrace --graph-opts tail -- perf stat true ... 1) | get_unused_fd_flags() { 1) | alloc_fd() { 1) 0.099 us | _raw_spin_lock(); 1) 0.083 us | expand_files(); 1) 0.081 us | _raw_spin_unlock(); 1) 0.601 us | } /* alloc_fd */ 1) 0.751 us | } /* get_unused_fd_flags */ Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/lkml/20240729004127.238611-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
d27087c76e |
perf docs: Document cross compilation
Records the commands for cross compilation with two methods. The first method relies on Multiarch. The second approach is to explicitly specify the PKG_CONFIG variables, which is widely used in build system (like Buildroot, Yocto, etc). Co-developed-by: James Clark <james.clark@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Ian Rogers <irogers@google.com> Cc: amadio@gentoo.org Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20240717082211.524826-7-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Madadi Vineeth Reddy
|
306f921e87 |
perf sched map: Add --fuzzy-name option for fuzzy matching in task names
The --fuzzy-name option can be used if fuzzy name matching is required. For example, "taskname" can be matched to any string that contains "taskname" as its substring. Sample output for --task-name wdav --fuzzy-name ============= . *A0 . . . . - . 131040.641346 secs A0 => wdavdaemon:62509 . A0 *B0 . . . - . 131040.641378 secs B0 => wdavdaemon:62274 . *- B0 . . . - . 131040.641379 secs *C0 . B0 . . . . . 131040.641572 secs C0 => wdavdaemon:62283 C0 . B0 . *D0 . . . 131040.641572 secs D0 => wdavdaemon:62277 C0 . B0 . D0 . *E0 . 131040.641578 secs E0 => wdavdaemon:62270 *- . B0 . D0 . E0 . 131040.641581 secs Suggested-by: Chen Yu <yu.c.chen@intel.com> Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Link: https://lore.kernel.org/r/20240707182716.22054-4-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Madadi Vineeth Reddy
|
9cc0afed6f |
perf sched map: Add support for multiple task names using CSV
To track the scheduling patterns of multiple tasks simultaneously, multiple task names can be specified using a comma separator without any whitespace. Sample output for --task-name perf,wdavdaemon ============= . *A0 . . . . - . 131040.641346 secs A0 => wdavdaemon:62509 . A0 *B0 . . . - . 131040.641378 secs B0 => wdavdaemon:62274 . *- B0 . . . - . 131040.641379 secs *C0 . B0 . . . . . 131040.641572 secs C0 => wdavdaemon:62283 ... . *- . . . . . . 131041.395649 secs . . . . . . . *X2 131041.403969 secs X2 => perf:70211 . . . . . . . *- 131041.404006 secs Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/r/20240707182716.22054-3-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Madadi Vineeth Reddy
|
3116d60910 |
perf sched map: Add task-name option to filter the output map
By default, perf sched map prints sched-in events for all the tasks which may not be required all the time as it prints lot of symbols and rows to the terminal. With --task-name option, one could specify the specific task name for which the map has to be shown. This would help in analyzing the CPU usage patterns easier for that specific task. Since multiple PID's might have the same task name, using task-name filter would be more useful for debugging. For other tasks, instead of printing the symbol, '-' is printed and the same '.' is used to represent idle. '-' is used instead of symbol for other tasks because it helps in clear visualization of task of interest and secondly the symbol itself doesn't mean anything because the sched-in of that symbol will not be printed(first sched-in contains pid and the corresponding symbol). When using the --task-name option, the sched-out time is represented by a '*-'. Since not all task sched-in events are printed, the sched-out time of the relevant task might be lost. This representation ensures that the sched-out time of the interested task is not overlooked. 6.10.0-rc1 ========== *A0 131040.639793 secs A0 => migration/0:19 *. 131040.639801 secs . => swapper:0 . *B0 131040.639830 secs B0 => migration/1:24 . *. 131040.639836 secs . . *C0 131040.640108 secs C0 => migration/2:30 . . *. 131040.640163 secs . . . *D0 131040.640386 secs D0 => migration/3:36 . . . *. 131040.640395 secs 6.10.0-rc1 + patch (--task-name wdavdaemon) ============= . *A0 . . . . - . 131040.641346 secs A0 => wdavdaemon:62509 . A0 *B0 . . . - . 131040.641378 secs B0 => wdavdaemon:62274 - *- B0 . . . - . 131040.641379 secs *C0 . B0 . . . . . 131040.641572 secs C0 => wdavdaemon:62283 C0 . B0 . *D0 . . . 131040.641572 secs D0 => wdavdaemon:62277 C0 . B0 . D0 . *E0 . 131040.641578 secs E0 => wdavdaemon:62270 *- . B0 . D0 . E0 . 131040.641581 secs . . B0 . D0 . *- . 131040.641583 secs Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/r/20240707182716.22054-2-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Madadi Vineeth Reddy
|
a7cacaa088 |
perf sched replay: Fix -r/--repeat command line option for infinity
Currently, the -r/--repeat option accepts values from 0 and complains for -1. The help section specifies: -r, --repeat <n> repeat the workload replay N times (-1: infinite) The -r -1 option raises an error because replay_repeat is defined as an unsigned int. In the current implementation, the workload is repeated n times when -r <n> is used, except when n is 0. When -r is set to 0, the workload is also repeated once. This happens because when -r=0, the run_one_test function is not called. (Note that mutex unlocking, which is essential for child threads spawned to emulate the workload, happens in run_one_test.) However, mutex unlocking is still performed in the destroy_tasks function. Thus, -r=0 results in the workload running once coincidentally. To clarify and maintain the existing logic for -r >= 1 (which runs the workload the specified number of times) and to fix the issue with infinite runs, make -r=0 perform an infinite run. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20240628071821.15264-1-vineethr@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Fernand Sieber
|
d363c2a880 |
perf: Timehist account sch delay for scheduled out running
When using perf timehist, sch delay is only computed for a waking task, not for a pre empted task. This patches changes sch delay to account for both. This makes sense as testing scheduling policy need to consider the effect of scheduling delay globally, not only for waking tasks. Example of `perf timehist` report before the patch for `stress` task competing with each other. First column is wait time, second column sch delay, third column runtime. 1.492060 [0000] s stress[81] 1.999 0.000 2.000 R next: stress[83] 1.494060 [0000] s stress[83] 2.000 0.000 2.000 R next: stress[81] 1.496060 [0000] s stress[81] 2.000 0.000 2.000 R next: stress[83] 1.498060 [0000] s stress[83] 2.000 0.000 1.999 R next: stress[81] After the patch, it looks like this (note that all wait time is not zero anymore): 1.492060 [0000] s stress[81] 1.999 1.999 2.000 R next: stress[83] 1.494060 [0000] s stress[83] 2.000 2.000 2.000 R next: stress[81] 1.496060 [0000] s stress[81] 2.000 2.000 2.000 R next: stress[83] 1.498060 [0000] s stress[83] 2.000 2.000 1.999 R next: stress[81] Signed-off-by: Fernand Sieber <sieberf@amazon.com> Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240618090339.87482-1-sieberf@amazon.com |
||
Ravi Bangoria
|
b739759c4e |
perf doc: Add AMD IBS usage document
Add a perf man page document that describes how to exploit AMD IBS with Linux perf. Brief intro about IBS and simple one-liner examples will help naive users to get started. This is not meant to be an exhaustive IBS guide. User should refer latest AMD64 Architecture Programmer's Manual for detailed description of IBS. Usage: $ man perf-amd-ibs Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: ananth.narayan@amd.com Cc: sandipan.das@amd.com Cc: santosh.shukla@amd.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240620054104.815-1-ravi.bangoria@amd.com |
||
Nick Forrington
|
f7d4485fce |
perf lock info: Display both map and thread by default
Change "perf lock info" argument handling to: Display both map and thread info (rather than an error) when neither are specified. Display both map and thread info (rather than just thread info) when both are requested. Signed-off-by: Nick Forrington <nick.forrington@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240513091413.738537-2-nick.forrington@arm.com |
||
Ian Rogers
|
af75201634 |
perf top: Allow filters on events
Allow filters to be added to perf top events. One use is to workaround issues with: ``` $ perf top --uid="$(id -u)" ``` which tries to scan /proc find processes belonging to the uid and can fail in such a pid terminates between the scan and the perf_event_open reporting: ``` Error: The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:P). /bin/dmesg | grep -i perf may provide additional information. ``` A similar filter: ``` $ perf top -e cycles:P --filter "uid == $(id -u)" ``` doesn't fail this way. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240524205227.244375-4-irogers@google.com |
||
Ian Rogers
|
d92aa899fe |
perf bpf filter: Add uid and gid terms
Allow the BPF filter to use the uid and gid terms determined by the bpf_get_current_uid_gid BPF helper. For example, the following will record the cpu-clock event system wide discarding samples that don't belong to the current user. $ perf record -e cpu-clock --filter "uid == $(id -u)" -a sleep 0.1 Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240524205227.244375-3-irogers@google.com |
||
Ian Rogers
|
a93c83eca4 |
perf docs: Fix typos
Assorted typo fixes. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240521223555.858859-1-irogers@google.com |
||
Madadi Vineeth Reddy
|
6fe61cb4ae |
perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency
Rename 'Switches' to 'Count' and document metrics shown for perf
sched latency output. Also add options possible with perf sched
latency.
Initially, after seeing the output of 'perf sched latency', the term
'Switches' seemed like it's the number of context switches-in for a
particular task, but upon going through the code, it was observed that
it's actually keeping track of number of times a delay was calculated so
that it is used in calculation of the average delay.
Actually, the switches here is a subset of number of context switches-in
because there are some cases where the count is not incremented in
switch-in handler 'add_sched_in_event'. For example when a task is
switched-in while it's state is not ready to run(!= THREAD_WAIT_CPU).
commit
|
||
Arnaldo Carvalho de Melo
|
8c618b58c8 |
perf test: Reintroduce -p/--parallel and make -S/--sequential the default
We can't default to doing parallel tests as there are tests that compete for the same resources and thus clash, for instance tests that put in place 'perf probe' probes, that clean the probes without regard to other tests needs, ARM64 coresight tests, Intel PT ones, etc. So reintroduce --p/--parallel and make -S/--sequential the default. We need to come up with infrastructure that state which tests can't run in parallel because they need exclusive access to some resource, something as simple as "probes" that would then avoid 'perf probe' tests from running while other such test is running, or make the tests more resilient, till then we can't use parallel mode as default. While at it, document all these options in the 'perf test' man page. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reported-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Ziwm18BqIn_vc1vn@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
eb4d27cf9a |
perf docs: Document bpf event modifier
Document that 'b' is used as a modifier to make an event use a BPF
counter.
Fixes:
|
||
Namhyung Kim
|
7043dc5286 |
perf report: Add weight[123] output fields
Add weight1, weight2 and weight3 fields to -F/--fields and their aliases like 'ins_lat', 'p_stage_cyc' and 'retire_lat'. Note that they are in the sort keys too but the difference is that output fields will sum up the weight values and display the average. In the sort key, users can see the distribution of weight value and I think it's confusing we have local vs. global weight for the same weight. For example, I experiment with mem-loads events to get the weights. On my laptop, it seems only weight1 field is supported. $ perf mem record -- perf test -w noploop Let's look at the noploop function only. It has 7 samples. $ perf script -F event,ip,sym,weight | grep noploop # event weight ip sym cpu/mem-loads,ldlat=30/P: 43 55b3c122bffc noploop cpu/mem-loads,ldlat=30/P: 48 55b3c122bffc noploop cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight cpu/mem-loads,ldlat=30/P: 59 55b3c122bffc noploop cpu/mem-loads,ldlat=30/P: 33 55b3c122bffc noploop cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight When you use the 'weight' sort key, it'd show entries with a separate weight value separately. Also note that the first entry has 3 samples with weight value 38, so they are displayed together and the weight value is the sum of 3 samples (114 = 38 * 3). $ perf report -n -s +weight | grep -e Weight -e noploop # Overhead Samples Command Shared Object Symbol Weight 0.53% 3 perf perf [.] noploop 114 0.18% 1 perf perf [.] noploop 59 0.18% 1 perf perf [.] noploop 48 0.18% 1 perf perf [.] noploop 43 0.18% 1 perf perf [.] noploop 33 If you use 'local_weight' sort key, you can see the actual weight. $ perf report -n -s +local_weight | grep -e Weight -e noploop # Overhead Samples Command Shared Object Symbol Local Weight 0.53% 3 perf perf [.] noploop 38 0.18% 1 perf perf [.] noploop 59 0.18% 1 perf perf [.] noploop 48 0.18% 1 perf perf [.] noploop 43 0.18% 1 perf perf [.] noploop 33 But when you use the -F/--field option instead, you can see the average weight for the while noploop function (as it won't group samples by weight value and use the default 'comm,dso,sym' sort keys). $ perf report -n -F +weight | grep -e Weight -e noploop Warning: --fields weight shows the average value unlike in the --sort key. # Overhead Samples Weight1 Command Shared Object Symbol 1.23% 7 42.4 perf perf [.] noploop The weight1 field shows the average value: (38 * 3 + 59 + 48 + 43 + 33) / 7 = 42.4 Also it'd show the warning that 'weight' field has the average value. Using 'weight1' can remove the warning. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240411181718.2367948-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Andi Kleen
|
d812044688 |
perf script: Add capstone support for '-F +brstackdisasm'
Support capstone output for the '-F +brstackinsn' branch dump. The new output is enabled with the new field 'brstackdisasm'. This was possible before with --xed, but now also allow it for users that don't have xed using the builtin capstone support. Before: perf record -b emacs -Q --batch '()' perf script -F +brstackinsn ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [3] 00007f0ab2d17105 insn: 73 51 00007f0ab2d17107 insn: 48 89 c1 00007f0ab2d1710a insn: 48 39 ca 00007f0ab2d1710d insn: 73 96 00007f0ab2d1710f insn: 48 8d 04 11 00007f0ab2d17113 insn: 48 d1 e8 00007f0ab2d17116 insn: 49 8d 34 c1 00007f0ab2d1711a insn: 44 3a 06 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 insn: 73 51 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 insn: 48 8d 50 01 00007f0ab2d1715c insn: eb 92 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 insn: 48 39 ca 00007f0ab2d170f3 insn: 73 b0 # PRED 1 cycles [9] 2.00 IPC After (perf must be compiled with capstone): perf script -F +brstackdisasm ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [3] 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 00007f0ab2d17107 movq %rax, %rcx 00007f0ab2d1710a cmpq %rcx, %rdx 00007f0ab2d1710d jae intel_check_word.constprop.0+0x75 00007f0ab2d1710f leaq (%rcx, %rdx), %rax 00007f0ab2d17113 shrq $1, %rax 00007f0ab2d17116 leaq (%r9, %rax, 8), %rsi 00007f0ab2d1711a cmpb (%rsi), %r8b 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 leaq 1(%rax), %rdx 00007f0ab2d1715c jmp intel_check_word.constprop.0+0xc0 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 cmpq %rcx, %rdx 00007f0ab2d170f3 jae intel_check_word.constprop.0+0x75 # PRED 1 cycles [9] 2.00 IPC Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-3-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
James Clark
|
36f65f9b7a |
perf docs arm_spe: Clarify more SPE requirements related to KPTI
The question of exactly when KPTI needs to be disabled comes up a lot because it doesn't always need to be done. Add the relevant kernel function and some examples that describe the behavior. Also describe the interrupt requirement and that no error message will be printed if this isn't met. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240312132508.423320-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Changbin Du
|
659663f0bc |
perf: script: prefer capstone to XED
Now perf can show assembly instructions with libcapstone for x86, and the capstone is better in general. Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-6-changbin.du@huawei.com |
||
Changbin Du
|
6750ba4b64 |
perf: script: add raw|disasm arguments to --insn-trace option
Now '--insn-trace' accept a argument to specify the output format: - raw: display raw instructions. - disasm: display mnemonic instructions (if capstone is installed). $ sudo perf script --insn-trace=raw ls 1443864 [006] 2275506.209908875: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: 48 89 e7 ls 1443864 [006] 2275506.209908875: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: e8 e8 0c 00 00 ls 1443864 [006] 2275506.209908875: 7f216b426df0 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: f3 0f 1e fa $ sudo perf script --insn-trace=disasm ls 1443864 [006] 2275506.209908875: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rdi ls 1443864 [006] 2275506.209908875: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) callq _dl_start+0x0 ls 1443864 [006] 2275506.209908875: 7f216b426df0 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) illegal instruction ls 1443864 [006] 2275506.209908875: 7f216b426df4 _dl_start+0x4 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) pushq %rbp ls 1443864 [006] 2275506.209908875: 7f216b426df5 _dl_start+0x5 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rbp ls 1443864 [006] 2275506.209908875: 7f216b426df8 _dl_start+0x8 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) pushq %r15 Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-5-changbin.du@huawei.com |
||
Changbin Du
|
9941723438 |
perf: script: add field 'disasm' to display mnemonic instructions
In addition to the 'insn' field, this adds a new field 'disasm' to display mnemonic instructions instead of the raw code. $ sudo perf script -F +disasm perf-exec 1443864 [006] 2275506.209848: psb: psb offs: 0 0 [unknown] ([unknown]) perf-exec 1443864 [006] 2275506.209848: cbr: cbr: 41 freq: 4100 MHz (114%) 0 [unknown] ([unknown]) ls 1443864 [006] 2275506.209905: 1 branches:uH: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rdi ls 1443864 [006] 2275506.209908: 1 branches:uH: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) callq _dl_start+0x0 Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-4-changbin.du@huawei.com |
||
Namhyung Kim
|
39d14c0dd6 |
Merge branch 'perf-tools' into perf-tools-next
To get some fixes in the perf test and JSON metrics into the development branch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Yicong Yang
|
cbc917a1b0 |
perf stat: Support per-cluster aggregation
Some platforms have 'cluster' topology and CPUs in the cluster will
share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
cache (for Intel Jacobsville). Currently parsing and building cluster
topology have been supported since [1].
perf stat has already supported aggregation for other topologies like
die or socket, etc. It'll be useful to aggregate per-cluster to find
problems like L3T bandwidth contention.
This patch add support for "--per-cluster" option for per-cluster
aggregation. Also update the docs and related test. The output will
be like:
[root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
Performance counter stats for 'system wide':
S56-D0-CLS158 4 1,321,521,570 LLC-load
S56-D0-CLS594 4 794,211,453 LLC-load
S56-D0-CLS1030 4 41,623 LLC-load
S56-D0-CLS1466 4 41,646 LLC-load
S56-D0-CLS1902 4 16,863 LLC-load
S56-D0-CLS2338 4 15,721 LLC-load
S56-D0-CLS2774 4 22,671 LLC-load
[...]
On a legacy system without cluster or cluster support, the output will
be look like:
[root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1
Performance counter stats for 'system wide':
S56-D0-CLS0 64 18,011,485 cycles
S7182-D0-CLS0 64 16,548,835 cycles
Note that this patch doesn't mix the cluster information in the outputs
of --per-core to avoid breaking any tools/scripts using it.
Note that perf recently supports "--per-cache" aggregation, but it's not
the same with the cluster although cluster CPUs may share some cache
resources. For example on my machine all clusters within a die share the
same L3 cache:
$ cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-31
$ cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list
0-3
[1] commit
|
||
Adrian Hunter
|
0bdfbd04c6 |
perf tools: Make it possible to see perf's kernel and module memory mappings
Dump kmaps if using 'perf --debug kmaps' or verbose > 2 (e.g. -vvv) for tools 'perf script' and 'perf report' if there is no browser. Example: $ perf --debug kmaps script 2>&1 >/dev/null | grep kvm.intel build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20] Map: 0-3a3 4f5d8 [kvm_intel].modinfo Map: 0-5240 5f280 [kvm_intel]__versions Map: 0-30 64 [kvm_intel].note.Linux Map: 0-14 644c0 [kvm_intel].orc_header Map: 0-5297 43680 [kvm_intel].rodata Map: 0-5bee 3b837 [kvm_intel].text.unlikely Map: 0-7e0 41430 [kvm_intel].noinstr.text Map: 0-2080 713c0 [kvm_intel].bss Map: 0-26 705c8 [kvm_intel].data..read_mostly Map: 0-5888 6a4c0 [kvm_intel].data Map: 0-22 70220 [kvm_intel].data.once Map: 0-40 705f0 [kvm_intel].data..percpu Map: 0-1685 41d20 [kvm_intel].init.text Map: 0-4b8 6fd60 [kvm_intel].init.data Map: 0-380 70248 [kvm_intel]__dyndbg Map: 0-8 70218 [kvm_intel].exit.data Map: 0-438 4f980 [kvm_intel]__param Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1 Map: 0-3657 493b8 [kvm_intel].rodata.str1.8 Map: 0-e0 70640 [kvm_intel].data..ro_after_init Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko The example above shows how the module section mappings are all wrong except for the main .text mapping at 0xffffffffc13a7000. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240208085326.13432-2-adrian.hunter@intel.com |
||
Ben Gainey
|
acfd65c894 |
tools: perf: Expose sample ID / stream ID to python scripts
perf script exposes the evsel_name to python scripts as part of the data passed to the sample or tracepoint handler function, and it passes the id and stream_id to the throttled/unthrottled handler functions. This makes matching throttle events and samples difficult. To make this possible, this change exposes the sample id and stream_id values to the script. Signed-off-by: Ben Gainey <ben.gainey@arm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: will@kernel.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240123103137.1890779-2-ben.gainey@arm.com |