mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-07 13:43:51 +00:00
bebc6082da
8015 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Andrei Vagin
|
33974a414c |
perf trace: Call machine__exit() at exit
Otherwise 'perf trace' leaves a temporary file /tmp/perf-vdso.so-XXXXXX. $ perf trace -o log true $ ls -l /tmp/perf-vdso.* -rw------- 1 root root 8192 Nov 8 03:08 /tmp/perf-vdso.so-5bCpD0 Signed-off-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vasily Averin <vvs@virtuozzo.com> Link: http://lkml.kernel.org/r/20171108002246.8924-1-avagin@openvz.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
a271bfaf30 |
perf tools: Fix eBPF event specification parsing
Looks like I've reached the new level of stupidity, adding missing braces. Committer testing: Given the following eBPF C filter, that will add a record when it returns true, i.e. when the tv_nsec variable is > 2000ns, should be built and installed via sys_bpf(), but fails to do so before this patch: # cat filter.c #include <uapi/linux/bpf.h> #define SEC(NAME) __attribute__((section(NAME), used)) SEC("func=hrtimer_nanosleep rqtp->tv_nsec") int func(void *ctx, int err, long nsec) { return nsec > 1000; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf trace -e nanosleep,filter.c usleep 1 invalid or unsupported event: 'filter.c' Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # And works again after it is applied, the nothing is inserted when the co # perf trace -e *sleep,filter.c usleep 1 0.000 ( 0.066 ms): usleep/23994 nanosleep(rqtp: 0x7ffead94a0d0) = 0 # perf trace -e *sleep,filter.c usleep 2 0.000 ( 0.008 ms): usleep/24378 nanosleep(rqtp: 0x7fffa021ba50) ... 0.008 ( ): perf_bpf_probe:func:(ffffffffb410cb30) tv_nsec=2000) 0.000 ( 0.066 ms): usleep/24378 ... [continued]: nanosleep()) = 0 # The intent of |
||
Jiri Olsa
|
b6af53b7d6 |
perf tools: Add "reject" option for parse-events.l
Arnaldo reported broken builds in some distros using a newer flex release, 2.6.4, found in Alpine Linux 3.6 and Edge, with flex not spotting the REJECT macro: CC /tmp/build/perf/util/parse-events-flex.o util/parse-events.l: In function 'parse_events_lex': /tmp/build/perf/util/parse-events-flex.c:4734:16: error: \ 'reject_used_but_not_detected' undeclared (first use in this function) It's happening because we put the REJECT under another USER_REJECT macro in following commit: |
||
Ingo Molnar
|
294cbd05e3 |
Merge branch 'linus' into perf/urgent, to pick up dependent commits
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Greg Kroah-Hartman
|
b24413180f |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Jiri Olsa
|
9445464bb8 |
perf tools: Unwind properly location after REJECT
We have defined YY_USER_ACTION to keep trace of the column location during events parsing, but we need to clean it up when we call REJECT. When REJECT is called, the lexer shrinks the text and re-runs the matching, so we need to address it in resuming the previous location value to keep it correct for error display, like: Before: $ perf stat -e 'cpu/uops_executed.core,krava/' true event syntax error: '..38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;1\ 1;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;\ 21;38;50 �' \___ unknown term After: $ ./perf stat -e 'cpu/uops_executed.core,krava/' true event syntax error: '..cuted.core,krava/' \___ unknown term Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Andi Kleen <ak@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vug2hchlny30jfsfrumbym26@git.kernel.org Link: http://lkml.kernel.org/r/20171009140944.GD28623@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ravi Bangoria
|
331c7cb307 |
perf symbols: Fix memory corruption because of zero length symbols
Perf top is often crashing at very random locations on powerpc. After
investigating, I found the crash only happens when sample is of zero
length symbol. Powerpc kernel has many such symbols which does not
contain length details in vmlinux binary and thus start and end
addresses of such symbols are same.
Structure
struct sym_hist {
u64 nr_samples;
u64 period;
struct sym_hist_entry addr[0];
};
has last member 'addr[]' of size zero. 'addr[]' is an array of addresses
that belongs to one symbol (function). If function consist of 100
instructions, 'addr' points to an array of 100 'struct sym_hist_entry'
elements. For zero length symbol, it points to the *empty* array, i.e.
no members in the array and thus offset 0 is also invalid for such
array.
static int __symbol__inc_addr_samples(...)
{
...
offset = addr - sym->start;
h = annotation__histogram(notes, evidx);
h->nr_samples++;
h->addr[offset].nr_samples++;
h->period += sample->period;
h->addr[offset].period += sample->period;
...
}
Here, when 'addr' is same as 'sym->start', 'offset' becomes 0, which is
valid for normal symbols but *invalid* for zero length symbols and thus
updating h->addr[offset] causes memory corruption.
Fix this by adding one dummy element for zero length symbols.
Link: https://lkml.org/lkml/2016/10/10/148
Fixes:
|
||
Li Zhijian
|
74f8e22c15 |
perf test shell trace+probe_libc_inet_pton.sh: Be compatible with Debian/Ubuntu
In debian/ubuntu, libc.so is located at a different place, /lib/x86_64-linux-gnu/libc-2.23.so, so it outputs like this when testing: PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.040 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.040/0.040/0.040/0.000 ms 0.000 probe_libc:inet_pton:(7f0e2db741c0)) __GI___inet_pton (/lib/x86_64-linux-gnu/libc-2.23.so) getaddrinfo (/lib/x86_64-linux-gnu/libc-2.23.so) [0xffffa9d40f34ff4d] (/bin/ping) Fix up the libc path to make sure this test works in more OSes. Committer testing: When this test fails one can use 'perf test -v', i.e. in verbose mode, where it'll show the expected backtrace, so, after applying this test: On Fedora 26: # perf test -v ping 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 23322 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms 0.000 probe_libc:inet_pton:(7fe344310d80)) __GI___inet_pton (/usr/lib64/libc-2.25.so) getaddrinfo (/usr/lib64/libc-2.25.so) _init (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Li Zhijian <lizhijian@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philip Li <philip.li@intel.com> Link: http://lkml.kernel.org/r/1508315649-18836-1-git-send-email-lizhijian@cn.fujitsu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
3d8bba9535 |
perf xyarray: Fix wrong processing when closing evsel fd
In current xyarray code, xyarray__max_x() returns max_y, and xyarray__max_y()
returns max_x.
It's confusing and for code logic it looks not correct.
Error happens when closing evsel fd. Let's see this scenario:
1. Allocate an fd (pseudo-code)
perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
{
evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int));
}
xyarray__new(int xlen, int ylen, size_t entry_size)
{
size_t row_size = ylen * entry_size;
struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
xy->entry_size = entry_size;
xy->row_size = row_size;
xy->entries = xlen * ylen;
xy->max_x = xlen;
xy->max_y = ylen;
......
}
So max_x is ncpus, max_y is nthreads and row_size = nthreads * 4.
2. Use perf syscall and get the fd
int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
for (cpu = 0; cpu < cpus->nr; cpu++) {
for (thread = 0; thread < nthreads; thread++) {
int fd, group_fd;
fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
group_fd, flags);
FD(evsel, cpu, thread) = fd;
}
}
static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
{
return &xy->contents[x * xy->row_size + y * xy->entry_size];
}
These codes don't have issues. The issue happens in the closing of fd.
3. Close fd.
void perf_evsel__close_fd(struct perf_evsel *evsel)
{
int cpu, thread;
for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++)
for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) {
close(FD(evsel, cpu, thread));
FD(evsel, cpu, thread) = -1;
}
}
Since xyarray__max_x() returns max_y (nthreads) and xyarry__max_y()
returns max_x (ncpus), so above code is actually to be:
for (cpu = 0; cpu < nthreads; cpu++)
for (thread = 0; thread < ncpus; ++thread) {
close(FD(evsel, cpu, thread));
FD(evsel, cpu, thread) = -1;
}
It's not correct!
This change is introduced by "475fb533fb7d" ("perf evsel: Fix buffer overflow
while freeing events")
This fix is to let xyarray__max_x() return max_x (ncpus) and
let xyarry__max_y() return max_y (nthreads)
Committer note:
This was also fixed by Ravi Bangoria, who provided the same patch,
noticing the problem with 'perf record':
<quote Ravi>
I see 'perf record -p <pid>' crashes with following log:
*** Error in `./perf': free(): invalid next size (normal): 0x000000000298b340 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7fd85c87e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f7fd85d137a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7fd85d553c]
./perf(perf_evsel__close+0xb4)[0x4b7614]
./perf(perf_evlist__delete+0x100)[0x4ab180]
./perf(cmd_record+0x1d9)[0x43a5a9]
./perf[0x49aa2f]
./perf(main+0x631)[0x427841]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7fd8571830]
./perf(_start+0x29)[0x427a59]
</>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Fixes:
|
||
Namhyung Kim
|
7f0cd23615 |
perf buildid-list: Fix crash when processing PERF_RECORD_NAMESPACE
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events. It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
+elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
+python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
#2 0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
#3 perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
file_offset=file_offset@entry=1136) at util/session.c:1522
#4 0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
data_offset=<optimized out>, session=0x0) at util/session.c:1899
#5 perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
#6 0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
at builtin-buildid-list.c:83
#7 cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
#8 0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
at perf.c:296
#9 0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
(gdb)
Fix it by adding a stub event handler for namespace event.
Committer testing:
Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:
# perf record -a --namespaces sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
# perf buildid-list | wc -l
38
# perf buildid-list | head -5
e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
#
It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:
# perf buildid-list -H
Segmentation fault (core dumped)
^C
#
Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes:
|
||
Taeung Song
|
3f50f614d6 |
perf record: Fix documentation for a inexistent option '-l'
'perf record' had a '-l' option that meant "scale counter values" a very long time ago, but it currently belongs to 'perf stat' as '-c'. So remove it. I found this problem in the below case. $ perf record -e cycles -l sleep 3 Error: unknown switch `l Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1507907412-19813-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
29479bfe83 |
perf tools: Check wether the eBPF file exists in event parsing
Adding the check wether the eBPF file exists, to consider it as eBPF input file. This way we can differentiate eBPF events from events that end up with same suffix as eBPF file. Before: $ perf stat -e 'cpu/uops_executed.core/' true bpf: builtin compilation failed: -95, try external compiler WARNING: unable to get correct kernel building directory. Hint: Set correct kbuild directory using 'kbuild-dir' option in [llvm] section of ~/.perfconfig or set it to "" to suppress kbuild detection. event syntax error: 'cpu/uops_executed.core/' \___ Failed to load cpu/uops_executed.c from source: 'version' section incorrect or lost After: $ perf stat -e 'cpu/uops_executed.core/' true Performance counter stats for 'true': 181,533 cpu/uops_executed.core/:u 0.002795447 seconds time elapsed If user makes type in the eBPF file, we prioritize the event syntax and show following warning: $ perf stat -e 'krava.c//' true event syntax error: 'krava.c//' \___ Cannot find PMU `krava.c'. Missing kernel support? Reported-and-Tested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20171013083736.15037-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
d0e35234f6 |
perf hists: Add extra integrity checks to fmt_free()
Make sure the struct perf_hpp_fmt is properly unhooked before we free it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Changbin Du <changbin.du@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20171013083736.15037-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
70b01dfd76 |
perf hists: Fix crash in perf_hpp__reset_output_field()
Du Changbin reported crash [1] when calling perf_hpp__reset_output_field() after unregistering field via perf_hpp__column_unregister(). This ends up in calling following list_del* sequence on the same format: perf_hpp__column_unregister: list_del(&format->list); perf_hpp__reset_output_field: list_del_init(&fmt->list); where the later list_del_init might touch already freed formats. Fixing this by replacing list_del() with list_del_init() in perf_hpp__column_unregister(). [1] http://marc.info/?l=linux-kernel&m=149059595826019&w=2 Reported-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20171013083736.15037-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Mark Rutland
|
66ec11919a |
perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU
Currently, perf record is broken on arm/arm64 systems when the PMU is specified explicitly as part of the event, e.g. $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true In such cases, perf record fails to open events unless perf_event_paranoid is set to -1, even if the PMU in question supports mode exclusion. Further, even when perf_event_paranoid is toggled, no samples are recorded. This is an unintended side effect of commit: |
||
Mark Santaniello
|
e9516c0813 |
perf script: Add missing separator for "-F ip,brstack" (and brstackoff)
Prior to commit |
||
Ravi Bangoria
|
c1fbc0cf81 |
perf callchain: Compare dsos (as well) for CCKEY_FUNCTION
Two functions from different binaries can have same start address. Thus, comparing only start address in match_chain() leads to inconsistent callchains. Fix this by adding a check for dsos as well. Ex, https://www.spinics.net/lists/linux-perf-users/msg04067.html Reported-by: Alexander Pozdneev <pozdneyev@gmail.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: zhangmengting@huawei.com Link: http://lkml.kernel.org/r/20171005091234.5874-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ingo Molnar
|
1addcd55bc |
perf/urgent fixes:
- Fix syscalltbl build failure (Akemi Yagi) - Fix attr.exclude_kernel setting for default cycles:p, this time for !root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo) - Sync kernel ABI headers with tooling headers (Ingo Molnar) - Remove misleading debug messages with --call-graph option (Mengting Zhang) - Revert vmlinux symbol resolution patches for s390x (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlnNSQgACgkQ1lAW81NS qkAijg//THaO1ErOm7Ha/Xyh6gs2h4OfAQ8t+FmvEjQKrd6zIjG3/WKzfzoD7AY2 Sy8VfiCM3mno/VnrH/9Ty/6jtIn06aV4Ljm3UgBYQJXHSHUmiVaK/M+ddzZahH2j 3/MBs0vK1kOjzv4s9LWs5a9nwCJrCbsdpZs2HmdoX90/NhEq41T6VsFK5zA+bgqr 4Wjm3livbxCihpAhEhB31IED1YLn8Yhoas/GxC/o4bIUFtnSe8sWS8pZi4Mu9cwI /OUo6iOCR7DDsLW5GSJkzvJnUSGEEDFX/brgibZjuDL9rLSm5tG1gmiIue4VUU12 OLUltRCOP3C3y3rZLeB0W4S4fB2R1vV56/dOIZcXThXR4etpS+cMTv+lmxFdvFkx cxyXM2KJDr313q43b49zYZMjKvsRbt4zEjJgis8OIQUBKHE9tu7MQ9nkAOZE1F1c VE49F2Q4u+LmoN5KNf6d44h12FO3o+8XBr9B4yOTghZmk3rp6fWh8I++UxfuICXO xHD6nN1oDmziB4T3+SfbGjbScTwyYScn4RAWS4bVUkMJZ57Y44cdikr12f/IPKO1 MS20DWwkXaJrB2PEJxu7F7T+6bUJLMGLJH3CRnFBkxIHV8y/oEc+gLYFj1h4k/59 3dAmXpfhcQ+B6MIqHoWsVEXbpH5iBaJR6Jywp57xyrcY0TATRuk= =Tg+2 -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo-4.14-20170928' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix syscalltbl build failure (Akemi Yagi) - Fix attr.exclude_kernel setting for default cycles:p, this time for !root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo) - Sync kernel ABI headers with tooling headers (Ingo Molnar) - Remove misleading debug messages with --call-graph option (Mengting Zhang) - Revert vmlinux symbol resolution patches for s390x (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Thomas Richter
|
5357413f5c |
perf test: Fix vmlinux failure on s390x part 2
On s390x perf test 1 failed. It turned out that commit |
||
Thomas Richter
|
b28503a3fe |
perf test: Fix vmlinux failure on s390x
On s390x perf test 1 failed. It turned out that commit |
||
Akemi Yagi
|
090657c9fb |
perf tools: Fix syscalltbl build failure
The build of kernel v4.14-rc1 for i686 fails on RHEL 6 with the error in tools/perf: util/syscalltbl.c:157: error: expected ';', ',' or ')' before '__maybe_unused' mv: cannot stat `util/.syscalltbl.o.tmp': No such file or directory Fix it by placing/moving: #include <linux/compiler.h> outside of #ifdef HAVE_SYSCALL_TABLE block. Signed-off-by: Akemi Yagi <toracat@elrepo.org> Cc: Alan Bartlett <ajb@elrepo.org> Link: http://lkml.kernel.org/r/oq41r8$1v9$1@blaine.gmane.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Mengting Zhang
|
9789e7e93f |
perf report: Fix debug messages with --call-graph option
With --call-graph option, perf report can display call chains using type, min percent threshold, optional print limit and order. And the default call-graph parameter is 'graph,0.5,caller,function,percent'. Before this patch, 'perf report --call-graph' shows incorrect debug messages as below: # perf report --call-graph Invalid callchain mode: 0.5 Invalid callchain order: 0.5 Invalid callchain sort key: 0.5 Invalid callchain config key: 0.5 Invalid callchain mode: caller Invalid callchain mode: function Invalid callchain order: function Invalid callchain mode: percent Invalid callchain order: percent Invalid callchain sort key: percent That is because in function __parse_callchain_report_opt(),each field of the call-graph parameter is passed to parse_callchain_{mode,order, sort_key,value} in turn until it meets the matching value. For example, the order field "caller" is passed to parse_callchain_mode() firstly and obviously it doesn't match any mode field. Therefore parse_callchain_mode() will shows the debug message "Invalid callchain mode: caller", which could confuse users. The patch fixes this issue by moving the warning out of the function parse_callchain_{mode,order,sort_key,value}. Signed-off-by: Mengting Zhang <zhangmengting@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/1506154694-39691-1-git-send-email-zhangmengting@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
f1e52f14a6 |
perf evsel: Fix attr.exclude_kernel setting for default cycles:p
Yet another fix for probing the max attr.precise_ip setting: it is not
enough settting attr.exclude_kernel for !root users, as they _can_
profile the kernel if the kernel.perf_event_paranoid sysctl is set to
-1, so check that as well.
Testing it:
As non root:
$ sysctl kernel.perf_event_paranoid
kernel.perf_event_paranoid = 2
$ perf record sleep 1
$ perf evlist -v
cycles:uppp: ..., exclude_kernel: 1, ... precise_ip: 3, ...
Now as non-root, but with kernel.perf_event_paranoid set set to the
most permissive value, -1:
$ sysctl kernel.perf_event_paranoid
kernel.perf_event_paranoid = -1
$ perf record sleep 1
$ perf evlist -v
cycles:ppp: ..., exclude_kernel: 0, ... precise_ip: 3, ...
$
I.e. non-root, default kernel.perf_event_paranoid: :uppp modifier = not allowed to sample the kernel,
non-root, most permissible kernel.perf_event_paranoid: :ppp = allowed to sample the kernel.
In both cases, use the highest available precision: attr.precise_ip = 3.
Reported-and-Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes:
|
||
Arnaldo Carvalho de Melo
|
89975bd335 |
perf tools: Get all of tools/{arch,include}/ in the MANIFEST
Now that I'm switching the container builds from using a local volume pointing to the kernel repository with the perf sources, instead getting a detached tarball to be able to use a container cluster, some places broke because I forgot to put some of the required files in tools/perf/MANIFEST, namely some bitsperlong.h files. So, to fix it do the same as for tools/build/ and pack the whole tools/arch/ directory. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wmenpjfjsobwdnfde30qqncj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Michal Hocko
|
0ee931c4e3 |
mm: treewide: remove GFP_TEMPORARY allocation flag
GFP_TEMPORARY was introduced by commit
|
||
Ingo Molnar
|
b130a699c0 |
perf/urgent fixes:
- Fix TUI progress bar when delta from new total from that of the previous update is greater than the progress "step" (screen width progress bar block)) (Jiri Olsa) - Make tools/lib/api make DEBUG=1 build use -D_FORTIFY_SOURCE=2 not to cripple debuginfo, just like tools/perf/ does (Jiri Olsa) - Avoid leaking the 'perf.data' file to workloads started from the 'perf record' command line by using the O_CLOEXEC open flag (Jiri Olsa) - Fix building when libunwind's 'unwind.h' file is present in the include path, clashing with tools/perf/util/unwind.h (Milian Wolff) - Check per .perfconfig section entry flag, not just per section (Taeung Song) - Support running perf binaries with a dash in their name, needed to run perf as an AppImage (Milian Wolff) - Wait for the right child by using waitpid() when running workloads from 'perf stat', also to fix using perf as an AppImage (Milian Wolff) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlm4HVkACgkQ1lAW81NS qkByEhAAsNfQRKGQIeudLdEWx63wyZviU0KQ2zeNurbpEMHsttcHgQciYvqQmyCn FZ+zm21vcNBKd0pqFwPL0WPJzYSnudRT23cG2NFSLlFX7RNZhzgp0X1a75kACXCH oJKpu/D4YDDS8J+xLjApJUWaVOYW39yAG3Cdzq2IBYHvbvPg/ovrBkxrwKLhJJTE ZdQIjt+DbGbEUvgOMAji/BmpgjnV5/goz736KoOIiWso3LpAsEb5kiLBMghnSTyR M4Hxl7NHS+3f7J8QpTTVlcL4oxI7RgYSQbjnqdwhff4LRrTfS3txRcit20KCMwF/ u+n1JBgR7I3ogUoO1jXyi0IaDdi77Vr7EckO5Yd+8shGIiXICwx1pMl88NvxBNHN 6YB8sL8/Fbw6q7d/o5iHwbrkuLZDWaE7fU3kj31l+W5jSIM1orCwcUG+IPU8e2UQ M40vtebGEVWKkYl/UqGNGh9d6zlN8du8HW+uT0l9Hebon6YQFt/qDnP0PRQ5QlFi p8AozBXbCUxjenUknuaHN4QtIJhshOhzw2YRKxghWeVHqP/yJ8lrPy646NosKXMj qPHezIvrEarERDxn8sMlHic2edOfcbiol6GjZmxeDutvLJzc3i54+tfwVuIUG3TV o123i3Xt7vld6xaENxPhCvfEqo/IRRuvqAxAIvLRq+OLwc3H2ec= =d3gz -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo-4.14-20170912' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix TUI progress bar when delta from new total from that of the previous update is greater than the progress "step" (screen width progress bar block)) (Jiri Olsa) - Make tools/lib/api make DEBUG=1 build use -D_FORTIFY_SOURCE=2 not to cripple debuginfo, just like tools/perf/ does (Jiri Olsa) - Avoid leaking the 'perf.data' file to workloads started from the 'perf record' command line by using the O_CLOEXEC open flag (Jiri Olsa) - Fix building when libunwind's 'unwind.h' file is present in the include path, clashing with tools/perf/util/unwind.h (Milian Wolff) - Check per .perfconfig section entry flag, not just per section (Taeung Song) - Support running perf binaries with a dash in their name, needed to run perf as an AppImage (Milian Wolff) - Wait for the right child by using waitpid() when running workloads from 'perf stat', also to fix using perf as an AppImage (Milian Wolff) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
e6328a7abe |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling updates from Ingo Molnar: "Perf tooling updates and fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf annotate browser: Help for cycling thru hottest instructions with TAB/shift+TAB perf stat: Only auto-merge events that are PMU aliases perf test: Add test case for PERF_SAMPLE_PHYS_ADDR perf script: Support physical address perf mem: Support physical address perf sort: Add sort option for physical address perf tools: Support new sample type for physical address perf vendor events powerpc: Remove duplicate events perf intel-pt: Fix syntax in documentation of config option perf test powerpc: Fix 'Object code reading' test perf trace: Support syscall name globbing perf syscalltbl: Support glob matching on syscall names perf report: Calculate the average cycles of iterations |
||
Milian Wolff
|
dfc9eec771 |
perf stat: Wait for the correct child
When packaging the perf userland application into an AppImage, the wait() call in perf stat returned too early. It turned out that some other child process exited, but not the one perf stat launched: $ sudo strace -e fork,execve,clone,wait4 -f ./perf-x86_64.AppImage stat sleep 1 execve("./perf-git.3a73b7f9-x86_64.AppImage", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x7ffec1bbf050 /* 18 vars */) = 0 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3912 strace: Process 3912 attached [pid 3912] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6a6e7efe50) = 3914 strace: Process 3914 attached [pid 3912] +++ exited with 0 +++ [pid 3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3912, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- [pid 3914] clone(strace: Process 3915 attached child_stack=0x7f6a6d9fefb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6a6d9ff9d0, tls=0x7f6a6d9ff700, child_tidptr=0x7f6a6d9ff9d0) = 3915 [pid 3911] execve("/tmp/.mount_perf-g6VYMpl/AppRun", ["./perf-git.3a73b7f9-x86_64.AppIm"..., "stat", "sleep", "1"], 0x14aab70 /* 21 vars */) = 0 [pid 3911] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f4ae113c4d0) = 3916 strace: Process 3916 attached [pid 3911] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3912 [pid 3916] execve("/usr/libexec/perf-core/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/tmp/./sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/home/milian/.bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/usr/lib/icecream/libexec/icecc/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/ssd2/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/home/milian/.bin/kf5/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/ssd2/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/home/milian/projects/compiled/other/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/home/milian/projects/compiled/kf5/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/usr/local/sbin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/usr/local/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */) = -1 ENOENT (No such file or directory) [pid 3916] execve("/usr/bin/sleep", ["sleep", "1"], 0x27d3650 /* 22 vars */ Performance counter stats for 'sleep 1': <not counted> task-clock <not counted> context-switches <not counted> cpu-migrations <not counted> page-faults <not counted> cycles <not counted> instructions <not counted> branches <not counted> branch-misses 0.000047194 seconds time elapsed [pid 3916] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3911, si_uid=0} --- [pid 3916] +++ killed by SIGTERM +++ [pid 3911] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3916, si_uid=0, si_status=SIGTERM, si_utime=0, si_stime=0} --- [pid 3915] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3914, si_uid=0} --- [pid 3911] +++ exited with 0 +++ [pid 3915] --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=3914, si_uid=0} --- [pid 3915] +++ exited with 0 +++ +++ exited with 0 +++ This patch uses waitpid instead to ensure the call waits for the debuggee application launched by 'perf stat'. This fixes 'perf stat' when launched from an AppImage: $ ./perf-x86_64.AppImage stat sleep 1 Performance counter stats for 'sleep 1': 0.357235 task-clock (msec) # 0.000 CPUs utilized 1 context-switches # 0.003 M/sec 0 cpu-migrations # 0.000 K/sec 50 page-faults # 0.140 M/sec 1269602 cycles # 3.554 GHz 654278 instructions # 0.52 insn per cycle 129963 branches # 363.803 M/sec 7082 branch-misses # 5.45% of all branches 1.000633420 seconds time elapsed Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170912152523.4497-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Milian Wolff
|
3192f1ed3d |
perf tools: Support running perf binaries with a dash in their name
Previously the part behind "perf-" was interpreted as an internal perf command. If the suffix could not be handled, the execution was stopped. This makes it impossible to launch perf binaries that got renamed to have the `perf-` prefix. This is e.g. the case for appimages (e.g. "perf-x86_64.AppImage"), but would also apply to all other scenarios where users symlink or rename perf themselves: Status quo with the broken behavior: $ ln -s ./perf ./perf-custom-suffix $ ./perf-custom-suffix list cannot handle custom-suffix internally$ Also note the missing newline at the end of the error message. With this patch applied, the above works properly: $ ./perf-custom-suffix list List of pre-defined events (to be used in -e): ... Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170911111422.31903-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Taeung Song
|
cba225d6ee |
perf config: Check not only section->from_system_config but also item's
Currently section->from_system_config is being checked multiple times. item->from_system_config should be checked instead, when iterating thru the items in a section. Fix it. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1504754325-9724-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
a82bfd041d |
perf ui progress: Fix progress update
We currently update the 'next' variable only with a single step value. But it's possible the 'adv' update is bigger than single 'step' value. This would leave 'next' value under counted and force unnecessary ui_progress__ops->update calls. Calculate the amount of steps we need for 'adv' update and increase the 'next' with that amounts of steps. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170908120510.22515-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
4d286c89e4 |
perf ui progress: Make sure we always define step value
Unlikely, but we could have ui_progress__init being called with total < 16, which would set the next and step variables to 0. That would force unnecessary ui_progress__ops->update calls because 'next' would never raise. Forcing the next and step values to be always > 0. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170908120510.22515-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jiri Olsa
|
cd6379ebb5 |
perf tools: Open perf.data with O_CLOEXEC flag
Do not carry the perf.data file descriptor into the workload process and close it when perf executes the workload. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170908084621.31595-2-jolsa@kernel.org [ Add definitions for O_CLOEXEC for older systems ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Milian Wolff
|
df90cc41d6 |
perf tests: Fix compile when libunwind's unwind.h is available
When cross compiling perf and I want to link against a self-compiled libunwind, I usually make the custom path where the libunwind headers exist visible by adding the libunwind prefix to the include path when compiling perf, i.e.: ~~~~~ $ ls $HOME/projects/compiled/other/include/ libunwind-coredump.h libunwind.h libunwind-x86_64.h libunwind-common.h libunwind-dynamic.h libunwind-ptrace.h unwind.h $ make EXTRA_CFLAGS="-I$HOME/projects/compiled/other/include/ ~~~~~~ Note the `unwind.h` header from libunwind which leads to compile errors when compiling tests/dwarf-unwind.c, since it shadows perf's util/unwind.h: ~~~~~ tests/dwarf-unwind.c:41:32: error: ‘struct unwind_entry’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror] static int unwind_entry(struct unwind_entry *entry, void *arg) ^~~~~~~~~~~~ tests/dwarf-unwind.c: In function ‘unwind_entry’: tests/dwarf-unwind.c:44:22: error: dereferencing pointer to incomplete type ‘struct unwind_entry’ char *symbol = entry->sym ? entry->sym->name : NULL; ^~ tests/dwarf-unwind.c: In function ‘unwind_thread’: tests/dwarf-unwind.c:92:8: error: implicit declaration of function ‘unwind__get_entries’; did you mean ‘unwind_entry’? [-Werror=implicit-function-declaration] err = unwind__get_entries(unwind_entry, &cnt, thread, ^~~~~~~~~~~~~~~~~~~ unwind_entry tests/dwarf-unwind.c:92:8: error: nested extern declaration of ‘unwind__get_entries’ [-Werror=nested-externs] ~~~~~~ Fix this compile error by specificing an explicit include of perf's unwind.h in the util folder. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170906150209.12579-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Linus Torvalds
|
bafb0762cb |
Char/Misc drivers for 4.14-rc1
Here is the big char/misc driver update for 4.14-rc1. Lots of different stuff in here, it's been an active development cycle for some reason. Highlights are: - updated binder driver, this brings binder up to date with what shipped in the Android O release, plus some more changes that happened since then that are in the Android development trees. - coresight updates and fixes - mux driver file renames to be a bit "nicer" - intel_th driver updates - normal set of hyper-v updates and changes - small fpga subsystem and driver updates - lots of const code changes all over the driver trees - extcon driver updates - fmc driver subsystem upadates - w1 subsystem minor reworks and new features and drivers added - spmi driver updates Plus a smattering of other minor driver updates and fixes. All of these have been in linux-next with no reported issues for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWa1+Ew8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yl26wCgquufNylfhxr65NbJrovduJYzRnUAniCivXg8 bePIh/JI5WxWoHK+wEbY =hYWx -----END PGP SIGNATURE----- Merge tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big char/misc driver update for 4.14-rc1. Lots of different stuff in here, it's been an active development cycle for some reason. Highlights are: - updated binder driver, this brings binder up to date with what shipped in the Android O release, plus some more changes that happened since then that are in the Android development trees. - coresight updates and fixes - mux driver file renames to be a bit "nicer" - intel_th driver updates - normal set of hyper-v updates and changes - small fpga subsystem and driver updates - lots of const code changes all over the driver trees - extcon driver updates - fmc driver subsystem upadates - w1 subsystem minor reworks and new features and drivers added - spmi driver updates Plus a smattering of other minor driver updates and fixes. All of these have been in linux-next with no reported issues for a while" * tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits) ANDROID: binder: don't queue async transactions to thread. ANDROID: binder: don't enqueue death notifications to thread todo. ANDROID: binder: Don't BUG_ON(!spin_is_locked()). ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl ANDROID: binder: push new transactions to waiting threads. ANDROID: binder: remove proc waitqueue android: binder: Add page usage in binder stats android: binder: fixup crash introduced by moving buffer hdr drivers: w1: add hwmon temp support for w1_therm drivers: w1: refactor w1_slave_show to make the temp reading functionality separate drivers: w1: add hwmon support structures eeprom: idt_89hpesx: Support both ACPI and OF probing mcb: Fix an error handling path in 'chameleon_parse_cells()' MCB: add support for SC31 to mcb-lpc mux: make device_type const char: virtio: constify attribute_group structures. Documentation/ABI: document the nvmem sysfs files lkdtm: fix spelling mistake: "incremeted" -> "incremented" perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file nvmem: include linux/err.h from header ... |
||
Arnaldo Carvalho de Melo
|
eba9fac017 |
perf annotate browser: Help for cycling thru hottest instructions with TAB/shift+TAB
The popup help accessed via 'h' wasn't mentioning about TAB and shift-TAB, just about 'H', which goes to the hottest line, while the former two are the hotkeys for actually cycling thru the hottest lines. Reported-by: Flavio Bruno Leitner <fbl@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5ppym6odizfj1ifa4t7neiku@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
63ce8449bc |
perf stat: Only auto-merge events that are PMU aliases
Peter reported that when he explicitely asked for multiple events with
the same name on the command line it got coalesced into just one line,
i.e.:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
3,269,652 cycles
0.000884123 seconds time elapsed
#
And while there is the --no-merges option to disable that auto-merging,
this is a blunt change in behaviour for such explicit request, so change
the code so that this auto merging is done only when handling the multi
PMU aliases with the same name that introduced this coalescing,
restoring the previous behaviour for the explicit case:
# perf stat -e cycles -e cycles -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,472,837 cycles
1,472,837 cycles
1,472,837 cycles
0.001764870 seconds time elapsed
#
Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes:
|
||
Kan Liang
|
fc33dccba3 |
perf test: Add test case for PERF_SAMPLE_PHYS_ADDR
Extend sample-parsing test cases to support new sample type PERF_SAMPLE_PHYS_ADDR. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-6-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
49d58f04eb |
perf script: Support physical address
Display the physical address at the tail if it is available. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-5-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
c35aeb9dfe |
perf mem: Support physical address
Add option phys-data in "perf mem" to record/report physical address. The default mem sort order for physical address is changed accordingly. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
8780fb25ab |
perf sort: Add sort option for physical address
Add a new sort option "phys_daddr" for --mem-mode sort. With this option applied, perf can sort and report by sample's physical address. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
3b0a5daa06 |
perf tools: Support new sample type for physical address
Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address. Add new option --phys-data to record sample physical address. Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com [ Added missing printing in evsel.c patch sent by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Sukadev Bhattiprolu
|
2a118e1bd2 |
perf vendor events powerpc: Remove duplicate events
Some POWER PMU event names have multiple/alternate event codes. These alternate event codes were listed in the POWER9 JSON files for reference. But the perf tool does not seem to handle duplicates cleanly. 'perf list' shows such duplicate events only once, but 'perf stat' ends up counting the first event code twice, multiplexing if necessary and we end up with double the event counts. Remove the duplicate event codes from the JSON files for now. Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170830231506.GB20351@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jack Henschel
|
4fb2053920 |
perf intel-pt: Fix syntax in documentation of config option
As specified in tools/perf/Documentation/perf-config.txt, perf configuration items must be in 'key = value' format, otherwise the following error message occurs: $ perf record -e intel_pt//u -- ls bad config file line 2 in ~/.perfconfig $ cat .perfconfig [intel-pt] mispred-all Changing to assigning a value to the key 'mispred-all' fixes the issue: $ perf record -e intel_pt//u -- ls [ perf record: Woken up 1 times to write data ] [ perf record: Capured and wrote 0.031 MB perf.data] $ cat .perfconfig [intel-pt] mispred-all = true Signed-off-by: Jack Henschel <jackdev@mailbox.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170831080535.2157-1-jackdev@mailbox.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ravi Bangoria
|
9a805d8648 |
perf test powerpc: Fix 'Object code reading' test
'Object code reading' test always fails on powerpc guest. Two reasons for the failure are: 1. When elf section is too big (size beyond 'unsigned int' max value). objdump fails to disassemble from such section. This was fixed with commit 0f6329bd7fc ("binutils/objdump: Fix disassemble for huge elf sections") in binutils. 2. When the sample is from hypervisor. Hypervisor symbols can not be resolved within guest and thus thread__find_addr_map() fails for such symbols. Fix this by ignoring hypervisor symbols in the test. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1504170896-7876-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
27702bcfe8 |
perf trace: Support syscall name globbing
So now we can use: # perf trace -e pkey_* 532.784 ( 0.006 ms): pkey/16018 pkey_alloc(init_val: DISABLE_WRITE) = -1 EINVAL Invalid argument 532.795 ( 0.004 ms): pkey/16018 pkey_mprotect(start: 0x7f380d0a6000, len: 4096, prot: READ|WRITE, pkey: -1) = 0 532.801 ( 0.002 ms): pkey/16018 pkey_free(pkey: -1 ) = -1 EINVAL Invalid argument ^C[root@jouet ~]# Or '-e epoll*', '-e *msg*', etc. Combining syscall names with perf events, tracepoints, etc, continues to be valid, i.e. this is possible: # perf probe -L sys_nanosleep <SyS_nanosleep@/home/acme/git/linux/kernel/time/hrtimer.c:0> 0 SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp, struct timespec __user *, rmtp) { struct timespec64 tu; 5 if (get_timespec64(&tu, rqtp)) 6 return -EFAULT; if (!timespec64_valid(&tu)) 9 return -EINVAL; 11 current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; 12 current->restart_block.nanosleep.rmtp = rmtp; 13 return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC); } # perf probe my_probe="sys_nanosleep:12 rmtp" Added new event: probe:my_probe (on sys_nanosleep:12 with rmtp) You can now use it in all perf tools, such as: perf record -e probe:my_probe -aR sleep 1 # # perf trace -e probe:my_probe/max-stack=5/,*sleep sleep 1 0.427 ( 0.003 ms): sleep/16690 nanosleep(rqtp: 0x7ffefc245090) ... 0.430 ( ): probe:my_probe:(ffffffffbd112923) rmtp=0) sys_nanosleep ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep_nocancel (/usr/lib64/libc-2.25.so) 0.427 (1000.208 ms): sleep/16690 ... [continued]: nanosleep()) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-elycoi8wy6y0w9dkj7ox1mzz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
89be3f8ab7 |
perf syscalltbl: Support glob matching on syscall names
With two new methods, one to find the first match, returning its syscall id and its index in whatever internal database it keeps the syscall into, then one to find the next match, if any. Implemented only on arches where we actually read the syscall table from the kernel sources, i.e. x86-64 for now, all the others use the libaudit method for which this returns -1, i.e. just stubs were added, with the actual implementation using whatever libaudit functions for matching that may be available. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i0sj4rxk1a63pfe9gl8z8irs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
c4ee06251d |
perf report: Calculate the average cycles of iterations
The branch history code has a loop detection function. With this, we can get the number of iterations by calculating the removed loops. While it would be nice for knowing the average cycles of iterations. This patch adds up the cycles in branch entries of removed loops and save the result to the next branch entry (e.g. branch entry A). Finally it will display the iteration number and average cycles at the "from" of branch entry A. For example: perf record -g -j any,save_type ./div perf report --branch-history --no-children --stdio --22.63%--main div.c:42 (RET CROSS_2M) compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2) | --10.73%--compute_flag div.c:27 (RET CROSS_2M) rand rand.c:28 (cycles:1) rand rand.c:28 (RET CROSS_2M) __random random.c:298 (cycles:1) __random random.c:297 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (COND_BWD CROSS_2M) __random random.c:295 (cycles:1) __random random.c:295 (RET CROSS_2M) Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1502111115-18305-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Li Bin
|
b2f7605076 |
perf symbols: Fix plt entry calculation for ARM and AARCH64
On x86, the plt header size is as same as the plt entry size, and can be identified from shdr's sh_entsize of the plt. But we can't assume that the sh_entsize of the plt shdr is always the plt entry size in all architecture, and the plt header size may be not as same as the plt entry size in some architecure. On ARM, the plt header size is 20 bytes and the plt entry size is 12 bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils implementation. The plt section is as follows: Disassembly of section .plt: 000004a0 <__cxa_finalize@plt-0x14>: 4a0: e52de004 push {lr} ; (str lr, [sp, #-4]!) 4a4: e59fe004 ldr lr, [pc, #4] ; 4b0 <_init+0x1c> 4a8: e08fe00e add lr, pc, lr 4ac: e5bef008 ldr pc, [lr, #8]! 4b0: 00008424 .word 0x00008424 000004b4 <__cxa_finalize@plt>: 4b4: e28fc600 add ip, pc, #0, 12 4b8: e28cca08 add ip, ip, #8, 20 ; 0x8000 4bc: e5bcf424 ldr pc, [ip, #1060]! ; 0x424 000004c0 <printf@plt>: 4c0: e28fc600 add ip, pc, #0, 12 4c4: e28cca08 add ip, ip, #8, 20 ; 0x8000 4c8: e5bcf41c ldr pc, [ip, #1052]! ; 0x41c On AARCH64, the plt header size is 32 bytes and the plt entry size is 16 bytes. The plt section is as follows: Disassembly of section .plt: 0000000000000560 <__cxa_finalize@plt-0x20>: 560: a9bf7bf0 stp x16, x30, [sp,#-16]! 564: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 568: f944be11 ldr x17, [x16,#2424] 56c: 9125e210 add x16, x16, #0x978 570: d61f0220 br x17 574: d503201f nop 578: d503201f nop 57c: d503201f nop 0000000000000580 <__cxa_finalize@plt>: 580: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 584: f944c211 ldr x17, [x16,#2432] 588: 91260210 add x16, x16, #0x980 58c: d61f0220 br x17 0000000000000590 <__gmon_start__@plt>: 590: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 594: f944c611 ldr x17, [x16,#2440] 598: 91262210 add x16, x16, #0x988 59c: d61f0220 br x17 NOTES: In addition to ARM and AARCH64, other architectures, such as s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider this issue. Signed-off-by: Li Bin <huawei.libin@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: David Tolnay <dtolnay@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: zhangmengting@huawei.com Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Li Bin
|
2c29461e27 |
perf probe: Fix kprobe blacklist checking condition
The commit |