1655 Commits

Author SHA1 Message Date
Ian Rogers
0ddce121b0 perf test: Avoid system wide when not privileged
Switch the test program to sleep that makes more sense for system wide
events. Only enable system wide when root or not paranoid. This avoids
failures under some testing conditions like ARM cloud.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230930060206.2353141-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-04 22:28:07 -07:00
Athira Rajeev
ee33a0ef84 perf test: Fix parse-events tests to skip parametrized events
Testcase "Parsing of all PMU events from sysfs" parse events for
all PMUs, and not just cpu. In case of powerpc, the PowerVM
environment supports events from hv_24x7 and hv_gpci PMU which
is of example format like below:

- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/

The value for "?" needs to be filled in depending on system
configuration. It is better to skip these parametrized events
in this test as it is done in:
'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
parametrized events")' which handled a simialr instance with
"all PMU test".

Fix parse-events test to skip parametrized events since
it needs proper setup of the parameters.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230927181703.80936-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-27 21:11:53 -07:00
Jing Zhang
7fded33c69 perf test: Add pmu-event test for "Compat" and new event_field.
Add new event test for uncore system event which is used to verify the
functionality of "Compat" matching multiple identifiers and the new event
fields "EventidCode" and "NodeType".

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: Zhuo Song <zhuo.song@linux.alibaba.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/1695794391-34817-6-git-send-email-renyu.zj@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-27 21:02:41 -07:00
Jing Zhang
3bb59e759c perf test: Make matching_pmu effective
The perf_pmu_test_event.matching_pmu didn't work. No matter what its
value is, it does not affect the test results. So let matching_pmu be
used for matching perf_pmu_test_pmu.pmu.name.

Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: Zhuo Song <zhuo.song@linux.alibaba.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/1695794391-34817-5-git-send-email-renyu.zj@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-27 21:02:25 -07:00
Athira Rajeev
cefff1f33d tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
Running shellcheck on some of the shell scripts, throws
below warning on shellcheck v0.6. Example:

   In tests/shell/coresight/asm_pure_loop.sh line 14:
   DATA="$DATD/perf-$TEST-$DATV.data"
          ^---^ SC2153: Possible misspelling: DATD may not be assigned, but DATA is.

Here, DATD is exported from "lib/coresight.sh" and this
warning can be ignored. Use "shellcheck disable=" to ignore
this check.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230907171540.36736-4-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:47:12 -07:00
Athira Rajeev
55312ca752 tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh tetscase
Running shellcheck on stat+shadow_stat.sh generates below
warning

    In tests/shell/stat+csv_summary.sh line 26:
    while read _num _event _run _pct
           ^--^ SC2034: _num appears unused. Verify use (or export if used externally).
                ^----^ SC2034: _event appears unused. Verify use (or export if used externally).
                       ^--^ SC2034: _run appears unused. Verify use (or export if used externally).
                            ^--^ SC2034: _pct appears unused. Verify use (or export if used externally).

This variable is intentionally unused since it is
needed to parse through the output. commit used "_"
as a prefix for this throw away variable. But this
stil shows warning with shellcheck v0.6. Fix this
by only using "_" instead of prefix and variable name.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230907171540.36736-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:47:12 -07:00
Athira Rajeev
35de80c709 tests/shell: Fix shellcheck SC1090 to handle the location of sourced files
Running shellcheck on some of the shell scripts throws
below error:

	In tests/shell/coresight/unroll_loop_thread_10.sh line 8:
	. "$(dirname $0)"/../lib/coresight.sh
	  ^-- SC1090: Can't follow non-constant source. Use a directive to specify location.

This happens on shellcheck version "0.6.0". Fix shellcheck
warning for SC1090 using "shellcheck source="i option to mention
the location of sourced files.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230907171540.36736-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:47:12 -07:00
Veronika Molnarova
29441ab3a3 perf test lock_contention.sh: Skip test if not enough CPUs
Machines with less then 4 CPUs weren't consistently triggering lock
events required for the test.

Skip the test on those machines. The limit of 4 CPUs is set as it
generates around 100 lock events for a test.

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20230919150419.23193-2-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-21 11:55:13 -07:00
Veronika Molnarova
fa52d995d1 perf test stat+shadow_stat.sh: Add threshold for rounding errors
The test was failing in specific scenarios due to imperfection of FP
arithmetics. The `bc` command wasn't correctly rounding the result of
division causing the failure.

Replace the `bc` with `awk` which should work with more decimal places
and add a threshold to catch any possible rounding errors.  The
acceptable rounding error is set to 0.01 when the test passes with a
warning message.

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20230919150419.23193-1-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-21 11:53:45 -07:00
Ian Rogers
b4f48f34f9 perf test: Detect off-cpu support from build options
Use perf version to detect whether BPF skeletons were enabled in a
build rather than a failing perf record.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:53 -07:00
Ian Rogers
c2ac838ef7 perf test: Ensure EXTRA_TESTS is covered in build test
Add to run variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:53 -07:00
Ian Rogers
c67c631d52 perf test: Update build test for changed BPF skeleton defaults
Fix a target name and set BUILD_BPF_SKEL to 0 rather than 1.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:45 -07:00
Yang Jihong
a132b784db perf test: Fix test-record-dummy-C0 failure for supported PERF_FORMAT_LOST feature kernel
For kernel that supports PERF_FORMAT_LOST, attr->read_format has
PERF_FORMAT_LOST bit. Update expected value of
attr->read_format of test-record-dummy-C0 for this scenario.

Before:

  # ./perf test 17 -vv
   17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 1609441
  <SNIP>
  running './tests/attr/test-record-dummy-C0'
    'PERF_TEST_ATTR=/tmp/tmpm3s60aji ./perf record -o /tmp/tmpm3s60aji/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
  expected read_format=4, got 20
  FAILED './tests/attr/test-record-dummy-C0' - match failure
  test child finished with -1
  ---- end ----
  Setup struct perf_event_attr: FAILED!

After:

  # ./perf test 17 -vv
   17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 1609441
  <SNIP>
  running './tests/attr/test-record-dummy-C0'
    'PERF_TEST_ATTR=/tmp/tmppa9vxcb7 ./perf record -o /tmp/tmppa9vxcb7/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
  <SNIP>
  test child finished with 0
  ---- end ----
  Setup struct perf_event_attr: Ok

Reported-and-Tested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230916091641.776031-1-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-16 22:36:36 -07:00
James Clark
3d0f5f456a perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it
iterates over all PMUs, so move it to pmus.c

At the same time rename it to perf_pmus__find_core_pmu() to match the
naming convention in this file.

list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now
that it's called from the same compilation unit. This is with -O2
(specifically -O1 -ftree-vrp -finline-functions
-finline-small-functions) which allow the bounds of the array
access to be determined at compile time. list_prepare_entry() subtracts
the offset of the 'list' member in struct perf_pmu from &core_pmus,
which isn't a struct perf_pmu. The compiler sees that pmu results in
&core_pmus - 8 and refuses to compile. At runtime this works because
list_for_each_entry_continue() always adds the offset back again before
dereferencing ->next, but it's technically undefined behavior. With
-fsanitize=undefined an additional warning is generated.

Using list_first_entry_or_null() to get the first entry here avoids
doing &core_pmus - 8 but has the same result and fixes both the compile
warning and the undefined behavior warning. There are other uses of
list_prepare_entry() in pmus.c, but the compiler doesn't seem to be
able to see that they can also be called with &core_pmus, so I won't
change any at this time.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230913153355.138331-2-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
Ruidong Tian
bb35084796 perf test: Update cs_etm testcase for Arm ETE
Add ETE as one of the supported device types in perf cs_etm testcase.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230911065541.91293-1-tianruidong@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
a1ebf7718e perf test: Add a test for strcmp_cpuid_str() expression
Test that the new expression builtin returns a match when the current
escaped CPU ID is given, and that it doesn't match when "0x0" is given.

The CPU ID in test__expr() has to be changed to perf_pmu__getcpuid()
which returns the CPU ID string, rather than the raw CPU ID that
get_cpuid() returns because that can't be used with strcmp_cpuid_str().
It doesn't affect the is_intel test because both versions contain
"Intel".

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
8a55c1e2c9 perf util: Add a function for replacing characters in a string
It finds all occurrences of a single character and replaces them with
a multi character string. This will be used in a test in a following
commit.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
d19a353cdd perf test: Check result of has_event(cycles) test
Currently the function always returns 0, so even when the has_event()
test fails, the test still passes. Fix it by returning ret instead.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
e44b47b931 perf test shell lock_contention: Add cgroup aggregation and filter tests
Add cgroup aggregation and filter tests.

  $ sudo ./perf test -v contention
   84: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 222423
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  Testing perf lock contention --cgroup-filter
  Testing perf lock contention CSV output
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok

Committer testing:

  [root@quaco ~]# uname -a
  Linux quaco 6.4.10-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 11 12:20:29 UTC 2023 x86_64 GNU/Linux
  [root@quaco ~]# perf test -v contention
   84: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 452625
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  Testing perf lock contention --cgroup-filter
  Testing perf lock contention CSV output
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok
  [root@quaco ~]#

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Yang Jihong
d50ad02cb3 perf test: Add perf_event_attr test for record dummy event
If only dummy event is recorded, tracking event is not needed.
Add this test scenario.

Test result:

  # ./perf test list 2>&1 | grep 'Setup struct perf_event_attr'
   17: Setup struct perf_event_attr
  # ./perf test 17 -v
   17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 720198
  <SNIP>
  running './tests/attr/test-record-dummy-C0'
  <SNIP>
  test child finished with 0
  ---- end ----
  Setup struct perf_event_attr: Ok

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230904023340.12707-7-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:31:54 -03:00
Yang Jihong
23b97c7ee9 perf test: Add test case for record sideband events
Add a new test case to record sideband events for all CPUs when tracing
selected CPUs

Test result:

  # ./perf test list 2>&1 | grep 'perf record sideband tests'
   95: perf record sideband tests
  # ./perf test 95
   95: perf record sideband tests                                      : Ok

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230904023340.12707-6-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:31:49 -03:00
Yang Jihong
1285ab300d perf record: Move setting tracking events before record__init_thread_masks()
User space tasks can migrate between CPUs, so when tracing selected CPUs,
sideband for all CPUs is needed. In this case set the cpu map of the evsel
to all online CPUs. This may modify the original cpu map of the evlist.

Therefore, need to check whether the preceding scenario exists before
record__init_thread_masks().

Dummy tracking has been set in record__open(), move it before
record__init_thread_masks() and add a helper for unified processing.

The sys_perf_event_open invoked is as follows:

  # perf --debug verbose=3 record -e cpu-clock -D 100 true
  <SNIP>
  Opening: cpu-clock
  ------------------------------------------------------------
  perf_event_attr:
    type                             1 (PERF_TYPE_SOFTWARE)
    size                             136
    config                           0 (PERF_COUNT_SW_CPU_CLOCK)
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER
    read_format                      ID|LOST
    disabled                         1
    inherit                          1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 10318  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 10318  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 10318  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 10318  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 10318  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 10318  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 10318  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 10318  cpu 7  group_fd -1  flags 0x8 = 13
  Opening: dummy:u
  ------------------------------------------------------------
  perf_event_attr:
    type                             1 (PERF_TYPE_SOFTWARE)
    size                             136
    config                           0x9 (PERF_COUNT_SW_DUMMY)
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|IDENTIFIER
    read_format                      ID|LOST
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    mmap                             1
    comm                             1
    enable_on_exec                   1
    task                             1
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
  ------------------------------------------------------------
  sys_perf_event_open: pid 10318  cpu 0  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 10318  cpu 1  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 10318  cpu 2  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 10318  cpu 3  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 10318  cpu 4  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 10318  cpu 5  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 10318  cpu 6  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 10318  cpu 7  group_fd -1  flags 0x8 = 21
  <SNIP>

'perf test' needs to update base-record & system-wide-dummy attr expected values
for test-record-C0:

1. Because a dummy sideband event is added to the sampling of specified
   CPUs. When evlist contains evsel of different sample_type,
   evlist__config() will change the default PERF_SAMPLE_ID bit to
   PERF_SAMPLE_IDENTIFICATION bit.
   The attr sample_type expected value of base-record and system-wide-dummy
   in test-record-C0 needs to be updated.

2. The perf record uses evlist__add_aux_dummy() instead of
   evlist__add_dummy() to add a dummy event.
   The expected value of system-wide-dummy attr needs to be updated.

The 'perf test' result is as follows:

  # ./perf test list  2>&1 | grep 'Setup struct perf_event_attr'
   17: Setup struct perf_event_attr
  # ./perf test 17
   17: Setup struct perf_event_attr                                    : Ok

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230904023340.12707-4-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:31:37 -03:00
Ian Rogers
0d3f0e6f94 perf parse-events: Introduce 'struct parse_events_terms'
parse_events_terms() existed in function names but was passed a
'struct list_head'.

As many parse_events functions take an evsel_config list as well as a
parse_event_term list, and the naming head_terms and head_config is
inconsistent, there's a potential to switch the lists and get errors.

Introduce a 'struct parse_events_terms', that just wraps a list_head, to
avoid this. Add the regular init/exit functions and transition the code
to use them.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-11 10:26:36 -03:00
Ian Rogers
9ea150a8d0 perf parse-events: Fixes relating to no_value terms
A term may have no value in which case it is assumed to have a value
of 1. It doesn't just apply to alias/event terms so change the
parse_events_term__to_strbuf assert.

Commit 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long
terms without value") made it so that no_value terms could only be for a
single bit. Prior to commit 64199ae4b8a3 ("perf parse-events: Fix
propagation of term's no_value when cloning") this missed a test case
where config1 had no_value.

Fixes: 64199ae4b8a36038 ("perf parse-events: Fix propagation of term's no_value when cloning")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-02 08:12:15 -03:00
Ian Rogers
f0005f1732 perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the
number present.

Add a test of the property.

This will be used in future Intel metrics.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830073026.1829912-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
d2045f8715 perf jevents: Use "default_core" for events with no Unit
The JSON Unit field encodes the name of the PMU to match the events
to. When no name is given it has meant the "cpu" core PMU except for
tests.

On ARM, Intel hybrid and s390 the core PMU is named differently which
means that using "cpu" for this case causes the events not to get
matched to the PMU.

Introduce a new "default_core" string for this case and in the
pmu__name_match force all core PMUs to match this name.

Fixes: 2e255b4f9f41f137 ("perf jevents: Group events by PMU")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230826062203.1058041-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim
a84260e314 perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
It has system-wide test and cpu-list test but the cpu-list test fails
sometimes.  It runs sleep command on CPU1 and measure both user.slice
and system.slice cgroups by default (on systemd-based systems).

But if the system was idle enough, sometime the system.slice gets no
count and it makes the test failing.  Maybe that's because it only looks
at the CPU1, let's add CPU0 to increase the chance it finds some tasks.

Fixes: 7901086014bbaa3a ("perf test: Add a new test for perf stat cgroup BPF counter")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim
68ca249c96 perf test shell stat_bpf_counters: Fix test on Intel
As of now, bpf counters (bperf) don't support event groups.  But the
default perf stat includes topdown metrics if supported (on recent Intel
machines) which require groups.  That makes perf stat exiting.

  $ sudo perf stat --bpf-counter true
  bpf managed perf events do not yet support groups.

Actually the test explicitly uses cycles event only, but it missed to
pass the option when it checks the availability of the command.

Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim
11f5710d96 perf test shell record_bpf_filter: Skip 6.2 kernel
The BPF sample filtering requires two kernel changes below:

 * bpf_cast_to_kernel_ctx() kfunc (added in v6.2)

 * setting perf_sample_data->sample_flags (finished in v6.3)

The perf tools can check bpf_cast_to_kernel_ctx() easily so it can
refuse BPF filters on those old kernels (v6.1 and earlier).

But checking sample_flags appears to be difficult so current code won't
work on v6.2 kernel.  That's unfortunate but I don't know what's the
correct way to handle it.

For now, let's skip v6.2 kernels explicitly (if failed) in the test.

Fixes: 9575ecdd198a50e9 ("perf test: Add perf record sample filtering test")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Ian Rogers
b7823045ec perf pmu: Make id const and add missing free
The struct pmu id is initialized from pmu_id that is read into allocated
memory from a file, as such it needs free-ing in pmu__delete().

Make the id value const so that we can remove casts in tests.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:23:34 -03:00
Ian Rogers
970ef02e98 perf parse-events: Make term's config const
This avoids casts in tests. Use zfree in a few places to avoid
warnings about a freeing a const pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:22:34 -03:00
Ian Rogers
c091ee9089 perf pmu: Remove logic for PMU name being NULL
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:22:16 -03:00
Ian Rogers
eeb6b12992 perf jevents: Don't append Unit to desc
Unit with the PMU name is appended to desc in jevents.py, but on
hybrid platforms it causes the desc to differ from the regular
non-hybrid system with a PMU of 'cpu'. Having differing descs means
the events don't deduplicate. To make the perf list output not differ,
append the Unit on again in the perf list printing code.

On x86 reduces the binary size by 409,600 bytes or about 4%. Update
pmu-events test expectations to match the differently generated
pmu-events.c code.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824183212.374787-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 19:21:27 -03:00
Ian Rogers
8d4b6d37ea perf pmu: Lazily load sysfs aliases
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:10:27 -03:00
Ian Rogers
e6ff1eed35 perf pmu: Lazily add JSON events
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.

Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:08:47 -03:00
Ian Rogers
7c52f10c0d perf pmu: Cache JSON events table
Cache the JSON events table so that finding it isn't done per
event/alias.

Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.

Update usage to always use the PMU variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:07:39 -03:00
Ian Rogers
edb217ff14 perf pmu: Parse sysfs events directly from a file
Rather than read a sysfs events file into a 256 byte char buffer, pass
the FILE* directly to the lex/yacc parser.

This avoids there being a maximum events file size.

While changing the API, constify some arguments to remove unnecessary
casts.

Allocating the read buffer decreases the performance of pmu-scan by
around 3%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:59 -03:00
Ian Rogers
3d5045492a perf pmu-events: Add pmu_events_table__find_event()
jevents stores events sorted by name. Add a find function that will
binary search event names avoiding the need to linearly search through
events.

Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000
is returned. If the event is found but no callback function given, 0 is
returned.

This allows the find function also act as a test for existence.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:22 -03:00
Ian Rogers
e3edd6cf63 perf pmu-events: Reduce processed events by passing PMU
Pass the PMU to pmu_events_table__for_each_event so that entries that
don't match don't need to be processed by callback.

If a NULL PMU is passed then all PMUs are processed.

'perf bench internals pmu-scan's "Average PMU scanning" performance is
reduced by about 5% on an Intel tigerlake.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:00:09 -03:00
Ian Rogers
2e255b4f9f perf jevents: Group events by PMU
Prior to this change a cpuid would map to a list of events where the PMU
would be encoded alongside the event information. This change breaks
apart each group of events so that there is a group per PMU. A new table
is added with the PMU's name and the list of events, the original table
now holding an array of these per PMU tables.

These changes are to make it easier to get per PMU information about
events, rather than the current approach of scanning all events. The
perf binary size with BPF skeletons on x86 is reduced by about 1%. The
unidentified PMU is now always expanded to "cpu".

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:51:03 -03:00
Ian Rogers
4000519eb0 perf pmu-events: Add extra underscore to function names
Add extra underscore before "for" of pmu_events_table_for_each_event
and pmu_metrics_table_for_each_metric.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:43:19 -03:00
Ian Rogers
c3245d2093 perf pmu: Abstract alias/event struct
In order to be able to lazily compute aliases/events for a PMU, move
the struct perf_pmu_alias into pmu.c.

Add perf_pmu__find_event and perf_pmu__for_each_event that take a
callback that is called for the found event or for each event.

The layout of struct pmu and the event/alias list is unchanged but the
API is altered so that aliases are no longer directly accessed, allowing
for later changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:42:46 -03:00
Ian Rogers
5040264121 perf pmu: Make the loading of formats lazy
The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.

Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.

For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.

Before:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
    Average PMU scanning took: 4725.300 usec (+- 260.599 usec)

After:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
    Average PMU scanning took: 4520.960 usec (+- 251.272 usec)

Committer testing:

On a AMD Ryzen 5950x:

Before:

  $ perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
    Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
    Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
    Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
    Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
    Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
    Average PMU scanning took: 1623.213 usec (+- 23.998 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3276.82 msec task-clock:u                     #    0.990 CPUs utilized               ( +-  0.82% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1008      page-faults:u                    #  307.615 /sec                        ( +-  0.04% )
         12049614778      cycles:u                         #    3.677 GHz                         ( +-  0.07% )  (83.34%)
           117507478      stalled-cycles-frontend:u        #    0.98% frontend cycles idle        ( +-  0.33% )  (83.32%)
            27106761      stalled-cycles-backend:u         #    0.22% backend cycles idle         ( +-  9.55% )  (83.36%)
         33294953848      instructions:u                   #    2.76  insn per cycle
                                                           #    0.00  stalled cycles per insn     ( +-  0.03% )  (83.31%)
          6849825049      branches:u                       #    2.090 G/sec                       ( +-  0.03% )  (83.37%)
            71533903      branch-misses:u                  #    1.04% of all branches             ( +-  0.20% )  (83.30%)

              3.3088 +- 0.0302 seconds time elapsed  ( +-  0.91% )

  $

After:

  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
    Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
    Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
    Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
    Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
    Average PMU scanning took: 1562.814 usec (+- 22.729 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3170.86 msec task-clock:u                     #    0.992 CPUs utilized               ( +-  0.22% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1010      page-faults:u                    #  318.526 /sec                        ( +-  0.04% )
         11890047674      cycles:u                         #    3.750 GHz                         ( +-  0.14% )  (83.27%)
           119090499      stalled-cycles-frontend:u        #    1.00% frontend cycles idle        ( +-  0.46% )  (83.40%)
            32502449      stalled-cycles-backend:u         #    0.27% backend cycles idle         ( +-  8.32% )  (83.30%)
         33119141261      instructions:u                   #    2.79  insn per cycle
                                                    #    0.00  stalled cycles per insn     ( +-  0.01% )  (83.37%)
          6812816561      branches:u                       #    2.149 G/sec                       ( +-  0.01% )  (83.29%)
            70157855      branch-misses:u                  #    1.03% of all branches             ( +-  0.28% )  (83.38%)

             3.19710 +- 0.00826 seconds time elapsed  ( +-  0.26% )

  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:38:04 -03:00
Ian Rogers
838a8c5f40 perf pmu: Pass PMU rather than aliases and format
Pass the pmu so the aliases and format list can be better abstracted
and later lazily loaded.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-23 14:27:58 -03:00
Ian Rogers
804fee5d0f perf pmu: Avoid passing format list to perf_pmu__config_terms()
Abstract the format list better, hiding it in the PMU, by changing
perf_pmu__config_terms() the PMU rather than the format list in the PMU.

Change the PMU test to pass a dummy PMU for this purpose. Changing the
test allows perf_pmu__del_formats() to become static.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-23 08:49:35 -03:00
Alexandre Ghiti
10da1b8ed7 perf tests mmap-basic: Adapt for riscv
riscv now supports mmaping hardware counters to userspace so adapt the
test to run on this architecture.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Rémi Denis-Courmont <remi@remlab.net>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20230802080328.1213905-11-alexghiti@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-16 08:52:47 -03:00
Adrian Hunter
82b0a10390 perf dlfilter: Add al_cleanup()
Add perf_dlfilter_fns.al_cleanup() to do addr_location__exit() on data
passed via perf_dlfilter_fns.resolve_address().

Add dlfilter-test-api-v2 to the "dlfilter C API" test to test it.

Update documentation, clarifying that data returned by APIs should not
be dereferenced after filter_event() and filter_event_early() return.

Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230731091857.10681-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:49 -03:00
Namhyung Kim
9575ecdd19 perf test: Add perf record sample filtering test
$ sudo ./perf test 'sample filter' -v
   94: perf record sample filtering (by BPF) tests                     :
  --- start ---
  test child forked, pid 3817527
  Checking BPF-filter privilege
  Basic bpf-filter test
  Basic bpf-filter test [Success]
  Failing bpf-filter test
  Error: task-clock event does not have PERF_SAMPLE_CPU
  Failing bpf-filter test [Success]
  Group bpf-filter test
  Error: task-clock event does not have PERF_SAMPLE_CPU
  Error: task-clock event does not have PERF_SAMPLE_CODE_PAGE_SIZE
  Group bpf-filter test [Success]
  test child finished with 0
  ---- end ----
  perf record sample filtering (by BPF) tests: Ok

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230811025822.3859771-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:48 -03:00
Ian Rogers
3d6dfae889 perf parse-events: Remove BPF event support
New features like the BPF --filter support in perf record have made the
BPF event functionality somewhat redundant. As shown by commit
fcb027c1a4f6 ("perf tools: Revert enable indices setting syntax for BPF
map") and commit 14e4b9f4289a ("perf trace: Raw augmented syscalls fix
libbpf 1.0+ compatibility") the BPF event support hasn't been well
maintained and it adds considerable complexity in areas like event
parsing, not least as '/' is a separator for event modifiers as well as
in paths.

This patch removes support in the event parser for BPF events and then
the associated functions are removed. This leads to the removal of whole
source files like bpf-loader.c.  Removing support means that augmented
syscalls in perf trace is broken, this will be fixed in a later commit
adding support using BPF skeletons.

The removal of BPF events causes an unused label warning from flex
generated code, so update build to ignore it:

  ```
  util/parse-events-flex.c:2704:1: error: label ‘find_rule’ defined but not used [-Werror=unused-label]
  2704 | find_rule: /* we branch to this label when backing up */
  ```

Committer notes:

Extracted from a larger patch that was also removing the support for
linking with libllvm and libclang, that were an alternative to using an
external clang execution to compile the .c event source code into BPF
bytecode.

Testing it:

  # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
  event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c'
                        \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'home'

  Initial error:
  event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c'
                        \___ Cannot find PMU `home'. Missing kernel support?
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230810184853.2860737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:48 -03:00
Ian Rogers
56b11a2126 perf bpf: Remove support for embedding clang for compiling BPF events (-e foo.c)
This never was in the default build for perf, is difficult to maintain
as it uses clang/llvm internals so ditch it, keeping, for now, the
external compilation of .c BPF into .o bytecode and its subsequent
loading, that is also going to be removed, do it separately to help
bisection and to properly document what is being removed and why.

Committer notes:

Extracted from a larger patch and removed some leftovers, namely
deleting these now unused feature tests:

    tools/build/feature/test-clang.cpp
    tools/build/feature/test-cxx.cpp
    tools/build/feature/test-llvm-version.cpp
    tools/build/feature/test-llvm.cpp

Testing the use of BPF events after applying this patch:

To use the external clang/llvm toolchain to compile a .c event and then
use libbpf to load it, to get the syscalls:sys_enter_open* tracepoints
and read the filename pointer, putting it into the ring buffer right
after the usual tracepoint payload for 'perf trace' to then print it:

  [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.c,open* --max-events=10
     0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12
     0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
     0.063 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
     0.082 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
   250.124 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12
   250.521 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.pressure", flags: RDONLY|CLOEXEC) = 12
   251.047 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.current", flags: RDONLY|CLOEXEC) = 12
   251.162 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.min", flags: RDONLY|CLOEXEC) = 12
   251.242 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.low", flags: RDONLY|CLOEXEC) = 12
   251.353 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/memory.swap.current", flags: RDONLY|CLOEXEC) = 12
  [root@quaco ~]#

Same thing, but with a prebuilt .o BPF bytecode:

  [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.o,open* --max-events=10
     0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12
     0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
     0.083 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
     0.062 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4
   249.985 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12
   466.763 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/energy_uj") = 13
   467.145 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj") = 13
   467.311 thermald/1234 openat(dfd: CWD, filename: "/sys/class/thermal/thermal_zone2/temp") = 13
   500.040 cgroupify/24006 openat(dfd: 4, filename: ".", flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 5
   500.295 cgroupify/24006 openat(dfd: 4, filename: "24616/cgroup.procs") = 5
  [root@quaco ~]#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/lkml/ZNZWsAXg2px1sm2h@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:48 -03:00