- Fix two boot issues caused by the recent head.S rework when !KASLR
- Fix calculation of crashkernel memory reservation
- Fix bogus error check in PMU IRQ probing code
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmMRwE8QHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNP6lB/4m0NNn4PQJ5FmTBuNurwB+CvZyC/pl/exL
reXG0hwI+ApX1xj6aD7tZ3tB+mi6EG0+3PBLuJ4KuSNecwwT/au5ZxAipsc+IRs0
EI4loeItzTaevaVTyydsTMgXYAJlCHlMDxEI0QIW3/datH10bZ1GzhEDL/Ebflfu
Mo4g0/+eysGmSHtj7kL5IU4bE+CCXnXycphgCLAn0qcdxuF/iZphu/GgB4Az6gSS
9ks3HJkO+X1UVb8qK0prVDsalX2z5tTWbxQ7XACJ4KEVEiSqh7ei3B283cF+Zeip
3PSg6gkPC5YZmLjv431B/pAycsx/x8KeXY0yDAadTRmOQYROMm0j
=a7xW
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"It's a lot smaller than last week, with the star of the show being a
couple of fixes to head.S addressing a boot regression introduced by
the recent overhaul of that code in non-default configurations (i.e.
KASLR disabled).
The first of those two resolves the issue reported (and bisected) by
Mikulus in the wait_on_bit() thread.
Summary:
- Fix two boot issues caused by the recent head.S rework when !KASLR
- Fix calculation of crashkernel memory reservation
- Fix bogus error check in PMU IRQ probing code"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Reserve enough pages for the initial ID map
perf/arm_pmu_platform: fix tests for platform_get_irq() failure
arm64: head: Ignore bogus KASLR displacement on non-relocatable kernels
arm64/kexec: Fix missing extra range for crashkres_low.
The parent field in struct acpi_device is, in fact, redundant,
because the dev.parent field in it effectively points to the same
object and it is used by the driver core.
Accordingly, the parent field can be dropped from struct acpi_device
and for this purpose define acpi_dev_parent() to retrieve a parent
struct acpi_device pointer from the dev.parent field in struct
acpi_device. Next, update all of the users of the parent field
in struct acpi_device to use acpi_dev_parent() instead of it and
drop it.
While at it, drop the ACPI_IS_ROOT_DEVICE() macro that is only used
in one place in a confusing way.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com>
Counter info encoding format is defined by the SBI specificaiton.
KVM implementation of SBI PMU extension will also leverage this definition.
Move the definition to common sbi header file from the sbi pmu driver.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220711174632.4186047-5-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Some of the SBI PMU calls does not pass 64bit arguments
correctly and not under RV32 compile time flags. Currently,
this doesn't create any incorrect results as RV64 ignores
any value in the additional register and qemu doesn't support
raw events.
Fix those SBI calls in order to set correct values for RV32.
Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220711174632.4186047-4-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Currently, riscv_pmu_event_set_period updates the userpage mapping.
However, the caller of riscv_pmu_event_set_period should update
the userpage mapping because the counter can not be updated/started
from set_period function in counter overflow path.
Invoke the perf_event_update_userpage at the caller so that it
doesn't get invoked twice during counter start path.
Fixes: f5bfa23f57 ("RISC-V: Add a perf core library for pmu drivers")
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220711174632.4186047-3-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The arm_spe_pmu driver will enable SYS_PMSCR_EL1.CX in order to add CONTEXT
packets into the traces, if the owner of the perf event runs with required
capabilities i.e CAP_PERFMON or CAP_SYS_ADMIN via perfmon_capable() helper.
The value of this bit is computed in the arm_spe_event_to_pmscr() function
but the check for capabilities happens in the pmu event init callback i.e
arm_spe_pmu_event_init(). This suggests that the value of the CX bit should
remain consistent for the duration of the perf session.
However, the function arm_spe_event_to_pmscr() may be called later during
the event start callback i.e arm_spe_pmu_start() when the "current" process
is not the owner of the perf session, hence the CX bit setting is currently
not consistent.
One way to fix this, is by caching the required value of the CX bit during
the initialization of the PMU event, so that it remains consistent for the
duration of the session. It uses currently unused 'event->hw.flags' element
to cache perfmon_capable() value, which can be referred during event start
callback to compute SYS_PMSCR_EL1.CX. This ensures consistent availability
of context packets in the trace as per event owner capabilities.
Drop BIT(SYS_PMSCR_EL1_CX_SHIFT) check in arm_spe_pmu_event_init(), because
now CX bit cannot be set in arm_spe_event_to_pmscr() with perfmon_capable()
disabled.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Fixes: d5d9696b03 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Reported-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220714061302.2715102-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In pmu_sbi_setup_irqs(), we should call of_node_put() for the 'cpu'
when breaking out of for_each_of_cput_node() as its refcount will
be automatically increased and decreased during the iteration.
Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20220715130330.443363-1-windhl@126.com
Signed-off-by: Will Deacon <will@kernel.org>
HNS3(HiSilicon Network System 3) PMU is RCiEP device in HiSilicon SoC NIC,
supports collection of performance statistics such as bandwidth, latency,
packet rate and interrupt rate.
NIC of each SICL has one PMU device for it. Driver registers each PMU
device to perf, and exports information of supported events, filter mode of
each event, bdf range, hardware clock frequency, identifier and so on via
sysfs.
Each PMU device has its own registers of control, counters and interrupt,
and it supports 8 hardware events, each hardward event has its own
registers for configuration, counters and interrupt.
Filter options contains:
config - select event
port - select physical port of nic
tc - select tc(must be used with port)
func - select PF/VF
queue - select queue of PF/VF(must be used with func)
intr - select interrupt number(must be used with func)
global - select all functions of IO DIE
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220628063419.38514-3-huangguangbin2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Update driver to export formatting and event information to sysfs so it
can be used by the perf user space tools with the syntaxes:
perf stat -e cpu/event=0x05
perf stat -e cpu/event=0x05,firmware=0x1/
63-bit is used to distinguish hardware events from firmware. Firmware
events are defined by "RISC-V Supervisor Binary Interface
Specification".
perf stat -e cpu/event=0x05,firmware=0x1/
is equivalent to
perf stat -e r8000000000000005
Suggested-by: João Mário Domingos <joao.mario@tecnico.ulisboa.pt>
Signed-off-by: Nikita Shubin <n.shubin@yadro.com>
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
Link: https://lore.kernel.org/r/20220628114625.166665-2-nikita.shubin@maquefel.me
Signed-off-by: Will Deacon <will@kernel.org>
Currently, when the CPU is doing suspend to ram, we don't
save pmu counter register and its content will be lost.
To ensure perf profiling is not affected by suspend to ram,
this patch is based on arm_pmu CPU_PM notifier and implements riscv
pmu pm notifier. In the pm notifier, we stop the counter and update
the counter value before suspend and start the counter after resume.
Signed-off-by: Eric Lin <eric.lin@sifive.com>
Link: https://lore.kernel.org/r/20220705091920.27432-1-eric.lin@sifive.com
Signed-off-by: Will Deacon <will@kernel.org>
The existing offset of TAD_PRF and TAD_PFC registers are incorrect.
Hence, fix with the right register offsets.
Also, drop read of TAD_PRF register in tad_pmu_event_counter_start()
since we don't have to preserve any bit fields and always write
an updated value.
Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20220614171356.773967-1-tanmay@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
- Initial support for the ARMv9 Scalable Matrix Extension (SME). SME
takes the approach used for vectors in SVE and extends this to provide
architectural support for matrix operations. No KVM support yet, SME
is disabled in guests.
- Support for crashkernel reservations above ZONE_DMA via the
'crashkernel=X,high' command line option.
- btrfs search_ioctl() fix for live-lock with sub-page faults.
- arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring
coherent I/O traffic, support for Arm's CMN-650 and CMN-700
interconnect PMUs, minor driver fixes, kerneldoc cleanup.
- Kselftest updates for SME, BTI, MTE.
- Automatic generation of the system register macros from a 'sysreg'
file describing the register bitfields.
- Update the type of the function argument holding the ESR_ELx register
value to unsigned long to match the architecture register size
(originally 32-bit but extended since ARMv8.0).
- stacktrace cleanups.
- ftrace cleanups.
- Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
avoid executable mappings in kexec/hibernate code, drop TLB flushing
from get_clear_flush() (and rename it to get_clear_contig()),
ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmKH19IACgkQa9axLQDI
XvEFWg//bf0p6zjeNaOJmBbyVFsXsVyYiEaLUpFPUs3oB+81s2YZ+9i1rgMrNCft
EIDQ9+/HgScKxJxnzWf68heMdcBDbk76VJtLALExbge6owFsjByQDyfb/b3v/bLd
ezAcGzc6G5/FlI1IP7ct4Z9MnQry4v5AG8lMNAHjnf6GlBS/tYNAqpmj8HpQfgRQ
ZbhfZ8Ayu3TRSLWL39NHVevpmxQm/bGcpP3Q9TtjUqg0r1FQ5sK/LCqOksueIAzT
UOgUVYWSFwTpLEqbYitVqgERQp9LiLoK5RmNYCIEydfGM7+qmgoxofSq5e2hQtH2
SZM1XilzsZctRbBbhMit1qDBqMlr/XAy/R5FO0GauETVKTaBhgtj6mZGyeC9nU/+
RGDljaArbrOzRwMtSuXF+Fp6uVo5spyRn1m8UT/k19lUTdrV9z6EX5Fzuc4Mnhed
oz4iokbl/n8pDObXKauQspPA46QpxUYhrAs10B/ELc3yyp/Qj3jOfzYHKDNFCUOq
HC9mU+YiO9g2TbYgCrrFM6Dah2E8fU6/cR0ZPMeMgWK4tKa+6JMEINYEwak9e7M+
8lZnvu3ntxiJLN+PrPkiPyG+XBh2sux1UfvNQ+nw4Oi9xaydeX7PCbQVWmzTFmHD
q7UPQ8220e2JNCha9pULS8cxDLxiSksce06DQrGXwnHc1Ir7T04=
=0DjE
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Initial support for the ARMv9 Scalable Matrix Extension (SME).
SME takes the approach used for vectors in SVE and extends this to
provide architectural support for matrix operations. No KVM support
yet, SME is disabled in guests.
- Support for crashkernel reservations above ZONE_DMA via the
'crashkernel=X,high' command line option.
- btrfs search_ioctl() fix for live-lock with sub-page faults.
- arm64 perf updates: support for the Hisilicon "CPA" PMU for
monitoring coherent I/O traffic, support for Arm's CMN-650 and
CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup.
- Kselftest updates for SME, BTI, MTE.
- Automatic generation of the system register macros from a 'sysreg'
file describing the register bitfields.
- Update the type of the function argument holding the ESR_ELx register
value to unsigned long to match the architecture register size
(originally 32-bit but extended since ARMv8.0).
- stacktrace cleanups.
- ftrace cleanups.
- Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
avoid executable mappings in kexec/hibernate code, drop TLB flushing
from get_clear_flush() (and rename it to get_clear_contig()),
ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits)
arm64/sysreg: Generate definitions for FAR_ELx
arm64/sysreg: Generate definitions for DACR32_EL2
arm64/sysreg: Generate definitions for CSSELR_EL1
arm64/sysreg: Generate definitions for CPACR_ELx
arm64/sysreg: Generate definitions for CONTEXTIDR_ELx
arm64/sysreg: Generate definitions for CLIDR_EL1
arm64/sve: Move sve_free() into SVE code section
arm64: Kconfig.platforms: Add comments
arm64: Kconfig: Fix indentation and add comments
arm64: mm: avoid writable executable mappings in kexec/hibernate code
arm64: lds: move special code sections out of kernel exec segment
arm64/hugetlb: Implement arm64 specific huge_ptep_get()
arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
arm64: kdump: Do not allocate crash low memory if not needed
arm64/sve: Generate ZCR definitions
arm64/sme: Generate defintions for SVCR
arm64/sme: Generate SMPRI_EL1 definitions
arm64/sme: Automatically generate SMPRIMAP_EL2 definitions
arm64/sme: Automatically generate SMIDR_EL1 defines
arm64/sme: Automatically generate defines for SMCR
...
The debugfs code is lazy, and since it only keeps the bottom byte of
each connect_info register to save space, it also treats the whole thing
as the device_type since the other bits were reserved anyway. Upon
closer inspection, though, this is no longer true on newer IP versions,
so let's be good and decode the exact field properly. This should help
it not get confused when a Component Aggregation Layer is present (which
is already implied if Node IDs are found for both device addresses
represented by the next two lines of the table).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6a13a6128a28cfe2eec6d09cf372a167ec9c3b65.1652274773.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Carefully considering the bounds of an array is all well and good,
until you forget that that array also contains a NULL sentinel at
the end and dereference it. So close...
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/bebba768156aa3c0757140457bdd0fec10819388.1652217788.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Make sure to check the pmu type first and then check event->attr.disabled.
Doing so would avoid reading the disabled attribute of an event that is
not handled by TAD PMU.
Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
Link: https://lore.kernel.org/r/20220510102657.487539-1-tanmay@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
On HiSilicon Hip09 platform, there is a CPA (Coherency Protocol Agent) on
each SICL (Super IO Cluster) which implements packet format translation,
route parsing and traffic statistics.
CPA PMU has 8 PMU counters and interrupt is supported to handle counter
overflow. Let's support its driver under the framework of HiSilicon PMU
driver.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220415102352.6665-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
If a PMU is in a SICL (Super IO cluster), it is not appropriate to
associate this PMU with a CPU die. So we associate it with all CPUs
online, rather than CPUs in the nearest SCCL.
As the firmware of Hip09 platform hasn't been published yet, change
of PMU driver will not influence backwards compatibility between
driver and firmware.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220415102352.6665-2-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
In order to acquire more accurate latency, Armv8.8[1] has defined the
CountSize field to 16-bit saturating counters when it's 0b0011.
Let's support this new feature and expose its to user under sysfs.
[1] https://developer.arm.com/documentation/ddi0487/latest
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220429063307.63251-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Add the identifiers, events, and subtleties for CMN-700. Highlights
include yet more options for doubling up CHI channels, which finally
grows event IDs beyond 8 bits for XPs, and a new set of CML gateway
nodes adding support for CXL as well as CCIX, where the Link Agent is
now internal to the CMN mesh so we gain regular PMU events for that too.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/cf892baa0d0258ea6cd6544b15171be0069a083a.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
So far, DNs and HN-Fs have each had one event ralated to occupancy
trackers which are filtered by a separate field. CMN-700 raises the
stakes by introducing two more sets of HN-F events with corresponding
additional filter fields. Prepare for this by refactoring our filter
selection and tracking logic to account for multiple filter types
coexisting on the same node. This need not affect the uAPI, which can
just continue to encode any per-event filter setting in the "occupid"
config field, even if it's technically not the most accurate name for
some of them.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1aa47ba0455b144c416537f6b0e58dc93b467a00.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add the identifiers and events for CMN-650, which slots into its
evolutionary position between CMN-600 and the 700-series products.
Imagine CMN-600 made bigger, and with most of the rough edges smoothed
off, but that then balanced out by some bonkers PMU functionality for
the new HN-P enhancement in CMN-650r2.
Most of the CXG events are actually common to newer revisions of CMN-600
too, so they're arguably a little late; oh well.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b0adc5824db53f71a2b561c293e2120390106536.1650320598.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
This will presumably trip up some tools that try to parse the comments
as kernel doc when they're not.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 4905ec2fb7 ("RISC-V: Add sscofpmf extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
--
These recently landed in for-next, but I'm trying to avoid rewriting
history as there's a lot in flight right now.
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220322220147.11407-1-palmer@rivosinc.com
Signed-off-by: Will Deacon <will@kernel.org>
In the case where there is only a cycle counter available (i.e.
PMCR_EL0.N is 0) and an event other than CPU cycles is opened, the open
should fail as the event can never possibly be scheduled. However, the
event validation when an event is opened is skipped when the group
leader is opened. Fix this by always validating the group leader events.
Reported-by: Al Grant <al.grant@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220408203330.4014015-1-robh@kernel.org
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Fix:
In file included from <command-line>:0:0:
In function ‘ddr_perf_counter_enable’,
inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2:
././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \
declared with attribute error: FIELD_PREP: mask is not constant
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
...
See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de
Signed-off-by: Will Deacon <will@kernel.org>
The Marvell CN10K DRAM Subsystem (DSS) performance monitor is only
present on Marvell CN10K SoCs. Hence add a dependency on ARCH_THUNDER,
to prevent asking the user about this driver when configuring a kernel
without Cavium Thunder (incl. Marvell CN10K) SoC support,
Fixes: 68fa55f0e0 ("perf/marvell: cn10k DDR perf event core ownership")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/18bfd6e1bcf67db7ea656d684a8bbb68261eeb54.1648559364.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
The bug is here:
return cluster;
The list iterator value 'cluster' will *always* be set and non-NULL
by list_for_each_entry(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
is found.
To fix the bug, return 'cluster' when found, otherwise return NULL.
Cc: stable@vger.kernel.org
Fixes: 21bdbb7102 ("perf: add qcom l2 cache perf events driver")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Link: https://lore.kernel.org/r/20220327055733.4070-1-xiam0nd.tong@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
* Support for Sv57-based virtual memory.
* Various improvements for the MicroChip PolarFire SOC and the
associated Icicle dev board, which should allow upstream kernels to
boot without any additional modifications.
* An improved memmove() implementation.
* Support for the new Ssconfpmf and SBI PMU extensions, which allows for
a much more useful perf implementation on RISC-V systems.
* Support for restartable sequences.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
+srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
0x39vK86qtB46A==
=Omc6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for Sv57-based virtual memory.
- Various improvements for the MicroChip PolarFire SOC and the
associated Icicle dev board, which should allow upstream kernels to
boot without any additional modifications.
- An improved memmove() implementation.
- Support for the new Ssconfpmf and SBI PMU extensions, which allows
for a much more useful perf implementation on RISC-V systems.
- Support for restartable sequences.
* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
rseq/selftests: Add support for RISC-V
RISC-V: Add support for restartable sequence
MAINTAINERS: Add entry for RISC-V PMU drivers
Documentation: riscv: Remove the old documentation
RISC-V: Add sscofpmf extension support
RISC-V: Add perf platform driver based on SBI PMU extension
RISC-V: Add RISC-V SBI PMU extension definitions
RISC-V: Add a simple platform driver for RISC-V legacy perf
RISC-V: Add a perf core library for pmu drivers
RISC-V: Add CSR encodings for all HPMCOUNTERS
RISC-V: Remove the current perf implementation
RISC-V: Improve /proc/cpuinfo output for ISA extensions
RISC-V: Do no continue isa string parsing without correct XLEN
RISC-V: Implement multi-letter ISA extension probing framework
RISC-V: Extract multi-letter extension names from "riscv, isa"
RISC-V: Minimal parser for "riscv, isa" strings
RISC-V: Correctly print supported extensions
riscv: Fixed misaligned memory access. Fixed pointer comparison.
MAINTAINERS: update riscv/microchip entry
riscv: dts: microchip: add new peripherals to icicle kit device tree
...
Including:
- IOMMU Core changes:
- Removal of aux domain related code as it is basically dead
and will be replaced by iommu-fd framework
- Split of iommu_ops to carry domain-specific call-backs
separatly
- Cleanup to remove useless ops->capable implementations
- Improve 32-bit free space estimate in iova allocator
- Intel VT-d updates:
- Various cleanups of the driver
- Support for ATS of SoC-integrated devices listed in
ACPI/SATC table
- ARM SMMU updates:
- Fix SMMUv3 soft lockup during continuous stream of events
- Fix error path for Qualcomm SMMU probe()
- Rework SMMU IRQ setup to prepare the ground for PMU support
- Minor cleanups and refactoring
- AMD IOMMU driver:
- Some minor cleanups and error-handling fixes
- Rockchip IOMMU driver:
- Use standard driver registration
- MSM IOMMU driver:
- Minor cleanup and change to standard driver registration
- Mediatek IOMMU driver:
- Fixes for IOTLB flushing logic
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmI8OUkACgkQK/BELZcB
GuNz9xAAvlgEg3byMx1y6LY9IctVGyLsegweVGM4+m6XR7qvT5Llc1E2Yw4Gooe4
EAceOihDKW2T9VnMlz9g/cG7Modrx60chcB22KKfxDXPl6yF3R89EMd7DE43T6n/
KPrP9+EsBnI8QSXyYu9ZowioX4CYwWhWD0dKHKAwDvw0BWHHUJ4hTaoHqEoIqLdP
vubeHziIok/g1sylSpJjTzV7r/Na8Q3TGcb/Mi5qC8uiyiyx40vtaduMGNW+/ToN
EqOKszxPmHfHv/xf0IHo0eUZ2L/JAe0mAlZzOb09f5F2sXJrbC05jlmRaDmSjT+u
iEc1r2By/0xo6iOuQC3wD6kTvwwO/ecpNYGhXYXdTbtLquYfL5PSXjRHEU9gf2BO
i/llPDsnytPvm/hnmbi26ChNR6yrDPz5bkoCUl5mnB1jZcaZtIURN7cRlEPPZUWo
62VDNdqWDB6AvALc1/SwYdJX/i5eaBf+niS7/BJ/IkLp2oJxFzrGsU8SRJFHNYsa
zdFIUUoTw647Ul6derSpGzHow169/RwVKYPiXMsaA8/viPNjpBOtfg56abn1WLW6
N4CtwNu6tt+sPfftFdFx2cDEMW2zpWg5doMddBfEx9FAk0HJ4WLZiTpaO2PxcLyd
kCAsGHj+ViAZHINVKFV4nQN/V9yQtcIc4UPmSGJBtKCK3KUYujw=
=bcqr
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- IOMMU Core changes:
- Removal of aux domain related code as it is basically dead and
will be replaced by iommu-fd framework
- Split of iommu_ops to carry domain-specific call-backs separatly
- Cleanup to remove useless ops->capable implementations
- Improve 32-bit free space estimate in iova allocator
- Intel VT-d updates:
- Various cleanups of the driver
- Support for ATS of SoC-integrated devices listed in ACPI/SATC
table
- ARM SMMU updates:
- Fix SMMUv3 soft lockup during continuous stream of events
- Fix error path for Qualcomm SMMU probe()
- Rework SMMU IRQ setup to prepare the ground for PMU support
- Minor cleanups and refactoring
- AMD IOMMU driver:
- Some minor cleanups and error-handling fixes
- Rockchip IOMMU driver:
- Use standard driver registration
- MSM IOMMU driver:
- Minor cleanup and change to standard driver registration
- Mediatek IOMMU driver:
- Fixes for IOTLB flushing logic
* tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits)
iommu/amd: Improve amd_iommu_v2_exit()
iommu/amd: Remove unused struct fault.devid
iommu/amd: Clean up function declarations
iommu/amd: Call memunmap in error path
iommu/arm-smmu: Account for PMU interrupts
iommu/vt-d: Enable ATS for the devices in SATC table
iommu/vt-d: Remove unused function intel_svm_capable()
iommu/vt-d: Add missing "__init" for rmrr_sanity_check()
iommu/vt-d: Move intel_iommu_ops to header file
iommu/vt-d: Fix indentation of goto labels
iommu/vt-d: Remove unnecessary prototypes
iommu/vt-d: Remove unnecessary includes
iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO
iommu/vt-d: Remove domain and devinfo mempool
iommu/vt-d: Remove iova_cache_get/put()
iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()
iommu/vt-d: Remove intel_iommu::domains
iommu/mediatek: Always tlb_flush_all when each PM resume
iommu/mediatek: Add tlb_lock in tlb_flush_all
iommu/mediatek: Remove the power status checking in tlb flush all
...
The sscofpmf extension allows counter overflow and filtering for
programmable counters. Enable the perf driver to handle the overflow
interrupt. The overflow interrupt is a hart local interrupt.
Thus, per cpu overflow interrupts are setup as a child under the root
INTC irq domain.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
RISC-V SBI specification added a PMU extension that allows to configure
start/stop any pmu counter. The RISC-V perf can use most of the generic
perf features except interrupt overflow and event filtering based on
privilege mode which will be added in future.
It also allows to monitor a handful of firmware counters that can provide
insights into firmware activity during a performance analysis.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The old RISC-V perf implementation allowed counting of only
cycle/instruction counters using perf. Restore that feature by implementing
a simple platform driver under a separate config to provide backward
compatibility. Any existing software stack will continue to work as it is.
However, it provides an easy way out in future where we can remove the
legacy driver.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Implement a perf core library that can support all the essential perf
features in future. It can also accommodate any type of PMU implementation
in future. Currently, both SBI based perf driver and legacy driver
implemented uses the library. Most of the common perf functionalities
are kept in this core library wile PMU specific driver can implement PMU
specific features. For example, the SBI specific functionality will be
implemented in the SBI specific driver.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When compiling the Marvell CN10K DDR PMU driver with CONFIG_OF=n, the
build fails:
| drivers/perf/marvell_cn10k_ddr_pmu.c:723:35: error: 'cn10k_ddr_pmu_of_match' undeclared here (not in a function); did you mean 'cn10k_ddr_pmu_driver'?
Use `of_match_ptr()` to avoid referencing the non-existent match table
in this configuration.
Link: https://lore.kernel.org/r/202203091424.Vfe8J4W9-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
Support for the CPU PMUs on the Apple M1.
* for-next/perf-m1:
drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
drivers/perf: arm_pmu: Handle 47 bit counters
irqchip/apple-aic: Move PMU-specific registers to their own include file
arm64: dts: apple: Add t8303 PMU nodes
arm64: dts: apple: Add t8103 PMU interrupt affinities
irqchip/apple-aic: Wire PMU interrupts
irqchip/apple-aic: Parse FIQ affinities from device-tree
dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts
dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts
dt-bindings: arm-pmu: Document Apple PMU compatible strings
Add a new, weird and wonderful driver for the equally weird Apple
PMU HW. Although the PMU itself is functional, we don't know much
about the events yet, so this can be considered as yet another
random number generator...
Nonetheless, it can reliably count at least cycles and instructions
in the usually wonky big-little way. For anything else, it of course
supports raw event numbers.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
The current ARM PMU framework can only deal with 32 or 64bit counters.
Teach it about a 47bit flavour.
Yes, this is odd.
Reviewed-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
As DDR perf event counters are not per core, so they should be accessed
only by one core at a time. Select new core when previously owning core
is going offline.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-5-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
CN10k DSS h/w perfmon does not support event overflow interrupt, so
periodic timer is being used. Each event counter is 48bit, which in worst
case scenario can increment at maximum 5.6 GT/s. At this rate it may take
many hours to overflow these counters. Therefore polling period for
overflow is set to 100 sec, which can be changed using sysfs parameter.
Two fixed event counters starts counting from zero on overflow, so
overflow condition is when new count less than previous count. While
eight programmable event counters freezes at maximum value. Also individual
counter cannot be restarted, so need to restart all eight counters.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-4-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
Marvell CN10k DRAM Subsystem (DSS) supports eight event counters for
monitoring performance and software can program each counter to monitor
any of the defined performance event. Performance events are for
interface between the DDR controller and the PHY, interface between the
DDR Controller and the CHI interconnect, or within the DDR Controller.
Additionally DSS also supports two fixed performance event counters, one
for number of ddr reads and other for ddr writes.
This patch add basic support for these performance monitoring events
on CN10k.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-3-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
From CMN-650 onwards, some of the fields in the watchpoint config
registers moved subtly enough to easily overlook. Watchpoint events are
still only partially supported on newer IPs - which in itself deserves
noting - but were not intended to become any *less* functional than on
CMN-600.
Fixes: 60d1504070 ("perf/arm-cmn: Support new IP features")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e1ce4c2f1e4f73ab1c60c3a85e4037cd62dd6352.1645727871.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
While in this particular case it would not be a (critical) issue,
the pattern itself is bad and error prone in case somebody blindly
copies to their code.
Don't cast parameter to unsigned long pointer in the bit operations.
Instead copy to a local variable on stack of a proper type and use.
Note, new compilers might warn on this line for potential outbound access.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220209184758.56578-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
In some places, drivers/perf code calls bitmap_weight() to check if any
bit of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220210224933.379149-13-yury.norov@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Replace acpi_bus_get_device() that is going to be dropped with
acpi_fetch_acpi_dev().
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/10025610.nUPlyArG6x@kreacher
Signed-off-by: Will Deacon <will@kernel.org>
The kbuild helpfully reports that the Marvell CN10K TAD PMU driver emits
a warning when building with W=1 and CONFIG_OF=n:
| >> drivers/perf/marvell_cn10k_tad_pmu.c:371:34: warning: unused variable 'tad_pmu_of_match' [-Wunused-const-variable]
static const struct of_device_id tad_pmu_of_match[] = {
Guard the match table with CONFIG_OF to squash the warning.
Link: https://lore.kernel.org/r/202201292349.zRQLcDDD-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
The Marvell CN10K Last-Level cache Tag-and-data Units (LLC-TAD)
performance monitor is only present on Marvell CN10K SoCs. Hence add a
dependency on ARCH_THUNDER, to prevent asking the user about this driver
when configuring a kernel without Cavium Thunder (incl. Marvell CN10K)
SoC support.
Fixes: 036a7584be ("drivers: perf: Add LLC-TAD perf counter support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/b4662a2c767d04cca19417e0c845edea2da262ad.1641995941.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.
In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().
Link: https://lore.kernel.org/r/20211224161334.31123-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Will Deacon <will@kernel.org>
Treewide cleanup and consolidation of MSI interrupt handling in
preparation for further changes in this area which are necessary to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmHf+SETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYobzGD/wNEFl5qQo5mNZ9thP6JSJFOItm7zMc
2QgzCYOqNwAv4jL6Dqo+EHtbShYqDyWzKdKccgqNjmdIqgW8q7/fubN1OPzRsClV
CZG997AsXDGXYlQcE3tXZjkeCWnWEE2AGLnygSkFV1K/r9ALAtFfTBJAWB+UD+Zc
1P8Kxo0q0Jg+DQAMAA5bWfSSjo/Pmpr/1AFjY7+GA8BBeJJgWOyW7H1S+GYEWVOE
RaQP81Sbd6x1JkopxkNqSJ/lbNJfnPJxi2higB56Y0OYn5CuSarYbZUM7oQ2V61t
jN7pcEEvTpjLd6SJ93ry8WOcJVMTbccCklVfD0AfEwwGUGw2VM6fSyNrZfnrosUN
tGBEO8eflBJzGTAwSkz1EhiGKna4o1NBDWpr0sH2iUiZC5G6V2hUDbM+0PQJhDa8
bICwguZElcUUPOprwjS0HXhymnxghTmNHyoEP1yxGoKLTrwIqkH/9KGustWkcBmM
hNtOCwQNqxcOHg/r3MN0KxttTASgoXgNnmFliAWA7XwseRpLWc95XPQFa5sptRhc
EzwumEz17EW1iI5/NyZQcY+jcZ9BdgCqgZ9ECjZkyN4U+9G6iACUkxVaHUUs77jl
a0ISSEHEvJisFOsOMYyFfeWkpIKGIKP/bpLOJEJ6kAdrUWFvlRGF3qlav3JldXQl
ypFjPapDeB5guw==
=vKzd
-----END PGP SIGNATURE-----
Merge tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI irq updates from Thomas Gleixner:
"Rework of the MSI interrupt infrastructure.
This is a treewide cleanup and consolidation of MSI interrupt handling
in preparation for further changes in this area which are necessary
to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space"
* tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
genirq/msi: Populate sysfs entry only once
PCI/MSI: Unbreak pci_irq_get_affinity()
genirq/msi: Convert storage to xarray
genirq/msi: Simplify sysfs handling
genirq/msi: Add abuse prevention comment to msi header
genirq/msi: Mop up old interfaces
genirq/msi: Convert to new functions
genirq/msi: Make interrupt allocation less convoluted
platform-msi: Simplify platform device MSI code
platform-msi: Let core code handle MSI descriptors
bus: fsl-mc-msi: Simplify MSI descriptor handling
soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
NTB/msi: Convert to msi_on_each_desc()
PCI: hv: Rework MSI handling
powerpc/mpic_u3msi: Use msi_for_each-desc()
powerpc/fsl_msi: Use msi_for_each_desc()
powerpc/pasemi/msi: Convert to msi_on_each_dec()
powerpc/cell/axon_msi: Convert to msi_on_each_desc()
powerpc/4xx/hsta: Rework MSI handling
...
The devm_ioremap() function does not return error pointers. It returns
NULL.
Fixes: 036a7584be ("drivers: perf: Add LLC-TAD perf counter support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217145907.GA16611@kili
Signed-off-by: Will Deacon <will@kernel.org>
The kbuild robot reports that building the SMMUv3 PMU driver with
CONFIG_OF=n results in a warning for W=1 builds:
>> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable]
static const struct of_device_id smmu_pmu_of_match[] = {
^
Guard the match table with #ifdef CONFIG_OF.
Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com
Fixes: 3f7be43561 ("perf/smmuv3: Add devicetree support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
Let the core code fiddle with the MSI descriptor retrieval.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221815.029143589@linutronix.de
* for-next/perf-smmu:
perf/smmuv3: Synthesize IIDR from CoreSight ID registers
perf/smmuv3: Add devicetree support
dt-bindings: Add Arm SMMUv3 PMCG binding
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported
to sample bandwidth, latency, buffer occupation etc.
Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is
registered as a PMU in /sys/bus/event_source/devices, so users can
select target PMU, and use filter to do further sets.
Filtering options contains:
event - select the event.
port - select target Root Ports. Information of Root Ports are
shown under sysfs.
bdf - select requester_id of target EP device.
trig_len - set trigger condition for starting event statistics.
trig_mode - set trigger mode. 0 means starting to statistic when bigger
than trigger condition, and 1 means smaller.
thr_len - set threshold for statistics.
thr_mode - set threshold mode. 0 means count when bigger than threshold,
and 1 means smaller.
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
This driver adds support for Last-level cache tag-and-data unit
(LLC-TAD) PMU that is featured in some of the Marvell's CN10K
infrastructure silicons.
The LLC is divided into 2N slices distributed across N Mesh tiles
in a single-socket configuration. The driver always configures the
same counter for all of the TADs. The user would end up effectively
reserving one of eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of an application run. The driver does not provide a way
for the user to partition TADs so that different TADs are used for
different applications.
The event counters are zeroed to start event counting to avoid any
rollover issues. TAD perf counters are 64-bit, so it's not currently
possible to overflow event counters at current mesh and core
frequencies.
To measure tad pmu events use perf tool stat command. For instance:
perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application>
perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application>
Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
The SMMU_PMCG_IIDR register was not present in older revisions of the
Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of
fields from several PIDR registers, allowing us to present a
standardized identifier to userspace.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Add device-tree support to the SMMUv3 PMCG driver.
Signed-off-by: Jay Chen <jkchen@linux.alibaba.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
In general, detailed performance analysis will require knoweldge of the
the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc.
are connected to each node. However for certain development and bringup
tasks it can be useful to have a quick overview of the CMN internal
topology to hand too. Add a debugfs file to map this out.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The second generation of CMN IPs add new node types and significantly
expand the configuration space with options for extra device ports on
edge XPs, either plumbed into the regular DTM or with extra dedicated
DTMs to monitor them, plus larger (and smaller) mesh sizes. Add basic
support for pulling this new information out of the hardware, piping
it around as necessary, and handling (most of) the new choices.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e58b495bcc7deec3882be4bac910ed0bf6979674.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In preparation for supporting newer CMN products, let's introduce a
means to differentiate the features and events which are specific to a
particular IP from those which remain common to the whole family. The
newer designs have also smoothed off some of the rough edges in terms
of discoverability, so separate out the parts of the flow which have
effectively now become CMN-600 quirks.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9f6368cdca4c821d801138939508a5bba54ccabb.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
With the value of CMN_MAX_DTMS increasing significantly, our validation
data structure is set to get quite big. Technically we could pack it at
least twice as densely, since we only need around 19 bits of information
per DTM, but that makes the code even more mind-bogglingly impenetrable,
and even half of "quite big" may still be uncomfortably large for a
stack frame (~1KB). Just move it to an off-stack allocation instead.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In cases where we do know which DTC domain a node belongs to, we can
skip initialising or reading the global count in DTCs where we know
it won't change. The machinery to achieve that is mostly in place
already, so finish hooking it up by converting the vestigial domain
tracking to propagate suitable bitmaps all the way through to events.
Note that this does not allow allocating such an unused counter to a
different event on that DTC, because that is a flippin' nightmare.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/51d930fd945ef51c81f5889ccca055c302b0a1d0.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
When multiple nodes of the same type are connected to the same XP
(particularly in CAL configurations), it seems that they are likely
to be consecutive in logical ID. Therefore, we're likely to gain a
small benefit from an easy tweak to optimise out consecutive reads
of the same set of DTM counters for an aggregated event.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Untangle DTMs from XPs into a dedicated abstraction. This helps make
things a little more obvious and robust, but primarily paves the way
for further development where new IPs can grow extra DTMs per XP.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9cca18b1b98f482df7f1aaf3d3213e7f39500423.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Refactor the places where we scan through the set of nodes to switch
from explicit array indexing to pointer-based iteration. This leads to
slightly simpler object code, but also makes the source less dense and
more pleasant for further development. It also unearths an almost-bug
in arm_cmn_event_init() where we've been depending on the "array index"
of NULL relative to cmn->dns being a sufficiently large number, yuck.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ee0c9eda9a643f46001ac43aadf3f0b1fd5660dd.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add a bit more abstraction for the places where we decompose node IDs.
This will help keep things nice and manageable when we come to add yet
more variables which affect the node ID format. Also use the opportunity
to move the rest of the low-level node management helpers back up to the
logical place they were meant to be - how they ended up buried right in
the middle of the event-related definitions is somewhat of a mystery...
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/a2242a8c3c96056c13a04ae87bf2047e5e64d2d9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Although CMN is currently (and overwhelmingly likely to remain) deployed
in arm64-only (modulo userspace) systems, the 64-bit "dependency" for
compile-testing was just laziness due to heavy reliance on readq/writeq
accessors. Since we only need one extra include for robustness in that
regard, let's pull that in, widen the compile-test coverage, and fix up
the smattering of type laziness that that brings to light.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
On a system with multiple CMN meshes, ideally we'd want to access each
PMU from within its own mesh, rather than with a long CML round-trip,
wherever feasible. Since such a system is likely to be presented as
multiple NUMA nodes, let's also hope a proximity domain is specified
for each CMN programming interface, and use that to guide our choice
of IRQ affinity to favour a node-local CPU where possible.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32438b0d016e0649d882d47d30ac2000484287b9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Attempting to migrate the PMU context after we've unregistered the PMU
device, or especially if we never successfully registered it in the
first place, is a woefully bad idea. It's also fundamentally pointless
anyway. Make sure to unregister an instance from the hotplug handler
*without* invoking the teardown callback.
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/2c221d745544774e4b07583b65b5d4d94f7e0fe4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
- Update the ACPICA code in the kernel to upstream revision 20210930
including the following changes:
* Fix system-wide resume issue caused by evaluating control
methods too early in the resume path (Rafael Wysocki).
* Add support for Windows 2020 _OSI string (Mario Limonciello).
* Add Generic Port Affinity type for SRAT (Alison Schofield).
* Add disassembly support for the NHLT ACPI table (Bob Moore).
- Avoid flushing caches before entering C3 type of idle states on
AMD processors (Deepak Sharma).
- Avoid enumerating CPUs that are not present and not online-capable
according to the platform firmware (Mario Limonciello).
- Add DMI-based mechanism to quirk IRQ overrides and use it for two
platforms (Hui Wang).
- Change the configuration of unused ACPI device objects to reflect
the D3cold power state after enumerating devices (Rafael Wysocki).
- Update MAINTAINERS information regarding ACPI (Rafael Wysocki).
- Fix typo in ACPI Kconfig (Masanari Iid).
- Use sysfs_emit() instead of snprintf() in some places (Qing Wang).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices
in general (Rafael Wysocki).
- Use acpi_device_adr() in acpi_find_child_device() instead of
evaluating _ADR (Rafael Wysocki).
- Drop duplicate device IDs from PNP device IDs list (Krzysztof
Kozlowski).
- Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
Gong).
- Use ACPI_COMPANION() to simplify code in some drivers (Rafael
Wysocki).
- Check the states of all ACPI power resources during initialization
to avoid dealing with power resources in unknown states (Rafael
Wysocki).
- Fix ACPI power resource issues related to sharing wakeup power
resources (Rafael Wysocki).
- Avoid registering redundant suspend_ops (Rafael Wysocki).
- Report battery charging state as "full" if it appears to be over
the design capacity (André Almeida).
- Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
Schaeckeler).
- Mark apei_hest_parse() static (Christoph Hellwig).
- Relax platform response timeout to 1 second after instructing it
to inject an error (Shuai Xue).
- Make the PRM code handle memory allocation and remapping failures
more gracefully and drop some unnecessary blank lines from that
code (Aubrey Li).
- Fix spelling mistake in the ACPI documentation (Colin Ian King).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGBkbESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxCf8QALHX0PjsKC9cFs+hHVoI+T9VdVa/2pY4
8vMjxYqkano4ti2xu1Fo/gaECnpgtVH+mV6iXPSAf80Wjgxnrx8K7y4qyUR4POBR
l/5jA+XuFOyBRCSIHjJrfGof3/bTCt63LuyOc1WxiFzhkIuKeiI7Bmw6pvxrcXZw
Ehe0VZK8eZGHR2py7RlZ68IJ32uAdlVKjlXyOFWfm9pjKtYit49WV/rjY0ldHBou
vq8JP4EEmCKhyYAuzRprkPR2itm5fYI3lceG3UaIq5uWCj0IlaWcuv6c7K6N6QA1
bHjCUSWPMCoHKQbZoX5MEEAjIJKdO81Rj3PGjNL1uYSz806ZmNcxav1twebm1lEa
h6GbcFDaY9USMEl1D8s7h5Lm9cM33cen4IW9Ms5nMjugkdAr4KyLOA8CgpTbb1Ih
vBrlB4CzxEqgpa+YAvJd7/r3FckauGthkciEjOCFpUxeZpKW5kweLIwFpPHKaqot
KXtE+wwacxmkwYiJwEg141Bp4oFfy6iBcDn8xOapjq89DFEAn03VuvAO33Q+2KPL
A2RtOaV2tSxxNDUss9wENI4lF5sc2JvBECXc1MZCBBNQqxTrwT562UwMhgUXD1Vy
8LvdrYAOZzS6QizeHQC27FQ/uwQ5/uEqaYfjMv+JQjaZZU7YeG3vekADqiEstfWF
UUj72AE8a/Cs
=Ni9a
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to the most recent upstream
revision, address some issues related to the ACPI power resources
management, simplify the enumeration of PCI devices having ACPI
companions, add new quirks, fix assorted problems, update the
ACPI-related information in maintainers and clean up code in several
places.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20210930
including the following changes:
- Fix system-wide resume issue caused by evaluating control
methods too early in the resume path (Rafael Wysocki).
- Add support for Windows 2020 _OSI string (Mario Limonciello).
- Add Generic Port Affinity type for SRAT (Alison Schofield).
- Add disassembly support for the NHLT ACPI table (Bob Moore).
- Avoid flushing caches before entering C3 type of idle states on AMD
processors (Deepak Sharma).
- Avoid enumerating CPUs that are not present and not online-capable
according to the platform firmware (Mario Limonciello).
- Add DMI-based mechanism to quirk IRQ overrides and use it for two
platforms (Hui Wang).
- Change the configuration of unused ACPI device objects to reflect
the D3cold power state after enumerating devices (Rafael Wysocki).
- Update MAINTAINERS information regarding ACPI (Rafael Wysocki).
- Fix typo in ACPI Kconfig (Masanari Iid).
- Use sysfs_emit() instead of snprintf() in some places (Qing Wang).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices in
general (Rafael Wysocki).
- Use acpi_device_adr() in acpi_find_child_device() instead of
evaluating _ADR (Rafael Wysocki).
- Drop duplicate device IDs from PNP device IDs list (Krzysztof
Kozlowski).
- Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
Gong).
- Use ACPI_COMPANION() to simplify code in some drivers (Rafael
Wysocki).
- Check the states of all ACPI power resources during initialization
to avoid dealing with power resources in unknown states (Rafael
Wysocki).
- Fix ACPI power resource issues related to sharing wakeup power
resources (Rafael Wysocki).
- Avoid registering redundant suspend_ops (Rafael Wysocki).
- Report battery charging state as "full" if it appears to be over
the design capacity (André Almeida).
- Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
Schaeckeler).
- Mark apei_hest_parse() static (Christoph Hellwig).
- Relax platform response timeout to 1 second after instructing it to
inject an error (Shuai Xue).
- Make the PRM code handle memory allocation and remapping failures
more gracefully and drop some unnecessary blank lines from that
code (Aubrey Li).
- Fix spelling mistake in the ACPI documentation (Colin Ian King)"
* tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (36 commits)
ACPI: glue: Use acpi_device_adr() in acpi_find_child_device()
perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly
ACPI: APEI: mark apei_hest_parse() static
ACPI: APEI: EINJ: Relax platform response timeout to 1 second
gpio-amdpt: ACPI: Use the ACPI_COMPANION() macro directly
nouveau: ACPI: Use the ACPI_COMPANION() macro directly
ACPI: resources: Add one more Medion model in IRQ override quirk
ACPI: AC: Quirk GK45 to skip reading _PSR
ACPI: PM: sleep: Do not set suspend_ops unnecessarily
ACPI: PRM: Handle memory allocation and memory remap failure
ACPI: PRM: Remove unnecessary blank lines
ACPI: PM: Turn off wakeup power resources on _DSW/_PSW errors
ACPI: PM: Fix sharing of wakeup power resources
ACPI: PM: Turn off unused wakeup power resources
ACPI: PM: Check states of power resources during initialization
ACPI: replace snprintf() in "show" functions with sysfs_emit()
ACPI: LPSS: Use ACPI_COMPANION() directly
ACPI: scan: Release PM resources blocked by unused objects
ACPI: battery: Accept charges over the design capacity as full
ACPICA: Update version to 20210930
...
- Support for the Arm8.6 timer extensions, including a self-synchronising
view of the system registers to elide some expensive ISB instructions.
- Exception table cleanup and rework so that the fixup handlers appear
correctly in backtraces.
- A handful of miscellaneous changes, the main one being selection of
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
- More mm and pgtable cleanups.
- KASAN support for "asymmetric" MTE, where tag faults are reported
synchronously for loads (via an exception) and asynchronously for
stores (via a register).
- Support for leaving the MMU enabled during kexec relocation, which
significantly speeds up the operation.
- Minor improvements to our perf PMU drivers.
- Improvements to the compat vDSO build system, particularly when
building with LLVM=1.
- Preparatory work for handling some Coresight TRBE tracing errata.
- Cleanup and refactoring of the SVE code to pave the way for SME
support in future.
- Ensure SCS pages are unpoisoned immediately prior to freeing them
when KASAN is enabled for the vmalloc area.
- Try moving to the generic pfn_valid() implementation again now that
the DMA mapping issue from last time has been resolved.
- Numerous improvements and additions to our FPSIMD and SVE selftests.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
=3R3j
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"There's the usual summary below, but the highlights are support for
the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
ability to kexec() with the MMU enabled and a second attempt at
switching to the generic pfn_valid() implementation.
Summary:
- Support for the Arm8.6 timer extensions, including a
self-synchronising view of the system registers to elide some
expensive ISB instructions.
- Exception table cleanup and rework so that the fixup handlers
appear correctly in backtraces.
- A handful of miscellaneous changes, the main one being selection of
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
- More mm and pgtable cleanups.
- KASAN support for "asymmetric" MTE, where tag faults are reported
synchronously for loads (via an exception) and asynchronously for
stores (via a register).
- Support for leaving the MMU enabled during kexec relocation, which
significantly speeds up the operation.
- Minor improvements to our perf PMU drivers.
- Improvements to the compat vDSO build system, particularly when
building with LLVM=1.
- Preparatory work for handling some Coresight TRBE tracing errata.
- Cleanup and refactoring of the SVE code to pave the way for SME
support in future.
- Ensure SCS pages are unpoisoned immediately prior to freeing them
when KASAN is enabled for the vmalloc area.
- Try moving to the generic pfn_valid() implementation again now that
the DMA mapping issue from last time has been resolved.
- Numerous improvements and additions to our FPSIMD and SVE
selftests"
[ armv8.6 timer updates were in a shared branch and already came in
through -tip in the timer pull - Linus ]
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
arm64: Select POSIX_CPU_TIMERS_TASK_WORK
arm64: Document boot requirements for FEAT_SME_FA64
arm64/sve: Fix warnings when SVE is disabled
arm64/sve: Add stub for sve_max_virtualisable_vl()
arm64: errata: Add detection for TRBE write to out-of-range
arm64: errata: Add workaround for TSB flush failures
arm64: errata: Add detection for TRBE overwrite in FILL mode
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
selftests: arm64: Factor out utility functions for assembly FP tests
arm64: vmlinux.lds.S: remove `.fixup` section
arm64: extable: add load_unaligned_zeropad() handler
arm64: extable: add a dedicated uaccess handler
arm64: extable: add `type` and `data` fields
arm64: extable: use `ex` for `exception_table_entry`
arm64: extable: make fixup_exception() return bool
arm64: extable: consolidate definitions
arm64: gpr-num: support W registers
arm64: factor out GPR numbering helpers
arm64: kvm: use kvm_exception_table_entry
arm64: lib: __arch_copy_to_user(): fold fixups into body
...
The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION()
macro and the ACPI handle produced by the former comes from the
ACPI device object produced by the latter, so it is way more
straightforward to evaluate the latter directly instead of passing
the handle produced by the former to acpi_bus_get_device().
Modify l2_cache_pmu_probe_cluster() accordingly (no intentional
functional impact).
While at it, rename the ACPI device pointer to adev for more
clarity.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Improve build test cover by allowing some drivers to build under
COMPILE_TEST where possible.
Some notes:
- Mostly a dependency on CONFIG_ACPI is not really required for only
building (but left untouched), but is required for TX2 which uses ACPI
functions which have no stubs
- XGENE required 64b dependency as it relies on some unsigned long perf
struct fields being 64b
- I don't see why TX2 requires NUMA to build, but left untouched
- Added an explicit dependency on GENERIC_MSI_IRQ_DOMAIN for
ARM_SMMU_V3_PMU, which is required for platform MSI functions
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-3-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
A LSL of 32 requires > 32b value to hold the result. However in
tx2_uncore_event_update(), 1UL << 32 currently only works as unsigned
long is 64b on a 64b system.
If we want to compile test for a 32b system, we need unsigned long long,
whose min size is 64b.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The PA PMU counter offset was correct in [1] and the driver has
already been verified. We want to keep the register offset using
lower case character in later version that is consistent with
the existed driver. Since there was no functional change, we
didn't do more test. However there is typo when modified the PA
PMU counter offset by mistake, so fix this bad mistake.
[1] https://www.spinics.net/lists/arm-kernel/msg865263.html
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210928123022.23467-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Russell reported that since 5.13, KVM's probing of the PMU has
started to fail on his HW. As it turns out, there is an implicit
ordering dependency between the architectural PMU probing code and
and KVM's own probing. If, due to probe ordering reasons, KVM probes
before the PMU driver, it will fail to detect the PMU and prevent it
from being advertised to guests as well as the VMM.
Obviously, this is one probing too many, and we should be able to
deal with any ordering.
Add a callback from the PMU code into KVM to advertise the registration
of a host CPU PMU, allowing for any probing order.
Fixes: 5421db1be3 ("KVM: arm64: Divorce the perf code from oprofile helpers")
Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
Cc: stable@vger.kernel.org
ddr_perf_probe() misses to call ida_simple_remove() in an error path.
Jump to cpuhp_state_err to fix it.
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/20210617122614.166823-1-jingxiangfeng@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When multiple dtcs share the same IRQ number, the irq_friend which
used to refer to dtc object gets calculated incorrect which leads
to invalid pointer.
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1623946129-3290-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify IMX8_DDR_PMU_EVENT_ATTR
Reviewed by Frank Li <Frank .li@nxp.com>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-7-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify XGENE_PMU_EVENT_ATTR
Cc: Khuong Dinh <khuong@os.amperecomputing.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-6-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify L3CACHE_EVENT_ATTR
Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-5-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify L2CACHE_EVENT_ATTR
Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-4-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
With global filtering, we only allow an event to be scheduled if its
filter settings exactly match those of any existing events, therefore
it is pointless to reapply the filter in that case. Much worse, though,
is that in doing that we trample the event type of counter 0 if it's
already active, and never touch the appropriate PMEVTYPERn so the new
event is likely not counting the right thing either. Don't do that.
CC: stable@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32c80c0e46237f49ad8da0c9f8864e13c4a803aa.1623153312.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
These are only put in an array of pointers to const attribute_group
structs. Make them const like the other static attribute_group structs
to allow the compiler to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210605221514.73449-1-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Link: https://lore.kernel.org/r/20210608084816.1046485-1-chenxiaosong2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>