Commit Graph

1215887 Commits

Author SHA1 Message Date
Oliver Upton
123f42f0ad Merge branch kvm-arm64/pmu_pmcr_n into kvmarm/next
* kvm-arm64/pmu_pmcr_n:
  : User-defined PMC limit, courtesy Raghavendra Rao Ananta
  :
  : Certain VMMs may want to reserve some PMCs for host use while running a
  : KVM guest. This was a bit difficult before, as KVM advertised all
  : supported counters to the guest. Userspace can now limit the number of
  : advertised PMCs by writing to PMCR_EL0.N, as KVM's sysreg and PMU
  : emulation enforce the specified limit for handling guest accesses.
  KVM: selftests: aarch64: vPMU test for validating user accesses
  KVM: selftests: aarch64: vPMU register test for unimplemented counters
  KVM: selftests: aarch64: vPMU register test for implemented counters
  KVM: selftests: aarch64: Introduce vpmu_counter_access test
  tools: Import arm_pmuv3.h
  KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
  KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
  KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
  KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
  KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
  KVM: arm64: PMU: Introduce helpers to set the guest's PMU

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:24:19 +00:00
Oliver Upton
53ce49ea75 Merge branch kvm-arm64/mops into kvmarm/next
* kvm-arm64/mops:
  : KVM support for MOPS, courtesy of Kristina Martsenko
  :
  : MOPS adds new instructions for accelerating memcpy(), memset(), and
  : memmove() operations in hardware. This series brings virtualization
  : support for KVM guests, and allows VMs to run on asymmetrict systems
  : that may have different MOPS implementations.
  KVM: arm64: Expose MOPS instructions to guests
  KVM: arm64: Add handler for MOPS exceptions

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:21:19 +00:00
Oliver Upton
a87a36436c Merge branch kvm-arm64/writable-id-regs into kvmarm/next
* kvm-arm64/writable-id-regs:
  : Writable ID registers, courtesy of Jing Zhang
  :
  : This series significantly expands the architectural feature set that
  : userspace can manipulate via the ID registers. A new ioctl is defined
  : that makes the mutable fields in the ID registers discoverable to
  : userspace.
  KVM: selftests: Avoid using forced target for generating arm64 headers
  tools headers arm64: Fix references to top srcdir in Makefile
  KVM: arm64: selftests: Test for setting ID register from usersapce
  tools headers arm64: Update sysreg.h with kernel sources
  KVM: selftests: Generate sysreg-defs.h and add to include path
  perf build: Generate arm64's sysreg-defs.h and add to include path
  tools: arm64: Add a Makefile for generating sysreg-defs.h
  KVM: arm64: Document vCPU feature selection UAPIs
  KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
  KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
  KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
  KVM: arm64: Reject attempts to set invalid debug arch version
  KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
  KVM: arm64: Use guest ID register values for the sake of emulation
  KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS
  KVM: arm64: Allow userspace to get the writable masks for feature ID registers

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:21:09 +00:00
Oliver Upton
70c7b704ca KVM: selftests: Avoid using forced target for generating arm64 headers
The 'prepare' target that generates the arm64 sysreg headers had no
prerequisites, so it wound up forcing a rebuild of all KVM selftests
each invocation. Add a rule for the generated headers and just have
dependents use that for a prerequisite.

Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Fixes: 9697d84cc3 ("KVM: selftests: Generate sysreg-defs.h and add to include path")
Tested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Link: https://lore.kernel.org/r/20231027005439.3142015-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:20:39 +00:00
Oliver Upton
fbb075c116 tools headers arm64: Fix references to top srcdir in Makefile
Aishwarya reports that KVM selftests for arm64 fail with the following
error:

 | make[4]: Entering directory '/tmp/kci/linux/tools/testing/selftests/kvm'
 | Makefile:270: warning: overriding recipe for target
 | '/tmp/kci/linux/build/kselftest/kvm/get-reg-list'
 | Makefile:265: warning: ignoring old recipe for target
 | '/tmp/kci/linux/build/kselftest/kvm/get-reg-list'
 | make -C ../../../../tools/arch/arm64/tools/
 | make[5]: Entering directory '/tmp/kci/linux/tools/arch/arm64/tools'
 | Makefile:10: ../tools/scripts/Makefile.include: No such file or directory
 | make[5]: *** No rule to make target '../tools/scripts/Makefile.include'.
 |  Stop.

It would appear that this only affects builds from the top-level
Makefile (e.g. make kselftest-all), as $(srctree) is set to ".". Work
around the issue by shadowing the kselftest naming scheme for the source
tree variable.

Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Fixes: 0359c946b1 ("tools headers arm64: Update sysreg.h with kernel sources")
Link: https://lore.kernel.org/r/20231027005439.3142015-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:20:39 +00:00
Oliver Upton
54b44ad26c Merge branch kvm-arm64/sgi-injection into kvmarm/next
* kvm-arm64/sgi-injection:
  : vSGI injection improvements + fixes, courtesy Marc Zyngier
  :
  : Avoid linearly searching for vSGI targets using a compressed MPIDR to
  : index a cache. While at it, fix some egregious bugs in KVM's mishandling
  : of vcpuid (user-controlled value) and vcpu_idx.
  KVM: arm64: Clarify the ordering requirements for vcpu/RD creation
  KVM: arm64: vgic-v3: Optimize affinity-based SGI injection
  KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available
  KVM: arm64: Build MPIDR to vcpu index cache at runtime
  KVM: arm64: Simplify kvm_vcpu_get_mpidr_aff()
  KVM: arm64: Use vcpu_idx for invalidation tracking
  KVM: arm64: vgic: Use vcpu_idx for the debug information
  KVM: arm64: vgic-v2: Use cpuid from userspace as vcpu_id
  KVM: arm64: vgic-v3: Refactor GICv3 SGI generation
  KVM: arm64: vgic-its: Treat the collection target address as a vcpu_id
  KVM: arm64: vgic: Make kvm_vgic_inject_irq() take a vcpu pointer

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:19:13 +00:00
Oliver Upton
df26b77915 Merge branch kvm-arm64/stage2-vhe-load into kvmarm/next
* kvm-arm64/stage2-vhe-load:
  : Setup stage-2 MMU from vcpu_load() for VHE
  :
  : Unlike nVHE, there is no need to switch the stage-2 MMU around on guest
  : entry/exit in VHE mode as the host is running at EL2. Despite this KVM
  : reloads the stage-2 on every guest entry, which is needless.
  :
  : This series moves the setup of the stage-2 MMU context to vcpu_load()
  : when running in VHE mode. This is likely to be a win across the board,
  : but also allows us to remove an ISB on the guest entry path for systems
  : with one of the speculative AT errata.
  KVM: arm64: Move VTCR_EL2 into struct s2_mmu
  KVM: arm64: Load the stage-2 MMU context in kvm_vcpu_load_vhe()
  KVM: arm64: Rename helpers for VHE vCPU load/put
  KVM: arm64: Reload stage-2 for VMID change on VHE
  KVM: arm64: Restore the stage-2 context in VHE's __tlb_switch_to_host()
  KVM: arm64: Don't zero VTTBR in __tlb_switch_to_host()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:56 +00:00
Oliver Upton
51e6079614 Merge branch kvm-arm64/nv-trap-fixes into kvmarm/next
* kvm-arm64/nv-trap-fixes:
  : NV trap forwarding fixes, courtesy Miguel Luis and Marc Zyngier
  :
  :  - Explicitly define the effects of HCR_EL2.NV on EL2 sysregs in the
  :    NV trap encoding
  :
  :  - Make EL2 registers that access AArch32 guest state UNDEF or RAZ/WI
  :    where appropriate for NV guests
  KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
  KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
  KVM: arm64: Refine _EL2 system register list that require trap reinjection
  arm64: Add missing _EL2 encodings
  arm64: Add missing _EL12 encodings

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:46 +00:00
Oliver Upton
25a35c1a3d Merge branch kvm-arm64/smccc-filter-cleanups into kvmarm/next
* kvm-arm64/smccc-filter-cleanups:
  : Cleanup the management of KVM's SMCCC maple tree
  :
  : Avoid the cost of maintaining the SMCCC filter maple tree if userspace
  : hasn't writen a rule to the filter. While at it, rip out the now
  : unnecessary VM flag to indicate whether or not the SMCCC filter was
  : configured.
  KVM: arm64: Use mtree_empty() to determine if SMCCC filter configured
  KVM: arm64: Only insert reserved ranges when SMCCC filter is used
  KVM: arm64: Add a predicate for testing if SMCCC filter is configured

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:37 +00:00
Oliver Upton
7ff7dfe946 Merge branch kvm-arm64/pmevtyper-filter into kvmarm/next
* kvm-arm64/pmevtyper-filter:
  : Fixes to KVM's handling of the PMUv3 exception level filtering bits
  :
  :  - NSH (count at EL2) and M (count at EL3) should be stateful when the
  :    respective EL is advertised in the ID registers but have no effect on
  :    event counting.
  :
  :  - NSU and NSK modify the event filtering of EL0 and EL1, respectively.
  :    Though the kernel may not use these bits, other KVM guests might.
  :    Implement these bits exactly as written in the pseudocode if EL3 is
  :    advertised.
  KVM: arm64: Add PMU event filter bits required if EL3 is implemented
  KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertised

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:23 +00:00
Oliver Upton
d47dcb67fc Merge branch kvm-arm64/feature-flag-refactor into kvmarm/next
* kvm-arm64/feature-flag-refactor:
  : vCPU feature flag cleanup
  :
  : Clean up KVM's handling of vCPU feature flags to get rid of the
  : vCPU-scoped bitmaps and remove failure paths from kvm_reset_vcpu().
  KVM: arm64: Get rid of vCPU-scoped feature bitmap
  KVM: arm64: Remove unused return value from kvm_reset_vcpu()
  KVM: arm64: Hoist NV+SVE check into KVM_ARM_VCPU_INIT ioctl handler
  KVM: arm64: Prevent NV feature flag on systems w/o nested virt
  KVM: arm64: Hoist PAuth checks into KVM_ARM_VCPU_INIT ioctl
  KVM: arm64: Hoist SVE check into KVM_ARM_VCPU_INIT ioctl handler
  KVM: arm64: Hoist PMUv3 check into KVM_ARM_VCPU_INIT ioctl handler
  KVM: arm64: Add generic check for system-supported vCPU features

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:14 +00:00
Oliver Upton
054056bf98 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Put an upper bound on the number of I-cache invalidations by
  :    cacheline to avoid soft lockups
  :
  :  - Get rid of bogus refererence count transfer for THP mappings
  :
  :  - Do a local TLB invalidation on permission fault race
  :
  :  - Fixes for page_fault_test KVM selftest
  :
  :  - Add a tracepoint for detecting MMIO instructions unsupported by KVM
  KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
  KVM: arm64: selftest: Perform ISB before reading PAR_EL1
  KVM: arm64: selftest: Add the missing .guest_prepare()
  KVM: arm64: Always invalidate TLB for stage-2 permission faults
  KVM: arm64: Do not transfer page refcount for THP adjustment
  KVM: arm64: Avoid soft lockups due to I-cache maintenance
  arm64: tlbflush: Rename MAX_TLBI_OPS
  KVM: arm64: Don't use kerneldoc comment for arm64_check_features()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:00 +00:00
Oliver Upton
d11974dc5f KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
It is a pretty well known fact that KVM does not support MMIO emulation
without valid instruction syndrome information (ESR_EL2.ISV == 0). The
current kvm_pr_unimpl() is pretty useless, as it contains zero context
to relate the event to a vCPU.

Replace it with a precise tracepoint that dumps the relevant context
so the user can make sense of what the guest is doing.

Acked-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231026205306.3045075-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:17:22 +00:00
Zenghui Yu
06899aa5dd KVM: arm64: selftest: Perform ISB before reading PAR_EL1
It looks like a mistake to issue ISB *after* reading PAR_EL1, we should
instead perform it between the AT instruction and the reads of PAR_EL1.

As according to DDI0487J.a IJTYVP,

"When an address translation instruction is executed, explicit
 synchronization is required to guarantee the result is visible to
 subsequent direct reads of PAR_EL1."

Otherwise all guest_at testcases fail on my box with

==== Test Assertion Failure ====
  aarch64/page_fault_test.c:142: par & 1 == 0
  pid=1355864 tid=1355864 errno=4 - Interrupted system call
     1	0x0000000000402853: vcpu_run_loop at page_fault_test.c:681
     2	0x0000000000402cdb: run_test at page_fault_test.c:730
     3	0x0000000000403897: for_each_guest_mode at guest_modes.c:100
     4	0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1105
     5	 (inlined by) main at page_fault_test.c:1131
     6	0x0000ffffb153c03b: ?? ??:0
     7	0x0000ffffb153c113: ?? ??:0
     8	0x0000000000401aaf: _start at ??:?
  0x1 != 0x0 (par & 1 != 0)

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-2-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:12:46 +00:00
Zenghui Yu
beaf35b480 KVM: arm64: selftest: Add the missing .guest_prepare()
Running page_fault_test on a Cortex A72 fails with

Test: ro_memslot_no_syndrome_guest_cas
Testing guest mode: PA-bits:40,  VA-bits:48,  4K pages
Testing memory backing src type: anonymous
==== Test Assertion Failure ====
  aarch64/page_fault_test.c:117: guest_check_lse()
  pid=1944087 tid=1944087 errno=4 - Interrupted system call
     1	0x00000000004028b3: vcpu_run_loop at page_fault_test.c:682
     2	0x0000000000402d93: run_test at page_fault_test.c:731
     3	0x0000000000403957: for_each_guest_mode at guest_modes.c:100
     4	0x00000000004019f3: for_each_test_and_guest_mode at page_fault_test.c:1108
     5	 (inlined by) main at page_fault_test.c:1134
     6	0x0000ffff868e503b: ?? ??:0
     7	0x0000ffff868e5113: ?? ??:0
     8	0x0000000000401aaf: _start at ??:?
  guest_check_lse()

because we don't have a guest_prepare stage to check the presence of
FEAT_LSE and skip the related guest_cas testing, and we end-up failing in
GUEST_ASSERT(guest_check_lse()).

Add the missing .guest_prepare() where it's indeed required.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231007124043.626-1-yuzenghui@huawei.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:12:46 +00:00
Oliver Upton
be097997a2 KVM: arm64: Always invalidate TLB for stage-2 permission faults
It is possible for multiple vCPUs to fault on the same IPA and attempt
to resolve the fault. One of the page table walks will actually update
the PTE and the rest will return -EAGAIN per our race detection scheme.
KVM elides the TLB invalidation on the racing threads as the return
value is nonzero.

Before commit a12ab1378a ("KVM: arm64: Use local TLBI on permission
relaxation") KVM always used broadcast TLB invalidations when handling
permission faults, which had the convenient property of making the
stage-2 updates visible to all CPUs in the system. However now we do a
local invalidation, and TLBI elision leads to the vCPU thread faulting
again on the stale entry. Remember that the architecture permits the TLB
to cache translations that precipitate a permission fault.

Invalidate the TLB entry responsible for the permission fault if the
stage-2 descriptor has been relaxed, regardless of which thread actually
did the job.

Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922223229.1608155-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:12:46 +00:00
Marc Zyngier
3f7915ccc9 KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
When trapping accesses from a NV guest that tries to access
SPSR_{irq,abt,und,fiq}, make sure we handle them as RAZ/WI,
as if AArch32 wasn't implemented.

This involves a bit of repainting to make the visibility
handler more generic.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231023095444.1587322-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:25:06 +00:00
Marc Zyngier
c7d11a61c7 KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
DBGVCR32_EL2, DACR32_EL2, IFSR32_EL2 and FPEXC32_EL2 are required to
UNDEF when AArch32 isn't implemented, which is definitely the case when
running NV.

Given that this is the only case where these registers can trap,
unconditionally inject an UNDEF exception.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20231023095444.1587322-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:24:57 +00:00
Miguel Luis
04cf546505 KVM: arm64: Refine _EL2 system register list that require trap reinjection
Implement a fine grained approach in the _EL2 sysreg range instead of
the current wide cast trap. This ensures that we don't mistakenly
inject the wrong exception into the guest.

[maz: commit message massaging, dropped secure and AArch32 registers
      from the list]

Fixes: d0fc0a2519 ("KVM: arm64: nv: Add trap forwarding for HCR_EL2")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231023095444.1587322-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:24:53 +00:00
Miguel Luis
41f6c93447 arm64: Add missing _EL2 encodings
Some _EL2 encodings are missing. Add them.

Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
[maz: dropped secure encodings]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231023095444.1587322-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:24:49 +00:00
Miguel Luis
d5cb781b77 arm64: Add missing _EL12 encodings
Some _EL12 encodings are missing. Add them.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231023095444.1587322-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:24:34 +00:00
Raghavendra Rao Ananta
62708be351 KVM: selftests: aarch64: vPMU test for validating user accesses
Add a vPMU test scenario to validate the userspace accesses for
the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} to ensure
that KVM honors the architectural definitions of these registers
for a given PMCR.N.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-13-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe
e1cc872063 KVM: selftests: aarch64: vPMU register test for unimplemented counters
Add a new test case to the vpmu_counter_access test to check
if PMU registers or their bits for unimplemented counters are not
accessible or are RAZ, as expected.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-12-rananta@google.com
[Oliver: fix issues relating to exception return address]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe
ada1ae6826 KVM: selftests: aarch64: vPMU register test for implemented counters
Add a new test case to the vpmu_counter_access test to check if PMU
registers or their bits for implemented counters on the vCPU are
readable/writable as expected, and can be programmed to count events.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-11-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Reiji Watanabe
8d0aebe1ca KVM: selftests: aarch64: Introduce vpmu_counter_access test
Introduce vpmu_counter_access test for arm64 platforms.
The test configures PMUv3 for a vCPU, sets PMCR_EL0.N for the vCPU,
and check if the guest can consistently see the same number of the
PMU event counters (PMCR_EL0.N) that userspace sets.
This test case is done with each of the PMCR_EL0.N values from
0 to 31 (With the PMCR_EL0.N values greater than the host value,
the test expects KVM_SET_ONE_REG for the PMCR_EL0 to fail).

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-10-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:31 +00:00
Raghavendra Rao Ananta
9f4b3273df tools: Import arm_pmuv3.h
Import kernel's include/linux/perf/arm_pmuv3.h, with the
definition of PMEVN_SWITCH() additionally including an assert()
for the 'default' case. The following patches will use macros
defined in this header.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-9-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Reiji Watanabe
ea9ca904d2 KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.

Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.

Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-8-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
27131b199f KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
Hence to ensure correct KVM's PMU emulation, mask out the RES0 bits.
Defer this work to the point that userspace can no longer change the
number of advertised PMCs.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-7-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
a45f41d754 KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
explicitly implement the {get,set}_user functions for these
registers to mask out unimplemented counters for userspace reads
and writes.

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-6-rananta@google.com
[Oliver: drop unnecessary locking]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
4d20debf9c KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, the value is set to the same
value as the current PE on every vCPU reset.  Unless the vCPU is
pinned to PEs that has the PMU associated to the guest from the
initial vCPU reset, the value might be different from the PMU's
PMCR_EL0.N on heterogeneous PMU systems.

Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
value. Track the PMCR_EL0.N per guest, as only one PMU can be set
for the guest (PMCR_EL0.N must be the same for all vCPUs of the
guest), and it is convenient for updating the value.

To achieve this, the patch introduces a helper,
kvm_arm_pmu_get_max_counters(), that reads the maximum number of
counters from the arm_pmu associated to the VM. Make the function
global as upcoming patches will be interested to know the value
while setting the PMCR.N of the guest from userspace.

KVM does not yet support userspace modifying PMCR_EL0.N.
The following patch will add support for that.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-5-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Reiji Watanabe
57fc267f1b KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
reads a vCPU's PMCR_EL0.

Currently, the PMCR_EL0 value is tracked per vCPU. The following
patches will make (only) PMCR_EL0.N track per guest. Having the
new helper will be useful to combine the PMCR_EL0.N field
(tracked per guest) and the other fields (tracked per vCPU)
to provide the value of PMCR_EL0.

No functional change intended.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-4-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:29 +00:00
Reiji Watanabe
4277335797 KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
Future changes to KVM's sysreg emulation will rely on having a valid PMU
instance to determine the number of implemented counters (PMCR_EL0.N).
This is earlier than when userspace is expected to modify the vPMU
device attributes, where the default is selected today.

Select the default PMU when handling KVM_ARM_VCPU_INIT such that it is
available in time for sysreg emulation.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-3-rananta@google.com
[Oliver: rewrite changelog]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:29 +00:00
Oliver Upton
ae8d3522e5 KVM: arm64: Add PMU event filter bits required if EL3 is implemented
Suzuki noticed that KVM's PMU emulation is oblivious to the NSU and NSK
event filter bits. On systems that have EL3 these bits modify the
filter behavior in non-secure EL0 and EL1, respectively. Even though the
kernel doesn't use these bits, it is entirely possible some other guest
OS does. Additionally, it would appear that these and the M bit are
required by the architecture if EL3 is implemented.

Allow the EL3 event filter bits to be set if EL3 is advertised in the
guest's ID register. Implement the behavior of NSU and NSK according to
the pseudocode, and entirely ignore the M bit for perf event creation.

Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231019185618.3442949-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 19:26:14 +00:00
Oliver Upton
bc512d6a9b KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertised
The NSH bit, which filters event counting at EL2, is required by the
architecture if an implementation has EL2. Even though KVM doesn't
support nested virt yet, it makes no effort to hide the existence of EL2
from the ID registers. Userspace can, however, change the value of PFR0
to hide EL2. Align KVM's sysreg emulation with the architecture and make
NSH RES0 if EL2 isn't advertised. Keep in mind the bit is ignored when
constructing the backing perf event.

While at it, build the event type mask using explicit field definitions
instead of relying on ARMV8_PMU_EVTYPE_MASK. KVM probably should've been
doing this in the first place, as it avoids changes to the
aforementioned mask affecting sysreg emulation.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231019185618.3442949-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 19:26:14 +00:00
Reiji Watanabe
1616ca6f3c KVM: arm64: PMU: Introduce helpers to set the guest's PMU
Introduce new helper functions to set the guest's PMU
(kvm->arch.arm_pmu) either to a default probed instance or to a
caller requested one, and use it when the guest's PMU needs to
be set. These helpers will make it easier for the following
patches to modify the relevant code.

No functional change intended.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-2-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 19:07:33 +00:00
Marc Zyngier
fe49fd940e KVM: arm64: Move VTCR_EL2 into struct s2_mmu
We currently have a global VTCR_EL2 value for each guest, even
if the guest uses NV. This implies that the guest's own S2 must
fit in the host's. This is odd, for multiple reasons:

- the PARange values and the number of IPA bits don't necessarily
  match: you can have 33 bits of IPA space, and yet you can only
  describe 32 or 36 bits of PARange

- When userspace set the IPA space, it creates a contract with the
  kernel saying "this is the IPA space I'm prepared to handle".
  At no point does it constraint the guest's own IPA space as
  long as the guest doesn't try to use a [I]PA outside of the
  IPA space set by userspace

- We don't even try to hide the value of ID_AA64MMFR0_EL1.PARange.

And then there is the consequence of the above: if a guest tries
to create a S2 that has for input address something that is larger
than the IPA space defined by the host, we inject a fatal exception.

This is no good. For all intent and purposes, a guest should be
able to have the S2 it really wants, as long as the *output* address
of that S2 isn't outside of the IPA space.

For that, we need to have a per-s2_mmu VTCR_EL2 setting, which
allows us to represent the full PARange. Move the vctr field into
the s2_mmu structure, which has no impact whatsoever, except for NV.

Note that once we are able to override ID_AA64MMFR0_EL1.PARange
from userspace, we'll also be able to restrict the size of the
shadow S2 that NV uses.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231012205108.3937270-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-23 18:48:46 +00:00
Oliver Upton
934bf871f0 KVM: arm64: Load the stage-2 MMU context in kvm_vcpu_load_vhe()
To date the VHE code has aggressively reloaded the stage-2 MMU context
on every guest entry, despite the fact that this isn't necessary. This
was probably done for consistency with the nVHE code, which needs to
switch in/out the stage-2 MMU context as both the host and guest run at
EL1.

Hoist __load_stage2() into kvm_vcpu_load_vhe(), thus avoiding a reload
on every guest entry/exit. This is likely to be beneficial to systems
with one of the speculative AT errata, as there is now one fewer context
synchronization event on the guest entry path. Additionally, it is
possible that implementations have hitched correctness mitigations on
writes to VTTBR_EL2, which are now elided on guest re-entry.

Note that __tlb_switch_to_guest() is deliberately left untouched as it
can be called outside the context of a running vCPU.

Link: https://lore.kernel.org/r/20231018233212.2888027-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20 17:52:01 +00:00
Oliver Upton
27cde4c0fe KVM: arm64: Rename helpers for VHE vCPU load/put
The names for the helpers we expose to the 'generic' KVM code are a bit
imprecise; we switch the EL0 + EL1 sysreg context and setup trap
controls that do not need to change for every guest entry/exit. Rename +
shuffle things around a bit in preparation for loading the stage-2 MMU
context on vcpu_load().

Link: https://lore.kernel.org/r/20231018233212.2888027-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20 17:52:01 +00:00
Marc Zyngier
5eba523e1e KVM: arm64: Reload stage-2 for VMID change on VHE
Naturally, a change to the VMID for an MMU implies a new value for
VTTBR. Reload on VMID change in anticipation of loading stage-2 on
vcpu_load() instead of every guest entry.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231018233212.2888027-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20 17:52:01 +00:00
Marc Zyngier
4288ff7ba1 KVM: arm64: Restore the stage-2 context in VHE's __tlb_switch_to_host()
An MMU notifier could cause us to clobber the stage-2 context loaded on
a CPU when we switch to another VM's context to invalidate. This isn't
an issue right now as the stage-2 context gets reloaded on every guest
entry, but is disastrous when moving __load_stage2() into the
vcpu_load() path.

Restore the previous stage-2 context on the way out of a TLB
invalidation if we installed something else. Deliberately do this after
TGE=1 is synchronized to keep things safe in light of the speculative AT
errata.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231018233212.2888027-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20 17:52:01 +00:00
Oliver Upton
38ce26bf26 KVM: arm64: Don't zero VTTBR in __tlb_switch_to_host()
HCR_EL2.TGE=0 is sufficient to disable stage-2 translation, so there's
no need to explicitly zero VTTBR_EL2.

Link: https://lore.kernel.org/r/20231018233212.2888027-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-20 17:52:00 +00:00
Jing Zhang
54a9ea7352 KVM: arm64: selftests: Test for setting ID register from usersapce
Add tests to verify setting ID registers from userspace is handled
correctly by KVM. Also add a test case to use ioctl
KVM_ARM_GET_REG_WRITABLE_MASKS to get writable masks.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:37:33 +00:00
Jing Zhang
0359c946b1 tools headers arm64: Update sysreg.h with kernel sources
The users of sysreg.h (perf, KVM selftests) are now generating the
necessary sysreg-defs.h; sync sysreg.h with the kernel sources and
fix the KVM selftests that use macros which suffered a rename.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Oliver Upton
9697d84cc3 KVM: selftests: Generate sysreg-defs.h and add to include path
Start generating sysreg-defs.h for arm64 builds in anticipation of
updating sysreg.h to a version that depends on it.

Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Oliver Upton
e2bdd172e6 perf build: Generate arm64's sysreg-defs.h and add to include path
Start generating sysreg-defs.h in anticipation of updating sysreg.h to a
version that needs the generated output.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Oliver Upton
02e85f7466 tools: arm64: Add a Makefile for generating sysreg-defs.h
Use a common Makefile for generating sysreg-defs.h, which will soon be
needed by perf and KVM selftests. The naming scheme of the generated
macros is not expected to change, so just refer to the canonical
script/data in the kernel source rather than copying to tools.

Co-developed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231011195740.3349631-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-18 23:36:25 +00:00
Kristina Martsenko
e0bb80c62c KVM: arm64: Expose MOPS instructions to guests
Expose the Armv8.8 FEAT_MOPS feature to guests in the ID register and
allow the MOPS instructions to be run in a guest. Only expose MOPS if
the whole system supports it.

Note, it is expected that guests do not use these instructions on MMIO,
similarly to other instructions where ESR_EL2.ISV==0 such as LDP/STP.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922112508.1774352-3-kristina.martsenko@arm.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-09 19:54:25 +00:00
Kristina Martsenko
2de451a329 KVM: arm64: Add handler for MOPS exceptions
An Armv8.8 FEAT_MOPS main or epilogue instruction will take an exception
if executed on a CPU with a different MOPS implementation option (A or
B) than the CPU where the preceding prologue instruction ran. In this
case the OS exception handler is expected to reset the registers and
restart execution from the prologue instruction.

A KVM guest may use the instructions at EL1 at times when the guest is
not able to handle the exception, expecting that the instructions will
only run on one CPU (e.g. when running UEFI boot services in the guest).
As KVM may reschedule the guest between different types of CPUs at any
time (on an asymmetric system), it needs to also handle the resulting
exception itself in case the guest is not able to. A similar situation
will also occur in the future when live migrating a guest from one type
of CPU to another.

Add handling for the MOPS exception to KVM. The handling can be shared
with the EL0 exception handler, as the logic and register layouts are
the same. The exception can be handled right after exiting a guest,
which avoids the cost of returning to the host exit handler.

Similarly to the EL0 exception handler, in case the main or epilogue
instruction is being single stepped, it makes sense to finish the step
before executing the prologue instruction, so advance the single step
state machine.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922112508.1774352-2-kristina.martsenko@arm.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-09 19:54:25 +00:00
Oliver Upton
4202bcac5e KVM: arm64: Use mtree_empty() to determine if SMCCC filter configured
The smccc_filter maple tree is only populated if userspace attempted to
configure it. Use the state of the maple tree to determine if the filter
has been configured, eliminating the VM flag.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231004234947.207507-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-05 09:33:15 +00:00
Oliver Upton
d34b76489e KVM: arm64: Only insert reserved ranges when SMCCC filter is used
The reserved ranges are only useful for preventing userspace from
adding a rule that intersects with functions we must handle in KVM. If
userspace never writes to the SMCCC filter than this is all just wasted
work/memory.

Insert reserved ranges on the first call to KVM_ARM_VM_SMCCC_FILTER.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231004234947.207507-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-05 09:33:15 +00:00