- Ensure perf events programmed to count during guest execution
   are actually enabled before entering the guest in the nVHE
   configuration.
 
 - Restore out-of-range handler for stage-2 translation faults.
 
 - Several fixes to stage-2 TLB invalidations to avoid stale
   translations, possibly including partial walk caches.
 
 - Fix early handling of architectural VHE-only systems to ensure E2H is
   appropriately set.
 
 - Correct a format specifier warning in the arch_timer selftest.
 
 - Make the KVM banner message correctly handle all of the possible
   configurations.
 
 RISC-V:
 
 - Remove redundant semicolon in num_isa_ext_regs().
 
 - Fix APLIC setipnum_le/be write emulation.
 
 - Fix APLIC in_clrip[x] read emulation.
 
 x86:
 
 - Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID entries (old
   vs. new) and ultimately neglects to clear PV_UNHALT from vCPUs with HLT-exiting
   disabled.
 
 - Documentation fixes for SEV.
 
 - Fix compat ABI for KVM_MEMORY_ENCRYPT_OP.
 
 - Fix a 14-year-old goof in a declaration shared by host and guest; the enabled
   field used by Linux when running as a guest pushes the size of "struct
   kvm_vcpu_pv_apf_data" from 64 to 68 bytes.  This is really unconsequential
   because KVM never consumes anything beyond the first 64 bytes, but the
   resulting struct does not match the documentation.
 
 Selftests:
 
 - Fix spelling mistake in arch_timer selftest.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmYMOJYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP2zAf/Z7/cK0+yFSvm7/tsbWtjnWofad/p
 82puu0V+8lZSjGVs3AydiDCV+FahvLS0QIwgrffVr4XA10Km5ZZMjZyJ3uH4xki/
 VFFsDnZPdKuj55T0wwN7JFn0YVOMdtgcP0b+F8aMbkL0uoJXjutOMKNhssuW12kw
 9cmPjaBWm/bfrfoTUUB9mCh0Ub3HKpguYwTLQuf6Fyn2FK7oORpt87Zi+oIKUn6H
 pFXFtZYduLg6M2LXvZqsXZLXnvABPjANNWEhiiwrvuF/wmXXTwTpvRXlYXhCvpAN
 q0AhxPhPm3NnsmRhEB6SmoMjXyZIByezcEiqAspBrUvEqs/2u6VyzFMrXw==
 =PlsI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "ARM:

   - Ensure perf events programmed to count during guest execution are
     actually enabled before entering the guest in the nVHE
     configuration

   - Restore out-of-range handler for stage-2 translation faults

   - Several fixes to stage-2 TLB invalidations to avoid stale
     translations, possibly including partial walk caches

   - Fix early handling of architectural VHE-only systems to ensure E2H
     is appropriately set

   - Correct a format specifier warning in the arch_timer selftest

   - Make the KVM banner message correctly handle all of the possible
     configurations

  RISC-V:

   - Remove redundant semicolon in num_isa_ext_regs()

   - Fix APLIC setipnum_le/be write emulation

   - Fix APLIC in_clrip[x] read emulation

  x86:

   - Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID
     entries (old vs. new) and ultimately neglects to clear PV_UNHALT
     from vCPUs with HLT-exiting disabled

   - Documentation fixes for SEV

   - Fix compat ABI for KVM_MEMORY_ENCRYPT_OP

   - Fix a 14-year-old goof in a declaration shared by host and guest;
     the enabled field used by Linux when running as a guest pushes the
     size of "struct kvm_vcpu_pv_apf_data" from 64 to 68 bytes. This is
     really unconsequential because KVM never consumes anything beyond
     the first 64 bytes, but the resulting struct does not match the
     documentation

  Selftests:

   - Fix spelling mistake in arch_timer selftest"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: arm64: Rationalise KVM banner output
  arm64: Fix early handling of FEAT_E2H0 not being implemented
  KVM: arm64: Ensure target address is granule-aligned for range TLBI
  KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
  KVM: arm64: Don't pass a TLBI level hint when zapping table entries
  KVM: arm64: Don't defer TLB invalidation when zapping table entries
  KVM: selftests: Fix __GUEST_ASSERT() format warnings in ARM's arch timer test
  KVM: arm64: Fix out-of-IPA space translation fault handling
  KVM: arm64: Fix host-programmed guest events in nVHE
  RISC-V: KVM: Fix APLIC in_clrip[x] read emulation
  RISC-V: KVM: Fix APLIC setipnum_le/be write emulation
  RISC-V: KVM: Remove second semicolon
  KVM: selftests: Fix spelling mistake "trigged" -> "triggered"
  Documentation: kvm/sev: clarify usage of KVM_MEMORY_ENCRYPT_OP
  Documentation: kvm/sev: separate description of firmware
  KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
  KVM: selftests: Check that PV_UNHALT is cleared when HLT exiting is disabled
  KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
  KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper
  KVM: SVM: Return -EINVAL instead of -EBUSY on attempt to re-init SEV/SEV-ES
  ...
This commit is contained in:
Linus Torvalds 2024-04-03 10:26:37 -07:00
commit 0f099dc9d1
21 changed files with 255 additions and 123 deletions

View File

@ -46,21 +46,16 @@ SEV hardware uses ASIDs to associate a memory encryption key with a VM.
Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
defined in the CPUID 0x8000001f[ecx] field.
SEV Key Management
==================
The KVM_MEMORY_ENCRYPT_OP ioctl
===============================
The SEV guest key management is handled by a separate processor called the AMD
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_
The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP. If the argument
to KVM_MEMORY_ENCRYPT_OP is NULL, the ioctl returns 0 if SEV is enabled
and ``ENOTTY`` if it is disabled (on some older versions of Linux,
the ioctl runs normally even with a NULL argument, and therefore will
likely return ``EFAULT``). If non-NULL, the argument to KVM_MEMORY_ENCRYPT_OP
must be a struct kvm_sev_cmd::
The main ioctl to access SEV is KVM_MEMORY_ENCRYPT_OP, which operates on
the VM file descriptor. If the argument to KVM_MEMORY_ENCRYPT_OP is NULL,
the ioctl returns 0 if SEV is enabled and ``ENOTTY`` if it is disabled
(on some older versions of Linux, the ioctl tries to run normally even
with a NULL argument, and therefore will likely return ``EFAULT`` instead
of zero if SEV is enabled). If non-NULL, the argument to
KVM_MEMORY_ENCRYPT_OP must be a struct kvm_sev_cmd::
struct kvm_sev_cmd {
__u32 id;
@ -87,10 +82,6 @@ guests, such as launching, running, snapshotting, migrating and decommissioning.
The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
context. In a typical workflow, this command should be the first command issued.
The firmware can be initialized either by using its own non-volatile storage or
the OS can manage the NV storage for the firmware using the module parameter
``init_ex_path``. If the file specified by ``init_ex_path`` does not exist or
is invalid, the OS will create or override the file with output from PSP.
Returns: 0 on success, -negative on error
@ -434,6 +425,21 @@ issued by the hypervisor to make the guest ready for execution.
Returns: 0 on success, -negative on error
Firmware Management
===================
The SEV guest key management is handled by a separate processor called the AMD
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
key management interface to perform common hypervisor activities such as
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
information, see the SEV Key Management spec [api-spec]_
The AMD-SP firmware can be initialized either by using its own non-volatile
storage or the OS can manage the NV storage for the firmware using
parameter ``init_ex_path`` of the ``ccp`` module. If the file specified
by ``init_ex_path`` does not exist or is invalid, the OS will create or
override the file with PSP non-volatile storage.
References
==========

View File

@ -193,8 +193,8 @@ data:
Asynchronous page fault (APF) control MSR.
Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area
which must be in guest RAM and must be zeroed. This memory is expected
to hold a copy of the following structure::
which must be in guest RAM. This memory is expected to hold the
following structure::
struct kvm_vcpu_pv_apf_data {
/* Used for 'page not present' events delivered via #PF */
@ -204,7 +204,6 @@ data:
__u32 token;
__u8 pad[56];
__u32 enabled;
};
Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1
@ -232,14 +231,14 @@ data:
as regular page fault, guest must reset 'flags' to '0' before it does
something that can generate normal page fault.
Bytes 5-7 of 64 byte memory location ('token') will be written to by the
Bytes 4-7 of 64 byte memory location ('token') will be written to by the
hypervisor at the time of APF 'page ready' event injection. The content
of these bytes is a token which was previously delivered as 'page not
present' event. The event indicates the page in now available. Guest is
supposed to write '0' to 'token' when it is done handling 'page ready'
event and to write 1' to MSR_KVM_ASYNC_PF_ACK after clearing the location;
writing to the MSR forces KVM to re-scan its queue and deliver the next
pending notification.
of these bytes is a token which was previously delivered in CR2 as
'page not present' event. The event indicates the page is now available.
Guest is supposed to write '0' to 'token' when it is done handling
'page ready' event and to write '1' to MSR_KVM_ASYNC_PF_ACK after
clearing the location; writing to the MSR forces KVM to re-scan its
queue and deliver the next pending notification.
Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page
ready' APF delivery needs to be written to before enabling APF mechanism

View File

@ -291,6 +291,21 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
blr x2
0:
mov_q x0, HCR_HOST_NVHE_FLAGS
/*
* Compliant CPUs advertise their VHE-onlyness with
* ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
* RES1 in that case. Publish the E2H bit early so that
* it can be picked up by the init_el2_state macro.
*
* Fruity CPUs seem to have HCR_EL2.E2H set to RAO/WI, but
* don't advertise it (they predate this relaxation).
*/
mrs_s x1, SYS_ID_AA64MMFR4_EL1
tbz x1, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f
orr x0, x0, #HCR_E2H
1:
msr hcr_el2, x0
isb
@ -303,22 +318,10 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
mov_q x1, INIT_SCTLR_EL1_MMU_OFF
/*
* Compliant CPUs advertise their VHE-onlyness with
* ID_AA64MMFR4_EL1.E2H0 < 0. HCR_EL2.E2H can be
* RES1 in that case.
*
* Fruity CPUs seem to have HCR_EL2.E2H set to RES1, but
* don't advertise it (they predate this relaxation).
*/
mrs_s x0, SYS_ID_AA64MMFR4_EL1
ubfx x0, x0, #ID_AA64MMFR4_EL1_E2H0_SHIFT, #ID_AA64MMFR4_EL1_E2H0_WIDTH
tbnz x0, #(ID_AA64MMFR4_EL1_E2H0_SHIFT + ID_AA64MMFR4_EL1_E2H0_WIDTH - 1), 1f
mrs x0, hcr_el2
and x0, x0, #HCR_E2H
cbz x0, 2f
1:
/* Set a sane SCTLR_EL1, the VHE way */
pre_disable_mmu_workaround
msr_s SYS_SCTLR_EL12, x1

View File

@ -2597,14 +2597,11 @@ static __init int kvm_arm_init(void)
if (err)
goto out_hyp;
if (is_protected_kvm_enabled()) {
kvm_info("Protected nVHE mode initialized successfully\n");
} else if (in_hyp_mode) {
kvm_info("VHE mode initialized successfully\n");
} else {
char mode = cpus_have_final_cap(ARM64_KVM_HVHE) ? 'h' : 'n';
kvm_info("Hyp mode (%cVHE) initialized successfully\n", mode);
}
kvm_info("%s%sVHE mode initialized successfully\n",
in_hyp_mode ? "" : (is_protected_kvm_enabled() ?
"Protected " : "Hyp "),
in_hyp_mode ? "" : (cpus_have_final_cap(ARM64_KVM_HVHE) ?
"h" : "n"));
/*
* FIXME: Do something reasonable if kvm_init() fails after pKVM

View File

@ -154,7 +154,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
/* Switch to requested VMID */
__tlb_switch_to_guest(mmu, &cxt, false);
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
TLBI_TTL_UNKNOWN);
dsb(ish);
__tlbi(vmalle1is);

View File

@ -528,7 +528,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
kvm_clear_pte(ctx->ptep);
dsb(ishst);
__tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), ctx->level);
__tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), TLBI_TTL_UNKNOWN);
} else {
if (ctx->end - ctx->addr < granule)
return -EINVAL;
@ -843,12 +843,15 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx,
* Perform the appropriate TLB invalidation based on the
* evicted pte value (if any).
*/
if (kvm_pte_table(ctx->old, ctx->level))
kvm_tlb_flush_vmid_range(mmu, ctx->addr,
kvm_granule_size(ctx->level));
else if (kvm_pte_valid(ctx->old))
if (kvm_pte_table(ctx->old, ctx->level)) {
u64 size = kvm_granule_size(ctx->level);
u64 addr = ALIGN_DOWN(ctx->addr, size);
kvm_tlb_flush_vmid_range(mmu, addr, size);
} else if (kvm_pte_valid(ctx->old)) {
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
ctx->addr, ctx->level);
}
}
if (stage2_pte_is_counted(ctx->old))
@ -896,9 +899,13 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
if (kvm_pte_valid(ctx->old)) {
kvm_clear_pte(ctx->ptep);
if (!stage2_unmap_defer_tlb_flush(pgt))
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
ctx->addr, ctx->level);
if (kvm_pte_table(ctx->old, ctx->level)) {
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
TLBI_TTL_UNKNOWN);
} else if (!stage2_unmap_defer_tlb_flush(pgt)) {
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr,
ctx->level);
}
}
mm_ops->put_page(ctx->ptep);

View File

@ -171,7 +171,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
/* Switch to requested VMID */
__tlb_switch_to_guest(mmu, &cxt);
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
TLBI_TTL_UNKNOWN);
dsb(ish);
__tlbi(vmalle1is);

View File

@ -1637,7 +1637,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
is_iabt = kvm_vcpu_trap_is_iabt(vcpu);
if (esr_fsc_is_permission_fault(esr)) {
if (esr_fsc_is_translation_fault(esr)) {
/* Beyond sanitised PARange (which is the IPA limit) */
if (fault_ipa >= BIT_ULL(get_kvm_ipa_limit())) {
kvm_inject_size_fault(vcpu);

View File

@ -137,11 +137,21 @@ static void aplic_write_pending(struct aplic *aplic, u32 irq, bool pending)
raw_spin_lock_irqsave(&irqd->lock, flags);
sm = irqd->sourcecfg & APLIC_SOURCECFG_SM_MASK;
if (!pending &&
((sm == APLIC_SOURCECFG_SM_LEVEL_HIGH) ||
(sm == APLIC_SOURCECFG_SM_LEVEL_LOW)))
if (sm == APLIC_SOURCECFG_SM_INACTIVE)
goto skip_write_pending;
if (sm == APLIC_SOURCECFG_SM_LEVEL_HIGH ||
sm == APLIC_SOURCECFG_SM_LEVEL_LOW) {
if (!pending)
goto skip_write_pending;
if ((irqd->state & APLIC_IRQ_STATE_INPUT) &&
sm == APLIC_SOURCECFG_SM_LEVEL_LOW)
goto skip_write_pending;
if (!(irqd->state & APLIC_IRQ_STATE_INPUT) &&
sm == APLIC_SOURCECFG_SM_LEVEL_HIGH)
goto skip_write_pending;
}
if (pending)
irqd->state |= APLIC_IRQ_STATE_PENDING;
else
@ -187,16 +197,31 @@ static void aplic_write_enabled(struct aplic *aplic, u32 irq, bool enabled)
static bool aplic_read_input(struct aplic *aplic, u32 irq)
{
bool ret;
unsigned long flags;
u32 sourcecfg, sm, raw_input, irq_inverted;
struct aplic_irq *irqd;
unsigned long flags;
bool ret = false;
if (!irq || aplic->nr_irqs <= irq)
return false;
irqd = &aplic->irqs[irq];
raw_spin_lock_irqsave(&irqd->lock, flags);
ret = (irqd->state & APLIC_IRQ_STATE_INPUT) ? true : false;
sourcecfg = irqd->sourcecfg;
if (sourcecfg & APLIC_SOURCECFG_D)
goto skip;
sm = sourcecfg & APLIC_SOURCECFG_SM_MASK;
if (sm == APLIC_SOURCECFG_SM_INACTIVE)
goto skip;
raw_input = (irqd->state & APLIC_IRQ_STATE_INPUT) ? 1 : 0;
irq_inverted = (sm == APLIC_SOURCECFG_SM_LEVEL_LOW ||
sm == APLIC_SOURCECFG_SM_EDGE_FALL) ? 1 : 0;
ret = !!(raw_input ^ irq_inverted);
skip:
raw_spin_unlock_irqrestore(&irqd->lock, flags);
return ret;

View File

@ -986,7 +986,7 @@ static int copy_isa_ext_reg_indices(const struct kvm_vcpu *vcpu,
static inline unsigned long num_isa_ext_regs(const struct kvm_vcpu *vcpu)
{
return copy_isa_ext_reg_indices(vcpu, NULL);;
return copy_isa_ext_reg_indices(vcpu, NULL);
}
static int copy_sbi_ext_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)

View File

@ -694,6 +694,7 @@ enum sev_cmd_id {
struct kvm_sev_cmd {
__u32 id;
__u32 pad0;
__u64 data;
__u32 error;
__u32 sev_fd;
@ -704,28 +705,35 @@ struct kvm_sev_launch_start {
__u32 policy;
__u64 dh_uaddr;
__u32 dh_len;
__u32 pad0;
__u64 session_uaddr;
__u32 session_len;
__u32 pad1;
};
struct kvm_sev_launch_update_data {
__u64 uaddr;
__u32 len;
__u32 pad0;
};
struct kvm_sev_launch_secret {
__u64 hdr_uaddr;
__u32 hdr_len;
__u32 pad0;
__u64 guest_uaddr;
__u32 guest_len;
__u32 pad1;
__u64 trans_uaddr;
__u32 trans_len;
__u32 pad2;
};
struct kvm_sev_launch_measure {
__u64 uaddr;
__u32 len;
__u32 pad0;
};
struct kvm_sev_guest_status {
@ -738,33 +746,43 @@ struct kvm_sev_dbg {
__u64 src_uaddr;
__u64 dst_uaddr;
__u32 len;
__u32 pad0;
};
struct kvm_sev_attestation_report {
__u8 mnonce[16];
__u64 uaddr;
__u32 len;
__u32 pad0;
};
struct kvm_sev_send_start {
__u32 policy;
__u32 pad0;
__u64 pdh_cert_uaddr;
__u32 pdh_cert_len;
__u32 pad1;
__u64 plat_certs_uaddr;
__u32 plat_certs_len;
__u32 pad2;
__u64 amd_certs_uaddr;
__u32 amd_certs_len;
__u32 pad3;
__u64 session_uaddr;
__u32 session_len;
__u32 pad4;
};
struct kvm_sev_send_update_data {
__u64 hdr_uaddr;
__u32 hdr_len;
__u32 pad0;
__u64 guest_uaddr;
__u32 guest_len;
__u32 pad1;
__u64 trans_uaddr;
__u32 trans_len;
__u32 pad2;
};
struct kvm_sev_receive_start {
@ -772,17 +790,22 @@ struct kvm_sev_receive_start {
__u32 policy;
__u64 pdh_uaddr;
__u32 pdh_len;
__u32 pad0;
__u64 session_uaddr;
__u32 session_len;
__u32 pad1;
};
struct kvm_sev_receive_update_data {
__u64 hdr_uaddr;
__u32 hdr_len;
__u32 pad0;
__u64 guest_uaddr;
__u32 guest_len;
__u32 pad1;
__u64 trans_uaddr;
__u32 trans_len;
__u32 pad2;
};
#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0)

View File

@ -142,7 +142,6 @@ struct kvm_vcpu_pv_apf_data {
__u32 token;
__u8 pad[56];
__u32 enabled;
};
#define KVM_PV_EOI_BIT 0

View File

@ -65,6 +65,7 @@ static int __init parse_no_stealacc(char *arg)
early_param("no-steal-acc", parse_no_stealacc);
static DEFINE_PER_CPU_READ_MOSTLY(bool, async_pf_enabled);
static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64) __visible;
static int has_steal_clock = 0;
@ -244,7 +245,7 @@ noinstr u32 kvm_read_and_reset_apf_flags(void)
{
u32 flags = 0;
if (__this_cpu_read(apf_reason.enabled)) {
if (__this_cpu_read(async_pf_enabled)) {
flags = __this_cpu_read(apf_reason.flags);
__this_cpu_write(apf_reason.flags, 0);
}
@ -295,7 +296,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
inc_irq_stat(irq_hv_callback_count);
if (__this_cpu_read(apf_reason.enabled)) {
if (__this_cpu_read(async_pf_enabled)) {
token = __this_cpu_read(apf_reason.token);
kvm_async_pf_task_wake(token);
__this_cpu_write(apf_reason.token, 0);
@ -362,7 +363,7 @@ static void kvm_guest_cpu_init(void)
wrmsrl(MSR_KVM_ASYNC_PF_INT, HYPERVISOR_CALLBACK_VECTOR);
wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
__this_cpu_write(async_pf_enabled, true);
pr_debug("setup async PF for cpu %d\n", smp_processor_id());
}
@ -383,11 +384,11 @@ static void kvm_guest_cpu_init(void)
static void kvm_pv_disable_apf(void)
{
if (!__this_cpu_read(apf_reason.enabled))
if (!__this_cpu_read(async_pf_enabled))
return;
wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
__this_cpu_write(apf_reason.enabled, 0);
__this_cpu_write(async_pf_enabled, false);
pr_debug("disable async PF for cpu %d\n", smp_processor_id());
}

View File

@ -189,15 +189,15 @@ static int kvm_cpuid_check_equal(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2
return 0;
}
static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcpu,
const char *sig)
static struct kvm_hypervisor_cpuid __kvm_get_hypervisor_cpuid(struct kvm_cpuid_entry2 *entries,
int nent, const char *sig)
{
struct kvm_hypervisor_cpuid cpuid = {};
struct kvm_cpuid_entry2 *entry;
u32 base;
for_each_possible_hypervisor_cpuid_base(base) {
entry = kvm_find_cpuid_entry(vcpu, base);
entry = cpuid_entry2_find(entries, nent, base, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
if (entry) {
u32 signature[3];
@ -217,22 +217,29 @@ static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcp
return cpuid;
}
static struct kvm_cpuid_entry2 *__kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu,
struct kvm_cpuid_entry2 *entries, int nent)
static struct kvm_hypervisor_cpuid kvm_get_hypervisor_cpuid(struct kvm_vcpu *vcpu,
const char *sig)
{
return __kvm_get_hypervisor_cpuid(vcpu->arch.cpuid_entries,
vcpu->arch.cpuid_nent, sig);
}
static struct kvm_cpuid_entry2 *__kvm_find_kvm_cpuid_features(struct kvm_cpuid_entry2 *entries,
int nent, u32 kvm_cpuid_base)
{
return cpuid_entry2_find(entries, nent, kvm_cpuid_base | KVM_CPUID_FEATURES,
KVM_CPUID_INDEX_NOT_SIGNIFICANT);
}
static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
{
u32 base = vcpu->arch.kvm_cpuid.base;
if (!base)
return NULL;
return cpuid_entry2_find(entries, nent, base | KVM_CPUID_FEATURES,
KVM_CPUID_INDEX_NOT_SIGNIFICANT);
}
static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
{
return __kvm_find_kvm_cpuid_features(vcpu, vcpu->arch.cpuid_entries,
vcpu->arch.cpuid_nent);
return __kvm_find_kvm_cpuid_features(vcpu->arch.cpuid_entries,
vcpu->arch.cpuid_nent, base);
}
void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)
@ -266,6 +273,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
int nent)
{
struct kvm_cpuid_entry2 *best;
struct kvm_hypervisor_cpuid kvm_cpuid;
best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
if (best) {
@ -292,10 +300,12 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
if (kvm_hlt_in_guest(vcpu->kvm) && best &&
(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
kvm_cpuid = __kvm_get_hypervisor_cpuid(entries, nent, KVM_SIGNATURE);
if (kvm_cpuid.base) {
best = __kvm_find_kvm_cpuid_features(entries, nent, kvm_cpuid.base);
if (kvm_hlt_in_guest(vcpu->kvm) && best)
best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
}
if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);

View File

@ -84,9 +84,10 @@ struct enc_region {
};
/* Called with the sev_bitmap_lock held, or on shutdown */
static int sev_flush_asids(int min_asid, int max_asid)
static int sev_flush_asids(unsigned int min_asid, unsigned int max_asid)
{
int ret, asid, error = 0;
int ret, error = 0;
unsigned int asid;
/* Check if there are any ASIDs to reclaim before performing a flush */
asid = find_next_bit(sev_reclaim_asid_bitmap, nr_asids, min_asid);
@ -116,7 +117,7 @@ static inline bool is_mirroring_enc_context(struct kvm *kvm)
}
/* Must be called with the sev_bitmap_lock held */
static bool __sev_recycle_asids(int min_asid, int max_asid)
static bool __sev_recycle_asids(unsigned int min_asid, unsigned int max_asid)
{
if (sev_flush_asids(min_asid, max_asid))
return false;
@ -143,8 +144,20 @@ static void sev_misc_cg_uncharge(struct kvm_sev_info *sev)
static int sev_asid_new(struct kvm_sev_info *sev)
{
int asid, min_asid, max_asid, ret;
/*
* SEV-enabled guests must use asid from min_sev_asid to max_sev_asid.
* SEV-ES-enabled guest can use from 1 to min_sev_asid - 1.
* Note: min ASID can end up larger than the max if basic SEV support is
* effectively disabled by disallowing use of ASIDs for SEV guests.
*/
unsigned int min_asid = sev->es_active ? 1 : min_sev_asid;
unsigned int max_asid = sev->es_active ? min_sev_asid - 1 : max_sev_asid;
unsigned int asid;
bool retry = true;
int ret;
if (min_asid > max_asid)
return -ENOTTY;
WARN_ON(sev->misc_cg);
sev->misc_cg = get_current_misc_cg();
@ -157,12 +170,6 @@ static int sev_asid_new(struct kvm_sev_info *sev)
mutex_lock(&sev_bitmap_lock);
/*
* SEV-enabled guests must use asid from min_sev_asid to max_sev_asid.
* SEV-ES-enabled guest can use from 1 to min_sev_asid - 1.
*/
min_asid = sev->es_active ? 1 : min_sev_asid;
max_asid = sev->es_active ? min_sev_asid - 1 : max_sev_asid;
again:
asid = find_next_zero_bit(sev_asid_bitmap, max_asid + 1, min_asid);
if (asid > max_asid) {
@ -179,7 +186,8 @@ static int sev_asid_new(struct kvm_sev_info *sev)
mutex_unlock(&sev_bitmap_lock);
return asid;
sev->asid = asid;
return 0;
e_uncharge:
sev_misc_cg_uncharge(sev);
put_misc_cg(sev->misc_cg);
@ -187,7 +195,7 @@ static int sev_asid_new(struct kvm_sev_info *sev)
return ret;
}
static int sev_get_asid(struct kvm *kvm)
static unsigned int sev_get_asid(struct kvm *kvm)
{
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
@ -247,21 +255,19 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
{
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
struct sev_platform_init_args init_args = {0};
int asid, ret;
int ret;
if (kvm->created_vcpus)
return -EINVAL;
ret = -EBUSY;
if (unlikely(sev->active))
return ret;
return -EINVAL;
sev->active = true;
sev->es_active = argp->id == KVM_SEV_ES_INIT;
asid = sev_asid_new(sev);
if (asid < 0)
ret = sev_asid_new(sev);
if (ret)
goto e_no_asid;
sev->asid = asid;
init_args.probe = false;
ret = sev_platform_init(&init_args);
@ -287,8 +293,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
{
unsigned int asid = sev_get_asid(kvm);
struct sev_data_activate activate;
int asid = sev_get_asid(kvm);
int ret;
/* activate ASID on the given handle */
@ -2240,8 +2246,10 @@ void __init sev_hardware_setup(void)
goto out;
}
sev_asid_count = max_sev_asid - min_sev_asid + 1;
WARN_ON_ONCE(misc_cg_set_capacity(MISC_CG_RES_SEV, sev_asid_count));
if (min_sev_asid <= max_sev_asid) {
sev_asid_count = max_sev_asid - min_sev_asid + 1;
WARN_ON_ONCE(misc_cg_set_capacity(MISC_CG_RES_SEV, sev_asid_count));
}
sev_supported = true;
/* SEV-ES support requested? */
@ -2272,7 +2280,9 @@ void __init sev_hardware_setup(void)
out:
if (boot_cpu_has(X86_FEATURE_SEV))
pr_info("SEV %s (ASIDs %u - %u)\n",
sev_supported ? "enabled" : "disabled",
sev_supported ? min_sev_asid <= max_sev_asid ? "enabled" :
"unusable" :
"disabled",
min_sev_asid, max_sev_asid);
if (boot_cpu_has(X86_FEATURE_SEV_ES))
pr_info("SEV-ES %s (ASIDs %u - %u)\n",
@ -2320,7 +2330,7 @@ int sev_cpu_init(struct svm_cpu_data *sd)
*/
static void sev_flush_encrypted_page(struct kvm_vcpu *vcpu, void *va)
{
int asid = to_kvm_svm(vcpu->kvm)->sev_info.asid;
unsigned int asid = sev_get_asid(vcpu->kvm);
/*
* Note! The address must be a kernel address, as regular page walk
@ -2638,7 +2648,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
void pre_sev_run(struct vcpu_svm *svm, int cpu)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
int asid = sev_get_asid(svm->vcpu.kvm);
unsigned int asid = sev_get_asid(svm->vcpu.kvm);
/* Assign the asid allocated with this SEV guest */
svm->asid = asid;

View File

@ -735,13 +735,13 @@ TRACE_EVENT(kvm_nested_intr_vmexit,
* Tracepoint for nested #vmexit because of interrupt pending
*/
TRACE_EVENT(kvm_invlpga,
TP_PROTO(__u64 rip, int asid, u64 address),
TP_PROTO(__u64 rip, unsigned int asid, u64 address),
TP_ARGS(rip, asid, address),
TP_STRUCT__entry(
__field( __u64, rip )
__field( int, asid )
__field( __u64, address )
__field( __u64, rip )
__field( unsigned int, asid )
__field( __u64, address )
),
TP_fast_assign(
@ -750,7 +750,7 @@ TRACE_EVENT(kvm_invlpga,
__entry->address = address;
),
TP_printk("rip: 0x%016llx asid: %d address: 0x%016llx",
TP_printk("rip: 0x%016llx asid: %u address: 0x%016llx",
__entry->rip, __entry->asid, __entry->address)
);

View File

@ -86,7 +86,7 @@ void kvm_vcpu_pmu_resync_el0(void);
*/
#define kvm_pmu_update_vcpu_events(vcpu) \
do { \
if (!has_vhe() && kvm_vcpu_has_pmu(vcpu)) \
if (!has_vhe() && kvm_arm_support_pmu_v3()) \
vcpu->arch.pmu.events = *kvm_get_pmu_events(); \
} while (0)

View File

@ -135,8 +135,8 @@ static void guest_run_stage(struct test_vcpu_shared_data *shared_data,
irq_iter = READ_ONCE(shared_data->nr_iter);
__GUEST_ASSERT(config_iter + 1 == irq_iter,
"config_iter + 1 = 0x%lx, irq_iter = 0x%lx.\n"
" Guest timer interrupt was not trigged within the specified\n"
"config_iter + 1 = 0x%x, irq_iter = 0x%x.\n"
" Guest timer interrupt was not triggered within the specified\n"
" interval, try to increase the error margin by [-e] option.\n",
config_iter + 1, irq_iter);
}

View File

@ -1037,8 +1037,19 @@ static inline void vcpu_set_cpuid(struct kvm_vcpu *vcpu)
void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
struct kvm_x86_cpu_property property,
uint32_t value);
void vcpu_set_cpuid_maxphyaddr(struct kvm_vcpu *vcpu, uint8_t maxphyaddr);
void vcpu_clear_cpuid_entry(struct kvm_vcpu *vcpu, uint32_t function);
static inline bool vcpu_cpuid_has(struct kvm_vcpu *vcpu,
struct kvm_x86_cpu_feature feature)
{
struct kvm_cpuid_entry2 *entry;
entry = __vcpu_get_cpuid_entry(vcpu, feature.function, feature.index);
return *((&entry->eax) + feature.reg) & BIT(feature.bit);
}
void vcpu_set_or_clear_cpuid_feature(struct kvm_vcpu *vcpu,
struct kvm_x86_cpu_feature feature,
bool set);

View File

@ -60,7 +60,7 @@ static void guest_run(struct test_vcpu_shared_data *shared_data)
irq_iter = READ_ONCE(shared_data->nr_iter);
__GUEST_ASSERT(config_iter + 1 == irq_iter,
"config_iter + 1 = 0x%x, irq_iter = 0x%x.\n"
" Guest timer interrupt was not trigged within the specified\n"
" Guest timer interrupt was not triggered within the specified\n"
" interval, try to increase the error margin by [-e] option.\n",
config_iter + 1, irq_iter);
}

View File

@ -133,6 +133,43 @@ static void enter_guest(struct kvm_vcpu *vcpu)
}
}
static void test_pv_unhalt(void)
{
struct kvm_vcpu *vcpu;
struct kvm_vm *vm;
struct kvm_cpuid_entry2 *ent;
u32 kvm_sig_old;
pr_info("testing KVM_FEATURE_PV_UNHALT\n");
TEST_REQUIRE(KVM_CAP_X86_DISABLE_EXITS);
/* KVM_PV_UNHALT test */
vm = vm_create_with_one_vcpu(&vcpu, guest_main);
vcpu_set_cpuid_feature(vcpu, X86_FEATURE_KVM_PV_UNHALT);
TEST_ASSERT(vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
"Enabling X86_FEATURE_KVM_PV_UNHALT had no effect");
/* Make sure KVM clears vcpu->arch.kvm_cpuid */
ent = vcpu_get_cpuid_entry(vcpu, KVM_CPUID_SIGNATURE);
kvm_sig_old = ent->ebx;
ent->ebx = 0xdeadbeef;
vcpu_set_cpuid(vcpu);
vm_enable_cap(vm, KVM_CAP_X86_DISABLE_EXITS, KVM_X86_DISABLE_EXITS_HLT);
ent = vcpu_get_cpuid_entry(vcpu, KVM_CPUID_SIGNATURE);
ent->ebx = kvm_sig_old;
vcpu_set_cpuid(vcpu);
TEST_ASSERT(!vcpu_cpuid_has(vcpu, X86_FEATURE_KVM_PV_UNHALT),
"KVM_FEATURE_PV_UNHALT is set with KVM_CAP_X86_DISABLE_EXITS");
/* FIXME: actually test KVM_FEATURE_PV_UNHALT feature */
kvm_vm_free(vm);
}
int main(void)
{
struct kvm_vcpu *vcpu;
@ -151,4 +188,6 @@ int main(void)
enter_guest(vcpu);
kvm_vm_free(vm);
test_pv_unhalt();
}