linux-next/arch/x86/kvm
Sean Christopherson 1201f226c8 KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to
avoid the overead of CPUID during runtime.  The offset, size, and metadata
for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e.
is constant for a given CPU, and thus can be cached during module load.

On Intel's Emerald Rapids, CPUID is *wildly* expensive, to the point where
recomputing XSAVE offsets and sizes results in a 4x increase in latency of
nested VM-Enter and VM-Exit (nested transitions can trigger
xstate_required_size() multiple times per transition), relative to using
cached values.  The issue is easily visible by running `perf top` while
triggering nested transitions: kvm_update_cpuid_runtime() shows up at a
whopping 50%.

As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test
and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested
roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald
Rapids (EMR) takes:

  SKX 11650
  ICX 22350
  EMR 28850

Using cached values, the latency drops to:

  SKX 6850
  ICX 9000
  EMR 7900

The underlying issue is that CPUID itself is slow on ICX, and comically
slow on EMR.  The problem is exacerbated on CPUs which support XSAVES
and/or XSAVEC, as KVM invokes xstate_required_size() twice on each
runtime CPUID update, and because there are more supported XSAVE features
(CPUID for supported XSAVE feature sub-leafs is significantly slower).

 SKX:
  CPUID.0xD.2  = 348 cycles
  CPUID.0xD.3  = 400 cycles
  CPUID.0xD.4  = 276 cycles
  CPUID.0xD.5  = 236 cycles
  <other sub-leaves are similar>

 EMR:
  CPUID.0xD.2  = 1138 cycles
  CPUID.0xD.3  = 1362 cycles
  CPUID.0xD.4  = 1068 cycles
  CPUID.0xD.5  = 910 cycles
  CPUID.0xD.6  = 914 cycles
  CPUID.0xD.7  = 1350 cycles
  CPUID.0xD.8  = 734 cycles
  CPUID.0xD.9  = 766 cycles
  CPUID.0xD.10 = 732 cycles
  CPUID.0xD.11 = 718 cycles
  CPUID.0xD.12 = 734 cycles
  CPUID.0xD.13 = 1700 cycles
  CPUID.0xD.14 = 1126 cycles
  CPUID.0xD.15 = 898 cycles
  CPUID.0xD.16 = 716 cycles
  CPUID.0xD.17 = 748 cycles
  CPUID.0xD.18 = 776 cycles

Note, updating runtime CPUID information multiple times per nested
transition is itself a flaw, especially since CPUID is a mandotory
intercept on both Intel and AMD.  E.g. KVM doesn't need to ensure emulated
CPUID state is up-to-date while running L2.  That flaw will be fixed in a
future patch, as deferring runtime CPUID updates is more subtle than it
appears at first glance, the benefits aren't super critical to have once
the XSAVE issue is resolved, and caching CPUID output is desirable even if
KVM's updates are deferred.

Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241211013302.1347853-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-13 13:58:10 -05:00
..
mmu KVM: x86: switch hugepage recovery thread to vhost_task 2024-11-14 13:20:04 -05:00
svm The biggest change here is eliminating the awful idea that KVM had, of 2024-11-23 16:00:50 -08:00
vmx Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()" 2024-11-19 19:34:35 -05:00
.gitignore KVM: x86: use a separate asm-offsets.c file 2022-11-09 12:10:17 -05:00
cpuid.c KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init 2024-12-13 13:58:10 -05:00
cpuid.h KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init 2024-12-13 13:58:10 -05:00
debugfs.c KVM: Get rid of return value from kvm_arch_create_vm_debugfs() 2024-02-23 21:44:58 +00:00
emulate.c KVM: x86: Add X86EMUL_F_MSR and X86EMUL_F_DT_LOAD to aid canonical checks 2024-11-01 09:22:25 -07:00
fpu.h KVM: x86: Move FPU register accessors into fpu.h 2021-06-17 13:09:24 -04:00
governed_features.h KVM: x86: Use KVM-governed feature framework to track "LAM enabled" 2023-11-28 17:54:09 -08:00
hyperv.c KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops 2024-07-16 12:14:12 -04:00
hyperv.h KVM: x86: hyper-v: Remove unused inline function kvm_hv_free_pa_page() 2024-08-13 09:28:48 -04:00
i8254.c KVM: x86: Unify pr_fmt to use module name for all KVM modules 2022-12-29 15:47:35 -05:00
i8254.h KVM: x86: PIT: Preserve state of speaker port data bit 2022-06-08 13:06:20 -04:00
i8259.c KVM: x86: Fix poll command 2023-06-01 13:44:13 -07:00
ioapic.c KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking 2023-03-27 10:13:28 -04:00
ioapic.h x86/kvm: remove unused ack_notifier callbacks 2021-11-18 07:05:57 -05:00
irq_comm.c KVM: x86: Don't re-setup empty IRQ routing when KVM_CAP_SPLIT_IRQCHIP 2024-06-11 14:18:40 -07:00
irq.c KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt() 2024-09-09 20:15:01 -07:00
irq.h KVM: x86: Don't re-setup empty IRQ routing when KVM_CAP_SPLIT_IRQCHIP 2024-06-11 14:18:40 -07:00
Kconfig KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD 2024-11-19 19:34:51 -05:00
kvm_cache_regs.h KVM: x86: Add lockdep-guarded asserts on register cache usage 2024-11-01 09:22:22 -07:00
kvm_emulate.h KVM: x86: Add X86EMUL_F_MSR and X86EMUL_F_DT_LOAD to aid canonical checks 2024-11-01 09:22:25 -07:00
kvm_onhyperv.c KVM: x86/mmu: Move filling of Hyper-V's TLB range struct into Hyper-V code 2023-04-10 15:17:29 -07:00
kvm_onhyperv.h KVM: x86: Move Hyper-V partition assist page out of Hyper-V emulation context 2023-12-07 09:34:01 -08:00
kvm-asm-offsets.c KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly 2022-11-09 12:25:53 -05:00
lapic.c Merge branch 'kvm-docs-6.13' into HEAD 2024-11-13 07:18:12 -05:00
lapic.h KVM: x86: Unpack msr_data structure prior to calling kvm_apic_set_base() 2024-11-04 20:57:46 -08:00
Makefile KVM: x86: leave kvm.ko out of the build if no vendor module is requested 2024-10-06 03:53:41 -04:00
mmu.h KVM: x86: drop x86.h include from cpuid.h 2024-11-01 09:22:23 -07:00
mtrr.c KVM: x86: drop x86.h include from cpuid.h 2024-11-01 09:22:23 -07:00
pmu.c KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops 2024-07-16 12:14:12 -04:00
pmu.h KVM: x86/pmu: Introduce distinct macros for GP/fixed counter max number 2024-06-28 09:12:16 -07:00
reverse_cpuid.h x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest 2024-11-13 14:40:40 -05:00
smm.c KVM: x86: Forcibly leave nested if RSM to L2 hits shutdown 2024-09-09 20:09:49 -07:00
smm.h KVM: x86: smm: preserve interrupt shadow in SMRAM 2022-11-09 12:31:26 -05:00
trace.h KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops 2024-07-16 12:14:12 -04:00
tss.h
x86.c KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init 2024-12-13 13:58:10 -05:00
x86.h KVM: x86: model canonical checks more precisely 2024-11-01 09:22:26 -07:00
xen.c KVM: x86/xen: Initialize hrtimer in kvm_xen_init_vcpu() 2024-11-07 02:47:05 +01:00
xen.h KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled 2024-03-04 16:22:36 -08:00