mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-16 01:54:00 +00:00
606 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Sean Christopherson
|
3af4a9e61e |
KVM: x86: Serialize vendor module initialization (hardware setup)
Acquire a new mutex, vendor_module_lock, in kvm_x86_vendor_init() while doing hardware setup to ensure that concurrent calls are fully serialized. KVM rejects attempts to load vendor modules if a different module has already been loaded, but doesn't handle the case where multiple vendor modules are loaded at the same time, and module_init() doesn't run under the global module_mutex. Note, in practice, this is likely a benign bug as no platform exists that supports both SVM and VMX, i.e. barring a weird VM setup, one of the vendor modules is guaranteed to fail a support check before modifying common KVM state. Alternatively, KVM could perform an atomic CMPXCHG on .hardware_enable, but that comes with its own ugliness as it would require setting .hardware_enable before success is guaranteed, e.g. attempting to load the "wrong" could result in spurious failure to load the "right" module. Introduce a new mutex as using kvm_lock is extremely deadlock prone due to kvm_lock being taken under cpus_write_lock(), and in the future, under under cpus_read_lock(). Any operation that takes cpus_read_lock() while holding kvm_lock would potentially deadlock, e.g. kvm_timer_init() takes cpus_read_lock() to register a callback. In theory, KVM could avoid such problematic paths, i.e. do less setup under kvm_lock, but avoiding all calls to cpus_read_lock() is subtly difficult and thus fragile. E.g. updating static calls also acquires cpus_read_lock(). Inverting the lock ordering, i.e. always taking kvm_lock outside cpus_read_lock(), is not a viable option as kvm_lock is taken in various callbacks that may be invoked under cpus_read_lock(), e.g. x86's kvmclock_cpufreq_notifier(). The lockdep splat below is dependent on future patches to take cpus_read_lock() in hardware_enable_all(), but as above, deadlock is already is already possible. ====================================================== WARNING: possible circular locking dependency detected 6.0.0-smp--7ec93244f194-init2 #27 Tainted: G O ------------------------------------------------------ stable/251833 is trying to acquire lock: ffffffffc097ea28 (kvm_lock){+.+.}-{3:3}, at: hardware_enable_all+0x1f/0xc0 [kvm] but task is already holding lock: ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2a/0xa0 __cpuhp_setup_state+0x2b/0x60 __kvm_x86_vendor_init+0x16a/0x1870 [kvm] kvm_x86_vendor_init+0x23/0x40 [kvm] 0xffffffffc0a4d02b do_one_initcall+0x110/0x200 do_init_module+0x4f/0x250 load_module+0x1730/0x18f0 __se_sys_finit_module+0xca/0x100 __x64_sys_finit_module+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (kvm_lock){+.+.}-{3:3}: __lock_acquire+0x16f4/0x30d0 lock_acquire+0xb2/0x190 __mutex_lock+0x98/0x6f0 mutex_lock_nested+0x1b/0x20 hardware_enable_all+0x1f/0xc0 [kvm] kvm_dev_ioctl+0x45e/0x930 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpu_hotplug_lock); lock(kvm_lock); lock(cpu_hotplug_lock); lock(kvm_lock); *** DEADLOCK *** 1 lock held by stable/251833: #0: ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221130230934.1014142-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Paolo Bonzini
|
fc471e8310 |
Merge branch 'kvm-late-6.1' into HEAD
x86: * Change tdp_mmu to a read-only parameter * Separate TDP and shadow MMU page fault paths * Enable Hyper-V invariant TSC control selftests: * Use TAP interface for kvm_binary_stats_test and tsc_msrs_test Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Paolo Bonzini
|
a5496886eb |
Merge branch 'kvm-late-6.1-fixes' into HEAD
x86: * several fixes to nested VMX execution controls * fixes and clarification to the documentation for Xen emulation * do not unnecessarily release a pmu event with zero period * MMU fixes * fix Coverity warning in kvm_hv_flush_tlb() selftests: * fixes for the ucall mechanism in selftests * other fixes mostly related to compilation with clang |
||
Paolo Bonzini
|
02d9a04da4 |
Documentation: kvm: clarify SRCU locking order
Currently only the locking order of SRCU vs kvm->slots_arch_lock and kvm->slots_lock is documented. Extend this to kvm->lock since Xen emulation got it terribly wrong. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
David Woodhouse
|
af2808906a |
KVM: x86/xen: Documentation updates and clarifications
Most notably, the KVM_XEN_EVTCHN_RESET feature had escaped documentation entirely. Along with how to turn most stuff off on SHUTDOWN_soft_reset. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20221226120320.1125390-6-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Sean Christopherson
|
23e528d9bc |
KVM: Delete extra block of "};" in the KVM API documentation
Delete an extra block of code/documentation that snuck in when KVM's documentation was converted to ReST format. Fixes: 106ee47dc633 ("docs: kvm: Convert api.txt to ReST format") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221207003637.2041211-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Linus Torvalds
|
8fa590bf34 |
ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. * Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). * Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. * Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. * Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. * Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: * Second batch of the lazy destroy patches * First batch of KVM changes for kernel virtual != physical address support * Removal of a unused function x86: * Allow compiling out SMM support * Cleanup and documentation of SMM state save area format * Preserve interrupt shadow in SMM state save area * Respond to generic signals during slow page faults * Fixes and optimizations for the non-executable huge page errata fix. * Reprogram all performance counters on PMU filter change * Cleanups to Hyper-V emulation and tests * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) * Advertise several new Intel features * x86 Xen-for-KVM: ** Allow the Xen runstate information to cross a page boundary ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured ** Add support for 32-bit guests in SCHEDOP_poll * Notable x86 fixes and cleanups: ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. ** Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. ** Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. ** Advertise (on AMD) that the SMM_CTL MSR is not supported ** Remove unnecessary exports Generic: * Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: * Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. * Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. * Introduce actual atomics for clear/set_bit() in selftests * Add support for pinning vCPUs in dirty_log_perf_test. * Rename the so called "perf_util" framework to "memstress". * Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. * Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. * Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). * A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. * x86-specific selftest changes: ** Clean up x86's page table management. ** Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. ** Clean up the nEPT support checks. ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. ** Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: * Remove deleted ioctls from documentation * Clean up the docs for the x86 MSR filter. * Various fixes -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA== =QAYX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ... |
||
Sean Christopherson
|
549a715b98 |
KVM: x86: Add proper ReST tables for userspace MSR exits/flags
Add ReST formatting to the set of userspace MSR exits/flags so that the resulting HTML docs generate a table instead of malformed gunk. This also fixes a warning that was introduced by a recent cleanup of the relevant documentation (yay copy+paste). >> Documentation/virt/kvm/api.rst:7287: WARNING: Block quote ends without a blank line; unexpected unindent. Fixes: 1ae099540e8c ("KVM: x86: Allow deflecting unknown MSR accesses to user space") Fixes: 1f158147181b ("KVM: x86: Clean up KVM_CAP_X86_USER_SPACE_MSR documentation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221207000959.2035098-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Linus Torvalds
|
a89ef2aa55 |
Add TDX guest attestation infrastructure and driver
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmOXYmwACgkQaDWVMHDJ krD8hg/+J0hUTfljmlCctwGZyqVR3Y2E722wL9oTvbgYiUAtFrARzfPF0WNwvHi5 Ywvod5hQ4unPoluthdVAD/uJqcPVhjIZ7CvNTGrS8J7ED5x5ydGLNWAL3Rn+9s6O xkz/DsV4zl+cPQ60XLsO+3Mc6RhwVs9DUthpUovl22epmgmRPCovkHWkvQsZajJq ceF/78ThfrkG4dDouaIXi1gsmKLLzU4KdHeBATMg0bgPQXFJZSGBCLaeJXWmLapq 7N3SznUqDMn4Plr/IuP4XuMA6VTVojrakCcBmw5SGVqhkVWGM1/FMg7jHSQS7Z5V 5uG7CkhTBqh17v9xKwDMPh34D51TLtNifA7jbecyL5155czFkj7BoSwEFINU/wCz agUO9NvK9j1chUnA2UGqGQigM3nWGZHMwaQjfgBWyq5gqF8HURUUrjx6XuunOfmB 1byyrDu0g48u/zaQ/RpNfewz1ZY+WylDPcqOhYaVWF1PYThStML/VMBKpdsl1Ovw nytUdQsaBIjFHQdB+snizaF93+/0FG+FTGAlDnHYmey/8plL2LYuzrcDnDYnGEXa tN3HFd2lAi4JBLmvmgF39gH+BLXuKTLweIhwTXZTn91cfire3yxiXAnLd0tuptMP aXFddxKMdMpxTqzy2X+8gJjqCr2lZ9gZkxaPsWwrBM+xrJf0p2w= =JGnq -----END PGP SIGNATURE----- Merge tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tdx updates from Dave Hansen: "This includes a single chunk of new functionality for TDX guests which allows them to talk to the trusted TDX module software and obtain an attestation report. This report can then be used to prove the trustworthiness of the guest to a third party and get access to things like storage encryption keys" * tag 'x86_tdx_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/tdx: Test TDX attestation GetReport support virt: Add TDX guest driver x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module |
||
Paolo Bonzini
|
9352e7470a |
Merge remote-tracking branch 'kvm/queue' into HEAD
x86 Xen-for-KVM: * Allow the Xen runstate information to cross a page boundary * Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured * add support for 32-bit guests in SCHEDOP_poll x86 fixes: * One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). * Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. * Clean up the MSR filter docs. * Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. * Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. * Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. * Advertise (on AMD) that the SMM_CTL MSR is not supported * Remove unnecessary exports Selftests: * Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. * Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). * Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. * Introduce actual atomics for clear/set_bit() in selftests Documentation: * Remove deleted ioctls from documentation * Various fixes |
||
Paolo Bonzini
|
eb5618911a |
KVM/arm64 updates for 6.2
- Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on. - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Add/Enable/Fix a bunch of selftests covering memslots, breakpoints, stage-2 faults and access tracking. You name it, we got it, we probably broke it. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. As a side effect, this tag also drags: - The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring series - A shared branch with the arm64 tree that repaints all the system registers to match the ARM ARM's naming, and resulting in interesting conflicts -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmOODb0PHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDztsQAInRnsgLl57/SpqhZzExNCllN6AT/bdeB3uz rnw3ScJOV174uNKp8lnPWoTvu2YUGiVtBp6tFHhDI8le7zHX438ZT8KE5mcs8p5i KfFKnb8SHV2DDpqkcy24c0Xl/6vsg1qkKrdfJb49yl5ZakRITDpynW/7tn6dXsxX wASeGFdCYeW4g2xMQzsCbtx6LgeQ8uomBmzRfPrOtZHYYxAn6+4Mj4595EC1sWxM AQnbp8tW3Vw46saEZAQvUEOGOW9q0Nls7G21YqQ52IA+ZVDK1LmAF2b1XY3edjkk pX8EsXOURfqdasBxfSfF3SgnUazoz9GHpSzp1cTVTktrPp40rrT7Ldtml0ktq69d 1malPj47KVMDsIq0kNJGnMxciXFgAHw+VaCQX+k4zhIatNwviMbSop2fEoxj22jc 4YGgGOxaGrnvmAJhreCIbr4CkZk5CJ8Zvmtfg+QM6npIp8BY8896nvORx/d4i6tT H4caadd8AAR56ANUyd3+KqF3x0WrkaU0PLHJLy1tKwOXJUUTjcpvIfahBAAeUlSR qEFrtb+EEMPgAwLfNOICcNkPZR/yyuYvM+FiUQNVy5cNiwFkpztpIctfOFaHySGF K07O2/a1F6xKL0OKRUg7hGKknF9ecmux4vHhiUMuIk9VOgNTWobHozBDorLKXMzC aWa6oGVC =iIPT -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.2 - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on. - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Add/Enable/Fix a bunch of selftests covering memslots, breakpoints, stage-2 faults and access tracking. You name it, we got it, we probably broke it. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. As a side effect, this tag also drags: - The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring series - A shared branch with the arm64 tree that repaints all the system registers to match the ARM ARM's naming, and resulting in interesting conflicts |
||
Marc Zyngier
|
86f27d849b |
Merge branch kvm-arm64/misc-6.2 into kvmarm-master/next
* kvm-arm64/misc-6.2: : . : Misc fixes for 6.2: : : - Fix formatting for the pvtime documentation : : - Fix a comment in the VHE-specific Makefile : . KVM: arm64: Fix typo in comment KVM: arm64: Fix pvtime documentation Signed-off-by: Marc Zyngier <maz@kernel.org> |
||
Marc Zyngier
|
382b5b87a9 |
Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next
* kvm-arm64/mte-map-shared: : . : Update the MTE support to allow the VMM to use shared mappings : to back the memslots exposed to MTE-enabled guests. : : Patches courtesy of Catalin Marinas and Peter Collingbourne. : . : Fix a number of issues with MTE, such as races on the tags : being initialised vs the PG_mte_tagged flag as well as the : lack of support for VM_SHARED when KVM is involved. : : Patches from Catalin Marinas and Peter Collingbourne. : . Documentation: document the ABI changes for KVM_CAP_ARM_MTE KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled arm64: mte: Lock a page for MTE tag initialisation mm: Add PG_arch_3 page flag KVM: arm64: Simplify the sanitise_mte_tags() logic arm64: mte: Fix/clarify the PG_mte_tagged semantics mm: Do not enable PG_arch_2 for all 64-bit architectures Signed-off-by: Marc Zyngier <maz@kernel.org> |
||
David Matlack
|
34e30ebbe4 |
KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns
Clarify the existing documentation about how KVM_CAP_HALT_POLL and halt_poll_ns interact to make it clear that VMs using KVM_CAP_HALT_POLL ignore halt_poll_ns. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20221201195249.3369720-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
David Matlack
|
b8b43a4c2e |
KVM: Move halt-polling documentation into common directory
Move halt-polling.rst into the common KVM documentation directory and out of the x86-specific directory. Halt-polling is a common feature and the existing documentation is already written as such. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20221201195249.3369720-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Paolo Bonzini
|
b376144595 |
Misc KVM x86 fixes and cleanups for 6.2:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up the MSR filter docs. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmOJQesSHHNlYW5qY0Bn b29nbGUuY29tAAoJEGCRIgFNDBL5IR4QAKGPbLRykY/2FohV2HDu5fDPxA2Fe9nu 5W7ZIptQu+tQtCTWKFEjcQdwYoNrLbr0hr1eGubVbIvBqJbwPQfH7G0765eOIcvy s6Zn2N24IisIoUxdkJGOL3Tt1UR7wCFbwC+ms0i4FQvMcw+TbM0BTHgJDdwR5laX mGN7ubz5iImwDFFE3Bd8Qy5I+FGL9CI60l+RzK6b7J8HYi1wOBMLU9QueF/dB7gR g+navZJAAnvN6YIkjP5j8yPBuvhDzni379ue5ATDq1ALvyyI7xlYALsxpUjCnLuo CkbvgmfmC94Vdm7pzFgsbazUN2oIjwoimjFQHP1bf8Jmd+770R282JpnwiD/ydCV Tl2ArwXA2zxVxNZm9g/XqPBwWBWWgWfYIQbuuxc065MnXCnHkY5UGGf0JNx41CDl hdtm9DHkft8+6kyBBmgkdKxd328Znljq02v3nLePUipfpDVaNj4VAUj9IpV6Lpuj GJjs4Wx7oqFwH1Im/LqZgnOGwgkSj3ObHtkYx2RSrQAQultbjuplFz2qZWP8PF6A FrJbcddKOmLINrdNOlvTd5WKCAjtV8vycjFkk+/7H67rpZdM8AI1StrzMP6gmwg4 ARozZJ2UF8nTriRYFQbFQyNm9bBTZ7YQ/HajqfhvCuZLi7i1EaImhC0F1xn7IU5S 6XhvQPvjRTgS =i6OA -----END PGP SIGNATURE----- Merge tag 'kvm-x86-fixes-6.2-1' of https://github.com/kvm-x86/linux into HEAD Misc KVM x86 fixes and cleanups for 6.2: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up the MSR filter docs. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. |
||
Javier Martinez Canillas
|
10c5e80b2c |
KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
The ioctls are missing an architecture property that is present in others. Suggested-by: Sergio Lopez Pascual <slp@redhat.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Message-Id: <20221202105011.185147-5-javierm@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Javier Martinez Canillas
|
30ee198ce4 |
KVM: Reference to kvm_userspace_memory_region in doc and comments
There are still references to the removed kvm_memory_region data structure but the doc and comments should mention struct kvm_userspace_memory_region instead, since that is what's used by the ioctl that replaced the old one and this data structure support the same set of flags. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Message-Id: <20221202105011.185147-4-javierm@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Javier Martinez Canillas
|
66a9221d73 |
KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
The documentation says that the ioctl has been deprecated, but it has been actually removed and the remaining references are just left overs. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Message-Id: <20221202105011.185147-3-javierm@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Javier Martinez Canillas
|
61e15f8712 |
KVM: Delete all references to removed KVM_SET_MEMORY_REGION ioctl
The documentation says that the ioctl has been deprecated, but it has been actually removed and the remaining references are just left overs. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Message-Id: <20221202105011.185147-2-javierm@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Sean Christopherson
|
1f15814718 |
KVM: x86: Clean up KVM_CAP_X86_USER_SPACE_MSR documentation
Clean up the KVM_CAP_X86_USER_SPACE_MSR documentation to eliminate misleading and/or inconsistent verbiage, and to actually document what accesses are intercepted by which flags. - s/will/may since not all #GPs are guaranteed to be intercepted - s/deflect/intercept to align with common KVM terminology - s/user space/userspace to align with the majority of KVM docs - Avoid using "trap" terminology, as KVM exits to userspace _before_ stepping, i.e. doesn't exhibit trap-like behavior - Actually document the flags Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220831001706.4075399-4-seanjc@google.com |
||
Sean Christopherson
|
b93d2ec34e |
KVM: x86: Reword MSR filtering docs to more precisely define behavior
Reword the MSR filtering documentatiion to more precisely define the behavior of filtering using common virtualization terminology. - Explicitly document KVM's behavior when an MSR is denied - s/handled/allowed as there is no guarantee KVM will "handle" the MSR access - Drop the "fall back" terminology, which incorrectly suggests that there is existing KVM behavior to fall back to - Fix an off-by-one error in the range (the end is exclusive) - Call out the interaction between MSR filtering and KVM_CAP_X86_USER_SPACE_MSR's KVM_MSR_EXIT_REASON_FILTER - Delete the redundant paragraph on what '0' and '1' in the bitmap means, it's covered by the sections on KVM_MSR_FILTER_{READ,WRITE} - Delete the clause on x2APIC MSR behavior depending on APIC base, this is covered by stating that KVM follows architectural behavior when emulating/virtualizing MSR accesses Reported-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220831001706.4075399-3-seanjc@google.com |
||
Sean Christopherson
|
5c8c0b3273 |
KVM: x86: Delete documentation for READ|WRITE in KVM_X86_SET_MSR_FILTER
Delete the paragraph that describes the behavior when both KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE are set for a range. There is nothing special about KVM's handling of this combination, whereas explicitly documenting the combination suggests that there is some magic behavior the user needs to be aware of. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220831001706.4075399-2-seanjc@google.com |
||
David Woodhouse
|
d8ba8ba4c8 |
KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
Closer inspection of the Xen code shows that we aren't supposed to be using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall. If we randomly set the top bit of ->state_entry_time for a guest that hasn't asked for it and doesn't expect it, that could make the runtimes fail to add up and confuse the guest. Without the flag it's perfectly safe for a vCPU to read its own vcpu_runstate_info; just not for one vCPU to read *another's*. I briefly pondered adding a word for the whole set of VMASST_TYPE_* flags but the only one we care about for HVM guests is this, so it seemed a bit pointless. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20221127122210.248427-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Peter Collingbourne
|
a4baf8d263 |
Documentation: document the ABI changes for KVM_CAP_ARM_MTE
Document both the restriction on VM_MTE_ALLOWED mappings and the relaxation for shared mappings. Signed-off-by: Peter Collingbourne <pcc@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-9-pcc@google.com |
||
Paolo Bonzini
|
1e79a9e3ab |
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmN/eYwACgkQ41TmuOI4 ufjoxA/9Et38aXO/IhmUt8v0QhA4yec+sc5GSFfQSYehej/1Vqhw0DXx+ORUiRgg +rtiXJSSqkuD2dL+BDffY2xoul6nzNdVf4AbkcnrWscfWr6xwVYlPvuL0ymGI6J2 U/IPedRoKw0bHw/wHs05yV5PubrRwDFERKhtyXWYGbPJhX0w2n3IFOoKH1oWBhLW Dc8jEs6t3gDbJ71Er0xoeBUoiuu+PgZG06cpOvzBZ0KclRgjADXyISqqk8/4mu8w R+/Wf8NcrbQYV1jfCeq5zIsKC8uvnFj25UuyTLumn5vh+dNNsvE72Khe4tz7LI0I ZPZ+GZuemu7Yi12dKjw4Sw3ui0ejWH/5XL1SVB0X/xYIWrBqOot+Lq6538GCng+c tJt+zsu64VFgXCCZ8O9qO4uE4DBL70H3ThT7VZxIghSTZtY0xh3uFc64f3/3d9dy K4WTJHrmMxhXaA/rqtIa8I53JvFl8CztofZATiQQesyPuc7lZ01w1Co5el4xYaxe YknyMTq11qf/iYqVOW7sjoWW/YRuuMZ4+FhpI3o/SllVdN98iTwkk1kP3wcoBO5P bvzpm+WXHbv9OxifPrqkqv34+upbjfEmSogHudQzagBX4vl3rZRfBCdQGCAha0Uc ZYyg68kiil5sWmHI/Ln/ZjANYfbS5sF0CreuWxnmqcwKl2NSN/E= =/1yt -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-6.2-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function |
||
Claudio Imbrenda
|
d9459922a1 |
KVM: s390: pv: api documentation for asynchronous destroy
Add documentation for the new commands added to the KVM_S390_PV_COMMAND ioctl. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20221111170632.77622-3-imbrenda@linux.ibm.com Message-Id: <20221111170632.77622-3-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> |
||
Kuppuswamy Sathyanarayanan
|
6c8c1406a6 |
virt: Add TDX guest driver
TDX guest driver exposes IOCTL interfaces to service TDX guest user-specific requests. Currently, it is only used to allow the user to get the TDREPORT to support TDX attestation. Details about the TDX attestation process are documented in Documentation/x86/tdx.rst, and the IOCTL details are documented in Documentation/virt/coco/tdx-guest.rst. Operations like getting TDREPORT involves sending a blob of data as input and getting another blob of data as output. It was considered to use a sysfs interface for this, but it doesn't fit well into the standard sysfs model for configuring values. It would be possible to do read/write on files, but it would need multiple file descriptors, which would be somewhat messy. IOCTLs seem to be the best fitting and simplest model for this use case. The AMD sev-guest driver also uses the IOCTL interface to support attestation. [Bagas Sanjaya: Ack is for documentation portion] Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/all/20221116223820.819090-3-sathyanarayanan.kuppuswamy%40linux.intel.com |
||
Usama Arif
|
83f8a81dec |
KVM: arm64: Fix pvtime documentation
This includes table format and using reST labels for cross-referencing to vcpu.rst. Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Usama Arif <usama.arif@bytedance.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221103131210.3603385-1-usama.arif@bytedance.com |
||
Gavin Shan
|
9cb1096f85 |
KVM: arm64: Enable ring-based dirty memory tracking
Enable ring-based dirty memory tracking on ARM64: - Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL. - Enable CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP. - Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page offset. - Add ARM64 specific kvm_arch_allow_write_without_running_vcpu() to keep the site of saving vgic/its tables out of the no-running-vcpu radar. Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110104914.31280-5-gshan@redhat.com |
||
Gavin Shan
|
86bdf3ebcf |
KVM: Support dirty ring in conjunction with bitmap
ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is enabled. It's conflicting with that ring-based dirty page tracking always requires a running VCPU context. Introduce a new flavor of dirty ring that requires the use of both VCPU dirty rings and a dirty bitmap. The expectation is that for non-VCPU sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to the dirty bitmap. Userspace should scan the dirty bitmap before migrating the VM to the target. Use an additional capability to advertise this behavior. The newly added capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Suggested-by: Marc Zyngier <maz@kernel.org> Suggested-by: Peter Xu <peterx@redhat.com> Co-developed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Gavin Shan <gshan@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110104914.31280-4-gshan@redhat.com |
||
Nico Boehr
|
6973091d1b |
KVM: s390: pv: don't allow userspace to set the clock under PV
When running under PV, the guest's TOD clock is under control of the ultravisor and the hypervisor isn't allowed to change it. Hence, don't allow userspace to change the guest's TOD clock by returning -EOPNOTSUPP. When userspace changes the guest's TOD clock, KVM updates its kvm.arch.epoch field and, in addition, the epoch field in all state descriptions of all VCPUs. But, under PV, the ultravisor will ignore the epoch field in the state description and simply overwrite it on next SIE exit with the actual guest epoch. This leads to KVM having an incorrect view of the guest's TOD clock: it has updated its internal kvm.arch.epoch field, but the ultravisor ignores the field in the state description. Whenever a guest is now waiting for a clock comparator, KVM will incorrectly calculate the time when the guest should wake up, possibly causing the guest to sleep for much longer than expected. With this change, kvm_s390_set_tod() will now take the kvm->lock to be able to call kvm_s390_pv_is_protected(). Since kvm_s390_set_tod_clock() also takes kvm->lock, use __kvm_s390_set_tod_clock() instead. The function kvm_s390_set_tod_clock is now unused, hence remove it. Update the documentation to indicate the TOD clock attr calls can now return -EOPNOTSUPP. Fixes: 0f3035047140 ("KVM: s390: protvirt: Do only reset registers that are accessible") Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20221011160712.928239-2-nrb@linux.ibm.com Message-Id: <20221011160712.928239-2-nrb@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> |
||
Linus Torvalds
|
f311d498be |
ARM:
* Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS * Better handling of AArch32 ID registers on AArch64-only systems * Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering * Advertise the new kvmarm mailing list * Various minor cleanups and spelling fixes RISC-V: * Improved instruction encoding infrastructure for instructions not yet supported by binutils * Svinval support for both KVM Host and KVM Guest * Zihintpause support for KVM Guest * Zicbom support for KVM Guest * Record number of signal exits as a VCPU stat * Use generic guest entry infrastructure x86: * Misc PMU fixes and cleanups. * selftests: fixes for Hyper-V hypercall * selftests: fix nx_huge_pages_test on TDP-disabled hosts * selftests: cleanups for fix_hypercall_test -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM7OcMUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAFgf/Rqc9hrXZVdbh2OZ+gScSsFsPK1zO DISUksLcXaYVYYsvQAEg/N2BPz3XbmO4jA+z8bIUrYTA7fC98we2C4jfR+EaX/fO +/Kzf0lAgu/nQZyFzUya+1jRsZqvVbC/HmDCI2kzN4u78e/LZ7NVcMijdV/ly6ib cq0b0LLqJHe/fcpJ806JZP3p5sndQhDmlUkZ2AWZf6CUKSEFcufbbYkt+84ZK4PL N9mEqXYQ3DXClLQmIBv+NZhtGlmADkWDE4BNouw8dVxhaXH7Hw/jfBHdb6SSHMRe tQ6Src1j8AYOhf5J35SMudgkbGcMelm0yeZ7Sizk+5Ft0EmdbZsnkvsGdQ== =4RA+ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm updates from Paolo Bonzini: "The main batch of ARM + RISC-V changes, and a few fixes and cleanups for x86 (PMU virtualization and selftests). ARM: - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes RISC-V: - Improved instruction encoding infrastructure for instructions not yet supported by binutils - Svinval support for both KVM Host and KVM Guest - Zihintpause support for KVM Guest - Zicbom support for KVM Guest - Record number of signal exits as a VCPU stat - Use generic guest entry infrastructure x86: - Misc PMU fixes and cleanups. - selftests: fixes for Hyper-V hypercall - selftests: fix nx_huge_pages_test on TDP-disabled hosts - selftests: cleanups for fix_hypercall_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (57 commits) riscv: select HAVE_POSIX_CPU_TIMERS_TASK_WORK RISC-V: KVM: Use generic guest entry infrastructure RISC-V: KVM: Record number of signal exits as a vCPU stat RISC-V: KVM: add __init annotation to riscv_kvm_init() RISC-V: KVM: Expose Zicbom to the guest RISC-V: KVM: Provide UAPI for Zicbom block size RISC-V: KVM: Make ISA ext mappings explicit RISC-V: KVM: Allow Guest use Zihintpause extension RISC-V: KVM: Allow Guest use Svinval extension RISC-V: KVM: Use Svinval for local TLB maintenance when available RISC-V: Probe Svinval extension form ISA string RISC-V: KVM: Change the SBI specification version to v1.0 riscv: KVM: Apply insn-def to hlv encodings riscv: KVM: Apply insn-def to hfence encodings riscv: Introduce support for defining instructions riscv: Add X register names to gpr-nums KVM: arm64: Advertise new kvmarm mailing list kvm: vmx: keep constant definition format consistent kvm: mmu: fix typos in struct kvm_arch KVM: selftests: Fix nx_huge_pages_test on TDP-disabled hosts ... |
||
Linus Torvalds
|
3604a7f568 |
This update includes the following changes:
API: - Feed untrusted RNGs into /dev/random. - Allow HWRNG sleeping to be more interruptible. - Create lib/utils module. - Setting private keys no longer required for akcipher. - Remove tcrypt mode=1000. - Reorganised Kconfig entries. Algorithms: - Load x86/sha512 based on CPU features. - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher. Drivers: - Add HACE crypto driver aspeed. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmM785cACgkQxycdCkmx i6dveBAAmGVYtrPmcGfA6CmzZ8ps9KdZxhjHjzLKwuqrOMulZvE2IYeUV4QtNqpQ 6NLY2+TkqL0XIbCXoByIk32lMYIlXBaJdMYdHHDTeo7E2wqZn/46SPSWeNKazyJx dkL8Oj62nqDc2s0LOi3vLvod+sENFQ69R+vkHOa0fZhX0UBsac3NIXo+74Y2A7bE 0+iQFKTWdNnoQzQ0j4q8WMiolKYh21iPZ9l5sjgMgichLCaE6PrITlRcaWrtPhey U1OmJtbTPsg+5X1r9KyLtoAXtBDONl66GQyne+p/ZYD8cMhxomjJaPlMhwWE/n4d d2KJKvoXoPPo4c+yNIS9hBav07ZriPl0q0jd2M1rd6oYTmFpaodTgIBfjvxO+wfV GoqDS8PEc42U1uwkuKC/cvfr6pB8WiybfXy+vSXBm/jUgIOO3y+eqsC8Jx9ZoQeG F+d34PYfJrJbmDRtcA6ZKdzN0OmKq7aCilx1kGKGPg0D+uq64FBo7zsT6XzTK8HL 2Za9AACPn87xLQwGrKDSBfyrlSSIJm2FaIIPayUXHEo7cyoiZwbTpXRRJ1mDR+v9 jzI+xPEXCthtjysuRmufNhTkiZUv3lZ8ORfQ0QFKR53tjZUm+dVQo0V/N/ZSXoSV SyRvXYO+ToXePAofNWl1LcO1grX/vxtFNedMkDLHXooRcnCaIYo= =rq2f -----END PGP SIGNATURE----- Merge tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Feed untrusted RNGs into /dev/random - Allow HWRNG sleeping to be more interruptible - Create lib/utils module - Setting private keys no longer required for akcipher - Remove tcrypt mode=1000 - Reorganised Kconfig entries Algorithms: - Load x86/sha512 based on CPU features - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher Drivers: - Add HACE crypto driver aspeed" * tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (124 commits) crypto: aspeed - Remove redundant dev_err call crypto: scatterwalk - Remove unused inline function scatterwalk_aligned() crypto: aead - Remove unused inline functions from aead crypto: bcm - Simplify obtain the name for cipher crypto: marvell/octeontx - use sysfs_emit() to instead of scnprintf() hwrng: core - start hwrng kthread also for untrusted sources crypto: zip - remove the unneeded result variable crypto: qat - add limit to linked list parsing crypto: octeontx2 - Remove the unneeded result variable crypto: ccp - Remove the unneeded result variable crypto: aspeed - Fix check for platform_get_irq() errors crypto: virtio - fix memory-leak crypto: cavium - prevent integer overflow loading firmware crypto: marvell/octeontx - prevent integer overflows crypto: aspeed - fix build error when only CRYPTO_DEV_ASPEED is enabled crypto: hisilicon/qm - fix the qos value initialization crypto: sun4i-ss - use DEFINE_SHOW_ATTRIBUTE to simplify sun4i_ss_debugfs crypto: tcrypt - add async speed test for aria cipher crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher crypto: aria - prepare generic module for optimized implementations ... |
||
Linus Torvalds
|
ef688f8b8c |
The first batch of KVM patches, mostly covering x86, which I
am sending out early due to me travelling next week. There is a lone mm patch for which Andrew gave an informal ack at https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org. I will send the bulk of ARM work, as well as other architectures, at the end of next week. ARM: * Account stage2 page table allocations in memory stats. x86: * Account EPT/NPT arm64 page table allocations in memory stats. * Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses. * Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest. * Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs. * A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good. * A handful of fixes for memory leaks in error paths. * Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow. * Never write to memory from non-sleepable kvm_vcpu_check_block() * Selftests refinements and cleanups. * Misc typo cleanups. Generic: * remove KVM_REQ_UNHALT -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/ FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4 3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE +cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig== =/hqX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "The first batch of KVM patches, mostly covering x86. ARM: - Account stage2 page table allocations in memory stats x86: - Account EPT/NPT arm64 page table allocations in memory stats - Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses - Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest - Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs - A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good - A handful of fixes for memory leaks in error paths - Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow - Never write to memory from non-sleepable kvm_vcpu_check_block() - Selftests refinements and cleanups - Misc typo cleanups Generic: - remove KVM_REQ_UNHALT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits) KVM: remove KVM_REQ_UNHALT KVM: mips, x86: do not rely on KVM_REQ_UNHALT KVM: x86: never write to memory from kvm_vcpu_check_block() KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set KVM: x86: lapic does not have to process INIT if it is blocked KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed KVM: nVMX: Make an event request when pending an MTF nested VM-Exit KVM: x86: make vendor code check for all nested events mailmap: Update Oliver's email address KVM: x86: Allow force_emulation_prefix to be written without a reload KVM: selftests: Add an x86-only test to verify nested exception queueing KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events() KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions KVM: x86: Morph pending exceptions to pending VM-Exits at queue time ... |
||
Paolo Bonzini
|
fe4d9e4abf |
KVM/arm64 updates for v6.1
- Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5hQcPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDoMUP/jra4HSmujLUB5G7Op8HxuurEecOc6xtw0Af AbDLlVc2Vs4rrdVh8GMc8D80atUAVitp8IFjdp/PzI2GTBTzWz43Gav2AbhgIJbJ xoFVHL8LkdHKyMbq10359DqGMqhIf41OFzGwhbzcx2V4pKNkSpjbCpu3bi/+Ybjg 006ZpZc7NAU0rZgw9Flb/dhn0jw7RMc3orhoDQ4tBp1P/VhvqvgFt5bWipkvvBP7 +lQK28ujG3ghST/hKRhg6ozgy5+6NEEHMuhErMYP8nIivRchX+pWF2Lb0qGH1e+U v2MZIZnIIUjyTV1vbYlxtltzfYmPuQ2MFNUBawI9tmlIOU9vJSCzeJS64uWK4KLV kbmk57OfC7rQoSNJH4jaKQp0YpIktrB9Vei97t4I7NwEmkjQj6cLTgg4tQrNqTiQ cFGeC9mE+lhFC8z1lCbna2eG631FxpPrB1SJ1/CU9wboam9dUfXGIvBPh+i2pvMZ vcxzUZJ11y+/uhp4k8i2PBwNno0iwRXd5MinwRUs2CR5vhs8qa5y7FVWKyqKpgI2 xqr4lYTixJZL3mWkYyOQuClrTbT1zkoaPldLq6M7wvO08+QV8ryMeyKT+9s/gNQU dcYSwBCWZaOZm2nN8/zjxRb7VqZVu3cwyXi9XXUWNTCgIe/Q/SDPbXU/Hwbgzf8X UsQF7e9A =aNPK -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes |
||
Marc Zyngier
|
671c8c7f9f |
KVM: Document weakly ordered architecture requirements for dirty ring
Now that the kernel can expose to userspace that its dirty ring management relies on explicit ordering, document these new requirements for VMMs to do the right thing. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20220926145120.27974-5-maz@kernel.org |
||
Akhil Raj
|
7f77ebbf75 |
Delete duplicate words from kernel docs
I have deleted duplicate words like to, guest, trace, when, we Signed-off-by: Akhil Raj <lf32.dev@gmail.com> Link: https://lore.kernel.org/r/20220829065239.4531-1-lf32.dev@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net> |
||
Paolo Bonzini
|
c59fb12758 |
KVM: remove KVM_REQ_UNHALT
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Aaron Lewis
|
b5cb32b16c |
KVM: x86: Delete duplicate documentation for KVM_X86_SET_MSR_FILTER
Two copies of KVM_X86_SET_MSR_FILTER somehow managed to make it's way into the documentation. Remove one copy and merge the difference from the removed copy into the copy that's being kept. Fixes: fd49e8ee70b3 ("Merge branch 'kvm-sev-cgroup' into HEAD") Signed-off-by: Aaron Lewis <aaronlewis@google.com> Link: https://lore.kernel.org/r/20220712001045.2364298-2-aaronlewis@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Jacky Li
|
d8da2da21f |
crypto: ccp - Initialize PSP when reading psp data file failed
Currently the OS fails the PSP initialization when the file specified at 'init_ex_path' does not exist or has invalid content. However the SEV spec just requires users to allocate 32KB of 0xFF in the file, which can be taken care of by the OS easily. To improve the robustness during the PSP init, leverage the retry mechanism and continue the init process: Before the first INIT_EX call, if the content is invalid or missing, continue the process by feeding those contents into PSP instead of aborting. PSP will then override it with 32KB 0xFF and return SEV_RET_SECURE_DATA_INVALID status code. In the second INIT_EX call, this 32KB 0xFF content will then be fed and PSP will write the valid data to the file. In order to do this, sev_read_init_ex_file should only be called once for the first INIT_EX call. Calling it again for the second INIT_EX call will cause the invalid file content overwriting the valid 32KB 0xFF data provided by PSP in the first INIT_EX call. Co-developed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Peter Gonda <pgonda@google.com> Signed-off-by: Jacky Li <jackyli@google.com> Reported-by: Alper Gun <alpergun@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> |
||
Bagas Sanjaya
|
19a7cc817a |
KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
There is unexpected warning on KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table, which cause the table to be rendered as paragraph text instead. The warning is due to missing colon at capability name and returns keyword, as well as improper alignment on multi-line returns field. Fix the warning by adding missing colons and aligning the field. Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/ Fixes: 084cc29f8bbb03 ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Matlack <dmatlack@google.com> Cc: Ben Gardon <bgardon@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-next@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Message-Id: <20220627095151.19339-3-bagasdotme@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Bagas Sanjaya
|
b4aed4d85f |
Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
Extend heading underline for KVM_CAP_VM_DISABLE_NX_HUGE_PAGE to match the heading text length. Link: https://lore.kernel.org/lkml/20220627181937.3be67263@canb.auug.org.au/ Fixes: 084cc29f8bbb03 ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Matlack <dmatlack@google.com> Cc: Ben Gardon <bgardon@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-next@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Message-Id: <20220627095151.19339-2-bagasdotme@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
Linus Torvalds
|
7c5c3a6177 |
ARM:
* Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack * Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure * Disagregation of the vcpu flags in separate sets to better track their use model. * A fix for the GICv2-on-v3 selftest * A small set of cosmetic fixes RISC-V: * Track ISA extensions used by Guest using bitmap * Added system instruction emulation framework * Added CSR emulation framework * Added gfp_custom flag in struct kvm_mmu_memory_cache * Added G-stage ioremap() and iounmap() functions * Added support for Svpbmt inside Guest s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes x86: * Permit guests to ignore single-bit ECC errors * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Use try_cmpxchg64 instead of cmpxchg64 * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * Allow NX huge page mitigation to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs * Miscellaneous cleanups: ** MCE MSR emulation ** Use separate namespaces for guest PTEs and shadow PTEs bitmasks ** PIO emulation ** Reorganize rmap API, mostly around rmap destruction ** Do not workaround very old KVM bugs for L0 that runs with nesting enabled ** new selftests API for CPUID Generic: * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA== =PM8o -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Quite a large pull request due to a selftest API overhaul and some patches that had come in too late for 5.19. ARM: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes RISC-V: - Track ISA extensions used by Guest using bitmap - Added system instruction emulation framework - Added CSR emulation framework - Added gfp_custom flag in struct kvm_mmu_memory_cache - Added G-stage ioremap() and iounmap() functions - Added support for Svpbmt inside Guest s390: - add an interface to provide a hypervisor dump for secure guests - improve selftests to use TAP interface - enable interpretive execution of zPCI instructions (for PCI passthrough) - First part of deferred teardown - CPU Topology - PV attestation - Minor fixes x86: - Permit guests to ignore single-bit ECC errors - Intel IPI virtualization - Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS - PEBS virtualization - Simplify PMU emulation by just using PERF_TYPE_RAW events - More accurate event reinjection on SVM (avoid retrying instructions) - Allow getting/setting the state of the speaker port data bit - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent - "Notify" VM exit (detect microarchitectural hangs) for Intel - Use try_cmpxchg64 instead of cmpxchg64 - Ignore benign host accesses to PMU MSRs when PMU is disabled - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior - Allow NX huge page mitigation to be disabled on a per-vm basis - Port eager page splitting to shadow MMU as well - Enable CMCI capability by default and handle injected UCNA errors - Expose pid of vcpu threads in debugfs - x2AVIC support for AMD - cleanup PIO emulation - Fixes for LLDT/LTR emulation - Don't require refcounted "struct page" to create huge SPTEs - Miscellaneous cleanups: - MCE MSR emulation - Use separate namespaces for guest PTEs and shadow PTEs bitmasks - PIO emulation - Reorganize rmap API, mostly around rmap destruction - Do not workaround very old KVM bugs for L0 that runs with nesting enabled - new selftests API for CPUID Generic: - Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache - new selftests API using struct kvm_vcpu instead of a (vm, id) tuple" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits) selftests: kvm: set rax before vmcall selftests: KVM: Add exponent check for boolean stats selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test selftests: KVM: Check stat name before other fields KVM: x86/mmu: remove unused variable RISC-V: KVM: Add support for Svpbmt inside Guest/VM RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap() RISC-V: KVM: Add G-stage ioremap() and iounmap() functions KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache RISC-V: KVM: Add extensible CSR emulation framework RISC-V: KVM: Add extensible system instruction emulation framework RISC-V: KVM: Factor-out instruction emulation into separate sources RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function RISC-V: KVM: Fix variable spelling mistake RISC-V: KVM: Improve ISA extension by using a bitmap KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs() KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT KVM: x86: Do not block APIC write for non ICR registers ... |
||
Linus Torvalds
|
aad26f55f4 |
This was a moderately busy cycle for documentation, but nothing all that
earth-shaking: - More Chinese translations, and an update to the Italian translations. The Japanese, Korean, and traditional Chinese translations are more-or-less unmaintained at this point, instead. - Some build-system performance improvements. - The removal of the archaic submitting-drivers.rst document, with the movement of what useful material that remained into other docs. - Improvements to sphinx-pre-install to, hopefully, give more useful suggestions. - A number of build-warning fixes Plus the usual collection of typo fixes, updates, and more. -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmLn9OwPHGNvcmJldEBs d24ubmV0AAoJEBdDWhNsDH5YtrwIAJNZoDYJJIRuVHnFkAn5EJ4b/chnR1dSTBtn WdE/1zdAlMBWVlEGO48VZybph9Sk0v+cUGf+yviDgASQrfOhRRTkg/0u6XaBAYO0 +C2D1QDd9DggGgajxsfJfTdD3IuB78mGmCQvP17XIJW+NK1CK9rXZBnj6WC5/HJw PCHzeeVreBxOS3W9GelMYa6vjVl7dv81x4DPllnsgU2AMk0/Ce0MVjeIZ695sOeP Ki6jZgC2GsgFSK5kBC35OiDe5q+fDzlLfek34EUCn4SIbMALSUYWO1db122w5Pme Ej0+UTBhD19WH1uB/rcVKnVWugi7UEUJexZsao+nC7UrdIVtYq0= =83BG -----END PGP SIGNATURE----- Merge tag 'docs-6.0' of git://git.lwn.net/linux Pull documentation updates from Jonathan Corbet: "This was a moderately busy cycle for documentation, but nothing all that earth-shaking: - More Chinese translations, and an update to the Italian translations. The Japanese, Korean, and traditional Chinese translations are more-or-less unmaintained at this point, instead. - Some build-system performance improvements. - The removal of the archaic submitting-drivers.rst document, with the movement of what useful material that remained into other docs. - Improvements to sphinx-pre-install to, hopefully, give more useful suggestions. - A number of build-warning fixes Plus the usual collection of typo fixes, updates, and more" * tag 'docs-6.0' of git://git.lwn.net/linux: (92 commits) docs: efi-stub: Fix paths for x86 / arm stubs Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8 Docs/zh_CN: Update the translation of pci to 5.19-rc8 Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8 Docs/zh_CN: Update the translation of usage to 5.19-rc8 Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8 Docs/zh_CN: Update the translation of sparse to 5.19-rc8 Docs/zh_CN: Update the translation of kasan to 5.19-rc8 Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8 doc:it_IT: align Italian documentation docs: Remove spurious tag from admin-guide/mm/overcommit-accounting.rst Documentation: process: Update email client instructions for Thunderbird docs: ABI: correct QEMU fw_cfg spec path doc/zh_CN: remove submitting-driver reference from docs docs: zh_TW: align to submitting-drivers removal docs: zh_CN: align to submitting-drivers removal docs: ko_KR: howto: remove reference to removed submitting-drivers docs: ja_JP: howto: remove reference to removed submitting-drivers docs: it_IT: align to submitting-drivers removal docs: process: remove outdated submitting-drivers.rst ... |
||
Linus Torvalds
|
0cec3f24a7 |
arm64 updates for 5.20
- Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4 wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1 =LVH0 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Highlights include a major rework of our kPTI page-table rewriting code (which makes it both more maintainable and considerably faster in the cases where it is required) as well as significant changes to our early boot code to reduce the need for data cache maintenance and greatly simplify the KASLR relocation dance. Summary: - Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits) arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr} arm64: fix KASAN_INLINE arm64/hwcap: Support FEAT_EBF16 arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long arm64/hwcap: Document allocation of upper bits of AT_HWCAP arm64: enable THP_SWAP for arm64 arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52 arm64: errata: Remove AES hwcap for COMPAT tasks arm64: numa: Don't check node against MAX_NUMNODES drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node() docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags" mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON mm: kasan: Skip unpoisoning of user pages mm: kasan: Ensure the tags are visible before the tag in page->flags drivers/perf: hisi: add driver for HNS3 PMU drivers/perf: hisi: Add description for HNS3 PMU driver drivers/perf: riscv_pmu_sbi: perf format perf/arm-cci: Use the bitmap API to allocate bitmaps ... |
||
Paolo Bonzini
|
63f4b21041 |
Merge remote-tracking branch 'kvm/next' into kvm-next-5.20
KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID |
||
Paolo Bonzini
|
a4850b5590 |
KVM: s390x: Fixes and features for 5.20
* First part of deferred teardown * CPU Topology * interpretive execution for PCI instructions * PV attestation * Minor fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEoWuZBM6M3lCBSfTnuARItAMU6BMFAmLZaj0ACgkQuARItAMU 6BPmPw/+PIZ7xxoiuUCCWlFi+RsfgTH1OWGoLL2B8zQHyJJ+IS4puGqKfVdusl56 GhPPlTkBG9gbhRXUMDWnKgp4JLOIj8ngbgQ8zcD/xeH8tzm4iRy+uj7PxCe+kQGB CY3rL42HwEvFJUbLzi0/8ahLjz6t7+hq7okMdhq9/MoIJBp/w74QZb+chpbuCpi/ /nwdfon85NYTLCK73g6kjjHzzA3hQfUmheVkOt288Qp3yKLh77blJ79SRl3xHeuH PBTqsLoxMiJyFfhJQ9U5HnKcIMAD6vA4ayoj1PAZgToYt40B3k8K6HQF+07TcZuy KABAqmUMNgbapUC74+U45d1eLAl1TGbZSuaq5ymb9jyFyMxGN2Vp6K9O6Ebd+ghy NsGc+CLRPDTiwi/D+QQpGrIksqcMEy+ocGGjbeIuRQxao6C9+KladP5nRxXJpDZf Vrupw+gDjBPozmQAA50VtR+ad5rdEdbSmUvo0nIeBpQ4VJxLFnA0tXl4XOEEFr2M 214yv691KFXhVtaV0lHRKoCQ78OAix6XyjHI0rukGnqNNo7w+V3htU0oG0ifWEU9 96BpKKpjfsCRhN1h2BF8MTPMEb6AGsI2hLCm3neGxrcT7KM1chk2Z3qlGdrc3xZT BIj7H23trC6HG0vCi1rHAn+S0zmcnt5TCaRUNQF5nUeLYfV1FC0= =IB9Y -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-5.20-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390x: Fixes and features for 5.20 * First part of deferred teardown * CPU Topology * interpretive execution for PCI instructions * PV attestation * Minor fixes |
||
Pierre Morel
|
f5ecfee944 |
KVM: s390: resetting the Topology-Change-Report
During a subsystem reset the Topology-Change-Report is cleared. Let's give userland the possibility to clear the MTCR in the case of a subsystem reset. To migrate the MTCR, we give userland the possibility to query the MTCR state. We indicate KVM support for the CPU topology facility with a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20220714194334.127812-1-pmorel@linux.ibm.com> Link: https://lore.kernel.org/all/20220714194334.127812-1-pmorel@linux.ibm.com/ [frankja@linux.ibm.com: Simple conflict resolution in Documentation/virt/kvm/api.rst] Signed-off-by: Janosch Frank <frankja@linux.ibm.com> |
||
Oliver Upton
|
450a563924 |
KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats
commit 1b870fa5573e ("kvm: stats: tell userspace which values are boolean") added a new stat unit (boolean) but failed to raise KVM_STATS_UNIT_MAX. Fix by pointing UNIT_MAX at the new max value of UNIT_BOOLEAN. Fixes: 1b870fa5573e ("kvm: stats: tell userspace which values are boolean") Reported-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20220719125229.2934273-1-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |