mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-04 04:02:26 +00:00
2c9b351240
* Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement * Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware * Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol * FPSIMD/SVE support for nested, including merged trap configuration and exception routing * New command-line parameter to control the WFx trap behavior under KVM * Introduce kCFI hardening in the EL2 hypervisor * Fixes + cleanups for handling presence/absence of FEAT_TCRX * Miscellaneous fixes + documentation updates LoongArch: * Add paravirt steal time support. * Add support for KVM_DIRTY_LOG_INITIALLY_SET. * Add perf kvm-stat support for loongarch. RISC-V: * Redirect AMO load/store access fault traps to guest * perf kvm stat support * Use guest files for IMSIC virtualization, when available ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA extensions is coming through the RISC-V tree. s390: * Assortment of tiny fixes which are not time critical x86: * Fixes for Xen emulation. * Add a global struct to consolidate tracking of host values, e.g. EFER * Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX. * Print the name of the APICv/AVIC inhibits in the relevant tracepoint. * Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor. * Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop. * Update to the newfangled Intel CPU FMS infrastructure. * Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored. * Misc cleanups x86 - MMU: * Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support. * Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs. * Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages. * Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards. x86 - AMD: * Make per-CPU save_area allocations NUMA-aware. * Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code. * Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges. This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification. There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data. x86 - Intel: * Remove an unnecessary EPT TLB flush when enabling hardware. * Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1). * KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation. Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed. See commit0dc902267c
("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows. Generic: * Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them). * New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl. * Enable halt poll shrinking by default, as Intel found it to be a clear win. * Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. * Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). * Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. * Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. Selftests: * Remove dead code in the memslot modification stress test. * Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs. * Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs. * Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaZQB0UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNkZwf/bv2jiENaLFNGPe/VqTKMQ6PHQLMG +sNHx6fJPP35gTM8Jqf0/7/ummZXcSuC1mWrzYbecZm7Oeg3vwNXHZ4LquwwX6Dv 8dKcUzLbWDAC4WA3SKhi8C8RV2v6E7ohy69NtAJmFWTc7H95dtIQm6cduV2osTC3 OEuHe1i8d9umk6couL9Qhm8hk3i9v2KgCsrfyNrQgLtS3hu7q6yOTR8nT0iH6sJR KE5A8prBQgLmF34CuvYDw4Hu6E4j+0QmIqodovg2884W1gZQ9LmcVqYPaRZGsG8S iDdbkualLKwiR1TpRr3HJGKWSFdc7RblbsnHRvHIZgFsMQiimh4HrBSCyQ== =zepX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit0dc902267c
("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
401 lines
13 KiB
C
401 lines
13 KiB
C
/* SPDX-License-Identifier: GPL-2.0+ */
|
|
/*
|
|
* Sleepable Read-Copy Update mechanism for mutual exclusion
|
|
*
|
|
* Copyright (C) IBM Corporation, 2006
|
|
* Copyright (C) Fujitsu, 2012
|
|
*
|
|
* Author: Paul McKenney <paulmck@linux.ibm.com>
|
|
* Lai Jiangshan <laijs@cn.fujitsu.com>
|
|
*
|
|
* For detailed explanation of Read-Copy Update mechanism see -
|
|
* Documentation/RCU/ *.txt
|
|
*
|
|
*/
|
|
|
|
#ifndef _LINUX_SRCU_H
|
|
#define _LINUX_SRCU_H
|
|
|
|
#include <linux/mutex.h>
|
|
#include <linux/rcupdate.h>
|
|
#include <linux/workqueue.h>
|
|
#include <linux/rcu_segcblist.h>
|
|
|
|
struct srcu_struct;
|
|
|
|
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
|
|
|
int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
|
|
struct lock_class_key *key);
|
|
|
|
#define init_srcu_struct(ssp) \
|
|
({ \
|
|
static struct lock_class_key __srcu_key; \
|
|
\
|
|
__init_srcu_struct((ssp), #ssp, &__srcu_key); \
|
|
})
|
|
|
|
#define __SRCU_DEP_MAP_INIT(srcu_name) .dep_map = { .name = #srcu_name },
|
|
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
|
|
|
int init_srcu_struct(struct srcu_struct *ssp);
|
|
|
|
#define __SRCU_DEP_MAP_INIT(srcu_name)
|
|
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
|
|
|
#ifdef CONFIG_TINY_SRCU
|
|
#include <linux/srcutiny.h>
|
|
#elif defined(CONFIG_TREE_SRCU)
|
|
#include <linux/srcutree.h>
|
|
#else
|
|
#error "Unknown SRCU implementation specified to kernel configuration"
|
|
#endif
|
|
|
|
void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
|
|
void (*func)(struct rcu_head *head));
|
|
void cleanup_srcu_struct(struct srcu_struct *ssp);
|
|
int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
|
|
void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
|
|
void synchronize_srcu(struct srcu_struct *ssp);
|
|
|
|
#define SRCU_GET_STATE_COMPLETED 0x1
|
|
|
|
/**
|
|
* get_completed_synchronize_srcu - Return a pre-completed polled state cookie
|
|
*
|
|
* Returns a value that poll_state_synchronize_srcu() will always treat
|
|
* as a cookie whose grace period has already completed.
|
|
*/
|
|
static inline unsigned long get_completed_synchronize_srcu(void)
|
|
{
|
|
return SRCU_GET_STATE_COMPLETED;
|
|
}
|
|
|
|
unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
|
|
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
|
|
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
|
|
|
|
// Maximum number of unsigned long values corresponding to
|
|
// not-yet-completed SRCU grace periods.
|
|
#define NUM_ACTIVE_SRCU_POLL_OLDSTATE 2
|
|
|
|
/**
|
|
* same_state_synchronize_srcu - Are two old-state values identical?
|
|
* @oldstate1: First old-state value.
|
|
* @oldstate2: Second old-state value.
|
|
*
|
|
* The two old-state values must have been obtained from either
|
|
* get_state_synchronize_srcu(), start_poll_synchronize_srcu(), or
|
|
* get_completed_synchronize_srcu(). Returns @true if the two values are
|
|
* identical and @false otherwise. This allows structures whose lifetimes
|
|
* are tracked by old-state values to push these values to a list header,
|
|
* allowing those structures to be slightly smaller.
|
|
*/
|
|
static inline bool same_state_synchronize_srcu(unsigned long oldstate1, unsigned long oldstate2)
|
|
{
|
|
return oldstate1 == oldstate2;
|
|
}
|
|
|
|
#ifdef CONFIG_NEED_SRCU_NMI_SAFE
|
|
int __srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp);
|
|
void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx) __releases(ssp);
|
|
#else
|
|
static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp)
|
|
{
|
|
return __srcu_read_lock(ssp);
|
|
}
|
|
static inline void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
|
|
{
|
|
__srcu_read_unlock(ssp, idx);
|
|
}
|
|
#endif /* CONFIG_NEED_SRCU_NMI_SAFE */
|
|
|
|
void srcu_init(void);
|
|
|
|
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
|
|
|
/**
|
|
* srcu_read_lock_held - might we be in SRCU read-side critical section?
|
|
* @ssp: The srcu_struct structure to check
|
|
*
|
|
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an SRCU
|
|
* read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC,
|
|
* this assumes we are in an SRCU read-side critical section unless it can
|
|
* prove otherwise.
|
|
*
|
|
* Checks debug_lockdep_rcu_enabled() to prevent false positives during boot
|
|
* and while lockdep is disabled.
|
|
*
|
|
* Note that SRCU is based on its own statemachine and it doesn't
|
|
* relies on normal RCU, it can be called from the CPU which
|
|
* is in the idle loop from an RCU point of view or offline.
|
|
*/
|
|
static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
|
|
{
|
|
if (!debug_lockdep_rcu_enabled())
|
|
return 1;
|
|
return lock_is_held(&ssp->dep_map);
|
|
}
|
|
|
|
/*
|
|
* Annotations provide deadlock detection for SRCU.
|
|
*
|
|
* Similar to other lockdep annotations, except there is an additional
|
|
* srcu_lock_sync(), which is basically an empty *write*-side critical section,
|
|
* see lock_sync() for more information.
|
|
*/
|
|
|
|
/* Annotates a srcu_read_lock() */
|
|
static inline void srcu_lock_acquire(struct lockdep_map *map)
|
|
{
|
|
lock_map_acquire_read(map);
|
|
}
|
|
|
|
/* Annotates a srcu_read_lock() */
|
|
static inline void srcu_lock_release(struct lockdep_map *map)
|
|
{
|
|
lock_map_release(map);
|
|
}
|
|
|
|
/* Annotates a synchronize_srcu() */
|
|
static inline void srcu_lock_sync(struct lockdep_map *map)
|
|
{
|
|
lock_map_sync(map);
|
|
}
|
|
|
|
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
|
|
|
static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
|
|
{
|
|
return 1;
|
|
}
|
|
|
|
#define srcu_lock_acquire(m) do { } while (0)
|
|
#define srcu_lock_release(m) do { } while (0)
|
|
#define srcu_lock_sync(m) do { } while (0)
|
|
|
|
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
|
|
|
#define SRCU_NMI_UNKNOWN 0x0
|
|
#define SRCU_NMI_UNSAFE 0x1
|
|
#define SRCU_NMI_SAFE 0x2
|
|
|
|
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_TREE_SRCU)
|
|
void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe);
|
|
#else
|
|
static inline void srcu_check_nmi_safety(struct srcu_struct *ssp,
|
|
bool nmi_safe) { }
|
|
#endif
|
|
|
|
|
|
/**
|
|
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
|
|
* @p: the pointer to fetch and protect for later dereferencing
|
|
* @ssp: pointer to the srcu_struct, which is used to check that we
|
|
* really are in an SRCU read-side critical section.
|
|
* @c: condition to check for update-side use
|
|
*
|
|
* If PROVE_RCU is enabled, invoking this outside of an RCU read-side
|
|
* critical section will result in an RCU-lockdep splat, unless @c evaluates
|
|
* to 1. The @c argument will normally be a logical expression containing
|
|
* lockdep_is_held() calls.
|
|
*/
|
|
#define srcu_dereference_check(p, ssp, c) \
|
|
__rcu_dereference_check((p), __UNIQUE_ID(rcu), \
|
|
(c) || srcu_read_lock_held(ssp), __rcu)
|
|
|
|
/**
|
|
* srcu_dereference - fetch SRCU-protected pointer for later dereferencing
|
|
* @p: the pointer to fetch and protect for later dereferencing
|
|
* @ssp: pointer to the srcu_struct, which is used to check that we
|
|
* really are in an SRCU read-side critical section.
|
|
*
|
|
* Makes rcu_dereference_check() do the dirty work. If PROVE_RCU
|
|
* is enabled, invoking this outside of an RCU read-side critical
|
|
* section will result in an RCU-lockdep splat.
|
|
*/
|
|
#define srcu_dereference(p, ssp) srcu_dereference_check((p), (ssp), 0)
|
|
|
|
/**
|
|
* srcu_dereference_notrace - no tracing and no lockdep calls from here
|
|
* @p: the pointer to fetch and protect for later dereferencing
|
|
* @ssp: pointer to the srcu_struct, which is used to check that we
|
|
* really are in an SRCU read-side critical section.
|
|
*/
|
|
#define srcu_dereference_notrace(p, ssp) srcu_dereference_check((p), (ssp), 1)
|
|
|
|
/**
|
|
* srcu_read_lock - register a new reader for an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to register the new reader.
|
|
*
|
|
* Enter an SRCU read-side critical section. Note that SRCU read-side
|
|
* critical sections may be nested. However, it is illegal to
|
|
* call anything that waits on an SRCU grace period for the same
|
|
* srcu_struct, whether directly or indirectly. Please note that
|
|
* one way to indirectly wait on an SRCU grace period is to acquire
|
|
* a mutex that is held elsewhere while calling synchronize_srcu() or
|
|
* synchronize_srcu_expedited().
|
|
*
|
|
* Note that srcu_read_lock() and the matching srcu_read_unlock() must
|
|
* occur in the same context, for example, it is illegal to invoke
|
|
* srcu_read_unlock() in an irq handler if the matching srcu_read_lock()
|
|
* was invoked in process context.
|
|
*/
|
|
static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
|
|
{
|
|
int retval;
|
|
|
|
srcu_check_nmi_safety(ssp, false);
|
|
retval = __srcu_read_lock(ssp);
|
|
srcu_lock_acquire(&ssp->dep_map);
|
|
return retval;
|
|
}
|
|
|
|
/**
|
|
* srcu_read_lock_nmisafe - register a new reader for an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to register the new reader.
|
|
*
|
|
* Enter an SRCU read-side critical section, but in an NMI-safe manner.
|
|
* See srcu_read_lock() for more information.
|
|
*/
|
|
static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp)
|
|
{
|
|
int retval;
|
|
|
|
srcu_check_nmi_safety(ssp, true);
|
|
retval = __srcu_read_lock_nmisafe(ssp);
|
|
rcu_try_lock_acquire(&ssp->dep_map);
|
|
return retval;
|
|
}
|
|
|
|
/* Used by tracing, cannot be traced and cannot invoke lockdep. */
|
|
static inline notrace int
|
|
srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
|
|
{
|
|
int retval;
|
|
|
|
srcu_check_nmi_safety(ssp, false);
|
|
retval = __srcu_read_lock(ssp);
|
|
return retval;
|
|
}
|
|
|
|
/**
|
|
* srcu_down_read - register a new reader for an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to register the new reader.
|
|
*
|
|
* Enter a semaphore-like SRCU read-side critical section. Note that
|
|
* SRCU read-side critical sections may be nested. However, it is
|
|
* illegal to call anything that waits on an SRCU grace period for the
|
|
* same srcu_struct, whether directly or indirectly. Please note that
|
|
* one way to indirectly wait on an SRCU grace period is to acquire
|
|
* a mutex that is held elsewhere while calling synchronize_srcu() or
|
|
* synchronize_srcu_expedited(). But if you want lockdep to help you
|
|
* keep this stuff straight, you should instead use srcu_read_lock().
|
|
*
|
|
* The semaphore-like nature of srcu_down_read() means that the matching
|
|
* srcu_up_read() can be invoked from some other context, for example,
|
|
* from some other task or from an irq handler. However, neither
|
|
* srcu_down_read() nor srcu_up_read() may be invoked from an NMI handler.
|
|
*
|
|
* Calls to srcu_down_read() may be nested, similar to the manner in
|
|
* which calls to down_read() may be nested.
|
|
*/
|
|
static inline int srcu_down_read(struct srcu_struct *ssp) __acquires(ssp)
|
|
{
|
|
WARN_ON_ONCE(in_nmi());
|
|
srcu_check_nmi_safety(ssp, false);
|
|
return __srcu_read_lock(ssp);
|
|
}
|
|
|
|
/**
|
|
* srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to unregister the old reader.
|
|
* @idx: return value from corresponding srcu_read_lock().
|
|
*
|
|
* Exit an SRCU read-side critical section.
|
|
*/
|
|
static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
|
|
__releases(ssp)
|
|
{
|
|
WARN_ON_ONCE(idx & ~0x1);
|
|
srcu_check_nmi_safety(ssp, false);
|
|
srcu_lock_release(&ssp->dep_map);
|
|
__srcu_read_unlock(ssp, idx);
|
|
}
|
|
|
|
/**
|
|
* srcu_read_unlock_nmisafe - unregister a old reader from an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to unregister the old reader.
|
|
* @idx: return value from corresponding srcu_read_lock().
|
|
*
|
|
* Exit an SRCU read-side critical section, but in an NMI-safe manner.
|
|
*/
|
|
static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
|
|
__releases(ssp)
|
|
{
|
|
WARN_ON_ONCE(idx & ~0x1);
|
|
srcu_check_nmi_safety(ssp, true);
|
|
rcu_lock_release(&ssp->dep_map);
|
|
__srcu_read_unlock_nmisafe(ssp, idx);
|
|
}
|
|
|
|
/* Used by tracing, cannot be traced and cannot call lockdep. */
|
|
static inline notrace void
|
|
srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
|
|
{
|
|
srcu_check_nmi_safety(ssp, false);
|
|
__srcu_read_unlock(ssp, idx);
|
|
}
|
|
|
|
/**
|
|
* srcu_up_read - unregister a old reader from an SRCU-protected structure.
|
|
* @ssp: srcu_struct in which to unregister the old reader.
|
|
* @idx: return value from corresponding srcu_read_lock().
|
|
*
|
|
* Exit an SRCU read-side critical section, but not necessarily from
|
|
* the same context as the maching srcu_down_read().
|
|
*/
|
|
static inline void srcu_up_read(struct srcu_struct *ssp, int idx)
|
|
__releases(ssp)
|
|
{
|
|
WARN_ON_ONCE(idx & ~0x1);
|
|
WARN_ON_ONCE(in_nmi());
|
|
srcu_check_nmi_safety(ssp, false);
|
|
__srcu_read_unlock(ssp, idx);
|
|
}
|
|
|
|
/**
|
|
* smp_mb__after_srcu_read_unlock - ensure full ordering after srcu_read_unlock
|
|
*
|
|
* Converts the preceding srcu_read_unlock into a two-way memory barrier.
|
|
*
|
|
* Call this after srcu_read_unlock, to guarantee that all memory operations
|
|
* that occur after smp_mb__after_srcu_read_unlock will appear to happen after
|
|
* the preceding srcu_read_unlock.
|
|
*/
|
|
static inline void smp_mb__after_srcu_read_unlock(void)
|
|
{
|
|
/* __srcu_read_unlock has smp_mb() internally so nothing to do here. */
|
|
}
|
|
|
|
/**
|
|
* smp_mb__after_srcu_read_lock - ensure full ordering after srcu_read_lock
|
|
*
|
|
* Converts the preceding srcu_read_lock into a two-way memory barrier.
|
|
*
|
|
* Call this after srcu_read_lock, to guarantee that all memory operations
|
|
* that occur after smp_mb__after_srcu_read_lock will appear to happen after
|
|
* the preceding srcu_read_lock.
|
|
*/
|
|
static inline void smp_mb__after_srcu_read_lock(void)
|
|
{
|
|
/* __srcu_read_lock has smp_mb() internally so nothing to do here. */
|
|
}
|
|
|
|
DEFINE_LOCK_GUARD_1(srcu, struct srcu_struct,
|
|
_T->idx = srcu_read_lock(_T->lock),
|
|
srcu_read_unlock(_T->lock, _T->idx),
|
|
int idx)
|
|
|
|
#endif
|