Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a

This commit is contained in:
Neeraj Upadhyay 2024-09-09 00:09:47 +05:30
35 changed files with 831 additions and 544 deletions

View File

@ -2649,8 +2649,7 @@ those that are idle from RCU's perspective) and then Tasks Rude RCU can
be removed from the kernel. be removed from the kernel.
The tasks-rude-RCU API is also reader-marking-free and thus quite compact, The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(), consisting solely of synchronize_rcu_tasks_rude().
and rcu_barrier_tasks_rude().
Tasks Trace RCU Tasks Trace RCU
~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~

View File

@ -194,14 +194,13 @@ over a rather long period of time, but improvements are always welcome!
when publicizing a pointer to a structure that can when publicizing a pointer to a structure that can
be traversed by an RCU read-side critical section. be traversed by an RCU read-side critical section.
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), 5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or
call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used, call_rcu_tasks_trace() is used, the callback function may be
the callback function may be invoked from softirq context, invoked from softirq context, and in any case with bottom halves
and in any case with bottom halves disabled. In particular, disabled. In particular, this callback function cannot block.
this callback function cannot block. If you need the callback If you need the callback to block, run that code in a workqueue
to block, run that code in a workqueue handler scheduled from handler scheduled from the callback. The queue_rcu_work()
the callback. The queue_rcu_work() function does this for you function does this for you in the case of call_rcu().
in the case of call_rcu().
6. Since synchronize_rcu() can block, it cannot be called 6. Since synchronize_rcu() can block, it cannot be called
from any sort of irq context. The same rule applies from any sort of irq context. The same rule applies
@ -254,10 +253,10 @@ over a rather long period of time, but improvements are always welcome!
corresponding readers must use rcu_read_lock_trace() corresponding readers must use rcu_read_lock_trace()
and rcu_read_unlock_trace(). and rcu_read_unlock_trace().
c. If an updater uses call_rcu_tasks_rude() or c. If an updater uses synchronize_rcu_tasks_rude(),
synchronize_rcu_tasks_rude(), then the corresponding then the corresponding readers must use anything that
readers must use anything that disables preemption, disables preemption, for example, preempt_disable()
for example, preempt_disable() and preempt_enable(). and preempt_enable().
Mixing things up will result in confusion and broken kernels, and Mixing things up will result in confusion and broken kernels, and
has even resulted in an exploitable security issue. Therefore, has even resulted in an exploitable security issue. Therefore,
@ -326,11 +325,9 @@ over a rather long period of time, but improvements are always welcome!
d. Periodically invoke rcu_barrier(), permitting a limited d. Periodically invoke rcu_barrier(), permitting a limited
number of updates per grace period. number of updates per grace period.
The same cautions apply to call_srcu(), call_rcu_tasks(), The same cautions apply to call_srcu(), call_rcu_tasks(), and
call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is call_rcu_tasks_trace(). This is why there is an srcu_barrier(),
why there is an srcu_barrier(), rcu_barrier_tasks(), rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively.
rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(),
respectively.
Note that although these primitives do take action to avoid Note that although these primitives do take action to avoid
memory exhaustion when any given CPU has too many callbacks, memory exhaustion when any given CPU has too many callbacks,
@ -383,17 +380,17 @@ over a rather long period of time, but improvements are always welcome!
must use whatever locking or other synchronization is required must use whatever locking or other synchronization is required
to safely access and/or modify that data structure. to safely access and/or modify that data structure.
Do not assume that RCU callbacks will be executed on Do not assume that RCU callbacks will be executed on the same
the same CPU that executed the corresponding call_rcu(), CPU that executed the corresponding call_rcu(), call_srcu(),
call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if
call_rcu_tasks_trace(). For example, if a given CPU goes offline a given CPU goes offline while having an RCU callback pending,
while having an RCU callback pending, then that RCU callback then that RCU callback will execute on some surviving CPU.
will execute on some surviving CPU. (If this was not the case, (If this was not the case, a self-spawning RCU callback would
a self-spawning RCU callback would prevent the victim CPU from prevent the victim CPU from ever going offline.) Furthermore,
ever going offline.) Furthermore, CPUs designated by rcu_nocbs= CPUs designated by rcu_nocbs= might well *always* have their
might well *always* have their RCU callbacks executed on some RCU callbacks executed on some other CPUs, in fact, for some
other CPUs, in fact, for some real-time workloads, this is the real-time workloads, this is the whole point of using the
whole point of using the rcu_nocbs= kernel boot parameter. rcu_nocbs= kernel boot parameter.
In addition, do not assume that callbacks queued in a given order In addition, do not assume that callbacks queued in a given order
will be invoked in that order, even if they all are queued on the will be invoked in that order, even if they all are queued on the
@ -507,9 +504,9 @@ over a rather long period of time, but improvements are always welcome!
These debugging aids can help you find problems that are These debugging aids can help you find problems that are
otherwise extremely difficult to spot. otherwise extremely difficult to spot.
17. If you pass a callback function defined within a module to one of 17. If you pass a callback function defined within a module
call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), to one of call_rcu(), call_srcu(), call_rcu_tasks(), or
or call_rcu_tasks_trace(), then it is necessary to wait for all call_rcu_tasks_trace(), then it is necessary to wait for all
pending callbacks to be invoked before unloading that module. pending callbacks to be invoked before unloading that module.
Note that it is absolutely *not* sufficient to wait for a grace Note that it is absolutely *not* sufficient to wait for a grace
period! For example, synchronize_rcu() implementation is *not* period! For example, synchronize_rcu() implementation is *not*
@ -522,7 +519,6 @@ over a rather long period of time, but improvements are always welcome!
- call_rcu() -> rcu_barrier() - call_rcu() -> rcu_barrier()
- call_srcu() -> srcu_barrier() - call_srcu() -> srcu_barrier()
- call_rcu_tasks() -> rcu_barrier_tasks() - call_rcu_tasks() -> rcu_barrier_tasks()
- call_rcu_tasks_rude() -> rcu_barrier_tasks_rude()
- call_rcu_tasks_trace() -> rcu_barrier_tasks_trace() - call_rcu_tasks_trace() -> rcu_barrier_tasks_trace()
However, these barrier functions are absolutely *not* guaranteed However, these barrier functions are absolutely *not* guaranteed
@ -539,7 +535,6 @@ over a rather long period of time, but improvements are always welcome!
- Either synchronize_srcu() or synchronize_srcu_expedited(), - Either synchronize_srcu() or synchronize_srcu_expedited(),
together with and srcu_barrier() together with and srcu_barrier()
- synchronize_rcu_tasks() and rcu_barrier_tasks() - synchronize_rcu_tasks() and rcu_barrier_tasks()
- synchronize_tasks_rude() and rcu_barrier_tasks_rude()
- synchronize_tasks_trace() and rcu_barrier_tasks_trace() - synchronize_tasks_trace() and rcu_barrier_tasks_trace()
If necessary, you can use something like workqueues to execute If necessary, you can use something like workqueues to execute

View File

@ -1103,7 +1103,7 @@ RCU-Tasks-Rude::
Critical sections Grace period Barrier Critical sections Grace period Barrier
N/A call_rcu_tasks_rude rcu_barrier_tasks_rude N/A N/A
synchronize_rcu_tasks_rude synchronize_rcu_tasks_rude

View File

@ -4937,6 +4937,10 @@
Set maximum number of finished RCU callbacks to Set maximum number of finished RCU callbacks to
process in one batch. process in one batch.
rcutree.csd_lock_suppress_rcu_stall= [KNL]
Do only a one-line RCU CPU stall warning when
there is an ongoing too-long CSD-lock wait.
rcutree.do_rcu_barrier= [KNL] rcutree.do_rcu_barrier= [KNL]
Request a call to rcu_barrier(). This is Request a call to rcu_barrier(). This is
throttled so that userspace tests can safely throttled so that userspace tests can safely
@ -5384,7 +5388,13 @@
Time to wait (s) after boot before inducing stall. Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL] rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set. Disable interrupts while stalling if set, but only
on the first stall in the set.
rcutorture.stall_cpu_repeat= [KNL]
Number of times to repeat the stall sequence,
so that rcutorture.stall_cpu_repeat=3 will result
in four stall sequences.
rcutorture.stall_gp_kthread= [KNL] rcutorture.stall_gp_kthread= [KNL]
Duration (s) of forced sleep within RCU Duration (s) of forced sleep within RCU
@ -5572,14 +5582,6 @@
of zero will disable batching. Batching is of zero will disable batching. Batching is
always disabled for synchronize_rcu_tasks(). always disabled for synchronize_rcu_tasks().
rcupdate.rcu_tasks_rude_lazy_ms= [KNL]
Set timeout in milliseconds RCU Tasks
Rude asynchronous callback batching for
call_rcu_tasks_rude(). A negative value
will take the default. A value of zero will
disable batching. Batching is always disabled
for synchronize_rcu_tasks_rude().
rcupdate.rcu_tasks_trace_lazy_ms= [KNL] rcupdate.rcu_tasks_trace_lazy_ms= [KNL]
Set timeout in milliseconds RCU Tasks Set timeout in milliseconds RCU Tasks
Trace asynchronous callback batching for Trace asynchronous callback batching for

View File

@ -185,11 +185,7 @@ struct rcu_cblist {
* ---------------------------------------------------------------------------- * ----------------------------------------------------------------------------
*/ */
#define SEGCBLIST_ENABLED BIT(0) #define SEGCBLIST_ENABLED BIT(0)
#define SEGCBLIST_RCU_CORE BIT(1) #define SEGCBLIST_OFFLOADED BIT(1)
#define SEGCBLIST_LOCKING BIT(2)
#define SEGCBLIST_KTHREAD_CB BIT(3)
#define SEGCBLIST_KTHREAD_GP BIT(4)
#define SEGCBLIST_OFFLOADED BIT(5)
struct rcu_segcblist { struct rcu_segcblist {
struct rcu_head *head; struct rcu_head *head;

View File

@ -191,7 +191,10 @@ static inline void hlist_del_init_rcu(struct hlist_node *n)
* @old : the element to be replaced * @old : the element to be replaced
* @new : the new element to insert * @new : the new element to insert
* *
* The @old entry will be replaced with the @new entry atomically. * The @old entry will be replaced with the @new entry atomically from
* the perspective of concurrent readers. It is the caller's responsibility
* to synchronize with concurrent updaters, if any.
*
* Note: @old should not be empty. * Note: @old should not be empty.
*/ */
static inline void list_replace_rcu(struct list_head *old, static inline void list_replace_rcu(struct list_head *old,
@ -519,7 +522,9 @@ static inline void hlist_del_rcu(struct hlist_node *n)
* @old : the element to be replaced * @old : the element to be replaced
* @new : the new element to insert * @new : the new element to insert
* *
* The @old entry will be replaced with the @new entry atomically. * The @old entry will be replaced with the @new entry atomically from
* the perspective of concurrent readers. It is the caller's responsibility
* to synchronize with concurrent updaters, if any.
*/ */
static inline void hlist_replace_rcu(struct hlist_node *old, static inline void hlist_replace_rcu(struct hlist_node *old,
struct hlist_node *new) struct hlist_node *new)

View File

@ -34,10 +34,12 @@
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b)) #define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b)) #define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
#define RCU_SEQ_CTR_SHIFT 2
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
/* Exported common interfaces */ /* Exported common interfaces */
void call_rcu(struct rcu_head *head, rcu_callback_t func); void call_rcu(struct rcu_head *head, rcu_callback_t func);
void rcu_barrier_tasks(void); void rcu_barrier_tasks(void);
void rcu_barrier_tasks_rude(void);
void synchronize_rcu(void); void synchronize_rcu(void);
struct rcu_gp_oldstate; struct rcu_gp_oldstate;
@ -144,11 +146,18 @@ void rcu_init_nohz(void);
int rcu_nocb_cpu_offload(int cpu); int rcu_nocb_cpu_offload(int cpu);
int rcu_nocb_cpu_deoffload(int cpu); int rcu_nocb_cpu_deoffload(int cpu);
void rcu_nocb_flush_deferred_wakeup(void); void rcu_nocb_flush_deferred_wakeup(void);
#define RCU_NOCB_LOCKDEP_WARN(c, s) RCU_LOCKDEP_WARN(c, s)
#else /* #ifdef CONFIG_RCU_NOCB_CPU */ #else /* #ifdef CONFIG_RCU_NOCB_CPU */
static inline void rcu_init_nohz(void) { } static inline void rcu_init_nohz(void) { }
static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; } static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; }
static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; } static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; }
static inline void rcu_nocb_flush_deferred_wakeup(void) { } static inline void rcu_nocb_flush_deferred_wakeup(void) { }
#define RCU_NOCB_LOCKDEP_WARN(c, s)
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */ #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
/* /*
@ -165,6 +174,7 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
} while (0) } while (0)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void); void synchronize_rcu_tasks(void);
void rcu_tasks_torture_stats_print(char *tt, char *tf);
# else # else
# define rcu_tasks_classic_qs(t, preempt) do { } while (0) # define rcu_tasks_classic_qs(t, preempt) do { } while (0)
# define call_rcu_tasks call_rcu # define call_rcu_tasks call_rcu
@ -191,6 +201,7 @@ void rcu_tasks_trace_qs_blkd(struct task_struct *t);
rcu_tasks_trace_qs_blkd(t); \ rcu_tasks_trace_qs_blkd(t); \
} \ } \
} while (0) } while (0)
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf);
# else # else
# define rcu_tasks_trace_qs(t) do { } while (0) # define rcu_tasks_trace_qs(t) do { } while (0)
# endif # endif
@ -202,8 +213,8 @@ do { \
} while (0) } while (0)
# ifdef CONFIG_TASKS_RUDE_RCU # ifdef CONFIG_TASKS_RUDE_RCU
void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks_rude(void); void synchronize_rcu_tasks_rude(void);
void rcu_tasks_rude_torture_stats_print(char *tt, char *tf);
# endif # endif
#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false) #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)

View File

@ -294,4 +294,10 @@ int smpcfd_prepare_cpu(unsigned int cpu);
int smpcfd_dead_cpu(unsigned int cpu); int smpcfd_dead_cpu(unsigned int cpu);
int smpcfd_dying_cpu(unsigned int cpu); int smpcfd_dying_cpu(unsigned int cpu);
#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
bool csd_lock_is_stuck(void);
#else
static inline bool csd_lock_is_stuck(void) { return false; }
#endif
#endif /* __LINUX_SMP_H */ #endif /* __LINUX_SMP_H */

View File

@ -129,10 +129,23 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1 #define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2 #define SRCU_STATE_SCAN2 2
/*
* Values for initializing gp sequence fields. Higher values allow wrap arounds to
* occur earlier.
* The second value with state is useful in the case of static initialization of
* srcu_usage where srcu_gp_seq_needed is expected to have some state value in its
* lower bits (or else it will appear to be already initialized within
* the call check_init_srcu_struct()).
*/
#define SRCU_GP_SEQ_INITIAL_VAL ((0UL - 100UL) << RCU_SEQ_CTR_SHIFT)
#define SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE (SRCU_GP_SEQ_INITIAL_VAL - 1)
#define __SRCU_USAGE_INIT(name) \ #define __SRCU_USAGE_INIT(name) \
{ \ { \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \ .srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \
.srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \
.srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \ .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
} }

View File

@ -54,9 +54,6 @@
* grace-period sequence number. * grace-period sequence number.
*/ */
#define RCU_SEQ_CTR_SHIFT 2
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
/* Low-order bit definition for polled grace-period APIs. */ /* Low-order bit definition for polled grace-period APIs. */
#define RCU_GET_STATE_COMPLETED 0x1 #define RCU_GET_STATE_COMPLETED 0x1
@ -255,6 +252,11 @@ static inline void debug_rcu_head_callback(struct rcu_head *rhp)
kmem_dump_obj(rhp); kmem_dump_obj(rhp);
} }
static inline bool rcu_barrier_cb_is_done(struct rcu_head *rhp)
{
return rhp->next == rhp;
}
extern int rcu_cpu_stall_suppress_at_boot; extern int rcu_cpu_stall_suppress_at_boot;
static inline bool rcu_stall_is_suppressed_at_boot(void) static inline bool rcu_stall_is_suppressed_at_boot(void)

View File

@ -260,17 +260,6 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_ENABLED); rcu_segcblist_clear_flags(rsclp, SEGCBLIST_ENABLED);
} }
/*
* Mark the specified rcu_segcblist structure as offloaded (or not)
*/
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
{
if (offload)
rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED);
else
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED);
}
/* /*
* Does the specified rcu_segcblist structure contain callbacks that * Does the specified rcu_segcblist structure contain callbacks that
* are ready to be invoked? * are ready to be invoked?

View File

@ -89,16 +89,7 @@ static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp) static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp)
{ {
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
rcu_segcblist_test_flags(rsclp, SEGCBLIST_LOCKING)) rcu_segcblist_test_flags(rsclp, SEGCBLIST_OFFLOADED))
return true;
return false;
}
static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp)
{
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
!rcu_segcblist_test_flags(rsclp, SEGCBLIST_RCU_CORE))
return true; return true;
return false; return false;

View File

@ -39,6 +39,7 @@
#include <linux/torture.h> #include <linux/torture.h>
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/rcupdate_trace.h> #include <linux/rcupdate_trace.h>
#include <linux/sched/debug.h>
#include "rcu.h" #include "rcu.h"
@ -104,6 +105,20 @@ static char *scale_type = "rcu";
module_param(scale_type, charp, 0444); module_param(scale_type, charp, 0444);
MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)"); MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)");
// Structure definitions for custom fixed-per-task allocator.
struct writer_mblock {
struct rcu_head wmb_rh;
struct llist_node wmb_node;
struct writer_freelist *wmb_wfl;
};
struct writer_freelist {
struct llist_head ws_lhg;
atomic_t ws_inflight;
struct llist_head ____cacheline_internodealigned_in_smp ws_lhp;
struct writer_mblock *ws_mblocks;
};
static int nrealreaders; static int nrealreaders;
static int nrealwriters; static int nrealwriters;
static struct task_struct **writer_tasks; static struct task_struct **writer_tasks;
@ -111,6 +126,8 @@ static struct task_struct **reader_tasks;
static struct task_struct *shutdown_task; static struct task_struct *shutdown_task;
static u64 **writer_durations; static u64 **writer_durations;
static bool *writer_done;
static struct writer_freelist *writer_freelists;
static int *writer_n_durations; static int *writer_n_durations;
static atomic_t n_rcu_scale_reader_started; static atomic_t n_rcu_scale_reader_started;
static atomic_t n_rcu_scale_writer_started; static atomic_t n_rcu_scale_writer_started;
@ -120,7 +137,6 @@ static u64 t_rcu_scale_writer_started;
static u64 t_rcu_scale_writer_finished; static u64 t_rcu_scale_writer_finished;
static unsigned long b_rcu_gp_test_started; static unsigned long b_rcu_gp_test_started;
static unsigned long b_rcu_gp_test_finished; static unsigned long b_rcu_gp_test_finished;
static DEFINE_PER_CPU(atomic_t, n_async_inflight);
#define MAX_MEAS 10000 #define MAX_MEAS 10000
#define MIN_MEAS 100 #define MIN_MEAS 100
@ -143,6 +159,7 @@ struct rcu_scale_ops {
void (*sync)(void); void (*sync)(void);
void (*exp_sync)(void); void (*exp_sync)(void);
struct task_struct *(*rso_gp_kthread)(void); struct task_struct *(*rso_gp_kthread)(void);
void (*stats)(void);
const char *name; const char *name;
}; };
@ -224,6 +241,11 @@ static void srcu_scale_synchronize(void)
synchronize_srcu(srcu_ctlp); synchronize_srcu(srcu_ctlp);
} }
static void srcu_scale_stats(void)
{
srcu_torture_stats_print(srcu_ctlp, scale_type, SCALE_FLAG);
}
static void srcu_scale_synchronize_expedited(void) static void srcu_scale_synchronize_expedited(void)
{ {
synchronize_srcu_expedited(srcu_ctlp); synchronize_srcu_expedited(srcu_ctlp);
@ -241,6 +263,7 @@ static struct rcu_scale_ops srcu_ops = {
.gp_barrier = srcu_rcu_barrier, .gp_barrier = srcu_rcu_barrier,
.sync = srcu_scale_synchronize, .sync = srcu_scale_synchronize,
.exp_sync = srcu_scale_synchronize_expedited, .exp_sync = srcu_scale_synchronize_expedited,
.stats = srcu_scale_stats,
.name = "srcu" .name = "srcu"
}; };
@ -270,6 +293,7 @@ static struct rcu_scale_ops srcud_ops = {
.gp_barrier = srcu_rcu_barrier, .gp_barrier = srcu_rcu_barrier,
.sync = srcu_scale_synchronize, .sync = srcu_scale_synchronize,
.exp_sync = srcu_scale_synchronize_expedited, .exp_sync = srcu_scale_synchronize_expedited,
.stats = srcu_scale_stats,
.name = "srcud" .name = "srcud"
}; };
@ -288,6 +312,11 @@ static void tasks_scale_read_unlock(int idx)
{ {
} }
static void rcu_tasks_scale_stats(void)
{
rcu_tasks_torture_stats_print(scale_type, SCALE_FLAG);
}
static struct rcu_scale_ops tasks_ops = { static struct rcu_scale_ops tasks_ops = {
.ptype = RCU_TASKS_FLAVOR, .ptype = RCU_TASKS_FLAVOR,
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
@ -300,6 +329,7 @@ static struct rcu_scale_ops tasks_ops = {
.sync = synchronize_rcu_tasks, .sync = synchronize_rcu_tasks,
.exp_sync = synchronize_rcu_tasks, .exp_sync = synchronize_rcu_tasks,
.rso_gp_kthread = get_rcu_tasks_gp_kthread, .rso_gp_kthread = get_rcu_tasks_gp_kthread,
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_scale_stats,
.name = "tasks" .name = "tasks"
}; };
@ -326,6 +356,11 @@ static void tasks_rude_scale_read_unlock(int idx)
{ {
} }
static void rcu_tasks_rude_scale_stats(void)
{
rcu_tasks_rude_torture_stats_print(scale_type, SCALE_FLAG);
}
static struct rcu_scale_ops tasks_rude_ops = { static struct rcu_scale_ops tasks_rude_ops = {
.ptype = RCU_TASKS_RUDE_FLAVOR, .ptype = RCU_TASKS_RUDE_FLAVOR,
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
@ -333,11 +368,10 @@ static struct rcu_scale_ops tasks_rude_ops = {
.readunlock = tasks_rude_scale_read_unlock, .readunlock = tasks_rude_scale_read_unlock,
.get_gp_seq = rcu_no_completed, .get_gp_seq = rcu_no_completed,
.gp_diff = rcu_seq_diff, .gp_diff = rcu_seq_diff,
.async = call_rcu_tasks_rude,
.gp_barrier = rcu_barrier_tasks_rude,
.sync = synchronize_rcu_tasks_rude, .sync = synchronize_rcu_tasks_rude,
.exp_sync = synchronize_rcu_tasks_rude, .exp_sync = synchronize_rcu_tasks_rude,
.rso_gp_kthread = get_rcu_tasks_rude_gp_kthread, .rso_gp_kthread = get_rcu_tasks_rude_gp_kthread,
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_rude_scale_stats,
.name = "tasks-rude" .name = "tasks-rude"
}; };
@ -366,6 +400,11 @@ static void tasks_trace_scale_read_unlock(int idx)
rcu_read_unlock_trace(); rcu_read_unlock_trace();
} }
static void rcu_tasks_trace_scale_stats(void)
{
rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG);
}
static struct rcu_scale_ops tasks_tracing_ops = { static struct rcu_scale_ops tasks_tracing_ops = {
.ptype = RCU_TASKS_FLAVOR, .ptype = RCU_TASKS_FLAVOR,
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
@ -378,6 +417,7 @@ static struct rcu_scale_ops tasks_tracing_ops = {
.sync = synchronize_rcu_tasks_trace, .sync = synchronize_rcu_tasks_trace,
.exp_sync = synchronize_rcu_tasks_trace, .exp_sync = synchronize_rcu_tasks_trace,
.rso_gp_kthread = get_rcu_tasks_trace_gp_kthread, .rso_gp_kthread = get_rcu_tasks_trace_gp_kthread,
.stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats,
.name = "tasks-tracing" .name = "tasks-tracing"
}; };
@ -437,13 +477,53 @@ rcu_scale_reader(void *arg)
return 0; return 0;
} }
/*
* Allocate a writer_mblock structure for the specified rcu_scale_writer
* task.
*/
static struct writer_mblock *rcu_scale_alloc(long me)
{
struct llist_node *llnp;
struct writer_freelist *wflp;
struct writer_mblock *wmbp;
if (WARN_ON_ONCE(!writer_freelists))
return NULL;
wflp = &writer_freelists[me];
if (llist_empty(&wflp->ws_lhp)) {
// ->ws_lhp is private to its rcu_scale_writer task.
wmbp = container_of(llist_del_all(&wflp->ws_lhg), struct writer_mblock, wmb_node);
wflp->ws_lhp.first = &wmbp->wmb_node;
}
llnp = llist_del_first(&wflp->ws_lhp);
if (!llnp)
return NULL;
return container_of(llnp, struct writer_mblock, wmb_node);
}
/*
* Free a writer_mblock structure to its rcu_scale_writer task.
*/
static void rcu_scale_free(struct writer_mblock *wmbp)
{
struct writer_freelist *wflp;
if (!wmbp)
return;
wflp = wmbp->wmb_wfl;
llist_add(&wmbp->wmb_node, &wflp->ws_lhg);
}
/* /*
* Callback function for asynchronous grace periods from rcu_scale_writer(). * Callback function for asynchronous grace periods from rcu_scale_writer().
*/ */
static void rcu_scale_async_cb(struct rcu_head *rhp) static void rcu_scale_async_cb(struct rcu_head *rhp)
{ {
atomic_dec(this_cpu_ptr(&n_async_inflight)); struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh);
kfree(rhp); struct writer_freelist *wflp = wmbp->wmb_wfl;
atomic_dec(&wflp->ws_inflight);
rcu_scale_free(wmbp);
} }
/* /*
@ -456,12 +536,14 @@ rcu_scale_writer(void *arg)
int i_max; int i_max;
unsigned long jdone; unsigned long jdone;
long me = (long)arg; long me = (long)arg;
struct rcu_head *rhp = NULL; bool selfreport = false;
bool started = false, done = false, alldone = false; bool started = false, done = false, alldone = false;
u64 t; u64 t;
DEFINE_TORTURE_RANDOM(tr); DEFINE_TORTURE_RANDOM(tr);
u64 *wdp; u64 *wdp;
u64 *wdpp = writer_durations[me]; u64 *wdpp = writer_durations[me];
struct writer_freelist *wflp = &writer_freelists[me];
struct writer_mblock *wmbp = NULL;
VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started"); VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started");
WARN_ON(!wdpp); WARN_ON(!wdpp);
@ -493,30 +575,34 @@ rcu_scale_writer(void *arg)
jdone = jiffies + minruntime * HZ; jdone = jiffies + minruntime * HZ;
do { do {
bool gp_succeeded = false;
if (writer_holdoff) if (writer_holdoff)
udelay(writer_holdoff); udelay(writer_holdoff);
if (writer_holdoff_jiffies) if (writer_holdoff_jiffies)
schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1); schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1);
wdp = &wdpp[i]; wdp = &wdpp[i];
*wdp = ktime_get_mono_fast_ns(); *wdp = ktime_get_mono_fast_ns();
if (gp_async) { if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) {
retry: if (!wmbp)
if (!rhp) wmbp = rcu_scale_alloc(me);
rhp = kmalloc(sizeof(*rhp), GFP_KERNEL); if (wmbp && atomic_read(&wflp->ws_inflight) < gp_async_max) {
if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) { atomic_inc(&wflp->ws_inflight);
atomic_inc(this_cpu_ptr(&n_async_inflight)); cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb);
cur_ops->async(rhp, rcu_scale_async_cb); wmbp = NULL;
rhp = NULL; gp_succeeded = true;
} else if (!kthread_should_stop()) { } else if (!kthread_should_stop()) {
cur_ops->gp_barrier(); cur_ops->gp_barrier();
goto retry;
} else { } else {
kfree(rhp); /* Because we are stopping. */ rcu_scale_free(wmbp); /* Because we are stopping. */
wmbp = NULL;
} }
} else if (gp_exp) { } else if (gp_exp) {
cur_ops->exp_sync(); cur_ops->exp_sync();
gp_succeeded = true;
} else { } else {
cur_ops->sync(); cur_ops->sync();
gp_succeeded = true;
} }
t = ktime_get_mono_fast_ns(); t = ktime_get_mono_fast_ns();
*wdp = t - *wdp; *wdp = t - *wdp;
@ -526,6 +612,7 @@ rcu_scale_writer(void *arg)
started = true; started = true;
if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) { if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) {
done = true; done = true;
WRITE_ONCE(writer_done[me], true);
sched_set_normal(current, 0); sched_set_normal(current, 0);
pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n", pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n",
scale_type, SCALE_FLAG, me, MIN_MEAS); scale_type, SCALE_FLAG, me, MIN_MEAS);
@ -551,11 +638,32 @@ rcu_scale_writer(void *arg)
if (done && !alldone && if (done && !alldone &&
atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters) atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters)
alldone = true; alldone = true;
if (started && !alldone && i < MAX_MEAS - 1) if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) {
static atomic_t dumped;
int i;
if (!atomic_xchg(&dumped, 1)) {
for (i = 0; i < nrealwriters; i++) {
if (writer_done[i])
continue;
pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i);
sched_show_task(writer_tasks[i]);
}
if (cur_ops->stats)
cur_ops->stats();
}
}
if (!selfreport && time_after(jiffies, jdone + HZ * (70 + me))) {
pr_info("%s: Writer %ld self-report: started %d done %d/%d->%d i %d jdone %lu.\n",
__func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone);
selfreport = true;
}
if (gp_succeeded && started && !alldone && i < MAX_MEAS - 1)
i++; i++;
rcu_scale_wait_shutdown(); rcu_scale_wait_shutdown();
} while (!torture_must_stop()); } while (!torture_must_stop());
if (gp_async) { if (gp_async && cur_ops->async) {
rcu_scale_free(wmbp);
cur_ops->gp_barrier(); cur_ops->gp_barrier();
} }
writer_n_durations[me] = i_max + 1; writer_n_durations[me] = i_max + 1;
@ -713,6 +821,7 @@ kfree_scale_cleanup(void)
torture_stop_kthread(kfree_scale_thread, torture_stop_kthread(kfree_scale_thread,
kfree_reader_tasks[i]); kfree_reader_tasks[i]);
kfree(kfree_reader_tasks); kfree(kfree_reader_tasks);
kfree_reader_tasks = NULL;
} }
torture_cleanup_end(); torture_cleanup_end();
@ -881,6 +990,7 @@ rcu_scale_cleanup(void)
torture_stop_kthread(rcu_scale_reader, torture_stop_kthread(rcu_scale_reader,
reader_tasks[i]); reader_tasks[i]);
kfree(reader_tasks); kfree(reader_tasks);
reader_tasks = NULL;
} }
if (writer_tasks) { if (writer_tasks) {
@ -919,10 +1029,33 @@ rcu_scale_cleanup(void)
schedule_timeout_uninterruptible(1); schedule_timeout_uninterruptible(1);
} }
kfree(writer_durations[i]); kfree(writer_durations[i]);
if (writer_freelists) {
int ctr = 0;
struct llist_node *llnp;
struct writer_freelist *wflp = &writer_freelists[i];
if (wflp->ws_mblocks) {
llist_for_each(llnp, wflp->ws_lhg.first)
ctr++;
llist_for_each(llnp, wflp->ws_lhp.first)
ctr++;
WARN_ONCE(ctr != gp_async_max,
"%s: ctr = %d gp_async_max = %d\n",
__func__, ctr, gp_async_max);
kfree(wflp->ws_mblocks);
}
}
} }
kfree(writer_tasks); kfree(writer_tasks);
writer_tasks = NULL;
kfree(writer_durations); kfree(writer_durations);
writer_durations = NULL;
kfree(writer_n_durations); kfree(writer_n_durations);
writer_n_durations = NULL;
kfree(writer_done);
writer_done = NULL;
kfree(writer_freelists);
writer_freelists = NULL;
} }
/* Do torture-type-specific cleanup operations. */ /* Do torture-type-specific cleanup operations. */
@ -949,8 +1082,9 @@ rcu_scale_shutdown(void *arg)
static int __init static int __init
rcu_scale_init(void) rcu_scale_init(void)
{ {
long i;
int firsterr = 0; int firsterr = 0;
long i;
long j;
static struct rcu_scale_ops *scale_ops[] = { static struct rcu_scale_ops *scale_ops[] = {
&rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS &rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS
}; };
@ -1017,14 +1151,22 @@ rcu_scale_init(void)
} }
while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders) while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders)
schedule_timeout_uninterruptible(1); schedule_timeout_uninterruptible(1);
writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL);
GFP_KERNEL); writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL);
writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL);
GFP_KERNEL); writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL);
writer_n_durations = if (gp_async) {
kcalloc(nrealwriters, sizeof(*writer_n_durations), if (gp_async_max <= 0) {
GFP_KERNEL); pr_warn("%s: gp_async_max = %d must be greater than zero.\n",
if (!writer_tasks || !writer_durations || !writer_n_durations) { __func__, gp_async_max);
WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST));
firsterr = -EINVAL;
goto unwind;
}
writer_freelists = kcalloc(nrealwriters, sizeof(writer_freelists[0]), GFP_KERNEL);
}
if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done ||
(gp_async && !writer_freelists)) {
SCALEOUT_ERRSTRING("out of memory"); SCALEOUT_ERRSTRING("out of memory");
firsterr = -ENOMEM; firsterr = -ENOMEM;
goto unwind; goto unwind;
@ -1037,6 +1179,24 @@ rcu_scale_init(void)
firsterr = -ENOMEM; firsterr = -ENOMEM;
goto unwind; goto unwind;
} }
if (writer_freelists) {
struct writer_freelist *wflp = &writer_freelists[i];
init_llist_head(&wflp->ws_lhg);
init_llist_head(&wflp->ws_lhp);
wflp->ws_mblocks = kcalloc(gp_async_max, sizeof(wflp->ws_mblocks[0]),
GFP_KERNEL);
if (!wflp->ws_mblocks) {
firsterr = -ENOMEM;
goto unwind;
}
for (j = 0; j < gp_async_max; j++) {
struct writer_mblock *wmbp = &wflp->ws_mblocks[j];
wmbp->wmb_wfl = wflp;
llist_add(&wmbp->wmb_node, &wflp->ws_lhp);
}
}
firsterr = torture_create_kthread(rcu_scale_writer, (void *)i, firsterr = torture_create_kthread(rcu_scale_writer, (void *)i,
writer_tasks[i]); writer_tasks[i]);
if (torture_init_error(firsterr)) if (torture_init_error(firsterr))

View File

@ -115,6 +115,7 @@ torture_param(int, stall_cpu_holdoff, 10, "Time to wait before starting stall (s
torture_param(bool, stall_no_softlockup, false, "Avoid softlockup warning during cpu stall."); torture_param(bool, stall_no_softlockup, false, "Avoid softlockup warning during cpu stall.");
torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling."); torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling.");
torture_param(int, stall_cpu_block, 0, "Sleep while stalling."); torture_param(int, stall_cpu_block, 0, "Sleep while stalling.");
torture_param(int, stall_cpu_repeat, 0, "Number of additional stalls after the first one.");
torture_param(int, stall_gp_kthread, 0, "Grace-period kthread stall duration (s)."); torture_param(int, stall_gp_kthread, 0, "Grace-period kthread stall duration (s).");
torture_param(int, stat_interval, 60, "Number of seconds between stats printk()s"); torture_param(int, stat_interval, 60, "Number of seconds between stats printk()s");
torture_param(int, stutter, 5, "Number of seconds to run/halt test"); torture_param(int, stutter, 5, "Number of seconds to run/halt test");
@ -366,8 +367,6 @@ struct rcu_torture_ops {
bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2); bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2);
unsigned long (*get_gp_state)(void); unsigned long (*get_gp_state)(void);
void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp); void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp);
unsigned long (*get_gp_completed)(void);
void (*get_gp_completed_full)(struct rcu_gp_oldstate *rgosp);
unsigned long (*start_gp_poll)(void); unsigned long (*start_gp_poll)(void);
void (*start_gp_poll_full)(struct rcu_gp_oldstate *rgosp); void (*start_gp_poll_full)(struct rcu_gp_oldstate *rgosp);
bool (*poll_gp_state)(unsigned long oldstate); bool (*poll_gp_state)(unsigned long oldstate);
@ -375,6 +374,8 @@ struct rcu_torture_ops {
bool (*poll_need_2gp)(bool poll, bool poll_full); bool (*poll_need_2gp)(bool poll, bool poll_full);
void (*cond_sync)(unsigned long oldstate); void (*cond_sync)(unsigned long oldstate);
void (*cond_sync_full)(struct rcu_gp_oldstate *rgosp); void (*cond_sync_full)(struct rcu_gp_oldstate *rgosp);
int poll_active;
int poll_active_full;
call_rcu_func_t call; call_rcu_func_t call;
void (*cb_barrier)(void); void (*cb_barrier)(void);
void (*fqs)(void); void (*fqs)(void);
@ -553,8 +554,6 @@ static struct rcu_torture_ops rcu_ops = {
.get_comp_state_full = get_completed_synchronize_rcu_full, .get_comp_state_full = get_completed_synchronize_rcu_full,
.get_gp_state = get_state_synchronize_rcu, .get_gp_state = get_state_synchronize_rcu,
.get_gp_state_full = get_state_synchronize_rcu_full, .get_gp_state_full = get_state_synchronize_rcu_full,
.get_gp_completed = get_completed_synchronize_rcu,
.get_gp_completed_full = get_completed_synchronize_rcu_full,
.start_gp_poll = start_poll_synchronize_rcu, .start_gp_poll = start_poll_synchronize_rcu,
.start_gp_poll_full = start_poll_synchronize_rcu_full, .start_gp_poll_full = start_poll_synchronize_rcu_full,
.poll_gp_state = poll_state_synchronize_rcu, .poll_gp_state = poll_state_synchronize_rcu,
@ -562,6 +561,8 @@ static struct rcu_torture_ops rcu_ops = {
.poll_need_2gp = rcu_poll_need_2gp, .poll_need_2gp = rcu_poll_need_2gp,
.cond_sync = cond_synchronize_rcu, .cond_sync = cond_synchronize_rcu,
.cond_sync_full = cond_synchronize_rcu_full, .cond_sync_full = cond_synchronize_rcu_full,
.poll_active = NUM_ACTIVE_RCU_POLL_OLDSTATE,
.poll_active_full = NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE,
.get_gp_state_exp = get_state_synchronize_rcu, .get_gp_state_exp = get_state_synchronize_rcu,
.start_gp_poll_exp = start_poll_synchronize_rcu_expedited, .start_gp_poll_exp = start_poll_synchronize_rcu_expedited,
.start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full, .start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full,
@ -740,9 +741,12 @@ static struct rcu_torture_ops srcu_ops = {
.deferred_free = srcu_torture_deferred_free, .deferred_free = srcu_torture_deferred_free,
.sync = srcu_torture_synchronize, .sync = srcu_torture_synchronize,
.exp_sync = srcu_torture_synchronize_expedited, .exp_sync = srcu_torture_synchronize_expedited,
.same_gp_state = same_state_synchronize_srcu,
.get_comp_state = get_completed_synchronize_srcu,
.get_gp_state = srcu_torture_get_gp_state, .get_gp_state = srcu_torture_get_gp_state,
.start_gp_poll = srcu_torture_start_gp_poll, .start_gp_poll = srcu_torture_start_gp_poll,
.poll_gp_state = srcu_torture_poll_gp_state, .poll_gp_state = srcu_torture_poll_gp_state,
.poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE,
.call = srcu_torture_call, .call = srcu_torture_call,
.cb_barrier = srcu_torture_barrier, .cb_barrier = srcu_torture_barrier,
.stats = srcu_torture_stats, .stats = srcu_torture_stats,
@ -780,9 +784,12 @@ static struct rcu_torture_ops srcud_ops = {
.deferred_free = srcu_torture_deferred_free, .deferred_free = srcu_torture_deferred_free,
.sync = srcu_torture_synchronize, .sync = srcu_torture_synchronize,
.exp_sync = srcu_torture_synchronize_expedited, .exp_sync = srcu_torture_synchronize_expedited,
.same_gp_state = same_state_synchronize_srcu,
.get_comp_state = get_completed_synchronize_srcu,
.get_gp_state = srcu_torture_get_gp_state, .get_gp_state = srcu_torture_get_gp_state,
.start_gp_poll = srcu_torture_start_gp_poll, .start_gp_poll = srcu_torture_start_gp_poll,
.poll_gp_state = srcu_torture_poll_gp_state, .poll_gp_state = srcu_torture_poll_gp_state,
.poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE,
.call = srcu_torture_call, .call = srcu_torture_call,
.cb_barrier = srcu_torture_barrier, .cb_barrier = srcu_torture_barrier,
.stats = srcu_torture_stats, .stats = srcu_torture_stats,
@ -915,11 +922,6 @@ static struct rcu_torture_ops tasks_ops = {
* Definitions for rude RCU-tasks torture testing. * Definitions for rude RCU-tasks torture testing.
*/ */
static void rcu_tasks_rude_torture_deferred_free(struct rcu_torture *p)
{
call_rcu_tasks_rude(&p->rtort_rcu, rcu_torture_cb);
}
static struct rcu_torture_ops tasks_rude_ops = { static struct rcu_torture_ops tasks_rude_ops = {
.ttype = RCU_TASKS_RUDE_FLAVOR, .ttype = RCU_TASKS_RUDE_FLAVOR,
.init = rcu_sync_torture_init, .init = rcu_sync_torture_init,
@ -927,11 +929,8 @@ static struct rcu_torture_ops tasks_rude_ops = {
.read_delay = rcu_read_delay, /* just reuse rcu's version. */ .read_delay = rcu_read_delay, /* just reuse rcu's version. */
.readunlock = rcu_torture_read_unlock_trivial, .readunlock = rcu_torture_read_unlock_trivial,
.get_gp_seq = rcu_no_completed, .get_gp_seq = rcu_no_completed,
.deferred_free = rcu_tasks_rude_torture_deferred_free,
.sync = synchronize_rcu_tasks_rude, .sync = synchronize_rcu_tasks_rude,
.exp_sync = synchronize_rcu_tasks_rude, .exp_sync = synchronize_rcu_tasks_rude,
.call = call_rcu_tasks_rude,
.cb_barrier = rcu_barrier_tasks_rude,
.gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread, .gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread,
.get_gp_data = rcu_tasks_rude_get_gp_data, .get_gp_data = rcu_tasks_rude_get_gp_data,
.cbflood_max = 50000, .cbflood_max = 50000,
@ -1318,6 +1317,7 @@ static void rcu_torture_write_types(void)
} else if (gp_sync && !cur_ops->sync) { } else if (gp_sync && !cur_ops->sync) {
pr_alert("%s: gp_sync without primitives.\n", __func__); pr_alert("%s: gp_sync without primitives.\n", __func__);
} }
pr_alert("%s: Testing %d update types.\n", __func__, nsynctypes);
} }
/* /*
@ -1374,17 +1374,20 @@ rcu_torture_writer(void *arg)
int i; int i;
int idx; int idx;
int oldnice = task_nice(current); int oldnice = task_nice(current);
struct rcu_gp_oldstate rgo[NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE]; struct rcu_gp_oldstate *rgo = NULL;
int rgo_size = 0;
struct rcu_torture *rp; struct rcu_torture *rp;
struct rcu_torture *old_rp; struct rcu_torture *old_rp;
static DEFINE_TORTURE_RANDOM(rand); static DEFINE_TORTURE_RANDOM(rand);
unsigned long stallsdone = jiffies; unsigned long stallsdone = jiffies;
bool stutter_waited; bool stutter_waited;
unsigned long ulo[NUM_ACTIVE_RCU_POLL_OLDSTATE]; unsigned long *ulo = NULL;
int ulo_size = 0;
// If a new stall test is added, this must be adjusted. // If a new stall test is added, this must be adjusted.
if (stall_cpu_holdoff + stall_gp_kthread + stall_cpu) if (stall_cpu_holdoff + stall_gp_kthread + stall_cpu)
stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * HZ; stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) *
HZ * (stall_cpu_repeat + 1);
VERBOSE_TOROUT_STRING("rcu_torture_writer task started"); VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
if (!can_expedite) if (!can_expedite)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
@ -1401,6 +1404,16 @@ rcu_torture_writer(void *arg)
torture_kthread_stopping("rcu_torture_writer"); torture_kthread_stopping("rcu_torture_writer");
return 0; return 0;
} }
if (cur_ops->poll_active > 0) {
ulo = kzalloc(cur_ops->poll_active * sizeof(ulo[0]), GFP_KERNEL);
if (!WARN_ON(!ulo))
ulo_size = cur_ops->poll_active;
}
if (cur_ops->poll_active_full > 0) {
rgo = kzalloc(cur_ops->poll_active_full * sizeof(rgo[0]), GFP_KERNEL);
if (!WARN_ON(!rgo))
rgo_size = cur_ops->poll_active_full;
}
do { do {
rcu_torture_writer_state = RTWS_FIXED_DELAY; rcu_torture_writer_state = RTWS_FIXED_DELAY;
@ -1437,8 +1450,8 @@ rcu_torture_writer(void *arg)
rcu_torture_writer_state_getname(), rcu_torture_writer_state_getname(),
rcu_torture_writer_state, rcu_torture_writer_state,
cookie, cur_ops->get_gp_state()); cookie, cur_ops->get_gp_state());
if (cur_ops->get_gp_completed) { if (cur_ops->get_comp_state) {
cookie = cur_ops->get_gp_completed(); cookie = cur_ops->get_comp_state();
WARN_ON_ONCE(!cur_ops->poll_gp_state(cookie)); WARN_ON_ONCE(!cur_ops->poll_gp_state(cookie));
} }
cur_ops->readunlock(idx); cur_ops->readunlock(idx);
@ -1452,8 +1465,8 @@ rcu_torture_writer(void *arg)
rcu_torture_writer_state_getname(), rcu_torture_writer_state_getname(),
rcu_torture_writer_state, rcu_torture_writer_state,
cpumask_pr_args(cpu_online_mask)); cpumask_pr_args(cpu_online_mask));
if (cur_ops->get_gp_completed_full) { if (cur_ops->get_comp_state_full) {
cur_ops->get_gp_completed_full(&cookie_full); cur_ops->get_comp_state_full(&cookie_full);
WARN_ON_ONCE(!cur_ops->poll_gp_state_full(&cookie_full)); WARN_ON_ONCE(!cur_ops->poll_gp_state_full(&cookie_full));
} }
cur_ops->readunlock(idx); cur_ops->readunlock(idx);
@ -1502,19 +1515,19 @@ rcu_torture_writer(void *arg)
break; break;
case RTWS_POLL_GET: case RTWS_POLL_GET:
rcu_torture_writer_state = RTWS_POLL_GET; rcu_torture_writer_state = RTWS_POLL_GET;
for (i = 0; i < ARRAY_SIZE(ulo); i++) for (i = 0; i < ulo_size; i++)
ulo[i] = cur_ops->get_comp_state(); ulo[i] = cur_ops->get_comp_state();
gp_snap = cur_ops->start_gp_poll(); gp_snap = cur_ops->start_gp_poll();
rcu_torture_writer_state = RTWS_POLL_WAIT; rcu_torture_writer_state = RTWS_POLL_WAIT;
while (!cur_ops->poll_gp_state(gp_snap)) { while (!cur_ops->poll_gp_state(gp_snap)) {
gp_snap1 = cur_ops->get_gp_state(); gp_snap1 = cur_ops->get_gp_state();
for (i = 0; i < ARRAY_SIZE(ulo); i++) for (i = 0; i < ulo_size; i++)
if (cur_ops->poll_gp_state(ulo[i]) || if (cur_ops->poll_gp_state(ulo[i]) ||
cur_ops->same_gp_state(ulo[i], gp_snap1)) { cur_ops->same_gp_state(ulo[i], gp_snap1)) {
ulo[i] = gp_snap1; ulo[i] = gp_snap1;
break; break;
} }
WARN_ON_ONCE(i >= ARRAY_SIZE(ulo)); WARN_ON_ONCE(ulo_size > 0 && i >= ulo_size);
torture_hrtimeout_jiffies(torture_random(&rand) % 16, torture_hrtimeout_jiffies(torture_random(&rand) % 16,
&rand); &rand);
} }
@ -1522,20 +1535,20 @@ rcu_torture_writer(void *arg)
break; break;
case RTWS_POLL_GET_FULL: case RTWS_POLL_GET_FULL:
rcu_torture_writer_state = RTWS_POLL_GET_FULL; rcu_torture_writer_state = RTWS_POLL_GET_FULL;
for (i = 0; i < ARRAY_SIZE(rgo); i++) for (i = 0; i < rgo_size; i++)
cur_ops->get_comp_state_full(&rgo[i]); cur_ops->get_comp_state_full(&rgo[i]);
cur_ops->start_gp_poll_full(&gp_snap_full); cur_ops->start_gp_poll_full(&gp_snap_full);
rcu_torture_writer_state = RTWS_POLL_WAIT_FULL; rcu_torture_writer_state = RTWS_POLL_WAIT_FULL;
while (!cur_ops->poll_gp_state_full(&gp_snap_full)) { while (!cur_ops->poll_gp_state_full(&gp_snap_full)) {
cur_ops->get_gp_state_full(&gp_snap1_full); cur_ops->get_gp_state_full(&gp_snap1_full);
for (i = 0; i < ARRAY_SIZE(rgo); i++) for (i = 0; i < rgo_size; i++)
if (cur_ops->poll_gp_state_full(&rgo[i]) || if (cur_ops->poll_gp_state_full(&rgo[i]) ||
cur_ops->same_gp_state_full(&rgo[i], cur_ops->same_gp_state_full(&rgo[i],
&gp_snap1_full)) { &gp_snap1_full)) {
rgo[i] = gp_snap1_full; rgo[i] = gp_snap1_full;
break; break;
} }
WARN_ON_ONCE(i >= ARRAY_SIZE(rgo)); WARN_ON_ONCE(rgo_size > 0 && i >= rgo_size);
torture_hrtimeout_jiffies(torture_random(&rand) % 16, torture_hrtimeout_jiffies(torture_random(&rand) % 16,
&rand); &rand);
} }
@ -1617,6 +1630,8 @@ rcu_torture_writer(void *arg)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
" Dynamic grace-period expediting was disabled.\n", " Dynamic grace-period expediting was disabled.\n",
torture_type); torture_type);
kfree(ulo);
kfree(rgo);
rcu_torture_writer_state = RTWS_STOPPING; rcu_torture_writer_state = RTWS_STOPPING;
torture_kthread_stopping("rcu_torture_writer"); torture_kthread_stopping("rcu_torture_writer");
return 0; return 0;
@ -2370,7 +2385,7 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
"test_boost=%d/%d test_boost_interval=%d " "test_boost=%d/%d test_boost_interval=%d "
"test_boost_duration=%d shutdown_secs=%d " "test_boost_duration=%d shutdown_secs=%d "
"stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d " "stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d "
"stall_cpu_block=%d " "stall_cpu_block=%d stall_cpu_repeat=%d "
"n_barrier_cbs=%d " "n_barrier_cbs=%d "
"onoff_interval=%d onoff_holdoff=%d " "onoff_interval=%d onoff_holdoff=%d "
"read_exit_delay=%d read_exit_burst=%d " "read_exit_delay=%d read_exit_burst=%d "
@ -2382,7 +2397,7 @@ rcu_torture_print_module_parms(struct rcu_torture_ops *cur_ops, const char *tag)
test_boost, cur_ops->can_boost, test_boost, cur_ops->can_boost,
test_boost_interval, test_boost_duration, shutdown_secs, test_boost_interval, test_boost_duration, shutdown_secs,
stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff, stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff,
stall_cpu_block, stall_cpu_block, stall_cpu_repeat,
n_barrier_cbs, n_barrier_cbs,
onoff_interval, onoff_holdoff, onoff_interval, onoff_holdoff,
read_exit_delay, read_exit_burst, read_exit_delay, read_exit_burst,
@ -2460,19 +2475,11 @@ static struct notifier_block rcu_torture_stall_block = {
* induces a CPU stall for the time specified by stall_cpu. If a new * induces a CPU stall for the time specified by stall_cpu. If a new
* stall test is added, stallsdone in rcu_torture_writer() must be adjusted. * stall test is added, stallsdone in rcu_torture_writer() must be adjusted.
*/ */
static int rcu_torture_stall(void *args) static void rcu_torture_stall_one(int rep, int irqsoff)
{ {
int idx; int idx;
int ret;
unsigned long stop_at; unsigned long stop_at;
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
if (rcu_cpu_stall_notifiers) {
ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
if (ret)
pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
__func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
}
if (stall_cpu_holdoff > 0) { if (stall_cpu_holdoff > 0) {
VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff"); VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff");
schedule_timeout_interruptible(stall_cpu_holdoff * HZ); schedule_timeout_interruptible(stall_cpu_holdoff * HZ);
@ -2492,12 +2499,12 @@ static int rcu_torture_stall(void *args)
stop_at = ktime_get_seconds() + stall_cpu; stop_at = ktime_get_seconds() + stall_cpu;
/* RCU CPU stall is expected behavior in following code. */ /* RCU CPU stall is expected behavior in following code. */
idx = cur_ops->readlock(); idx = cur_ops->readlock();
if (stall_cpu_irqsoff) if (irqsoff)
local_irq_disable(); local_irq_disable();
else if (!stall_cpu_block) else if (!stall_cpu_block)
preempt_disable(); preempt_disable();
pr_alert("%s start on CPU %d.\n", pr_alert("%s start stall episode %d on CPU %d.\n",
__func__, raw_smp_processor_id()); __func__, rep + 1, raw_smp_processor_id());
while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) && while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) &&
!kthread_should_stop()) !kthread_should_stop())
if (stall_cpu_block) { if (stall_cpu_block) {
@ -2509,12 +2516,42 @@ static int rcu_torture_stall(void *args)
} else if (stall_no_softlockup) { } else if (stall_no_softlockup) {
touch_softlockup_watchdog(); touch_softlockup_watchdog();
} }
if (stall_cpu_irqsoff) if (irqsoff)
local_irq_enable(); local_irq_enable();
else if (!stall_cpu_block) else if (!stall_cpu_block)
preempt_enable(); preempt_enable();
cur_ops->readunlock(idx); cur_ops->readunlock(idx);
} }
}
/*
* CPU-stall kthread. Invokes rcu_torture_stall_one() once, and then as many
* additional times as specified by the stall_cpu_repeat module parameter.
* Note that stall_cpu_irqsoff is ignored on the second and subsequent
* stall.
*/
static int rcu_torture_stall(void *args)
{
int i;
int repeat = stall_cpu_repeat;
int ret;
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
if (repeat < 0) {
repeat = 0;
WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST));
}
if (rcu_cpu_stall_notifiers) {
ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block);
if (ret)
pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n",
__func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : "");
}
for (i = 0; i <= repeat; i++) {
if (kthread_should_stop())
break;
rcu_torture_stall_one(i, i == 0 ? stall_cpu_irqsoff : 0);
}
pr_alert("%s end.\n", __func__); pr_alert("%s end.\n", __func__);
if (rcu_cpu_stall_notifiers && !ret) { if (rcu_cpu_stall_notifiers && !ret) {
ret = rcu_stall_chain_notifier_unregister(&rcu_torture_stall_block); ret = rcu_stall_chain_notifier_unregister(&rcu_torture_stall_block);

View File

@ -28,6 +28,7 @@
#include <linux/rcupdate_trace.h> #include <linux/rcupdate_trace.h>
#include <linux/reboot.h> #include <linux/reboot.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/seq_buf.h>
#include <linux/spinlock.h> #include <linux/spinlock.h>
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/stat.h> #include <linux/stat.h>
@ -134,7 +135,7 @@ struct ref_scale_ops {
const char *name; const char *name;
}; };
static struct ref_scale_ops *cur_ops; static const struct ref_scale_ops *cur_ops;
static void un_delay(const int udl, const int ndl) static void un_delay(const int udl, const int ndl)
{ {
@ -170,7 +171,7 @@ static bool rcu_sync_scale_init(void)
return true; return true;
} }
static struct ref_scale_ops rcu_ops = { static const struct ref_scale_ops rcu_ops = {
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
.readsection = ref_rcu_read_section, .readsection = ref_rcu_read_section,
.delaysection = ref_rcu_delay_section, .delaysection = ref_rcu_delay_section,
@ -204,7 +205,7 @@ static void srcu_ref_scale_delay_section(const int nloops, const int udl, const
} }
} }
static struct ref_scale_ops srcu_ops = { static const struct ref_scale_ops srcu_ops = {
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
.readsection = srcu_ref_scale_read_section, .readsection = srcu_ref_scale_read_section,
.delaysection = srcu_ref_scale_delay_section, .delaysection = srcu_ref_scale_delay_section,
@ -231,7 +232,7 @@ static void rcu_tasks_ref_scale_delay_section(const int nloops, const int udl, c
un_delay(udl, ndl); un_delay(udl, ndl);
} }
static struct ref_scale_ops rcu_tasks_ops = { static const struct ref_scale_ops rcu_tasks_ops = {
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
.readsection = rcu_tasks_ref_scale_read_section, .readsection = rcu_tasks_ref_scale_read_section,
.delaysection = rcu_tasks_ref_scale_delay_section, .delaysection = rcu_tasks_ref_scale_delay_section,
@ -270,7 +271,7 @@ static void rcu_trace_ref_scale_delay_section(const int nloops, const int udl, c
} }
} }
static struct ref_scale_ops rcu_trace_ops = { static const struct ref_scale_ops rcu_trace_ops = {
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
.readsection = rcu_trace_ref_scale_read_section, .readsection = rcu_trace_ref_scale_read_section,
.delaysection = rcu_trace_ref_scale_delay_section, .delaysection = rcu_trace_ref_scale_delay_section,
@ -309,7 +310,7 @@ static void ref_refcnt_delay_section(const int nloops, const int udl, const int
} }
} }
static struct ref_scale_ops refcnt_ops = { static const struct ref_scale_ops refcnt_ops = {
.init = rcu_sync_scale_init, .init = rcu_sync_scale_init,
.readsection = ref_refcnt_section, .readsection = ref_refcnt_section,
.delaysection = ref_refcnt_delay_section, .delaysection = ref_refcnt_delay_section,
@ -346,7 +347,7 @@ static void ref_rwlock_delay_section(const int nloops, const int udl, const int
} }
} }
static struct ref_scale_ops rwlock_ops = { static const struct ref_scale_ops rwlock_ops = {
.init = ref_rwlock_init, .init = ref_rwlock_init,
.readsection = ref_rwlock_section, .readsection = ref_rwlock_section,
.delaysection = ref_rwlock_delay_section, .delaysection = ref_rwlock_delay_section,
@ -383,7 +384,7 @@ static void ref_rwsem_delay_section(const int nloops, const int udl, const int n
} }
} }
static struct ref_scale_ops rwsem_ops = { static const struct ref_scale_ops rwsem_ops = {
.init = ref_rwsem_init, .init = ref_rwsem_init,
.readsection = ref_rwsem_section, .readsection = ref_rwsem_section,
.delaysection = ref_rwsem_delay_section, .delaysection = ref_rwsem_delay_section,
@ -418,7 +419,7 @@ static void ref_lock_delay_section(const int nloops, const int udl, const int nd
preempt_enable(); preempt_enable();
} }
static struct ref_scale_ops lock_ops = { static const struct ref_scale_ops lock_ops = {
.readsection = ref_lock_section, .readsection = ref_lock_section,
.delaysection = ref_lock_delay_section, .delaysection = ref_lock_delay_section,
.name = "lock" .name = "lock"
@ -453,7 +454,7 @@ static void ref_lock_irq_delay_section(const int nloops, const int udl, const in
preempt_enable(); preempt_enable();
} }
static struct ref_scale_ops lock_irq_ops = { static const struct ref_scale_ops lock_irq_ops = {
.readsection = ref_lock_irq_section, .readsection = ref_lock_irq_section,
.delaysection = ref_lock_irq_delay_section, .delaysection = ref_lock_irq_delay_section,
.name = "lock-irq" .name = "lock-irq"
@ -489,7 +490,7 @@ static void ref_acqrel_delay_section(const int nloops, const int udl, const int
preempt_enable(); preempt_enable();
} }
static struct ref_scale_ops acqrel_ops = { static const struct ref_scale_ops acqrel_ops = {
.readsection = ref_acqrel_section, .readsection = ref_acqrel_section,
.delaysection = ref_acqrel_delay_section, .delaysection = ref_acqrel_delay_section,
.name = "acqrel" .name = "acqrel"
@ -523,7 +524,7 @@ static void ref_clock_delay_section(const int nloops, const int udl, const int n
stopopts = x; stopopts = x;
} }
static struct ref_scale_ops clock_ops = { static const struct ref_scale_ops clock_ops = {
.readsection = ref_clock_section, .readsection = ref_clock_section,
.delaysection = ref_clock_delay_section, .delaysection = ref_clock_delay_section,
.name = "clock" .name = "clock"
@ -555,7 +556,7 @@ static void ref_jiffies_delay_section(const int nloops, const int udl, const int
stopopts = x; stopopts = x;
} }
static struct ref_scale_ops jiffies_ops = { static const struct ref_scale_ops jiffies_ops = {
.readsection = ref_jiffies_section, .readsection = ref_jiffies_section,
.delaysection = ref_jiffies_delay_section, .delaysection = ref_jiffies_delay_section,
.name = "jiffies" .name = "jiffies"
@ -705,9 +706,9 @@ static void refscale_typesafe_ctor(void *rtsp_in)
preempt_enable(); preempt_enable();
} }
static struct ref_scale_ops typesafe_ref_ops; static const struct ref_scale_ops typesafe_ref_ops;
static struct ref_scale_ops typesafe_lock_ops; static const struct ref_scale_ops typesafe_lock_ops;
static struct ref_scale_ops typesafe_seqlock_ops; static const struct ref_scale_ops typesafe_seqlock_ops;
// Initialize for a typesafe test. // Initialize for a typesafe test.
static bool typesafe_init(void) static bool typesafe_init(void)
@ -768,7 +769,7 @@ static void typesafe_cleanup(void)
} }
// The typesafe_init() function distinguishes these structures by address. // The typesafe_init() function distinguishes these structures by address.
static struct ref_scale_ops typesafe_ref_ops = { static const struct ref_scale_ops typesafe_ref_ops = {
.init = typesafe_init, .init = typesafe_init,
.cleanup = typesafe_cleanup, .cleanup = typesafe_cleanup,
.readsection = typesafe_read_section, .readsection = typesafe_read_section,
@ -776,7 +777,7 @@ static struct ref_scale_ops typesafe_ref_ops = {
.name = "typesafe_ref" .name = "typesafe_ref"
}; };
static struct ref_scale_ops typesafe_lock_ops = { static const struct ref_scale_ops typesafe_lock_ops = {
.init = typesafe_init, .init = typesafe_init,
.cleanup = typesafe_cleanup, .cleanup = typesafe_cleanup,
.readsection = typesafe_read_section, .readsection = typesafe_read_section,
@ -784,7 +785,7 @@ static struct ref_scale_ops typesafe_lock_ops = {
.name = "typesafe_lock" .name = "typesafe_lock"
}; };
static struct ref_scale_ops typesafe_seqlock_ops = { static const struct ref_scale_ops typesafe_seqlock_ops = {
.init = typesafe_init, .init = typesafe_init,
.cleanup = typesafe_cleanup, .cleanup = typesafe_cleanup,
.readsection = typesafe_read_section, .readsection = typesafe_read_section,
@ -891,32 +892,34 @@ static u64 process_durations(int n)
{ {
int i; int i;
struct reader_task *rt; struct reader_task *rt;
char buf1[64]; struct seq_buf s;
char *buf; char *buf;
u64 sum = 0; u64 sum = 0;
buf = kmalloc(800 + 64, GFP_KERNEL); buf = kmalloc(800 + 64, GFP_KERNEL);
if (!buf) if (!buf)
return 0; return 0;
buf[0] = 0; seq_buf_init(&s, buf, 800 + 64);
sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
exp_idx); seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
exp_idx);
for (i = 0; i < n && !torture_must_stop(); i++) { for (i = 0; i < n && !torture_must_stop(); i++) {
rt = &(reader_tasks[i]); rt = &(reader_tasks[i]);
sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns);
if (i % 5 == 0) if (i % 5 == 0)
strcat(buf, "\n"); seq_buf_putc(&s, '\n');
if (strlen(buf) >= 800) {
pr_alert("%s", buf); if (seq_buf_used(&s) >= 800) {
buf[0] = 0; pr_alert("%s", seq_buf_str(&s));
seq_buf_clear(&s);
} }
strcat(buf, buf1);
seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns);
sum += rt->last_duration_ns; sum += rt->last_duration_ns;
} }
pr_alert("%s\n", buf); pr_alert("%s\n", seq_buf_str(&s));
kfree(buf); kfree(buf);
return sum; return sum;
@ -1023,7 +1026,7 @@ static int main_func(void *arg)
} }
static void static void
ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag) ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag)
{ {
pr_alert("%s" SCALE_FLAG pr_alert("%s" SCALE_FLAG
"--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag, "--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag,
@ -1078,7 +1081,7 @@ ref_scale_init(void)
{ {
long i; long i;
int firsterr = 0; int firsterr = 0;
static struct ref_scale_ops *scale_ops[] = { static const struct ref_scale_ops *scale_ops[] = {
&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops, &rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops, &rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops,
&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops, &typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,

View File

@ -137,6 +137,7 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
sdp->srcu_cblist_invoking = false; sdp->srcu_cblist_invoking = false;
sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq; sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq; sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
sdp->srcu_barrier_head.next = &sdp->srcu_barrier_head;
sdp->mynode = NULL; sdp->mynode = NULL;
sdp->cpu = cpu; sdp->cpu = cpu;
INIT_WORK(&sdp->work, srcu_invoke_callbacks); INIT_WORK(&sdp->work, srcu_invoke_callbacks);
@ -247,7 +248,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
mutex_init(&ssp->srcu_sup->srcu_cb_mutex); mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
mutex_init(&ssp->srcu_sup->srcu_gp_mutex); mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
ssp->srcu_idx = 0; ssp->srcu_idx = 0;
ssp->srcu_sup->srcu_gp_seq = 0; ssp->srcu_sup->srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL;
ssp->srcu_sup->srcu_barrier_seq = 0; ssp->srcu_sup->srcu_barrier_seq = 0;
mutex_init(&ssp->srcu_sup->srcu_barrier_mutex); mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0); atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
@ -258,7 +259,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
if (!ssp->sda) if (!ssp->sda)
goto err_free_sup; goto err_free_sup;
init_srcu_struct_data(ssp); init_srcu_struct_data(ssp);
ssp->srcu_sup->srcu_gp_seq_needed_exp = 0; ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL;
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns(); ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) { if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
@ -266,7 +267,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG); WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
} }
ssp->srcu_sup->srcu_ssp = ssp; ssp->srcu_sup->srcu_ssp = ssp;
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */ smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed,
SRCU_GP_SEQ_INITIAL_VAL); /* Init done. */
return 0; return 0;
err_free_sda: err_free_sda:
@ -628,6 +630,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
if (time_after(j, gpstart)) if (time_after(j, gpstart))
jbase += j - gpstart; jbase += j - gpstart;
if (!jbase) { if (!jbase) {
ASSERT_EXCLUSIVE_WRITER(sup->srcu_n_exp_nodelay);
WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1); WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase) if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
jbase = 1; jbase = 1;
@ -1560,6 +1563,7 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
struct srcu_data *sdp; struct srcu_data *sdp;
struct srcu_struct *ssp; struct srcu_struct *ssp;
rhp->next = rhp; // Mark the callback as having been invoked.
sdp = container_of(rhp, struct srcu_data, srcu_barrier_head); sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
ssp = sdp->ssp; ssp = sdp->ssp;
if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt)) if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
@ -1818,6 +1822,7 @@ static void process_srcu(struct work_struct *work)
} else { } else {
j = jiffies; j = jiffies;
if (READ_ONCE(sup->reschedule_jiffies) == j) { if (READ_ONCE(sup->reschedule_jiffies) == j) {
ASSERT_EXCLUSIVE_WRITER(sup->reschedule_count);
WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1); WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay) if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
curdelay = 1; curdelay = 1;

View File

@ -34,6 +34,7 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
* @rtp_blkd_tasks: List of tasks blocked as readers. * @rtp_blkd_tasks: List of tasks blocked as readers.
* @rtp_exit_list: List of tasks in the latter portion of do_exit(). * @rtp_exit_list: List of tasks in the latter portion of do_exit().
* @cpu: CPU number corresponding to this entry. * @cpu: CPU number corresponding to this entry.
* @index: Index of this CPU in rtpcp_array of the rcu_tasks structure.
* @rtpp: Pointer to the rcu_tasks structure. * @rtpp: Pointer to the rcu_tasks structure.
*/ */
struct rcu_tasks_percpu { struct rcu_tasks_percpu {
@ -49,6 +50,7 @@ struct rcu_tasks_percpu {
struct list_head rtp_blkd_tasks; struct list_head rtp_blkd_tasks;
struct list_head rtp_exit_list; struct list_head rtp_exit_list;
int cpu; int cpu;
int index;
struct rcu_tasks *rtpp; struct rcu_tasks *rtpp;
}; };
@ -63,7 +65,7 @@ struct rcu_tasks_percpu {
* @init_fract: Initial backoff sleep interval. * @init_fract: Initial backoff sleep interval.
* @gp_jiffies: Time of last @gp_state transition. * @gp_jiffies: Time of last @gp_state transition.
* @gp_start: Most recent grace-period start in jiffies. * @gp_start: Most recent grace-period start in jiffies.
* @tasks_gp_seq: Number of grace periods completed since boot. * @tasks_gp_seq: Number of grace periods completed since boot in upper bits.
* @n_ipis: Number of IPIs sent to encourage grace periods to end. * @n_ipis: Number of IPIs sent to encourage grace periods to end.
* @n_ipis_fails: Number of IPI-send failures. * @n_ipis_fails: Number of IPI-send failures.
* @kthread_ptr: This flavor's grace-period/callback-invocation kthread. * @kthread_ptr: This flavor's grace-period/callback-invocation kthread.
@ -76,6 +78,7 @@ struct rcu_tasks_percpu {
* @call_func: This flavor's call_rcu()-equivalent function. * @call_func: This flavor's call_rcu()-equivalent function.
* @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE). * @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE).
* @rtpcpu: This flavor's rcu_tasks_percpu structure. * @rtpcpu: This flavor's rcu_tasks_percpu structure.
* @rtpcp_array: Array of pointers to rcu_tasks_percpu structure of CPUs in cpu_possible_mask.
* @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks. * @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks.
* @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing. * @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing.
* @percpu_dequeue_lim: Number of per-CPU callback queues in use for dequeuing. * @percpu_dequeue_lim: Number of per-CPU callback queues in use for dequeuing.
@ -84,6 +87,7 @@ struct rcu_tasks_percpu {
* @barrier_q_count: Number of queues being waited on. * @barrier_q_count: Number of queues being waited on.
* @barrier_q_completion: Barrier wait/wakeup mechanism. * @barrier_q_completion: Barrier wait/wakeup mechanism.
* @barrier_q_seq: Sequence number for barrier operations. * @barrier_q_seq: Sequence number for barrier operations.
* @barrier_q_start: Most recent barrier start in jiffies.
* @name: This flavor's textual name. * @name: This flavor's textual name.
* @kname: This flavor's kthread name. * @kname: This flavor's kthread name.
*/ */
@ -110,6 +114,7 @@ struct rcu_tasks {
call_rcu_func_t call_func; call_rcu_func_t call_func;
unsigned int wait_state; unsigned int wait_state;
struct rcu_tasks_percpu __percpu *rtpcpu; struct rcu_tasks_percpu __percpu *rtpcpu;
struct rcu_tasks_percpu **rtpcp_array;
int percpu_enqueue_shift; int percpu_enqueue_shift;
int percpu_enqueue_lim; int percpu_enqueue_lim;
int percpu_dequeue_lim; int percpu_dequeue_lim;
@ -118,6 +123,7 @@ struct rcu_tasks {
atomic_t barrier_q_count; atomic_t barrier_q_count;
struct completion barrier_q_completion; struct completion barrier_q_completion;
unsigned long barrier_q_seq; unsigned long barrier_q_seq;
unsigned long barrier_q_start;
char *name; char *name;
char *kname; char *kname;
}; };
@ -182,6 +188,8 @@ module_param(rcu_task_collapse_lim, int, 0444);
static int rcu_task_lazy_lim __read_mostly = 32; static int rcu_task_lazy_lim __read_mostly = 32;
module_param(rcu_task_lazy_lim, int, 0444); module_param(rcu_task_lazy_lim, int, 0444);
static int rcu_task_cpu_ids;
/* RCU tasks grace-period state for debugging. */ /* RCU tasks grace-period state for debugging. */
#define RTGS_INIT 0 #define RTGS_INIT 0
#define RTGS_WAIT_WAIT_CBS 1 #define RTGS_WAIT_WAIT_CBS 1
@ -245,6 +253,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
int cpu; int cpu;
int lim; int lim;
int shift; int shift;
int maxcpu;
int index = 0;
if (rcu_task_enqueue_lim < 0) { if (rcu_task_enqueue_lim < 0) {
rcu_task_enqueue_lim = 1; rcu_task_enqueue_lim = 1;
@ -254,14 +264,9 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
} }
lim = rcu_task_enqueue_lim; lim = rcu_task_enqueue_lim;
if (lim > nr_cpu_ids) rtp->rtpcp_array = kcalloc(num_possible_cpus(), sizeof(struct rcu_tasks_percpu *), GFP_KERNEL);
lim = nr_cpu_ids; BUG_ON(!rtp->rtpcp_array);
shift = ilog2(nr_cpu_ids / lim);
if (((nr_cpu_ids - 1) >> shift) >= lim)
shift++;
WRITE_ONCE(rtp->percpu_enqueue_shift, shift);
WRITE_ONCE(rtp->percpu_dequeue_lim, lim);
smp_store_release(&rtp->percpu_enqueue_lim, lim);
for_each_possible_cpu(cpu) { for_each_possible_cpu(cpu) {
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
@ -273,14 +278,30 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq); INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq);
rtpcp->cpu = cpu; rtpcp->cpu = cpu;
rtpcp->rtpp = rtp; rtpcp->rtpp = rtp;
rtpcp->index = index;
rtp->rtpcp_array[index] = rtpcp;
index++;
if (!rtpcp->rtp_blkd_tasks.next) if (!rtpcp->rtp_blkd_tasks.next)
INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks);
if (!rtpcp->rtp_exit_list.next) if (!rtpcp->rtp_exit_list.next)
INIT_LIST_HEAD(&rtpcp->rtp_exit_list); INIT_LIST_HEAD(&rtpcp->rtp_exit_list);
rtpcp->barrier_q_head.next = &rtpcp->barrier_q_head;
maxcpu = cpu;
} }
pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name, rcu_task_cpu_ids = maxcpu + 1;
data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), rcu_task_cb_adjust); if (lim > rcu_task_cpu_ids)
lim = rcu_task_cpu_ids;
shift = ilog2(rcu_task_cpu_ids / lim);
if (((rcu_task_cpu_ids - 1) >> shift) >= lim)
shift++;
WRITE_ONCE(rtp->percpu_enqueue_shift, shift);
WRITE_ONCE(rtp->percpu_dequeue_lim, lim);
smp_store_release(&rtp->percpu_enqueue_lim, lim);
pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d rcu_task_cpu_ids=%d.\n",
rtp->name, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim),
rcu_task_cb_adjust, rcu_task_cpu_ids);
} }
// Compute wakeup time for lazy callback timer. // Compute wakeup time for lazy callback timer.
@ -339,6 +360,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
rcu_read_lock(); rcu_read_lock();
ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift); ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift);
chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask); chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask);
WARN_ON_ONCE(chosen_cpu >= rcu_task_cpu_ids);
rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu); rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu);
if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled. if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled.
raw_spin_lock_rcu_node(rtpcp); // irqs already disabled. raw_spin_lock_rcu_node(rtpcp); // irqs already disabled.
@ -348,7 +370,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
rtpcp->rtp_n_lock_retries = 0; rtpcp->rtp_n_lock_retries = 0;
} }
if (rcu_task_cb_adjust && ++rtpcp->rtp_n_lock_retries > rcu_task_contend_lim && if (rcu_task_cb_adjust && ++rtpcp->rtp_n_lock_retries > rcu_task_contend_lim &&
READ_ONCE(rtp->percpu_enqueue_lim) != nr_cpu_ids) READ_ONCE(rtp->percpu_enqueue_lim) != rcu_task_cpu_ids)
needadjust = true; // Defer adjustment to avoid deadlock. needadjust = true; // Defer adjustment to avoid deadlock.
} }
// Queuing callbacks before initialization not yet supported. // Queuing callbacks before initialization not yet supported.
@ -368,10 +390,10 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
if (unlikely(needadjust)) { if (unlikely(needadjust)) {
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
if (rtp->percpu_enqueue_lim != nr_cpu_ids) { if (rtp->percpu_enqueue_lim != rcu_task_cpu_ids) {
WRITE_ONCE(rtp->percpu_enqueue_shift, 0); WRITE_ONCE(rtp->percpu_enqueue_shift, 0);
WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids); WRITE_ONCE(rtp->percpu_dequeue_lim, rcu_task_cpu_ids);
smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids); smp_store_release(&rtp->percpu_enqueue_lim, rcu_task_cpu_ids);
pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name); pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name);
} }
raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags); raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
@ -388,6 +410,7 @@ static void rcu_barrier_tasks_generic_cb(struct rcu_head *rhp)
struct rcu_tasks *rtp; struct rcu_tasks *rtp;
struct rcu_tasks_percpu *rtpcp; struct rcu_tasks_percpu *rtpcp;
rhp->next = rhp; // Mark the callback as having been invoked.
rtpcp = container_of(rhp, struct rcu_tasks_percpu, barrier_q_head); rtpcp = container_of(rhp, struct rcu_tasks_percpu, barrier_q_head);
rtp = rtpcp->rtpp; rtp = rtpcp->rtpp;
if (atomic_dec_and_test(&rtp->barrier_q_count)) if (atomic_dec_and_test(&rtp->barrier_q_count))
@ -396,7 +419,7 @@ static void rcu_barrier_tasks_generic_cb(struct rcu_head *rhp)
// Wait for all in-flight callbacks for the specified RCU Tasks flavor. // Wait for all in-flight callbacks for the specified RCU Tasks flavor.
// Operates in a manner similar to rcu_barrier(). // Operates in a manner similar to rcu_barrier().
static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp) static void __maybe_unused rcu_barrier_tasks_generic(struct rcu_tasks *rtp)
{ {
int cpu; int cpu;
unsigned long flags; unsigned long flags;
@ -409,6 +432,7 @@ static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp)
mutex_unlock(&rtp->barrier_q_mutex); mutex_unlock(&rtp->barrier_q_mutex);
return; return;
} }
rtp->barrier_q_start = jiffies;
rcu_seq_start(&rtp->barrier_q_seq); rcu_seq_start(&rtp->barrier_q_seq);
init_completion(&rtp->barrier_q_completion); init_completion(&rtp->barrier_q_completion);
atomic_set(&rtp->barrier_q_count, 2); atomic_set(&rtp->barrier_q_count, 2);
@ -444,6 +468,8 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim);
for (cpu = 0; cpu < dequeue_limit; cpu++) { for (cpu = 0; cpu < dequeue_limit; cpu++) {
if (!cpu_possible(cpu))
continue;
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
/* Advance and accelerate any new callbacks. */ /* Advance and accelerate any new callbacks. */
@ -481,7 +507,7 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) { if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) {
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
if (rtp->percpu_enqueue_lim > 1) { if (rtp->percpu_enqueue_lim > 1) {
WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids)); WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(rcu_task_cpu_ids));
smp_store_release(&rtp->percpu_enqueue_lim, 1); smp_store_release(&rtp->percpu_enqueue_lim, 1);
rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu(); rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu();
gpdone = false; gpdone = false;
@ -496,7 +522,9 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name); pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name);
} }
if (rtp->percpu_dequeue_lim == 1) { if (rtp->percpu_dequeue_lim == 1) {
for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) { for (cpu = rtp->percpu_dequeue_lim; cpu < rcu_task_cpu_ids; cpu++) {
if (!cpu_possible(cpu))
continue;
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist)); WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
@ -511,30 +539,32 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
// Advance callbacks and invoke any that are ready. // Advance callbacks and invoke any that are ready.
static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu *rtpcp) static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu *rtpcp)
{ {
int cpu;
int cpunext;
int cpuwq; int cpuwq;
unsigned long flags; unsigned long flags;
int len; int len;
int index;
struct rcu_head *rhp; struct rcu_head *rhp;
struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
struct rcu_tasks_percpu *rtpcp_next; struct rcu_tasks_percpu *rtpcp_next;
cpu = rtpcp->cpu; index = rtpcp->index * 2 + 1;
cpunext = cpu * 2 + 1; if (index < num_possible_cpus()) {
if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) { rtpcp_next = rtp->rtpcp_array[index];
rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext); if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND; cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND;
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
cpunext++;
if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext);
cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND;
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
index++;
if (index < num_possible_cpus()) {
rtpcp_next = rtp->rtpcp_array[index];
if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND;
queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
}
}
} }
} }
if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu)) if (rcu_segcblist_empty(&rtpcp->cblist))
return; return;
raw_spin_lock_irqsave_rcu_node(rtpcp, flags); raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq)); rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq));
@ -687,9 +717,7 @@ static void __init rcu_tasks_bootup_oddness(void)
#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */ #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
} }
#endif /* #ifndef CONFIG_TINY_RCU */
#ifndef CONFIG_TINY_RCU
/* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */ /* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */
static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s) static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
{ {
@ -723,6 +751,53 @@ static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
rtp->lazy_jiffies, rtp->lazy_jiffies,
s); s);
} }
/* Dump out more rcutorture-relevant state common to all RCU-tasks flavors. */
static void rcu_tasks_torture_stats_print_generic(struct rcu_tasks *rtp, char *tt,
char *tf, char *tst)
{
cpumask_var_t cm;
int cpu;
bool gotcb = false;
unsigned long j = jiffies;
pr_alert("%s%s Tasks%s RCU g%ld gp_start %lu gp_jiffies %lu gp_state %d (%s).\n",
tt, tf, tst, data_race(rtp->tasks_gp_seq),
j - data_race(rtp->gp_start), j - data_race(rtp->gp_jiffies),
data_race(rtp->gp_state), tasks_gp_state_getname(rtp));
pr_alert("\tEnqueue shift %d limit %d Dequeue limit %d gpseq %lu.\n",
data_race(rtp->percpu_enqueue_shift),
data_race(rtp->percpu_enqueue_lim),
data_race(rtp->percpu_dequeue_lim),
data_race(rtp->percpu_dequeue_gpseq));
(void)zalloc_cpumask_var(&cm, GFP_KERNEL);
pr_alert("\tCallback counts:");
for_each_possible_cpu(cpu) {
long n;
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
if (cpumask_available(cm) && !rcu_barrier_cb_is_done(&rtpcp->barrier_q_head))
cpumask_set_cpu(cpu, cm);
n = rcu_segcblist_n_cbs(&rtpcp->cblist);
if (!n)
continue;
pr_cont(" %d:%ld", cpu, n);
gotcb = true;
}
if (gotcb)
pr_cont(".\n");
else
pr_cont(" (none).\n");
pr_alert("\tBarrier seq %lu start %lu count %d holdout CPUs ",
data_race(rtp->barrier_q_seq), j - data_race(rtp->barrier_q_start),
atomic_read(&rtp->barrier_q_count));
if (cpumask_available(cm) && !cpumask_empty(cm))
pr_cont(" %*pbl.\n", cpumask_pr_args(cm));
else
pr_cont("(none).\n");
free_cpumask_var(cm);
}
#endif // #ifndef CONFIG_TINY_RCU #endif // #ifndef CONFIG_TINY_RCU
static void exit_tasks_rcu_finish_trace(struct task_struct *t); static void exit_tasks_rcu_finish_trace(struct task_struct *t);
@ -1174,6 +1249,12 @@ void show_rcu_tasks_classic_gp_kthread(void)
show_rcu_tasks_generic_gp_kthread(&rcu_tasks, ""); show_rcu_tasks_generic_gp_kthread(&rcu_tasks, "");
} }
EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread); EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread);
void rcu_tasks_torture_stats_print(char *tt, char *tf)
{
rcu_tasks_torture_stats_print_generic(&rcu_tasks, tt, tf, "");
}
EXPORT_SYMBOL_GPL(rcu_tasks_torture_stats_print);
#endif // !defined(CONFIG_TINY_RCU) #endif // !defined(CONFIG_TINY_RCU)
struct task_struct *get_rcu_tasks_gp_kthread(void) struct task_struct *get_rcu_tasks_gp_kthread(void)
@ -1244,13 +1325,12 @@ void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
//////////////////////////////////////////////////////////////////////// ////////////////////////////////////////////////////////////////////////
// //
// "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of // "Rude" variant of Tasks RCU, inspired by Steve Rostedt's
// passing an empty function to schedule_on_each_cpu(). This approach // trick of passing an empty function to schedule_on_each_cpu().
// provides an asynchronous call_rcu_tasks_rude() API and batching of // This approach provides batching of concurrent calls to the synchronous
// concurrent calls to the synchronous synchronize_rcu_tasks_rude() API. // synchronize_rcu_tasks_rude() API. This invokes schedule_on_each_cpu()
// This invokes schedule_on_each_cpu() in order to send IPIs far and wide // in order to send IPIs far and wide and induces otherwise unnecessary
// and induces otherwise unnecessary context switches on all online CPUs, // context switches on all online CPUs, whether idle or not.
// whether idle or not.
// //
// Callback handling is provided by the rcu_tasks_kthread() function. // Callback handling is provided by the rcu_tasks_kthread() function.
// //
@ -1268,11 +1348,11 @@ static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
schedule_on_each_cpu(rcu_tasks_be_rude); schedule_on_each_cpu(rcu_tasks_be_rude);
} }
void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func); static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude, DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude,
"RCU Tasks Rude"); "RCU Tasks Rude");
/** /*
* call_rcu_tasks_rude() - Queue a callback rude task-based grace period * call_rcu_tasks_rude() - Queue a callback rude task-based grace period
* @rhp: structure to be used for queueing the RCU updates. * @rhp: structure to be used for queueing the RCU updates.
* @func: actual callback function to be invoked after the grace period * @func: actual callback function to be invoked after the grace period
@ -1289,12 +1369,14 @@ DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude,
* *
* See the description of call_rcu() for more detailed information on * See the description of call_rcu() for more detailed information on
* memory ordering guarantees. * memory ordering guarantees.
*
* This is no longer exported, and is instead reserved for use by
* synchronize_rcu_tasks_rude().
*/ */
void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func) static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func)
{ {
call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude); call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude);
} }
EXPORT_SYMBOL_GPL(call_rcu_tasks_rude);
/** /**
* synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period * synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period
@ -1320,26 +1402,9 @@ void synchronize_rcu_tasks_rude(void)
} }
EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude); EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude);
/**
* rcu_barrier_tasks_rude - Wait for in-flight call_rcu_tasks_rude() callbacks.
*
* Although the current implementation is guaranteed to wait, it is not
* obligated to, for example, if there are no pending callbacks.
*/
void rcu_barrier_tasks_rude(void)
{
rcu_barrier_tasks_generic(&rcu_tasks_rude);
}
EXPORT_SYMBOL_GPL(rcu_barrier_tasks_rude);
int rcu_tasks_rude_lazy_ms = -1;
module_param(rcu_tasks_rude_lazy_ms, int, 0444);
static int __init rcu_spawn_tasks_rude_kthread(void) static int __init rcu_spawn_tasks_rude_kthread(void)
{ {
rcu_tasks_rude.gp_sleep = HZ / 10; rcu_tasks_rude.gp_sleep = HZ / 10;
if (rcu_tasks_rude_lazy_ms >= 0)
rcu_tasks_rude.lazy_jiffies = msecs_to_jiffies(rcu_tasks_rude_lazy_ms);
rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude); rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude);
return 0; return 0;
} }
@ -1350,6 +1415,12 @@ void show_rcu_tasks_rude_gp_kthread(void)
show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, ""); show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, "");
} }
EXPORT_SYMBOL_GPL(show_rcu_tasks_rude_gp_kthread); EXPORT_SYMBOL_GPL(show_rcu_tasks_rude_gp_kthread);
void rcu_tasks_rude_torture_stats_print(char *tt, char *tf)
{
rcu_tasks_torture_stats_print_generic(&rcu_tasks_rude, tt, tf, "");
}
EXPORT_SYMBOL_GPL(rcu_tasks_rude_torture_stats_print);
#endif // !defined(CONFIG_TINY_RCU) #endif // !defined(CONFIG_TINY_RCU)
struct task_struct *get_rcu_tasks_rude_gp_kthread(void) struct task_struct *get_rcu_tasks_rude_gp_kthread(void)
@ -2027,6 +2098,12 @@ void show_rcu_tasks_trace_gp_kthread(void)
show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf); show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf);
} }
EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread); EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread);
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf)
{
rcu_tasks_torture_stats_print_generic(&rcu_tasks_trace, tt, tf, "");
}
EXPORT_SYMBOL_GPL(rcu_tasks_trace_torture_stats_print);
#endif // !defined(CONFIG_TINY_RCU) #endif // !defined(CONFIG_TINY_RCU)
struct task_struct *get_rcu_tasks_trace_gp_kthread(void) struct task_struct *get_rcu_tasks_trace_gp_kthread(void)
@ -2069,11 +2146,6 @@ static struct rcu_tasks_test_desc tests[] = {
/* If not defined, the test is skipped. */ /* If not defined, the test is skipped. */
.notrun = IS_ENABLED(CONFIG_TASKS_RCU), .notrun = IS_ENABLED(CONFIG_TASKS_RCU),
}, },
{
.name = "call_rcu_tasks_rude()",
/* If not defined, the test is skipped. */
.notrun = IS_ENABLED(CONFIG_TASKS_RUDE_RCU),
},
{ {
.name = "call_rcu_tasks_trace()", .name = "call_rcu_tasks_trace()",
/* If not defined, the test is skipped. */ /* If not defined, the test is skipped. */
@ -2081,6 +2153,7 @@ static struct rcu_tasks_test_desc tests[] = {
} }
}; };
#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
static void test_rcu_tasks_callback(struct rcu_head *rhp) static void test_rcu_tasks_callback(struct rcu_head *rhp)
{ {
struct rcu_tasks_test_desc *rttd = struct rcu_tasks_test_desc *rttd =
@ -2090,6 +2163,7 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
rttd->notrun = false; rttd->notrun = false;
} }
#endif // #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
static void rcu_tasks_initiate_self_tests(void) static void rcu_tasks_initiate_self_tests(void)
{ {
@ -2102,16 +2176,14 @@ static void rcu_tasks_initiate_self_tests(void)
#ifdef CONFIG_TASKS_RUDE_RCU #ifdef CONFIG_TASKS_RUDE_RCU
pr_info("Running RCU Tasks Rude wait API self tests\n"); pr_info("Running RCU Tasks Rude wait API self tests\n");
tests[1].runstart = jiffies;
synchronize_rcu_tasks_rude(); synchronize_rcu_tasks_rude();
call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
#endif #endif
#ifdef CONFIG_TASKS_TRACE_RCU #ifdef CONFIG_TASKS_TRACE_RCU
pr_info("Running RCU Tasks Trace wait API self tests\n"); pr_info("Running RCU Tasks Trace wait API self tests\n");
tests[2].runstart = jiffies; tests[1].runstart = jiffies;
synchronize_rcu_tasks_trace(); synchronize_rcu_tasks_trace();
call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback); call_rcu_tasks_trace(&tests[1].rh, test_rcu_tasks_callback);
#endif #endif
} }

View File

@ -79,9 +79,6 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *);
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
.gpwrap = true, .gpwrap = true,
#ifdef CONFIG_RCU_NOCB_CPU
.cblist.flags = SEGCBLIST_RCU_CORE,
#endif
}; };
static struct rcu_state rcu_state = { static struct rcu_state rcu_state = {
.level = { &rcu_state.node[0] }, .level = { &rcu_state.node[0] },
@ -97,6 +94,9 @@ static struct rcu_state rcu_state = {
.srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
rcu_sr_normal_gp_cleanup_work), rcu_sr_normal_gp_cleanup_work),
.srs_cleanups_pending = ATOMIC_INIT(0), .srs_cleanups_pending = ATOMIC_INIT(0),
#ifdef CONFIG_RCU_NOCB_CPU
.nocb_mutex = __MUTEX_INITIALIZER(rcu_state.nocb_mutex),
#endif
}; };
/* Dump rcu_node combining tree at boot to verify correct setup. */ /* Dump rcu_node combining tree at boot to verify correct setup. */
@ -1660,7 +1660,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
* the done tail list manipulations are protected here. * the done tail list manipulations are protected here.
*/ */
done = smp_load_acquire(&rcu_state.srs_done_tail); done = smp_load_acquire(&rcu_state.srs_done_tail);
if (!done) if (WARN_ON_ONCE(!done))
return; return;
WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
@ -2394,7 +2394,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
{ {
unsigned long flags; unsigned long flags;
unsigned long mask; unsigned long mask;
bool needacc = false;
struct rcu_node *rnp; struct rcu_node *rnp;
WARN_ON_ONCE(rdp->cpu != smp_processor_id()); WARN_ON_ONCE(rdp->cpu != smp_processor_id());
@ -2431,23 +2430,11 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
* to return true. So complain, but don't awaken. * to return true. So complain, but don't awaken.
*/ */
WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp)); WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp));
} else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
/*
* ...but NOCB kthreads may miss or delay callbacks acceleration
* if in the middle of a (de-)offloading process.
*/
needacc = true;
} }
rcu_disable_urgency_upon_qs(rdp); rcu_disable_urgency_upon_qs(rdp);
rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
/* ^^^ Released rnp->lock */ /* ^^^ Released rnp->lock */
if (needacc) {
rcu_nocb_lock_irqsave(rdp, flags);
rcu_accelerate_cbs_unlocked(rnp, rdp);
rcu_nocb_unlock_irqrestore(rdp, flags);
}
} }
} }
@ -2802,24 +2789,6 @@ static __latent_entropy void rcu_core(void)
unsigned long flags; unsigned long flags;
struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode; struct rcu_node *rnp = rdp->mynode;
/*
* On RT rcu_core() can be preempted when IRQs aren't disabled.
* Therefore this function can race with concurrent NOCB (de-)offloading
* on this CPU and the below condition must be considered volatile.
* However if we race with:
*
* _ Offloading: In the worst case we accelerate or process callbacks
* concurrently with NOCB kthreads. We are guaranteed to
* call rcu_nocb_lock() if that happens.
*
* _ Deoffloading: In the worst case we miss callbacks acceleration or
* processing. This is fine because the early stage
* of deoffloading invokes rcu_core() after setting
* SEGCBLIST_RCU_CORE. So we guarantee that we'll process
* what could have been dismissed without the need to wait
* for the next rcu_pending() check in the next jiffy.
*/
const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist);
if (cpu_is_offline(smp_processor_id())) if (cpu_is_offline(smp_processor_id()))
return; return;
@ -2839,17 +2808,17 @@ static __latent_entropy void rcu_core(void)
/* No grace period and unregistered callbacks? */ /* No grace period and unregistered callbacks? */
if (!rcu_gp_in_progress() && if (!rcu_gp_in_progress() &&
rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) { rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_rdp_is_offloaded(rdp)) {
rcu_nocb_lock_irqsave(rdp, flags); local_irq_save(flags);
if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
rcu_accelerate_cbs_unlocked(rnp, rdp); rcu_accelerate_cbs_unlocked(rnp, rdp);
rcu_nocb_unlock_irqrestore(rdp, flags); local_irq_restore(flags);
} }
rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check()); rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check());
/* If there are callbacks ready, invoke them. */ /* If there are callbacks ready, invoke them. */
if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) && if (!rcu_rdp_is_offloaded(rdp) && rcu_segcblist_ready_cbs(&rdp->cblist) &&
likely(READ_ONCE(rcu_scheduler_fully_active))) { likely(READ_ONCE(rcu_scheduler_fully_active))) {
rcu_do_batch(rdp); rcu_do_batch(rdp);
/* Re-invoke RCU core processing if there are callbacks remaining. */ /* Re-invoke RCU core processing if there are callbacks remaining. */
@ -3238,7 +3207,7 @@ struct kvfree_rcu_bulk_data {
struct list_head list; struct list_head list;
struct rcu_gp_oldstate gp_snap; struct rcu_gp_oldstate gp_snap;
unsigned long nr_records; unsigned long nr_records;
void *records[]; void *records[] __counted_by(nr_records);
}; };
/* /*
@ -3550,10 +3519,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
if (delayed_work_pending(&krcp->monitor_work)) { if (delayed_work_pending(&krcp->monitor_work)) {
delay_left = krcp->monitor_work.timer.expires - jiffies; delay_left = krcp->monitor_work.timer.expires - jiffies;
if (delay < delay_left) if (delay < delay_left)
mod_delayed_work(system_wq, &krcp->monitor_work, delay); mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
return; return;
} }
queue_delayed_work(system_wq, &krcp->monitor_work, delay); queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
} }
static void static void
@ -3645,7 +3614,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
// be that the work is in the pending state when // be that the work is in the pending state when
// channels have been detached following by each // channels have been detached following by each
// other. // other.
queue_rcu_work(system_wq, &krwp->rcu_work); queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
} }
} }
@ -3715,7 +3684,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
!atomic_xchg(&krcp->work_in_progress, 1)) { !atomic_xchg(&krcp->work_in_progress, 1)) {
if (atomic_read(&krcp->backoff_page_cache_fill)) { if (atomic_read(&krcp->backoff_page_cache_fill)) {
queue_delayed_work(system_wq, queue_delayed_work(system_unbound_wq,
&krcp->page_cache_work, &krcp->page_cache_work,
msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
} else { } else {
@ -3778,7 +3747,8 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
} }
// Finally insert and update the GP for this page. // Finally insert and update the GP for this page.
bnode->records[bnode->nr_records++] = ptr; bnode->nr_records++;
bnode->records[bnode->nr_records - 1] = ptr;
get_state_synchronize_rcu_full(&bnode->gp_snap); get_state_synchronize_rcu_full(&bnode->gp_snap);
atomic_inc(&(*krcp)->bulk_count[idx]); atomic_inc(&(*krcp)->bulk_count[idx]);
@ -4414,6 +4384,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
{ {
unsigned long __maybe_unused s = rcu_state.barrier_sequence; unsigned long __maybe_unused s = rcu_state.barrier_sequence;
rhp->next = rhp; // Mark the callback as having been invoked.
if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) {
rcu_barrier_trace(TPS("LastCB"), -1, s); rcu_barrier_trace(TPS("LastCB"), -1, s);
complete(&rcu_state.barrier_completion); complete(&rcu_state.barrier_completion);
@ -5435,6 +5406,8 @@ static void __init rcu_init_one(void)
while (i > rnp->grphi) while (i > rnp->grphi)
rnp++; rnp++;
per_cpu_ptr(&rcu_data, i)->mynode = rnp; per_cpu_ptr(&rcu_data, i)->mynode = rnp;
per_cpu_ptr(&rcu_data, i)->barrier_head.next =
&per_cpu_ptr(&rcu_data, i)->barrier_head;
rcu_boot_init_percpu_data(i); rcu_boot_init_percpu_data(i);
} }
} }

View File

@ -411,7 +411,6 @@ struct rcu_state {
arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
/* Synchronize offline with */ /* Synchronize offline with */
/* GP pre-initialization. */ /* GP pre-initialization. */
int nocb_is_setup; /* nocb is setup from boot */
/* synchronize_rcu() part. */ /* synchronize_rcu() part. */
struct llist_head srs_next; /* request a GP users. */ struct llist_head srs_next; /* request a GP users. */
@ -420,6 +419,11 @@ struct rcu_state {
struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX]; struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX];
struct work_struct srs_cleanup_work; struct work_struct srs_cleanup_work;
atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */ atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */
#ifdef CONFIG_RCU_NOCB_CPU
struct mutex nocb_mutex; /* Guards (de-)offloading */
int nocb_is_setup; /* nocb is setup from boot */
#endif
}; };
/* Values for rcu_state structure's gp_flags field. */ /* Values for rcu_state structure's gp_flags field. */

View File

@ -542,6 +542,67 @@ static bool synchronize_rcu_expedited_wait_once(long tlimit)
return false; return false;
} }
/*
* Print out an expedited RCU CPU stall warning message.
*/
static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigned long j)
{
int cpu;
unsigned long mask;
int ndetected;
struct rcu_node *rnp;
struct rcu_node *rnp_root = rcu_get_root();
if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
pr_err("INFO: %s detected expedited stalls, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
return;
}
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name);
ndetected = 0;
rcu_for_each_leaf_node(rnp) {
ndetected += rcu_print_task_exp_stall(rnp);
for_each_leaf_node_possible_cpu(rnp, cpu) {
struct rcu_data *rdp;
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
ndetected++;
rdp = per_cpu_ptr(&rcu_data, cpu);
pr_cont(" %d-%c%c%c%c", cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
}
}
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures (internal RCU debug):");
rcu_for_each_node_breadth_first(rnp) {
if (rnp == rnp_root)
continue; /* printed unconditionally */
if (sync_rcu_exp_done_unlocked(rnp))
continue;
pr_cont(" l=%u:%d-%d:%#lx/%c",
rnp->level, rnp->grplo, rnp->grphi, data_race(rnp->expmask),
".T"[!!data_race(rnp->exp_tasks)]);
}
pr_cont("\n");
}
rcu_for_each_leaf_node(rnp) {
for_each_leaf_node_possible_cpu(rnp, cpu) {
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
dump_cpu_task(cpu);
}
rcu_exp_print_detail_task_stall_rnp(rnp);
}
}
/* /*
* Wait for the expedited grace period to elapse, issuing any needed * Wait for the expedited grace period to elapse, issuing any needed
* RCU CPU stall warnings along the way. * RCU CPU stall warnings along the way.
@ -553,10 +614,8 @@ static void synchronize_rcu_expedited_wait(void)
unsigned long jiffies_stall; unsigned long jiffies_stall;
unsigned long jiffies_start; unsigned long jiffies_start;
unsigned long mask; unsigned long mask;
int ndetected;
struct rcu_data *rdp; struct rcu_data *rdp;
struct rcu_node *rnp; struct rcu_node *rnp;
struct rcu_node *rnp_root = rcu_get_root();
unsigned long flags; unsigned long flags;
trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait"));
@ -593,55 +652,7 @@ static void synchronize_rcu_expedited_wait(void)
j = jiffies; j = jiffies;
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start)); rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start));
trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall")); trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall"));
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", synchronize_rcu_expedited_stall(jiffies_start, j);
rcu_state.name);
ndetected = 0;
rcu_for_each_leaf_node(rnp) {
ndetected += rcu_print_task_exp_stall(rnp);
for_each_leaf_node_possible_cpu(rnp, cpu) {
struct rcu_data *rdp;
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
ndetected++;
rdp = per_cpu_ptr(&rcu_data, cpu);
pr_cont(" %d-%c%c%c%c", cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
}
}
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
j - jiffies_start, rcu_state.expedited_sequence,
data_race(rnp_root->expmask),
".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures (internal RCU debug):");
rcu_for_each_node_breadth_first(rnp) {
if (rnp == rnp_root)
continue; /* printed unconditionally */
if (sync_rcu_exp_done_unlocked(rnp))
continue;
pr_cont(" l=%u:%d-%d:%#lx/%c",
rnp->level, rnp->grplo, rnp->grphi,
data_race(rnp->expmask),
".T"[!!data_race(rnp->exp_tasks)]);
}
pr_cont("\n");
}
rcu_for_each_leaf_node(rnp) {
for_each_leaf_node_possible_cpu(rnp, cpu) {
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
preempt_disable(); // For smp_processor_id() in dump_cpu_task().
dump_cpu_task(cpu);
preempt_enable();
}
rcu_exp_print_detail_task_stall_rnp(rnp);
}
jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
panic_on_rcu_stall(); panic_on_rcu_stall();
} }

View File

@ -16,10 +16,6 @@
#ifdef CONFIG_RCU_NOCB_CPU #ifdef CONFIG_RCU_NOCB_CPU
static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */ static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */ static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
{
return lockdep_is_held(&rdp->nocb_lock);
}
static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
{ {
@ -220,7 +216,7 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
if (needwake) { if (needwake) {
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake"));
wake_up_process(rdp_gp->nocb_gp_kthread); swake_up_one_online(&rdp_gp->nocb_gp_wq);
} }
return needwake; return needwake;
@ -413,14 +409,6 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
return false; return false;
} }
// In the process of (de-)offloading: no bypassing, but
// locking.
if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
rcu_nocb_lock(rdp);
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
return false; /* Not offloaded, no bypassing. */
}
// Don't use ->nocb_bypass during early boot. // Don't use ->nocb_bypass during early boot.
if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) { if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
rcu_nocb_lock(rdp); rcu_nocb_lock(rdp);
@ -505,7 +493,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
} }
rcu_nocb_bypass_unlock(rdp); rcu_nocb_bypass_unlock(rdp);
smp_mb(); /* Order enqueue before wake. */
// A wake up of the grace period kthread or timer adjustment // A wake up of the grace period kthread or timer adjustment
// needs to be done only if: // needs to be done only if:
// 1. Bypass list was fully empty before (this is the first // 1. Bypass list was fully empty before (this is the first
@ -616,37 +604,33 @@ static void call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *head,
} }
} }
static int nocb_gp_toggle_rdp(struct rcu_data *rdp) static void nocb_gp_toggle_rdp(struct rcu_data *rdp_gp, struct rcu_data *rdp)
{ {
struct rcu_segcblist *cblist = &rdp->cblist; struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long flags; unsigned long flags;
int ret;
rcu_nocb_lock_irqsave(rdp, flags); /*
if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && * Locking orders future de-offloaded callbacks enqueue against previous
!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) { * handling of this rdp. Ie: Make sure rcuog is done with this rdp before
* deoffloaded callbacks can be enqueued.
*/
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
/* /*
* Offloading. Set our flag and notify the offload worker. * Offloading. Set our flag and notify the offload worker.
* We will handle this rdp until it ever gets de-offloaded. * We will handle this rdp until it ever gets de-offloaded.
*/ */
rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP); list_add_tail(&rdp->nocb_entry_rdp, &rdp_gp->nocb_head_rdp);
ret = 1; rcu_segcblist_set_flags(cblist, SEGCBLIST_OFFLOADED);
} else if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && } else {
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
/* /*
* De-offloading. Clear our flag and notify the de-offload worker. * De-offloading. Clear our flag and notify the de-offload worker.
* We will ignore this rdp until it ever gets re-offloaded. * We will ignore this rdp until it ever gets re-offloaded.
*/ */
rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP); list_del(&rdp->nocb_entry_rdp);
ret = 0; rcu_segcblist_clear_flags(cblist, SEGCBLIST_OFFLOADED);
} else {
WARN_ON_ONCE(1);
ret = -1;
} }
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
rcu_nocb_unlock_irqrestore(rdp, flags);
return ret;
} }
static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu) static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu)
@ -853,14 +837,7 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
} }
if (rdp_toggling) { if (rdp_toggling) {
int ret; nocb_gp_toggle_rdp(my_rdp, rdp_toggling);
ret = nocb_gp_toggle_rdp(rdp_toggling);
if (ret == 1)
list_add_tail(&rdp_toggling->nocb_entry_rdp, &my_rdp->nocb_head_rdp);
else if (ret == 0)
list_del(&rdp_toggling->nocb_entry_rdp);
swake_up_one(&rdp_toggling->nocb_state_wq); swake_up_one(&rdp_toggling->nocb_state_wq);
} }
@ -1030,16 +1007,11 @@ void rcu_nocb_flush_deferred_wakeup(void)
} }
EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup); EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup);
static int rdp_offload_toggle(struct rcu_data *rdp, static int rcu_nocb_queue_toggle_rdp(struct rcu_data *rdp)
bool offload, unsigned long flags)
__releases(rdp->nocb_lock)
{ {
struct rcu_segcblist *cblist = &rdp->cblist;
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
bool wake_gp = false; bool wake_gp = false;
unsigned long flags;
rcu_segcblist_offload(cblist, offload);
rcu_nocb_unlock_irqrestore(rdp, flags);
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags); raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
// Queue this rdp for add/del to/from the list to iterate on rcuog // Queue this rdp for add/del to/from the list to iterate on rcuog
@ -1053,89 +1025,74 @@ static int rdp_offload_toggle(struct rcu_data *rdp,
return wake_gp; return wake_gp;
} }
static long rcu_nocb_rdp_deoffload(void *arg) static bool rcu_nocb_rdp_deoffload_wait_cond(struct rcu_data *rdp)
{
unsigned long flags;
bool ret;
/*
* Locking makes sure rcuog is done handling this rdp before deoffloaded
* enqueue can happen. Also it keeps the SEGCBLIST_OFFLOADED flag stable
* while the ->nocb_lock is held.
*/
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
ret = !rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
return ret;
}
static int rcu_nocb_rdp_deoffload(struct rcu_data *rdp)
{ {
struct rcu_data *rdp = arg;
struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long flags; unsigned long flags;
int wake_gp; int wake_gp;
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
/* /* CPU must be offline, unless it's early boot */
* rcu_nocb_rdp_deoffload() may be called directly if WARN_ON_ONCE(cpu_online(rdp->cpu) && rdp->cpu != raw_smp_processor_id());
* rcuog/o[p] spawn failed, because at this time the rdp->cpu
* is not online yet.
*/
WARN_ON_ONCE((rdp->cpu != raw_smp_processor_id()) && cpu_online(rdp->cpu));
pr_info("De-offloading %d\n", rdp->cpu); pr_info("De-offloading %d\n", rdp->cpu);
/* Flush all callbacks from segcblist and bypass */
rcu_barrier();
/*
* Make sure the rcuoc kthread isn't in the middle of a nocb locked
* sequence while offloading is deactivated, along with nocb locking.
*/
if (rdp->nocb_cb_kthread)
kthread_park(rdp->nocb_cb_kthread);
rcu_nocb_lock_irqsave(rdp, flags); rcu_nocb_lock_irqsave(rdp, flags);
/* WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
* Flush once and for all now. This suffices because we are WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
* running on the target CPU holding ->nocb_lock (thus having rcu_nocb_unlock_irqrestore(rdp, flags);
* interrupts disabled), and because rdp_offload_toggle()
* invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED. wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
* Thus future calls to rcu_segcblist_completely_offloaded() will
* return false, which means that future calls to rcu_nocb_try_bypass()
* will refuse to put anything into the bypass.
*/
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
/*
* Start with invoking rcu_core() early. This way if the current thread
* happens to preempt an ongoing call to rcu_core() in the middle,
* leaving some work dismissed because rcu_core() still thinks the rdp is
* completely offloaded, we are guaranteed a nearby future instance of
* rcu_core() to catch up.
*/
rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE);
invoke_rcu_core();
wake_gp = rdp_offload_toggle(rdp, false, flags);
mutex_lock(&rdp_gp->nocb_gp_kthread_mutex); mutex_lock(&rdp_gp->nocb_gp_kthread_mutex);
if (rdp_gp->nocb_gp_kthread) { if (rdp_gp->nocb_gp_kthread) {
if (wake_gp) if (wake_gp)
wake_up_process(rdp_gp->nocb_gp_kthread); wake_up_process(rdp_gp->nocb_gp_kthread);
swait_event_exclusive(rdp->nocb_state_wq, swait_event_exclusive(rdp->nocb_state_wq,
!rcu_segcblist_test_flags(cblist, rcu_nocb_rdp_deoffload_wait_cond(rdp));
SEGCBLIST_KTHREAD_GP));
if (rdp->nocb_cb_kthread)
kthread_park(rdp->nocb_cb_kthread);
} else { } else {
/* /*
* No kthread to clear the flags for us or remove the rdp from the nocb list * No kthread to clear the flags for us or remove the rdp from the nocb list
* to iterate. Do it here instead. Locking doesn't look stricly necessary * to iterate. Do it here instead. Locking doesn't look stricly necessary
* but we stick to paranoia in this rare path. * but we stick to paranoia in this rare path.
*/ */
rcu_nocb_lock_irqsave(rdp, flags); raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP); rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
rcu_nocb_unlock_irqrestore(rdp, flags); raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
list_del(&rdp->nocb_entry_rdp); list_del(&rdp->nocb_entry_rdp);
} }
mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
/*
* Lock one last time to acquire latest callback updates from kthreads
* so we can later handle callbacks locally without locking.
*/
rcu_nocb_lock_irqsave(rdp, flags);
/*
* Theoretically we could clear SEGCBLIST_LOCKING after the nocb
* lock is released but how about being paranoid for once?
*/
rcu_segcblist_clear_flags(cblist, SEGCBLIST_LOCKING);
/*
* Without SEGCBLIST_LOCKING, we can't use
* rcu_nocb_unlock_irqrestore() anymore.
*/
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
/* Sanity check */
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
return 0; return 0;
} }
@ -1145,33 +1102,42 @@ int rcu_nocb_cpu_deoffload(int cpu)
int ret = 0; int ret = 0;
cpus_read_lock(); cpus_read_lock();
mutex_lock(&rcu_state.barrier_mutex); mutex_lock(&rcu_state.nocb_mutex);
if (rcu_rdp_is_offloaded(rdp)) { if (rcu_rdp_is_offloaded(rdp)) {
if (cpu_online(cpu)) { if (!cpu_online(cpu)) {
ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); ret = rcu_nocb_rdp_deoffload(rdp);
if (!ret) if (!ret)
cpumask_clear_cpu(cpu, rcu_nocb_mask); cpumask_clear_cpu(cpu, rcu_nocb_mask);
} else { } else {
pr_info("NOCB: Cannot CB-deoffload offline CPU %d\n", rdp->cpu); pr_info("NOCB: Cannot CB-deoffload online CPU %d\n", rdp->cpu);
ret = -EINVAL; ret = -EINVAL;
} }
} }
mutex_unlock(&rcu_state.barrier_mutex); mutex_unlock(&rcu_state.nocb_mutex);
cpus_read_unlock(); cpus_read_unlock();
return ret; return ret;
} }
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload); EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
static long rcu_nocb_rdp_offload(void *arg) static bool rcu_nocb_rdp_offload_wait_cond(struct rcu_data *rdp)
{ {
struct rcu_data *rdp = arg;
struct rcu_segcblist *cblist = &rdp->cblist;
unsigned long flags; unsigned long flags;
bool ret;
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
ret = rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
return ret;
}
static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
{
int wake_gp; int wake_gp;
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id()); WARN_ON_ONCE(cpu_online(rdp->cpu));
/* /*
* For now we only support re-offload, ie: the rdp must have been * For now we only support re-offload, ie: the rdp must have been
* offloaded on boot first. * offloaded on boot first.
@ -1184,44 +1150,17 @@ static long rcu_nocb_rdp_offload(void *arg)
pr_info("Offloading %d\n", rdp->cpu); pr_info("Offloading %d\n", rdp->cpu);
/* WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
* Can't use rcu_nocb_lock_irqsave() before SEGCBLIST_LOCKING WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
* is set.
*/
raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
/* wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
* We didn't take the nocb lock while working on the
* rdp->cblist with SEGCBLIST_LOCKING cleared (pure softirq/rcuc mode).
* Every modifications that have been done previously on
* rdp->cblist must be visible remotely by the nocb kthreads
* upon wake up after reading the cblist flags.
*
* The layout against nocb_lock enforces that ordering:
*
* __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait()
* ------------------------- ----------------------------
* WRITE callbacks rcu_nocb_lock()
* rcu_nocb_lock() READ flags
* WRITE flags READ callbacks
* rcu_nocb_unlock() rcu_nocb_unlock()
*/
wake_gp = rdp_offload_toggle(rdp, true, flags);
if (wake_gp) if (wake_gp)
wake_up_process(rdp_gp->nocb_gp_kthread); wake_up_process(rdp_gp->nocb_gp_kthread);
kthread_unpark(rdp->nocb_cb_kthread);
swait_event_exclusive(rdp->nocb_state_wq, swait_event_exclusive(rdp->nocb_state_wq,
rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)); rcu_nocb_rdp_offload_wait_cond(rdp));
/* kthread_unpark(rdp->nocb_cb_kthread);
* All kthreads are ready to work, we can finally relieve rcu_core() and
* enable nocb bypass.
*/
rcu_nocb_lock_irqsave(rdp, flags);
rcu_segcblist_clear_flags(cblist, SEGCBLIST_RCU_CORE);
rcu_nocb_unlock_irqrestore(rdp, flags);
return 0; return 0;
} }
@ -1232,18 +1171,18 @@ int rcu_nocb_cpu_offload(int cpu)
int ret = 0; int ret = 0;
cpus_read_lock(); cpus_read_lock();
mutex_lock(&rcu_state.barrier_mutex); mutex_lock(&rcu_state.nocb_mutex);
if (!rcu_rdp_is_offloaded(rdp)) { if (!rcu_rdp_is_offloaded(rdp)) {
if (cpu_online(cpu)) { if (!cpu_online(cpu)) {
ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); ret = rcu_nocb_rdp_offload(rdp);
if (!ret) if (!ret)
cpumask_set_cpu(cpu, rcu_nocb_mask); cpumask_set_cpu(cpu, rcu_nocb_mask);
} else { } else {
pr_info("NOCB: Cannot CB-offload offline CPU %d\n", rdp->cpu); pr_info("NOCB: Cannot CB-offload online CPU %d\n", rdp->cpu);
ret = -EINVAL; ret = -EINVAL;
} }
} }
mutex_unlock(&rcu_state.barrier_mutex); mutex_unlock(&rcu_state.nocb_mutex);
cpus_read_unlock(); cpus_read_unlock();
return ret; return ret;
@ -1261,7 +1200,7 @@ lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
return 0; return 0;
/* Protect rcu_nocb_mask against concurrent (de-)offloading. */ /* Protect rcu_nocb_mask against concurrent (de-)offloading. */
if (!mutex_trylock(&rcu_state.barrier_mutex)) if (!mutex_trylock(&rcu_state.nocb_mutex))
return 0; return 0;
/* Snapshot count of all CPUs */ /* Snapshot count of all CPUs */
@ -1271,7 +1210,7 @@ lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
count += READ_ONCE(rdp->lazy_len); count += READ_ONCE(rdp->lazy_len);
} }
mutex_unlock(&rcu_state.barrier_mutex); mutex_unlock(&rcu_state.nocb_mutex);
return count ? count : SHRINK_EMPTY; return count ? count : SHRINK_EMPTY;
} }
@ -1289,9 +1228,9 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
* Protect against concurrent (de-)offloading. Otherwise nocb locking * Protect against concurrent (de-)offloading. Otherwise nocb locking
* may be ignored or imbalanced. * may be ignored or imbalanced.
*/ */
if (!mutex_trylock(&rcu_state.barrier_mutex)) { if (!mutex_trylock(&rcu_state.nocb_mutex)) {
/* /*
* But really don't insist if barrier_mutex is contended since we * But really don't insist if nocb_mutex is contended since we
* can't guarantee that it will never engage in a dependency * can't guarantee that it will never engage in a dependency
* chain involving memory allocation. The lock is seldom contended * chain involving memory allocation. The lock is seldom contended
* anyway. * anyway.
@ -1330,7 +1269,7 @@ lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
break; break;
} }
mutex_unlock(&rcu_state.barrier_mutex); mutex_unlock(&rcu_state.nocb_mutex);
return count ? count : SHRINK_STOP; return count ? count : SHRINK_STOP;
} }
@ -1396,9 +1335,7 @@ void __init rcu_init_nohz(void)
rdp = per_cpu_ptr(&rcu_data, cpu); rdp = per_cpu_ptr(&rcu_data, cpu);
if (rcu_segcblist_empty(&rdp->cblist)) if (rcu_segcblist_empty(&rdp->cblist))
rcu_segcblist_init(&rdp->cblist); rcu_segcblist_init(&rdp->cblist);
rcu_segcblist_offload(&rdp->cblist, true); rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_OFFLOADED);
rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_RCU_CORE);
} }
rcu_organize_nocb_kthreads(); rcu_organize_nocb_kthreads();
} }
@ -1446,7 +1383,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
"rcuog/%d", rdp_gp->cpu); "rcuog/%d", rdp_gp->cpu);
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) { if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) {
mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
goto end; goto err;
} }
WRITE_ONCE(rdp_gp->nocb_gp_kthread, t); WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
if (kthread_prio) if (kthread_prio)
@ -1458,7 +1395,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
t = kthread_create(rcu_nocb_cb_kthread, rdp, t = kthread_create(rcu_nocb_cb_kthread, rdp,
"rcuo%c/%d", rcu_state.abbr, cpu); "rcuo%c/%d", rcu_state.abbr, cpu);
if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
goto end; goto err;
if (rcu_rdp_is_offloaded(rdp)) if (rcu_rdp_is_offloaded(rdp))
wake_up_process(t); wake_up_process(t);
@ -1471,13 +1408,21 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
WRITE_ONCE(rdp->nocb_cb_kthread, t); WRITE_ONCE(rdp->nocb_cb_kthread, t);
WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
return; return;
end:
mutex_lock(&rcu_state.barrier_mutex); err:
/*
* No need to protect against concurrent rcu_barrier()
* because the number of callbacks should be 0 for a non-boot CPU,
* therefore rcu_barrier() shouldn't even try to grab the nocb_lock.
* But hold nocb_mutex to avoid nocb_lock imbalance from shrinker.
*/
WARN_ON_ONCE(system_state > SYSTEM_BOOTING && rcu_segcblist_n_cbs(&rdp->cblist));
mutex_lock(&rcu_state.nocb_mutex);
if (rcu_rdp_is_offloaded(rdp)) { if (rcu_rdp_is_offloaded(rdp)) {
rcu_nocb_rdp_deoffload(rdp); rcu_nocb_rdp_deoffload(rdp);
cpumask_clear_cpu(cpu, rcu_nocb_mask); cpumask_clear_cpu(cpu, rcu_nocb_mask);
} }
mutex_unlock(&rcu_state.barrier_mutex); mutex_unlock(&rcu_state.nocb_mutex);
} }
/* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */ /* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */
@ -1653,16 +1598,6 @@ static void show_rcu_nocb_state(struct rcu_data *rdp)
#else /* #ifdef CONFIG_RCU_NOCB_CPU */ #else /* #ifdef CONFIG_RCU_NOCB_CPU */
static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
{
return 0;
}
static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
{
return false;
}
/* No ->nocb_lock to acquire. */ /* No ->nocb_lock to acquire. */
static void rcu_nocb_lock(struct rcu_data *rdp) static void rcu_nocb_lock(struct rcu_data *rdp)
{ {

View File

@ -24,10 +24,11 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
* timers have their own means of synchronization against the * timers have their own means of synchronization against the
* offloaded state updaters. * offloaded state updaters.
*/ */
RCU_LOCKDEP_WARN( RCU_NOCB_LOCKDEP_WARN(
!(lockdep_is_held(&rcu_state.barrier_mutex) || !(lockdep_is_held(&rcu_state.barrier_mutex) ||
(IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
rcu_lockdep_is_held_nocb(rdp) || lockdep_is_held(&rdp->nocb_lock) ||
lockdep_is_held(&rcu_state.nocb_mutex) ||
(!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) && (!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) &&
rdp == this_cpu_ptr(&rcu_data)) || rdp == this_cpu_ptr(&rcu_data)) ||
rcu_current_is_nocb_kthread(rdp)), rcu_current_is_nocb_kthread(rdp)),

View File

@ -9,6 +9,7 @@
#include <linux/kvm_para.h> #include <linux/kvm_para.h>
#include <linux/rcu_notifier.h> #include <linux/rcu_notifier.h>
#include <linux/smp.h>
////////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////////
// //
@ -370,6 +371,7 @@ static void rcu_dump_cpu_stacks(void)
struct rcu_node *rnp; struct rcu_node *rnp;
rcu_for_each_leaf_node(rnp) { rcu_for_each_leaf_node(rnp) {
printk_deferred_enter();
raw_spin_lock_irqsave_rcu_node(rnp, flags); raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu) for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
@ -379,6 +381,7 @@ static void rcu_dump_cpu_stacks(void)
dump_cpu_task(cpu); dump_cpu_task(cpu);
} }
raw_spin_unlock_irqrestore_rcu_node(rnp, flags); raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
printk_deferred_exit();
} }
} }
@ -719,6 +722,9 @@ static void print_cpu_stall(unsigned long gps)
set_preempt_need_resched(); set_preempt_need_resched();
} }
static bool csd_lock_suppress_rcu_stall;
module_param(csd_lock_suppress_rcu_stall, bool, 0644);
static void check_cpu_stall(struct rcu_data *rdp) static void check_cpu_stall(struct rcu_data *rdp)
{ {
bool self_detected; bool self_detected;
@ -791,7 +797,9 @@ static void check_cpu_stall(struct rcu_data *rdp)
return; return;
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps); rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps);
if (self_detected) { if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
} else if (self_detected) {
/* We haven't checked in, so go dump stack. */ /* We haven't checked in, so go dump stack. */
print_cpu_stall(gps); print_cpu_stall(gps);
} else { } else {

View File

@ -9726,7 +9726,7 @@ struct cgroup_subsys cpu_cgrp_subsys = {
void dump_cpu_task(int cpu) void dump_cpu_task(int cpu)
{ {
if (cpu == smp_processor_id() && in_hardirq()) { if (in_hardirq() && cpu == smp_processor_id()) {
struct pt_regs *regs; struct pt_regs *regs;
regs = get_irq_regs(); regs = get_irq_regs();

View File

@ -208,12 +208,25 @@ static int csd_lock_wait_getcpu(call_single_data_t *csd)
return -1; return -1;
} }
static atomic_t n_csd_lock_stuck;
/**
* csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long?
*
* Returns @true if a CSD-lock acquisition is stuck and has been stuck
* long enough for a "non-responsive CSD lock" message to be printed.
*/
bool csd_lock_is_stuck(void)
{
return !!atomic_read(&n_csd_lock_stuck);
}
/* /*
* Complain if too much time spent waiting. Note that only * Complain if too much time spent waiting. Note that only
* the CSD_TYPE_SYNC/ASYNC types provide the destination CPU, * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
* so waiting on other types gets much less information. * so waiting on other types gets much less information.
*/ */
static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id) static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages)
{ {
int cpu = -1; int cpu = -1;
int cpux; int cpux;
@ -229,15 +242,26 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu = csd_lock_wait_getcpu(csd); cpu = csd_lock_wait_getcpu(csd);
pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n", pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
*bug_id, raw_smp_processor_id(), cpu); *bug_id, raw_smp_processor_id(), cpu);
atomic_dec(&n_csd_lock_stuck);
return true; return true;
} }
ts2 = sched_clock(); ts2 = sched_clock();
/* How long since we last checked for a stuck CSD lock.*/ /* How long since we last checked for a stuck CSD lock.*/
ts_delta = ts2 - *ts1; ts_delta = ts2 - *ts1;
if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0)) if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) *
(!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) ||
csd_lock_timeout_ns == 0))
return false; return false;
if (ts0 > ts2) {
/* Our own sched_clock went backward; don't blame another CPU. */
ts_delta = ts0 - ts2;
pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta);
*ts1 = ts2;
return false;
}
firsttime = !*bug_id; firsttime = !*bug_id;
if (firsttime) if (firsttime)
*bug_id = atomic_inc_return(&csd_bug_count); *bug_id = atomic_inc_return(&csd_bug_count);
@ -249,9 +273,12 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */ cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
/* How long since this CSD lock was stuck. */ /* How long since this CSD lock was stuck. */
ts_delta = ts2 - ts0; ts_delta = ts2 - ts0;
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n", pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta, firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
cpu, csd->func, csd->info); cpu, csd->func, csd->info);
(*nmessages)++;
if (firsttime)
atomic_inc(&n_csd_lock_stuck);
/* /*
* If the CSD lock is still stuck after 5 minutes, it is unlikely * If the CSD lock is still stuck after 5 minutes, it is unlikely
* to become unstuck. Use a signed comparison to avoid triggering * to become unstuck. Use a signed comparison to avoid triggering
@ -290,12 +317,13 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
*/ */
static void __csd_lock_wait(call_single_data_t *csd) static void __csd_lock_wait(call_single_data_t *csd)
{ {
unsigned long nmessages = 0;
int bug_id = 0; int bug_id = 0;
u64 ts0, ts1; u64 ts0, ts1;
ts1 = ts0 = sched_clock(); ts1 = ts0 = sched_clock();
for (;;) { for (;;) {
if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id)) if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages))
break; break;
cpu_relax(); cpu_relax();
} }

View File

@ -1614,6 +1614,7 @@ config SCF_TORTURE_TEST
config CSD_LOCK_WAIT_DEBUG config CSD_LOCK_WAIT_DEBUG
bool "Debugging for csd_lock_wait(), called from smp_call_function*()" bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
depends on DEBUG_KERNEL depends on DEBUG_KERNEL
depends on SMP
depends on 64BIT depends on 64BIT
default n default n
help help

View File

@ -21,12 +21,10 @@ fi
bpftrace -e 'kprobe:kvfree_call_rcu, bpftrace -e 'kprobe:kvfree_call_rcu,
kprobe:call_rcu, kprobe:call_rcu,
kprobe:call_rcu_tasks, kprobe:call_rcu_tasks,
kprobe:call_rcu_tasks_rude,
kprobe:call_rcu_tasks_trace, kprobe:call_rcu_tasks_trace,
kprobe:call_srcu, kprobe:call_srcu,
kprobe:rcu_barrier, kprobe:rcu_barrier,
kprobe:rcu_barrier_tasks, kprobe:rcu_barrier_tasks,
kprobe:rcu_barrier_tasks_rude,
kprobe:rcu_barrier_tasks_trace, kprobe:rcu_barrier_tasks_trace,
kprobe:srcu_barrier, kprobe:srcu_barrier,
kprobe:synchronize_rcu, kprobe:synchronize_rcu,

View File

@ -68,6 +68,8 @@ config_override_param "--gdb options" KcList "$TORTURE_KCONFIG_GDB_ARG"
config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG" config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG"
config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG" config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG"
config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG" config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG"
config_override_param "$config_dir/CFcommon.$(uname -m)" KcList \
"`cat $config_dir/CFcommon.$(uname -m) 2> /dev/null`"
cp $T/KcList $resdir/ConfigFragment cp $T/KcList $resdir/ConfigFragment
base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'` base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`

View File

@ -19,10 +19,10 @@ PATH=${RCUTORTURE}/bin:$PATH; export PATH
TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`" TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`"
MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2)) MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2))
HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2)) SCALE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
if test "$HALF_ALLOTED_CPUS" -lt 1 if test "$SCALE_ALLOTED_CPUS" -lt 1
then then
HALF_ALLOTED_CPUS=1 SCALE_ALLOTED_CPUS=1
fi fi
VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16)) VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16))
if test "$VERBOSE_BATCH_CPUS" -lt 2 if test "$VERBOSE_BATCH_CPUS" -lt 2
@ -90,6 +90,7 @@ usage () {
echo " --do-scftorture / --do-no-scftorture / --no-scftorture" echo " --do-scftorture / --do-no-scftorture / --no-scftorture"
echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep" echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep"
echo " --duration [ <minutes> | <hours>h | <days>d ]" echo " --duration [ <minutes> | <hours>h | <days>d ]"
echo " --guest-cpu-limit N"
echo " --kcsan-kmake-arg kernel-make-arguments" echo " --kcsan-kmake-arg kernel-make-arguments"
exit 1 exit 1
} }
@ -203,6 +204,21 @@ do
duration_base=$(($ts*mult)) duration_base=$(($ts*mult))
shift shift
;; ;;
--guest-cpu-limit|--guest-cpu-lim)
checkarg --guest-cpu-limit "(number)" "$#" "$2" '^[0-9]*$' '^--'
if (("$2" <= "$TORTURE_ALLOTED_CPUS" / 2))
then
SCALE_ALLOTED_CPUS="$2"
VERBOSE_BATCH_CPUS="$((SCALE_ALLOTED_CPUS/8))"
if (("$VERBOSE_BATCH_CPUS" < 2))
then
VERBOSE_BATCH_CPUS=0
fi
else
echo "Ignoring value of $2 for --guest-cpu-limit which is greater than (("$TORTURE_ALLOTED_CPUS" / 2))."
fi
shift
;;
--kcsan-kmake-arg|--kcsan-kmake-args) --kcsan-kmake-arg|--kcsan-kmake-args)
checkarg --kcsan-kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$' checkarg --kcsan-kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
kcsan_kmake_args="`echo "$kcsan_kmake_args $2" | sed -e 's/^ *//' -e 's/ *$//'`" kcsan_kmake_args="`echo "$kcsan_kmake_args $2" | sed -e 's/^ *//' -e 's/ *$//'`"
@ -425,9 +441,9 @@ fi
if test "$do_scftorture" = "yes" if test "$do_scftorture" = "yes"
then then
# Scale memory based on the number of CPUs. # Scale memory based on the number of CPUs.
scfmem=$((3+HALF_ALLOTED_CPUS/16)) scfmem=$((3+SCALE_ALLOTED_CPUS/16))
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1" torture_bootargs="scftorture.nthreads=$SCALE_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
fi fi
if test "$do_rt" = "yes" if test "$do_rt" = "yes"
@ -471,8 +487,8 @@ for prim in $primlist
do do
if test -n "$firsttime" if test -n "$firsttime"
then then
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot" torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$SCALE_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
if test -f "$T/last-resdir-kasan" if test -f "$T/last-resdir-kasan"
then then
@ -520,8 +536,8 @@ for prim in $primlist
do do
if test -n "$firsttime" if test -n "$firsttime"
then then
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot" torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$SCALE_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --trust-make
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
if test -f "$T/last-resdir-kasan" if test -f "$T/last-resdir-kasan"
then then
@ -559,7 +575,7 @@ do_kcsan="$do_kcsan_save"
if test "$do_kvfree" = "yes" if test "$do_kvfree" = "yes"
then then
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory 2G --trust-make
fi fi
if test "$do_clocksourcewd" = "yes" if test "$do_clocksourcewd" = "yes"

View File

@ -1,7 +1,5 @@
CONFIG_RCU_TORTURE_TEST=y CONFIG_RCU_TORTURE_TEST=y
CONFIG_PRINTK_TIME=y CONFIG_PRINTK_TIME=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y CONFIG_PARAVIRT=y
CONFIG_KVM_GUEST=y
CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n
CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n

View File

@ -0,0 +1,2 @@
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM_GUEST=y

View File

@ -0,0 +1 @@
CONFIG_KVM_GUEST=y

View File

@ -0,0 +1,2 @@
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM_GUEST=y

View File

@ -2,3 +2,4 @@ nohz_full=2-9
rcutorture.stall_cpu=14 rcutorture.stall_cpu=14
rcutorture.stall_cpu_holdoff=90 rcutorture.stall_cpu_holdoff=90
rcutorture.fwd_progress=0 rcutorture.fwd_progress=0
rcutree.nohz_full_patience_delay=1000

View File

@ -0,0 +1,20 @@
CONFIG_SMP=n
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=n
CONFIG_PREEMPT_DYNAMIC=n
#CHECK#CONFIG_PREEMPT_RCU=n
CONFIG_HZ_PERIODIC=n
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ_FULL=n
CONFIG_HOTPLUG_CPU=n
CONFIG_SUSPEND=n
CONFIG_HIBERNATION=n
CONFIG_RCU_NOCB_CPU=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_PROVE_LOCKING=n
CONFIG_RCU_BOOST=n
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_RCU_EXPERT=y
CONFIG_KPROBES=n
CONFIG_FTRACE=n