Event tracing is moving to SRCU in order to take advantage of the fact
that SRCU may be safely used from idle and even offline CPUs. However,
event tracing can invoke call_srcu() very early in the boot process,
even before workqueue_init_early() is invoked (let alone rcu_init()).
Therefore, call_srcu()'s attempts to queue work fail miserably.
This commit therefore detects this situation, and refrains from attempting
to queue work before rcu_init() time, but does everything else that it
would have done, and in addition, adds the srcu_struct to a global list.
The rcu_init() function now invokes a new srcu_init() function, which
is empty if CONFIG_SRCU=n. Otherwise, srcu_init() queues work for
each srcu_struct on the list. This all happens early enough in boot
that there is but a single CPU with interrupts disabled, which allows
synchronization to be dispensed with.
Of course, the queued work won't actually be invoked until after
workqueue_init() is invoked, which happens shortly after the scheduler
is up and running. This means that although call_srcu() may be invoked
any time after per-CPU variables have been set up, there is still a very
narrow window when synchronize_srcu() won't work, and this window
extends from the time that the scheduler starts until the time that
workqueue_init() returns. This can be fixed in a manner similar to
the fix for synchronize_rcu_expedited() and friends, but until someone
actually needs to use synchronize_srcu() during this window, this fix
is added churn for no benefit.
Finally, note that Tree SRCU's new srcu_init() function invokes
queue_work() rather than the queue_delayed_work() function that is
invoked post-boot. The reason is that queue_delayed_work() will (as you
would expect) post a timer, and timers have not yet been initialized.
So use of queue_work() avoids the complaints about use of uninitialized
spinlocks that would otherwise result. Besides, some delay is already
provide by the aforementioned fact that the queued work won't actually
be invoked until after the scheduler is up and running.
Requested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
1e64b15a4b ("rcu: Fix grace-period hangs due to race with CPU offline")
added spinlock_t ofl_lock to the rcu_state structure, then takes it with
preemption disabled during CPU offline, which gives the -rt patchset's
sleeping spinlock heartburn.
This commit therefore converts ->ofl_lock to raw_spinlock_t.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
The rcu_data structure's ->dynticks_fqs is incremented but never
accesses. Its ->cond_resched_completed field isn't used at all.
This commit therefore removes both fields.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit move ->dynticks from the rcu_dynticks structure to the
rcu_data structure, replacing the field of the same name. It also updates
the code to access ->dynticks from the rcu_data structure and to use the
rcu_data structure rather than following to now-gone ->dynticks field
to the now-gone rcu_dynticks structure. While in the area, this commit
also fixes up comments.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit removes ->dynticks_nesting and ->dynticks_nmi_nesting from
the rcu_dynticks structure and updates the code to access them from the
rcu_data structure.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit removes ->rcu_need_heavy_qs and ->rcu_urgent_qs from the
rcu_dynticks structure and updates the code to access them from the
rcu_data structure.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The resched_cpu() interface is quite handy, but it does acquire the
specified CPU's runqueue lock, which does not come for free. This
commit therefore substitutes the following when directing resched_cpu()
at the current CPU:
set_tsk_need_resched(current);
set_preempt_need_resched();
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Because nohz_full CPUs can leave the scheduler-clock interrupt disabled
even when in kernel mode, RCU cannot rely on rcu_check_callbacks() to
enlist the scheduler's aid in extracting a quiescent state from such CPUs.
This commit therefore more aggressively uses resched_cpu() on nohz_full
CPUs that fail to pass through a quiescent state in a timely manner.
By default, the resched_cpu() beating starts 300 milliseconds into the
quiescent state.
While in the neighborhood, add a ->last_fqs_resched field to the rcu_data
structure in order to rate-limit resched_cpu() calls from the RCU
grace-period kthread.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The jiffies_till_sched_qs value used to determine how old a grace period
must be before RCU enlists the help of the scheduler to force a quiescent
state on the holdout CPU. Currently, this defaults to HZ/10 regardless of
system size and may be set only at boot time. This can be a problem for
very large systems, because if the values of the jiffies_till_first_fqs
and jiffies_till_next_fqs kernel parameters are left at their defaults,
they are calculated to increase as the number of CPUs actually configured
on the system increases. Thus, on a sufficiently large system, RCU would
enlist the help of the scheduler before the grace-period kthread had a
chance to scan for idle CPUs, which wastes CPU time.
This commit therefore allows jiffies_till_sched_qs to be set, if desired,
but if left as default, computes is as jiffies_till_first_fqs plus twice
jiffies_till_next_fqs, thus allowing three force-quiescent-state scans
for idle CPUs. This scales with the number of CPUs, providing sensible
default values.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The ->rcu_qs_ctr counter was intended to allow providing a lightweight
report of a quiescent state to all RCU flavors. But now that there is
only one flavor of RCU in any one running kernel, there is no point in
having this feature. This commit therefore removes the ->rcu_qs_ctr
field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field
from the rcu_data structure. This results in the "rqc" option to the
rcu_fqs trace event no longer being used, so this commit also removes the
"rqc" description from the header comment.
While in the neighborhood, this commit also causes the forward-progress
request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval
later in the grace period than the first setting of .rcu_urgent_qs.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The patch making need_resched() respond to urgent RCU-QS needs used
is_idle_task(current) to detect an interrupt from idle, which does work
reasonably, but is (in theory at least) vulnerable to loops containing
need_resched() invoked from within RCU_NONIDLE() or its tracepoint
equivalent. This commit therefore moves rcu_is_cpu_rrupt_from_idle()
to a place from which rcu_check_callbacks() can invoke it and replaces
the is_idle_task(current) with rcu_is_cpu_rrupt_from_idle().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent
need for an RCU quiescent state from the force-quiescent-state processing
within the grace-period kthread to context switches and to cond_resched().
Unfortunately, such urgent needs are not communicated to need_resched(),
which is sometimes used to decide when to invoke cond_resched(), for
but one example, within the KVM vcpu_run() function. As of v4.15, this
can result in synchronize_sched() being delayed by up to ten seconds,
which can be problematic, to say nothing of annoying.
This commit therefore checks rcu_dynticks.rcu_urgent_qs from within
rcu_check_callbacks(), which is invoked from the scheduling-clock
interrupt handler. If the current task is not an idle task and is
not executing in usermode, a context switch is forced, and either way,
the rcu_dynticks.rcu_urgent_qs variable is set to false. If the current
task is an idle task, then RCU's dyntick-idle code will detect the
quiescent state, so no further action is required. Similarly, if the
task is executing in usermode, other code in rcu_check_callbacks() and
its called functions will report the corresponding quiescent state.
Reported-by: Marius Hillenbrand <mhillenb@amazon.de>
Reported-by: David Woodhouse <dwmw2@infradead.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because rcu_barrier() is a one-line wrapper function for _rcu_barrier()
and because nothing else calls _rcu_barrier(), this commit inlines
_rcu_barrier() into rcu_barrier().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that rcu_all_qs() is used only in !PREEMPT builds, move it to
tree_plugin.h so that it is defined only in those builds. This in
turn means that rcu_momentary_dyntick_idle() is only used in !PREEMPT
builds, but it is simply marked __maybe_unused in order to keep it
near the rest of the dyntick-idle code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit removes rcu_sched_get_gp_seq(), rcu_bh_get_gp_seq(),
rcu_exp_batches_completed_sched(), rcu_sched_force_quiescent_state(),
and rcu_bh_force_quiescent_state(), which are no longer used because
rcutorture no longer does "rcu_bh" and "rcu_sched" torture types.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit saves a few lines by consolidating the RCU-sched function
definitions at the end of include/linux/rcupdate.h. This consolidation
also makes it easier to remove them all when the time comes.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit saves a few lines by consolidating the RCU-bh function
definitions at the end of include/linux/rcupdate.h. This consolidation
also makes it easier to remove them all when the time comes.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_gp_kthread() function is long and deeply indented, so this
commit pulls the loop that repeatedly invokes rcu_gp_fqs() into a new
rcu_gp_fqs_loop() function.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Consolidation of the RCU flavors into one makes increment_cpu_stall_ticks()
a trivial one-line function with only one caller. This commit therefore
inlines it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pointers to rcu_data structures should be named rdp, not rsp. This
commit therefore makes this change.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it. This commit therefore replaces rsp with
&rcu_state in rcu_cpu_starting() and rcu_init_one().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it. This commit therefore replaces rsp
with &rcu_state in rcu_barrier_callback(), rcu_barrier_func(), and
_rcu_barrier().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it. This commit therefore replaces rsp with
&rcu_state in rcu_report_qs_rnp(), force_quiescent_state(), and
rcu_check_gp_start_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point in
maintaining a pointer to it. This commit therefore replaces rsp with
&rcu_state in rcu_do_batch(), invoke_rcu_callbacks(), and __call_rcu().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it. This commit therefore replaces
rsp with &rcu_state in rcu_start_this_gp(), rcu_accelerate_cbs(),
__note_gp_changes(), rcu_gp_init(), rcu_gp_fqs(), rcu_gp_cleanup(),
rcu_gp_kthread(), and rcu_report_qs_rsp().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is less point
in maintaining a pointer to it. This commit therefore replaces rsp
with &rcu_state in print_other_cpu_stall(), print_cpu_stall(), and
check_cpu_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit removes the rsp and gpa local variables, repurposes the j
local variable and adds a gpk (GP kthread) local to improve readability.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit restructures rcutorture_get_gp_data() to take advantage of
the fact that there is only one flavor of RCU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only ever a single flavor of RCU in a given kernel
build, there isn't a whole lot of point in having a flavor-traversal
macro. This commit therefore removes it and converts calls to it to
straightline code, inlining trivial functions as appropriate.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that there is only one rcu_state structure, there is no need for the
rcu_data structure to indicate which it corresponds to. This commit
therefore removes the rcu_data structure's ->rsp field, replacing all
remaining uses of it with &rcu_state.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the Linux
kernel, so there is no need to pass it as a parameter to RCU's rcu_node
tree's accessor macros. This commit therefore removes the rsp parameter
from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp
local variables while in the area.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to
RCU's functions. This commit therefore removes the rsp parameter
from the code in kernel/rcu/tree_exp.h, and removes all of the
rsp local variables while in the area.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to
RCU's functions. This commit therefore removes the rsp parameter
from rcu_nocb_cpu_needs_barrier(), rcu_spawn_one_nocb_kthread(),
rcu_organize_nocb_kthreads(), rcu_nocb_cpu_needs_barrier(), and
rcu_nohz_full_cpu().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
print_cpu_stall_info().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
dump_blkd_tasks() and rcu_preempt_blocked_readers_cgp().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_print_detail_task_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_init_one() and rcu_dump_rcu_node_tree().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of
the Linux kernel, so there is no need to pass it as a parameter
to RCU's functions. This commit therefore removes the rsp
parameter from rcu_boot_init_percpu_data(), rcu_init_percpu_data(),
rcu_cleanup_dying_idle_cpu(), and rcu_migrate_callbacks(). While in
the neighborhood, line the last three into rcutree_prepare_cpu(),
rcu_report_dead() and rcutree_migrate_callbacks(), respectively.
This also gets rid of the for_each_rcu_flavor() calls that were in
those tree functions.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
_rcu_barrier_trace() and _rcu_barrier().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the Linux
kernel, so there is no need to pass it as a parameter to RCU's functions.
This commit therefore removes the rsp parameter from __rcu_pending(),
and also inlines it into rcu_pending(), removing the for_each_rcu_flavor()
while in the neighborhood..
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
__call_rcu_core() and __call_rcu().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
__rcu_process_callbacks(), and also inlines it into rcu_process_callbacks(),
removing the for_each_rcu_flavor() while in the neighborhood.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_check_gp_start_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
force_qs_rnp() and force_quiescent_state().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_do_batch().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_cleanup_dying_cpu() and rcu_cleanup_dead_cpu(). And, as long as
we are in the neighborhood, inlines them into rcutree_dying_cpu() and
rcutree_dead_cpu(), respectively. This also eliminates a pair of
for_each_rcu_flavor() loops.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_check_quiescent_state().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_gp_init(), rcu_gp_fqs_check_wake(), rcu_gp_fqs(), rcu_gp_cleanup(),
and rcu_gp_kthread().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_gp_slow().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
note_gp_changes().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
__note_gp_changes().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_advance_cbs().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_accelerate_cbs_unlocked().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_accelerate_cbs().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_gp_kthread_wake().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_future_gp_cleanup().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
check_cpu_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
print_cpu_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
print_other_cpu_stall().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_stall_kick_kthreads().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_dump_cpu_stacks().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_check_gp_kthread_starvation().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
record_gp_stall_check_time().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_get_root().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_gp_in_progress().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_report_qs_rdp().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_report_unblock_qs_rnp(), which is particularly appropriate in
this case given that this parameter is no longer used.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_report_qs_rsp().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_report_qs_rnp().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_data_p pointer references the default set of per-CPU rcu_data
structures, that is, those that call_rcu() uses, as opposed to
call_rcu_bh() and sometimes call_rcu_sched(). But there is now only one
set of per-CPU rcu_data structures, so that one set is by definition
the default, which means that the rcu_data_p pointer no longer serves
any useful purpose. This commit therefore removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_state_p pointer references the default rcu_state structure,
that is, the one that call_rcu() uses, as opposed to call_rcu_bh()
and sometimes call_rcu_sched(). But there is now only one rcu_state
structure, so that one structure is by definition the default, which
means that the rcu_state_p pointer no longer serves any useful purpose.
This commit therefore removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_state structure's ->rda field was used to find the per-CPU
rcu_data structures corresponding to that rcu_state structure. But now
there is only one rcu_state structure (creatively named "rcu_state")
and one set of per-CPU rcu_data structures (creatively named "rcu_data").
Therefore, uses of the ->rda field can always be replaced by "rcu_data,
and this commit makes that change and removes the ->rda field.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_state structure's ->call field references the corresponding RCU
flavor's call_rcu() function. However, now that there is only ever one
rcu_state structure in a given build of the Linux kernel, and that flavor
uses plain old call_rcu(), there is not a lot of point in continuing to
have the ->call field. This commit therefore removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that a given build of the Linux kernel has only one set of rcu_state,
rcu_node, and rcu_data structures, there is no point in creating a macro
to declare and compile-time initialize them. This commit therefore
just does normal declaration and compile-time initialization of these
structures. While in the area, this commit also removes #ifndefs of
the no-longer-ever-defined preprocessor macro RCU_TREE_NONCORE.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that RCU-preempt knows about preemption disabling, its implementation
of synchronize_rcu() works for synchronize_sched(), and likewise for the
other RCU-sched update-side API members. This commit therefore confines
the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines
RCU-sched's update-side API members in terms of those of RCU-preempt.
This means that any given build of the Linux kernel has only one
update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds
and RCU-sched for CONFIG_PREEMPT=n builds. This in turn means that kernels
built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
The rcu_report_exp_rdp() function is always invoked with its "wake"
argument set to "true", so this commit drops this parameter. The only
potential call site that would use "false" is in the code driving the
expedited grace period, and that code uses rcu_report_exp_cpu_mult()
instead, which therefore retains its "wake" parameter.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit updates comments and help text to account for the fact that
RCU-bh update-side functions are now simple wrappers for their RCU or
RCU-sched counterparts.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Now that the main RCU API knows about softirq disabling and softirq's
quiescent states, the RCU-bh update code can be dispensed with.
This commit therefore removes the RCU-bh update-side implementation and
defines RCU-bh's update-side API in terms of that of either RCU-preempt or
RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option.
In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect
of reducing by one the number of rcuo kthreads per CPU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
One necessary step towards consolidating the three flavors of RCU is to
make sure that the resulting consolidated "one flavor to rule them all"
correctly handles networking denial-of-service attacks. One thing that
allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every
so often, and so something similar has to happen for consolidated RCU.
This must be done carefully. For example, if a preemption-disabled
region of code takes an interrupt which does softirq processing before
returning, consolidated RCU must ignore the resulting rcu_bh_qs()
invocations -- preemption is still disabled, and that means an RCU
reader for the consolidated flavor.
This commit therefore creates a new rcu_softirq_qs() that is called only
from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region
problem. This new rcu_softirq_qs() function invokes rcu_sched_qs(),
rcu_preempt_qs(), and rcu_preempt_deferred_qs(). The latter call handles
any deferred quiescent states.
Note that __do_softirq() still invokes rcu_bh_qs(). It will continue to
do so until a later stage of cleanup when the RCU-bh flavor is removed.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix !SMP issue located by kbuild test robot. ]
RCU's dyntick-idle code is written to tolerate half-interrupts, that it,
either an interrupt that invokes rcu_irq_enter() but never invokes the
corresponding rcu_irq_exit() on the one hand, or an interrupt that never
invokes rcu_irq_enter() but does invoke the "corresponding" rcu_irq_exit()
on the other. These things really did happen at one time, as evidenced
by this ca-2011 LKML post:
http://lkml.kernel.org/r/20111014170019.GE2428@linux.vnet.ibm.com
The reason why RCU tolerates half-interrupts is that usermode helpers
used exceptions to invoke a system call from within the kernel such that
the system call did a normal return (not a return from exception) to
the calling context. This caused rcu_irq_enter() to be invoked without
a matching rcu_irq_exit(). However, usermode helpers have since been
rewritten to make much more housebroken use of workqueues, kernel threads,
and do_execve(), and therefore should no longer produce half-interrupts.
No one knows of any other source of half-interrupts, but then again,
no one seems insane enough to go audit the entire kernel to verify that
half-interrupts really are a relic of the past.
This commit therefore adds a pair of WARN_ON_ONCE() calls that will
trigger in the presence of half interrupts, which the code will continue
to handle correctly. If neither of these WARN_ON_ONCE() trigger by
mid-2021, then perhaps RCU can stop handling half-interrupts, which
would be a considerable simplification.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
This commit defers reporting of RCU-preempt quiescent states at
rcu_read_unlock_special() time when any of interrupts, softirq, or
preemption are disabled. These deferred quiescent states are reported
at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
offline operation. Of course, if another RCU read-side critical
section has started in the meantime, the reporting of the quiescent
state will be further deferred.
This also means that disabling preemption, interrupts, and/or
softirqs will act as an RCU-preempt read-side critical section.
This is enforced by checking preempt_count() as needed.
Some special cases must be handled on an ad-hoc basis, for example,
context switch is a quiescent state even though both the scheduler and
do_exit() disable preemption. In these cases, additional calls to
rcu_preempt_deferred_qs() override the preemption disabling. Similar
logic overrides disabled interrupts in rcu_preempt_check_callbacks()
because in this case the quiescent state happened just before the
corresponding scheduling-clock interrupt.
In theory, this change lifts a long-standing restriction that required
that if interrupts were disabled across a call to rcu_read_unlock()
that the matching rcu_read_lock() also be contained within that
interrupts-disabled region of code. Because the reporting of the
corresponding RCU-preempt quiescent state is now deferred until
after interrupts have been enabled, it is no longer possible for this
situation to result in deadlocks involving the scheduler's runqueue and
priority-inheritance locks. This may allow some code simplification that
might reduce interrupt latency a bit. Unfortunately, in practice this
would also defer deboosting a low-priority task that had been subjected
to RCU priority boosting, so real-time-response considerations might
well force this restriction to remain in place.
Because RCU-preempt grace periods are now blocked not only by RCU
read-side critical sections, but also by disabling of interrupts,
preemption, and softirqs, it will be possible to eliminate RCU-bh and
RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels. This may
require some additional plumbing to provide the network denial-of-service
guarantees that have been traditionally provided by RCU-bh. Once these
are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
into RCU-sched. This would mean that all kernels would have but
one flavor of RCU, which would open the door to significant code
cleanup.
Moving to a single flavor of RCU would also have the beneficial effect
of reducing the NOCB kthreads by at least a factor of two.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
from Joel Fernandes. ]
[ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
response to bug reports from kbuild test robot. ]
[ paulmck: Fix bug located by kbuild test robot involving recursion
via rcu_preempt_deferred_qs(). ]
When entering or exiting irq or NMI handlers, the current code uses
->dynticks_nmi_nesting to detect if it is in the outermost handler,
that is, the one interrupting or returning to an RCU-idle context (the
idle loop or nohz_full usermode execution). When entering the outermost
handler via an interrupt (as opposed to NMI), it is necessary to invoke
rcu_dynticks_task_exit() just before the CPU is marked non-idle from an
RCU perspective and to invoke rcu_cleanup_after_idle() just after the
CPU is marked non-idle. Similarly, when exiting the outermost handler
via an interrupt, it is necessary to invoke rcu_prepare_for_idle() just
before marking the CPU idle and to invoke rcu_dynticks_task_enter()
just after marking the CPU idle.
The decision to execute these four functions is currently taken in
rcu_irq_enter() and rcu_irq_exit() as follows:
rcu_irq_enter()
/* A conditional branch with ->dynticks_nmi_nesting */
rcu_nmi_enter()
/* A conditional branch with ->dynticks */
/* A conditional branch with ->dynticks_nmi_nesting */
rcu_irq_exit()
/* A conditional branch with ->dynticks_nmi_nesting */
rcu_nmi_exit()
/* A conditional branch with ->dynticks_nmi_nesting */
/* A conditional branch with ->dynticks_nmi_nesting */
rcu_nmi_enter()
/* A conditional branch with ->dynticks */
rcu_nmi_exit()
/* A conditional branch with ->dynticks_nmi_nesting */
This works, but the conditional branches in rcu_irq_enter() and
rcu_irq_exit() are redundant with those in rcu_nmi_enter() and
rcu_nmi_exit(), respectively. Redundant branches are not something
we want in the to/from-idle fastpaths, so this commit refactors
rcu_{nmi,irq}_{enter,exit}() so they use a common inlined function passed
a constant argument as follows:
rcu_irq_enter() inlining rcu_nmi_enter_common(irq=true)
/* A conditional branch with ->dynticks */
rcu_irq_exit() inlining rcu_nmi_exit_common(irq=true)
/* A conditional branch with ->dynticks_nmi_nesting */
rcu_nmi_enter() inlining rcu_nmi_enter_common(irq=false)
/* A conditional branch with ->dynticks */
rcu_nmi_exit() inlining rcu_nmi_exit_common(irq=false)
/* A conditional branch with ->dynticks_nmi_nesting */
The combination of the constant function argument and the inlining allows
the compiler to discard the conditionals that previously controlled
execution of rcu_dynticks_task_exit(), rcu_cleanup_after_idle(),
rcu_prepare_for_idle(), and rcu_dynticks_task_enter(). This reduces both
the to-idle and from-idle path lengths by two conditional branches each,
and improves readability as well.
This commit also changes order of execution from this:
rcu_dynticks_task_exit();
rcu_dynticks_eqs_exit();
trace_rcu_dyntick();
rcu_cleanup_after_idle();
To this:
rcu_dynticks_task_exit();
rcu_dynticks_eqs_exit();
rcu_cleanup_after_idle();
trace_rcu_dyntick();
In other words, the calls to rcu_cleanup_after_idle() and
trace_rcu_dyntick() are reversed. This has no functional effect because
the real concern is whether a given call is before or after the call to
rcu_dynticks_eqs_exit(), and this patch does not change that. Before the
call to rcu_dynticks_eqs_exit(), RCU is not yet watching the current
CPU and after that call RCU is watching.
A similar switch in calling order happens on the idle-entry path, with
similar lack of effect for the same reasons.
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Applied Steven Rostedt feedback. ]
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull scheduler updates from Thomas Gleixner:
- Cleanup and improvement of NUMA balancing
- Refactoring and improvements to the PELT (Per Entity Load Tracking)
code
- Watchdog simplification and related cleanups
- The usual pile of small incremental fixes and improvements
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
watchdog: Reduce message verbosity
stop_machine: Reflow cpu_stop_queue_two_works()
sched/numa: Move task_numa_placement() closer to numa_migrate_preferred()
sched/numa: Use group_weights to identify if migration degrades locality
sched/numa: Update the scan period without holding the numa_group lock
sched/numa: Remove numa_has_capacity()
sched/numa: Modify migrate_swap() to accept additional parameters
sched/numa: Remove unused task_capacity from 'struct numa_stats'
sched/numa: Skip nodes that are at 'hoplimit'
sched/debug: Reverse the order of printing faults
sched/numa: Use task faults only if numa_group is not yet set up
sched/numa: Set preferred_node based on best_cpu
sched/numa: Simplify load_too_imbalanced()
sched/numa: Evaluate move once per node
sched/numa: Remove redundant field
sched/debug: Show the sum wait time of a task group
sched/fair: Remove #ifdefs from scale_rt_capacity()
sched/core: Remove get_cpu() from sched_fork()
sched/cpufreq: Clarify sugov_get_util()
sched/sysctl: Remove unused sched_time_avg_ms sysctl
...
When rcutorture is built in to the kernel, an earlier patch detects
that and raises the priority of RCU's kthreads to allow rcutorture's
RCU priority boosting tests to succeed.
However, if rcutorture is built as a module, those priorities must be
raised manually via the rcutree.kthread_prio kernel boot parameter.
If this manual step is not taken, rcutorture's RCU priority boosting
tests will fail due to kthread starvation. One approach would be to
raise the default priority, but that risks breaking existing users.
Another approach would be to allow runtime adjustment of RCU's kthread
priorities, but that introduces numerous "interesting" race conditions.
This patch therefore instead detects too-low priorities, and prints a
message and disables the RCU priority boosting tests in that case.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Back when RCU had a debugfs interface, there was a test version and
sequence number that allowed associating debugfs data with a particular
test run, where the test run started with modprobe and ended with rmmod,
which was how tests were run back on the old ABAT system within IBM.
But rcutorture testing no longer runs on ABAT, and there is no longer an
RCU debugfs interface, so there is no longer any need for test versions
and sequence numbers.
This commit therefore removes the rcutorture_record_test_transition()
and rcutorture_record_progress() functions, and along with them the
rcutorture_testseq and rcutorture_vernum variables that they update.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcutorture RCU priority boosting tests fail even with CONFIG_RCU_BOOST
set because rcutorture's threads run at the same priority as the default
RCU kthreads (RT class with priority of 1).
This patch checks if RCU torture is built into the kernel and if so,
assigns RT priority 1 to the RCU threads, allowing the rcutorture boost
tests to pass.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, the range of jiffies_till_{first,next}_fqs are checked and
adjusted on and on in the loop of rcu_gp_kthread on runtime.
However, it's enough to check them only when setting them, not every
time in the loop. So make them handled on a setting time via sysfs.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit adds any in-the-future ->gp_seq_needed fields to the
diagnostics for an rcutorture writer stall warning message.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, rcu_check_gp_start_stall() waits for one second after the first
request before complaining that a grace period has not yet started. This
was desirable while testing the conversion from ->future_gp_needed[] to
->gp_seq_needed, but it is a bit on the hair-trigger side for production
use under heavy load. This commit therefore makes this wait time be
exactly that of the RCU CPU stall warning, allowing easy adjustment of
both timeouts to suit the distribution or installation at hand.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_cpu_has_callbacks() function is now used in all configurations,
so this commit removes the __maybe_unused.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
These functions are in kernel/rcu/tree.c, which is not an include file,
so there is no problem dropping the "inline", especially given that these
functions are nowhere near a fastpath. This commit therefore delegates
the inlining decision to the compiler by dropping the "inline".
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_dynticks_momentary_idle() function is invoked only from
rcu_momentary_dyntick_idle(), and neither function is particularly
large. This commit therefore saves a few lines by inlining
rcu_dynticks_momentary_idle() into rcu_momentary_dyntick_idle().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The naming and comments associated with some RCU-tasks code make
the faulty assumption that context switches due to cond_resched()
are voluntary. As several people pointed out, this is not the case.
This commit therefore updates function names and comments to better
reflect current reality.
Reported-by: Byungchul Park <byungchul.park@lge.com>
Reported-by: Joel Fernandes <joel@joelfernandes.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit also adjusts some whitespace while in the area.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Revert string-breaking %s as requested by Andy Shevchenko. ]
We expect a quiescent state of TASKS_RCU when cond_resched_tasks_rcu_qs()
is called, no matter whether it actually be scheduled or not. However,
it currently doesn't report the quiescent state when the task enters
into __schedule() as it's called with preempt = true. So make it report
the quiescent state unconditionally when cond_resched_tasks_rcu_qs() is
called.
And in TINY_RCU, even though the quiescent state of rcu_bh also should
be reported when the tick interrupt comes from user, it doesn't. So make
it reported.
Lastly in TREE_RCU, rcu_note_voluntary_context_switch() should be
reported when the tick interrupt comes from not only user but also idle,
as an extended quiescent state.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Simplify rcutiny portion given no RCU-tasks for !PREEMPT. ]