mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-19 02:56:15 +07:00
doc: Update RCU data-structure documentation for rcu_segcblist
The rcu_segcblist data structure, which contains segmented lists of RCU callbacks, was recently added. This commit updates the documentation accordingly. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit is contained in:
parent
8e2a439753
commit
aa123a748e
@ -19,6 +19,8 @@ to each other.
|
||||
The <tt>rcu_state</tt> Structure</a>
|
||||
<li> <a href="#The rcu_node Structure">
|
||||
The <tt>rcu_node</tt> Structure</a>
|
||||
<li> <a href="#The rcu_segcblist Structure">
|
||||
The <tt>rcu_segcblist</tt> Structure</a>
|
||||
<li> <a href="#The rcu_data Structure">
|
||||
The <tt>rcu_data</tt> Structure</a>
|
||||
<li> <a href="#The rcu_dynticks Structure">
|
||||
@ -841,6 +843,134 @@ for lockdep lock-class names.
|
||||
Finally, lines 64-66 produce an error if the maximum number of
|
||||
CPUs is too large for the specified fanout.
|
||||
|
||||
<h3><a name="The rcu_segcblist Structure">
|
||||
The <tt>rcu_segcblist</tt> Structure</a></h3>
|
||||
|
||||
The <tt>rcu_segcblist</tt> structure maintains a segmented list of
|
||||
callbacks as follows:
|
||||
|
||||
<pre>
|
||||
1 #define RCU_DONE_TAIL 0
|
||||
2 #define RCU_WAIT_TAIL 1
|
||||
3 #define RCU_NEXT_READY_TAIL 2
|
||||
4 #define RCU_NEXT_TAIL 3
|
||||
5 #define RCU_CBLIST_NSEGS 4
|
||||
6
|
||||
7 struct rcu_segcblist {
|
||||
8 struct rcu_head *head;
|
||||
9 struct rcu_head **tails[RCU_CBLIST_NSEGS];
|
||||
10 unsigned long gp_seq[RCU_CBLIST_NSEGS];
|
||||
11 long len;
|
||||
12 long len_lazy;
|
||||
13 };
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
The segments are as follows:
|
||||
|
||||
<ol>
|
||||
<li> <tt>RCU_DONE_TAIL</tt>: Callbacks whose grace periods have elapsed.
|
||||
These callbacks are ready to be invoked.
|
||||
<li> <tt>RCU_WAIT_TAIL</tt>: Callbacks that are waiting for the
|
||||
current grace period.
|
||||
Note that different CPUs can have different ideas about which
|
||||
grace period is current, hence the <tt>->gp_seq</tt> field.
|
||||
<li> <tt>RCU_NEXT_READY_TAIL</tt>: Callbacks waiting for the next
|
||||
grace period to start.
|
||||
<li> <tt>RCU_NEXT_TAIL</tt>: Callbacks that have not yet been
|
||||
associated with a grace period.
|
||||
</ol>
|
||||
|
||||
<p>
|
||||
The <tt>->head</tt> pointer references the first callback or
|
||||
is <tt>NULL</tt> if the list contains no callbacks (which is
|
||||
<i>not</i> the same as being empty).
|
||||
Each element of the <tt>->tails[]</tt> array references the
|
||||
<tt>->next</tt> pointer of the last callback in the corresponding
|
||||
segment of the list, or the list's <tt>->head</tt> pointer if
|
||||
that segment and all previous segments are empty.
|
||||
If the corresponding segment is empty but some previous segment is
|
||||
not empty, then the array element is identical to its predecessor.
|
||||
Older callbacks are closer to the head of the list, and new callbacks
|
||||
are added at the tail.
|
||||
This relationship between the <tt>->head</tt> pointer, the
|
||||
<tt>->tails[]</tt> array, and the callbacks is shown in this
|
||||
diagram:
|
||||
|
||||
</p><p><img src="nxtlist.svg" alt="nxtlist.svg" width="40%">
|
||||
|
||||
</p><p>In this figure, the <tt>->head</tt> pointer references the
|
||||
first
|
||||
RCU callback in the list.
|
||||
The <tt>->tails[RCU_DONE_TAIL]</tt> array element references
|
||||
the <tt>->head</tt> pointer itself, indicating that none
|
||||
of the callbacks is ready to invoke.
|
||||
The <tt>->tails[RCU_WAIT_TAIL]</tt> array element references callback
|
||||
CB 2's <tt>->next</tt> pointer, which indicates that
|
||||
CB 1 and CB 2 are both waiting on the current grace period,
|
||||
give or take possible disagreements about exactly which grace period
|
||||
is the current one.
|
||||
The <tt>->tails[RCU_NEXT_READY_TAIL]</tt> array element
|
||||
references the same RCU callback that <tt>->tails[RCU_WAIT_TAIL]</tt>
|
||||
does, which indicates that there are no callbacks waiting on the next
|
||||
RCU grace period.
|
||||
The <tt>->tails[RCU_NEXT_TAIL]</tt> array element references
|
||||
CB 4's <tt>->next</tt> pointer, indicating that all the
|
||||
remaining RCU callbacks have not yet been assigned to an RCU grace
|
||||
period.
|
||||
Note that the <tt>->tails[RCU_NEXT_TAIL]</tt> array element
|
||||
always references the last RCU callback's <tt>->next</tt> pointer
|
||||
unless the callback list is empty, in which case it references
|
||||
the <tt>->head</tt> pointer.
|
||||
|
||||
<p>
|
||||
There is one additional important special case for the
|
||||
<tt>->tails[RCU_NEXT_TAIL]</tt> array element: It can be <tt>NULL</tt>
|
||||
when this list is <i>disabled</i>.
|
||||
Lists are disabled when the corresponding CPU is offline or when
|
||||
the corresponding CPU's callbacks are offloaded to a kthread,
|
||||
both of which are described elsewhere.
|
||||
|
||||
</p><p>CPUs advance their callbacks from the
|
||||
<tt>RCU_NEXT_TAIL</tt> to the <tt>RCU_NEXT_READY_TAIL</tt> to the
|
||||
<tt>RCU_WAIT_TAIL</tt> to the <tt>RCU_DONE_TAIL</tt> list segments
|
||||
as grace periods advance.
|
||||
|
||||
</p><p>The <tt>->gp_seq[]</tt> array records grace-period
|
||||
numbers corresponding to the list segments.
|
||||
This is what allows different CPUs to have different ideas as to
|
||||
which is the current grace period while still avoiding premature
|
||||
invocation of their callbacks.
|
||||
In particular, this allows CPUs that go idle for extended periods
|
||||
to determine which of their callbacks are ready to be invoked after
|
||||
reawakening.
|
||||
|
||||
</p><p>The <tt>->len</tt> counter contains the number of
|
||||
callbacks in <tt>->head</tt>, and the
|
||||
<tt>->len_lazy</tt> contains the number of those callbacks that
|
||||
are known to only free memory, and whose invocation can therefore
|
||||
be safely deferred.
|
||||
|
||||
<p><b>Important note</b>: It is the <tt>->len</tt> field that
|
||||
determines whether or not there are callbacks associated with
|
||||
this <tt>rcu_segcblist</tt> structure, <i>not</i> the <tt>->head</tt>
|
||||
pointer.
|
||||
The reason for this is that all the ready-to-invoke callbacks
|
||||
(that is, those in the <tt>RCU_DONE_TAIL</tt> segment) are extracted
|
||||
all at once at callback-invocation time.
|
||||
If callback invocation must be postponed, for example, because a
|
||||
high-priority process just woke up on this CPU, then the remaining
|
||||
callbacks are placed back on the <tt>RCU_DONE_TAIL</tt> segment.
|
||||
Either way, the <tt>->len</tt> and <tt>->len_lazy</tt> counts
|
||||
are adjusted after the corresponding callbacks have been invoked, and so
|
||||
again it is the <tt>->len</tt> count that accurately reflects whether
|
||||
or not there are callbacks associated with this <tt>rcu_segcblist</tt>
|
||||
structure.
|
||||
Of course, off-CPU sampling of the <tt>->len</tt> count requires
|
||||
the use of appropriate synchronization, for example, memory barriers.
|
||||
This synchronization can be a bit subtle, particularly in the case
|
||||
of <tt>rcu_barrier()</tt>.
|
||||
|
||||
<h3><a name="The rcu_data Structure">
|
||||
The <tt>rcu_data</tt> Structure</a></h3>
|
||||
|
||||
@ -983,62 +1113,18 @@ choice.
|
||||
as follows:
|
||||
|
||||
<pre>
|
||||
1 struct rcu_head *nxtlist;
|
||||
2 struct rcu_head **nxttail[RCU_NEXT_SIZE];
|
||||
3 unsigned long nxtcompleted[RCU_NEXT_SIZE];
|
||||
4 long qlen_lazy;
|
||||
5 long qlen;
|
||||
6 long qlen_last_fqs_check;
|
||||
1 struct rcu_segcblist cblist;
|
||||
2 long qlen_last_fqs_check;
|
||||
3 unsigned long n_cbs_invoked;
|
||||
4 unsigned long n_nocbs_invoked;
|
||||
5 unsigned long n_cbs_orphaned;
|
||||
6 unsigned long n_cbs_adopted;
|
||||
7 unsigned long n_force_qs_snap;
|
||||
8 unsigned long n_cbs_invoked;
|
||||
9 unsigned long n_cbs_orphaned;
|
||||
10 unsigned long n_cbs_adopted;
|
||||
11 long blimit;
|
||||
8 long blimit;
|
||||
</pre>
|
||||
|
||||
<p>The <tt>->nxtlist</tt> pointer and the
|
||||
<tt>->nxttail[]</tt> array form a four-segment list with
|
||||
older callbacks near the head and newer ones near the tail.
|
||||
Each segment contains callbacks with the corresponding relationship
|
||||
to the current grace period.
|
||||
The pointer out of the end of each of the four segments is referenced
|
||||
by the element of the <tt>->nxttail[]</tt> array indexed by
|
||||
<tt>RCU_DONE_TAIL</tt> (for callbacks handled by a prior grace period),
|
||||
<tt>RCU_WAIT_TAIL</tt> (for callbacks waiting on the current grace period),
|
||||
<tt>RCU_NEXT_READY_TAIL</tt> (for callbacks that will wait on the next
|
||||
grace period), and
|
||||
<tt>RCU_NEXT_TAIL</tt> (for callbacks that are not yet associated
|
||||
with a specific grace period)
|
||||
respectively, as shown in the following figure.
|
||||
|
||||
</p><p><img src="nxtlist.svg" alt="nxtlist.svg" width="40%">
|
||||
|
||||
</p><p>In this figure, the <tt>->nxtlist</tt> pointer references the
|
||||
first
|
||||
RCU callback in the list.
|
||||
The <tt>->nxttail[RCU_DONE_TAIL]</tt> array element references
|
||||
the <tt>->nxtlist</tt> pointer itself, indicating that none
|
||||
of the callbacks is ready to invoke.
|
||||
The <tt>->nxttail[RCU_WAIT_TAIL]</tt> array element references callback
|
||||
CB 2's <tt>->next</tt> pointer, which indicates that
|
||||
CB 1 and CB 2 are both waiting on the current grace period.
|
||||
The <tt>->nxttail[RCU_NEXT_READY_TAIL]</tt> array element
|
||||
references the same RCU callback that <tt>->nxttail[RCU_WAIT_TAIL]</tt>
|
||||
does, which indicates that there are no callbacks waiting on the next
|
||||
RCU grace period.
|
||||
The <tt>->nxttail[RCU_NEXT_TAIL]</tt> array element references
|
||||
CB 4's <tt>->next</tt> pointer, indicating that all the
|
||||
remaining RCU callbacks have not yet been assigned to an RCU grace
|
||||
period.
|
||||
Note that the <tt>->nxttail[RCU_NEXT_TAIL]</tt> array element
|
||||
always references the last RCU callback's <tt>->next</tt> pointer
|
||||
unless the callback list is empty, in which case it references
|
||||
the <tt>->nxtlist</tt> pointer.
|
||||
|
||||
</p><p>CPUs advance their callbacks from the
|
||||
<tt>RCU_NEXT_TAIL</tt> to the <tt>RCU_NEXT_READY_TAIL</tt> to the
|
||||
<tt>RCU_WAIT_TAIL</tt> to the <tt>RCU_DONE_TAIL</tt> list segments
|
||||
as grace periods advance.
|
||||
<p>The <tt>->cblist</tt> structure is the segmented callback list
|
||||
described earlier.
|
||||
The CPU advances the callbacks in its <tt>rcu_data</tt> structure
|
||||
whenever it notices that another RCU grace period has completed.
|
||||
The CPU detects the completion of an RCU grace period by noticing
|
||||
@ -1049,16 +1135,7 @@ Recall that each <tt>rcu_node</tt> structure's
|
||||
<tt>->completed</tt> field is updated at the end of each
|
||||
grace period.
|
||||
|
||||
</p><p>The <tt>->nxtcompleted[]</tt> array records grace-period
|
||||
numbers corresponding to the list segments.
|
||||
This allows CPUs that go idle for extended periods to determine
|
||||
which of their callbacks are ready to be invoked after reawakening.
|
||||
|
||||
</p><p>The <tt>->qlen</tt> counter contains the number of
|
||||
callbacks in <tt>->nxtlist</tt>, and the
|
||||
<tt>->qlen_lazy</tt> contains the number of those callbacks that
|
||||
are known to only free memory, and whose invocation can therefore
|
||||
be safely deferred.
|
||||
<p>
|
||||
The <tt>->qlen_last_fqs_check</tt> and
|
||||
<tt>->n_force_qs_snap</tt> coordinate the forcing of quiescent
|
||||
states from <tt>call_rcu()</tt> and friends when callback
|
||||
@ -1069,6 +1146,10 @@ lists grow excessively long.
|
||||
fields count the number of callbacks invoked,
|
||||
sent to other CPUs when this CPU goes offline,
|
||||
and received from other CPUs when those other CPUs go offline.
|
||||
The <tt>->n_nocbs_invoked</tt> is used when the CPU's callbacks
|
||||
are offloaded to a kthread.
|
||||
|
||||
<p>
|
||||
Finally, the <tt>->blimit</tt> counter is the maximum number of
|
||||
RCU callbacks that may be invoked at a given time.
|
||||
|
||||
|
@ -19,7 +19,7 @@
|
||||
id="svg2"
|
||||
version="1.1"
|
||||
inkscape:version="0.48.4 r9939"
|
||||
sodipodi:docname="nxtlist.fig">
|
||||
sodipodi:docname="segcblist.svg">
|
||||
<metadata
|
||||
id="metadata94">
|
||||
<rdf:RDF>
|
||||
@ -28,7 +28,7 @@
|
||||
<dc:format>image/svg+xml</dc:format>
|
||||
<dc:type
|
||||
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
|
||||
<dc:title></dc:title>
|
||||
<dc:title />
|
||||
</cc:Work>
|
||||
</rdf:RDF>
|
||||
</metadata>
|
||||
@ -241,61 +241,51 @@
|
||||
xml:space="preserve"
|
||||
x="225"
|
||||
y="675"
|
||||
fill="#000000"
|
||||
font-family="Courier"
|
||||
font-style="normal"
|
||||
font-weight="bold"
|
||||
font-size="324"
|
||||
text-anchor="start"
|
||||
id="text64">nxtlist</text>
|
||||
id="text64"
|
||||
style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">->head</text>
|
||||
<!-- Text -->
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="225"
|
||||
y="1800"
|
||||
fill="#000000"
|
||||
font-family="Courier"
|
||||
font-style="normal"
|
||||
font-weight="bold"
|
||||
font-size="324"
|
||||
text-anchor="start"
|
||||
id="text66">nxttail[RCU_DONE_TAIL]</text>
|
||||
id="text66"
|
||||
style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">->tails[RCU_DONE_TAIL]</text>
|
||||
<!-- Text -->
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="225"
|
||||
y="2925"
|
||||
fill="#000000"
|
||||
font-family="Courier"
|
||||
font-style="normal"
|
||||
font-weight="bold"
|
||||
font-size="324"
|
||||
text-anchor="start"
|
||||
id="text68">nxttail[RCU_WAIT_TAIL]</text>
|
||||
id="text68"
|
||||
style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">->tails[RCU_WAIT_TAIL]</text>
|
||||
<!-- Text -->
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="225"
|
||||
y="4050"
|
||||
fill="#000000"
|
||||
font-family="Courier"
|
||||
font-style="normal"
|
||||
font-weight="bold"
|
||||
font-size="324"
|
||||
text-anchor="start"
|
||||
id="text70">nxttail[RCU_NEXT_READY_TAIL]</text>
|
||||
id="text70"
|
||||
style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">->tails[RCU_NEXT_READY_TAIL]</text>
|
||||
<!-- Text -->
|
||||
<text
|
||||
xml:space="preserve"
|
||||
x="225"
|
||||
y="5175"
|
||||
fill="#000000"
|
||||
font-family="Courier"
|
||||
font-style="normal"
|
||||
font-weight="bold"
|
||||
font-size="324"
|
||||
text-anchor="start"
|
||||
id="text72">nxttail[RCU_NEXT_TAIL]</text>
|
||||
id="text72"
|
||||
style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">->tails[RCU_NEXT_TAIL]</text>
|
||||
<!-- Text -->
|
||||
<text
|
||||
xml:space="preserve"
|
||||
|
Before Width: | Height: | Size: 11 KiB After Width: | Height: | Size: 11 KiB |
Loading…
Reference in New Issue
Block a user