mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-19 01:56:21 +07:00
memory-barriers: Rework multicopy-atomicity section
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit is contained in:
parent
f1ab25a30c
commit
0902b1f44a
@ -1343,13 +1343,13 @@ MULTICOPY ATOMICITY
|
|||||||
|
|
||||||
Multicopy atomicity is a deeply intuitive notion about ordering that is
|
Multicopy atomicity is a deeply intuitive notion about ordering that is
|
||||||
not always provided by real computer systems, namely that a given store
|
not always provided by real computer systems, namely that a given store
|
||||||
is visible at the same time to all CPUs, or, alternatively, that all
|
becomes visible at the same time to all CPUs, or, alternatively, that all
|
||||||
CPUs agree on the order in which all stores took place. However, use of
|
CPUs agree on the order in which all stores become visible. However,
|
||||||
full multicopy atomicity would rule out valuable hardware optimizations,
|
support of full multicopy atomicity would rule out valuable hardware
|
||||||
so a weaker form called ``other multicopy atomicity'' instead guarantees
|
optimizations, so a weaker form called ``other multicopy atomicity''
|
||||||
that a given store is observed at the same time by all -other- CPUs. The
|
instead guarantees only that a given store becomes visible at the same
|
||||||
remainder of this document discusses this weaker form, but for brevity
|
time to all -other- CPUs. The remainder of this document discusses this
|
||||||
will call it simply ``multicopy atomicity''.
|
weaker form, but for brevity will call it simply ``multicopy atomicity''.
|
||||||
|
|
||||||
The following example demonstrates multicopy atomicity:
|
The following example demonstrates multicopy atomicity:
|
||||||
|
|
||||||
@ -1360,24 +1360,26 @@ The following example demonstrates multicopy atomicity:
|
|||||||
<general barrier> <read barrier>
|
<general barrier> <read barrier>
|
||||||
STORE Y=r1 LOAD X
|
STORE Y=r1 LOAD X
|
||||||
|
|
||||||
Suppose that CPU 2's load from X returns 1 which it then stores to Y and
|
Suppose that CPU 2's load from X returns 1, which it then stores to Y,
|
||||||
that CPU 3's load from Y returns 1. This indicates that CPU 2's load
|
and CPU 3's load from Y returns 1. This indicates that CPU 1's store
|
||||||
from X in some sense follows CPU 1's store to X and that CPU 2's store
|
to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
|
||||||
to Y in some sense preceded CPU 3's load from Y. The question is then
|
CPU 3's load from Y. In addition, the memory barriers guarantee that
|
||||||
"Can CPU 3's load from X return 0?"
|
CPU 2 executes its load before its store, and CPU 3 loads from Y before
|
||||||
|
it loads from X. The question is then "Can CPU 3's load from X return 0?"
|
||||||
|
|
||||||
Because CPU 3's load from X in some sense came after CPU 2's load, it
|
Because CPU 3's load from X in some sense comes after CPU 2's load, it
|
||||||
is natural to expect that CPU 3's load from X must therefore return 1.
|
is natural to expect that CPU 3's load from X must therefore return 1.
|
||||||
This expectation is an example of multicopy atomicity: if a load executing
|
This expectation follows from multicopy atomicity: if a load executing
|
||||||
on CPU A follows a load from the same variable executing on CPU B, then
|
on CPU B follows a load from the same variable executing on CPU A (and
|
||||||
an understandable but incorrect expectation is that CPU A's load must
|
CPU A did not originally store the value which it read), then on
|
||||||
either return the same value that CPU B's load did, or must return some
|
multicopy-atomic systems, CPU B's load must return either the same value
|
||||||
later value.
|
that CPU A's load did or some later value. However, the Linux kernel
|
||||||
|
does not require systems to be multicopy atomic.
|
||||||
|
|
||||||
In the Linux kernel, the above use of a general memory barrier compensates
|
The use of a general memory barrier in the example above compensates
|
||||||
for any lack of multicopy atomicity. Therefore, in the above example,
|
for any lack of multicopy atomicity. In the example, if CPU 2's load
|
||||||
if CPU 2's load from X returns 1 and its load from Y returns 0, and CPU 3's
|
from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
|
||||||
load from Y returns 1, then CPU 3's load from X must also return 1.
|
from X must indeed also return 1.
|
||||||
|
|
||||||
However, dependencies, read barriers, and write barriers are not always
|
However, dependencies, read barriers, and write barriers are not always
|
||||||
able to compensate for non-multicopy atomicity. For example, suppose
|
able to compensate for non-multicopy atomicity. For example, suppose
|
||||||
@ -1396,11 +1398,11 @@ this example, it is perfectly legal for CPU 2's load from X to return 1,
|
|||||||
CPU 3's load from Y to return 1, and its load from X to return 0.
|
CPU 3's load from Y to return 1, and its load from X to return 0.
|
||||||
|
|
||||||
The key point is that although CPU 2's data dependency orders its load
|
The key point is that although CPU 2's data dependency orders its load
|
||||||
and store, it does not guarantee to order CPU 1's store. Therefore,
|
and store, it does not guarantee to order CPU 1's store. Thus, if this
|
||||||
if this example runs on a non-multicopy-atomic system where CPUs 1 and 2
|
example runs on a non-multicopy-atomic system where CPUs 1 and 2 share a
|
||||||
share a store buffer or a level of cache, CPU 2 might have early access
|
store buffer or a level of cache, CPU 2 might have early access to CPU 1's
|
||||||
to CPU 1's writes. A general barrier is therefore required to ensure
|
writes. General barriers are therefore required to ensure that all CPUs
|
||||||
that all CPUs agree on the combined order of CPU 1's and CPU 2's accesses.
|
agree on the combined order of multiple accesses.
|
||||||
|
|
||||||
General barriers can compensate not only for non-multicopy atomicity,
|
General barriers can compensate not only for non-multicopy atomicity,
|
||||||
but can also generate additional ordering that can ensure that -all-
|
but can also generate additional ordering that can ensure that -all-
|
||||||
|
Loading…
Reference in New Issue
Block a user