mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-18 10:26:09 +07:00
4de5f89ef8
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
315 lines
13 KiB
Plaintext
315 lines
13 KiB
Plaintext
RCU Torture Test Operation
|
|
|
|
|
|
CONFIG_RCU_TORTURE_TEST
|
|
|
|
The CONFIG_RCU_TORTURE_TEST config option is available for all RCU
|
|
implementations. It creates an rcutorture kernel module that can
|
|
be loaded to run a torture test. The test periodically outputs
|
|
status messages via printk(), which can be examined via the dmesg
|
|
command (perhaps grepping for "torture"). The test is started
|
|
when the module is loaded, and stops when the module is unloaded.
|
|
|
|
|
|
MODULE PARAMETERS
|
|
|
|
This module has the following parameters:
|
|
|
|
fqs_duration Duration (in microseconds) of artificially induced bursts
|
|
of force_quiescent_state() invocations. In RCU
|
|
implementations having force_quiescent_state(), these
|
|
bursts help force races between forcing a given grace
|
|
period and that grace period ending on its own.
|
|
|
|
fqs_holdoff Holdoff time (in microseconds) between consecutive calls
|
|
to force_quiescent_state() within a burst.
|
|
|
|
fqs_stutter Wait time (in seconds) between consecutive bursts
|
|
of calls to force_quiescent_state().
|
|
|
|
gp_normal Make the fake writers use normal synchronous grace-period
|
|
primitives.
|
|
|
|
gp_exp Make the fake writers use expedited synchronous grace-period
|
|
primitives. If both gp_normal and gp_exp are set, or
|
|
if neither gp_normal nor gp_exp are set, then randomly
|
|
choose the primitive so that about 50% are normal and
|
|
50% expedited. By default, neither are set, which
|
|
gives best overall test coverage.
|
|
|
|
irqreader Says to invoke RCU readers from irq level. This is currently
|
|
done via timers. Defaults to "1" for variants of RCU that
|
|
permit this. (Or, more accurately, variants of RCU that do
|
|
-not- permit this know to ignore this variable.)
|
|
|
|
n_barrier_cbs If this is nonzero, RCU barrier testing will be conducted,
|
|
in which case n_barrier_cbs specifies the number of
|
|
RCU callbacks (and corresponding kthreads) to use for
|
|
this testing. The value cannot be negative. If you
|
|
specify this to be non-zero when torture_type indicates a
|
|
synchronous RCU implementation (one for which a member of
|
|
the synchronize_rcu() rather than the call_rcu() family is
|
|
used -- see the documentation for torture_type below), an
|
|
error will be reported and no testing will be carried out.
|
|
|
|
nfakewriters This is the number of RCU fake writer threads to run. Fake
|
|
writer threads repeatedly use the synchronous "wait for
|
|
current readers" function of the interface selected by
|
|
torture_type, with a delay between calls to allow for various
|
|
different numbers of writers running in parallel.
|
|
nfakewriters defaults to 4, which provides enough parallelism
|
|
to trigger special cases caused by multiple writers, such as
|
|
the synchronize_srcu() early return optimization.
|
|
|
|
nreaders This is the number of RCU reading threads supported.
|
|
The default is twice the number of CPUs. Why twice?
|
|
To properly exercise RCU implementations with preemptible
|
|
read-side critical sections.
|
|
|
|
onoff_interval
|
|
The number of seconds between each attempt to execute a
|
|
randomly selected CPU-hotplug operation. Defaults to
|
|
zero, which disables CPU hotplugging. In HOTPLUG_CPU=n
|
|
kernels, rcutorture will silently refuse to do any
|
|
CPU-hotplug operations regardless of what value is
|
|
specified for onoff_interval.
|
|
|
|
onoff_holdoff The number of seconds to wait until starting CPU-hotplug
|
|
operations. This would normally only be used when
|
|
rcutorture was built into the kernel and started
|
|
automatically at boot time, in which case it is useful
|
|
in order to avoid confusing boot-time code with CPUs
|
|
coming and going.
|
|
|
|
shuffle_interval
|
|
The number of seconds to keep the test threads affinitied
|
|
to a particular subset of the CPUs, defaults to 3 seconds.
|
|
Used in conjunction with test_no_idle_hz.
|
|
|
|
shutdown_secs The number of seconds to run the test before terminating
|
|
the test and powering off the system. The default is
|
|
zero, which disables test termination and system shutdown.
|
|
This capability is useful for automated testing.
|
|
|
|
stall_cpu The number of seconds that a CPU should be stalled while
|
|
within both an rcu_read_lock() and a preempt_disable().
|
|
This stall happens only once per rcutorture run.
|
|
If you need multiple stalls, use modprobe and rmmod to
|
|
repeatedly run rcutorture. The default for stall_cpu
|
|
is zero, which prevents rcutorture from stalling a CPU.
|
|
|
|
Note that attempts to rmmod rcutorture while the stall
|
|
is ongoing will hang, so be careful what value you
|
|
choose for this module parameter! In addition, too-large
|
|
values for stall_cpu might well induce failures and
|
|
warnings in other parts of the kernel. You have been
|
|
warned!
|
|
|
|
stall_cpu_holdoff
|
|
The number of seconds to wait after rcutorture starts
|
|
before stalling a CPU. Defaults to 10 seconds.
|
|
|
|
stat_interval The number of seconds between output of torture
|
|
statistics (via printk()). Regardless of the interval,
|
|
statistics are printed when the module is unloaded.
|
|
Setting the interval to zero causes the statistics to
|
|
be printed -only- when the module is unloaded, and this
|
|
is the default.
|
|
|
|
stutter The length of time to run the test before pausing for this
|
|
same period of time. Defaults to "stutter=5", so as
|
|
to run and pause for (roughly) five-second intervals.
|
|
Specifying "stutter=0" causes the test to run continuously
|
|
without pausing, which is the old default behavior.
|
|
|
|
test_boost Whether or not to test the ability of RCU to do priority
|
|
boosting. Defaults to "test_boost=1", which performs
|
|
RCU priority-inversion testing only if the selected
|
|
RCU implementation supports priority boosting. Specifying
|
|
"test_boost=0" never performs RCU priority-inversion
|
|
testing. Specifying "test_boost=2" performs RCU
|
|
priority-inversion testing even if the selected RCU
|
|
implementation does not support RCU priority boosting,
|
|
which can be used to test rcutorture's ability to
|
|
carry out RCU priority-inversion testing.
|
|
|
|
test_boost_interval
|
|
The number of seconds in an RCU priority-inversion test
|
|
cycle. Defaults to "test_boost_interval=7". It is
|
|
usually wise for this value to be relatively prime to
|
|
the value selected for "stutter".
|
|
|
|
test_boost_duration
|
|
The number of seconds to do RCU priority-inversion testing
|
|
within any given "test_boost_interval". Defaults to
|
|
"test_boost_duration=4".
|
|
|
|
test_no_idle_hz Whether or not to test the ability of RCU to operate in
|
|
a kernel that disables the scheduling-clock interrupt to
|
|
idle CPUs. Boolean parameter, "1" to test, "0" otherwise.
|
|
Defaults to omitting this test.
|
|
|
|
torture_type The type of RCU to test, with string values as follows:
|
|
|
|
"rcu": rcu_read_lock(), rcu_read_unlock() and call_rcu(),
|
|
along with expedited, synchronous, and polling
|
|
variants.
|
|
|
|
"rcu_bh": rcu_read_lock_bh(), rcu_read_unlock_bh(), and
|
|
call_rcu_bh(), along with expedited and synchronous
|
|
variants.
|
|
|
|
"rcu_busted": This tests an intentionally incorrect version
|
|
of RCU in order to help test rcutorture itself.
|
|
|
|
"srcu": srcu_read_lock(), srcu_read_unlock() and
|
|
call_srcu(), along with expedited and
|
|
synchronous variants.
|
|
|
|
"sched": preempt_disable(), preempt_enable(), and
|
|
call_rcu_sched(), along with expedited,
|
|
synchronous, and polling variants.
|
|
|
|
"tasks": voluntary context switch and call_rcu_tasks(),
|
|
along with expedited and synchronous variants.
|
|
|
|
Defaults to "rcu".
|
|
|
|
verbose Enable debug printk()s. Default is disabled.
|
|
|
|
|
|
OUTPUT
|
|
|
|
The statistics output is as follows:
|
|
|
|
rcu-torture:--- Start of test: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
|
|
rcu-torture: rtc: (null) ver: 155441 tfle: 0 rta: 155441 rtaf: 8884 rtf: 155440 rtmbe: 0 rtbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 3055767
|
|
rcu-torture: Reader Pipe: 727860534 34213 0 0 0 0 0 0 0 0 0
|
|
rcu-torture: Reader Batch: 727877838 17003 0 0 0 0 0 0 0 0 0
|
|
rcu-torture: Free-Block Circulation: 155440 155440 155440 155440 155440 155440 155440 155440 155440 155440 0
|
|
rcu-torture:--- End of test: SUCCESS: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
|
|
|
|
The command "dmesg | grep torture:" will extract this information on
|
|
most systems. On more esoteric configurations, it may be necessary to
|
|
use other commands to access the output of the printk()s used by
|
|
the RCU torture test. The printk()s use KERN_ALERT, so they should
|
|
be evident. ;-)
|
|
|
|
The first and last lines show the rcutorture module parameters, and the
|
|
last line shows either "SUCCESS" or "FAILURE", based on rcutorture's
|
|
automatic determination as to whether RCU operated correctly.
|
|
|
|
The entries are as follows:
|
|
|
|
o "rtc": The hexadecimal address of the structure currently visible
|
|
to readers.
|
|
|
|
o "ver": The number of times since boot that the RCU writer task
|
|
has changed the structure visible to readers.
|
|
|
|
o "tfle": If non-zero, indicates that the "torture freelist"
|
|
containing structures to be placed into the "rtc" area is empty.
|
|
This condition is important, since it can fool you into thinking
|
|
that RCU is working when it is not. :-/
|
|
|
|
o "rta": Number of structures allocated from the torture freelist.
|
|
|
|
o "rtaf": Number of allocations from the torture freelist that have
|
|
failed due to the list being empty. It is not unusual for this
|
|
to be non-zero, but it is bad for it to be a large fraction of
|
|
the value indicated by "rta".
|
|
|
|
o "rtf": Number of frees into the torture freelist.
|
|
|
|
o "rtmbe": A non-zero value indicates that rcutorture believes that
|
|
rcu_assign_pointer() and rcu_dereference() are not working
|
|
correctly. This value should be zero.
|
|
|
|
o "rtbe": A non-zero value indicates that one of the rcu_barrier()
|
|
family of functions is not working correctly.
|
|
|
|
o "rtbke": rcutorture was unable to create the real-time kthreads
|
|
used to force RCU priority inversion. This value should be zero.
|
|
|
|
o "rtbre": Although rcutorture successfully created the kthreads
|
|
used to force RCU priority inversion, it was unable to set them
|
|
to the real-time priority level of 1. This value should be zero.
|
|
|
|
o "rtbf": The number of times that RCU priority boosting failed
|
|
to resolve RCU priority inversion.
|
|
|
|
o "rtb": The number of times that rcutorture attempted to force
|
|
an RCU priority inversion condition. If you are testing RCU
|
|
priority boosting via the "test_boost" module parameter, this
|
|
value should be non-zero.
|
|
|
|
o "nt": The number of times rcutorture ran RCU read-side code from
|
|
within a timer handler. This value should be non-zero only
|
|
if you specified the "irqreader" module parameter.
|
|
|
|
o "Reader Pipe": Histogram of "ages" of structures seen by readers.
|
|
If any entries past the first two are non-zero, RCU is broken.
|
|
And rcutorture prints the error flag string "!!!" to make sure
|
|
you notice. The age of a newly allocated structure is zero,
|
|
it becomes one when removed from reader visibility, and is
|
|
incremented once per grace period subsequently -- and is freed
|
|
after passing through (RCU_TORTURE_PIPE_LEN-2) grace periods.
|
|
|
|
The output displayed above was taken from a correctly working
|
|
RCU. If you want to see what it looks like when broken, break
|
|
it yourself. ;-)
|
|
|
|
o "Reader Batch": Another histogram of "ages" of structures seen
|
|
by readers, but in terms of counter flips (or batches) rather
|
|
than in terms of grace periods. The legal number of non-zero
|
|
entries is again two. The reason for this separate view is that
|
|
it is sometimes easier to get the third entry to show up in the
|
|
"Reader Batch" list than in the "Reader Pipe" list.
|
|
|
|
o "Free-Block Circulation": Shows the number of torture structures
|
|
that have reached a given point in the pipeline. The first element
|
|
should closely correspond to the number of structures allocated,
|
|
the second to the number that have been removed from reader view,
|
|
and all but the last remaining to the corresponding number of
|
|
passes through a grace period. The last entry should be zero,
|
|
as it is only incremented if a torture structure's counter
|
|
somehow gets incremented farther than it should.
|
|
|
|
Different implementations of RCU can provide implementation-specific
|
|
additional information. For example, Tree SRCU provides the following
|
|
additional line:
|
|
|
|
srcud-torture: Tree SRCU per-CPU(idx=0): 0(35,-21) 1(-4,24) 2(1,1) 3(-26,20) 4(28,-47) 5(-9,4) 6(-10,14) 7(-14,11) T(1,6)
|
|
|
|
This line shows the per-CPU counter state, in this case for Tree SRCU
|
|
using a dynamically allocated srcu_struct (hence "srcud-" rather than
|
|
"srcu-"). The numbers in parentheses are the values of the "old" and
|
|
"current" counters for the corresponding CPU. The "idx" value maps the
|
|
"old" and "current" values to the underlying array, and is useful for
|
|
debugging. The final "T" entry contains the totals of the counters.
|
|
|
|
|
|
USAGE
|
|
|
|
The following script may be used to torture RCU:
|
|
|
|
#!/bin/sh
|
|
|
|
modprobe rcutorture
|
|
sleep 3600
|
|
rmmod rcutorture
|
|
dmesg | grep torture:
|
|
|
|
The output can be manually inspected for the error flag of "!!!".
|
|
One could of course create a more elaborate script that automatically
|
|
checked for such errors. The "rmmod" command forces a "SUCCESS",
|
|
"FAILURE", or "RCU_HOTPLUG" indication to be printk()ed. The first
|
|
two are self-explanatory, while the last indicates that while there
|
|
were no RCU failures, CPU-hotplug problems were detected.
|
|
|
|
However, the tools/testing/selftests/rcutorture/bin/kvm.sh script
|
|
provides better automation, including automatic failure analysis.
|
|
It assumes a qemu/kvm-enabled platform, and runs guest OSes out of initrd.
|
|
See tools/testing/selftests/rcutorture/doc/initrd.txt for instructions
|
|
on setting up such an initrd.
|