linux_dsm_epyc7002/Documentation/fault-injection/nvme-fault-injection.txt
Thomas Tai cf4182f3d0 Documentation: nvme: Documentation for nvme fault injection
Add examples to show how to use nvme fault injection.

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Signed-off-by: Karl Volz <karl.volz@oracle.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-26 08:53:43 -06:00

117 lines
4.0 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

NVMe Fault Injection
====================
Linux's fault injection framework provides a systematic way to support
error injection via debugfs in the /sys/kernel/debug directory. When
enabled, the default NVME_SC_INVALID_OPCODE with no retry will be
injected into the nvme_end_request. Users can change the default status
code and no retry flag via the debugfs. The list of Generic Command
Status can be found in include/linux/nvme.h
Following examples show how to inject an error into the nvme.
First, enable CONFIG_FAULT_INJECTION_DEBUG_FS kernel config,
recompile the kernel. After booting up the kernel, do the
following.
Example 1: Inject default status code with no retry
---------------------------------------------------
mount /dev/nvme0n1 /mnt
echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
cp a.file /mnt
Expected Result:
cp: cannot stat /mnt/a.file: Input/output error
Message from dmesg:
FAULT_INJECTION: forcing a failure.
name fault_inject, interval 1, probability 100, space 0, times 1
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc8+ #2
Hardware name: innotek GmbH VirtualBox/VirtualBox,
BIOS VirtualBox 12/01/2006
Call Trace:
<IRQ>
dump_stack+0x5c/0x7d
should_fail+0x148/0x170
nvme_should_fail+0x2f/0x50 [nvme_core]
nvme_process_cq+0xe7/0x1d0 [nvme]
nvme_irq+0x1e/0x40 [nvme]
__handle_irq_event_percpu+0x3a/0x190
handle_irq_event_percpu+0x30/0x70
handle_irq_event+0x36/0x60
handle_fasteoi_irq+0x78/0x120
handle_irq+0xa7/0x130
? tick_irq_enter+0xa8/0xc0
do_IRQ+0x43/0xc0
common_interrupt+0xa2/0xa2
</IRQ>
RIP: 0010:native_safe_halt+0x2/0x10
RSP: 0018:ffffffff82003e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
RAX: ffffffff817a10c0 RBX: ffffffff82012480 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 000000008e38ce64 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82012480
R13: ffffffff82012480 R14: 0000000000000000 R15: 0000000000000000
? __sched_text_end+0x4/0x4
default_idle+0x18/0xf0
do_idle+0x150/0x1d0
cpu_startup_entry+0x6f/0x80
start_kernel+0x4c4/0x4e4
? set_init_arg+0x55/0x55
secondary_startup_64+0xa5/0xb0
print_req_error: I/O error, dev nvme0n1, sector 9240
EXT4-fs error (device nvme0n1): ext4_find_entry:1436:
inode #2: comm cp: reading directory lblock 0
Example 2: Inject default status code with retry
------------------------------------------------
mount /dev/nvme0n1 /mnt
echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/status
echo 0 > /sys/kernel/debug/nvme0n1/fault_inject/dont_retry
cp a.file /mnt
Expected Result:
command success without error
Message from dmesg:
FAULT_INJECTION: forcing a failure.
name fault_inject, interval 1, probability 100, space 0, times 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc8+ #4
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
<IRQ>
dump_stack+0x5c/0x7d
should_fail+0x148/0x170
nvme_should_fail+0x30/0x60 [nvme_core]
nvme_loop_queue_response+0x84/0x110 [nvme_loop]
nvmet_req_complete+0x11/0x40 [nvmet]
nvmet_bio_done+0x28/0x40 [nvmet]
blk_update_request+0xb0/0x310
blk_mq_end_request+0x18/0x60
flush_smp_call_function_queue+0x3d/0xf0
smp_call_function_single_interrupt+0x2c/0xc0
call_function_single_interrupt+0xa2/0xb0
</IRQ>
RIP: 0010:native_safe_halt+0x2/0x10
RSP: 0018:ffffc9000068bec0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04
RAX: ffffffff817a10c0 RBX: ffff88011a3c9680 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 000000008e38c131 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011a3c9680
R13: ffff88011a3c9680 R14: 0000000000000000 R15: 0000000000000000
? __sched_text_end+0x4/0x4
default_idle+0x18/0xf0
do_idle+0x150/0x1d0
cpu_startup_entry+0x6f/0x80
start_secondary+0x187/0x1e0
secondary_startup_64+0xa5/0xb0