linux_dsm_epyc7002/arch/x86/kernel/apic
Ingo Molnar c1dc0b9c0c debug lockups: Improve lockup detection
When debugging a recent lockup bug i found various deficiencies
in how our current lockup detection helpers work:

 - SysRq-L is not very efficient as it uses a workqueue, hence
   it cannot punch through hard lockups and cannot see through
   most soft lockups either.

 - The SysRq-L code depends on the NMI watchdog - which is off
   by default.

 - We dont print backtraces from the RCU code's built-in
   'RCU state machine is stuck' debug code. This debug
   code tends to be one of the first (and only) mechanisms
   that show that a lockup has occured.

This patch changes the code so taht we:

 - Trigger the NMI backtrace code from SysRq-L instead of using
   a workqueue (which cannot punch through hard lockups)

 - Trigger print-all-CPU-backtraces from the RCU lockup detection
   code

Also decouple the backtrace printing code from the NMI watchdog:

 - Dont use variable size cpumasks (it might not be initialized
   and they are a bit more fragile anyway)

 - Trigger an NMI immediately via an IPI, instead of waiting
   for the NMI tick to occur. This is a lot faster and can
   produce more relevant backtraces. It will also work if the
   NMI watchdog is disabled.

 - Dont print the 'dazed and confused' message when we print
   a backtrace from the NMI

 - Do a show_regs() plus a dump_stack() to get maximum info
   out of the dump. Worst-case we get two stacktraces - which
   is not a big deal. Sometimes, if register content is
   corrupted, the precise stack walker in show_regs() wont
   give us a full backtrace - in this case dump_stack() will
   do it.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02 13:27:17 +02:00
..
apic_flat_64.c x86: don't call read_apic_id if !cpu_has_apic 2009-05-18 08:43:25 +02:00
apic.c x86: Remove unused variable disable_x2apic 2009-07-03 14:34:27 +02:00
bigsmp_32.c cpumask: use new cpumask functions throughout x86 2009-03-13 14:49:54 +10:30
es7000_32.c x86: Fix false positive section mismatch in es7000_32.c 2009-07-13 11:03:26 +02:00
io_apic.c x86/pci: insert ioapic resource before assigning unassigned resources 2009-07-10 13:03:14 -07:00
ipi.c x86, apic: move APIC drivers to arch/x86/kernel/apic/* 2009-02-17 18:17:36 +01:00
Makefile x86, apic: separate 32-bit setup functionality out of apic_32.c 2009-02-17 23:12:48 +01:00
nmi.c debug lockups: Improve lockup detection 2009-08-02 13:27:17 +02:00
numaq_32.c x86, apic: Fix false positive section mismatch in numaq_32.c 2009-07-13 11:03:27 +02:00
probe_32.c x86: Remove duplicated #include's 2009-06-17 19:02:35 +02:00
probe_64.c x86: x2apic, IR: Clean up X86_X2APIC and INTR_REMAP config checks 2009-04-21 09:08:25 +02:00
summit_32.c x86: Remove duplicated #include's 2009-06-17 19:02:35 +02:00
x2apic_cluster.c x86: apic/x2apic_cluster.c x86_cpu_to_logical_apicid should be static 2009-04-12 12:39:24 +02:00
x2apic_phys.c x86: add x2apic_wrmsr_fence() to x2apic flush tlb paths 2009-03-18 09:36:14 +01:00
x2apic_uv_x.c Merge branch 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-15 10:06:19 -07:00