linux_dsm_epyc7002/arch
David S. Miller e6617c6ec2 sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.
This is a compromise and a temporary workaround for bootup NMI
watchdog triggers some people see with qla2xxx devices present.

This happens when, for example:

CPU 0 is in the driver init and looping submitting mailbox commands to
load the firmware, then waiting for completion.

CPU 1 is receiving the device interrupts.  CPU 1 is where the NMI
watchdog triggers.

CPU 0 is submitting mailbox commands fast enough that by the time CPU
1 returns from the device interrupt handler, a new one is pending.
This sequence runs for more than 5 seconds.

The problematic case is CPU 1's timer interrupt running when the
barrage of device interrupts begin.  Then we have:

	timer interrupt
	return for softirq checking
	pending, thus enable interrupts

		 qla2xxx interrupt
		 return
		 qla2xxx interrupt
		 return
		 ... 5+ seconds pass
		 final qla2xxx interrupt for fw load
		 return

	run timer softirq
	return

At some point in the multi-second qla2xxx interrupt storm we trigger
the NMI watchdog on CPU 1 from the NMI interrupt handler.

The timer softirq, once we get back to running it, is smart enough to
run the timer work enough times to make up for the missed timer
interrupts.

However, the NMI watchdogs (both x86 and sparc) use the timer
interrupt count to notice the cpu is wedged.  But in the above
scenerio we'll receive only one such timer interrupt even if we last
all the way back to running the timer softirq.

The default watchdog trigger point is only 5 seconds, which is pretty
low (the softwatchdog triggers at 60 seconds).  So increase it to 30
seconds for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-03 02:35:20 -07:00
..
alpha Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
arm Merge git://git.infradead.org/mtd-2.6 2009-06-22 16:56:22 -07:00
avr32 Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
blackfin Blackfin: fix dma-mapping build errors 2009-06-22 22:31:00 -04:00
cris Merge branch 'for-linus' of git://www.jni.nu/cris 2009-06-23 10:47:01 -07:00
frv Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
h8300 h8/300: fix incorrect "select" directives in arch/h8300/Kconfig.cpu. 2009-06-23 12:50:05 -07:00
ia64 Merge branches 'acerhdf', 'acpi-pci-bind', 'bjorn-pci-root', 'bugzilla-12904', 'bugzilla-13121', 'bugzilla-13396', 'bugzilla-13533', 'bugzilla-13612', 'c3_lock', 'hid-cleanups', 'misc-2.6.31', 'pdc-leak-fix', 'pnpacpi', 'power_nocheck', 'thinkpad_acpi', 'video' and 'wmi' into release 2009-06-24 01:19:50 -04:00
m32r Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
m68k Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
m68knommu ptrace: remove PT_DTRACE from m68k, m68knommu 2009-06-18 13:03:48 -07:00
microblaze Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
mips MIPS: Cavium: Add CPU hotplugging code. 2009-06-24 18:34:40 +01:00
mn10300 MN10300: Fix the vmlinux ldscript 2009-06-22 13:34:50 -07:00
parisc Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
powerpc Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2009-06-22 11:59:51 -07:00
s390 [S390] Update default configuration. 2009-06-22 12:08:25 +02:00
sh sh: Fix up HAVE_PERF_COUNTERS typo. 2009-06-24 01:41:05 +09:00
sparc sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds. 2009-09-03 02:35:20 -07:00
um Move FAULT_FLAG_xyz into handle_mm_fault() callers 2009-06-21 13:08:22 -07:00
x86 Revert "PCI: use ACPI _CRS data by default" 2009-06-24 16:23:03 -07:00
xtensa xtensa: enable m41t80 driver in s6105_defconfig 2009-06-22 02:38:11 -07:00
.gitignore
Kconfig gcov: add gcov profiling infrastructure 2009-06-18 13:03:57 -07:00