linux_dsm_epyc7002/drivers/acpi
Chris Bainbridge add68d6aa9 ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code
In the SBS initialisation, a reentrant call to wait_event_timeout()
causes an intermittent boot stall of several minutes usually following
the "Switching to clocksource tsc" message. Another symptom of this bug
is high CPU usage from programs (Firefox, upowerd) querying the battery
state. This is caused by:

 1. drivers/acpi/sbshc.c wait_transaction_complete() calls
    wait_event_timeout():

 	if (wait_event_timeout(hc->wait, smb_check_done(hc),
 			       msecs_to_jiffies(timeout)))

 2. ___wait_event sets task state to uninterruptible

 3. ___wait_event calls the "condition" smb_check_done()

 4. smb_check_done (sbshc.c) calls through to ec_read() in
    drivers/acpi/ec.c

 5. ec_guard() is reached which calls wait_event_timeout()

 	if (wait_event_timeout(ec->wait,
 			       ec_transaction_completed(ec),
 			       guard))

    ie. wait_event_timeout() is being called again inside evaluation of
    the previous wait_event_timeout() condition

 5. The EC IRQ handler calls wake_up() and wakes up the sleeping task in
    ec_guard()

 6. The task is now in state running even though the wait "condition" is
    still being evaluated

 7. The "condition" check returns false so ___wait_event calls
    schedule_timeout()

 8. Since the task state is running, the scheduler immediately schedules
    it again

 9. This loop usually repeats for around 250 seconds even though the
    original wait_event_timeout was only 1000ms.

    The timeout is incorrect because each call to schedule_timeout()
    usually returns immediately, taking less than 1ms, so the jiffies
    timeout counter is not decremented. The task is now stuck in a
    running state, and so is highly likely to be immediately
    rescheduled, which takes less than a jiffy. The loop will never exit
    if all schedule_timeout() calls take less than a jiffy.

Fix this by replacing SMBus reads in the wait_event_timeout condition
with checks of a boolean value that is updated by the EC query handler.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=107191
Link: https://lkml.org/lkml/2015/11/6/776
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-16 23:23:45 +01:00
..
acpica ACPICA: Debugger: Fix dead lock issue ocurred in single stepping mode 2015-10-22 02:05:06 +02:00
apei Merge branch 'x86/urgent' into core/efi, to pick up a pending EFI fix 2015-10-14 16:05:18 +02:00
pmic ACPI/PMIC: Fix typo in MODULE_DESCRIPTION in intel_pmic_crc.c 2015-03-26 21:34:51 +01:00
ac.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
acpi_apd.c ACPI: Remove clk.h include 2015-07-20 10:52:45 -07:00
acpi_cmos_rtc.c
acpi_extlog.c
acpi_ipmi.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
acpi_lpat.c ACPI / LPAT: Common table processing functions 2015-01-29 21:02:10 +08:00
acpi_lpss.c PM / PCI / ACPI: Kick devices that might have been reset by firmware 2015-10-14 02:17:34 +02:00
acpi_memhotplug.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
acpi_pad.c ACPI / PAD: power_saving_thread() is not freezable 2015-10-26 04:42:54 +01:00
acpi_platform.c device property: ACPI: Make use of the new DMA Attribute APIs 2015-11-07 01:29:22 +01:00
acpi_pnp.c ACPI / scan: constify first argument of struct acpi_scan_handler::match 2015-09-15 02:56:29 +02:00
acpi_processor.c ACPI: Add weak routines for ACPI CPU Hotplug 2015-10-12 23:08:03 +02:00
acpi_video.c ACPI / video: only register backlight for LCD device 2015-11-02 01:37:30 +01:00
battery.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
battery.h
bgrt.c
blacklist.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
bus.c ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED() 2015-09-15 03:05:45 +02:00
button.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
cm_sbs.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
container.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
cppc_acpi.c ACPI / CPPC: Fix potential memory leak 2015-10-26 04:47:02 +01:00
custom_method.c
debugfs.c ACPI: fix acpi_debugfs_init prototype 2015-08-07 02:55:18 +02:00
device_pm.c PM / PCI / ACPI: Kick devices that might have been reset by firmware 2015-10-14 02:17:34 +02:00
device_sysfs.c ACPI / property: Expose data-only subnodes via sysfs 2015-09-15 01:47:34 +02:00
dock.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
ec_sys.c ACPI / EC: Fix broken 64bit big-endian users of 'global_lock' 2015-10-04 11:36:07 +01:00
ec.c ACPI / EC: Fix a race issue in acpi_ec_guard_event() 2015-09-26 01:46:25 +02:00
event.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
fan.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
glue.c Merge branch 'acpi-pci' 2015-11-07 01:30:10 +01:00
gsi.c acpi/gsi: Cleanup acpi_register_gsi 2015-10-13 19:01:25 +02:00
hed.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
int340x_thermal.c ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED() 2015-09-15 03:05:45 +02:00
internal.h driver core update for 4.4-rc1 2015-11-04 21:50:37 -08:00
ioapic.c x86/irq, ACPI: Implement ACPI driver to support IOAPIC hotplug 2015-02-05 15:09:26 +01:00
Kconfig Merge branch 'acpi-processor' 2015-11-02 00:50:37 +01:00
Makefile ACPI: Introduce CPU performance controls using CPPC 2015-10-12 22:49:55 +02:00
nfit.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
nfit.h libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
numa.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
nvs.c
osl.c asm-generic cleanups 2015-11-06 14:22:15 -08:00
pci_irq.c ACPI, PCI, irq: Do not share PCI IRQ with ISA IRQ 2015-09-26 01:53:07 +02:00
pci_link.c ACPI / PCI: Remove duplicated penalty on SCI IRQ 2015-09-26 01:53:07 +02:00
pci_root.c PCI/ACPI: Add interface acpi_pci_root_create() 2015-10-16 22:18:51 +02:00
pci_slot.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
power.c Merge branch 'acpi-pm' 2015-09-01 03:38:43 +02:00
proc.c ACPI: change acpi_sleep_proc_init() to return void 2015-09-15 03:03:15 +02:00
processor_core.c ACPI / processor: Introduce invalid_phys_cpuid() 2015-05-13 23:28:16 +02:00
processor_driver.c CPPC: Probe for CPPC tables for each ACPI Processor object 2015-10-12 23:08:04 +02:00
processor_idle.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
processor_pdc.c ACPI / processor: Introduce invalid_logical_cpuid() 2015-05-13 23:28:14 +02:00
processor_perflib.c Merge branch 'pm-cpufreq' 2015-09-01 15:52:35 +02:00
processor_thermal.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
processor_throttling.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
property.c ACPI / property: Fix subnode lookup scope for data-only subnodes 2015-10-22 00:54:03 +02:00
reboot.c
resource.c ACPI/PCI: Enhance ACPI core to support sparse IO space 2015-10-16 22:18:51 +02:00
sbs.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
sbshc.c ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code 2015-11-16 23:23:45 +01:00
sbshc.h
scan.c Merge branch 'acpi-pci' 2015-11-07 01:30:10 +01:00
sleep.c Merge branch 'pm-sleep' 2015-11-02 00:52:19 +01:00
sleep.h ACPI / sleep: Drop acpi_suspend() which is not used 2015-03-18 12:53:21 +01:00
sysfs.c ACPI / sysfs: correctly check failing memory allocation 2015-10-26 04:57:27 +01:00
tables.c ACPI / tables: test the correct variable 2015-10-15 01:31:24 +02:00
thermal.c linux/thermal.h: rename KELVIN_TO_CELSIUS to DECI_KELVIN_TO_CELSIUS 2015-10-10 11:32:30 +08:00
utils.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
video_detect.c ACPI / video: Add a quirk to force acpi-video backlight on Dell XPS L421X 2015-11-02 01:36:31 +01:00
wakeup.c