linux_dsm_epyc7002/drivers/leds
Andrea Righi 736924801c leds: trigger: fix potential deadlock with libata
commit 27af8e2c90fba242460b01fa020e6e19ed68c495 upstream.

We have the following potential deadlock condition:

 ========================================================
 WARNING: possible irq lock inversion dependency detected
 5.10.0-rc2+ #25 Not tainted
 --------------------------------------------------------
 swapper/3/0 just changed the state of lock:
 ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
 but this lock took another, HARDIRQ-READ-unsafe lock in the past:
  (&trig->leddev_list_lock){.+.?}-{2:2}

 and interrupts could create inverse lock ordering between them.

 other info that might help us debug this:
  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&trig->leddev_list_lock);
                                local_irq_disable();
                                lock(&host->lock);
                                lock(&trig->leddev_list_lock);
   <Interrupt>
     lock(&host->lock);

  *** DEADLOCK ***

 no locks held by swapper/3/0.

 the shortest dependencies between 2nd lock and 1st lock:
  -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 {
     HARDIRQ-ON-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       rfkill_global_led_trigger_worker+0x94/0xb0
                       process_one_work+0x240/0x560
                       worker_thread+0x58/0x3d0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     IN-SOFTIRQ-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       kbd_bh+0x9e/0xc0
                       tasklet_action_common.constprop.0+0xe9/0x100
                       tasklet_action+0x22/0x30
                       __do_softirq+0xcc/0x46d
                       run_ksoftirqd+0x3f/0x70
                       smpboot_thread_fn+0x116/0x1f0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     SOFTIRQ-ON-R at:
                       lock_acquire+0x15f/0x420
                       _raw_read_lock+0x42/0x90
                       led_trigger_event+0x2b/0x70
                       rfkill_global_led_trigger_worker+0x94/0xb0
                       process_one_work+0x240/0x560
                       worker_thread+0x58/0x3d0
                       kthread+0x151/0x170
                       ret_from_fork+0x1f/0x30
     INITIAL READ USE at:
                           lock_acquire+0x15f/0x420
                           _raw_read_lock+0x42/0x90
                           led_trigger_event+0x2b/0x70
                           rfkill_global_led_trigger_worker+0x94/0xb0
                           process_one_work+0x240/0x560
                           worker_thread+0x58/0x3d0
                           kthread+0x151/0x170
                           ret_from_fork+0x1f/0x30
   }
   ... key      at: [<ffffffff83da4c00>] __key.0+0x0/0x10
   ... acquired at:
    _raw_read_lock+0x42/0x90
    led_trigger_blink_oneshot+0x3b/0x90
    ledtrig_disk_activity+0x3c/0xa0
    ata_qc_complete+0x26/0x450
    ata_do_link_abort+0xa3/0xe0
    ata_port_freeze+0x2e/0x40
    ata_hsm_qc_complete+0x94/0xa0
    ata_sff_hsm_move+0x177/0x7a0
    ata_sff_pio_task+0xc7/0x1b0
    process_one_work+0x240/0x560
    worker_thread+0x58/0x3d0
    kthread+0x151/0x170
    ret_from_fork+0x1f/0x30

 -> (&host->lock){-...}-{2:2} ops: 69 {
    IN-HARDIRQ-W at:
                     lock_acquire+0x15f/0x420
                     _raw_spin_lock_irqsave+0x52/0xa0
                     ata_bmdma_interrupt+0x27/0x200
                     __handle_irq_event_percpu+0xd5/0x2b0
                     handle_irq_event+0x57/0xb0
                     handle_edge_irq+0x8c/0x230
                     asm_call_irq_on_stack+0xf/0x20
                     common_interrupt+0x100/0x1c0
                     asm_common_interrupt+0x1e/0x40
                     native_safe_halt+0xe/0x10
                     arch_cpu_idle+0x15/0x20
                     default_idle_call+0x59/0x1c0
                     do_idle+0x22c/0x2c0
                     cpu_startup_entry+0x20/0x30
                     start_secondary+0x11d/0x150
                     secondary_startup_64_no_verify+0xa6/0xab
    INITIAL USE at:
                    lock_acquire+0x15f/0x420
                    _raw_spin_lock_irqsave+0x52/0xa0
                    ata_dev_init+0x54/0xe0
                    ata_link_init+0x8b/0xd0
                    ata_port_alloc+0x1f1/0x210
                    ata_host_alloc+0xf1/0x130
                    ata_host_alloc_pinfo+0x14/0xb0
                    ata_pci_sff_prepare_host+0x41/0xa0
                    ata_pci_bmdma_prepare_host+0x14/0x30
                    piix_init_one+0x21f/0x600
                    local_pci_probe+0x48/0x80
                    pci_device_probe+0x105/0x1c0
                    really_probe+0x221/0x490
                    driver_probe_device+0xe9/0x160
                    device_driver_attach+0xb2/0xc0
                    __driver_attach+0x91/0x150
                    bus_for_each_dev+0x81/0xc0
                    driver_attach+0x1e/0x20
                    bus_add_driver+0x138/0x1f0
                    driver_register+0x91/0xf0
                    __pci_register_driver+0x73/0x80
                    piix_init+0x1e/0x2e
                    do_one_initcall+0x5f/0x2d0
                    kernel_init_freeable+0x26f/0x2cf
                    kernel_init+0xe/0x113
                    ret_from_fork+0x1f/0x30
  }
  ... key      at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10
  ... acquired at:
    __lock_acquire+0x9da/0x2370
    lock_acquire+0x15f/0x420
    _raw_spin_lock_irqsave+0x52/0xa0
    ata_bmdma_interrupt+0x27/0x200
    __handle_irq_event_percpu+0xd5/0x2b0
    handle_irq_event+0x57/0xb0
    handle_edge_irq+0x8c/0x230
    asm_call_irq_on_stack+0xf/0x20
    common_interrupt+0x100/0x1c0
    asm_common_interrupt+0x1e/0x40
    native_safe_halt+0xe/0x10
    arch_cpu_idle+0x15/0x20
    default_idle_call+0x59/0x1c0
    do_idle+0x22c/0x2c0
    cpu_startup_entry+0x20/0x30
    start_secondary+0x11d/0x150
    secondary_startup_64_no_verify+0xa6/0xab

This lockdep splat is reported after:
commit e918188611 ("locking: More accurate annotations for read_lock()")

To clarify:
 - read-locks are recursive only in interrupt context (when
   in_interrupt() returns true)
 - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call
   write_lock(&trig->leddev_list_lock) that would be blocked by CPU0
   that holds trig->leddev_list_lock in read-mode
 - when CPU1 (ata_ac_complete()) tries to read-lock
   trig->leddev_list_lock, it would be blocked by the write-lock waiter
   on CPU2 (because we are not in interrupt context, so the read-lock is
   not recursive)
 - at this point if an interrupt happens on CPU0 and
   ata_bmdma_interrupt() is executed it will try to acquire host->lock,
   that is held by CPU1, that is currently blocked by CPU2, so:

   * CPU0 blocked by CPU1
   * CPU1 blocked by CPU2
   * CPU2 blocked by CPU0

     *** DEADLOCK ***

The deadlock scenario is better represented by the following schema
(thanks to Boqun Feng <boqun.feng@gmail.com> for the schema and the
detailed explanation of the deadlock condition):

 CPU 0:                          CPU 1:                        CPU 2:
 -----                           -----                         -----
 led_trigger_event():
   read_lock(&trig->leddev_list_lock);
 				<workqueue>
 				ata_hsm_qc_complete():
 				  spin_lock_irqsave(&host->lock);
 								write_lock(&trig->leddev_list_lock);
 				  ata_port_freeze():
 				    ata_do_link_abort():
 				      ata_qc_complete():
 					ledtrig_disk_activity():
 					  led_trigger_blink_oneshot():
 					    read_lock(&trig->leddev_list_lock);
 					    // ^ not in in_interrupt() context, so could get blocked by CPU 2
 <interrupt>
   ata_bmdma_interrupt():
     spin_lock_irqsave(&host->lock);

Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so
that no interrupt can happen in between, preventing the deadlock
condition.

Apply the same change to led_trigger_blink_setup() as well, since the
same deadlock scenario can also happen in power_supply_update_bat_leds()
-> led_trigger_blink() -> led_trigger_blink_setup() (workqueue context),
and potentially prevent other similar usages.

Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/
Fixes: eb25cb9956 ("leds: convert IDE trigger to common disk trigger")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:28:41 +01:00
..
trigger ledtrig-cpu: Limit to 8 CPUs 2020-09-30 19:15:40 +02:00
Kconfig leds: Add driver for Acer Iconia Tab A500 2020-09-26 21:56:42 +02:00
led-class-flash.c
led-class-multicolor.c leds: multicolor: Introduce a multicolor class definition 2020-07-22 14:41:29 +02:00
led-class.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
led-core.c leds: disallow /sys/class/leds/*:multi:* for now 2020-08-03 14:00:06 +02:00
led-triggers.c leds: trigger: fix potential deadlock with libata 2021-02-03 23:28:41 +01:00
leds-88pm860x.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-aat1290.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-acer-a500.c leds: Add driver for Acer Iconia Tab A500 2020-09-26 21:56:42 +02:00
leds-adp5520.c
leds-an30259a.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-apu.c
leds-ariel.c
leds-as3645a.c
leds-asic3.c
leds-aw2013.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-bcm6328.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-bcm6358.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-bd2802.c
leds-blinkm.c
leds-clevo-mail.c
leds-cobalt-qube.c
leds-cobalt-raq.c
leds-cpcap.c leds: various: use device_get_match_data 2020-09-26 21:56:39 +02:00
leds-cr0014114.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-da903x.c leds: da903x: fix use-after-free on unbind 2020-06-22 10:37:58 +02:00
leds-da9052.c
leds-dac124s085.c
leds-el15203000.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-fsg.c
leds-gpio-register.c
leds-gpio.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-hp6xx.c
leds-ip30.c leds: ip30: compile if COMPILE_TEST=y 2020-09-26 21:56:38 +02:00
leds-ipaq-micro.c
leds-is31fl32xx.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-is31fl319x.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-ktd2692.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-lm355x.c leds: drop redundant struct-device pointer casts 2020-06-22 10:38:00 +02:00
leds-lm3530.c
leds-lm3532.c leds: lm3532: Fix warnings for undefined parameters 2020-09-30 18:53:28 +02:00
leds-lm3533.c leds: lm3533: fix use-after-free on unbind 2020-06-22 10:37:59 +02:00
leds-lm3601x.c leds: Replace HTTP links with HTTPS ones 2020-07-22 14:48:40 +02:00
leds-lm3642.c leds: drop redundant struct-device pointer casts 2020-06-22 10:38:00 +02:00
leds-lm3692x.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-lm3697.c leds: lm3697: Fix out-of-bound access 2020-10-05 20:36:09 +02:00
leds-lm36274.c leds: lm36274: Fix warning for undefined parameters 2020-09-30 18:53:28 +02:00
leds-locomo.c
leds-lp50xx.c leds: lp50xx: Fix an error handling path in 'lp50xx_probe_dt()' 2020-12-30 11:53:22 +01:00
leds-lp55xx-common.c leds: various: fix OF node leaks 2020-09-26 21:56:39 +02:00
leds-lp55xx-common.h leds: lp55xx: Add multicolor framework support to lp55xx 2020-07-22 14:42:06 +02:00
leds-lp3944.c
leds-lp3952.c
leds-lp5521.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-lp5523.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-lp5562.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-lp8501.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-lp8788.c
leds-lp8860.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-lt3593.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-max8997.c
leds-max77650.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-max77693.c leds: various: use dev_of_node(dev) instead of dev->of_node 2020-09-26 21:56:39 +02:00
leds-mc13783.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-menf21bmc.c
leds-mlxcpld.c
leds-mlxreg.c
leds-mt6323.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-net48xx.c
leds-netxbig.c leds: netxbig: add missing put_device() call in netxbig_leds_get_of_pdata() 2020-12-30 11:53:21 +01:00
leds-nic78bx.c
leds-ns2.c leds: ns2: do not guard OF match pointer with of_match_ptr 2020-09-30 19:22:58 +02:00
leds-ot200.c
leds-pca955x.c leds: pca955x: Add an IBM software implementation of the PCA9552 chip 2020-08-17 22:24:21 +02:00
leds-pca963x.c leds: pca963x: use struct led_init_data when registering 2020-09-30 19:15:42 +02:00
leds-pca9532.c leds: pca9532: read pwm settings from device tree 2020-09-30 18:53:28 +02:00
leds-pm8058.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-powernv.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-pwm.c leds: pwm: Remove platform_data support 2020-10-07 12:02:58 +02:00
leds-rb532.c
leds-regulator.c
leds-s3c24xx.c leds: s3c24xx: Remove unused machine header include 2020-08-17 18:04:06 +02:00
leds-sc27xx-bltc.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-sgm3140.c leds: sgm3140: Simplify with dev_err_probe() 2020-09-09 11:20:09 +02:00
leds-spi-byte.c leds: various: use only available OF children 2020-09-26 21:56:39 +02:00
leds-ss4200.c
leds-sunfire.c
leds-syscon.c leds: parse linux,default-trigger DT property in LED core 2020-09-26 21:56:43 +02:00
leds-tca6507.c leds: tca6507: remove binding comment 2020-09-30 19:15:41 +02:00
leds-ti-lmu-common.c
leds-tlc591xx.c leds: tlc591xx: fix leak of device node iterator 2020-09-30 19:20:46 +02:00
leds-tps6105x.c
leds-turris-omnia.c leds: turris-omnia: check for LED_COLOR_ID_RGB instead LED_COLOR_ID_MULTI 2020-12-30 11:53:22 +01:00
leds-wm831x-status.c leds: wm831x-status: fix use-after-free on unbind 2020-06-22 10:38:00 +02:00
leds-wm8350.c
leds-wrap.c
leds.h
Makefile leds: Add driver for Acer Iconia Tab A500 2020-09-26 21:56:42 +02:00
TODO leds: TODO: Add documentation about possible subsystem improvements 2020-09-30 19:15:33 +02:00
uleds.c