mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-19 01:26:42 +07:00
powerpc/pseries: Don't enforce MSI affinity with kdump
commit f9619d5e5174867536b7e558683bc4408eab833f upstream. Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit9ea69a55b3
("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes:9ea69a55b3
("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit is contained in:
parent
ac022fbee6
commit
a9c55f22a0
@ -4,6 +4,7 @@
|
||||
* Copyright 2006-2007 Michael Ellerman, IBM Corp.
|
||||
*/
|
||||
|
||||
#include <linux/crash_dump.h>
|
||||
#include <linux/device.h>
|
||||
#include <linux/irq.h>
|
||||
#include <linux/msi.h>
|
||||
@ -458,8 +459,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
|
||||
return hwirq;
|
||||
}
|
||||
|
||||
virq = irq_create_mapping_affinity(NULL, hwirq,
|
||||
entry->affinity);
|
||||
/*
|
||||
* Depending on the number of online CPUs in the original
|
||||
* kernel, it is likely for CPU #0 to be offline in a kdump
|
||||
* kernel. The associated IRQs in the affinity mappings
|
||||
* provided by irq_create_affinity_masks() are thus not
|
||||
* started by irq_startup(), as per-design for managed IRQs.
|
||||
* This can be a problem with multi-queue block devices driven
|
||||
* by blk-mq : such a non-started IRQ is very likely paired
|
||||
* with the single queue enforced by blk-mq during kdump (see
|
||||
* blk_mq_alloc_tag_set()). This causes the device to remain
|
||||
* silent and likely hangs the guest at some point.
|
||||
*
|
||||
* We don't really care for fine-grained affinity when doing
|
||||
* kdump actually : simply ignore the pre-computed affinity
|
||||
* masks in this case and let the default mask with all CPUs
|
||||
* be used when creating the IRQ mappings.
|
||||
*/
|
||||
if (is_kdump_kernel())
|
||||
virq = irq_create_mapping(NULL, hwirq);
|
||||
else
|
||||
virq = irq_create_mapping_affinity(NULL, hwirq,
|
||||
entry->affinity);
|
||||
|
||||
if (!virq) {
|
||||
pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
|
||||
|
Loading…
Reference in New Issue
Block a user