linux_dsm_epyc7002/drivers
Mike Marciniszyn de427b662b IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
commit 5de61a47eb9064cbbc5f3360d639e8e34a690a54 upstream.

A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000000000000000
  PGD 0 P4D 0
  Oops: 0000 1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
  RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
  RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
  R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
  FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6 ("IB/hfi1: Activate the dummy netdev")
Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14 08:41:58 +02:00
..
accessibility speakup: fix uninitialized flush_lock 2020-12-30 11:53:44 +01:00
acpi ACPI: processor: Fix build when CONFIG_ACPI_PROCESSOR=m 2021-04-14 08:41:58 +02:00
amba amba: Fix resource leak for drivers without .remove 2021-03-04 11:38:02 +01:00
android binder: add flag to clear buffer on txn complete 2020-12-30 11:54:09 +01:00
ata ata: ahci_brcm: Add back regulators management 2021-03-04 11:37:45 +01:00
atm atm: idt77252: fix null-ptr-dereference 2021-03-30 14:31:50 +02:00
auxdisplay auxdisplay: ht16k33: Fix refresh rate handling 2021-03-04 11:38:00 +01:00
base driver core: clear deferred probe reason on probe retry 2021-04-07 15:00:13 +02:00
bcma
block xen-blkback: don't leak persistent grants from xen_blkbk_map() 2021-03-30 14:32:09 +02:00
bluetooth Bluetooth: btqca: Add valid le states quirk 2021-03-11 14:17:22 +01:00
bus bus: ti-sysc: Fix warning on unbind if reset is not deasserted 2021-04-10 13:36:07 +02:00
cdrom
char tpm, tpm_tis: Decorate tpm_get_timeouts() with request_locality() 2021-03-09 11:11:10 +01:00
clk clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1 clk 2021-03-30 14:31:58 +02:00
clocksource clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined 2021-03-04 11:37:57 +01:00
connector
counter counter: stm32-timer-cnt: fix ceiling miss-alignment with reload register 2021-03-25 09:04:16 +01:00
cpufreq cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdev 2021-03-30 14:31:49 +02:00
cpuidle
crypto crypto: sun4i-ss - initialize need_fallback 2021-03-04 11:38:32 +01:00
dax device-dax: Fix default return code of range_parse() 2021-03-04 11:38:15 +01:00
dca
devfreq
dio
dma dmaengine: idxd: set DMA channel to be private 2021-03-04 11:37:57 +01:00
dma-buf dmabuf: fix use-after-free of dmabuf's file->f_inode 2021-01-12 20:18:24 +01:00
edac EDAC/amd64: Do not load on family 0x15, model 0x13 2021-03-07 12:34:08 +01:00
eisa
extcon extcon: Fix error handling in extcon_dev_register 2021-04-07 15:00:11 +02:00
firewire firewire: nosy: Fix a use-after-free bug in nosy_ioctl() 2021-04-07 15:00:11 +02:00
firmware firmware/efi: Fix a use after bug in efi_mem_reserve_persistent 2021-03-25 09:04:18 +01:00
fpga fpga: Specify HAS_IOMEM dependency for FPGA_DFL 2020-12-01 18:46:24 +01:00
fsi fsi: Aspeed: Add mutex to protect HW access 2020-12-30 11:53:46 +01:00
gnss
gpio gpiolib: acpi: Add missing IRQF_ONESHOT 2021-03-30 14:31:49 +02:00
gpu drm/i915: Fix invalid access to ACPI _DSM objects 2021-04-14 08:41:58 +02:00
greybus
hid HID: logitech-dj: add support for the new lightspeed connection iteration 2021-03-17 17:06:24 +01:00
hsi HSI: Fix PM usage counter unbalance in ssi_hw_init 2021-03-04 11:37:52 +01:00
hv Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind() 2021-03-04 11:37:46 +01:00
hwmon hwmon: (dell-smm) Add XPS 15 L502X to fan control blacklist 2021-02-26 10:13:00 +01:00
hwspinlock
hwtracing coresight: etm4x: Handle accesses to TRCSTALLCTLR 2021-03-04 11:38:37 +01:00
i2c i2c: rcar: optimize cacheline to minimize HW race condition 2021-03-17 17:06:22 +01:00
i3c i3c master: fix missing destroy_workqueue() on error in i3c_master_register 2021-01-06 14:56:53 +01:00
ide ide/falconide: Fix module unload 2021-03-04 11:38:21 +01:00
idle intel_idle: Build fix 2020-12-03 10:00:23 +01:00
iio iio: hid-sensor-temperature: Fix issues of timestamp channel 2021-03-25 09:04:16 +01:00
infiniband IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS 2021-04-14 08:41:58 +02:00
input Input: applespi - don't wait for responses to commands indefinitely. 2021-03-17 17:06:24 +01:00
interconnect interconnect: imx8mq: Use icc_sync_state 2021-01-27 11:55:29 +01:00
iommu iommu/amd: Fix performance counter initialization 2021-03-17 17:06:24 +01:00
ipack
irqchip irqchip/ingenic: Add support for the JZ4760 2021-03-30 14:31:50 +02:00
isdn mISDN: fix crash in fritzpci 2021-04-10 13:36:08 +02:00
leds leds: trigger: fix potential deadlock with libata 2021-02-03 23:28:41 +01:00
lightnvm lightnvm: fix memory leak when submit fails 2021-01-27 11:55:22 +01:00
macintosh macintosh/adb-iop: Use big-endian autopoll mask 2021-03-04 11:37:42 +01:00
mailbox mailbox: sprd: correct definition of SPRD_OUTBOX_FIFO_FULL 2021-03-04 11:38:15 +01:00
mcb
md dm table: Fix zoned model check and zone sectors check 2021-03-30 14:32:06 +02:00
media media: rc: compile rc-cec.c into rc-core 2021-03-17 17:06:20 +01:00
memory memory: ti-aemif: Drop child node when jumping out loop 2021-03-04 11:37:25 +01:00
memstick memstick: r592: Fix error return in r592_probe() 2020-12-30 11:53:34 +01:00
message
mfd mfd: gateworks-gsc: Fix interrupt type 2021-03-04 11:38:40 +01:00
misc habanalabs: Call put_pid() when releasing control device 2021-03-30 14:31:50 +02:00
mmc mmc: cqhci: Fix random crash when remove mmc module/card 2021-03-17 17:06:28 +01:00
most
mtd mtd: spi-nor: hisi-sfc: Put child node np on error path 2021-03-04 11:38:37 +01:00
mux
net net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits 2021-04-14 08:41:57 +02:00
nfc nfc: s3fwrn5: Release the nfc firmware 2020-12-30 11:53:53 +01:00
ntb
nubus
nvdimm libnvdimm/dimm: Avoid race between probe and available_slots_show() 2021-02-10 09:29:17 +01:00
nvme nvmet-tcp: fix kmap leak when data digest in use 2021-04-07 15:00:06 +02:00
nvmem nvmem: qcom-spmi-sdam: Fix uninitialized pdev pointer 2021-03-04 11:38:39 +01:00
of of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS 2021-03-09 11:11:15 +01:00
opp opp: Correct debug message in _opp_add_static_v2() 2021-03-04 11:37:27 +01:00
oprofile
parisc
parport
pci PCI: rpadlpar: Fix potential drc_name corruption in store functions 2021-03-25 09:04:16 +01:00
pcmcia
perf perf/arm-cmn: Move IRQs when migrating context 2021-03-04 11:37:44 +01:00
phy phy: lantiq: rcu-usb2: wait after clock enable 2021-03-04 11:38:24 +01:00
pinctrl pinctrl: rockchip: fix restore error in resume 2021-04-07 15:00:11 +02:00
platform platform/x86: intel_pmc_core: Ignore GBE LTR on Tiger Lake platforms 2021-04-10 13:36:09 +02:00
pnp
power power: supply: smb347-charger: Fix interrupt usage if interrupt is unavailable 2021-03-04 11:37:59 +01:00
powercap
pps
ps3 powerpc/ps3: use dma_mapping_error() 2020-12-30 11:53:53 +01:00
ptp ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation 2021-04-10 13:36:09 +02:00
pwm pwm: iqs620a: Fix overflow and optimize calculations 2021-03-04 11:38:17 +01:00
rapidio
ras
regulator regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck 2021-03-30 14:31:51 +02:00
remoteproc remoteproc/mediatek: Fix kernel test robot warning 2021-03-07 12:34:15 +01:00
reset
rpmsg
rtc rtc: zynqmp: depend on HAS_IOMEM 2021-03-04 11:38:03 +01:00
s390 s390/qeth: schedule TX NAPI on QAOB completion 2021-03-25 09:04:13 +01:00
sbus
scsi scsi: qla2xxx: Fix broken #endif placement 2021-04-07 15:00:05 +02:00
sfi
sh
siox
slimbus slimbus: qcom: fix potential NULL dereference in qcom_slim_prg_slew() 2020-12-30 11:53:47 +01:00
soc soc: qcom-geni-se: Cleanup the code to remove proxy votes 2021-04-07 15:00:13 +02:00
soundwire soundwire: intel: fix possible crash when no device is detected 2021-03-04 11:38:22 +01:00
spi spi: cadence: set cqspi to the driver_data field of struct device 2021-03-25 09:04:04 +01:00
spmi spmi: spmi-pmic-arb: Fix hw_irq overflow 2021-03-04 11:38:40 +01:00
ssb
staging staging: rtl8192e: Change state information from u16 to u8 2021-04-07 15:00:13 +02:00
target scsi: target: pscsi: Clean up after failure in pscsi_map_sg() 2021-04-10 13:36:09 +02:00
tc
tee optee: simplify i2c access 2021-03-04 11:37:28 +01:00
thermal thermal/core: Add NULL pointer check before using cooling device stats 2021-04-07 15:00:06 +02:00
thunderbolt thunderbolt: Increase runtime PM reference count on DP tunnel discovery 2021-03-25 09:04:15 +01:00
tty soc: qcom-geni-se: Cleanup the code to remove proxy votes 2021-04-07 15:00:13 +02:00
uio
usb usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable 2021-04-07 15:00:13 +02:00
vdpa vdpa/mlx5: fix param validation in mlx5_vdpa_get_config() 2021-03-04 11:37:17 +01:00
vfio vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends 2021-04-07 15:00:11 +02:00
vhost vhost: Fix vhost_vq_reset() 2021-04-07 15:00:05 +02:00
video drivers: video: fbcon: fix NULL dereference in fbcon_cursor() 2021-04-07 15:00:13 +02:00
virt virt: vbox: Do not use wait_event_interruptible when called from kernel context 2021-03-04 11:37:18 +01:00
virtio virtio_ring: Fix two use after free bugs 2020-12-30 11:54:00 +01:00
visorbus
vlynq
vme
w1 w1: w1_therm: Fix conversion result for negative temperatures 2021-03-04 11:37:18 +01:00
watchdog watchdog: mei_wdt: request stop on unregister 2021-03-04 11:38:36 +01:00
xen xen/evtchn: Change irq_info lock to raw_spinlock_t 2021-04-14 08:41:57 +02:00
zorro
Kconfig
Makefile vdpa: mlx5: fix vdpa/vhost dependencies 2020-12-02 04:09:56 -05:00