linux_dsm_epyc7002/drivers/edac
Robert Richter 216aa145aa EDAC/mc: Fix use-after-free and memleaks during device removal
A test kernel with the options DEBUG_TEST_DRIVER_REMOVE, KASAN and
DEBUG_KMEMLEAK set, revealed several issues when removing an mci device:

1) Use-after-free:

On 27.11.19 17:07:33, John Garry wrote:
> [   22.104498] BUG: KASAN: use-after-free in
> edac_remove_sysfs_mci_device+0x148/0x180

The use-after-free is caused by the mci_for_each_dimm() macro called in
edac_remove_sysfs_mci_device(). The iterator was introduced with

  c498afaf7d ("EDAC: Introduce an mci_for_each_dimm() iterator").

The iterator loop calls device_unregister(&dimm->dev), which removes
the sysfs entry of the device, but also frees the dimm struct in
dimm_attr_release(). When incrementing the loop in mci_for_each_dimm(),
the dimm struct is accessed again, after having been freed already.

The fix is to free all the mci device's subsequent dimm and csrow
objects at a later point, in _edac_mc_free(), when the mci device itself
is being freed.

This keeps the data structures intact and the mci device can be
fully used until its removal. The change allows the safe usage of
mci_for_each_dimm() to release dimm devices from sysfs.

2) Memory leaks:

Following memory leaks have been detected:

 # grep edac /sys/kernel/debug/kmemleak | sort | uniq -c
       1     [<000000003c0f58f9>] edac_mc_alloc+0x3bc/0x9d0      # mci->csrows
      16     [<00000000bb932dc0>] edac_mc_alloc+0x49c/0x9d0      # csr->channels
      16     [<00000000e2734dba>] edac_mc_alloc+0x518/0x9d0      # csr->channels[chn]
       1     [<00000000eb040168>] edac_mc_alloc+0x5c8/0x9d0      # mci->dimms
      34     [<00000000ef737c29>] ghes_edac_register+0x1c8/0x3f8 # see edac_mc_alloc()

All leaks are from memory allocated by edac_mc_alloc().

Note: The test above shows that edac_mc_alloc() was called here from
ghes_edac_register(), thus both functions show up in the stack trace
but the module causing the leaks is edac_mc. The comments with the data
structures involved were made manually by analyzing the objdump.

The data structures listed above and created by edac_mc_alloc() are
not properly removed during device removal, which is done in
edac_mc_free().

There are two paths implemented to remove the device depending on device
registration, _edac_mc_free() is called if the device is not registered
and edac_unregister_sysfs() otherwise.

The implemenations differ. For the sysfs case, the mci device removal
lacks the removal of subsequent data structures (csrows, channels,
dimms). This causes the memory leaks (see mci_attr_release()).

 [ bp: Massage commit message. ]

Fixes: c498afaf7d ("EDAC: Introduce an mci_for_each_dimm() iterator")
Fixes: faa2ad09c0 ("edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs.")
Fixes: 7a623c0390 ("edac: rewrite the sysfs code to use struct device")
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: John Garry <john.garry@huawei.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200212120340.4764-3-rrichter@marvell.com
2020-02-13 13:28:52 +01:00
..
altera_edac.c EDAC/altera: Use the Altera System Manager driver 2019-11-22 10:18:29 +01:00
altera_edac.h edac: altera: Move Stratix10 SDRAM ECC to peripheral 2019-07-25 14:28:42 -04:00
amd64_edac_dbg.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
amd64_edac_inj.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
amd64_edac.c Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-01-27 09:19:35 -08:00
amd64_edac.h EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh 2020-01-16 17:09:23 +01:00
amd76x_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
amd8111_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8111_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8131_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8131_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
armada_xp_edac.c ARM: 8891/1: EDAC: armada_xp: Add support for more SoCs 2019-08-29 07:58:01 +01:00
aspeed_edac.c EDAC/aspeed: Remove unneeded semicolon 2019-12-19 07:27:09 +01:00
bluefield_edac.c EDAC, mellanox: Add ECC support for BlueField DDR4 2019-08-08 12:57:01 -03:00
cell_edac.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
cpc925_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
debugfs.c ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper 2019-08-29 07:58:01 +01:00
e7xxx_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
e752x_edac.c EDAC: Fix indentation issues in several EDAC drivers 2018-11-10 16:56:16 +01:00
edac_device_sysfs.c edac: move EDAC device definitions to drivers/edac/edac_device.h 2016-12-15 08:54:51 -02:00
edac_device.c EDAC/device: Rework error logging API 2019-10-09 13:01:42 +02:00
edac_device.h EDAC/device: Rework error logging API 2019-10-09 13:01:42 +02:00
edac_mc_sysfs.c EDAC/mc: Fix use-after-free and memleaks during device removal 2020-02-13 13:28:52 +01:00
edac_mc.c EDAC/mc: Fix use-after-free and memleaks during device removal 2020-02-13 13:28:52 +01:00
edac_mc.h EDAC: Prefer 'unsigned int' to bare use of 'unsigned' 2019-09-03 19:21:19 +02:00
edac_module.c treewide: Fix function prototypes for module_param_call() 2017-10-31 15:30:37 +01:00
edac_module.h ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper 2019-08-29 07:58:01 +01:00
edac_pci_sysfs.c edac: move documentation from edac_pci*.c to edac_pci.h 2016-12-15 08:54:51 -02:00
edac_pci.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
edac_pci.h edac: move documentation from edac_pci*.c to edac_pci.h 2016-12-15 08:54:51 -02:00
fsl_ddr_edac.c EDAC, fsl_ddr: Add LS1021A to the list of supported hardware 2018-12-19 11:57:45 +01:00
fsl_ddr_edac.h EDAC, fsl_ddr: Add LS1021A to the list of supported hardware 2018-12-19 11:57:45 +01:00
ghes_edac.c EDAC/ghes: Do not warn when incrementing refcount on 0 2019-11-22 09:53:08 +01:00
highbank_l2_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
highbank_mc_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
i7core_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
i10nm_base.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
i3000_edac.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
i3200_edac.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
i5000_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
i5100_edac.c EDAC: remove set but not used variable 'ecc_loc' 2019-12-16 13:54:02 -08:00
i5400_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
i7300_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
i82443bxgx_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82860_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82875p_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82975x_edac.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
ie31200_edac.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
Kconfig A garden variety of small fixes all over the place. 2020-01-27 09:16:22 -08:00
layerscape_edac.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
Makefile ARM updates for 5.4-rc1: 2019-09-22 09:39:09 -07:00
mce_amd.c EDAC/mce_amd: Make fam_ops static global 2020-01-16 21:52:48 +01:00
mce_amd.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mpc85xx_edac.c EDAC, mpc85xx: Add T2080 l2-cache support 2017-02-03 10:36:35 +01:00
mpc85xx_edac.h EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx 2016-09-01 10:28:00 +02:00
mv64x60_edac.c EDAC, mv64x60: Fix an error handling path 2018-01-09 20:14:23 +01:00
mv64x60_edac.h edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
octeon_edac-l2c.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
octeon_edac-lmc.c EDAC, octeon: Fix an uninitialized variable warning 2017-11-27 11:57:26 +01:00
octeon_edac-pc.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
octeon_edac-pci.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
pasemi_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
pnd2_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
pnd2_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
ppc4xx_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
ppc4xx_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
qcom_edac.c EDAC, qcom_edac: Remove irq_handled local variable 2018-11-06 12:03:16 +01:00
r82600_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
sb_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
sifive_edac.c A garden variety of small fixes all over the place. 2020-01-27 09:16:22 -08:00
skx_base.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
skx_common.c EDAC: skx_common: downgrade message importance on missing PCI device 2019-12-10 14:14:43 -08:00
skx_common.h EDAC, skx: Retrieve and print retry_rd_err_log registers 2019-10-18 15:27:58 -07:00
synopsys_edac.c EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller 2018-11-06 10:38:27 +01:00
thunderx_edac.c EDAC, thunderx: Fix memory leak in thunderx_l2c_threaded_isr() 2018-10-13 13:58:06 +02:00
ti_edac.c EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function 2019-11-09 10:32:32 +01:00
wq.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
x38_edac.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
xgene_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00