linux_dsm_epyc7002/drivers/gpu/drm/amd/powerplay/smumgr
Ivan Mironov 7808363154 drm/amd/powerplay: Fix NULL dereference in lock_bus() on Vega20 w/o RAS
I updated my system with Radeon VII from kernel 5.6 to kernel 5.7, and
following started to happen on each boot:

	...
	BUG: kernel NULL pointer dereference, address: 0000000000000128
	...
	CPU: 9 PID: 1940 Comm: modprobe Tainted: G            E     5.7.2-200.im0.fc32.x86_64 #1
	Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 1407 04/02/2020
	RIP: 0010:lock_bus+0x42/0x60 [amdgpu]
	...
	Call Trace:
	 i2c_smbus_xfer+0x3d/0xf0
	 i2c_default_probe+0xf3/0x130
	 i2c_detect.isra.0+0xfe/0x2b0
	 ? kfree+0xa3/0x200
	 ? kobject_uevent_env+0x11f/0x6a0
	 ? i2c_detect.isra.0+0x2b0/0x2b0
	 __process_new_driver+0x1b/0x20
	 bus_for_each_dev+0x64/0x90
	 ? 0xffffffffc0f34000
	 i2c_register_driver+0x73/0xc0
	 do_one_initcall+0x46/0x200
	 ? _cond_resched+0x16/0x40
	 ? kmem_cache_alloc_trace+0x167/0x220
	 ? do_init_module+0x23/0x260
	 do_init_module+0x5c/0x260
	 __do_sys_init_module+0x14f/0x170
	 do_syscall_64+0x5b/0xf0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
	...

Error appears when some i2c device driver tries to probe for devices
using adapter registered by `smu_v11_0_i2c_eeprom_control_init()`.
Code supporting this adapter requires `adev->psp.ras.ras` to be not
NULL, which is true only when `amdgpu_ras_init()` detects HW support by
calling `amdgpu_ras_check_supported()`.

Before 9015d60c9e, adapter was registered by

	-> amdgpu_device_ip_init()
	  -> amdgpu_ras_recovery_init()
	    -> amdgpu_ras_eeprom_init()
	      -> smu_v11_0_i2c_eeprom_control_init()

after verifying that `adev->psp.ras.ras` is not NULL in
`amdgpu_ras_recovery_init()`. Currently it is registered
unconditionally by

	-> amdgpu_device_ip_init()
	  -> pp_sw_init()
	    -> hwmgr_sw_init()
	      -> vega20_smu_init()
	        -> smu_v11_0_i2c_eeprom_control_init()

Fix simply adds HW support check (ras == NULL => no support) before
calling `smu_v11_0_i2c_eeprom_control_{init,fini}()`.

Please note that there is a chance that similar fix is also required for
CHIP_ARCTURUS. I do not know whether any actual Arcturus hardware without
RAS exist, and whether calling `smu_i2c_eeprom_init()` makes any sense
when there is no HW support.

Cc: stable@vger.kernel.org
Fixes: 9015d60c9e ("drm/amdgpu: Move EEPROM I2C adapter to amdgpu_device")
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Tested-by: Bjorn Nostvold <bjorn.nostvold@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01 01:59:27 -04:00
..
ci_smumgr.c drm/powerplay: label internally used symbols as static 2020-07-01 01:59:23 -04:00
ci_smumgr.h
fiji_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
fiji_smumgr.h
iceland_smumgr.c drm/amd/powerplay: avoid calling SMU7 specific SMU message implemention 2020-04-01 14:44:44 -04:00
iceland_smumgr.h
Makefile drm/amd/powerplay: add the smu manager for vega20 (v2) 2018-08-27 11:10:26 -05:00
polaris10_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
polaris10_smumgr.h
smu7_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
smu7_smumgr.h drm/amd/powerpaly: drop unused APIs 2020-04-01 14:44:44 -04:00
smu8_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
smu8_smumgr.h drm/amd/pp: Rename file name cz_* to smu8_* 2018-03-15 09:58:56 -05:00
smu9_smumgr.c drm/amd/powerplay: enable pp one vf mode for vega10 2019-12-11 15:22:07 -05:00
smu9_smumgr.h drm/amdgpu/pp: switch smu callback type for get_argument() 2018-07-16 11:39:28 -05:00
smu10_smumgr.c drm/amdgpu: add apu flags (v2) 2020-05-22 13:41:53 -04:00
smu10_smumgr.h drm/amd/pp: Replace rv_* with smu10_* 2018-03-15 09:57:12 -05:00
smumgr.c drm/amd/powerplay: added mutex protection on msg issuing 2020-04-01 14:44:45 -04:00
tonga_smumgr.c drm/powerplay: label internally used symbols as static 2020-07-01 01:59:23 -04:00
tonga_smumgr.h
vega10_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
vega10_smumgr.h drm/amdgpu: implement ENABLED_SMC_FEATURES_MASK sensor for vega10 2018-09-26 21:09:10 -05:00
vega12_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
vega12_smumgr.h drm/amdgpu/powerplay: add smu smc_table_manager callback for vega12 2018-09-26 21:09:09 -05:00
vega20_smumgr.c drm/amd/powerplay: Fix NULL dereference in lock_bus() on Vega20 w/o RAS 2020-07-01 01:59:27 -04:00
vega20_smumgr.h drm/amd/powerplay: Add interface to lock SMU HW I2C. 2019-08-27 08:17:42 -05:00
vegam_smumgr.c drm/amd/powerplay: unified interfaces for message issuing and response checking 2020-04-01 14:44:45 -04:00
vegam_smumgr.h drm/amd/powerplay: add smumgr support for VEGAM (v2) 2018-05-15 13:44:04 -05:00