linux_dsm_epyc7002/arch/x86/kernel
Yazen Ghannam 006c077041 x86/mce: Handle varying MCA bank counts
Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a
CPU. Currently, this number is the same for all CPUs and a warning is
shown if there is a difference. The number of banks is overwritten with
the MCG_CAP[Count] value of each following CPU that boots.

According to the Intel SDM and AMD APM, the MCG_CAP[Count] value gives
the number of banks that are available to a "processor implementation".
The AMD BKDGs/PPRs further clarify that this value is per core. This
value has historically been the same for every core in the system, but
that is not an architectural requirement.

Future AMD systems may have different MCG_CAP[Count] values per core,
so the assumption that all CPUs will have the same MCG_CAP[Count] value
will no longer be valid.

Also, the first CPU to boot will allocate the struct mce_banks[] array
using the number of banks based on its MCG_CAP[Count] value. The machine
check handler and other functions use the global number of banks to
iterate and index into the mce_banks[] array. So it's possible to use an
out-of-bounds index on an asymmetric system where a following CPU sees a
MCG_CAP[Count] value greater than its predecessors.

Thus, allocate the mce_banks[] array to the maximum number of banks.
This will avoid the potential out-of-bounds index since the value of
mca_cfg.banks is capped to MAX_NR_BANKS.

Set the value of mca_cfg.banks equal to the max of the previous value
and the value for the current CPU. This way mca_cfg.banks will always
represent the max number of banks detected on any CPU in the system.

This will ensure that all CPUs will access all the banks that are
visible to them. A CPU that can access fewer than the max number of
banks will find the registers of the extra banks to be read-as-zero.

Furthermore, print the resulting number of MCA banks in use. Do this in
mcheck_late_init() so that the final value is printed after all CPUs
have been initialized.

Finally, get bank count from target CPU when doing injection with mce-inject
module.

 [ bp: Remove out-of-bounds example, passify and cleanup commit message. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20180727214009.78289-1-Yazen.Ghannam@amd.com
2019-03-27 13:12:49 +01:00
..
acpi treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
apic treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
cpu x86/mce: Handle varying MCA bank counts 2019-03-27 13:12:49 +01:00
fpu x86/fpu: Move init_xstate_size() to __init section 2019-02-08 14:32:34 +01:00
kprobes x86/kprobes: Prohibit probing on IRQ handlers directly 2019-02-13 08:16:39 +01:00
.gitignore
alternative.c Merge branch 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 08:45:46 -08:00
amd_gart_64.c x86/amd_gart: fix unmapping of non-GART mappings 2019-01-05 08:27:32 +01:00
amd_nb.c x86/amd_nb: Add PCI device IDs for family 17h, model 30h 2018-11-07 21:36:09 +01:00
apb_timer.c
aperture_64.c x86/gart: Rewrite early_gart_iommu_check() comment 2018-11-05 21:18:31 +01:00
apm_32.c x86/APM: Fix build warning when PROC_FS is not enabled 2018-09-15 10:16:25 +02:00
asm-offsets_32.c x86/entry/32: Load task stack from x86_tss.sp1 in SYSENTER handler 2018-07-20 01:11:36 +02:00
asm-offsets_64.c x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella 2018-09-03 16:50:36 +02:00
asm-offsets.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
audit_64.c
bootflag.c
check.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
cpuid.c x86/cpuid: Allow cpuid_read() to schedule 2018-03-27 12:01:48 +02:00
crash_dump_32.c
crash_dump_64.c x86: Fix various typos in comments 2018-12-03 10:49:13 +01:00
crash.c x86/kexec: Fix a kexec_file_load() failure 2019-01-15 12:12:50 +01:00
devicetree.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
doublefault.c
dumpstack_32.c x86/dumpstack: Unify show_regs() 2018-03-08 12:04:59 +01:00
dumpstack_64.c x86/dumpstack: Unify show_regs() 2018-03-08 12:04:59 +01:00
dumpstack.c x86/process: Don't mix user/kernel regs in 64bit __show_regs() 2018-09-06 14:33:12 +02:00
e820.c Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
early_printk.c efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation 2019-02-04 08:27:30 +01:00
early-quirks.c On GEM side: 2018-07-20 12:29:24 +10:00
ebda.c
eisa.c x86/EISA: Don't probe EISA bus for Xen PV guests 2018-09-11 23:36:50 +02:00
espfix_64.c x86/espfix: Document use of _PAGE_GLOBAL 2018-04-09 18:27:33 +02:00
ftrace_32.S
ftrace_64.S x86/ftrace: Do not call function graph from dynamic trampolines 2018-12-19 22:43:37 -05:00
ftrace.c The biggest change for this release is in the histogram code. 2019-03-11 17:01:32 -07:00
head32.c x86/boot: Mostly revert commit ae7e1238e6 ("Add ACPI RSDP address to setup_header") 2018-11-20 09:43:10 +01:00
head64.c x86/boot: Mostly revert commit ae7e1238e6 ("Add ACPI RSDP address to setup_header") 2018-11-20 09:43:10 +01:00
head_32.S x86/pgtable/32: Allocate 8k page-tables when PTI is enabled 2018-07-20 01:11:41 +02:00
head_64.S xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH 2018-12-13 13:41:49 -05:00
hpet.c x86/hpet: Remove unused FSEC_PER_NSEC define 2018-12-04 12:17:21 +01:00
hw_breakpoint.c x86/hw_breakpoints, kprobes: Remove kprobes ifdeffery 2019-01-30 11:52:21 +01:00
i8237.c x86/i8237: Register device based on FADT legacy boot flag 2018-04-27 16:44:29 +02:00
i8253.c
i8259.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
idt.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
ima_arch.c x86/ima: retry detecting secure boot mode 2018-12-11 07:19:45 -05:00
io_delay.c
ioport.c x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm() 2018-04-02 20:16:12 +02:00
irq_32.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irq_64.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irq_work.c
irq.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irqflags.S x86/paravirt: Make native_save_fl() extern inline 2018-07-03 10:56:27 +02:00
irqinit.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
itmt.c
jailhouse.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
kdebugfs.c
kexec-bzimage64.c Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-03-10 17:32:04 -07:00
kgdb.c x86/kernel: Mark expected switch-case fall-throughs 2019-01-26 11:19:13 +01:00
ksysfs.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
kvm.c KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error 2019-01-25 19:11:33 +01:00
kvmclock.c x86: kvmguest: use TSC clocksource if invariant TSC is exposed 2019-02-20 22:48:52 +01:00
ldt.c x86/ldt: Remove unused variable in map_ldt_struct() 2018-11-06 21:35:11 +01:00
livepatch.c
machine_kexec_32.c x86/kexec: Allocate 8k PGDs for PTI 2018-07-30 13:53:48 +02:00
machine_kexec_64.c x86/kdump: Export the SME mask to vmcoreinfo 2019-01-11 16:09:25 +01:00
Makefile jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
mmconf-fam10h_64.c
module.c x86: Add support for 64-bit place relative relocations 2018-09-27 17:56:47 +02:00
mpparse.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
msr.c x86: Clean up 'sizeof x' => 'sizeof(x)' 2018-10-29 07:13:28 +01:00
nmi_selftest.c
nmi.c
paravirt_patch_32.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
paravirt_patch_64.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
paravirt-spinlocks.c x86/paravirt: Use a single ops structure 2018-09-03 16:50:35 +02:00
paravirt.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
pci-calgary_64.c x86/calgary: remove the mapping_error dma_map_ops method 2018-12-06 06:56:46 -08:00
pci-dma.c dma-mapping: bypass indirect calls for dma-direct 2018-12-13 21:06:18 +01:00
pci-iommu_table.c x86/iommu: Use NULL instead of 0 2018-08-02 14:33:19 +02:00
pci-swiotlb.c dma-direct: merge swiotlb_dma_ops into the dma_direct code 2018-12-13 21:06:17 +01:00
pcspeaker.c x86/platform/pcspeaker: Use PTR_ERR_OR_ZERO() to fix ptr_ret.cocci warning 2018-07-24 09:46:42 +02:00
perf_regs.c perf/x86: Store user space frame-pointer value on a sample 2018-05-25 08:11:12 +02:00
platform-quirks.c x86/i8237: Register device based on FADT legacy boot flag 2018-04-27 16:44:29 +02:00
pmem.c
probe_roms.c
process_32.c Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-12-26 17:37:51 -08:00
process_64.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-12-26 18:08:18 -08:00
process.c x86/speculation: Add PR_SPEC_DISABLE_NOEXEC 2019-01-29 22:11:49 +01:00
process.h x86/speculation: Change misspelled STIPB to STIBP 2018-12-06 11:49:15 +01:00
ptrace.c x86/fsgsbase/64: Fix the base write helper functions 2018-12-18 14:26:09 +01:00
pvclock.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
quirks.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
reboot_fixups_32.c
reboot.c
relocate_kernel_32.S
relocate_kernel_64.S
resource.c
rtc.c x86: Convert x86_platform_ops to timespec64 2018-05-19 14:03:14 +02:00
setup_percpu.c memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
setup.c Merge branch 'mount.part1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-01-05 13:25:58 -08:00
signal_compat.c signal: Add TRAP_UNK si_code for undiagnosted trap exceptions 2018-04-25 10:40:56 -05:00
signal.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
smp.c x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d 2018-08-05 09:53:13 +02:00
smpboot.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-07 16:36:57 -08:00
stacktrace.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
step.c
sys_x86_64.c x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT 2018-11-01 12:59:25 +01:00
sysfb_efi.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
sysfb_simplefb.c
sysfb.c
tboot.c iommu/vtd: Cleanup dma_remapping.h header 2018-11-12 14:22:56 +01:00
tce_64.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
time.c Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 19:07:25 +01:00
tls.c
tls.h
topology.c x86/xen: Disable CPU0 hotplug for Xen PV 2018-09-12 21:15:02 +02:00
trace_clock.c
tracepoint.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
traps.c Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-07 17:09:28 -08:00
tsc_msr.c x86/cpu: Sanitize FAM6_ATOM naming 2018-10-02 10:14:32 +02:00
tsc_sync.c
tsc.c x86/tsc: Make calibration refinement more robust 2018-11-06 21:53:15 +01:00
umip.c signal/x86: Use force_sig_fault where appropriate 2018-09-21 15:30:54 +02:00
unwind_frame.c x86/unwind: Handle NULL pointer calls better in frame unwinder 2019-03-06 23:03:26 +01:00
unwind_guess.c
unwind_orc.c x86/unwind: Add hardcoded ORC entry for NULL 2019-03-06 23:03:26 +01:00
uprobes.c x86/kernel: Mark expected switch-case fall-throughs 2019-01-26 11:19:13 +01:00
verify_cpu.S
vm86_32.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
vmlinux.lds.S x86/build: Use the single-argument OUTPUT_FORMAT() linker script command 2019-01-16 12:21:53 +01:00
vsmp_64.c x86/vsmp: Remove dependency on pv_irq_ops 2018-11-06 21:35:11 +01:00
x86_init.c x86/acpi, x86/boot: Take RSDP address for boot params if available 2018-10-10 10:44:22 +02:00