linux_dsm_epyc7002/arch/x86/kernel/cpu
Fenghua Yu 29e9bf1841 x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog
Thermal throttle and power limit events are not defined as MCE errors in x86
architecture and should not generate MCE errors in mcelog.

Current kernel generates fake software defined MCE errors for these events.
This may confuse users because they may think the machine has real MCE errors
while actually only thermal throttle or power limit events happen.

To make it worse, buggy firmware on some platforms may falsely generate
the events. Therefore, kernel reports MCE errors which users think as real
hardware errors. Although the firmware bugs should be fixed, on the other hand,
kernel should not report MCE errors either.

So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count,
package_power_limit_count, core_throttle_count, and package_throttle_count) in
/sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters
for each event on each CPU. Please note that all CPU's on one package report
duplicate counters. It's user application's responsibity to retrieve a package
level counter for one package.

This patch doesn't report package level power limit, core level power limit, and
package level thermal throttle events in mcelog. When the events happen, only
report them in respective counters in sysfs.

Since core level thermal throttle has been legacy code in kernel for a while and
users accepted it as MCE error in mcelog, core level thermal throttle is still
reported in mcelog. In the mean time, the event is counted in a counter in sysfs
as well.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Borislav Petkov <bp@amd64.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-12-14 16:25:26 -08:00
..
mcheck x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog 2011-12-14 16:25:26 -08:00
mtrr x86/mtrr: Resolve inconsistency with Intel processor manual 2011-12-05 15:06:15 +01:00
.gitignore Update .gitignore files for generated targets 2008-10-20 11:24:31 -07:00
amd.c x86: Fix boot failures on older AMD CPU's 2011-12-04 11:57:09 -08:00
bugs_64.c x86/cpu: Clean up various files a bit 2009-07-11 11:24:09 +02:00
bugs.c x86-32, fpu: Fix DNA exception during check_fpu() 2011-06-30 17:29:47 -07:00
centaur.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
common.c Merge branch 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-10-28 05:29:07 -07:00
cpu.h x86: Add a BSP cpu_dev helper 2011-08-05 12:26:49 -07:00
cyrix.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
hypervisor.c x86, hyper: Change hypervisor detection order 2011-07-08 16:22:29 -07:00
intel_cacheinfo.c x86: cache_info: Update calculation of AMD L3 cache indices 2011-09-12 19:29:03 +02:00
intel.c x86, intel: Use c->microcode for Atom errata check 2011-10-14 13:16:38 +02:00
Makefile Merge branch 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-10-28 05:29:07 -07:00
mkcapflags.pl x86: generate names for /proc/cpuinfo from <asm/cpufeature.h> 2008-08-27 19:23:22 -07:00
mshyperv.c x86: Hyper-V: Integrate the clocksource with Hyper-V detection code 2011-09-08 10:33:59 +02:00
perf_event_amd_ibs.c perf, x86: Force IBS LVT offset assignment for family 10h 2011-12-05 09:32:59 +01:00
perf_event_amd.c perf, x86: Share IBS macros between perf and oprofile 2011-10-10 06:57:11 +02:00
perf_event_intel_ds.c perf/x86: Fix PEBS instruction unwind 2011-11-14 13:01:20 +01:00
perf_event_intel_lbr.c x86, perf: Clean up perf_event cpu code 2011-09-26 12:58:00 +02:00
perf_event_intel.c perf, x86: Disable PEBS on SandyBridge chips 2011-12-05 09:32:38 +01:00
perf_event_p4.c perf: Don't use -ENOSPC for out of PMU resources 2011-11-14 13:01:24 +01:00
perf_event_p6.c x86, perf: Clean up perf_event cpu code 2011-09-26 12:58:00 +02:00
perf_event.c perf/x86: Enable raw event access to Intel offcore events 2011-11-14 13:03:44 +01:00
perf_event.h perf, intel: Use GO/HO bits in perf-ctr 2011-10-10 06:56:42 +02:00
perfctr-watchdog.c perf, x86: Add new AMD family 15h msrs to perfctr reservation code 2011-02-16 13:30:50 +01:00
powerflags.c x86: generate names for /proc/cpuinfo from <asm/cpufeature.h> 2008-08-27 19:23:22 -07:00
proc.c x86, microcode: Correct microcode revision format 2011-10-19 15:47:48 +02:00
rdrand.c x86, random: Verify RDRAND functionality and allow it to be disabled 2011-07-31 14:02:19 -07:00
scattered.c Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 13:01:08 -07:00
sched.c sched: x86: Name old_perf in a unique way 2009-09-16 11:21:07 +02:00
topology.c x86, cpu: Split addon_cpuid_features.c 2010-07-19 19:02:41 -07:00
transmeta.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
umc.c x86: move various CPU initialization objects into .cpuinit.rodata 2009-03-12 13:13:07 +01:00
vmware.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00