2019-05-30 06:58:01 +07:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
2005-11-05 23:25:53 +07:00
|
|
|
/*
|
2016-03-07 20:02:21 +07:00
|
|
|
* (c) 2005-2016 Advanced Micro Devices, Inc.
|
2005-11-05 23:25:53 +07:00
|
|
|
*
|
|
|
|
* Written by Jacob Shin - AMD, Inc.
|
2012-10-30 00:40:08 +07:00
|
|
|
* Maintained by: Borislav Petkov <bp@alien8.de>
|
2005-11-05 23:25:53 +07:00
|
|
|
*
|
2015-05-07 17:06:43 +07:00
|
|
|
* All MC4_MISCi registers are shared between cores on a node.
|
2005-11-05 23:25:53 +07:00
|
|
|
*/
|
|
|
|
#include <linux/interrupt.h>
|
|
|
|
#include <linux/notifier.h>
|
2009-04-08 17:31:18 +07:00
|
|
|
#include <linux/kobject.h>
|
x86, mce: trivial clean up for mce_amd_64.c
Fix for followings:
WARNING: Use #include <linux/percpu.h> instead of <asm/percpu.h>
+#include <asm/percpu.h>
ERROR: Macros with multiple statements should be enclosed in a do - while
loop
+#define THRESHOLD_ATTR(_name, _mode, _show, _store) \
+{ \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .show = _show, \
+ .store = _store, \
+};
WARNING: usage of NR_CPUS is often wrong - consider using cpu_possible(),
num_possible_cpus(), for_each_possible_cpu(), etc
+ if (cpu >= NR_CPUS)
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-04-08 17:31:18 +07:00
|
|
|
#include <linux/percpu.h>
|
2009-04-08 17:31:18 +07:00
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/sched.h>
|
2005-11-05 23:25:53 +07:00
|
|
|
#include <linux/sysfs.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 15:04:11 +07:00
|
|
|
#include <linux/slab.h>
|
2009-04-08 17:31:18 +07:00
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/cpu.h>
|
|
|
|
#include <linux/smp.h>
|
2016-09-12 14:59:35 +07:00
|
|
|
#include <linux/string.h>
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2012-05-02 22:16:59 +07:00
|
|
|
#include <asm/amd_nb.h>
|
2018-11-10 05:13:13 +07:00
|
|
|
#include <asm/traps.h>
|
2005-11-05 23:25:53 +07:00
|
|
|
#include <asm/apic.h>
|
|
|
|
#include <asm/mce.h>
|
|
|
|
#include <asm/msr.h>
|
2015-05-06 18:58:56 +07:00
|
|
|
#include <asm/trace/irq_vectors.h>
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2018-11-18 21:15:05 +07:00
|
|
|
#include "internal.h"
|
2017-10-02 16:28:36 +07:00
|
|
|
|
2016-01-26 02:41:50 +07:00
|
|
|
#define NR_BLOCKS 5
|
2006-06-26 18:58:56 +07:00
|
|
|
#define THRESHOLD_MAX 0xFFF
|
|
|
|
#define INT_TYPE_APIC 0x00020000
|
|
|
|
#define MASK_VALID_HI 0x80000000
|
2007-02-13 19:26:23 +07:00
|
|
|
#define MASK_CNTP_HI 0x40000000
|
|
|
|
#define MASK_LOCKED_HI 0x20000000
|
2006-06-26 18:58:56 +07:00
|
|
|
#define MASK_LVTOFF_HI 0x00F00000
|
|
|
|
#define MASK_COUNT_EN_HI 0x00080000
|
|
|
|
#define MASK_INT_TYPE_HI 0x00060000
|
|
|
|
#define MASK_OVERFLOW_HI 0x00010000
|
2005-11-05 23:25:53 +07:00
|
|
|
#define MASK_ERR_COUNT_HI 0x00000FFF
|
2006-06-26 18:58:53 +07:00
|
|
|
#define MASK_BLKPTR_LO 0xFF000000
|
|
|
|
#define MCG_XBLK_ADDR 0xC0000400
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2015-05-06 18:58:56 +07:00
|
|
|
/* Deferred error settings */
|
|
|
|
#define MSR_CU_DEF_ERR 0xC0000410
|
|
|
|
#define MASK_DEF_LVTOFF 0x000000F0
|
|
|
|
#define MASK_DEF_INT_TYPE 0x00000006
|
|
|
|
#define DEF_LVT_OFF 0x2
|
|
|
|
#define DEF_INT_TYPE_APIC 0x2
|
|
|
|
|
2016-01-26 02:41:51 +07:00
|
|
|
/* Scalable MCA: */
|
|
|
|
|
|
|
|
/* Threshold LVT offset is at MSR0xC0000410[15:12] */
|
|
|
|
#define SMCA_THR_LVT_OFF 0xF000
|
|
|
|
|
2018-11-27 20:41:37 +07:00
|
|
|
static bool thresholding_irq_en;
|
2016-11-11 00:44:44 +07:00
|
|
|
|
2012-05-04 22:05:27 +07:00
|
|
|
static const char * const th_names[] = {
|
|
|
|
"load_store",
|
|
|
|
"insn_fetch",
|
|
|
|
"combined_unit",
|
2017-03-30 18:17:14 +07:00
|
|
|
"decode_unit",
|
2012-05-04 22:05:27 +07:00
|
|
|
"northbridge",
|
|
|
|
"execution_unit",
|
|
|
|
};
|
|
|
|
|
2016-09-12 14:59:35 +07:00
|
|
|
static const char * const smca_umc_block_names[] = {
|
|
|
|
"dram_ecc",
|
|
|
|
"misc_umc"
|
|
|
|
};
|
|
|
|
|
2016-11-04 03:12:33 +07:00
|
|
|
struct smca_bank_name {
|
|
|
|
const char *name; /* Short name for sysfs */
|
|
|
|
const char *long_name; /* Long name for pretty-printing */
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct smca_bank_name smca_names[] = {
|
2016-09-12 14:59:34 +07:00
|
|
|
[SMCA_LS] = { "load_store", "Load Store Unit" },
|
2020-01-10 08:56:47 +07:00
|
|
|
[SMCA_LS_V2] = { "load_store", "Load Store Unit" },
|
2016-09-12 14:59:34 +07:00
|
|
|
[SMCA_IF] = { "insn_fetch", "Instruction Fetch Unit" },
|
|
|
|
[SMCA_L2_CACHE] = { "l2_cache", "L2 Cache" },
|
|
|
|
[SMCA_DE] = { "decode_unit", "Decode Unit" },
|
x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.
This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:
WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
...
Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.
Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.
Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.
Don't try to find the block address on reserved banks.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14.x
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 17:18:58 +07:00
|
|
|
[SMCA_RESERVED] = { "reserved", "Reserved" },
|
2016-09-12 14:59:34 +07:00
|
|
|
[SMCA_EX] = { "execution_unit", "Execution Unit" },
|
|
|
|
[SMCA_FP] = { "floating_point", "Floating Point Unit" },
|
|
|
|
[SMCA_L3_CACHE] = { "l3_cache", "L3 Cache" },
|
|
|
|
[SMCA_CS] = { "coherent_slave", "Coherent Slave" },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
[SMCA_CS_V2] = { "coherent_slave", "Coherent Slave" },
|
2016-09-12 14:59:34 +07:00
|
|
|
[SMCA_PIE] = { "pie", "Power, Interrupts, etc." },
|
|
|
|
[SMCA_UMC] = { "umc", "Unified Memory Controller" },
|
|
|
|
[SMCA_PB] = { "param_block", "Parameter Block" },
|
|
|
|
[SMCA_PSP] = { "psp", "Platform Security Processor" },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
[SMCA_PSP_V2] = { "psp", "Platform Security Processor" },
|
2016-09-12 14:59:34 +07:00
|
|
|
[SMCA_SMU] = { "smu", "System Management Unit" },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
[SMCA_SMU_V2] = { "smu", "System Management Unit" },
|
2019-02-02 05:55:51 +07:00
|
|
|
[SMCA_MP5] = { "mp5", "Microprocessor 5 Unit" },
|
|
|
|
[SMCA_NBIO] = { "nbio", "Northbridge IO Unit" },
|
|
|
|
[SMCA_PCIE] = { "pcie", "PCI Express Unit" },
|
2016-03-07 20:02:18 +07:00
|
|
|
};
|
2016-11-04 03:12:33 +07:00
|
|
|
|
2018-11-10 05:13:13 +07:00
|
|
|
static const char *smca_get_name(enum smca_bank_types t)
|
2016-11-04 03:12:33 +07:00
|
|
|
{
|
|
|
|
if (t >= N_SMCA_BANK_TYPES)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return smca_names[t].name;
|
|
|
|
}
|
|
|
|
|
|
|
|
const char *smca_get_long_name(enum smca_bank_types t)
|
|
|
|
{
|
|
|
|
if (t >= N_SMCA_BANK_TYPES)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return smca_names[t].long_name;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(smca_get_long_name);
|
2016-09-12 14:59:34 +07:00
|
|
|
|
2018-02-21 17:18:57 +07:00
|
|
|
static enum smca_bank_types smca_get_bank_type(unsigned int bank)
|
2017-12-18 18:37:12 +07:00
|
|
|
{
|
|
|
|
struct smca_bank *b;
|
|
|
|
|
2018-02-21 17:18:57 +07:00
|
|
|
if (bank >= MAX_NR_BANKS)
|
2017-12-18 18:37:12 +07:00
|
|
|
return N_SMCA_BANK_TYPES;
|
|
|
|
|
2018-02-21 17:18:57 +07:00
|
|
|
b = &smca_banks[bank];
|
2017-12-18 18:37:12 +07:00
|
|
|
if (!b->hwid)
|
|
|
|
return N_SMCA_BANK_TYPES;
|
|
|
|
|
|
|
|
return b->hwid->bank_type;
|
|
|
|
}
|
|
|
|
|
2016-11-02 18:48:01 +07:00
|
|
|
static struct smca_hwid smca_hwid_mcatypes[] = {
|
2016-09-12 14:59:34 +07:00
|
|
|
/* { bank_type, hwid_mcatype, xec_bitmap } */
|
|
|
|
|
x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.
This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:
WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
...
Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.
Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.
Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.
Don't try to find the block address on reserved banks.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14.x
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 17:18:58 +07:00
|
|
|
/* Reserved type */
|
|
|
|
{ SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 },
|
|
|
|
|
2016-09-12 14:59:34 +07:00
|
|
|
/* ZN Core (HWID=0xB0) MCA types */
|
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1FFFFF },
|
2020-01-10 08:56:47 +07:00
|
|
|
{ SMCA_LS_V2, HWID_MCATYPE(0xB0, 0x10), 0xFFFFFF },
|
2016-09-12 14:59:34 +07:00
|
|
|
{ SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF },
|
|
|
|
{ SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF },
|
|
|
|
{ SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF },
|
|
|
|
/* HWID 0xB0 MCATYPE 0x4 is Reserved */
|
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0xFFF },
|
2016-09-12 14:59:34 +07:00
|
|
|
{ SMCA_FP, HWID_MCATYPE(0xB0, 0x6), 0x7F },
|
|
|
|
{ SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF },
|
|
|
|
|
|
|
|
/* Data Fabric MCA types */
|
|
|
|
{ SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF },
|
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0x1F },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_CS_V2, HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
|
2016-09-12 14:59:34 +07:00
|
|
|
|
|
|
|
/* Unified Memory Controller MCA type */
|
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0xFF },
|
2016-09-12 14:59:34 +07:00
|
|
|
|
|
|
|
/* Parameter Block MCA type */
|
|
|
|
{ SMCA_PB, HWID_MCATYPE(0x05, 0x0), 0x1 },
|
2016-03-07 20:02:18 +07:00
|
|
|
|
2016-09-12 14:59:34 +07:00
|
|
|
/* Platform Security Processor MCA type */
|
|
|
|
{ SMCA_PSP, HWID_MCATYPE(0xFF, 0x0), 0x1 },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1), 0x3FFFF },
|
2016-09-12 14:59:34 +07:00
|
|
|
|
|
|
|
/* System Management Unit MCA type */
|
|
|
|
{ SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 },
|
x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.
Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.
Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com
2019-02-02 05:55:52 +07:00
|
|
|
{ SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1), 0x7FF },
|
2019-02-02 05:55:51 +07:00
|
|
|
|
|
|
|
/* Microprocessor 5 Unit MCA type */
|
|
|
|
{ SMCA_MP5, HWID_MCATYPE(0x01, 0x2), 0x3FF },
|
|
|
|
|
|
|
|
/* Northbridge IO Unit MCA type */
|
|
|
|
{ SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F },
|
|
|
|
|
|
|
|
/* PCI Express Unit MCA type */
|
|
|
|
{ SMCA_PCIE, HWID_MCATYPE(0x46, 0x0), 0x1F },
|
2016-03-07 20:02:18 +07:00
|
|
|
};
|
2016-09-12 14:59:34 +07:00
|
|
|
|
2016-11-01 23:33:00 +07:00
|
|
|
struct smca_bank smca_banks[MAX_NR_BANKS];
|
2016-09-12 14:59:34 +07:00
|
|
|
EXPORT_SYMBOL_GPL(smca_banks);
|
2016-03-07 20:02:18 +07:00
|
|
|
|
2016-09-12 14:59:35 +07:00
|
|
|
/*
|
|
|
|
* In SMCA enabled processors, we can have multiple banks for a given IP type.
|
|
|
|
* So to define a unique name for each bank, we use a temp c-string to append
|
|
|
|
* the MCA_IPID[InstanceId] to type's name in get_name().
|
|
|
|
*
|
|
|
|
* InstanceId is 32 bits which is 8 characters. Make sure MAX_MCATYPE_NAME_LEN
|
|
|
|
* is greater than 8 plus 1 (for underscore) plus length of longest type name.
|
|
|
|
*/
|
|
|
|
#define MAX_MCATYPE_NAME_LEN 30
|
|
|
|
static char buf_mcatype[MAX_MCATYPE_NAME_LEN];
|
|
|
|
|
2013-03-15 04:10:41 +07:00
|
|
|
static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks);
|
2016-07-08 16:09:38 +07:00
|
|
|
static DEFINE_PER_CPU(unsigned int, bank_map); /* see which banks are on */
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
/* Map of banks that have more than MCA_MISC0 available. */
|
|
|
|
static DEFINE_PER_CPU(u32, smca_misc_banks_map);
|
|
|
|
|
2009-02-12 19:49:31 +07:00
|
|
|
static void amd_threshold_interrupt(void);
|
2015-05-06 18:58:56 +07:00
|
|
|
static void amd_deferred_error_interrupt(void);
|
|
|
|
|
|
|
|
static void default_deferred_error_interrupt(void)
|
|
|
|
{
|
|
|
|
pr_err("Unexpected deferred interrupt at vector %x\n", DEFERRED_ERROR_VECTOR);
|
|
|
|
}
|
|
|
|
void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt;
|
2009-02-12 19:49:31 +07:00
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu)
|
|
|
|
{
|
|
|
|
u32 low, high;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* For SMCA enabled processors, BLKPTR field of the first MISC register
|
|
|
|
* (MCx_MISC0) indicates presence of additional MISC regs set (MISC1-4).
|
|
|
|
*/
|
|
|
|
if (rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &low, &high))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (!(low & MCI_CONFIG_MCAX))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (rdmsr_safe(MSR_AMD64_SMCA_MCx_MISC(bank), &low, &high))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (low & MASK_BLKPTR_LO)
|
|
|
|
per_cpu(smca_misc_banks_map, cpu) |= BIT(bank);
|
|
|
|
|
|
|
|
}
|
|
|
|
|
2017-05-19 16:39:15 +07:00
|
|
|
static void smca_configure(unsigned int bank, unsigned int cpu)
|
2016-09-12 14:59:34 +07:00
|
|
|
{
|
2017-05-19 16:39:15 +07:00
|
|
|
unsigned int i, hwid_mcatype;
|
2016-11-02 18:48:01 +07:00
|
|
|
struct smca_hwid *s_hwid;
|
2017-05-19 16:39:15 +07:00
|
|
|
u32 high, low;
|
|
|
|
u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank);
|
|
|
|
|
|
|
|
/* Set appropriate bits in MCA_CONFIG */
|
|
|
|
if (!rdmsr_safe(smca_config, &low, &high)) {
|
|
|
|
/*
|
|
|
|
* OS is required to set the MCAX bit to acknowledge that it is
|
|
|
|
* now using the new MSR ranges and new registers under each
|
|
|
|
* bank. It also means that the OS will configure deferred
|
|
|
|
* errors in the new MCx_CONFIG register. If the bit is not set,
|
|
|
|
* uncorrectable errors will cause a system panic.
|
|
|
|
*
|
|
|
|
* MCA_CONFIG[MCAX] is bit 32 (0 in the high portion of the MSR.)
|
|
|
|
*/
|
|
|
|
high |= BIT(0);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SMCA sets the Deferred Error Interrupt type per bank.
|
|
|
|
*
|
|
|
|
* MCA_CONFIG[DeferredIntTypeSupported] is bit 5, and tells us
|
|
|
|
* if the DeferredIntType bit field is available.
|
|
|
|
*
|
|
|
|
* MCA_CONFIG[DeferredIntType] is bits [38:37] ([6:5] in the
|
|
|
|
* high portion of the MSR). OS should set this to 0x1 to enable
|
|
|
|
* APIC based interrupt. First, check that no interrupt has been
|
|
|
|
* set.
|
|
|
|
*/
|
|
|
|
if ((low & BIT(5)) && !((high >> 5) & 0x3))
|
|
|
|
high |= BIT(5);
|
|
|
|
|
|
|
|
wrmsr(smca_config, low, high);
|
|
|
|
}
|
2016-09-12 14:59:34 +07:00
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
smca_set_misc_banks_map(bank, cpu);
|
|
|
|
|
x86/mce/AMD: Allow any CPU to initialize the smca_banks array
Current SMCA implementations have the same banks on each CPU with the
non-core banks only visible to a "master thread" on each die. Practically,
this means the smca_banks array, which describes the banks, only needs to
be populated once by a single master thread.
CPU 0 seemed like a good candidate to do the populating. However, it's
possible that CPU 0 is not enabled in which case the smca_banks array won't
be populated.
Rather than try to figure out another master thread to do the populating,
we should just allow any CPU to populate the array.
Drop the CPU 0 check and return early if the bank was already initialized.
Also, drop the WARNing about an already initialized bank, since this will
be a common, expected occurrence.
The smca_banks array is only populated at boot time and CPUs are brought
online sequentially. So there's no need for locking around the array.
If the first CPU up is a master thread, then it will populate the array
with all banks, core and non-core. Every CPU afterwards will return
early. If the first CPU up is not a master thread, then it will populate
the array with all core banks. The first CPU afterwards that is a master
thread will skip populating the core banks and continue populating the
non-core banks.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Jack Miller <jack@codezen.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170724101228.17326-4-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24 17:12:28 +07:00
|
|
|
/* Return early if this bank was already initialized. */
|
x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
Each logical CPU in Scalable MCA systems controls a unique set of MCA
banks in the system. These banks are not shared between CPUs. The bank
types and ordering will be the same across CPUs on currently available
systems.
However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
other CPUs do not. In this case, the bank seen as Reserved on one CPU is
assumed to be the same type as the bank seen as a known type on another
CPU.
In general, this occurs when the hardware represented by the MCA bank
is disabled, e.g. disabled memory controllers on certain models, etc.
The MCA bank is disabled in the hardware, so there is no possibility of
getting an MCA/MCE from it even if it is assumed to have a known type.
For example:
Full system:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | UMC | UMC
2 | CS | CS
System with hardware disabled:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | UMC | RAZ
2 | CS | CS
For this reason, there is a single, global struct smca_banks[] that is
initialized at boot time. This array is initialized on each CPU as it
comes online. However, the array will not be updated if an entry already
exists.
This works as expected when the first CPU (usually CPU0) has all
possible MCA banks enabled. But if the first CPU has a subset, then it
will save a "Reserved" type in smca_banks[]. Successive CPUs will then
not be able to update smca_banks[] even if they encounter a known bank
type.
This may result in unexpected behavior. Depending on the system
configuration, a user may observe issues enumerating the MCA
thresholding sysfs interface. The issues may be as trivial as sysfs
entries not being available, or as severe as system hangs.
For example:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | RAZ | UMC
2 | CS | CS
Extend the smca_banks[] entry check to return if the entry is a
non-reserved type. Otherwise, continue so that CPUs that encounter a
known bank type can update smca_banks[].
Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
2019-11-21 21:15:08 +07:00
|
|
|
if (smca_banks[bank].hwid && smca_banks[bank].hwid->hwid_mcatype != 0)
|
2016-09-12 14:59:34 +07:00
|
|
|
return;
|
|
|
|
|
2019-10-31 20:04:48 +07:00
|
|
|
if (rdmsr_safe(MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) {
|
2016-09-12 14:59:34 +07:00
|
|
|
pr_warn("Failed to read MCA_IPID for bank %d\n", bank);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2016-11-02 18:48:01 +07:00
|
|
|
hwid_mcatype = HWID_MCATYPE(high & MCI_IPID_HWID,
|
|
|
|
(high & MCI_IPID_MCATYPE) >> 16);
|
2016-09-12 14:59:34 +07:00
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(smca_hwid_mcatypes); i++) {
|
2016-11-02 18:48:01 +07:00
|
|
|
s_hwid = &smca_hwid_mcatypes[i];
|
|
|
|
if (hwid_mcatype == s_hwid->hwid_mcatype) {
|
|
|
|
smca_banks[bank].hwid = s_hwid;
|
2017-05-19 16:39:15 +07:00
|
|
|
smca_banks[bank].id = low;
|
2017-01-24 01:35:08 +07:00
|
|
|
smca_banks[bank].sysfs_id = s_hwid->count++;
|
2016-09-12 14:59:34 +07:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-12-17 08:34:04 +07:00
|
|
|
struct thresh_restart {
|
2009-04-08 17:31:18 +07:00
|
|
|
struct threshold_block *b;
|
|
|
|
int reset;
|
2010-10-25 21:03:35 +07:00
|
|
|
int set_lvt_off;
|
|
|
|
int lvt_off;
|
2009-04-08 17:31:18 +07:00
|
|
|
u16 old_limit;
|
2008-12-17 08:34:04 +07:00
|
|
|
};
|
|
|
|
|
2013-03-15 04:10:40 +07:00
|
|
|
static inline bool is_shared_bank(int bank)
|
|
|
|
{
|
2016-01-26 02:41:49 +07:00
|
|
|
/*
|
|
|
|
* Scalable MCA provides for only one core to have access to the MSRs of
|
|
|
|
* a shared bank.
|
|
|
|
*/
|
|
|
|
if (mce_flags.smca)
|
|
|
|
return false;
|
|
|
|
|
2013-03-15 04:10:40 +07:00
|
|
|
/* Bank 4 is for northbridge reporting and is thus shared */
|
|
|
|
return (bank == 4);
|
|
|
|
}
|
|
|
|
|
2015-01-23 15:32:01 +07:00
|
|
|
static const char *bank4_names(const struct threshold_block *b)
|
2012-05-04 22:05:27 +07:00
|
|
|
{
|
|
|
|
switch (b->address) {
|
|
|
|
/* MSR4_MISC0 */
|
|
|
|
case 0x00000413:
|
|
|
|
return "dram";
|
|
|
|
|
|
|
|
case 0xc0000408:
|
|
|
|
return "ht_links";
|
|
|
|
|
|
|
|
case 0xc0000409:
|
|
|
|
return "l3_cache";
|
|
|
|
|
|
|
|
default:
|
|
|
|
WARN(1, "Funny MSR: 0x%08x\n", b->address);
|
|
|
|
return "";
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
|
2012-04-16 23:01:53 +07:00
|
|
|
static bool lvt_interrupt_supported(unsigned int bank, u32 msr_high_bits)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* bank 4 supports APIC LVT interrupts implicitly since forever.
|
|
|
|
*/
|
|
|
|
if (bank == 4)
|
|
|
|
return true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* IntP: interrupt present; if this bit is set, the thresholding
|
|
|
|
* bank can generate APIC LVT interrupts
|
|
|
|
*/
|
|
|
|
return msr_high_bits & BIT(28);
|
|
|
|
}
|
|
|
|
|
2010-10-25 21:03:37 +07:00
|
|
|
static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
|
|
|
|
{
|
|
|
|
int msr = (hi & MASK_LVTOFF_HI) >> 20;
|
|
|
|
|
|
|
|
if (apic < 0) {
|
|
|
|
pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
|
|
|
|
"for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
|
|
|
|
b->bank, b->block, b->address, hi, lo);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (apic != msr) {
|
2016-01-26 02:41:51 +07:00
|
|
|
/*
|
|
|
|
* On SMCA CPUs, LVT offset is programmed at a different MSR, and
|
|
|
|
* the BIOS provides the value. The original field where LVT offset
|
|
|
|
* was set is reserved. Return early here:
|
|
|
|
*/
|
|
|
|
if (mce_flags.smca)
|
|
|
|
return 0;
|
|
|
|
|
2010-10-25 21:03:37 +07:00
|
|
|
pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d "
|
|
|
|
"for bank %d, block %d (MSR%08X=0x%x%08x)\n",
|
|
|
|
b->cpu, apic, b->bank, b->block, b->address, hi, lo);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
};
|
|
|
|
|
2016-03-07 20:02:21 +07:00
|
|
|
/* Reprogram MCx_MISC MSR behind this threshold bank. */
|
2009-03-18 07:10:25 +07:00
|
|
|
static void threshold_restart_bank(void *_tr)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2008-12-17 08:34:04 +07:00
|
|
|
struct thresh_restart *tr = _tr;
|
2010-10-25 21:03:36 +07:00
|
|
|
u32 hi, lo;
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2010-10-25 21:03:36 +07:00
|
|
|
rdmsr(tr->b->address, lo, hi);
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2010-10-25 21:03:36 +07:00
|
|
|
if (tr->b->threshold_limit < (hi & THRESHOLD_MAX))
|
2008-12-17 08:34:04 +07:00
|
|
|
tr->reset = 1; /* limit cannot be lower than err count */
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2008-12-17 08:34:04 +07:00
|
|
|
if (tr->reset) { /* reset err count and overflow bit */
|
2010-10-25 21:03:36 +07:00
|
|
|
hi =
|
|
|
|
(hi & ~(MASK_ERR_COUNT_HI | MASK_OVERFLOW_HI)) |
|
2008-12-17 08:34:04 +07:00
|
|
|
(THRESHOLD_MAX - tr->b->threshold_limit);
|
|
|
|
} else if (tr->old_limit) { /* change limit w/o reset */
|
2010-10-25 21:03:36 +07:00
|
|
|
int new_count = (hi & THRESHOLD_MAX) +
|
2008-12-17 08:34:04 +07:00
|
|
|
(tr->old_limit - tr->b->threshold_limit);
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2010-10-25 21:03:36 +07:00
|
|
|
hi = (hi & ~MASK_ERR_COUNT_HI) |
|
2005-11-05 23:25:53 +07:00
|
|
|
(new_count & THRESHOLD_MAX);
|
|
|
|
}
|
|
|
|
|
2012-04-16 23:01:53 +07:00
|
|
|
/* clear IntType */
|
|
|
|
hi &= ~MASK_INT_TYPE_HI;
|
|
|
|
|
|
|
|
if (!tr->b->interrupt_capable)
|
|
|
|
goto done;
|
|
|
|
|
2010-10-25 21:03:35 +07:00
|
|
|
if (tr->set_lvt_off) {
|
2010-10-25 21:03:37 +07:00
|
|
|
if (lvt_off_valid(tr->b, tr->lvt_off, lo, hi)) {
|
|
|
|
/* set new lvt offset */
|
|
|
|
hi &= ~MASK_LVTOFF_HI;
|
|
|
|
hi |= tr->lvt_off << 20;
|
|
|
|
}
|
2010-10-25 21:03:35 +07:00
|
|
|
}
|
|
|
|
|
2012-04-16 23:01:53 +07:00
|
|
|
if (tr->b->interrupt_enable)
|
|
|
|
hi |= INT_TYPE_APIC;
|
|
|
|
|
|
|
|
done:
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2010-10-25 21:03:36 +07:00
|
|
|
hi |= MASK_COUNT_EN_HI;
|
|
|
|
wrmsr(tr->b->address, lo, hi);
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2010-10-25 21:03:35 +07:00
|
|
|
static void mce_threshold_block_init(struct threshold_block *b, int offset)
|
|
|
|
{
|
|
|
|
struct thresh_restart tr = {
|
|
|
|
.b = b,
|
|
|
|
.set_lvt_off = 1,
|
|
|
|
.lvt_off = offset,
|
|
|
|
};
|
|
|
|
|
|
|
|
b->threshold_limit = THRESHOLD_MAX;
|
|
|
|
threshold_restart_bank(&tr);
|
|
|
|
};
|
|
|
|
|
2015-05-06 18:58:58 +07:00
|
|
|
static int setup_APIC_mce_threshold(int reserved, int new)
|
2010-10-25 21:03:37 +07:00
|
|
|
{
|
|
|
|
if (reserved < 0 && !setup_APIC_eilvt(new, THRESHOLD_APIC_VECTOR,
|
|
|
|
APIC_EILVT_MSG_FIX, 0))
|
|
|
|
return new;
|
|
|
|
|
|
|
|
return reserved;
|
|
|
|
}
|
|
|
|
|
2015-05-06 18:58:56 +07:00
|
|
|
static int setup_APIC_deferred_error(int reserved, int new)
|
|
|
|
{
|
|
|
|
if (reserved < 0 && !setup_APIC_eilvt(new, DEFERRED_ERROR_VECTOR,
|
|
|
|
APIC_EILVT_MSG_FIX, 0))
|
|
|
|
return new;
|
|
|
|
|
|
|
|
return reserved;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
|
|
|
|
{
|
|
|
|
u32 low = 0, high = 0;
|
|
|
|
int def_offset = -1, def_new;
|
|
|
|
|
|
|
|
if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high))
|
|
|
|
return;
|
|
|
|
|
|
|
|
def_new = (low & MASK_DEF_LVTOFF) >> 4;
|
|
|
|
if (!(low & MASK_DEF_LVTOFF)) {
|
|
|
|
pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n");
|
|
|
|
def_new = DEF_LVT_OFF;
|
|
|
|
low = (low & ~MASK_DEF_LVTOFF) | (DEF_LVT_OFF << 4);
|
|
|
|
}
|
|
|
|
|
|
|
|
def_offset = setup_APIC_deferred_error(def_offset, def_new);
|
|
|
|
if ((def_offset == def_new) &&
|
|
|
|
(deferred_error_int_vector != amd_deferred_error_interrupt))
|
|
|
|
deferred_error_int_vector = amd_deferred_error_interrupt;
|
|
|
|
|
2017-12-04 23:54:38 +07:00
|
|
|
if (!mce_flags.smca)
|
|
|
|
low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
|
|
|
|
|
2015-05-06 18:58:56 +07:00
|
|
|
wrmsr(MSR_CU_DEF_ERR, low, high);
|
|
|
|
}
|
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
static u32 smca_get_block_address(unsigned int bank, unsigned int block,
|
|
|
|
unsigned int cpu)
|
2018-02-21 17:19:00 +07:00
|
|
|
{
|
|
|
|
if (!block)
|
|
|
|
return MSR_AMD64_SMCA_MCx_MISC(bank);
|
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
if (!(per_cpu(smca_misc_banks_map, cpu) & BIT(bank)))
|
|
|
|
return 0;
|
2018-02-21 17:19:00 +07:00
|
|
|
|
2019-06-08 03:18:04 +07:00
|
|
|
return MSR_AMD64_SMCA_MCx_MISCy(bank, block - 1);
|
2018-02-21 17:19:00 +07:00
|
|
|
}
|
|
|
|
|
2018-05-17 23:32:33 +07:00
|
|
|
static u32 get_block_address(u32 current_addr, u32 low, u32 high,
|
2019-06-08 03:18:04 +07:00
|
|
|
unsigned int bank, unsigned int block,
|
|
|
|
unsigned int cpu)
|
2016-03-07 20:02:19 +07:00
|
|
|
{
|
|
|
|
u32 addr = 0, offset = 0;
|
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS))
|
2018-02-21 17:18:59 +07:00
|
|
|
return addr;
|
|
|
|
|
2018-02-21 17:19:00 +07:00
|
|
|
if (mce_flags.smca)
|
2019-06-08 03:18:04 +07:00
|
|
|
return smca_get_block_address(bank, block, cpu);
|
2016-03-07 20:02:19 +07:00
|
|
|
|
|
|
|
/* Fall back to method we used for older processors: */
|
|
|
|
switch (block) {
|
|
|
|
case 0:
|
2016-04-30 19:33:55 +07:00
|
|
|
addr = msr_ops.misc(bank);
|
2016-03-07 20:02:19 +07:00
|
|
|
break;
|
|
|
|
case 1:
|
|
|
|
offset = ((low & MASK_BLKPTR_LO) >> 21);
|
|
|
|
if (offset)
|
|
|
|
addr = MCG_XBLK_ADDR + offset;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
addr = ++current_addr;
|
|
|
|
}
|
|
|
|
return addr;
|
|
|
|
}
|
|
|
|
|
2016-01-26 02:41:52 +07:00
|
|
|
static int
|
|
|
|
prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
|
|
|
|
int offset, u32 misc_high)
|
|
|
|
{
|
|
|
|
unsigned int cpu = smp_processor_id();
|
2017-05-19 16:39:15 +07:00
|
|
|
u32 smca_low, smca_high;
|
2016-01-26 02:41:52 +07:00
|
|
|
struct threshold_block b;
|
|
|
|
int new;
|
|
|
|
|
|
|
|
if (!block)
|
|
|
|
per_cpu(bank_map, cpu) |= (1 << bank);
|
|
|
|
|
|
|
|
memset(&b, 0, sizeof(b));
|
|
|
|
b.cpu = cpu;
|
|
|
|
b.bank = bank;
|
|
|
|
b.block = block;
|
|
|
|
b.address = addr;
|
|
|
|
b.interrupt_capable = lvt_interrupt_supported(bank, misc_high);
|
|
|
|
|
|
|
|
if (!b.interrupt_capable)
|
|
|
|
goto done;
|
|
|
|
|
|
|
|
b.interrupt_enable = 1;
|
|
|
|
|
2016-05-11 19:58:25 +07:00
|
|
|
if (!mce_flags.smca) {
|
|
|
|
new = (misc_high & MASK_LVTOFF_HI) >> 20;
|
|
|
|
goto set_offset;
|
|
|
|
}
|
2016-05-11 19:58:24 +07:00
|
|
|
|
2016-05-11 19:58:25 +07:00
|
|
|
/* Gather LVT offset for thresholding: */
|
|
|
|
if (rdmsr_safe(MSR_CU_DEF_ERR, &smca_low, &smca_high))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
new = (smca_low & SMCA_THR_LVT_OFF) >> 12;
|
|
|
|
|
|
|
|
set_offset:
|
2016-01-26 02:41:52 +07:00
|
|
|
offset = setup_APIC_mce_threshold(offset, new);
|
2018-11-27 20:41:37 +07:00
|
|
|
if (offset == new)
|
|
|
|
thresholding_irq_en = true;
|
2016-01-26 02:41:52 +07:00
|
|
|
|
|
|
|
done:
|
|
|
|
mce_threshold_block_init(&b, offset);
|
|
|
|
|
|
|
|
out:
|
|
|
|
return offset;
|
|
|
|
}
|
|
|
|
|
2019-03-25 23:34:22 +07:00
|
|
|
bool amd_filter_mce(struct mce *m)
|
|
|
|
{
|
|
|
|
enum smca_bank_types bank_type = smca_get_bank_type(m->bank);
|
|
|
|
struct cpuinfo_x86 *c = &boot_cpu_data;
|
|
|
|
u8 xec = (m->status >> 16) & 0x3F;
|
|
|
|
|
|
|
|
/* See Family 17h Models 10h-2Fh Erratum #1114. */
|
|
|
|
if (c->x86 == 0x17 &&
|
|
|
|
c->x86_model >= 0x10 && c->x86_model <= 0x2F &&
|
|
|
|
bank_type == SMCA_IF && xec == 10)
|
|
|
|
return true;
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-01-16 22:10:40 +07:00
|
|
|
/*
|
2019-03-25 23:34:22 +07:00
|
|
|
* Turn off thresholding banks for the following conditions:
|
|
|
|
* - MC4_MISC thresholding is not supported on Family 0x15.
|
|
|
|
* - Prevent possible spurious interrupts from the IF bank on Family 0x17
|
|
|
|
* Models 0x10-0x2F due to Erratum #1114.
|
2019-01-16 22:10:40 +07:00
|
|
|
*/
|
2019-09-29 00:02:29 +07:00
|
|
|
static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
|
2019-01-16 22:10:40 +07:00
|
|
|
{
|
2019-03-25 23:34:22 +07:00
|
|
|
int i, num_msrs;
|
2019-01-16 22:10:40 +07:00
|
|
|
u64 hwcr;
|
|
|
|
bool need_toggle;
|
2019-03-25 23:34:22 +07:00
|
|
|
u32 msrs[NR_BLOCKS];
|
2019-01-16 22:10:40 +07:00
|
|
|
|
2019-03-25 23:34:22 +07:00
|
|
|
if (c->x86 == 0x15 && bank == 4) {
|
|
|
|
msrs[0] = 0x00000413; /* MC4_MISC0 */
|
|
|
|
msrs[1] = 0xc0000408; /* MC4_MISC1 */
|
|
|
|
num_msrs = 2;
|
|
|
|
} else if (c->x86 == 0x17 &&
|
|
|
|
(c->x86_model >= 0x10 && c->x86_model <= 0x2F)) {
|
|
|
|
|
|
|
|
if (smca_get_bank_type(bank) != SMCA_IF)
|
|
|
|
return;
|
|
|
|
|
|
|
|
msrs[0] = MSR_AMD64_SMCA_MCx_MISC(bank);
|
|
|
|
num_msrs = 1;
|
|
|
|
} else {
|
2019-01-16 22:10:40 +07:00
|
|
|
return;
|
2019-03-25 23:34:22 +07:00
|
|
|
}
|
2019-01-16 22:10:40 +07:00
|
|
|
|
|
|
|
rdmsrl(MSR_K7_HWCR, hwcr);
|
|
|
|
|
|
|
|
/* McStatusWrEn has to be set */
|
|
|
|
need_toggle = !(hwcr & BIT(18));
|
|
|
|
if (need_toggle)
|
|
|
|
wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
|
|
|
|
|
|
|
|
/* Clear CntP bit safely */
|
2019-03-25 23:34:22 +07:00
|
|
|
for (i = 0; i < num_msrs; i++)
|
2019-01-16 22:10:40 +07:00
|
|
|
msr_clear_bit(msrs[i], 62);
|
|
|
|
|
|
|
|
/* restore old settings */
|
|
|
|
if (need_toggle)
|
|
|
|
wrmsrl(MSR_K7_HWCR, hwcr);
|
|
|
|
}
|
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
/* cpu init entry point, called from mce.c with preempt off */
|
2009-02-21 14:35:51 +07:00
|
|
|
void mce_amd_feature_init(struct cpuinfo_x86 *c)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2016-09-12 14:59:31 +07:00
|
|
|
unsigned int bank, block, cpu = smp_processor_id();
|
2019-06-08 03:18:05 +07:00
|
|
|
u32 low = 0, high = 0, address = 0;
|
2016-01-26 02:41:52 +07:00
|
|
|
int offset = -1;
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
|
|
|
|
for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
|
2016-09-12 14:59:34 +07:00
|
|
|
if (mce_flags.smca)
|
2017-05-19 16:39:15 +07:00
|
|
|
smca_configure(bank, cpu);
|
2016-09-12 14:59:34 +07:00
|
|
|
|
2019-03-25 23:34:22 +07:00
|
|
|
disable_err_thresholding(c, bank);
|
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
for (block = 0; block < NR_BLOCKS; ++block) {
|
2019-06-08 03:18:04 +07:00
|
|
|
address = get_block_address(address, low, high, bank, block, cpu);
|
2016-03-07 20:02:19 +07:00
|
|
|
if (!address)
|
|
|
|
break;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
|
|
|
if (rdmsr_safe(address, &low, &high))
|
2007-02-13 19:26:23 +07:00
|
|
|
break;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2010-10-08 17:08:34 +07:00
|
|
|
if (!(high & MASK_VALID_HI))
|
|
|
|
continue;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2007-02-13 19:26:23 +07:00
|
|
|
if (!(high & MASK_CNTP_HI) ||
|
|
|
|
(high & MASK_LOCKED_HI))
|
2006-06-26 18:58:53 +07:00
|
|
|
continue;
|
|
|
|
|
2016-01-26 02:41:52 +07:00
|
|
|
offset = prepare_threshold_block(bank, block, address, offset, high);
|
2006-06-26 18:58:53 +07:00
|
|
|
}
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
2015-05-06 18:58:56 +07:00
|
|
|
|
|
|
|
if (mce_flags.succor)
|
|
|
|
deferred_error_interrupt_enable(c);
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2016-11-18 05:57:27 +07:00
|
|
|
int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
|
|
|
|
{
|
|
|
|
u64 dram_base_addr, dram_limit_addr, dram_hole_base;
|
|
|
|
/* We start from the normalized address */
|
|
|
|
u64 ret_addr = norm_addr;
|
|
|
|
|
|
|
|
u32 tmp;
|
|
|
|
|
|
|
|
u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask;
|
|
|
|
u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets;
|
|
|
|
u8 intlv_addr_sel, intlv_addr_bit;
|
|
|
|
u8 num_intlv_bits, hashed_bit;
|
|
|
|
u8 lgcy_mmio_hole_en, base = 0;
|
|
|
|
u8 cs_mask, cs_id = 0;
|
|
|
|
bool hash_enabled = false;
|
|
|
|
|
|
|
|
/* Read D18F0x1B4 (DramOffset), check if base 1 is used. */
|
|
|
|
if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
/* Remove HiAddrOffset from normalized address, if enabled: */
|
|
|
|
if (tmp & BIT(0)) {
|
|
|
|
u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8;
|
|
|
|
|
|
|
|
if (norm_addr >= hi_addr_offset) {
|
|
|
|
ret_addr -= hi_addr_offset;
|
|
|
|
base = 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Read D18F0x110 (DramBaseAddress). */
|
|
|
|
if (amd_df_indirect_read(nid, 0, 0x110 + (8 * base), umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
/* Check if address range is valid. */
|
|
|
|
if (!(tmp & BIT(0))) {
|
|
|
|
pr_err("%s: Invalid DramBaseAddress range: 0x%x.\n",
|
|
|
|
__func__, tmp);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
lgcy_mmio_hole_en = tmp & BIT(1);
|
|
|
|
intlv_num_chan = (tmp >> 4) & 0xF;
|
|
|
|
intlv_addr_sel = (tmp >> 8) & 0x7;
|
|
|
|
dram_base_addr = (tmp & GENMASK_ULL(31, 12)) << 16;
|
|
|
|
|
|
|
|
/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
|
|
|
|
if (intlv_addr_sel > 3) {
|
|
|
|
pr_err("%s: Invalid interleave address select %d.\n",
|
|
|
|
__func__, intlv_addr_sel);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Read D18F0x114 (DramLimitAddress). */
|
|
|
|
if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
intlv_num_sockets = (tmp >> 8) & 0x1;
|
|
|
|
intlv_num_dies = (tmp >> 10) & 0x3;
|
|
|
|
dram_limit_addr = ((tmp & GENMASK_ULL(31, 12)) << 16) | GENMASK_ULL(27, 0);
|
|
|
|
|
|
|
|
intlv_addr_bit = intlv_addr_sel + 8;
|
|
|
|
|
|
|
|
/* Re-use intlv_num_chan by setting it equal to log2(#channels) */
|
|
|
|
switch (intlv_num_chan) {
|
|
|
|
case 0: intlv_num_chan = 0; break;
|
|
|
|
case 1: intlv_num_chan = 1; break;
|
|
|
|
case 3: intlv_num_chan = 2; break;
|
|
|
|
case 5: intlv_num_chan = 3; break;
|
|
|
|
case 7: intlv_num_chan = 4; break;
|
|
|
|
|
|
|
|
case 8: intlv_num_chan = 1;
|
|
|
|
hash_enabled = true;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
pr_err("%s: Invalid number of interleaved channels %d.\n",
|
|
|
|
__func__, intlv_num_chan);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
num_intlv_bits = intlv_num_chan;
|
|
|
|
|
|
|
|
if (intlv_num_dies > 2) {
|
|
|
|
pr_err("%s: Invalid number of interleaved nodes/dies %d.\n",
|
|
|
|
__func__, intlv_num_dies);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
num_intlv_bits += intlv_num_dies;
|
|
|
|
|
|
|
|
/* Add a bit if sockets are interleaved. */
|
|
|
|
num_intlv_bits += intlv_num_sockets;
|
|
|
|
|
|
|
|
/* Assert num_intlv_bits <= 4 */
|
|
|
|
if (num_intlv_bits > 4) {
|
|
|
|
pr_err("%s: Invalid interleave bits %d.\n",
|
|
|
|
__func__, num_intlv_bits);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (num_intlv_bits > 0) {
|
|
|
|
u64 temp_addr_x, temp_addr_i, temp_addr_y;
|
|
|
|
u8 die_id_bit, sock_id_bit, cs_fabric_id;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Read FabricBlockInstanceInformation3_CS[BlockFabricID].
|
|
|
|
* This is the fabric id for this coherent slave. Use
|
|
|
|
* umc/channel# as instance id of the coherent slave
|
|
|
|
* for FICAA.
|
|
|
|
*/
|
|
|
|
if (amd_df_indirect_read(nid, 0, 0x50, umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
cs_fabric_id = (tmp >> 8) & 0xFF;
|
|
|
|
die_id_bit = 0;
|
|
|
|
|
|
|
|
/* If interleaved over more than 1 channel: */
|
|
|
|
if (intlv_num_chan) {
|
|
|
|
die_id_bit = intlv_num_chan;
|
|
|
|
cs_mask = (1 << die_id_bit) - 1;
|
|
|
|
cs_id = cs_fabric_id & cs_mask;
|
|
|
|
}
|
|
|
|
|
|
|
|
sock_id_bit = die_id_bit;
|
|
|
|
|
|
|
|
/* Read D18F1x208 (SystemFabricIdMask). */
|
|
|
|
if (intlv_num_dies || intlv_num_sockets)
|
|
|
|
if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
/* If interleaved over more than 1 die. */
|
|
|
|
if (intlv_num_dies) {
|
|
|
|
sock_id_bit = die_id_bit + intlv_num_dies;
|
|
|
|
die_id_shift = (tmp >> 24) & 0xF;
|
|
|
|
die_id_mask = (tmp >> 8) & 0xFF;
|
|
|
|
|
|
|
|
cs_id |= ((cs_fabric_id & die_id_mask) >> die_id_shift) << die_id_bit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If interleaved over more than 1 socket. */
|
|
|
|
if (intlv_num_sockets) {
|
|
|
|
socket_id_shift = (tmp >> 28) & 0xF;
|
|
|
|
socket_id_mask = (tmp >> 16) & 0xFF;
|
|
|
|
|
|
|
|
cs_id |= ((cs_fabric_id & socket_id_mask) >> socket_id_shift) << sock_id_bit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The pre-interleaved address consists of XXXXXXIIIYYYYY
|
|
|
|
* where III is the ID for this CS, and XXXXXXYYYYY are the
|
|
|
|
* address bits from the post-interleaved address.
|
|
|
|
* "num_intlv_bits" has been calculated to tell us how many "I"
|
|
|
|
* bits there are. "intlv_addr_bit" tells us how many "Y" bits
|
|
|
|
* there are (where "I" starts).
|
|
|
|
*/
|
|
|
|
temp_addr_y = ret_addr & GENMASK_ULL(intlv_addr_bit-1, 0);
|
|
|
|
temp_addr_i = (cs_id << intlv_addr_bit);
|
|
|
|
temp_addr_x = (ret_addr & GENMASK_ULL(63, intlv_addr_bit)) << num_intlv_bits;
|
|
|
|
ret_addr = temp_addr_x | temp_addr_i | temp_addr_y;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Add dram base address */
|
|
|
|
ret_addr += dram_base_addr;
|
|
|
|
|
|
|
|
/* If legacy MMIO hole enabled */
|
|
|
|
if (lgcy_mmio_hole_en) {
|
|
|
|
if (amd_df_indirect_read(nid, 0, 0x104, umc, &tmp))
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
dram_hole_base = tmp & GENMASK(31, 24);
|
|
|
|
if (ret_addr >= dram_hole_base)
|
|
|
|
ret_addr += (BIT_ULL(32) - dram_hole_base);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (hash_enabled) {
|
|
|
|
/* Save some parentheses and grab ls-bit at the end. */
|
|
|
|
hashed_bit = (ret_addr >> 12) ^
|
|
|
|
(ret_addr >> 18) ^
|
|
|
|
(ret_addr >> 21) ^
|
|
|
|
(ret_addr >> 30) ^
|
|
|
|
cs_id;
|
|
|
|
|
|
|
|
hashed_bit &= BIT(0);
|
|
|
|
|
|
|
|
if (hashed_bit != ((ret_addr >> intlv_addr_bit) & BIT(0)))
|
|
|
|
ret_addr ^= BIT(intlv_addr_bit);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Is calculated system address is above DRAM limit address? */
|
|
|
|
if (ret_addr > dram_limit_addr)
|
|
|
|
goto out_err;
|
|
|
|
|
|
|
|
*sys_addr = ret_addr;
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
out_err:
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(umc_normaddr_to_sysaddr);
|
|
|
|
|
2017-12-18 18:37:13 +07:00
|
|
|
bool amd_mce_is_memory_error(struct mce *m)
|
|
|
|
{
|
|
|
|
/* ErrCodeExt[20:16] */
|
|
|
|
u8 xec = (m->status >> 16) & 0x1f;
|
|
|
|
|
|
|
|
if (mce_flags.smca)
|
2018-02-21 17:18:57 +07:00
|
|
|
return smca_get_bank_type(m->bank) == SMCA_UMC && xec == 0x0;
|
2017-12-18 18:37:13 +07:00
|
|
|
|
|
|
|
return m->bank == 4 && xec == 0x8;
|
|
|
|
}
|
|
|
|
|
2017-05-19 16:39:14 +07:00
|
|
|
static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc)
|
2015-05-06 18:58:53 +07:00
|
|
|
{
|
|
|
|
struct mce m;
|
|
|
|
|
|
|
|
mce_setup(&m);
|
|
|
|
|
|
|
|
m.status = status;
|
2017-05-19 16:39:14 +07:00
|
|
|
m.misc = misc;
|
2017-01-24 01:35:09 +07:00
|
|
|
m.bank = bank;
|
|
|
|
m.tsc = rdtsc();
|
2015-05-06 18:58:54 +07:00
|
|
|
|
2016-09-12 14:59:39 +07:00
|
|
|
if (m.status & MCI_STATUS_ADDRV) {
|
2017-05-19 16:39:14 +07:00
|
|
|
m.addr = addr;
|
2015-05-06 18:58:53 +07:00
|
|
|
|
2016-09-12 14:59:39 +07:00
|
|
|
/*
|
|
|
|
* Extract [55:<lsb>] where lsb is the least significant
|
|
|
|
* *valid* bit of the address bits.
|
|
|
|
*/
|
|
|
|
if (mce_flags.smca) {
|
|
|
|
u8 lsb = (m.addr >> 56) & 0x3f;
|
|
|
|
|
|
|
|
m.addr &= GENMASK_ULL(55, lsb);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-09-12 14:59:37 +07:00
|
|
|
if (mce_flags.smca) {
|
|
|
|
rdmsrl(MSR_AMD64_SMCA_MCx_IPID(bank), m.ipid);
|
|
|
|
|
|
|
|
if (m.status & MCI_STATUS_SYNDV)
|
|
|
|
rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m.synd);
|
|
|
|
}
|
2016-09-12 14:59:28 +07:00
|
|
|
|
2015-05-06 18:58:54 +07:00
|
|
|
mce_log(&m);
|
2015-05-06 18:58:53 +07:00
|
|
|
}
|
|
|
|
|
2018-11-10 05:13:13 +07:00
|
|
|
asmlinkage __visible void __irq_entry smp_deferred_error_interrupt(struct pt_regs *regs)
|
2015-05-06 18:58:56 +07:00
|
|
|
{
|
|
|
|
entering_irq();
|
|
|
|
trace_deferred_error_apic_entry(DEFERRED_ERROR_VECTOR);
|
2017-08-28 13:47:28 +07:00
|
|
|
inc_irq_stat(irq_deferred_error_count);
|
|
|
|
deferred_error_int_vector();
|
2015-05-06 18:58:56 +07:00
|
|
|
trace_deferred_error_apic_exit(DEFERRED_ERROR_VECTOR);
|
|
|
|
exiting_ack_irq();
|
|
|
|
}
|
|
|
|
|
2017-05-19 16:39:14 +07:00
|
|
|
/*
|
|
|
|
* Returns true if the logged error is deferred. False, otherwise.
|
|
|
|
*/
|
|
|
|
static inline bool
|
|
|
|
_log_error_bank(unsigned int bank, u32 msr_stat, u32 msr_addr, u64 misc)
|
2015-05-06 18:58:56 +07:00
|
|
|
{
|
2017-05-19 16:39:14 +07:00
|
|
|
u64 status, addr = 0;
|
2015-05-06 18:58:56 +07:00
|
|
|
|
2017-05-19 16:39:14 +07:00
|
|
|
rdmsrl(msr_stat, status);
|
|
|
|
if (!(status & MCI_STATUS_VAL))
|
|
|
|
return false;
|
2016-05-11 19:58:23 +07:00
|
|
|
|
2017-05-19 16:39:14 +07:00
|
|
|
if (status & MCI_STATUS_ADDRV)
|
|
|
|
rdmsrl(msr_addr, addr);
|
2015-05-06 18:58:56 +07:00
|
|
|
|
2017-05-19 16:39:14 +07:00
|
|
|
__log_error(bank, status, addr, misc);
|
2015-05-06 18:58:56 +07:00
|
|
|
|
2017-06-13 23:28:28 +07:00
|
|
|
wrmsrl(msr_stat, 0);
|
2017-05-19 16:39:14 +07:00
|
|
|
|
|
|
|
return status & MCI_STATUS_DEFERRED;
|
2015-05-06 18:58:56 +07:00
|
|
|
}
|
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
/*
|
2017-05-19 16:39:14 +07:00
|
|
|
* We have three scenarios for checking for Deferred errors:
|
|
|
|
*
|
|
|
|
* 1) Non-SMCA systems check MCA_STATUS and log error if found.
|
|
|
|
* 2) SMCA systems check MCA_STATUS. If error is found then log it and also
|
|
|
|
* clear MCA_DESTAT.
|
|
|
|
* 3) SMCA systems check MCA_DESTAT, if error was not found in MCA_STATUS, and
|
|
|
|
* log it.
|
2005-11-05 23:25:53 +07:00
|
|
|
*/
|
2017-05-19 16:39:14 +07:00
|
|
|
static void log_error_deferred(unsigned int bank)
|
|
|
|
{
|
|
|
|
bool defrd;
|
|
|
|
|
|
|
|
defrd = _log_error_bank(bank, msr_ops.status(bank),
|
|
|
|
msr_ops.addr(bank), 0);
|
|
|
|
|
|
|
|
if (!mce_flags.smca)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* Clear MCA_DESTAT if we logged the deferred error from MCA_STATUS. */
|
|
|
|
if (defrd) {
|
|
|
|
wrmsrl(MSR_AMD64_SMCA_MCx_DESTAT(bank), 0);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Only deferred errors are logged in MCA_DE{STAT,ADDR} so just check
|
|
|
|
* for a valid error.
|
|
|
|
*/
|
|
|
|
_log_error_bank(bank, MSR_AMD64_SMCA_MCx_DESTAT(bank),
|
|
|
|
MSR_AMD64_SMCA_MCx_DEADDR(bank), 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* APIC interrupt handler for deferred errors */
|
|
|
|
static void amd_deferred_error_interrupt(void)
|
|
|
|
{
|
|
|
|
unsigned int bank;
|
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank)
|
2017-05-19 16:39:14 +07:00
|
|
|
log_error_deferred(bank);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void log_error_thresholding(unsigned int bank, u64 misc)
|
|
|
|
{
|
|
|
|
_log_error_bank(bank, msr_ops.status(bank), msr_ops.addr(bank), misc);
|
|
|
|
}
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2017-06-13 23:28:29 +07:00
|
|
|
static void log_and_reset_block(struct threshold_block *block)
|
|
|
|
{
|
|
|
|
struct thresh_restart tr;
|
|
|
|
u32 low = 0, high = 0;
|
|
|
|
|
|
|
|
if (!block)
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (rdmsr_safe(block->address, &low, &high))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (!(high & MASK_OVERFLOW_HI))
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* Log the MCE which caused the threshold event. */
|
|
|
|
log_error_thresholding(block->bank, ((u64)high << 32) | low);
|
|
|
|
|
|
|
|
/* Reset threshold block after logging error. */
|
|
|
|
memset(&tr, 0, sizeof(tr));
|
|
|
|
tr.b = block;
|
|
|
|
threshold_restart_bank(&tr);
|
|
|
|
}
|
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
/*
|
2017-05-19 16:39:14 +07:00
|
|
|
* Threshold interrupt handler will service THRESHOLD_APIC_VECTOR. The interrupt
|
|
|
|
* goes off when error_count reaches threshold_limit.
|
2005-11-05 23:25:53 +07:00
|
|
|
*/
|
2009-02-12 19:49:31 +07:00
|
|
|
static void amd_threshold_interrupt(void)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2017-06-13 23:28:29 +07:00
|
|
|
struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
|
|
|
|
unsigned int bank, cpu = smp_processor_id();
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
|
2014-10-02 19:48:19 +07:00
|
|
|
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
|
2007-02-13 19:26:23 +07:00
|
|
|
continue;
|
2014-10-02 19:48:19 +07:00
|
|
|
|
2017-06-13 23:28:29 +07:00
|
|
|
first_block = per_cpu(threshold_banks, cpu)[bank]->blocks;
|
|
|
|
if (!first_block)
|
|
|
|
continue;
|
2016-11-16 04:13:53 +07:00
|
|
|
|
2017-06-13 23:28:29 +07:00
|
|
|
/*
|
|
|
|
* The first block is also the head of the list. Check it first
|
|
|
|
* before iterating over the rest.
|
|
|
|
*/
|
|
|
|
log_and_reset_block(first_block);
|
|
|
|
list_for_each_entry_safe(block, tmp, &first_block->miscj, miscj)
|
|
|
|
log_and_reset_block(block);
|
2017-05-19 16:39:14 +07:00
|
|
|
}
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Sysfs Interface
|
|
|
|
*/
|
|
|
|
|
|
|
|
struct threshold_attr {
|
2006-06-26 18:58:56 +07:00
|
|
|
struct attribute attr;
|
2009-04-08 17:31:18 +07:00
|
|
|
ssize_t (*show) (struct threshold_block *, char *);
|
|
|
|
ssize_t (*store) (struct threshold_block *, const char *, size_t count);
|
2005-11-05 23:25:53 +07:00
|
|
|
};
|
|
|
|
|
2009-04-08 17:31:18 +07:00
|
|
|
#define SHOW_FIELDS(name) \
|
|
|
|
static ssize_t show_ ## name(struct threshold_block *b, char *buf) \
|
|
|
|
{ \
|
2012-04-27 17:31:34 +07:00
|
|
|
return sprintf(buf, "%lu\n", (unsigned long) b->name); \
|
2006-06-26 18:58:56 +07:00
|
|
|
}
|
2005-11-05 23:25:53 +07:00
|
|
|
SHOW_FIELDS(interrupt_enable)
|
|
|
|
SHOW_FIELDS(threshold_limit)
|
|
|
|
|
2009-04-08 17:31:18 +07:00
|
|
|
static ssize_t
|
2009-04-14 15:26:30 +07:00
|
|
|
store_interrupt_enable(struct threshold_block *b, const char *buf, size_t size)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2008-12-17 08:34:04 +07:00
|
|
|
struct thresh_restart tr;
|
2009-04-08 17:31:18 +07:00
|
|
|
unsigned long new;
|
|
|
|
|
2012-04-16 23:01:53 +07:00
|
|
|
if (!b->interrupt_capable)
|
|
|
|
return -EINVAL;
|
|
|
|
|
2014-08-09 04:24:03 +07:00
|
|
|
if (kstrtoul(buf, 0, &new) < 0)
|
2005-11-05 23:25:53 +07:00
|
|
|
return -EINVAL;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
b->interrupt_enable = !!new;
|
|
|
|
|
2010-10-25 21:03:35 +07:00
|
|
|
memset(&tr, 0, sizeof(tr));
|
2009-04-08 17:31:18 +07:00
|
|
|
tr.b = b;
|
|
|
|
|
2009-03-18 07:10:25 +07:00
|
|
|
smp_call_function_single(b->cpu, threshold_restart_bank, &tr, 1);
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2009-04-14 15:26:30 +07:00
|
|
|
return size;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2009-04-08 17:31:18 +07:00
|
|
|
static ssize_t
|
2009-04-14 15:26:30 +07:00
|
|
|
store_threshold_limit(struct threshold_block *b, const char *buf, size_t size)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2008-12-17 08:34:04 +07:00
|
|
|
struct thresh_restart tr;
|
2009-04-08 17:31:18 +07:00
|
|
|
unsigned long new;
|
|
|
|
|
2014-08-09 04:24:03 +07:00
|
|
|
if (kstrtoul(buf, 0, &new) < 0)
|
2005-11-05 23:25:53 +07:00
|
|
|
return -EINVAL;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
if (new > THRESHOLD_MAX)
|
|
|
|
new = THRESHOLD_MAX;
|
|
|
|
if (new < 1)
|
|
|
|
new = 1;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2010-10-25 21:03:35 +07:00
|
|
|
memset(&tr, 0, sizeof(tr));
|
2008-12-17 08:34:04 +07:00
|
|
|
tr.old_limit = b->threshold_limit;
|
2005-11-05 23:25:53 +07:00
|
|
|
b->threshold_limit = new;
|
2008-12-17 08:34:04 +07:00
|
|
|
tr.b = b;
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2009-03-18 07:10:25 +07:00
|
|
|
smp_call_function_single(b->cpu, threshold_restart_bank, &tr, 1);
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2009-04-14 15:26:30 +07:00
|
|
|
return size;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2008-12-17 08:34:04 +07:00
|
|
|
static ssize_t show_error_count(struct threshold_block *b, char *buf)
|
|
|
|
{
|
2012-04-27 17:53:59 +07:00
|
|
|
u32 lo, hi;
|
|
|
|
|
|
|
|
rdmsr_on_cpu(b->cpu, b->address, &lo, &hi);
|
2009-03-18 07:10:25 +07:00
|
|
|
|
2012-04-27 17:53:59 +07:00
|
|
|
return sprintf(buf, "%u\n", ((hi & THRESHOLD_MAX) -
|
|
|
|
(THRESHOLD_MAX - b->threshold_limit)));
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2012-04-27 20:37:25 +07:00
|
|
|
static struct threshold_attr error_count = {
|
|
|
|
.attr = {.name = __stringify(error_count), .mode = 0444 },
|
|
|
|
.show = show_error_count,
|
|
|
|
};
|
2005-11-05 23:25:53 +07:00
|
|
|
|
x86, mce: trivial clean up for mce_amd_64.c
Fix for followings:
WARNING: Use #include <linux/percpu.h> instead of <asm/percpu.h>
+#include <asm/percpu.h>
ERROR: Macros with multiple statements should be enclosed in a do - while
loop
+#define THRESHOLD_ATTR(_name, _mode, _show, _store) \
+{ \
+ .attr = {.name = __stringify(_name), .mode = _mode }, \
+ .show = _show, \
+ .store = _store, \
+};
WARNING: usage of NR_CPUS is often wrong - consider using cpu_possible(),
num_possible_cpus(), for_each_possible_cpu(), etc
+ if (cpu >= NR_CPUS)
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-04-08 17:31:18 +07:00
|
|
|
#define RW_ATTR(val) \
|
|
|
|
static struct threshold_attr val = { \
|
|
|
|
.attr = {.name = __stringify(val), .mode = 0644 }, \
|
|
|
|
.show = show_## val, \
|
|
|
|
.store = store_## val, \
|
2005-11-05 23:25:53 +07:00
|
|
|
};
|
|
|
|
|
2006-06-26 18:58:56 +07:00
|
|
|
RW_ATTR(interrupt_enable);
|
|
|
|
RW_ATTR(threshold_limit);
|
2005-11-05 23:25:53 +07:00
|
|
|
|
|
|
|
static struct attribute *default_attrs[] = {
|
|
|
|
&threshold_limit.attr,
|
|
|
|
&error_count.attr,
|
2012-04-16 23:20:36 +07:00
|
|
|
NULL, /* possibly interrupt_enable if supported, see below */
|
|
|
|
NULL,
|
2005-11-05 23:25:53 +07:00
|
|
|
};
|
|
|
|
|
2009-04-08 17:31:18 +07:00
|
|
|
#define to_block(k) container_of(k, struct threshold_block, kobj)
|
|
|
|
#define to_attr(a) container_of(a, struct threshold_attr, attr)
|
2005-11-05 23:25:53 +07:00
|
|
|
|
|
|
|
static ssize_t show(struct kobject *kobj, struct attribute *attr, char *buf)
|
|
|
|
{
|
2006-06-26 18:58:53 +07:00
|
|
|
struct threshold_block *b = to_block(kobj);
|
2005-11-05 23:25:53 +07:00
|
|
|
struct threshold_attr *a = to_attr(attr);
|
|
|
|
ssize_t ret;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
ret = a->show ? a->show(b, buf) : -EIO;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t store(struct kobject *kobj, struct attribute *attr,
|
|
|
|
const char *buf, size_t count)
|
|
|
|
{
|
2006-06-26 18:58:53 +07:00
|
|
|
struct threshold_block *b = to_block(kobj);
|
2005-11-05 23:25:53 +07:00
|
|
|
struct threshold_attr *a = to_attr(attr);
|
|
|
|
ssize_t ret;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
ret = a->store ? a->store(b, buf, count) : -EIO;
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-01-19 08:58:23 +07:00
|
|
|
static const struct sysfs_ops threshold_ops = {
|
2009-04-08 17:31:18 +07:00
|
|
|
.show = show,
|
|
|
|
.store = store,
|
2005-11-05 23:25:53 +07:00
|
|
|
};
|
|
|
|
|
2020-02-14 01:01:34 +07:00
|
|
|
static void threshold_block_release(struct kobject *kobj);
|
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
static struct kobj_type threshold_ktype = {
|
2009-04-08 17:31:18 +07:00
|
|
|
.sysfs_ops = &threshold_ops,
|
|
|
|
.default_attrs = default_attrs,
|
2020-02-14 01:01:34 +07:00
|
|
|
.release = threshold_block_release,
|
2005-11-05 23:25:53 +07:00
|
|
|
};
|
|
|
|
|
2016-09-12 14:59:35 +07:00
|
|
|
static const char *get_name(unsigned int bank, struct threshold_block *b)
|
|
|
|
{
|
2018-02-21 17:18:57 +07:00
|
|
|
enum smca_bank_types bank_type;
|
2016-09-12 14:59:35 +07:00
|
|
|
|
|
|
|
if (!mce_flags.smca) {
|
|
|
|
if (b && bank == 4)
|
|
|
|
return bank4_names(b);
|
|
|
|
|
|
|
|
return th_names[bank];
|
|
|
|
}
|
|
|
|
|
2018-02-21 17:18:57 +07:00
|
|
|
bank_type = smca_get_bank_type(bank);
|
|
|
|
if (bank_type >= N_SMCA_BANK_TYPES)
|
2016-09-12 14:59:35 +07:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
if (b && bank_type == SMCA_UMC) {
|
|
|
|
if (b->block < ARRAY_SIZE(smca_umc_block_names))
|
|
|
|
return smca_umc_block_names[b->block];
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2017-01-24 01:35:08 +07:00
|
|
|
if (smca_banks[bank].hwid->count == 1)
|
|
|
|
return smca_get_name(bank_type);
|
|
|
|
|
2016-09-12 14:59:35 +07:00
|
|
|
snprintf(buf_mcatype, MAX_MCATYPE_NAME_LEN,
|
2016-11-04 03:12:33 +07:00
|
|
|
"%s_%x", smca_get_name(bank_type),
|
2017-01-24 01:35:08 +07:00
|
|
|
smca_banks[bank].sysfs_id);
|
2016-09-12 14:59:35 +07:00
|
|
|
return buf_mcatype;
|
|
|
|
}
|
|
|
|
|
2020-02-04 19:28:41 +07:00
|
|
|
static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb,
|
|
|
|
unsigned int bank, unsigned int block,
|
|
|
|
u32 address)
|
2006-06-26 18:58:53 +07:00
|
|
|
{
|
|
|
|
struct threshold_block *b = NULL;
|
2009-04-08 17:31:18 +07:00
|
|
|
u32 low, high;
|
|
|
|
int err;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS))
|
2006-06-26 18:58:53 +07:00
|
|
|
return 0;
|
|
|
|
|
2009-03-18 07:10:25 +07:00
|
|
|
if (rdmsr_safe_on_cpu(cpu, address, &low, &high))
|
2007-02-13 19:26:23 +07:00
|
|
|
return 0;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
|
|
|
if (!(high & MASK_VALID_HI)) {
|
|
|
|
if (block)
|
|
|
|
goto recurse;
|
|
|
|
else
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2007-02-13 19:26:23 +07:00
|
|
|
if (!(high & MASK_CNTP_HI) ||
|
|
|
|
(high & MASK_LOCKED_HI))
|
2006-06-26 18:58:53 +07:00
|
|
|
goto recurse;
|
|
|
|
|
|
|
|
b = kzalloc(sizeof(struct threshold_block), GFP_KERNEL);
|
|
|
|
if (!b)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2009-04-08 17:31:18 +07:00
|
|
|
b->block = block;
|
|
|
|
b->bank = bank;
|
|
|
|
b->cpu = cpu;
|
|
|
|
b->address = address;
|
|
|
|
b->interrupt_enable = 0;
|
2012-04-16 23:01:53 +07:00
|
|
|
b->interrupt_capable = lvt_interrupt_supported(bank, high);
|
2009-04-08 17:31:18 +07:00
|
|
|
b->threshold_limit = THRESHOLD_MAX;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2015-02-03 00:02:41 +07:00
|
|
|
if (b->interrupt_capable) {
|
2012-04-16 23:20:36 +07:00
|
|
|
threshold_ktype.default_attrs[2] = &interrupt_enable.attr;
|
2015-02-03 00:02:41 +07:00
|
|
|
b->interrupt_enable = 1;
|
|
|
|
} else {
|
2012-04-16 23:20:36 +07:00
|
|
|
threshold_ktype.default_attrs[2] = NULL;
|
2015-02-03 00:02:41 +07:00
|
|
|
}
|
2012-04-16 23:20:36 +07:00
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
INIT_LIST_HEAD(&b->miscj);
|
|
|
|
|
2020-02-04 19:28:41 +07:00
|
|
|
if (tb->blocks)
|
|
|
|
list_add(&b->miscj, &tb->blocks->miscj);
|
|
|
|
else
|
|
|
|
tb->blocks = b;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2020-02-04 19:28:41 +07:00
|
|
|
err = kobject_init_and_add(&b->kobj, &threshold_ktype, tb->kobj, get_name(bank, b));
|
2006-06-26 18:58:53 +07:00
|
|
|
if (err)
|
|
|
|
goto out_free;
|
|
|
|
recurse:
|
2019-06-08 03:18:04 +07:00
|
|
|
address = get_block_address(address, low, high, bank, ++block, cpu);
|
2016-03-07 20:02:19 +07:00
|
|
|
if (!address)
|
|
|
|
return 0;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2020-02-04 19:28:41 +07:00
|
|
|
err = allocate_threshold_blocks(cpu, tb, bank, block, address);
|
2006-06-26 18:58:53 +07:00
|
|
|
if (err)
|
|
|
|
goto out_free;
|
|
|
|
|
2008-01-30 19:29:58 +07:00
|
|
|
if (b)
|
|
|
|
kobject_uevent(&b->kobj, KOBJ_ADD);
|
2007-12-20 00:23:20 +07:00
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
return err;
|
|
|
|
|
|
|
|
out_free:
|
|
|
|
if (b) {
|
2007-12-20 23:13:05 +07:00
|
|
|
kobject_put(&b->kobj);
|
2011-05-13 20:52:09 +07:00
|
|
|
list_del(&b->miscj);
|
2006-06-26 18:58:53 +07:00
|
|
|
kfree(b);
|
|
|
|
}
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-19 05:23:59 +07:00
|
|
|
static int __threshold_add_blocks(struct threshold_bank *b)
|
2012-05-02 22:16:59 +07:00
|
|
|
{
|
|
|
|
struct list_head *head = &b->blocks->miscj;
|
|
|
|
struct threshold_block *pos = NULL;
|
|
|
|
struct threshold_block *tmp = NULL;
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
err = kobject_add(&b->blocks->kobj, b->kobj, b->blocks->kobj.name);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
list_for_each_entry_safe(pos, tmp, head, miscj) {
|
|
|
|
|
|
|
|
err = kobject_add(&pos->kobj, b->kobj, pos->kobj.name);
|
|
|
|
if (err) {
|
|
|
|
list_for_each_entry_safe_reverse(pos, tmp, head, miscj)
|
|
|
|
kobject_del(&pos->kobj);
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-19 05:23:59 +07:00
|
|
|
static int threshold_create_bank(unsigned int cpu, unsigned int bank)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2012-01-27 06:49:14 +07:00
|
|
|
struct device *dev = per_cpu(mce_device, cpu);
|
2012-05-02 22:16:59 +07:00
|
|
|
struct amd_northbridge *nb = NULL;
|
2012-05-02 21:20:49 +07:00
|
|
|
struct threshold_bank *b = NULL;
|
2016-09-12 14:59:35 +07:00
|
|
|
const char *name = get_name(bank, NULL);
|
2012-05-02 21:20:49 +07:00
|
|
|
int err = 0;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2016-12-27 04:58:20 +07:00
|
|
|
if (!dev)
|
|
|
|
return -ENODEV;
|
|
|
|
|
2013-03-15 04:10:40 +07:00
|
|
|
if (is_shared_bank(bank)) {
|
2012-05-02 22:16:59 +07:00
|
|
|
nb = node_to_amd_nb(amd_get_nb_id(cpu));
|
|
|
|
|
|
|
|
/* threshold descriptor already initialized on this node? */
|
2012-10-01 13:42:05 +07:00
|
|
|
if (nb && nb->bank4) {
|
2012-05-02 22:16:59 +07:00
|
|
|
/* yes, use it */
|
|
|
|
b = nb->bank4;
|
|
|
|
err = kobject_add(b->kobj, &dev->kobj, name);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
per_cpu(threshold_banks, cpu)[bank] = b;
|
2017-05-19 16:39:13 +07:00
|
|
|
refcount_inc(&b->cpus);
|
2012-05-02 22:16:59 +07:00
|
|
|
|
|
|
|
err = __threshold_add_blocks(b);
|
|
|
|
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
|
2005-11-05 23:25:53 +07:00
|
|
|
if (!b) {
|
|
|
|
err = -ENOMEM;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2012-01-17 05:40:28 +07:00
|
|
|
b->kobj = kobject_create_and_add(name, &dev->kobj);
|
2012-05-02 21:20:49 +07:00
|
|
|
if (!b->kobj) {
|
|
|
|
err = -EINVAL;
|
2007-12-20 00:23:20 +07:00
|
|
|
goto out_free;
|
2012-05-02 21:20:49 +07:00
|
|
|
}
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2013-03-15 04:10:40 +07:00
|
|
|
if (is_shared_bank(bank)) {
|
2017-05-19 16:39:13 +07:00
|
|
|
refcount_set(&b->cpus, 1);
|
2012-05-02 22:16:59 +07:00
|
|
|
|
|
|
|
/* nb is already initialized, see above */
|
2012-10-01 13:42:05 +07:00
|
|
|
if (nb) {
|
|
|
|
WARN_ON(nb->bank4);
|
|
|
|
nb->bank4 = b;
|
|
|
|
}
|
2012-05-02 22:16:59 +07:00
|
|
|
}
|
|
|
|
|
2020-02-04 19:28:41 +07:00
|
|
|
err = allocate_threshold_blocks(cpu, b, bank, 0, msr_ops.misc(bank));
|
|
|
|
if (err)
|
|
|
|
goto out_free;
|
|
|
|
|
|
|
|
per_cpu(threshold_banks, cpu)[bank] = b;
|
|
|
|
|
|
|
|
return 0;
|
2006-06-26 18:58:53 +07:00
|
|
|
|
2012-05-02 22:16:59 +07:00
|
|
|
out_free:
|
2006-06-26 18:58:53 +07:00
|
|
|
kfree(b);
|
2012-05-02 22:16:59 +07:00
|
|
|
|
|
|
|
out:
|
2005-11-05 23:25:53 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2020-02-14 01:01:34 +07:00
|
|
|
static void threshold_block_release(struct kobject *kobj)
|
|
|
|
{
|
|
|
|
kfree(to_block(kobj));
|
|
|
|
}
|
|
|
|
|
|
|
|
static void deallocate_threshold_block(unsigned int cpu, unsigned int bank)
|
2006-06-26 18:58:53 +07:00
|
|
|
{
|
|
|
|
struct threshold_block *pos = NULL;
|
|
|
|
struct threshold_block *tmp = NULL;
|
|
|
|
struct threshold_bank *head = per_cpu(threshold_banks, cpu)[bank];
|
|
|
|
|
|
|
|
if (!head)
|
|
|
|
return;
|
|
|
|
|
|
|
|
list_for_each_entry_safe(pos, tmp, &head->blocks->miscj, miscj) {
|
|
|
|
list_del(&pos->miscj);
|
2020-02-14 01:01:34 +07:00
|
|
|
kobject_put(&pos->kobj);
|
2006-06-26 18:58:53 +07:00
|
|
|
}
|
|
|
|
|
2020-02-14 01:01:34 +07:00
|
|
|
kobject_put(&head->blocks->kobj);
|
2006-06-26 18:58:53 +07:00
|
|
|
}
|
|
|
|
|
2012-05-02 22:16:59 +07:00
|
|
|
static void __threshold_remove_blocks(struct threshold_bank *b)
|
|
|
|
{
|
|
|
|
struct threshold_block *pos = NULL;
|
|
|
|
struct threshold_block *tmp = NULL;
|
|
|
|
|
|
|
|
kobject_del(b->kobj);
|
|
|
|
|
|
|
|
list_for_each_entry_safe(pos, tmp, &b->blocks->miscj, miscj)
|
|
|
|
kobject_del(&pos->kobj);
|
|
|
|
}
|
|
|
|
|
2006-07-30 17:03:37 +07:00
|
|
|
static void threshold_remove_bank(unsigned int cpu, int bank)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2012-05-02 22:16:59 +07:00
|
|
|
struct amd_northbridge *nb;
|
2005-11-05 23:25:53 +07:00
|
|
|
struct threshold_bank *b;
|
|
|
|
|
|
|
|
b = per_cpu(threshold_banks, cpu)[bank];
|
|
|
|
if (!b)
|
|
|
|
return;
|
2012-05-02 22:16:59 +07:00
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
if (!b->blocks)
|
|
|
|
goto free_out;
|
|
|
|
|
2013-03-15 04:10:40 +07:00
|
|
|
if (is_shared_bank(bank)) {
|
2017-05-19 16:39:13 +07:00
|
|
|
if (!refcount_dec_and_test(&b->cpus)) {
|
2012-05-02 22:16:59 +07:00
|
|
|
__threshold_remove_blocks(b);
|
|
|
|
per_cpu(threshold_banks, cpu)[bank] = NULL;
|
|
|
|
return;
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* the last CPU on this node using the shared bank is
|
|
|
|
* going away, remove that bank now.
|
|
|
|
*/
|
|
|
|
nb = node_to_amd_nb(amd_get_nb_id(cpu));
|
|
|
|
nb->bank4 = NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2006-06-26 18:58:53 +07:00
|
|
|
deallocate_threshold_block(cpu, bank);
|
|
|
|
|
|
|
|
free_out:
|
x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c . Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former. At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other. This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.
Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined. Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.
This patch fixes bug #11337.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 03:23:09 +07:00
|
|
|
kobject_del(b->kobj);
|
2007-12-20 23:13:05 +07:00
|
|
|
kobject_put(b->kobj);
|
2006-06-26 18:58:53 +07:00
|
|
|
kfree(b);
|
|
|
|
per_cpu(threshold_banks, cpu)[bank] = NULL;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2016-11-11 00:44:44 +07:00
|
|
|
int mce_threshold_remove_device(unsigned int cpu)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2006-06-26 18:58:56 +07:00
|
|
|
unsigned int bank;
|
2005-11-05 23:25:53 +07:00
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) {
|
2008-01-30 19:33:40 +07:00
|
|
|
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
|
2005-11-05 23:25:53 +07:00
|
|
|
continue;
|
|
|
|
threshold_remove_bank(cpu, bank);
|
|
|
|
}
|
2013-03-15 04:10:41 +07:00
|
|
|
kfree(per_cpu(threshold_banks, cpu));
|
2016-11-11 00:44:42 +07:00
|
|
|
per_cpu(threshold_banks, cpu) = NULL;
|
2016-11-11 00:44:44 +07:00
|
|
|
return 0;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
2016-11-11 00:44:41 +07:00
|
|
|
/* create dir/files for all valid threshold banks */
|
2016-11-11 00:44:44 +07:00
|
|
|
int mce_threshold_create_device(unsigned int cpu)
|
2005-11-05 23:25:53 +07:00
|
|
|
{
|
2016-11-11 00:44:41 +07:00
|
|
|
unsigned int bank;
|
|
|
|
struct threshold_bank **bp;
|
|
|
|
int err = 0;
|
|
|
|
|
2016-11-11 00:44:43 +07:00
|
|
|
bp = per_cpu(threshold_banks, cpu);
|
|
|
|
if (bp)
|
|
|
|
return 0;
|
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
bp = kcalloc(per_cpu(mce_num_banks, cpu), sizeof(struct threshold_bank *),
|
2016-11-11 00:44:41 +07:00
|
|
|
GFP_KERNEL);
|
|
|
|
if (!bp)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
per_cpu(threshold_banks, cpu) = bp;
|
|
|
|
|
2019-06-08 03:18:05 +07:00
|
|
|
for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) {
|
2016-11-11 00:44:41 +07:00
|
|
|
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
|
|
|
|
continue;
|
|
|
|
err = threshold_create_bank(cpu, bank);
|
|
|
|
if (err)
|
2016-11-11 00:44:42 +07:00
|
|
|
goto err;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
2016-11-11 00:44:42 +07:00
|
|
|
return err;
|
|
|
|
err:
|
2016-11-11 00:44:44 +07:00
|
|
|
mce_threshold_remove_device(cpu);
|
2016-11-11 00:44:41 +07:00
|
|
|
return err;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static __init int threshold_init_device(void)
|
|
|
|
{
|
2006-06-26 18:58:56 +07:00
|
|
|
unsigned lcpu = 0;
|
2005-11-05 23:25:53 +07:00
|
|
|
|
|
|
|
/* to hit CPUs online before the notifier is up */
|
|
|
|
for_each_online_cpu(lcpu) {
|
2016-11-11 00:44:44 +07:00
|
|
|
int err = mce_threshold_create_device(lcpu);
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2005-11-05 23:25:53 +07:00
|
|
|
if (err)
|
2006-06-26 18:58:50 +07:00
|
|
|
return err;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
2009-04-08 17:31:18 +07:00
|
|
|
|
2018-11-27 20:41:37 +07:00
|
|
|
if (thresholding_irq_en)
|
|
|
|
mce_threshold_vector = amd_threshold_interrupt;
|
|
|
|
|
2006-06-26 18:58:50 +07:00
|
|
|
return 0;
|
2005-11-05 23:25:53 +07:00
|
|
|
}
|
2012-06-07 18:58:50 +07:00
|
|
|
/*
|
|
|
|
* there are 3 funcs which need to be _initcalled in a logic sequence:
|
|
|
|
* 1. xen_late_init_mcelog
|
|
|
|
* 2. mcheck_init_device
|
|
|
|
* 3. threshold_init_device
|
|
|
|
*
|
|
|
|
* xen_late_init_mcelog must register xen_mce_chrdev_device before
|
|
|
|
* native mce_chrdev_device registration if running under xen platform;
|
|
|
|
*
|
|
|
|
* mcheck_init_device should be inited before threshold_init_device to
|
|
|
|
* initialize mce_device, otherwise a NULL ptr dereference will cause panic.
|
|
|
|
*
|
|
|
|
* so we use following _initcalls
|
|
|
|
* 1. device_initcall(xen_late_init_mcelog);
|
|
|
|
* 2. device_initcall_sync(mcheck_init_device);
|
|
|
|
* 3. late_initcall(threshold_init_device);
|
|
|
|
*
|
|
|
|
* when running under xen, the initcall order is 1,2,3;
|
|
|
|
* on baremetal, we skip 1 and we do only 2 and 3.
|
|
|
|
*/
|
|
|
|
late_initcall(threshold_init_device);
|