linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-03 13:06:43 +07:00

Author	SHA1	Message	Date
Borislav Petkov	f8d5549df2	EDAC, ghes: Do not enable it by default Leave it to the user to decide whether to enable this or not. Otherwise, platform-specific drivers won't initialize (currently, EDAC supports only a single platform driver loaded). Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-27 14:15:38 +02:00
Borislav Petkov	bffc7dece9	EDAC: Rename report status accessors Change them to have the edac_ prefix. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:15:02 +02:00
Borislav Petkov	fee27d7d97	EDAC: Delete edac_stub.c Move the remaining functionality to edac_mc.c. Convert "edac_report=" to a module parameter. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:48 +02:00
Borislav Petkov	a06b85ff07	EDAC: Update Kconfig help text Remove the old URLs. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:44 +02:00
Borislav Petkov	e3c4ff6d8c	EDAC: Remove EDAC_MM_EDAC Move all the EDAC core functionality behind CONFIG_EDAC and get rid of that indirection. Update defconfigs which had it. While at it, fix dependencies such that EDAC depends on RAS for the tracepoints. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-edac@vger.kernel.org	2017-04-10 17:14:41 +02:00
Borislav Petkov	be1d162948	EDAC: Issue tracepoint only when it is defined ... and this happens only when CONFIG_RAS is enabled. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:38 +02:00
Borislav Petkov	8c22b4fece	EDAC: Move edac_op_state to edac_mc.c ... as part of moving stuff away from edac_stub.c Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:29 +02:00
Borislav Petkov	d3116a0837	EDAC: Remove edac_err_assert ... and the glue around it. It is not needed anymore. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:21 +02:00
Borislav Petkov	97bb6c17ad	EDAC: Get rid of edac_handlers Use mc_devices list instead to check whether we have EDAC driver instances successfully registered with EDAC core. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:14:17 +02:00
Borislav Petkov	db47d5f856	x86/nmi, EDAC: Get rid of DRAM error reporting thru PCI SERR NMI Apparently, some machines used to report DRAM errors through a PCI SERR NMI. This is why we have a call into EDAC in the NMI handler. See `c0d1217202` ("drivers/edac: add new nmi rescan"). From looking at the patch above, that's two drivers: e752x_edac.c and e7xxx_edac.c. Now, I wanna say those are old machines which are probably decommissioned already. Tony says that "[t]the newest CPU supported by either of those drivers is the Xeon E7520 (a.k.a. "Nehalem") released in Q1'2010. Possibly some folks are still using these ... but people that hold onto h/w for 7 years generally cling to old s/w too ... so I'd guess it unlikely that we will get complaints for breaking these in upstream." So even if there is a small number still in use, we did load EDAC with edac_op_state == EDAC_OPSTATE_POLL by default (we still do, in fact) which means a default EDAC setup without any parameters supplied on the command line or otherwise would never even log the error in the NMI handler because we're polling by default: inline int edac_handler_set(void) { if (edac_op_state == EDAC_OPSTATE_POLL) return 0; return atomic_read(&edac_handlers); } So, long story short, I'd like to get rid of that nastiness called edac_stub.c and confine all the EDAC drivers solely to drivers/edac/. If we ever have to do stuff like that again, it should be notifiers we're using and not some insanity like this one. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com>	2017-04-10 17:13:48 +02:00
Borislav Petkov	76f6a26ce9	EDAC, highbank: Align Makefile directives ... like the rest of the file. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-10 17:10:43 +02:00
Sergey Temerkhanov	5195c206fd	EDAC, thunderx: Remove unused code Remove unused code reserved for upcoming CPUs. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113834.17153-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-07 11:49:32 +02:00
Sergey Temerkhanov	3d2d8c0f84	EDAC, thunderx: Change LMC index calculation Shift the node number by 3 bits instead of 8 allowing proper functioning with default EDAC_MAX_MCS. Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David Daney <david.daney@cavium.com> Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170406113755.17082-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-07 11:47:44 +02:00
Thor Thayer	25b223ddfe	EDAC, altera: Fix peripheral warnings for Cyclone5 The peripherals' RAS functionality only exist on the Arria10 SoCFPGA. The Cyclone5 initialization generates EDAC warnings when the peripherals aren't found in the device tree. Fix by checking for Arria10 in the init functions. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1491415262-5018-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-06 11:42:38 +02:00
Jan Glauber	621c4fe3cc	EDAC, thunderx: Fix L2C MCI interrupt disable Fix a typo that disabled the MCI interrupts using the wrong bitmask. Signed-off-by: Jan Glauber <jglauber@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170405102739.6301-1-jglauber@cavium.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-04-05 14:46:05 +02:00
Sergey Temerkhanov	41003396f9	EDAC, thunderx: Add Cavium ThunderX EDAC driver Add support for Cavium ThunderX EDAC capable on-chip peripherals, namely the DRAM controller (LMC), cache coherent processor interconnect (CCPI) and level 2 cache blocks (L2C-TAD, L2C-MCI, L2C-CBC) Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Cc: David.Daney@cavium.com Cc: Jan.Glauber@cavium.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170324222837.60583-1-s.temerkhanov@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-27 11:43:56 +02:00
Qiuxu Zhuo	819f60fb7d	EDAC, pnd2_edac: Fix reported DIMM number DIMM number passed to edac_mc_handle_error() was accidentally hardcoded to zero. Pass in the correct daddr->dimm value. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-26 09:36:28 +02:00
Borislav Petkov	cd1be315ac	EDAC, pnd2_edac: Fix !EDAC_DEBUG build Provide debugfs function stubs when EDAC_DEBUG is not enabled so that we don't fail the build: drivers/edac/pnd2_edac.c: In function ‘pnd2_init’: drivers/edac/pnd2_edac.c:1521:2: error: implicit declaration of function ‘setup_pnd2_debug’ [-Werror=implicit-function-declaration] setup_pnd2_debug(); ^ drivers/edac/pnd2_edac.c: In function ‘pnd2_exit’: drivers/edac/pnd2_edac.c:1529:2: error: implicit declaration of function ‘teardown_pnd2_debug’ [-Werror=implicit-function-declaration] teardown_pnd2_debug(); ^ Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-23 12:56:23 +01:00
Borislav Petkov	1c5bf78114	EDAC: Select DEBUG_FS The debugfs.c functionality relies on DEBUG_FS so select it. Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-23 12:56:09 +01:00
Tony Luck	5c71ad17f9	EDAC, pnd2_edac: Add new EDAC driver for Intel SoC platforms Initial target for this driver is the Intel Apollo Lake platform and Denverton micro-server, they use the same internal memory controller IP called Pondicherry2. Memory controller registers are not in PCI config space like earlier Intel memory controllers. For Apollo Lake platform they are accessed via a "side-band" interface, for Denverton micro-server they are access via PCI config space and memory map I/O. This driver is for Apollo Lake and Denverton, but only the Denverton is fully enabled while we wait for the sideband driver. Apollo lake driver and initial cut at Denverton driver by Tony Luck. Extensive cleanup, refactoring and basic verification by Qiuxu Zhuo. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170308174539.14432-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-16 12:40:52 +01:00
Jérémy Lefaure	e61555c29c	EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used as if it returned a boolean true if the width if 8. Fix the tests where MTR_DRAM_WIDTH is misused. Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-09 09:25:29 +01:00
Colin Ian King	4bd035eae2	EDAC, xgene: Fix wrongly spelled "procesing" Fix spelling mistake in dev_err message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Loc Ho <lho@apm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170223002609.9440-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-03-06 17:08:19 +01:00
Linus Torvalds	60c906bab1	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The main changes in this cycle were: - Assign notifier chain priorities for all RAS related handlers to make the ordering explicit (Borislav Petkov) - Improve the AMD MCA banks sysfs output (Yazen Ghannam) - Various cleanups and restructuring of the x86 RAS code (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority x86/ras: Get rid of mce_process_work() EDAC/mce/amd: Dump TSC value EDAC/mce/amd: Unexport amd_decode_mce() x86/ras/amd/inj: Change dependency x86/ras: Flip the TSC-adding logic x86/ras/amd: Make sysfs names of banks more user-friendly x86/ras/therm_throt: Do not log a fake MCE for thermal events x86/ras/inject: Make it depend on X86_LOCAL_APIC=y	2017-02-20 12:47:44 -08:00
Yazen Ghannam	75bf2f6478	EDAC, mce_amd: Print IPID and Syndrome on a separate line Currently, the IPID and Syndrome are printed on the same line as the Address. There are cases when we can have a valid Syndrome but not a valid Address. For example, the MCA_SYND register can be used to hold more detailed error info that the hardware folks can use. It's not just DRAM ECC syndromes. There are some error types that aren't related to memory that may have valid syndromes, like some errors related to links in the Data Fabric, etc. In these cases, the IPID and Syndrome are not printed at the same log level as the rest of the stanza, so users won't see them on the console. Console: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Dmesg: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b , Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Print the IPID first and on a new line. The IPID should always be printed on SMCA systems. The Syndrome will then be printed with the IPID and at the same log level when valid: [Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over\|CE\|MiscV\|-\|-\|-\|-\|SyndV\|-]: 0xd82000000002080b [Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000 [Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2 Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-16 15:39:32 +01:00
Borislav Petkov	e62d2ca9d0	EDAC, amd64: Bump driver version Last time we did that was when we enabled Bulldozer. Now, we enabled Zen so it is only natural ... :-) Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>	2017-02-14 11:58:05 +01:00
Wei Yongjun	279fa58035	EDAC, fsl_ddr: Make locally used symbols static Fix the following sparse warnings: drivers/edac/fsl_ddr_edac.c:148:1: warning: symbol 'dev_attr_inject_data_hi' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:150:1: warning: symbol 'dev_attr_inject_data_lo' was not declared. Should it be static? drivers/edac/fsl_ddr_edac.c:152:1: warning: symbol 'dev_attr_inject_ctrl' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170209150424.15124-1-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-09 17:40:54 +01:00
Chris Packham	321d17c19b	EDAC, mpc85xx: Add T2080 l2-cache support The L2 cache controller on the T2080 SoC has similar capabilities to the others already supported by the mpc85xx_edac driver. Add it to the list of compatible devices. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170201231624.28843-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>	2017-02-03 10:36:35 +01:00
Yazen Ghannam	1bd9900b83	EDAC, amd64: Add x86cpuid sanity check during init Match one of the devices in amd64_cpuids[] before loading the module. This is an additional sanity check against users trying to load amd64_edac_mod on unsupported systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com [ Get rid of err_ret label, make it a bit more readable this way. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:43:06 +01:00
Yazen Ghannam	4688c9b42d	EDAC, amd64: Don't treat ECC disabled as failure Having ECC disabled on a node doesn't necessarily mean that it's disabled for the entire system. So let's return a non-failing code when ECC is disabled on a node. This way we can skip initialization for the node but still continue with the remaining nodes. After probing all instances, make sure we have at least one MC device allocated. This issue is seen and fix tested on Fam15h and Fam17h MCM systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-8-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:38:49 +01:00
Yazen Ghannam	d7fc9d77ac	EDAC: Add routine to check if MC devices list is empty We need to know if any MC devices have been allocated. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-7-git-send-email-Yazen.Ghannam@amd.com [ Prettify text. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 14:36:47 +01:00
Yazen Ghannam	df64636fa4	EDAC, amd64: Remove unused printing macros amd64_{debug,notice} don't have any users, so remove them. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:20:06 +01:00
Yazen Ghannam	11ab1cae58	EDAC, amd64: Rework messages in ecc_enabled() Print the node number when informing that DRAM ECC is disabled so that we can show which nodes have DRAM ECC disabled. Also, print more detailed system information as edac_dbg(), so as to not bother general users. Switch amd64_notice to amd64_info to match the message above it. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-5-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:18:33 +01:00
Yazen Ghannam	234365f56e	EDAC, amd64: Move global code out of instance functions We have a few functions that register/unregister an ECC error decoding routine. These functions are called when we init/remove instances. However, they are global and so don't need to be registered/unregistered multiple times. So move them out of the init/remove instance functions and into the module init/exit routines. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:08:10 +01:00
Yazen Ghannam	2b9b2c4659	EDAC, amd64: Free unused memory when init_one_instance() fails Jump to memory freeing routines when init_one_instance() fails. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-3-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:03:40 +01:00
Yazen Ghannam	67d7fd306e	EDAC, mce_amd: Give more context to deferred error message Users may not be familiar with the concept of deferred errors. There is no action for users to take on this type of error, so give more context in the error message to make this more clear. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-28 13:03:29 +01:00
Borislav Petkov	58fb24cb95	EDAC, i7300: Test for the second channel properly REDMEMB[17] is the ECC_Locator bit, which, when set, identifies the CS[3:2] as the simbols in error. And thus the second channel. The macro computing it was wrong so get rid of it (it was used at one place only) and get rid of the conditional too. Generates better code this way anyway. Signed-off-by: Borislav Petkov <bp@suse.de> Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-01-26 11:35:23 +01:00
Borislav Petkov	9026cc82b6	x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:57 +01:00
Borislav Petkov	0bceab677d	EDAC/mce/amd: Dump TSC value Dump the TSC value of the time when the MCE got logged. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:56 +01:00
Borislav Petkov	1fbcd90903	EDAC/mce/amd: Unexport amd_decode_mce() It is not used outside of the driver anymore. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-24 09:14:55 +01:00
Nicolas Iooss	127c1225bf	EDAC, sb_edac: Get rid of ->show_interleave_mode() Function sbridge_register_mci() sets pvt->info.show_interleave_mode to knl_show_interleave_mode() on Knight's Landing and show_interleave_mode() anywhere else. Merge show_interleave_mode() and knl_show_interleave_mode() in a single implementation and use it without an indirect function pointer. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170122172806.10412-1-nicolas.iooss_linux@m4x.org [ Call it get_intlv_mode_str(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-23 11:39:48 +01:00
Aaron Miller	4fb6fde74d	EDAC: Expose per-DIMM error counts in sysfs The old csrowX sysfs directories have per-csrow error counters, but the new dimmX directories do not currently expose error counts. EDAC already keeps these counts, add them to sysfs so per-DIMM counts are still available when CONFIG_EDAC_LEGACY_SYSFS=n. Signed-off-by: Aaron Miller <aaronmiller@fb.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161103220153.3997328-1-aaronmiller@fb.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-19 10:29:40 +01:00
Yazen Ghannam	2287c63643	EDAC, amd64: Save and return err code from probe_one_instance() We should save the return code from probe_one_instance() so that it can be returned from the module init function. Otherwise, we'll be returning the -ENOMEM from above. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484322741-41884-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-16 11:53:39 +01:00
Arvind Yadav	f5c61277f6	EDAC, i82975x: Add ioremap_nocache() error handling If ioremap_nocache() fails, it will return NULL. Which will then cause a NULL-pointer dereference. Handle the returned value properly. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484549092-11349-1-git-send-email-arvind.yadav.cs@gmail.com [ Boris: massage commit message and improve error message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-16 11:09:22 +01:00
Ben Dooks	628ea92f09	EDAC: Make dev_attr_sdram_scrub_rate static The dev_attr_sdram_scrub_rate is not declared in a header or used anywhere else, so make it static to fix the following warning: drivers/edac/edac_mc_sysfs.c:816:1: warning: symbol 'dev_attr_sdram_scrub_rate' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1465407356-7357-1-git-send-email-ben.dooks@codethink.co.uk Signed-off-by: Borislav Petkov <bp@suse.de>	2017-01-06 15:54:16 +01:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Mauro Carvalho Chehab	66c222a02f	edac: fix kernel-doc tags at the drivers/edac_*.h Some kernel-doc tags don't provide good descriptions or use a different style. Adjust them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:58:10 -02:00
Mauro Carvalho Chehab	6634fbb6b6	driver-api: create an edac.rst file with EDAC documentation Currently, there's no device driver documentation for the EDAC subsystem at the driver-api book. Fill in the blanks for the structures and functions that misses documentation, uniform the word on the existing ones, and add a new edac.rst file at driver-api, in order to document the EDAC subsystem. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab	e01aa14cf2	edac: move documentation from edac_mc.c to edac_core.h Several functions are documented at edac_mc.c. As we'll be including edac_core.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:52 -02:00
Mauro Carvalho Chehab	fdaf0b3505	edac: move documentation from edac_pci*.c to edac_pci.h Several functions are documented at edac_pci.c and edac_pci_sysfs.c. As we'll be including edac_pci.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. As several of those kernel-doc macros are not in the right format, fix them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	5336f75499	edac: move documentation from edac_device to edac_core.h Several functions are documented at edac_device.c. As we'll be including edac_core.h at drivers-api book, move those, in order for the kernel-doc markups be part of the API documentation book. As several of those kernel-doc macros are not in the right format, fix them. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	78d88e8a3d	edac: rename edac_core.h to edac_mc.h Now, all left at edac_core.h are at drivers/edac/edac_mc.c, so rename it to edac_mc.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	6d8ef24724	edac: move EDAC device definitions to drivers/edac/edac_device.h The edac_core.h header contain data structures and function definitions for both EDAC MC and EDAC device. Let's move the devices ones to a separate header file, as part of a header reorganization. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:51 -02:00
Mauro Carvalho Chehab	0b892c7177	edac: move EDAC PCI definitions to drivers/edac/edac_pci.h The edac_core.h header contain data structures and function definitions for the 3 parts of EDAC: MC, PCI and device. Let's move the PCI ones to a separate header file, as part of a header reorganization. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:50 -02:00
Mauro Carvalho Chehab	7230617537	edac: edac_core.h: remove prototype for edac_pci_reset_delay_period() This function doesn't exist. So, remove its prototype. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:49 -02:00
Mauro Carvalho Chehab	a2c223b5ed	edac: edac_core.h: get rid of unused kobj_complete This element of struct edac_pci_ctl_info is never used. So, get rid of it. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2016-12-15 08:54:49 -02:00
Pan Bian	0de2788447	EDAC, amd64: Fix improper return value When the call to zalloc_cpumask_var() fails, returning "false" seems improper. The real value of macro "false" is 0, and 0 means no error. Return -ENOMEM instead. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189071 Signed-off-by: Pan Bian <bianpan2016@163.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1480831638-5361-1-git-send-email-bianpan201604@163.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-12-04 10:51:42 +01:00
Borislav Petkov	5246c54007	EDAC, amd64: Improve amd64-specific printing macros Prefix the warn and error macros with the respective string so that callers don't have to say "Error" or "Warning". We save us string length this way in the actual calls. While at it, shorten the calls in reserve_mc_sibling_devs(). Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>	2016-12-01 11:35:07 +01:00
Yazen Ghannam	95d3af6bd1	EDAC, amd64: Autoload amd64_edac_mod on Fam17h systems Add Fam17h to the list of families to autoload amd64_edac_mod. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-18-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:49 +01:00
Yazen Ghannam	713ad54675	EDAC, amd64: Define and register UMC error decode function How we need to decode UMC errors is different from how we decode bus errors, so let's define a new function for this. We also need a way to determine the UMC channel since we're not guaranteed that there is a fixed relation between channel and MCA bank. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480359593-80369-1-git-send-email-Yazen.Ghannam@amd.com [ Fold in decode_synd_reg(), simplify. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:48 +01:00
Yazen Ghannam	d27f3a348e	EDAC, amd64: Determine EDAC capabilities on Fam17h systems We need to determine the EDAC capabilities from all UMCs on the node. We should only check UMCs that are enabled and make sure they all agree. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-15-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:05:47 +01:00
Yazen Ghannam	2d09d8f301	EDAC, amd64: Determine EDAC MC capabilities on Fam17h The UMCs on Fam17h are independent memory controllers so we need to read the capabilities from all UMCs and make sure they agree. Once we determine what capabilities are available we should save them for convenience. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480431116-94683-1-git-send-email-Yazen.Ghannam@amd.com [ Simplify f17h_determine_edac_ctl_cap(), preinit edac_mode in init_csrows(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 18:04:54 +01:00
Yazen Ghannam	07ed82ef93	EDAC, amd64: Add Fam17h debug output Read a few more UMC registers and provide debug output in order to be as similar as possible to older AMD systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480344621-14966-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded K8 check and comments, fixup others. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-29 17:16:09 +01:00
Yazen Ghannam	8051c0af3c	EDAC, amd64: Add Fam17h scrubber support Fam17h has new register offsets and fields for setting up the DRAM scrubber so add support for this. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-17-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:12 +01:00
Yazen Ghannam	a6c14dce85	EDAC, mce_amd: Don't report poison bit on Fam15h, bank 4 MCA_STATUS[43] has been defined as "Poison" or "Reserved" for every bank since Fam15h except for Fam15h, bank 4 in which case it's defined as part of the McaStatSubCache bitfield. Filter out that case. Reported-by: Dean Liberty <Dean.Liberty@amd.com> Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479478222-19896-1-git-send-email-Yazen.Ghannam@amd.com [ Split an almost unparseable ternary conditional, add a comment. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:12 +01:00
Yazen Ghannam	b64ce7cd7f	EDAC, amd64: Read MC registers on AMD Fam17h Fam17h has a different set of registers and bitfields. Most of these registers are read through SMN (System Management Network) rather than PCI config space. Also, the derivation of various values is now different. Update amd64_edac to read the appropriate registers and extract the correct values for Fam17h. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-12-git-send-email-Yazen.Ghannam@amd.com [ Save us the indentation level in read_mc_regs(), add defines ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:50:11 +01:00
Yazen Ghannam	936fc3afaa	EDAC, amd64: Reserve correct PCI devices on AMD Fam17h Fam17h needs PCI device functions 0 and 6 instead of 1 and 2 as on older systems. Update struct amd64_pvt to hold the new functions and reserve them if on Fam17h. Also, allocate an array of UMC structs within our newly allocated PVT struct. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-11-git-send-email-Yazen.Ghannam@amd.com [ init_one_instance() error handling, shorten lines, unbreak >80 cols lines. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-28 17:49:40 +01:00
Yazen Ghannam	f1cbbec9fc	EDAC, amd64: Add AMD Fam17h family type and ops Add a family type and associated ops for Fam17h. Define a struct to hold all the UMC registers that we need. Make this a part of struct amd64_pvt in order to maximize code reuse in the rest of the driver. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-10-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-24 21:24:09 +01:00
Yazen Ghannam	196b79fcc8	EDAC, amd64: Extend ecc_enabled() to Fam17h Update the ecc_enabled() function to work on Fam17h. This entails reading a different set of registers and using the SMN (System Management Network) rather than PCI devices. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-9-git-send-email-Yazen.Ghannam@amd.com [ Fixup ecc_en assignment and get_umc_base(). ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-24 21:07:03 +01:00
Borislav Petkov	627bc29ed9	Merge tip:ras/core to pick up dependent changes tip:ras/core contains the respective Fam17h x86 RAS bits which amd64_edac is going to use. So merge it into the EDAC branch. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-23 21:13:40 +01:00
Yazen Ghannam	044e7a414b	EDAC, amd64: Don't force-enable ECC checking on newer systems It's not recommended for the OS to try and force-enable ECC checking. This is considered a firmware task since it includes memory training, etc, so don't change ECC settings on Fam17h or newer systems and inform the user. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479850816-1595-1-git-send-email-Yazen.Ghannam@amd.com [ Put the "forcing" message in an else branch. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-23 19:04:11 +01:00
Yazen Ghannam	d12a969ebb	EDAC, amd64: Add Deferred Error type Currently, deferred errors are classified as correctable in EDAC. Add a new error type for deferred errors so that they are correctly reported to the user. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-7-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:57:19 +01:00
Yazen Ghannam	e70984d9eb	EDAC, amd64: Rename __log_bus_error() to be more specific We only use __log_bus_error() to log DRAM ECC errors, so let's change the name to reflect this. We'll also use this function for DRAM ECC errors on Fam17h, but we'll call it from a different function than decode_bus_error(). Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:42:55 +01:00
Yazen Ghannam	e7934b70d7	EDAC, amd64: Change target of pci_name from F2 to F3 AMD Fam17h will not be using PCI function 2 for EDAC, but will continue to use function 3. So let's get the name of F3 instead of F2 to support Fam17h and previous families. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-5-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 10:24:20 +01:00
Yazen Ghannam	5c332202f8	EDAC, mce_amd: Rename nb_bus_decoder to dram_ecc_decoder nb_bus_decoder() is only used for DRAM ECC errors so rename it so that the name is more generic and descriptive. Also, call it for DRAM ECC errors on SMCA systems. [ Boris: rename it to real function name with a verb in it. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-21 09:43:15 +01:00
Yanjiang Jin	27bda205ba	EDAC, mpc85xx: Implement remove method for the platform driver If we execute the below steps without this patch: modprobe mpc85xx_edac [The first insmod, everything is well.] modprobe -r mpc85xx_edac modprobe mpc85xx_edac [insmod again, error happens.] We would get the error messages as below: BUG: recent printk recursion! Oops: Kernel access of bad area, sig: 11 [#48] Modules linked in: mpc85xx_edac edac_core softdog [last unloaded: mpc85xx_edac] CPU: 5 PID: 14773 Comm: modprobe Tainted: G D C 4.8.3-rt2 .vsnprintf .vscnprintf .vprintk_emit .printk .edac_pci_add_device .mpc85xx_pci_err_probe .platform_drv_probe .driver_probe_device .__driver_attach .bus_for_each_dev .driver_attach .bus_add_driver .driver_register .__platform_register_drivers .mpc85xx_mc_init .do_one_initcall .do_init_module .load_module .SyS_finit_module system_call Address this by cleaning up properly when removing the platform driver. Tested on a T4240QDS board. Signed-off-by: Yanjiang Jin <yanjiang.jin@windriver.com> Acked-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: york.sun@nxp.com Link: http://lkml.kernel.org/r/1479351380-17109-2-git-send-email-yanjiang.jin@windriver.com [ Boris: massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-17 11:19:50 +01:00
Colin Ian King	8176170e03	EDAC, xgene: Fix spelling mistake in error messages Trivial fix to spelling mistake "Mutilple" to "Multiple" in error messages. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Loc Ho <lho@apm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161114231104.5585-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-15 11:10:02 +01:00
Borislav Petkov	c73e8833be	EDAC, mc: Fix locking around mc_devices list When accessing the mc_devices list of memory controller descriptors, we need to hold mem_ctls_mutex. This was not always the case, fix that. Make all external callers call a version which grabs the mutex since the last is local to edac_mc.c. Reported-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-11-14 13:26:11 +01:00
Borislav Petkov	c09a8c40e0	x86/RAS: Hide SMCA bank names Add accessor functions and hide the smca_names array. Also, add a sanity-check to bank HWID assignment in get_smca_bank_info(). Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20161104152317.5r276t35df53qk76@pd.tnic Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:15 +01:00
Borislav Petkov	a9a1c0ee04	x86/RAS: Rename smca_bank_names to smca_names Make it differ more from struct smca_bank_name for better readability. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:14 +01:00
Borislav Petkov	1ce9cd7f9f	x86/RAS: Simplify SMCA HWID descriptor struct Call it simply smca_hwid and call local variables "hwid". More readable. Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: http://lkml.kernel.org/r/20161103125556.15482-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-08 17:10:14 +01:00
Thor Thayer	90e493d7d5	EDAC, altera: Disable IRQs while injecting SDRAM errors Disable IRQs while injecting SDRAM errors. The RT patches exposed a spinlock deadlock where the spinlock taken for the regmap write deadlocked with the IRQ clear regmap write. Error injection is not normally enabled for ECC but only for testing. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1476906827-9412-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-22 20:14:03 +02:00
Wei Yongjun	240ea9214a	EDAC, skx_edac: Fix non static symbol warnings Fix the following sparse warnings: drivers/edac/skx_edac.c:266:25: warning: symbol 'skx_cpuids' was not declared. Should it be static? drivers/edac/skx_edac.c:1040:12: warning: symbol 'skx_init' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1477147098-2842-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-22 20:10:38 +02:00
Piotr Luc	9a9260ca92	EDAC, sb_edac: Add Knights Mill support Add Knights Mill (KNM) to the list of CPU models supported by sb_edac. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20161013153105.2517-6-piotr.luc@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-19 12:37:44 +02:00
Dave Hansen	20f4d69243	EDAC, {sb,skx}_edac: Use Intel model macros instead of open-coding them We now have symbolic names for a bunch of Intel CPU models via asm/intel-family.h. The original conversion missed the EDAC drivers. Convert them. Signed-off-by: Dave Hansen <dave.hansen@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160929204321.9FAE5F84@viggo.jf.intel.com [ Remove comment, macro name is descriptive enough. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-10-19 12:32:40 +02:00
Linus Torvalds	19fe416532	* Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO buffers (Thor Thayer) * Split the memory controller part out of mpc85xx and share it with a * new Freescale ARM Layerscape driver (York Sun) * amd64_edac fixes (Yazen Ghannam) * Misc cleanups, refactoring and fixes all over the place -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJX80EwAAoJEBLB8Bhh3lVKWwYP/1bbODQ7o+XhO8IaCDffYk30 8y4WdSnI0/QcP8JbSvFA7y6Zn4L0BbrbYhKLRDAg9c34V2bMaqonCnkDtT6YatUb 6l0H/hQ/Cah9AOm5PJLYg6O9s+ZBT8zA5b+F2Z9kUsuB6LSnVhp9skNrH6KPlm0U 4pFaLnHQenQPZbuRCfRxPU49ZuKtBZtQDkJLJlHXwn7e1qZy2Q4tMnnEtsY6U2ea t3Hj+F8g+cdoiTQXOceCcOTR8GqDI6szgzn7vpXAGYvljBndszauAkxO7by79jg1 I8AQfgwoBF5CYL2Q0pzT1maHmmG2sydeRAHIvhmGxiEfFz1abWhriXbS33c32q8a iFiVMAUIaSKpB/sB+5w5ymuBctI1mX5EQVW+8Xl2Gxt+olnhdJMocHnvQdYkfsYm Ka8LcbaiK6ZQTbs/cIMOc2paE0AFPu5uXKHCPeZlhQAxOBvSPuDAv0+qUB/of5Uq 1SPidtsTmCI7X2hrdHAH9hLEkSjq68v3kqL5YnZL3H4gA3WohQEmX9ybjk097Kus WWEhdi/PSFX0qQKotMUUDuxfNcKI6PZH9p+i2dN6tNCkiTDdb0Eo5lCXN7RVVhvq qfE0Fcc4uDzh5MUS5jT58MWpA1cfdu9jbAf2BwFIU/poJcaeqy/SMyzCL+1D2/u6 dmDAtQbKUUwiltB8QzQd =pcI8 -----END PGP SIGNATURE----- Merge tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "A lot of movement in the EDAC tree this time around, coarse summary below: - Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO buffers (Thor Thayer) - split the memory controller part out of mpc85xx and share it with a new Freescale ARM Layerscape driver (York Sun) - amd64_edac fixes (Yazen Ghannam) - misc cleanups, refactoring and fixes all over the place" * tag 'edac_for_4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (37 commits) EDAC, altera: Add IRQ Flags to disable IRQ while handling EDAC, altera: Correct EDAC IRQ error message EDAC, amd64: Autoload module using x86_cpu_id EDAC, sb_edac: Remove NULL pointer check on array pci_tad EDAC: Remove NO_IRQ from powerpc-only drivers EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe() EDAC, fsl_ddr: Add entry to MAINTAINERS EDAC: Move Doug Thompson to CREDITS EDAC, I3000: Orphan driver EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul() EDAC, layerscape: Add Layerscape EDAC support EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed EDAC, fsl_ddr: Add support for little endian EDAC, fsl_ddr: Add missing DDR DRAM types EDAC, fsl_ddr: Rename macros and names EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx EDAC, mpc85xx: Replace printk() with pr_* format EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1 EDAC, altera: Rename MC trigger to common name EDAC, altera: Rename device trigger to common name ...	2016-10-04 12:06:26 -07:00
Thor Thayer	a29d64a45e	EDAC, altera: Add IRQ Flags to disable IRQ while handling Add the IRQF_ONESHOT and IRQF_TRIGGER_HIGH flags to disable the IRQ while executing the IRQ handler. Remove the IRQF_SHARED because these are not shared IRQs in the domain. Exposed when flooding IRQs. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1474582419-7053-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-23 12:03:34 +02:00
Thor Thayer	3763569f4c	EDAC, altera: Correct EDAC IRQ error message Correct the error message sent out in the case of a single bit error IRQ allocation. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1474582419-7053-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-23 11:52:39 +02:00
Yazen Ghannam	d6efab74f6	EDAC, amd64: Autoload module using x86_cpu_id Reinstate driver autoloading now that PCI dependency is gone. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1473984445-1726-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-21 12:48:15 +02:00
Yazen Ghannam	a884675b87	x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly Bank 4 is reserved on family 0x17 and shouldn't generate any MCE records. However, broken hardware and software is not something unheard of so warn about bank 4 errors. They shouldn't be coming from bank 4 naturally but users can still use mce_amd_inj to simulate errors from it for testing purposed. Also, avoid special handling in the injector mce_amd_inj like it is being done on the older families. [ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:14 +02:00
Yazen Ghannam	4b711f92c9	x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems The MCA_SYND and MCA_IPID registers contain valuable information and should be included in MCE output. The MCA_SYND register contains syndrome and other error information, and the MCA_IPID register will uniquely identify the MCA bank's type without having to rely on system software. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:13 +02:00
Yazen Ghannam	5896820e0a	x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types Scalable MCA defines a number of IP types. An MCA bank on an SMCA system is defined as one of these IP types. A bank's type is uniquely identified by the combination of the HWID and MCATYPE values read from its MCA_IPID register. Add the required tables in order to be able to lookup error descriptions based on a bank's type and the error's extended error code. [ bp: Align comments, simplify a bit. ] Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:10 +02:00
Yazen Ghannam	856095b179	EDAC/mce_amd: Use SMCA prefix for error descriptions arrays The error descriptions defined for Fam17h can be reused for other SMCA systems, so their names should reflect this. Change f17h prefix to smca for error descriptions. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-4-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:09 +02:00
Yazen Ghannam	c019b951e1	EDAC/mce_amd: Add missing SMCA error descriptions Add missing SMCA error descriptions to the error descriptions arrays. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1472673994-12235-3-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:09 +02:00
Yazen Ghannam	b300e87300	EDAC/mce_amd: Print syndrome register value on SMCA systems Print SyndV bit status and print the raw value of the MCA_SYND register. Further decoding of the syndrome from struct mce.synd can be done in other places where appropriate, e.g. DRAM ECC. Boris: make the error stanza more compact by putting the error address and syndrome on the same line: [Hardware Error]: Corrected error, no action required. [Hardware Error]: CPU:2 (17:0:0) MC4_STATUS[-\|CE\|-\|PCC\|AddrV\|-\|-\|SyndV\|CECC]: 0x96204100001e0117 [Hardware Error]: Error Addr: 0x000000007f4c52e3, Syndrome: 0x0000000000000000 [Hardware Error]: Invalid IP block specified. [Hardware Error]: cache level: L3/GEN, tx: DATA, mem-tx: RD Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1467633035-32080-2-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-13 15:23:07 +02:00
Colin Ian King	c7c35407cd	EDAC, sb_edac: Remove NULL pointer check on array pci_tad pvt->pci_tad is a NUM_CHANNELS array of struct pci_dev pointers and hence cannot be NULL, so the NULL pointer check on pci_tad is redundant. Remove it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160908083801.14766-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-12 20:15:43 +02:00
Michael Ellerman	372095723a	EDAC: Remove NO_IRQ from powerpc-only drivers We'd like to eventually remove NO_IRQ on powerpc, so remove usages of it from powerpc-only drivers. The pdata structs are kzalloc'ed, so we don't need to initialise those to 0, we can just drop the assignments entirely. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/1473674436-19467-1-git-send-email-mpe@ellerman.id.au Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-12 13:04:56 +02:00
Wei Yongjun	43fa9ba632	EDAC, fsl_ddr: Fix error return code in fsl_mc_err_probe() Return negative error code from the edac_mc_add_mc() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1473350284-26482-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-09 19:27:22 +02:00
York Sun	f47ae798d8	EDAC, fsl_ddr: Replace simple_strtoul() with kstrtoul() Replace obsolete simple_strtoul() with kstrtoul(). Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471990593-27536-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:03 +02:00
York Sun	eeb3d68b6c	EDAC, layerscape: Add Layerscape EDAC support Add DDR EDAC driver for ARM-based compatible controllers. Both big-endian and little-endian are supported, as specified in device tree. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471990465-27443-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:03 +02:00
York Sun	55764ed37e	EDAC, fsl_ddr: Fix IRQ dispose warning when module is removed When compiled as a module, removing it causes kernel warnings when irq_dispose_mapping() is called. Instead of calling irq_of_parse_and_map(), use platform_get_irq() to acquire the IRQ number. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-8-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:02 +02:00
York Sun	339fdff14c	EDAC, fsl_ddr: Add support for little endian Get endianness from device tree. Both big endian and little endian are supported. Default to big endian for backwards compatibility to MPC85xx. Signed-off-by: York Sun <york.sun@nxp.com> Acked-by: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-7-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:02 +02:00
York Sun	4e2c3252d2	EDAC, fsl_ddr: Add missing DDR DRAM types The compatible DDR controllers may support DDR, DDR2, DDR3, DDR4 DRAM. An individual controller doesn't support all of them. The EDAC driver reads SDRAM_CFG to determine which mode is configured. Add DDR4 and drop the defines used only in the mtype assignment. Signed-off-by: York Sun <york.sun@nxp.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: morbidrsa@gmail.com Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-6-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:01 +02:00
York Sun	d43a9fb202	EDAC, fsl_ddr: Rename macros and names Use FSL-specific prefix for macros, variables and functions. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-5-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:01 +02:00
York Sun	ea2eb9a8b6	EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx The mpc85xx-compatible DDR controllers are used on ARM-based SoCs too. Carve out the DDR part from the mpc85xx EDAC driver in preparation to support both architectures. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470946525-3410-1-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:28:00 +02:00
York Sun	88857ebe71	EDAC, mpc85xx: Replace printk() with pr_* format Replace printk() with pr_err/pr_warn/pr_info macros. Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-3-git-send-email-york.sun@nxp.com [ Boris: unbreak strings for easier greppability. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:27:59 +02:00
York Sun	9e6a03a044	EDAC, mpc85xx: Drop setting/clearing RFXE bit in HID1 On e500v1, read fault exception enable (RFXE) controls whether assertion of core_fault_in causes a machine check interrupt. Assertion of core_fault_in can result from uncorrectable data error, such as an L2 multi-bit ECC error. It can also occur from a system error if logic on the integrated device signals a fault for nonfatal errors. RFXE bit is cleared out of reset, and should be left clear for normal operation. Assertion of core_fault_in does not cause a machine check. RFXE is set specifically for RIO (Rapid IO) and PCI for book E to catch the errors by machine check. With this bit set, the EDAC driver can't get the interrupt in case of uncorrectable error. So this bit is cleared in favor of EDAC. However, the benefit of catching such uncorrectable error doesn't outweigh the other errors which may hang the system. Besides, e500v2 has different errors masked by RFXE, and e500mc doesn't support this bit. It is more reasonable to leave RFXE as is in the EDAC driver, and leave the uncorrectable errors triggering machine check for e500v1. Suggested-by: Scott Wood <oss@buserror.net> Signed-off-by: York Sun <york.sun@nxp.com> Cc: Johannes Thumshirn <morbidrsa@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: oss@buserror.net Cc: stuart.yoder@nxp.com Link: http://lkml.kernel.org/r/1470779760-16483-2-git-send-email-york.sun@nxp.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 10:27:59 +02:00
Thor Thayer	b8978badc4	EDAC, altera: Rename MC trigger to common name Rename the Memory Controller debug trigger to the same common name as the EDAC devices. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471622666-15197-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 09:06:42 +02:00
Thor Thayer	f399f34bdb	EDAC, altera: Rename device trigger to common name The L2 and OCRAM devices have different ecc trigger names than the other EDAC devices (FIFO peripherals). Make them all the same and remove the character array from the device structure. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1471622666-15197-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-09-01 08:55:24 +02:00
Tony Luck	4ec656bdf4	EDAC, skx_edac: Add EDAC driver for Skylake This is an entirely new driver instead of yet another set of patches to sb_edac.c because: 1) Mapping from PCI devices to socket/memory controller is significantly different. Skylake scatters devices on a socket across a number of PCI buses. 2) There is an extra level of interleaving via the "mcroute" register that would be a little messy to squeeze into the old driver. 3) Validation is getting too expensive. Changes to sb_edac need to be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and Knights Landing. Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-21 10:58:34 -07:00
Tillmann Heidsieck	6fa06b0d9e	EDAC, mpc85xx: Fix PCIe error capture According to the reference manual of MPC8572 and T4240, bit 31 of PEX_ERR_CAP_STAT is W1C (write 1 to clear). Add the corresponding write to PEX_ERR_CAP_STAT in order to fix the PCIe error capture. Tested on a T4240 processor. Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de> Acked-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160815190849.29327-1-theidsieck@leenox.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-18 10:17:40 +02:00
Bhaktipriya Shridhar	7bb8b77779	EDAC, wq: Remove deprecated create_singlethread_workqueue() Replace the deprecated create_singlethread_workqueue() with alloc_ordered_workqueue() with WQ_MEM_RECLAIM. This is the identity conversion. It's not recommended to stall it from memory pressure. Hence, WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160813164124.GA9077@Karyakshetra Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-15 07:21:29 +02:00
Wei Yongjun	9bcd919eb8	EDAC, altera: Make a10_eccmgr_ic_ops static Fix the following sparse warning: drivers/edac/altera_edac.c:1649:23: warning: symbol 'a10_eccmgr_ic_ops' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Reviewed-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lkml <linux-kernel@vger.kernel.org> Link: http://lkml.kernel.org/r/1470836667-11822-1-git-send-email-weiyj.lk@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-10 18:37:06 +02:00
Thor Thayer	911049845d	EDAC, altera: Add Arria10 SD-MMC EDAC support Add Altera Arria10 SD-MMC FIFO memory EDAC support. The SD-MMC is a dual port RAM implementation which is different than any of the other peripherals and therefore requires additional code. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1470753653-23465-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-10 14:43:14 +02:00
Yazen Ghannam	dc0a50a841	EDAC, amd64: Fix channel decode on Fam15hMod60h systems Fam15hMod60h systems are using the channel decode of Fam15hMod30h which gives incorrect results. Fam15hMod60h systems should use the generic channel decode method plus a couple more cases. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1470236355-30039-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:42 +02:00
Thor Thayer	485fe9e24e	EDAC, altera: Add Arria10 QSPI support Add Altera Arria10 QSPI FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-9-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:39 +02:00
Thor Thayer	c609581d1f	EDAC, altera: Add Arria10 USB support Add Altera Arria10 USB FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-8-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:37 +02:00
Thor Thayer	e8263793b7	EDAC, altera: Add Arria10 DMA support Add Altera Arria10 DMA FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-7-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:36 +02:00
Thor Thayer	c6882fb2e8	EDAC, altera: Add Arria10 NAND support Add Altera Arria10 NAND FIFO memory support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1468512408-5156-6-git-send-email-tthayer@opensource.altera.com [ Reformat loop in altr_edac_a10_probe() for better readability. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:59:35 +02:00
Lukasz Odzioba	c5b48fa7e2	EDAC, sb_edac: Fix channel reporting on Knights Landing On Intel Xeon Phi Knights Landing processor family the channels of the memory controller have untypical arrangement - MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the channel name incorrectly. We missed this change earlier, so the code already contains similar comment, but the translation function is incorrect. Without this patch: errors in DIMM_A and DIMM_D were reported in DIMM_D errors in DIMM_B and DIMM_E were reported in DIMM_E errors in DIMM_C and DIMM_F were reported in DIMM_F Correct this. Hubert Chrzaniuk: - rebased to 4.8 - comments and code cleanup Fixes: `d0cdf90031` ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support") Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Cc: lukasz.odzioba@intel.com Cc: mchehab@kernel.org Cc: <stable@vger.kernel.org> # v4.5.. Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com> [ Boris: Simplify a bit by removing char mc. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-08-08 05:52:08 +02:00
Linus Torvalds	c79a14defb	* Altera Arria10 ethernet FIFO buffer support (Thor Thayer) * Minor cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXlbvGAAoJEBLB8Bhh3lVK9g0P/16Z5/WIqrrLjJegcZ3qKb4m u2b5FBwbTDQyj/smZmfZfhZUleO7HDayL188fHPKO11my+Mvjyws8IZCvHshy7+3 b6Krno8NXZuNYtm32srSQQoLE4eN5pookOPMe2KJcRFghwnDqOo0ZfMZ8Lt6RVGg YtqiVZ3DwaqoHUbvxkHvdJuMbtVEZIEC6SzK9t4URmdncAx+3/BTLqDSYc4LaL+B tI6REgMy+QT1IUMImuUbOA5ktd78K+D26YKF9NuTSWNQvwzjSYy4y6WmDsjYpZF6 h6YWgGwERIy6gws5IdvkBMKX0s1RV7/sY6xFBsEQrX+IFywyY5OcI1gPRttD6VAn ORJ4j7WNH8Phqld9YxjEX8yTwGxlxge88J/lb0i9s5jXPMX7ioxAqAjdpsJj6PlA sNDP0fTyCECqME73cWaAS9bdrzB2dgOxEDfcWKWR24vZeHfnh72owD1k+J0iWu9c 0R2usPB1Yjbo6ODGpqG1R+u09LR+RpNe38oT5GcsFrj6/EaGz0kn2QulaNtylsDo J5qkm02X1h3+F8/UHrzKavOGk82IQzLCbcW7eHPmyzkH8PTe1semLk9jHnXN5sK3 uL2nirZQVIF6QLbYxf0Uf2oSVs05xc7u9ZT+QOl6hqh/6/K7DWC45SoABwMXDaHH jXr+y4vtr5IR8GQfZ/nd =mdxx -----END PGP SIGNATURE----- Merge tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "This last cycle, Thor was busy adding Arria10 eth FIFO support to the altera_edac driver along with other improvements. We have two cleanups/fixes too. Summary: - Altera Arria10 ethernet FIFO buffer support (Thor Thayer) - Minor cleanups" * tag 'edac_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: ARM: dts: Add Arria10 Ethernet EDAC devicetree entry EDAC, altera: Add Arria10 Ethernet EDAC support EDAC, altera: Add Arria10 ECC memory init functions Documentation: dt: socfpga: Add Arria10 Ethernet binding EDAC, altera: Drop some ifdeffery EDAC, altera: Add panic flag check to A10 IRQ EDAC, altera: Check parent status for Arria10 EDAC block EDAC, altera: Make all private data structures static EDAC: Correct channel count limit EDAC, amd64_edac: Init opstate at the proper time during init EDAC, altera: Handle Arria10 SDRAM child node EDAC, altera: Add ECC Manager IRQ controller support Documentation: dt: socfpga: Add interrupt-controller to ecc-manager	2016-07-27 13:40:47 -07:00
Tony Luck	0ba169ac36	EDAC, sb_edac: Fix Knights Landing In commit `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") I broke Knights Landing because I failed to notice that it called a wrapper macro "sbridge_get_all_devices_knl" instead of "sbridge_get_all_devices" like all the other types. Now that we include the processor type in the pci_id_table structure we can skip the wrappers and just have the sbridge_get_all_devices() check the type to decide whether to allow duplicate devices and controllers to have registers spread across buses. Fixes: `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") Tested-by: Lukasz Odzioba <lukasz.odzioba@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-16 06:11:59 +09:00
Thor Thayer	ab8c1e0fb0	EDAC, altera: Add Arria10 Ethernet EDAC support Add Altera Arria10 Ethernet FIFO memory EDAC support. Update to support a common compatibility string for all Ethernet FIFOs in the DT. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-8-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-25 11:31:34 +02:00
Thor Thayer	1166fde93d	EDAC, altera: Add Arria10 ECC memory init functions In preparation for additional memory module ECCs, add the memory initialization functions and helpers. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-7-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 21:16:38 +02:00
Thor Thayer	6b300fb953	EDAC, altera: Drop some ifdeffery Make the IRQ and check_deps() functions available to all the memory buffers by moving them outside of the OCRAM only area. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-5-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 14:18:33 +02:00
Thor Thayer	2b083d65ff	EDAC, altera: Add panic flag check to A10 IRQ In preparation for additional memory module ECCs, the IRQ function will check a panic flag before doing a kernel panic on double bit errors. OCRAM uncorrectable errors cause a panic because sleep/resume functions and FPGA contents during sleep are stored in OCRAM. ECCs on peripheral FIFO buffers will not cause a kernel panic on DBERRs because the packet can be retried and therefore recovered. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:43 +02:00
Thor Thayer	44ec9b307e	EDAC, altera: Check parent status for Arria10 EDAC block In preparation for the Arria10 ECC modules, check the status of the parent in the device tree to ensure the block is enabled. Skip if no parent phandle is set in the device tree. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:42 +02:00
Thor Thayer	1cf7037724	EDAC, altera: Make all private data structures static The device private data structures are used only here so make them static. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1466603939-7526-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-24 11:58:42 +02:00
Borislav Petkov	bba142957e	EDAC: Correct channel count limit `c44696fff0` ("EDAC: Remove arbitrary limit on number of channels") lifted the arbitrary limit on memory controller channels in EDAC. However, the dynamic channel attributes dynamic_csrow_dimm_attr and dynamic_csrow_ce_count_attr remained 6. This wasn't a problem except channels 6 and 7 weren't visible in sysfs on machines with more than 6 channels after the conversion to static attr groups with `2c1946b6d6` ("EDAC: Use static attribute groups for managing sysfs entries") [ without that, we're exploding in edac_create_sysfs_mci_device() because we're dereferencing out of the bounds of the dynamic_csrow_dimm_attr array. ] Add attributes for channels 6 and 7 along with a guard for the future, should more channels be required and/or to sanity check for misconfigured machines. We still need to check against the number of channels present on the MC first, as Thor reported. Signed-off-by: Borislav Petkov <bp@suse.de> Reported-by: Hironobu Ishii <ishii.hironobu@jp.fujitsu.com> Tested-by: Thor Thayer <tthayer@opensource.altera.com> Cc: <stable@vger.kernel.org> # 4.2	2016-06-16 10:06:35 +02:00
Borislav Petkov	6ba92fea1b	EDAC, amd64_edac: Init opstate at the proper time during init It is useless to do it if we're loaded on unsupported hardware so do that only after we have detected at least 1 supported AMD northbridge. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-16 01:13:18 +02:00
Thor Thayer	ab564cb51e	EDAC, altera: Handle Arria10 SDRAM child node Separate the device match arrays for each platform to prevent CycloneV matches when calling of_platform_populate() on the Arria10 ECC manager node. If the SDRAM is a child node of ECC manager, call probe function via of_platform_populate(). Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1464193783-5071-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-08 13:23:09 +02:00
Thor Thayer	13ab8448d2	EDAC, altera: Add ECC Manager IRQ controller support To better support child devices, the ECC manager needs to be implemented as an IRQ controller. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1465331757-10227-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-08 13:23:01 +02:00
Tony Luck	665f05e0b8	EDAC, sb_edac: Readd accidentally dropped Broadwell-D support In commit `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") we switched from using PCI ids to determine which platform we are running on to using CPU model instead. I forgot that Broadwell-DE has its own distinct model number different from Broadwell-EP or -EX. Fixing this isn't just adding a line to the array of cpuids - the exising code assumed a 1:1 mapping between entries in that array and the "enum type" values. Added the type to pci_id_table structure to remove this dependency and allows two Broadwell cpu models. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `2c1ea4c700` ("EDAC, sb_edac: Use cpu family/model in driver detection") Link: http://lkml.kernel.org/r/b3cffe40dec6dfe0235a5d52a504f0ba86a07ce7.1464902605.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 17:28:21 +02:00
Nicholas Krause	fbedcaf43f	EDAC: Fix workqueues poll period resetting After the workqueue cleanup, we're registering workqueues based on the presence of an ->edac_check function. When that is the case, we're setting OP_RUNNING_POLL. But we forgot to check that in edac_mc_reset_delay_period(), leading to: BUG: unable to handle kernel paging request at 0000000000015d10 IP: [ .. ] queued_spin_lock_slowpath PGD 3ffcc8067 PUD 3ffc56067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: ... CPU: 1 PID: 2792 Comm: edactest Not tainted 4.6.0-dirty #1 Hardware name: HP ProLiant MicroServer, BIOS O41 10/01/2013 Stack: Call Trace: ? _raw_spin_lock_irqsave ? lock_timer_base.isra.34 ? del_timer ? try_to_grab_pending ? mod_delayed_work_on ? edac_mc_reset_delay_period ? edac_set_poll_msec ? param_attr_store ? module_attr_store ? kernfs_fop_write ? __vfs_write ? __vfs_read ? __alloc_fd ? vfs_write ? SyS_write ? entry_SYSCALL_64_fastpath Code: RIP [ .. ] queued_spin_lock_slowpath RSP <> CR2: 0000000000015d10 ---[ end trace 3f286bc71cca15d1 ]--- Kernel panic - not syncing: Fatal exception Fix it. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Cc: <stable@vger.kernel.org> # 4.5 Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1463697958-13406-1-git-send-email-xerofoify@gmail.com [ Rewrite commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 11:14:27 +02:00
Tony Luck	c7103f650a	EDAC, sb_edac: Fix rank lookup on Broadwell Broadwell made a small change to the rank target register moving the target rank ID field up from bits 16:19 to bits 20:23. Also found that the offset field grew by one bit in the IVY_BRIDGE to HASWELL transition, so fix the RIR_OFFSET() macro too. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org # v3.19+ Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-06-03 10:05:49 +02:00
Linus Torvalds	16bf834805	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits) gitignore: fix wording mfd: ab8500-debugfs: fix "between" in printk memstick: trivial fix of spelling mistake on management cpupowerutils: bench: fix "average" treewide: Fix typos in printk IB/mlx4: printk fix pinctrl: sirf/atlas7: fix printk spelling serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/ w1: comment spelling s/minmum/minimum/ Blackfin: comment spelling s/divsor/divisor/ metag: Fix misspellings in comments. ia64: Fix misspellings in comments. hexagon: Fix misspellings in comments. tools/perf: Fix misspellings in comments. cris: Fix misspellings in comments. c6x: Fix misspellings in comments. blackfin: Fix misspelling of 'register' in comment. avr32: Fix misspelling of 'definitions' in comment. treewide: Fix typos in printk Doc: treewide : Fix typos in DocBook/filesystem.xml ...	2016-05-17 17:05:30 -07:00
Linus Torvalds	1cc3880a3c	* Altera Arria10 L2 cache and On-Chip RAM ECC handling. (Thor Thayer) * Remove ad-hoc buffering of MCE records in sb_edac and i7core_edac. (Tony Luck) * Do not register sb_edac with pci_register_driver(). (Tony Luck) * Add support for Skylake to ie31200_edac. (Jason Baron) * Do not register amd64_edac with pci_register_driver(). (Borislav Petkov) + the usual round of cleanups and fixes all over the place. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXOaHaAAoJEBLB8Bhh3lVKRdoP/jDewaX4GAxgcKaK9Kx7VjMe i42E/7syh5iLoyFgZ1hUjvqiQHN4RyWloUYyNHbNxGS9uXCMXIUoJQd2FjOY442s oeEaoMCfZsJLzKQti3fUPK9GugZ57BSZxfn6NlJPpyG4FxSirOcZFCxGaNuyqqrh 0M+gFvbWfZcVDLE0FI5CUYZWRxsk//+pLM4ZkQZFA8Eo1tGVc3r0zEEAlL/uEMDW P0ATvRHNUeib9YQGTdSD7cNUpRX4SX+T8lDUiaVm+tHL5ES3b4rayMBdbVMrahfc Gnxu9GtO5gWXEY6QDDpOx+VkbfFmDFodx833psJ3MVD8evEFHHdinkgDaptLrrV6 92ZDKR5s3W6tKXkcmGuExrtc17UgjLcRZCebXbv+5FJlVrslzvQ9ESOTRyiZBrCD ZpFi2TZhpYU1uEWuoBZCbWNXW2pcSt7/bQ9bYUvfrvNfgPzPnblubJuVKRcfY0WB x2vj0PNnckpoPRvskV7GEe0Y/JISzAxBQUK6XO+GJgMgz5M+23SEaSVU5yeyf26e x/yD5yImQGN84AxfRMMvbR2JvpVLN3vdFWtWigneht160erVA/Qgw5Gcrpw53tZr zPD3fGaetrNgS8BvJAy/c6xHV1j6A1RGCGThy8Ivqep9WD90a5JhR1NRjZp6m8sf aB0A1zTP7e+hRFkZGahO =ud2P -----END PGP SIGNATURE----- Merge tag 'edac_for_4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: "It was pretty busy in EDAC land this time: - Altera Arria10 L2 cache and On-Chip RAM ECC handling (Thor Thayer) - Remove ad-hoc buffering of MCE records in sb_edac and i7core_edac (Tony Luck) - Do not register sb_edac with pci_register_driver() (Tony Luck) - Add support for Skylake to ie31200_edac (Jason Baron) - Do not register amd64_edac with pci_register_driver() (Borislav Petkov) ... plus the usual round of cleanups and fixes all over the place" * tag 'edac_for_4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (25 commits) EDAC, amd64_edac: Drop pci_register_driver() use EDAC, ie31200_edac: Add Skylake support EDAC, sb_edac: Use cpu family/model in driver detection EDAC, i7core: Remove double buffering of error records EDAC, amd64_edac: Issue driver banner only on success ARM: socfpga: Initialize Arria10 OCRAM ECC on startup EDAC: Increment correct counter in edac_inc_ue_error() EDAC, sb_edac: Remove double buffering of error records EDAC: Fix used after kfree() error in edac_unregister_sysfs() EDAC, altera: Avoid unused function warnings EDAC, altera: Remove useless casts ARM: socfpga: Enable Arria10 OCRAM ECC on startup EDAC, altera: Add Arria10 OCRAM ECC support Documentation: dt: socfpga: Add Altera Arria10 OCRAM binding EDAC, altera: Make OCRAM ECC dependency check generic EDAC, altera: Add register offset for ECC Enable EDAC, altera: Extract error inject operations to a struct fops ARM: socfpga: Enable Arria10 L2 cache ECC on startup EDAC, altera: Add Arria10 L2 Cache ECC handling Documentation, dt, socfpga: Add Altera Arria10 L2 cache binding ...	2016-05-16 18:44:39 -07:00
Yazen Ghannam	a348ed83d9	EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA Use X86_FEATURE_SMCA when detecting if SMCA is available instead of directly using CPUID 0x80000007_EBX. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462971509-3856-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-05-12 09:08:23 +02:00
Borislav Petkov	3f37a36b62	EDAC, amd64_edac: Drop pci_register_driver() use - remove homegrown instances counting. - take F3 PCI device from amd_nb caching instead of F2 which was used with the PCI core. With those changes, the driver doesn't need to register a PCI driver and relies on the northbridges caching which we do anyway on AMD. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com>	2016-05-09 20:41:16 +02:00
Jason Baron	953dee9bbd	EDAC, ie31200_edac: Add Skylake support Skylake adjusts some register locations, but otherwise follows the existing model quite closely. I was able to verify that the 'ce_count' increments when 'bad dimms' are used. The accounting of 'ce_count' and 'ue_count' is the primary functionality of interest for us. Tested on Intel(R) Xeon(R) CPU E3-1260L v5 @ 2.90GHz. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1462547927-22679-1-git-send-email-jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-05-06 18:50:14 +02:00
Tony Luck	2c1ea4c700	EDAC, sb_edac: Use cpu family/model in driver detection Instead of picking a random PCI ID from the dozen or so we need to access, just use x86_match_cpu() to pick based on CPU model number. The choosing of PCI devices has been problematic in the past, see `11249e7399` ("sb_edac: Fix detection on SNB machines") which fixed problems introduced by `d0585cd815` ("sb_edac: Claim a different PCI device"). This is especially ugly if future hardware might not even have EDAC-relevant registers in PCI config space and we would still be required to choose some "random" PCI devices to scan for just so our driver loads. Is this cleaner/clearer? It deletes much more code than it adds. Only tested on Broadwell. The driver loads/unloads and loads again. Still decodes errors too. Signed-off-by: Tony Luck <tony.luck@intel.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-05-02 19:44:43 +02:00
Tony Luck	5359534505	EDAC, i7core: Remove double buffering of error records In the bad old days the functions from x86_mce_decoder_chain could be called in machine check context. So we used to carefully copy them and defer processing until later. But in `f29a7aff4b` ("x86/mce: Avoid potential deadlock due to printk() in MCE context") we switched the logging code to save the record in a genpool, and call the functions that registered to be notified later from a work queue. So drop all the double buffering and do all the work we want to do as soon as i7core_mce_check_error() is called. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/29ab2c370915c6e132fc5d88e7b72cb834bedbfe.1461855008.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-29 16:41:24 +02:00
Tony Luck	c4fc1956fa	EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback Both of these drivers can return NOTIFY_BAD, but this terminates processing other callbacks that were registered later on the chain. Since the driver did nothing to log the error it seems wrong to prevent other interested parties from seeing it. E.g. neither of them had even bothered to check the type of the error to see if it was a memory error before the return NOTIFY_BAD. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-29 15:43:10 +02:00
Borislav Petkov	de0336b30d	EDAC, amd64_edac: Issue driver banner only on success ... and don't mislead users into thinking that the driver has loaded successfully. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-27 12:30:26 +02:00
Emmanouil Maroudas	993f88f1cc	EDAC: Increment correct counter in edac_inc_ue_error() Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of ce_noinfo_count. Signed-off-by: Emmanouil Maroudas <emmanouil.maroudas@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `4275be6355` ("edac: Change internal representation to work with layers") Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 18:10:09 +02:00
Tony Luck	ad08c4e974	EDAC, sb_edac: Remove double buffering of error records In the bad old days the functions from x86_mce_decoder_chain could be called in machine check context. So we used to carefully copy them and defer processing until later. But in `f29a7aff4b` ("x86/mce: Avoid potential deadlock due to printk() in MCE context") we switched the logging code to save the record in a genpool, and call the functions that registered to be notified later from a work queue. So drop all the double buffering and do all the work we want to do as soon as sbridge_mce_check_error() is called. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: patrickg@supermicro.com Link: http://lkml.kernel.org/r/100025611cd780d9bca72792b2b2146760da53e0.1460756761.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 14:02:02 +02:00
Tony Luck	ab67b6c22d	EDAC: Fix used after kfree() error in edac_unregister_sysfs() Code flow looks like this: device_unregister(&mci->dev); -> kobject_put+0x25/0x50 -> kobject_cleanup+0x77/0x190 -> device_release+0x32/0xa0 -> mci_attr_release+0x36/0x70 -> kfree(mci); bus_unregister(mci->bus); Fix is to grab a local copy of "mci->bus" and use that when we call bus_unregister(). Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/21d595b0ab3d718d9cb206647f4ec91c05e62ec4.1461261078.git.tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 13:23:54 +02:00
Arnd Bergmann	1aa6eb5c5b	EDAC, altera: Avoid unused function warnings The recently added Arria10 OCRAM ECC support caused some new harmless warnings about unused functions when it is disabled: drivers/edac/altera_edac.c:1067:20: error: 'altr_edac_a10_ecc_irq' defined but not used [-Werror=unused-function] drivers/edac/altera_edac.c:658:12: error: 'altr_check_ecc_deps' defined but not used [-Werror=unused-function] This rearranges the code slightly to have those two functions inside of the same #ifdef that hides their callers. It also manages to avoid a forward declaration of the IRQ handler in the process. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Alan Tull <atull@opensource.altera.com> Cc: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `c7b4be8db8` ("EDAC, altera: Add Arria10 OCRAM ECC support") Link: http://lkml.kernel.org/r/1460837650-1237650-2-git-send-email-arnd@arndb.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 12:00:12 +02:00
Arnd Bergmann	2c911f6cac	EDAC, altera: Remove useless casts The altera EDAC driver refers to its per-device data using a cast to '(void *)', which makes the pointer non-const, though both the source and destination are actually const. Removing the annotation makes the reference (almost) fit into a single line for improved readability, and ensures that it is actually defined as const. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Alan Tull <atull@opensource.altera.com> Cc: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1460837650-1237650-1-git-send-email-arnd@arndb.de Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-23 11:54:42 +02:00
Tony Luck	ea5dfb5fae	x86 EDAC, sb_edac.c: Take account of channel hashing when needed Haswell and Broadwell can be configured to hash the channel interleave function using bits [27:12] of the physical address. On those processor models we must check to see if hashing is enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and act accordingly. Based on a patch by patrickg <patrickg@supermicro.com> Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00
Tony Luck	ff15e95c82	x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address In commit: `eb1af3b71f` ("Fix computation of channel address") I switched the "sck_way" variable from holding the log2 value read from the h/w to instead be the actual number. Unfortunately it is needed in log2 form when used to shift the address. Tested-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `eb1af3b71f` ("Fix computation of channel address") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-22 10:10:01 +02:00
Masanari Iida	c19ca6cb4c	treewide: Fix typos in printk This patch fix spelling typos found in printk within various part of the kernel sources. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-04-18 11:23:24 +02:00
Thor Thayer	c7b4be8db8	EDAC, altera: Add Arria10 OCRAM ECC support Add Arria10 On-Chip RAM ECC handling. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1459992174-8015-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-07 12:42:56 +02:00
Thor Thayer	aa1f06dcc0	EDAC, altera: Make OCRAM ECC dependency check generic In preparation for the Arria10 peripheral ECCs, move the OCRAM ECC dependency check into the general ECC area since this same function can be used by other memories. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1459450087-24792-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-02 13:47:15 +02:00
Thor Thayer	943ad91798	EDAC, altera: Add register offset for ECC Enable In preparation for the Arria10 peripheral ECCs, add a register offset from the ECC base to index to the ECC enable register. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1459450087-24792-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-02 13:38:57 +02:00
Thor Thayer	e17ced2cb6	EDAC, altera: Extract error inject operations to a struct fops In preparation for the Arria10 peripheral ECCs, extract the inject file operations because the Arria10 IRQ trigger mechanism is different than Cyclone5/Arria5 and Arria10 L2 cache. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1459450087-24792-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-04-02 13:27:06 +02:00
Thor Thayer	588cb03ea2	EDAC, altera: Add Arria10 L2 Cache ECC handling Add a private data structure for Arria10 L2 cache ECC and the probe function for it. The Arria10 ECC device IRQs are in a shared register so the ECC Manager parent/child relationship requires a different probe function. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-8-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:34:06 +02:00
Thor Thayer	811fce4f2a	EDAC, altera: Add register offset for ECC Error Inject In preparation for the Arria10 peripheral ECCs, add a register offset from the ECC base to the private data structure to index to the error injection register. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-6-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:20:48 +02:00
Thor Thayer	27439a1a63	EDAC, altera: Abstract ECC Enable Mask in check_deps() In preparation for the Arria10 peripheral ECCs, use the ECC Enable mask in place of hard coded masks in the check dependency functions. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-5-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:17:32 +02:00
Thor Thayer	328ca7ae81	EDAC, altera: Remove platform device from check_deps() In preparation for the Arria10 peripheral ECCs, remove the platform device parameter from the check_deps() functions because it is not needed and makes the Arria10 check_deps() cleaner. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:14:21 +02:00
Thor Thayer	05b088b6f8	EDAC, altera: Move device structs and defines to the header Move the device structs and defines to altera_edac.h in preparation for adding the Arria10 L2 cache ECC. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:11:16 +02:00
Thor Thayer	3a8f21f170	EDAC, altera: Make L2C depend on L2x0 cache controller Make L2 cache depend instead of forcibly select the L2 cache support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1458576106-24505-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-29 10:06:11 +02:00
Linus Torvalds	047486d8e7	EDAC queue for 4.6 * Altera: L2 cache and On-Chip RAM support (Thor Thayer). * EDAC: Workqueue handling cleanups (Borislav Petkov). * Xgene: Register bus error handling (Loc Ho). * Misc small fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW5sS0AAoJEBLB8Bhh3lVKxPEP/j8pleG17s1HI9wfhukzuT9z epjyaYjEv1BR6+3drmFjbAYGgk7hhL8khVePEhouS/P5WQSzRHMdgimL5NMpnwSZ 6XXyR2szhz86eYMC1lQxdRnESZarbnVtqyiRG0Mv1hFbObyzM0ewiHt2lTtCa1Uj DeprrSE6QCQLwq0WSIF8MJ9LBcmGh5dXJTy0sTeATfkmgBaBQeJhiZOmJer25Jqf tEyvxkkFw0OLAy3BoI7eeI7ALmgzIXPLIWVOo0t1qTeKsURwdfap8xjteT9/5iiJ HHB+4A+8iUMSPYMoIiSr0qsyZT1CI2ncVV4VOAhm11if4gB8sCnOkioMo6bHhtbq NZrug6BnwizBBh4FFGHmTTpN8MlhLgYA8YfSkUD5jal3jM/l5VvlM9DDM72JuzLy 0hd5zaAqlqEJj4plvSx4ieuDHAkA3Qzd58U12LuN+R+nbN2EO8Svr9CvEMSLE+1r exxOYRP5xO9SPeK7pTnXJFsI09MrRmXGinRQu/LChWAm1j7NxaHkeUl7ulzxjV+5 E6Bx+mOPvYTUF32CQi4i/oIePK2kVGoExaUSkRNKeRgwF/ObQaVxjM+cYvM+VJEh 2OknzHNG/F87UvkcOdGMEy6G4E5xkHSuEYtq7bHGWI2prKn+f0nbPpWgx9d4DiE8 jo0vi58v+Irwk9H465lH =ssy9 -----END PGP SIGNATURE----- Merge tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC updates from Borislav Petkov: - Altera: L2 cache and On-Chip RAM support (Thor Thayer). - EDAC: Workqueue handling cleanups (Borislav Petkov). - Xgene: Register bus error handling (Loc Ho). - Misc small fixes. * tag 'edac_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: ARM: socfpga: Enable OCRAM ECC on startup ARM: socfpga: Enable L2 cache ECC on startup ARM: dts: Add Altera L2 Cache and OCRAM EDAC entries EDAC, altera: Add Altera L2 cache and OCRAM support EDAC: Use edac_debugfs_remove_recursive() in edac_debugfs_exit() EDAC, mpc85xx: Silence unused variable warning EDAC: Cleanup/sync workqueue functions EDAC: Kill workqueue setup/teardown functions EDAC: Balance workqueue setup and teardown arm64: Update the APM X-Gene EDAC node with the RB register resource EDAC, xgene: Add missing SoC register bus error handling Documentation, EDAC: Update xgene binding for missing register bus EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()	2016-03-16 08:36:55 -07:00
Linus Torvalds	d88bfe1d68	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "Various RAS updates: - AMD MCE support updates for future CPUs, fixes and 'SMCA' (Scalable MCA) error decoding support (Aravind Gopalakrishnan) - x86 memcpy_mcsafe() support, to enable smart(er) hardware error recovery in NVDIMM drivers, based on an extension of the x86 exception handling code. (Tony Luck)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: EDAC/sb_edac: Fix computation of channel address x86/mm, x86/mce: Add memcpy_mcsafe() x86/mce/AMD: Document some functionality x86/mce: Clarify comments regarding deferred error x86/mce/AMD: Fix logic to obtain block address x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors x86/mce: Move MCx_CONFIG MSR definitions x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries x86/mm: Expand the exception table logic to allow new handling options x86/mce/AMD: Set MCAX Enable bit x86/mce/AMD: Carve out threshold block preparation x86/mce/AMD: Fix LVT offset configuration for thresholding x86/mce/AMD: Reduce number of blocks scanned per bank x86/mce/AMD: Do not perform shared bank check for future processors x86/mce: Fix order of AMD MCE init function call	2016-03-14 18:43:51 -07:00
Luck, Tony	eb1af3b71f	EDAC/sb_edac: Fix computation of channel address Large memory Haswell-EX systems with multiple DIMMs per channel were sometimes reporting the wrong DIMM. Found three problems: 1) Debug printouts for socket and channel interleave were not interpreting the register fields correctly. The socket interleave field is a 2^X value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2, 2=3. 3=4). 2) Actual use of the socket interleave value didn't interpret as 2^X 3) Conversion of address to channel address was complicated, and wrong. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-10 18:31:55 +01:00
Aravind Gopalakrishnan	be0aec23bf	x86/mce/AMD, EDAC: Enable error decoding of Scalable MCA errors For Scalable MCA enabled processors, errors are listed per IP block. And since it is not required for an IP to map to a particular bank, we need to use HWID and McaType values from the MCx_IPID register to figure out which IP a given bank represents. We also have a new bit (TCC) in the MCx_STATUS register to indicate Task context is corrupt. Add logic here to decode errors from all known IP blocks for Fam17h Model 00-0fh and to print TCC errors. [ Minor fixups. ] Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1457021458-2522-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-08 11:48:14 +01:00
Hubert Chrzaniuk	83bdaad4d9	EDAC, sb_edac: Fix logic when computing DIMM sizes on Xeon Phi Correct a typo introduced by `d0cdf90031` ("EDAC, sb_edac: Add Knights Landing (Xeon Phi gen 2) support") As a result under some configurations DIMMs were not correctly recognized. Problem affects only Xeon Phi architecture. Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1457361045-26221-1-git-send-email-hubert.chrzaniuk@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-03-07 19:07:40 +01:00
Thor Thayer	c3eea1942a	EDAC, altera: Add Altera L2 cache and OCRAM support Add L2 Cache and On-Chip RAM EDAC support for the Altera SoCs. The SDRAM controller is using the Memory Controller model. Each type of ECC is individually configurable. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: galak@codeaurora.org Cc: grant.likely@linaro.org Cc: ijc+devicetree@hellion.org.uk Cc: linux-arm-kernel@lists.infradead.org Cc: linux@arm.linux.org.uk Cc: linux-doc@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: pawel.moll@arm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1455132384-17108-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-11 12:23:06 +01:00
Thor Thayer	9bf4f00567	EDAC: Use edac_debugfs_remove_recursive() in edac_debugfs_exit() debugfs_remove() is used to remove a file or a directory from the debugfs filesystem on an EDAC device exit. However edac_debugfs might not be empty. This is similar to `30f84a891b` ("EDAC: Use edac_debugfs_remove_recursive()") which changed the EDAC MCI code to use edac_debugfs_remove_recursive(). Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1455064165-3816-1-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-10 10:37:46 +01:00
Sudip Mukherjee	f2b59ac66f	EDAC, mpc85xx: Silence unused variable warning We were getting this build warning: drivers/edac/mpc85xx_edac.c:1247:6: warning: unused variable 'pvr' pvr is only used if CONFIG_FSL_SOC_BOOKE is defined. Declare it __maybe_unused. Suggested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1454427573-7994-1-git-send-email-sudipm.mukherjee@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 18:53:15 +01:00
Borislav Petkov	06e912d4d4	EDAC: Cleanup/sync workqueue functions They're both running only when ->edac_check is initialized so remove that check from the workqueue function itself. Synchronize/generalize the ->op_state check between the two. Kill useless comments, while at it. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:38:50 +01:00
Borislav Petkov	626a7a4dba	EDAC: Kill workqueue setup/teardown functions We have the generic wrappers now, use those. edac_pci_workq_setup() had an unused argument anyway. Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:38:43 +01:00
Borislav Petkov	0966760619	EDAC: Balance workqueue setup and teardown We use the ->edac_check function pointers to determine whether we need to setup a polling workqueue. However, the destroy path is not balanced and we might try to teardown an unitialized workqueue. Balance init and destroy paths by looking at ->edac_check in both cases. Set op_state to OP_OFFLINE before destroying anything. Reported-by: Zhiqiang Hou <Zhiqiang.Hou@freescale.com> Cc: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2016-02-02 11:04:29 +01:00
Loc Ho	4d67e3ce75	EDAC, xgene: Add missing SoC register bus error handling Add missing register bus error handling for APM X-Gene EDAC SoC and fix a checking condition for CE error promoted to UE. Signed-off-by: Loc Ho <lho@apm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: patches@apm.com Link: http://lkml.kernel.org/r/1453495625-28006-3-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-25 11:17:22 +01:00
Dan Carpenter	6f3508f61c	EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr() dct_sel_base_off is declared as a u64 but we're only using the lower 32 bits because of a shift wrapping bug. This can possibly truncate the upper 16 bits of DctSelBaseOffset[47:26], causing us to misdecode the CS row. Fixes: `c8e518d567` ('amd64_edac: Sanitize f10_get_base_addr_offset') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20160120095451.GB19898@mwanda Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-25 11:17:14 +01:00
Geliang Tang	1cac5503fb	EDAC, i5100: Use to_delayed_work() Use to_delayed_work() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/58c0e319c7263a10b692100c657c06c42814aecf.1451659910.git.geliangtang@163.com Signed-off-by: Borislav Petkov <bp@suse.de>	2016-01-01 18:31:34 +01:00
Hubert Chrzaniuk	45f4d3ab3e	EDAC, sb_edac: Set fixed DIMM width on Xeon Knights Landing Knights Landing does not come with register that could be used to fetch DIMM width. However the value is fixed for this architecture so it can be hardcoded. Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449840082-18673-1-git-send-email-hubert.chrzaniuk@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:58:32 +01:00
Borislav Petkov	c4cf3b454e	EDAC: Rework workqueue handling Hide the EDAC workqueue pointer in a separate compilation unit and add accessors for the workqueue manipulations needed. Remove edac_pci_reset_delay_period() which wasn't used by anything. It seems it got added without a user with `91b99041c1` ("drivers/edac: updated PCI monitoring") Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:43 +01:00
Borislav Petkov	e136fa016f	EDAC: Make edac_device workqueue setup/teardown functions static They're not used anywhere else. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:42 +01:00
Borislav Petkov	d4538000ca	EDAC: Remove edac_get_sysfs_subsys() error handling It cannot fail now. We either load EDAC core after having successfully initialized edac_subsys or we don't. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:41 +01:00
Borislav Petkov	a97d262701	EDAC: Unexport and make edac_subsys static ... and use the accessor instead. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:40 +01:00
Borislav Petkov	733476cf20	EDAC: Rip out the edac_subsys reference counting This was really dumb - reference counting for the main EDAC sysfs object. While we could've simply registered it as the first thing in the module init path and then hand it around to what needs it. Do that and rip out all the code around it, thus simplifying the whole handling significantly. Move the edac_subsys node back to edac_module.c. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:39 +01:00
Borislav Petkov	fcd5c4dd82	EDAC: Robustify workqueues destruction EDAC workqueue destruction is really fragile. We cancel delayed work but if it is still running and requeues itself, we still go ahead and destroy the workqueue and the queued work explodes when workqueue core attempts to run it. Make the destruction more robust by switching op_state to offline so that requeuing stops. Cancel any pending work synchronously too. EDAC i7core: Driver loaded. general protection fault: 0000 [#1] SMP CPU 12 Modules linked in: Supported: Yes Pid: 0, comm: kworker/0:1 Tainted: G IE 3.0.101-0-default #1 HP ProLiant DL380 G7 RIP: 0010:[<ffffffff8107dcd7>] [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0 < ... regs ...> Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600) Stack: ... Call Trace: call_timer_fn run_timer_softirq __do_softirq call_softirq do_softirq irq_exit smp_apic_timer_interrupt apic_timer_interrupt intel_idle cpuidle_idle_call cpu_idle Code: ... RIP __queue_work RSP <...> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org>	2015-12-11 16:56:39 +01:00
Borislav Petkov	12e26969b3	EDAC, mc_sysfs: Fix freeing bus' name I get the splat below when modprobing/rmmoding EDAC drivers. It happens because bus->name is invalid after bus_unregister() has run. The Code: section below corresponds to: .loc 1 1108 0 movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus .loc 1 1109 0 popq %rbx # .loc 1 1108 0 movq (%rax), %rdi # _7->name, jmp kfree # and %rax has some funky stuff 2030203020312030 which looks a lot like something walked over it. Fix that by saving the name ptr before doing stuff to string it points to. general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ #48 Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011 task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000 RIP: 0010:[<ffffffffa019da92>] [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP: 0018:ffff88030da3fe28 EFLAGS: 00010292 RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286 RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110 R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68 R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000 FS: 00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0 Stack: Call Trace: i7core_unregister_mci.isra.9 i7core_remove pci_device_remove __device_release_driver driver_detach bus_remove_driver pci_unregister_driver i7core_exit SyS_delete_module system_call_fastpath 0x7fc9bf426536 Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b RIP [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core] RSP <ffff88030da3fe28> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: <stable@vger.kernel.org> # v3.6.. Fixes: `7a623c0390` ("edac: rewrite the sysfs code to use struct device")	2015-12-11 16:56:38 +01:00
Scott Wood	666db563d3	EDAC, mpc85xx: Make mpc85xx-pci-edac a platform device Originally the mpc85xx-pci-edac driver bound directly to the PCI controller node. Commit `905e75c46d` ("powerpc/fsl-pci: Unify pci/pcie initialization code") turned the PCI controller code into a platform device. Since we can't have two drivers binding to the same device, the EDAC code was changed to be called into as a library-style submodule. However, this doesn't work if the EDAC driver is built as a module. Commit 8d8fcba6d1ea ("EDAC: Rip out the edac_subsys reference counting") exposed another problem with this approach -- mpc85xx_pci_err_probe() was being called in the same early boot phase that the PCI controller is initialized, rather than in the device_initcall phase that the EDAC layer expects. This caused a crash on boot. To fix this, the PCI controller code now creates a child platform device specifically for EDAC, which the mpc85xx-pci-edac driver binds to. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Daniel Axtens <dja@axtens.net> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Jia Hongtao <B38951@freescale.com> Cc: Jiri Kosina <jkosina@suse.com> Cc: Kim Phillips <kim.phillips@freescale.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Masanari Iida <standby24x7@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: http://lkml.kernel.org/r/1449774432-18593-1-git-send-email-scottwood@freescale.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:16 +01:00
Jim Snow	d0cdf90031	EDAC, sb_edac: Add Knights Landing (Xeon Phi gen 2) support Knights Landing is the next generation architecture for HPC market. KNL introduces concept of a tile and CHA - Cache/Home Agent for memory accesses. Some things are fixed in KNL: () There's single DIMM slot per channel () There's 2 memory controllers with 3 channels each, however, from EDAC standpoint, it is presented as single memory controller with 6 channels. In order to represent 2 MCs w/ 3 CH, it would require major redesign of EDAC core driver. Basically, two functionalities are added/extended: () during driver initialization KNL topology is being recognized, i.e. which channels are populated with what DIMM sizes (knl_get_dimm_capacity function) () handle MCE errors - channel swizzling Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Jim Snow <jim.m.snow@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-5-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-05 19:00:52 +01:00
Jim Snow	c1979ba254	EDAC, sb_edac: Add support for duplicate device IDs Add options to sbridge_get_all_devices() to allow for duplicate device IDs and devices that are scattered across mulitple PCI buses. Signed-off-by: Jim Snow <jim.m.snow@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-4-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-05 18:57:41 +01:00
Jim Snow	c59f9c06bd	EDAC, sb_edac: Virtualize several hard-coded functions SAD limit, interleave mode and DRAM related functionalities are now virtualized, so that overriding them is easier. Signed-off-by: Jim Snow <jim.m.snow@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-3-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-05 18:54:45 +01:00
Thierry Reding	768ce42cce	EDAC, mv64x60: Use platform_register/unregister_drivers() These new helpers simplify implementing multi-driver modules and properly handle failure to register one driver by unregistering all previously registered drivers. Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1449073138-10852-2-git-send-email-thierry.reding@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-03 12:05:40 +01:00
Thierry Reding	d54051f1cc	EDAC, mpc85xx: Use platform_register/unregister_drivers() These new helpers simplify implementing multi-driver modules and properly handle failure to register one driver by unregistering all previously registered drivers. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1449136632-11680-1-git-send-email-thierry.reding@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-03 12:03:36 +01:00
Borislav Petkov	e8f937da74	EDAC, pci: Remove old disabled code Remove an unused edac_pci_find() function iterating over edac_pci_list. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-11-18 14:00:05 +01:00
Linus Torvalds	9cf5c095b6	asm-generic cleanups The asm-generic changes for 4.4 are mostly a series from Christoph Hellwig to clean up various abuses of headers in there. The patch to rename the io-64-nonatomic-.h headers caused some conflicts with new users, so I added a workaround that we can remove in the next merge window. The only other patch is a warning fix from Marek Vasut -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVjzaf2CrR//JCVInAQImmhAA20fZ91sUlnA5skKNPT1phhF6Z7UF2Sx5 nPKcHQD3HA3lT1OKfPBYvCo+loYflvXFLaQThVylVcnE/8ecAEMtft4nnGW2nXvh sZqHIZ8fszTB53cynAZKTjdobD1wu33Rq7XRzg0ugn1mdxFkOzCHW/xDRvWRR5TL rdQjzzgvn2PNlqFfHlh6cZ5ykShM36AIKs3WGA0H0Y/aYsE9GmDOAUp41q1mLXnA 4lKQaIxoeOa+kmlsUB0wEHUecWWWJH4GAP+CtdKzTX9v12bGNhmiKUMCETG78BT3 uL8irSqaViNwSAS9tBxSpqvmVUsa5aCA5M3MYiO+fH9ifd7wbR65g/wq39D3Pc01 KnZ3BNVRW5XSA3c86pr8vbg/HOynUXK8TN0lzt6rEk8bjoPBNEDy5YWzy0t6reVe wX65F+ver8upjOKe9yl2Jsg+5Kcmy79GyYjLUY3TU2mZ+dIdScy/jIWatXe/OTKZ iB4Ctc4MDe9GDECmlPOWf98AXqsBUuKQiWKCN/OPxLtFOeWBvi4IzvFuO8QvnL9p jZcRDmIlIWAcDX/2wMnLjV+Hqi3EeReIrYznxTGnO7HHVInF555GP51vFaG5k+SN smJQAB0/sostmC1OCCqBKq5b6/li95/No7+0v0SUhJJ5o76AR5CcNsnolXesw1fu vTUkB/I66Hk= =dQKG -----END PGP SIGNATURE----- Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cleanups from Arnd Bergmann: "The asm-generic changes for 4.4 are mostly a series from Christoph Hellwig to clean up various abuses of headers in there. The patch to rename the io-64-nonatomic-.h headers caused some conflicts with new users, so I added a workaround that we can remove in the next merge window. The only other patch is a warning fix from Marek Vasut" * tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: temporarily add back asm-generic/io-64-nonatomic.h asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations gpio-mxc: stop including <asm-generic/bug> n_tracesink: stop including <asm-generic/bug> n_tracerouter: stop including <asm-generic/bug> mlx5: stop including <asm-generic/kmap_types.h> hifn_795x: stop including <asm-generic/kmap_types.h> drbd: stop including <asm-generic/kmap_types.h> move count_zeroes.h out of asm-generic move io-64-nonatomic.h out of asm-generic	2015-11-06 14:22:15 -08:00
Linus Torvalds	b831ef2cad	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS changes from Ingo Molnar: "The main system reliability related changes were from x86, but also some generic RAS changes: - AMD MCE error injection subsystem enhancements. (Aravind Gopalakrishnan) - Fix MCE and CPU hotplug interaction bug. (Ashok Raj) - kcrash bootup robustness fix. (Baoquan He) - kcrash cleanups. (Borislav Petkov) - x86 microcode driver rework: simplify it by unmodularizing it and other cleanups. (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init() x86/mce: Add a Scalable MCA vendor flags bit MAINTAINERS: Unify the microcode driver section x86/microcode/intel: Move #ifdef DEBUG inside the function x86/microcode/amd: Remove maintainers from comments x86/microcode: Remove modularization leftovers x86/microcode: Merge the early microcode loader x86/microcode: Unmodularize the microcode driver x86/mce: Fix thermal throttling reporting after kexec kexec/crash: Say which char is the unrecognized x86/setup/crash: Check memblock_reserve() retval x86/setup/crash: Cleanup some more x86/setup/crash: Remove alignment variable x86/setup: Cleanup crashkernel reservation functions x86/amd_nb, EDAC: Rename amd_get_node_id() x86/setup: Do not reserve crashkernel high memory if low reservation failed x86/microcode/amd: Do not overwrite final patch levels x86/microcode/amd: Extract current patch level read to a function x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts ...	2015-11-03 17:51:33 -08:00
Tan Xiaojun	990995bad1	EDAC: Fix PAGES_TO_MiB macro misuse The PAGES_TO_MiB macro is used for unit conversion but the trace_mc_event() tracepoint expects a page address. Fix that. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1445341538-24271-1-git-send-email-tanxiaojun@huawei.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-10-22 22:57:30 +02:00
Aravind Gopalakrishnan	1a6775c1a2	x86/amd_nb, EDAC: Rename amd_get_node_id() This function doesn't give us the "Node ID" as the function name suggests. Rather, it receives a PCI device as argument, checks the available F3 PCI device IDs in the system and returns the index of the matching Bus/Device IDs. Rename it to amd_pci_dev_to_node_id(). No functional change is introduced. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1445246268-26285-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:55 +02:00
Dinh Nguyen	941fd2e709	EDAC, altera: SoCFPGA EDAC should not look for ECC_CORR_EN The bootloader may or may not enable the ECC_CORR_EN bit. By not enabling ECC_CORR_EN, when error happens, it is the user's responsibility to perform a full SDRAM scrub. Remove the check for ECC_CORR_EN. Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Thor Thayer <tthayer@opensource.altera.com> Link: http://lkml.kernel.org/r/1444864456-21778-1-git-send-email-dinguyen@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-10-15 11:57:23 +02:00
Christoph Hellwig	2f8e2c8777	move io-64-nonatomic*.h out of asm-generic These are not implementations of default architecture code but helpers for drivers. Move them to the place they belong to. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Darren Hart <dvhart@linux.intel.com> Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2015-10-15 00:21:07 +02:00
Tan Xiaojun	30f84a891b	EDAC: Use edac_debugfs_remove_recursive() debugfs_remove() is used to remove a file or a directory from the debugfs filesystem, but mci->debugfs might not empty. This can be triggered by the following sequence: 1) Enable CONFIG_EDAC_DEBUG 2) insmod an EDAC module (like i3000_edac or similar) 3) rmmod this module 4) we can see files remaining under <debugfs_mountpoint>/edac/ like "fake_inject", for example. Removing edac_core then, causes a NULL pointer dereference. Reported-by: Yun Wu (Abel) <wuyun.wu@huawei.com> Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1444787364-104353-1-git-send-email-tanxiaojun@huawei.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-10-14 18:50:32 +02:00
Luis de Bethencourt	90b3b37383	EDAC, ppc4xx_edac: Fix module autoload for OF platform driver This platform driver has an OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20150917114619.GA13145@goodgumbo.baconseed.org Signed-off-by: Borislav Petkov <bp@suse.de>	2015-10-03 12:19:42 +02:00
Aravind Gopalakrishnan	1a8bc7707e	EDAC, amd64_edac: Update copyright and remove changelog Git provides us all the changelogs anyway. So trim the comments section here. Update the copyrights info while at it. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-29 13:41:04 +02:00
Aravind Gopalakrishnan	da92110dfd	EDAC, amd64_edac: Extend scrub rate support to F15hM60h The scrub rate control register has moved to function 2 in PCI config space and is at a different offset on family 0x15, models 0x60 and later. The minimum recommended scrub rate has also changed. (Refer to D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG). Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate this. Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com [ Cleanup conditionals. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-29 13:25:33 +02:00
Toshi Kani	d0c9c93019	EDAC: Don't allow empty DIMM labels Updating dimm_label to an empty string does not make much sense. Change the sysfs dimm_label store operation to fail a request when an input string is empty. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: elliott@hpe.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1443124767.25474.172.camel@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-28 16:39:05 +02:00
Toshi Kani	438470b84c	EDAC: Fix sysfs dimm_label store operation Sysfs "dimm_label" and "chX_dimm_label" nodes have the following issues in their store operation: 1) A newline-terminated input string causes redundant newlines: # echo "test" > /sys/bus/mc0/devices/dimm0/dimm_label # cat /sys/bus/mc0/devices/dimm0/dimm_label test # od -bc /sys/bus/mc0/devices/dimm0/dimm_label 0000000 164 145 163 164 012 012 t e s t \n \n 0000006 2) The original label string (31 characters) cannot be stored due to an improper size check: # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0" > /sys/bus/mc0/devices/dimm0/dimm_label # cat /sys/bus/mc0/devices/dimm0/dimm_label # od -bc /sys/bus/mc0/devices/dimm0/dimm_label 0000000 012 012 \n \n 0000002 3) An input string longer than the buffer size results a wrong label info as it allows a retry with the remaining string: # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0_TEST" > /sys/bus/mc0/devices/dimm0/dimm_label # cat /sys/bus/mc0/devices/dimm0/dimm_label _TEST Fix these issues by making the following changes: 1) Replace a newline character at the end by setting a null. It also assures that the string is null-terminated in the label buffer. 2) Check the label buffer size with 'sizeof(dimm->label)'. 3) Fail a request if its string exceeds the label buffer size. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Robert Elliott <elliott@hpe.com> Link: http://lkml.kernel.org/r/1443121564.25474.160.camel@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-25 19:45:59 +02:00
Toshi Kani	1ea62c59c8	EDAC: Fix sysfs dimm_label show operation After `7d375bffa5` ("sb_edac: Fix support for systems with two home agents per socket") sysfs "dimm_label" and "chX_dimm_label" show their label string without a newline "\n" at the end. [root@orange ~]# cat /sys/bus/mc0/devices/dimm0/dimm_label CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]# [root@orange ~]# cat /sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]# The label strings now have 31 characters, which are the same as EDAC_MC_LABEL_LEN. Since the snprintf()s in channel_dimm_label_show() and dimmdev_label_show() limit the whole length by EDAC_MC_LABEL_LEN, the newline in the format "%s\n" is ignored. [root@orange ~]# od -bc /sys/bus/mc0/devices/dimm0/dimm_label 0000000 103 120 125 137 123 162 143 111 104 043 060 137 110 141 043 060 C P U _ S r c I D # 0 _ H a # 0 0000020 137 103 150 141 156 043 060 137 104 111 115 115 043 060 000 _ C h a n # 0 _ D I M M # 0 \0 0000037 Fix it by using 'sizeof(dimm->label) + 1' as the whole length in the snprintf()s in channel_dimm_label_show() and dimmdev_label_show(). Reported-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Link: http://lkml.kernel.org/r/1442933883-21587-2-git-send-email-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-25 19:19:21 +02:00
Loc Ho	f864b79ba2	EDAC, xgene: Add SoC support Add support for the SoC component. Signed-off-by: Loc Ho <lho@apm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: ijc+devicetree@hellion.org.uk Cc: jcm@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: patches@apm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1443055261-8613-4-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-25 15:41:46 +02:00
Loc Ho	9bc1c0c0ec	EDAC, xgene: Fix possible sprintf() overflow issue Replace sprintf() with snprintf() to avoid possible string array overflow. Signed-off-by: Loc Ho <lho@apm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: ijc+devicetree@hellion.org.uk Cc: jcm@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: patches@apm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1443116287-11752-1-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-25 15:39:52 +02:00
Loc Ho	9347473c7d	EDAC, xgene: Add L3 support Add EDAC support for the L3 component. Signed-off-by: Loc Ho <lho@apm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: ijc+devicetree@hellion.org.uk Cc: jcm@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: patches@apm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1443055261-8613-3-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-25 15:36:31 +02:00
Seth Jennings	2900ea6096	EDAC, sb_edac: Fix TAD presence check for sbridge_mci_bind_devs() In commit `7d375bffa5` ("sb_edac: Fix support for systems with two home agents per socket") NUM_CHANNELS was changed to 8 and the channel space was renumerated to handle EN, EP, and EX configurations. The *_mci_bind_devs() functions - except for sbridge_mci_bind_devs() - got a new device presence check in the form of saw_chan_mask. However, sbridge_mci_bind_devs() still uses the NUM_CHANNELS for loop. With the increase in NUM_CHANNELS, this loop fails at index 4 since SB only has 4 TADs. This results in the following error on SB machines: EDAC sbridge: Some needed devices are missing EDAC sbridge: Couldn't find mci handler EDAC sbridge: Couldn't find mci handle This patch adapts the saw_chan_mask logic for sbridge_mci_bind_devs() as well. After this patch: EDAC MC0: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#0: DEV 0000:3f:0e.0 (POLLED) EDAC MC1: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#1: DEV 0000:7f:0e.0 (POLLED) Signed-off-by: Seth Jennings <sjenning@redhat.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Acked-by: Tony Luck <tony.luck@intel.com> Tested-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # v4.2 Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1438798561-10180-1-git-send-email-sjenning@redhat.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-24 20:40:50 +02:00
Aravind Gopalakrishnan	58a9c251c9	EDAC, ghes_edac: Remove redundant memory_type array We already have edac_mem_types[] that enumerates the different kinds of memory. So, use that and remove the redundant memory_type[] array here. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1442436811-23382-2-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-23 16:59:25 +02:00
Borislav Petkov	09bd1b4f81	EDAC, xgene: Convert to debugfs wrappers Drop CONFIG_EDAC_DEBUG ifdeffery too, while at it. Tested-by: Loc Ho <lho@apm.com> Cc: linux-edac@vger.kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-23 11:37:45 +02:00
Borislav Petkov	52019e406c	EDAC, i5100: Convert to debugfs wrappers This driver creates its debugfs hierarchy under the toplevel debugfs dir - see i5100_init() - so make it use edac_debugfs_create_dir_at( , NULL) because we're not breaking userspace. Oh well. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-22 18:21:12 +02:00
Borislav Petkov	bba3b31e44	EDAC, altera: Convert to debugfs wrappers Use the EDAC-specific wrappers. Drop CONFIG_EDAC_DEBUG ifdeffery. Cc: Thor Thayer <tthayer@opensource.altera.com> Cc: <linux-edac@vger.kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-22 18:20:56 +02:00
Borislav Petkov	4397bcb4fa	EDAC: Add debugfs wrappers Later patches will convert EDAC users to those. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-22 18:10:22 +02:00
Borislav Petkov	7ac8bf9bc9	EDAC: Carve out debugfs functionality ... into a separate compilation unit and drop a couple of CONFIG_EDAC_DEBUG ifdefferies. Rename edac_create_debug_nodes() to edac_create_debugfs_nodes(), while at it. No functionality change. Cc: <linux-edac@vger.kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-09-22 12:29:46 +02:00
Linus Torvalds	d9b44fe30f	edac updates for v4.3-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV8giFAAoJEAhfPr2O5OEVYxsP/imxjTa1utbeYToA+oqut9Yw HbMB6yjscfZF/CwP/rB/T5jNTTww6pvavBntw29NewTmPIfREuvYMvvOtCkyKdQf yEVeME0cvJAiSdtiqeWoAMJYm3VMDKPh0p/cncvhcpip0ORU3prV9XEqNo1byvEU S4laHNfxgDbTHptVhvaT7Li/yN3KyenlaT61K3dYn0LtbPf6uxSsvqX+ExuZSix7 ZWS+wQrzX35hQHvL7Ax1iZv15uTfrC2kD2bwZrgetvl6wVdwBKx9vbWosaZ+XffV gv4dxJG5zNV2nFPfFUxqQZ++OcieuqP1yeWtwbuBPx3qibDKTb/IJYX8VvHAAc8n EgaDwFsYH6YnLsxNL58+W9PYigVyc2VS2Nfqz9uRuXmuryoyDbCpcgVmZYdmgYKn c1Rc4DzOqvOWrv5jdtSxYmdmAMqgf0LjlMzUmZ7shJYWprsqjgwqxevPXmH9xb7s myBpM604KofiWGPGqVNVYIdt8AldXc1QNnyCxGHcAAdIzEOn+7WEdeWqVQ665Vya x2Cp9h1IURPtcniw7Dx/0nqBu64Y6OMGz9W6H0SxBUWc2cC6QZZ4k9zOg2krtFkt c/S6RCYdIoxIcwTPfhyoBfCAHueSfUaJ8sIAWMrQLFUCgALN2f1WslIYEJVGcPOT UeQzSo1pe9wyZu5hQzji =DrsG -----END PGP SIGNATURE----- Merge tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac Pull edac updates from Mauro Carvalho Chehab: "Two EDAC fixes for Intel systems (Haswell and Ivy Bridge)" * tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell sb_edac: look harder for DDRIO on Haswell systems	2015-09-11 16:21:12 -07:00
Aristeu Rozanski	12f0721c5a	sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell dimm_dev_type has been incorrectly determined in sb_edac. This patch fixes it for Ivy Bridge and Haswell only since nothing like exists for Sandy Bridge. We tested this patch in multiple systems matching the results with the installed memory modules. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-09-08 20:33:48 -03:00
Aristeu Rozanski	7179385afe	sb_edac: look harder for DDRIO on Haswell systems In case the memory banks are populated so the first channel isn't used, the DDRIO PCI device won't be visible and it won't be possible to determine the memory type. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-09-08 20:32:13 -03:00
Linus Torvalds	2d678b68e8	Minor stuff: AMD MCE decoding correction and xgene_edac cleanup. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5aADAAoJEBLB8Bhh3lVKHM8P/RLgWShqZ0CwXz+2v9e8wZxr Uwo3xhx0jqUTefBNZqMGgJKq7DexNXWa/40vHydPO4S7yNRka4aftev01OVr2tVG 3v/ggqxV3SRYkTyhQhjzs39MDeb6+jZlMPz17DmM75iry1og9iT213jPIk6aAt9G OoIGgWWT6rtsDa55+Yat/15qc4ajE5mmJnwej/Ybr/B3qI4/Zm3XGVqopbg2axrw CtmcK7+3OJ2mvbBTGCoNAL+xtcCRXsJvch8+eo1PhJDjA1lY37lCdZa2TaD0RvsU /YFDotPNYf09RJW+f8FvMkj4P3/Upwo0w97e1nhqLRYemmezdSBd+iU3KWHCdWkf wj1XFtdpDEDuDzpo2zX5KOA/KgB7ozcRKsm9NzcQHNnbt5UCXmHIWuK3+3ED/RdM Pbm4SrLzFX0Fp/08/Vd55CSZ0bP8+bduP5su8yBfX+oUuNY3cvNO2e7zmSRaifON CI+hpjfjpaDOmydK4/RbtPvB3YkcykeB7Jk/gyDTD2ZKjxoQw8YMH5+7Xy/cMQYA aKnnbE0O14+fnp7JrZJ/ER2FlcSOMiWBhgPJ2wi+0BqcJZhpwvhkEMmClsMtrQss u7rX9B5+GqeAQ145X+67CK3VwckyD6ZO6rwP7q6VEiknmkMAO9ML4DFUhCO7YeHx 8s5Cpy94v2rPext0O2Uw =23M/ -----END PGP SIGNATURE----- Merge tag 'edac_for_4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC fixes from Borislav Petkov: "Two minor fixlets this time: AMD MCE decoding correction and xgene_edac cleanup" * tag 'edac_for_4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, mce_amd: Don't emit 'CE' for Deferred error EDAC, xgene: Drop owner assignment from platform_driver	2015-09-01 18:34:22 -07:00
Linus Torvalds	3959df1dfb	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "MCE handling updates, but also some generic drivers/edac/ changes to better organize the Kconfig space" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ras: Move AMD MCE injector to arch/x86/ras/ x86/mce: Add a wrapper around mce_log() for injection x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check() RAS: Add a menuconfig option with descriptive text x86/mce: Reenable CMCI banks when swiching back to interrupt mode x86/mce: Clear Local MCE opt-in before kexec x86/mce: Remove unused function declarations x86/mce: Kill drain_mcelog_buffer() x86/mce: Avoid potential deadlock due to printk() in MCE context x86/mce: Remove the MCE ring for Action Optional errors x86/mce: Don't use percpu workqueues x86/mce: Provide a lockless memory pool to save error records x86/mce: Reuse one of the u16 padding fields in 'struct mce'	2015-08-31 20:20:30 -07:00
Borislav Petkov	6c36dfe949	x86/ras: Move AMD MCE injector to arch/x86/ras/ This is an x86-specific module and would benefit from being closer to the arch code. Move it there. Update copyright while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1439396985-12812-14-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-13 10:12:54 +02:00
Borislav Petkov	eef4dfa0cb	x86/mce: Kill drain_mcelog_buffer() This used to flush out MCEs logged during early boot and which were in the MCA registers from a previous system run. No need for that now, since we've moved to a genpool. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-13 10:12:52 +02:00
Chen, Gong	fd4cf79fcc	x86/mce: Remove the MCE ring for Action Optional errors Use unified genpool to save Action Optional error events and put Action Optional error handling in the same notification chain as MCE error decoding. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> [ Fold in subsequent patch from Boris for early boot logging. ] Signed-off-by: Tony Luck <tony.luck@intel.com> [ Correct a lot. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-13 10:12:51 +02:00
Michael Walle	5c16179b55	EDAC, ppc4xx: Access mci->csrows array elements properly The commit `de3910eb79` ("edac: change the mem allocation scheme to make Documentation/kobject.txt happy") changed the memory allocation for the csrows member. But ppc4xx_edac was forgotten in the patch. Fix it. Signed-off-by: Michael Walle <michael@walle.cc> Cc: <stable@vger.kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc Signed-off-by: Borislav Petkov <bp@suse.de>	2015-08-13 06:02:19 +02:00
Aravind Gopalakrishnan	99e1dfb7d2	EDAC, mce_amd: Don't emit 'CE' for Deferred error Currently, when decoding an MCE, we display 'CE' for a Deferred error, like this: [Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over\|CE\|MiscV\|-\|AddrV\|Deferred\|-\|UECC]: 0xdc04b00095080813 When the 'UC' bit in the MCx_STATUS register is clear, the error status is either a Corrected error or Deferred error as determined by the 'Deferred' bit. So do not print 'CE' on a deferred error. Refer to AMD Error Scope Hierarchy table in a newer BKDG (example: 49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features"). Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1436788382-6463-1-git-send-email-aravind.gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-07-14 06:32:53 +02:00
Krzysztof Kozlowski	ca12bb14fe	EDAC, xgene: Drop owner assignment from platform_driver platform_driver does not need to set an owner because platform_driver_register() will set it. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Loc Ho <lho@apm.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Link: http://lkml.kernel.org/r/1436507243-11159-1-git-send-email-k.kozlowski@samsung.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-07-10 14:16:24 +02:00
Linus Torvalds	b53343fc6c	A build fix for octeon_edac from Aaro Koskinen. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVlPv5AAoJEBLB8Bhh3lVKlxIP/35j4O/i4+xgw2CRQvAflJgz haUZsguT7QRXqbVmQdAMqyCrz77jnIIFIJEcoqpvpcHSVuNblbjBS6xc7u1UAfmF h7M+XIFZuIX7mp6svVJujN+VgP1Obgsx860bG335HIJZZzPfOfs31pisxUddXNpC deuP4opLlKcykioFF4GXhoLzDBoc2PLfPGczNRzw9HUisN2F+SK4NEa7iehp8RBX UCQTtwsLFuBzi4AHqAfrNKgDiEQ+DZ529sVh5bqAk/U8VDqUulKqIjLh0G75tYm2 SauVE2woOn4WqUTG8CJBjCTauL+BIc2LTcPh6qIzWxsgZnyeHCve/1QBBOMi/yk2 FAJYg2VDhPqj0OyUqC7JUXicB3ZKviWYTZ3knv9mIF8m98N8K4kAwGPuUkTc3Z2P fBBusvzb+6/OTN2WyQaLx8DjG/+Ha/o0AEeWE6S4A9+urCrJIx+Dkelt3dngdrVe kbHyvKmTPy13U1zEo5jmMZkjf7xNfotpckTIwCz3l00Z5AwPPVjC5KylhPvlwM7L BpftYJsPRJrIK0zkuU3Yfcrhy8U2QcQAV14IMEW1r5pZwpLH+EfoIHAbh8PWJc5C toporRXFc2R7MT1+0DA4faNzfwXXBLbT5BYRzNfTZPClAeSqKE0OPj4Ytv+JWyO5 kx3HK/kWiSH2SxQ1Sid1 =7C57 -----END PGP SIGNATURE----- Merge tag 'edac_urgent_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC fix from Borislav Petkov: "A build fix for octeon_edac from Aaro Koskinen" * tag 'edac_urgent_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, octeon: Fix broken build due to model helper renames	2015-07-03 12:10:12 -07:00
Aaro Koskinen	75a15a7864	EDAC, octeon: Fix broken build due to model helper renames Commit `debe6a623d` ("MIPS: OCTEON: Update octeon-model.h code for new SoCs.") renamed some SoC model helper functions, but forgot to update the EDAC drivers resulting in build failures. Fix that. Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Acked-by: David Daney <david.daney@cavium.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/1435747132-10954-1-git-send-email-aaro.koskinen@nokia.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-07-02 10:46:28 +02:00
Linus Torvalds	8d7804a2f0	Driver core patches for 4.2-rc1 Here is the driver core / firmware changes for 4.2-rc1. A number of small changes all over the place in the driver core, and in the firmware subsystem. Nothing really major, full details in the shortlog. Some of it is a bit of churn, given that the platform driver probing changes was found to not work well, so they were reverted. All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlWNoCQACgkQMUfUDdst+ym4JACdFrrXoMt2pb8nl5gMidGyM9/D jg8AnRgdW8ArDA/xOarULd/X43eA3J3C =Al2B -----END PGP SIGNATURE----- Merge tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the driver core / firmware changes for 4.2-rc1. A number of small changes all over the place in the driver core, and in the firmware subsystem. Nothing really major, full details in the shortlog. Some of it is a bit of churn, given that the platform driver probing changes was found to not work well, so they were reverted. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits) Revert "base/platform: Only insert MEM and IO resources" Revert "base/platform: Continue on insert_resource() error" Revert "of/platform: Use platform_device interface" Revert "base/platform: Remove code duplication" firmware: add missing kfree for work on async call fs: sysfs: don't pass count == 0 to bin file readers base:dd - Fix for typo in comment to function driver_deferred_probe_trigger(). base/platform: Remove code duplication of/platform: Use platform_device interface base/platform: Continue on insert_resource() error base/platform: Only insert MEM and IO resources firmware: use const for remaining firmware names firmware: fix possible use after free on name on asynchronous request firmware: check for file truncation on direct firmware loading firmware: fix __getname() missing failure check drivers: of/base: move of_init to driver_init drivers/base: cacheinfo: fix annoying typo when DT nodes are absent sysfs: disambiguate between "error code" and "failure" in comments driver-core: fix build for !CONFIG_MODULES driver-core: make __device_attach() static ...	2015-06-26 15:07:37 -07:00
Linus Torvalds	da996f7310	edac updates for v4.2-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVjFrRAAoJEAhfPr2O5OEVD70P/A17My+ABfqme1snLyJuiQsz Uuu1JpS4ZcCxEdc4J/nCxexlbwr7b5/2bkSWASIQaOYMOcgM7CkBs5Z7fIDjpgOt UMCMCnmVW6A/nMeVKfmcDri0BDzW73Yt8jT8Urtp4yz0k5wRV5fBHHK0JH2l8Zj6 ND9KPJQqA96cshe4TiLBJj+stZhRJhPG/FeQ5GwTs6j4mn+lyu7bGJ/V5h45iYkG 6frYP8UJMYR6g1VXl3ohTlyBJfEFdoG5Uz0H5oyqfRuFJtEuXNSCQVy2xWscBUc5 A4l/brTXXDhHgo6mq7tFDO2BldiIMWHZhLbDvzoAlVOK5s14q30DhSF8q25CP3+t uszwV6mihGerr1VjUEm3n1Uus5wuAXsgn75k1dnqbL/3aut0UYkOS9skml3ioWiV PQB+Ix3Bjz7k9Hwi4Rd2KZEhwvPAWXBKyRjScq9DYjo3+Er8ClgoJTcADiYXPTRM ctR2WY6PLkMhTdePBMJbK51ZljMGKZj0boDFHofZceQWVaDod8dSFn5HFjQkc3nu Ty9Qb3LMDPZeensVg5bOlIgXsLoUoo7zlwhdxVRAyc6duSJndKAJoQL6l1RTvXYM UQbEdMQZURGzTRyGui+tX9+NBER0n7mJtw9qUKWaxi3GlTGH8ETg/ga0COM0SEkD Ax+ZvqYeyJSh4/F2nGFV =bjTk -----END PGP SIGNATURE----- Merge tag 'edac/v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac Pull edac updates from Mauro Carvalho Chehab: "Some fixes and additions to the EDAC driver used on modern Intel x86 CPUs. It includes support for Broadwell EP/EX platforms and fixes for motherboards with more than 2 CPU sockets" * tag 'edac/v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: sb_edac: support for Broadwell -EP and -EX sb_edac: Fix support for systems with two home agents per socket sb_edac: Fix a typo and a thinko in address handling for Haswell EDAC: Remove arbitrary limit on number of channels	2015-06-25 18:22:20 -07:00
Borislav Petkov	cda9459da7	EDAC, mce_amd_inj: Set MISCV on injection When during injection we populate MCi_MISC by writing into misc, we need to set the MiscV bit in the corresponding MCi_STATUS register which denotes that there's valid info in the MCi_MISC register. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:17:38 +02:00
Borislav Petkov	6d1e9bf5b0	EDAC, mce_amd_inj: Move bit preparations before the injection We do get_online_cpus() and then start noodling with the bits. Do that before we grab the hotplug lock. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:17:37 +02:00
Borislav Petkov	f2f3dca1b7	EDAC, mce_amd_inj: Cleanup and simplify README Save us an indentation level, widen to 80 cols, make the text more succinct and slender. Use i as the bank variable, same as what the documentation uses. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:17:34 +02:00
Alan Tull	6f2b6422d4	EDAC, altera: Do not allow suspend when EDAC is enabled Suspend-to-RAM and EDAC support are mutually exclusive on SOCFPGA. If EDAC is enabled, it will prevent the platform from going into suspend. The reason is that the IRQ vectors for OCRAM reside on DDR and in Suspend-to-RAM mode we're executing out of OCRAM. If an ECC error occurs, we can't handle it so it was decided to make them mutually exclusive. Signed-off-by: Alan Tull <atull@opensource.altera.com> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: dinh.linux@gmail.com Cc: dougthompson@xmission.com Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Cc: tthayer@opensource.altera.com Link: http://lkml.kernel.org/r/1433512155-9906-1-git-send-email-dinguyen@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:16:12 +02:00
kbuild test robot	de2776787f	EDAC, mce_amd_inj: Make inj_type static It is used there only anyway. Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: kbuild-all@01.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20150605112426.GA97073@lkp-sb04 Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:16:11 +02:00
Thor Thayer	73bcc942f4	EDAC, altera: Add Arria10 EDAC support The Arria10 SDRAM and ECC system differs significantly from the Cyclone5 and Arria5 SoCs. This patch adds support for the Arria10 SoC. 1) IRQ handler needs to support SHARED IRQ 2) Support sberr and dberr address reporting. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: galak@codeaurora.org Cc: grant.likely@linaro.org Cc: ijc+devicetree@hellion.org.uk Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: m.chehab@samsung.com Cc: mark.rutland@arm.com Cc: pawel.moll@arm.com Cc: robh+dt@kernel.org Cc: tthayer.linux@gmail.com Link: http://lkml.kernel.org/r/1433428128-7292-4-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:16:09 +02:00
Thor Thayer	143f4a5ac5	EDAC, altera: Refactor for Altera CycloneV SoC The Arria10 SoC uses a completely different SDRAM controller from the earlier CycloneV and ArriaV SoCs. This patch abstracts the SDRAM bits for the CycloneV/ArriaV SoCs in preparation for the Arria10 support. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: galak@codeaurora.org Cc: grant.likely@linaro.org Cc: ijc+devicetree@hellion.org.uk Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: m.chehab@samsung.com Cc: mark.rutland@arm.com Cc: pawel.moll@arm.com Cc: robh+dt@kernel.org Cc: tthayer.linux@gmail.com Link: http://lkml.kernel.org/r/1433428128-7292-3-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:16:08 +02:00
Thor Thayer	f9ae487e04	EDAC, altera: Generalize driver to use DT Memory size The Arria10 SOC uses a completely different SDRAM controller from the earlier CycloneV and ArriaV SoCs. The memory size is calculated in the bootloader and passed via the device tree. Using this device tree size is more generic than using the register fields to calculate the memory size for different SDRAM controllers. Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: dinguyen@opensource.altera.com Cc: galak@codeaurora.org Cc: grant.likely@linaro.org Cc: ijc+devicetree@hellion.org.uk Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: m.chehab@samsung.com Cc: mark.rutland@arm.com Cc: pawel.moll@arm.com Cc: robh+dt@kernel.org Cc: tthayer.linux@gmail.com Link: http://lkml.kernel.org/r/1433428128-7292-2-git-send-email-tthayer@opensource.altera.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:16:07 +02:00
Aravind Gopalakrishnan	99e21fea47	EDAC, mce_amd_inj: Add README file Provide information about each injection file and its usage for ease of use and in-band documentation. This is a good idea adapted from ftrace. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1433277362-10911-7-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 18:15:51 +02:00
Aravind Gopalakrishnan	4c6034e8e1	EDAC, mce_amd_inj: Add individual permissions field to dfs_node Add per-file permissions to the dfs_fls[] array. In a later patch, we will add a README file that needs different permissions. Hence the move here to add a perm field. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1433277362-10911-6-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-24 15:17:18 +02:00
Aravind Gopalakrishnan	0451d14d05	EDAC, mce_amd_inj: Modify flags attribute to use string arguments Use strings such as "hw" or "sw" to indicate the type of error injection to be performed. Current flags attribute derives the meanings of values that can be programmed into it from asm/mce.h. Moving to defined strings for the attribute allows this module to be self-sufficient and removes the dependency. Also, we can introduce new flags as and when needed without having to worry about conflicting with the flags already defined in asm/mce.h. Also, modify do_inject() to use the newly defined injection_type enum to figure out the injection mechanism we need to use Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1433277362-10911-4-git-send-email-Aravind.Gopalakrishnan@amd.com [ Use strstrip() return value. ] Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-03 16:47:51 +02:00
Aravind Gopalakrishnan	685d46d72b	EDAC, mce_amd_inj: Read out number of MCE banks from the hardware The number of banks for a given processor is encoded in MSR_IA32_MCG_CAP[7:0]. So obtain the value from that MSR and use it for sanity checking in inj_bank_set() instead of doing a family/model check. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Link: http://lkml.kernel.org/r/1432753418-2985-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-03 16:18:22 +02:00
Aravind Gopalakrishnan	e7f2ea1dbe	EDAC, mce_amd_inj: Use MCE_INJECT_GET macro for bank node too inj_bank_get() is generic enough that we can use the MCE_INJECT_GET macro instead. No functionality change. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mchehab@osg.samsung.com Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1433277362-10911-2-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-03 16:16:21 +02:00
Tony Luck	fa2ce64f85	sb_edac: support for Broadwell -EP and -EX Basic support for the single socket Broadwell-DE processor was added back in commit `1f39581a9a` sb_edac: Add support for Broadwell-DE processor This patch extends Broadwell support to cover the two socket "-EP" and four socket "-EX" versions of Broadwell. Only tested on the 2 socket - but this code is largely cloned from the Haswell path. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-06-03 10:10:59 -03:00
Tony Luck	7d375bffa5	sb_edac: Fix support for systems with two home agents per socket First noticed a problem on a 4 socket machine where EDAC only reported half the DIMMS. Tracked this down to the code that assumes that systems with two home agents only have two memory channels on each agent. This is true on 2 sockect ("-EP") machines. But four socket ("-EX") machines have four memory channels on each home agent. The old code would have had problems on two socket systems as it did a shuffling trick to make the internals of the code think that the channels from the first agent were '0' and '1', with the second agent providing '2' and '3'. But the code didn't uniformly convert from {ha,channel} tuples to this internal representation. New code always considers up to eight channels. On a machine with a single home agent these map easily to edac channels 0, 1, 2, 3. On machines with two home agents we map using: edac_channel = 4*ha# + channel So on a -EP machine where each home agent supports only two channels we'll fill in channels 0, 1, 4, 5, and on a -EX machine we use all of 0, 1, 2, 3, 4, 5, 6, 7. [mchehab@osg.samsung.com: fold a fixup patch as per Tony's request and fixed a few CodingStyle issues] Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-06-03 10:10:52 -03:00
Tony Luck	bb89e7141a	sb_edac: Fix a typo and a thinko in address handling for Haswell typo: "a7mode" chooses whether to use bits {8, 7, 9} or {8, 7, 6} in the algorithm to spread access between memory resources. But the non-a7mode path was incorrectly using GET_BITFIELD(addr, 7, 9) and so picking bits {9, 8, 7} thinko: BIT(1) of the dram_rule registers chooses whether to just use the {8, 7, 6} (or {8, 7, 9}) bits mentioned above as they are, or to XOR them with bits {18, 17, 16} but the code inverted the test. We need the additional XOR when dram_rule{1} == 0. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-06-03 10:10:47 -03:00
Tony Luck	c44696fff0	EDAC: Remove arbitrary limit on number of channels Currently set to "6", but the reset of the code will dynamically allocate as needed. We need to go to "8" today, but drop the check completely to save doing this again when we need even larger numbers. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2015-06-03 10:10:22 -03:00
Arnd Bergmann	451bb7fbcc	EDAC, xgene: Fix cpuid abuse The new x-gene EDAC driver incorrectly tried to figure out the version of one of its IP blocks by looking at the version of the CPU core, which is only vagely related. This removes the incorrect code and instead uses the version of the IP block in the compatible string where it belongs. Found using build testing on x86, which does not provide the arm64 cpuid interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> [ Changed subnode to "apm,xgene-edac-pmd-v2", adjusted check. ] Signed-off-by: Loc Ho <lho@apm.com> Cc: devicetree@vger.kernel.org Cc: dougthompson@xmission.com Cc: ijc+devicetree@hellion.org.uk Cc: jcm@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: mchehab@osg.samsung.com Cc: patches@apm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/3195065.IK73o60xya@wuerfel Signed-off-by: Borislav Petkov <bp@suse.de>	2015-06-02 19:07:50 +02:00
York Sun	2ce39109a5	EDAC, mpc85xx: Extend error address to 64 bit Extend err_addr to cover 64 bits for DDR errors. Signed-off-by: York Sun <yorksun@freescale.com> Acked-by: Johannes Thumshirn <morbidrsa@gmail.com> Cc: Mingkai.hu@freescale.com Link: http://lkml.kernel.org/r/1431425022-44766-2-git-send-email-Wenbin.Song@freescale.com Signed-off-by: songwenbin <wenbin.song@freescale.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-05-31 12:51:08 +02:00
York Sun	74210267a5	EDAC, mpc8xxx: Adapt for FSL SoC Remove mpc83xx and mpc85xx as dependency. Signed-off-by: York Sun <yorksun@freescale.com> Acked-by: Johannes Thumshirn <morbidrsa@gmail.com> Cc: Mingkai.hu@freescale.com Link: http://lkml.kernel.org/r/1431425022-44766-1-git-send-email-Wenbin.Song@freescale.com Signed-off-by: songwenbin <wenbin.song@freescale.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2015-05-31 12:50:31 +02:00
Borislav Petkov	cc14c5a808	EDAC, edac_stub: Drop arch-specific include <asm/edac.h> contains only the arch-specific scrubbing function and is thus not needed in edac_stub.c. Kill it. Signed-off-by: Borislav Petkov <bp@suse.de>	2015-05-29 22:01:00 +02:00
Loc Ho	0d4429301c	EDAC: Add APM X-Gene SoC EDAC driver Add support for the APM X-Gene SoC EDAC driver. Signed-off-by: Loc Ho <lho@apm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: dougthompson@xmission.com Cc: ijc+devicetree@hellion.org.uk Cc: jcm@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: mchehab@osg.samsung.com Cc: patches@apm.com Cc: robh+dt@kernel.org Link: http://lkml.kernel.org/r/1432337580-3750-5-git-send-email-lho@apm.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-05-29 11:39:24 +02:00

... 3 4 5 6 7 ...

1531 Commits