linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 18:45:02 +07:00

Author	SHA1	Message	Date
Borislav Petkov	b487c33e55	amd64_edac: Fix node id signedness A node id can never be negative since we use it as an index into the DRAM ranges array. This also makes one of the BUG_ON conditions redundant. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:28 +01:00
Borislav Petkov	d88977a9c4	amd64_edac: Drop redundant declarations Those were moved to the mce_amd.h header. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:27 +01:00
Borislav Petkov	df71a05324	amd64_edac: Enable driver on F15h Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	a3b7db09a6	amd64_edac: Adjust ECC symbol size to F15h F15h has the same ECC symbol size options as F10h revD and later so adjust checks to that. Simplify code a bit. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	87b3e0e6e4	amd64_edac: Simplify scrubrate setting Drop per-instance variable and compute min scrubrate dynamically. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:25 +01:00
Borislav Petkov	41d8bfaba7	amd64_edac: Improve DRAM address mapping Drop static tables which map the bits in F2x80 to a chip select size in favor of functions doing the mapping with some bit fiddling. Also, add F15 support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:24 +01:00
Borislav Petkov	5a5d237169	amd64_edac: Sanitize ->read_dram_ctl_register This function is relevant for F10h and higher, and it has only one callsite so drop its function pointer from the low_ops struct. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:23 +01:00
Borislav Petkov	b15f0fcab1	amd64_edac: Adjust sys_addr to chip select conversion routine to F15h F15h sys_addr to chip select mapping is almost identical to F10h's so reuse that. Rename functions on that path accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	355fba6005	amd64_edac: Beef up early exit reporting Add paranoid checks for the sys address before going off and decoding it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	614ec9d853	amd64_edac: Revamp online spare handling Replace per-DCT macros with smarter ones, drop hack and look for the spare rank on all chip selects on a channel. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	5d4b58e84a	amd64_edac: Fix channel interleave removal Remove the channel interleave select bit properly. See F2x110[DctSelIntLvAddr] for details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	e2f79dbdfb	amd64_edac: Correct node interleaving removal When node interleaving is enabled, a subset of the addr[14:12] bits has to be removed in order to get the normalized DCT address of the DRAM channel. The actual number of bits to remove is determined by F1x[1, 0][7C:40][IntlvEn]. Do this correctly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	95b0ef55cd	amd64_edac: Add support for interleaved region swapping On revC3 and revE Fam10h machines and later, non-interleaved graphics framebuffer memory under the 16G mark can be swapped with a region located at the bottom of memory so that the GPU can use the interleaved region and thus two channels. Add support for that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	700466249f	amd64_edac: Unify get_error_address The address bits from MC4_STATUS differ only between K8 and the rest so no need for a per-family method. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	f192c7b16c	amd64_edac: Simplify decoding path Use the struct mce directly instead of copying from it into a custom struct err_regs. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	7d20d14da1	amd64_edac: Adjust channel counting to F15h The only difference is that F10h used to sport ganged DCTs and F15h doesn't so adjust the F10h routine and reuse it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	5980bb9cd8	amd64_edac: Cleanup old defines cruft Remove unused defines, drop family names from define names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:18 +01:00
Borislav Petkov	bcd781f46a	amd64_edac: Cleanup NBSH cruft Remove reporting of errors with UC bit set - this is done by the MCE decoding code anyway and this driver deals with DRAM ECC errors only. UC (NB uncorrectable error) doesn't necessarily mean it is a DRAM error. Remove unused macros while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:17 +01:00
Borislav Petkov	a97fa68ec4	amd64_edac: Cleanup NBCFG handling The fact whether we are chipkill capable or not does not have any bearing when computing the channel index on a ganged DCT configuration so remove that. Also, simplify debug statements. Finally, remove old error injection leftovers, while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:16 +01:00
Borislav Petkov	c9f4f26eae	amd64_edac: Cleanup NBCTL code Remove family names from macro names, drop single bit defines and comment their meaning instead. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:15 +01:00
Borislav Petkov	78da121e15	amd64_edac: Cleanup DCT Select Low/High code Shorten macro names, remove family name from macros, fix macro arguments, shorten debug strings. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:15 +01:00
Borislav Petkov	cb32850744	amd64_edac: Cleanup Dram Configuration registers handling * Restrict DCT ganged mode check since only Fam10h supports it * Adjust DRAM type detection for BD since it only supports DDR3 * Remove second and thus unneeded DCLR read in k8_early_channel_count() - we do that in read_mc_regs() * Cleanup comments and remove family names from register macros * Remove unused defines There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	525a1b20a6	amd64_edac: Cleanup DBAM handling Do not read DBAM regs twice and simplify code around them. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	f678b8ccce	amd64_edac: Replace huge bitmasks with a macro Replace hard to read hex constants with a continuous masks macro. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:14 +01:00
Borislav Petkov	c8e518d567	amd64_edac: Sanitize f10_get_base_addr_offset This function maps the system address to the normalized DCT address. Document what the code does for more clarity and wrap insane bitmasks in a more understandable macro which generates them. Also, reduce number of arguments passed to the function. Finally, rename this function to what it actually does. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:13 +01:00
Borislav Petkov	229a7a11ac	amd64_edac: Sanitize channel extraction Cleanup and simplify f10_determine_channel(); make it more readable. Also drop f10_map_intlv_en_to_shift() in favor of simply counting the bits in F1x124[DramIntlvEn] which is equivalent. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:13 +01:00
Borislav Petkov	11c75eadaf	amd64_edac: Cleanup chipselect handling Add a struct representing the DRAM chip select base/limit register pairs. Concentrate all CS handling in a single function. Also, add CS looping macros for cleaner, more readable code. While at it, adjust code to F15h. Finally, do smaller macro names cleanups (remove family names from register macros) and debug messages clarification. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:12 +01:00
Borislav Petkov	bc21fa5787	amd64_edac: Cleanup DHAR handling Adjust to F15h, simplify code, fixup macros. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:12 +01:00
Borislav Petkov	7f19bf755c	amd64_edac: Remove DRAM base/limit subfields caching Add a struct representing the DRAM base/limit range pairs and remove all cached subfields. Replace them with accessor functions, which actually saves us some space: text data bss dec hex filename 14712 1577 336 16625 40f1 drivers/edac/amd64_edac_mod.o.after 14831 1609 336 16776 4188 drivers/edac/amd64_edac_mod.o.before Also, it simplifies the code a lot allowing to merge the K8 and F10h routines. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:11 +01:00
Borislav Petkov	b2b0c60543	amd64_edac: Add support for F15h DCT PCI config accesses F15h "multiplexes" between the configuration space of the two DRAM controllers by toggling D18F1x10C[DctCfgSel] while F10h has a different set of registers for DCT0, and DCT1 in extended PCI config space. Add DCT configuration space accessors per family thus wrapping all the different access prerequisites. Clean up code while at it, shorten names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:11 +01:00
Borislav Petkov	b6a280bb96	EDAC: Shut up sysfs registration debug code Raise the debug level of these routines so that their output get issued out only when the highest debug level is selected. Otherwise, don't pollute driver debug output. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:10 +01:00
Chris Metcalf	5c77075548	drivers/edac: provide support for tile architecture Add tile support for the EDAC driver, which provides unified system error (memory, PCI, etc.) reporting. For now, the TILEPro port reports memory correctable error (CE) only. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2011-03-10 13:30:14 -05:00
Grant Likely	000061245a	dt/powerpc: Eliminate users of of_platform_{,un}register_driver Get rid of old users of of_platform_driver in arch/powerpc. Most of_platform_driver users can be converted to use the platform_bus directly. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2011-02-28 01:36:39 -07:00
Arvind R	cb60a42269	edac: correct i82975x error-info reported to edac-core fix the totally wrong info w.r.t page,row,dimm-label previously reported to edac-core by i82975x driver Signed-off-by: Arvind R. <arvino55@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-02-17 16:47:04 +01:00
Arvind R	da95b3d21f	edac: correct i82975x mci initialisation corrected mtype, and added dev_name,scrubmode initialisers in i82975x struct mem_ctl initialisation Signed-off-by: Arvind R. <arvino55@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-02-17 16:46:22 +01:00
Arvind R	7ba9957581	edac: correct commented info wrong comments in i82975x driver corrected Signed-off-by: Arvind R. <arvino55@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-02-17 16:44:45 +01:00
Jiri Kosina	0a9d59a246	Merge branch 'master' into for-next	2011-02-15 10:24:31 +01:00
Borislav Petkov	4d7963648f	amd64_edac: Fix DIMMs per DCTs output amd64_debug_display_dimm_sizes() reports the distribution of the DIMMs on each DRAM controller and its chip select sizes. Thus, the last don't have anything to do with whether we're running in ganged DCT mode or not - their sizes don't change all of a sudden. Fix that by removing the ganged-check and dump DCT0's config for DCT1 when in ganged mode since they're identical. Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-02-10 14:41:49 +01:00
Arvind R	25527885e3	edac: i82975x author/maintainer email address change edac-i82975x author/maintainer email address change Signed-off-by: Arvind R. <arvino55@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-24 16:17:51 +01:00
Jesper Juhl	42b16b3fbb	Kill off warning: ‘inline’ is not at beginning of declaration Fix a bunch of warning: ‘inline’ is not at beginning of declaration messages when building a 'make allyesconfig' kernel with -Wextra. These warnings are trivial to kill, yet rather annoying when building with -Wextra. The more we can cut down on pointless crap like this the better (IMHO). A previous patch to do this for a 'allnoconfig' build has already been merged. This just takes the cleanup a little further. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-19 15:43:08 +01:00
Linus Torvalds	008d23e485	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.	2011-01-13 10:05:56 -08:00
Linus Torvalds	128283a47e	Merge branch 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, MCE: Fix NB error formatting EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit EDAC, MCE: Enable MCE decoding on F15h EDAC, MCE: Allow F15h bank 6 MCE injection EDAC, MCE: Shorten error report formatting EDAC, MCE: Overhaul error fields extraction macros EDAC, MCE: Add F15h FP MCE decoder EDAC, MCE: Add F15 EX MCE decoder EDAC, MCE: Add an F15h NB MCE decoder EDAC, MCE: No F15h LS MCE decoder EDAC, MCE: Add F15h CU MCE decoder EDAC, MCE: Add F15h IC MCE decoder EDAC, MCE: Add F15h DC MCE decoder EDAC, MCE: Select extended error code mask	2011-01-07 14:54:03 -08:00
Borislav Petkov	6d5db46687	EDAC, MCE: Fix NB error formatting Minor formatting fixup since the information which core was associated with the MCE is not always valid. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:26 +01:00
Randy Dunlap	50adbbd8a8	EDAC, MCE: Use BIT_64() to eliminate warnings on 32-bit Building for X86_32 produces shift count warnings, so use BIT_64() to eliminate the warnings. drivers/edac/mce_amd.c:778: warning: left shift count >= width of type drivers/edac/mce_amd.c:778: warning: left shift count >= width of type Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: bluesmoke-devel@lists.sourceforge.net Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:25 +01:00
Borislav Petkov	bad11e0318	EDAC, MCE: Enable MCE decoding on F15h Now that everything is inplace, enable MCE decoding on F15h. Make initcall routine a bit more readable. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:24 +01:00
Borislav Petkov	1b07ca47ff	EDAC, MCE: Allow F15h bank 6 MCE injection F15h adds a sixth MCE bank: adjust bank number check in the injection code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:23 +01:00
Borislav Petkov	fa7ae8cc8c	EDAC, MCE: Shorten error report formatting Shorten up MCi_STATUS flags and add BD's new deferred and poison types. Also, simplify formatting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:22 +01:00
Borislav Petkov	6245288232	EDAC, MCE: Overhaul error fields extraction macros Make macro names shorter thus making code shorter and more clear. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:21 +01:00
Borislav Petkov	b8f85c477b	EDAC, MCE: Add F15h FP MCE decoder Add decoder for FP MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:20 +01:00
Borislav Petkov	8259a7e572	EDAC, MCE: Add F15 EX MCE decoder Integrate the single FIROB signature into an expanded table along with the new BD MCE types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:19 +01:00
Borislav Petkov	05cd667d66	EDAC, MCE: Add an F15h NB MCE decoder by (almost) reusing the F10h one since the signatures are the same. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:18 +01:00
Borislav Petkov	b18434cad1	EDAC, MCE: No F15h LS MCE decoder F15h BD doesn't generate LS MCEs so warn about it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:17 +01:00
Borislav Petkov	70fdb494aa	EDAC, MCE: Add F15h CU MCE decoder MCE bank 2 is redefined from a BU to a CU (Combined Unit) bank on F15h. Add a decoder function for CU MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:16 +01:00
Borislav Petkov	86039cd401	EDAC, MCE: Add F15h IC MCE decoder Add support for decoding F15h IC MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:15 +01:00
Borislav Petkov	25a4f8b059	EDAC, MCE: Add F15h DC MCE decoder Add a decoder for F15h DC MCEs to support the new types of DC MCEs introduced by the BD microarchitecture. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:14 +01:00
Borislav Petkov	2be64bfac7	EDAC, MCE: Select extended error code mask F15h enlarges the extended error code of an MCE to a 5-bit field (MCi_STATUS[20:16]). Add a mask variable which default 0xf is overridden on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:54:12 +01:00
Borislav Petkov	a135cef79a	amd64_edac: Disable DRAM ECC injection on K8 K8 does not allow for an atomic RMW to a cacheline as F10h does so disable the error injection interface for it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:46 +01:00
Borislav Petkov	390944439f	EDAC: Fixup scrubrate manipulation Make the ->{get\|set}_sdram_scrub_rate return the actual scrub rate bandwidth it succeeded setting and remove superfluous arg pointer used for that. A negative value returned still means that an error occurred while setting the scrubrate. Document this for future reference. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:38:31 +01:00
Borislav Petkov	360b7f3c60	amd64_edac: Remove two-stage initialization Now that all prerequisites are in place, drop the two-stage driver instances initialization in favor of the following simple init sequence: 1. Probe PCI device: we only test ECC capabilities here and if none exit early. 2. If the hw supports ECC and it is/can be enabled, we init the per-node instance. Remove "amd64_" prefix from static functions touched, while at it. There actually should be no visible functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:03 +01:00
Borislav Petkov	2299ef7114	amd64_edac: Check ECC capabilities initially Rework the code to check the hardware ECC capabilities at PCI probing time. We do all further initialization only if we actually can/have ECC enabled. While at it: 0. Fix function naming. 1. Simplify/clarify debug output. 2. Remove amd64_ prefix from the static functions 3. Reorganize code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:02 +01:00
Borislav Petkov	ae7bb7c679	amd64_edac: Carve out ECC-related hw settings This is in preparation for the init path reorganization where we want only to 1) test whether a particular node supports ECC 2) can it be enabled and only then do the necessary allocation/initialization. For that, we need to decouple the ECC settings of the node from the instance's descriptor. The should be no functional change introduced by this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:34:00 +01:00
Borislav Petkov	f1db274e1b	amd64_edac: Remove PCI ECS enabling functions PCI ECS is being enabled by default since 2.6.26 on AMD so this code is just superfluous now, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:59 +01:00
Borislav Petkov	027dbd6f5d	amd64_edac: Remove explicit Kconfig PCI dependency AMD_NB pulls in the dependency on PCI. Clarify/fix help text while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:58 +01:00
Borislav Petkov	cc4d8860fc	amd64_edac: Allocate driver instances dynamically Remove static allocation in favor of dynamically allocating space for as many driver instances as northbridges present on the system. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:57 +01:00
Borislav Petkov	24f9a7fe3f	amd64_edac: Rework printk macros Add a macro per printk level, shorten up error messages. Add relevant information to KERN_INFO level. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:56 +01:00
Borislav Petkov	8d5b5d9c7b	amd64_edac: Rename CPU PCI devices Rename variables representing PCI devices to their BKDG names for faster search and shorter, clearer code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:54 +01:00
Borislav Petkov	b8cfa02f83	amd64_edac: Concentrate per-family init even more Move the remaining per-family init code into the proper place and simplify the rest of the initialization. Reorganize error handling in amd64_init_one_instance(). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:53 +01:00
Borislav Petkov	bbd0c1f675	amd64_edac: Cleanup the CPU PCI device reservation Shorten code and clarify comments, return proper -E* values on error. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:52 +01:00
Borislav Petkov	0092b20d4c	amd64_edac: Simplify CPU family detection Concentrate CPU family detection in the per-family init function. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:51 +01:00
Borislav Petkov	395ae783b3	amd64_edac: Add per-family init function Run a per-family init function which does all the settings based on the family this driver instance is running on. Move the scrubrate calculation in it and simplify code. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:50 +01:00
Borislav Petkov	9f56da0e3c	amd64_edac: Use cached extended CPU model ... instead of computing it needlessly again. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:49 +01:00
Borislav Petkov	3ab0e7dc2e	amd64_edac: Remove F11h support F11h doesn't support DRAM ECC so whack it away. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-01-07 11:33:47 +01:00
Linus Torvalds	42cbd8efb0	Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code	2011-01-06 10:50:28 -08:00
David Sterba	e7bf068aa3	i7core_edac: fix typos in comments Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-12-28 01:20:51 +01:00
Jiri Kosina	4b7bd36470	Merge branch 'master' into for-next Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.	2010-12-22 18:57:02 +01:00
Borislav Petkov	e726f3c368	amd64_edac: Fix interleaving check When matching error address to the range contained by one memory node, we're in valid range when node interleaving 1. is disabled, or 2. enabled and when the address bits we interleave on match the interleave selector on this node (see the "Node Interleaving" section in the BKDG for an enlightening example). Thus, when we early-exit, we need to reverse the compound logic statement properly. Cc: <stable@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-12-08 19:52:54 +01:00
Andrei Konovalov	76f04f2591	EDAC: Correct MiB_TO_PAGES() macro This corrects the misprint introduced when moving '#if PAGE_SHIFT' from i7core_edac.c to edac_core.h (commit `e9144601d3`) Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Andrei Konovalov <akonovalov@mvista.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-12-08 19:52:53 +01:00
Borislav Petkov	bb31b3122c	EDAC: Fix workqueue-related crashes `00740c5854` changed edac_core to un-/register a workqueue item only if a lowlevel driver supplies a polling routine. Normally, when we remove a polling low-level driver, we go and cancel all the queued work. However, the workqueue unreg happens based on the ->op_state setting, and edac_mc_del_mc() sets this to OP_OFFLINE _before_ we cancel the work item, leading to NULL ptr oops on the workqueue list. Fix it by putting the unreg stuff in proper order. Cc: <stable@kernel.org> #36.x Reported-and-tested-by: Tobias Karnat <tobias.karnat@googlemail.com> LKML-Reference: <1291201307.3029.21.camel@Tobias-Karnat> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-12-08 19:52:27 +01:00
Axel Lin	df4b2a30e0	EDAC, MCE: Fix edac_init_mce_inject error handling Otherwise, variable i will be -1 inside the latest iteration of the while loop. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-22 15:35:32 +01:00
Tracey Dent	f570e1dd84	EDAC: Remove deprecated kbuild goal definitions Change EDAC's Makefile to use <modules>-y instead of <modules>-objs because -objs is deprecated and not mentioned in Documentation/kbuild/makefiles.txt. [bp: Fixup commit message] [bp: Fixup indentation] Signed-off-by: Tracey Dent <tdent48227@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-22 15:35:31 +01:00
Hans Rosenfeld	9653a5c76c	x86, amd-nb: Cleanup AMD northbridge caching code Support more than just the "Misc Control" part of the northbridges. Support more flags by turning "gart_supported" into a single bit flag that is stored in a flags member. Clean up related code by using a set of functions (amd_nb_num(), amd_nb_has_feature() and node_to_amd_nb()) instead of accessing the NB data structures directly. Reorder the initialization code and put the GART flush words caching in a separate function. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:05 +01:00
Hans Rosenfeld	eec1d4fa00	x86, amd-nb: Complete the rename of AMD NB and related code Not only the naming of the files was confusing, it was even more so for the function and variable names. Renamed the K8 NB and NUMA stuff that is also used on other AMD platforms. This also renames the CONFIG_K8_NUMA option to CONFIG_AMD_NUMA and the related file k8topology_64.c to amdtopology_64.c. No functional changes intended. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-11-18 15:53:04 +01:00
Uwe Kleine-König	b595076a18	tree-wide: fix comment/printk typos "gadget", "through", "command", "maintain", "maintain", "controller", "address", "between", "initiali[zs]e", "instead", "function", "select", "already", "equal", "access", "management", "hierarchy", "registration", "interest", "relative", "memory", "offset", "already", Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-01 15:38:34 -04:00
Linus Torvalds	da62aa69c1	Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (34 commits) i7core_edac: return -ENODEV when devices were already probed i7core_edac: properly terminate pci_dev_table i7core_edac: Avoid PCI refcount to reach zero on successive load/reload i7core_edac: Fix refcount error at PCI devices i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL i7core_edac: Fix an oops at i7core probe i7core_edac: Remove unused member channels in i7core_pvt i7core_edac: Remove unused arg csrow from get_dimm_config i7core_edac: Reduce args of i7core_register_mci i7core_edac: Introduce i7core_unregister_mci i7core_edac: Use saved pointers i7core_edac: Check probe counter in i7core_remove i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failed i7core_edac: Fix error path of i7core_register_mci i7core_edac: Fix order of lines in i7core_register_mci i7core_edac: Always do get/put for all devices i7core_edac: Introduce i7core_pci_ctl_create/release i7core_edac: Introduce free_i7core_dev i7core_edac: Introduce alloc_i7core_dev i7core_edac: Reduce args of i7core_get_onedevice ...	2010-10-26 10:13:48 -07:00
Linus Torvalds	229aebb873	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing\|ed] => interest[ing\|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c	2010-10-24 13:41:39 -07:00
Linus Torvalds	8de547e182	Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac * 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits) i7300_edac: Properly initialize per-csrow memory size V4L/DVB: i7300_edac: better initialize page counts MAINTAINERS: Add maintainer for i7300-edac driver i7300-edac: CodingStyle cleanup i7300_edac: Improve comments i7300_edac: Cleanup: reorganize the file contents i7300_edac: Properly detect channel on CE errors i7300_edac: enrich FBD error info for corrected errors i7300_edac: enrich FBD error info for fatal errors i7300_edac: pre-allocate a buffer used to prepare err messages i7300_edac: Fix MTR x4/x8 detection logic i7300_edac: Make the debug messages coherent with the others i7300_edac: Cleanup: remove get_error_info logic i7300_edac: Add a code to cleanup error registers i7300_edac: Add support for reporting FBD errors i7300_edac: Properly detect the type of error correction i7300_edac: Detect if the device is on single mode i7300_edac: Adds detection for enhanced scrub mode on x8 i7300_edac: Clear the error bit after reading i7300_edac: Add error detection code for global errors ...	2010-10-24 13:06:57 -07:00
Mauro Carvalho Chehab	76a7bd8113	i7core_edac: return -ENODEV when devices were already probed Due to the nature of i7core, we need to probe and attach all PCI devices used by this driver during the first time probe is called. However, PCI core will call the probe routine one time for each CPU socket. If we return -EINVAL to those calls, it would seem that the driver fails, when, in fact, there's no more devices left to initialize. Changing the return code to -ENODEV solves this issue. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:36:19 -02:00
Mauro Carvalho Chehab	3c52cc57cc	i7core_edac: properly terminate pci_dev_table At pci_xeon_fixup(), it waits for a null-terminated table, while at i7core_get_all_devices, it just do a for 0..ARRAY_SIZE. As other tables are zero-terminated, change it to be terminate with 0 as well, and fixes a bug where it may be running out of the table elements. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:31:50 -02:00
Mauro Carvalho Chehab	a3e1541637	i7core_edac: Avoid PCI refcount to reach zero on successive load/reload That's a nasty bug that took me a lot of time to track, and whose solution took just one line to solve. The best fragrances and the worse poisons are shipped on the smalest bottles. The drivers/pci/quick.c implements the pci_get_device function. The normal behavior is that you call it, the function returns you a pdev pointer and increment pdev->kobj.kref.refcount of the pci device. However, if you want to keep searching an object, you need to pass the previous pdev function to the search. When you use a not null pointer to pdev "from" field, pci_get_device will decrement pdev->kobj.kref.refcount, assuming that the driver won't be using the previous pdev. The solution is simple: we just need to call pci_dev_get() manually, for the pdev's that the driver will actually use. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab	79daef2099	i7core_edac: Fix refcount error at PCI devices Probably due to a bug or some testing logic at PCI level, device refcount for <bus>:00.0 device is decremented at the end of the pci_get_device, made by i7core_get_all_devices(). The fact is that the first versions of the driver relied on those devices to probe for Nehalem, but the current versions don't use it at all. So, let's just remove those devices from the driver, making it simpler and fixing the bug. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab	88ef5ea976	i7core_edac: it is safe to i7core_unregister_mci() when mci=NULL i7core_unregister_mci() checks internally when mci=NULL. There's no need to test it outside. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Mauro Carvalho Chehab	6d37d240f2	i7core_edac: Fix an oops at i7core probe changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init of the priv pointer to the end of the probe routine. However, we need them before that, otherwise, we hit an OOPS: [ 67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0 [ 67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 67.759685] IP: [<ffffffffa017e484>] i7core_probe+0x979/0x130c [i7core_edac] [ 67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0 [ 67.771178] Oops: 0000 [#1] SMP [ 67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map [ 67.782213] CPU 1 [ 67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Hidetoshi Seto	21b6806a8c	i7core_edac: Remove unused member channels in i7core_pvt Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Hidetoshi Seto	2e5185f7ff	i7core_edac: Remove unused arg csrow from get_dimm_config A local is enough. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:41 -02:00
Hidetoshi Seto	aace42831a	i7core_edac: Reduce args of i7core_register_mci We can check the number of channels in i7core_register_mci. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	1c6edbbe25	i7core_edac: Introduce i7core_unregister_mci In i7core_probe, when setup of mci for 2nd or later socket failed, we should cleanup prepared mci for 1st socket or so before "put" of all devices. So let have i7core_unregister_mci that can be shared between here and i7core_remove. While here fix a typo "hanler". Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	73589c80cd	i7core_edac: Use saved pointers We already have saved pointers. Use shorter ones. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	71fe01706d	i7core_edac: Check probe counter in i7core_remove Prevent i7core_remove from running multiple times. Otherwise value proved will be negative and something will be wrong. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	2896637b86	i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failed Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	628c5ddfb0	i7core_edac: Fix error path of i7core_register_mci Release resources properly. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:40 -02:00
Hidetoshi Seto	5939813b9c	i7core_edac: Fix order of lines in i7core_register_mci The flag is_registered is not initialized until mci_bind_devs() is called. Refer it properly. The mci->dev and mci->edac_check is required in edac_mc_add_mc(), so prepare them just before the call. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:39 -02:00
Hidetoshi Seto	64c10f6e0e	i7core_edac: Always do get/put for all devices We already do 'get' for all sockets at once. So do 'put' in the same way. And let args of the 'get' function to void since it handles only the single, static and known size table pci_dev_table[]. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:39 -02:00
Hidetoshi Seto	a3aa0a4ab5	i7core_edac: Introduce i7core_pci_ctl_create/release Have a couple of method. while here sort out lines in the i7core_register_mci() a bit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:39 -02:00
Hidetoshi Seto	2aa9be448d	i7core_edac: Introduce free_i7core_dev Have a method to make a couple with alloc_i7core_dev() previously introduced. Using in pair will help proper resource handling. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:39 -02:00
Hidetoshi Seto	848b2f7ed6	i7core_edac: Introduce alloc_i7core_dev It's nice to have a method for a single purpose. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Hidetoshi Seto	b197cba071	i7core_edac: Reduce args of i7core_get_onedevice Since we need to pass the index of the entry, pass the table itself instead of passing individual members of the table. While here make it static. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Hidetoshi Seto	45b7c981ae	i7core_edac: Fix the logic in i7core_remove() commit 47251b4d960bdfa648b0d06dbc6d445f41cb3906 have changed the logic for unexplained reasons. It looks strange that it can release i7core_dev without calling i7core_put_devices() that releases i7core_dev->pdev. Fix the part. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab	54a08ab153	i7core_edac: Don't do the legacy PCI probe by default The legacy PCI probe sometimes cause hangs. Better to have it disabled by default, and have a parameter to enable it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab	accf74fff3	i7core_edac: don't use a freed mci struct This is a nasty bug. Since kobject count will be reduced by zero by edac_mc_del_mc(), and this triggers the kobj release method, the mci memory will be freed automatically. So, all we have left is ctl_name, as shown by enabling debug: [ 80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [ 80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance [ 80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [ 80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0 [ 80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs [ 80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free() [ 80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj() [ 80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices() Also, kfree(mci) shouldn't happen at the kobj.release, as it happens when edac_remove_sysfs_mci_device() is called, but the logic is: edac_remove_sysfs_mci_device(mci); edac_printk(KERN_INFO, EDAC_MC, "Removed device %d for %s %s: DEV %s\n", mci->mc_idx, mci->mod_name, mci->ctl_name, edac_dev_name(mci)); So, as the edac_printk() needs the mci struct, this generates an OOPS. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab	bbc560ae67	edac_core: Print debug messages at release calls This is important to track a nasty bug at the free logic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:38 -02:00
Mauro Carvalho Chehab	ac99768c53	edac_core: Don't let free(mci) happen while using it A very nasty bug were happening on edac core, due to the way mci objects are freed. mci memory is freed when kobject count reaches zero, by edac_mci_control_release(). However, from the logs, this is clearly happening before the final usage of mci struct: [15799.607454] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [15799.618773] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 769: edac_inst_grp_release() [15799.627326] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 894: edac_remove_mci_instance_attributes() end of seeking for group all_channel_counts [15799.640887] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 877: edac_remove_mci_instance_attributes() sysfs_attrib = ffffffffa01d7240 [15799.653412] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [15799.663753] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	6fe1108f14	edac_core: Do a better job with node removal Make sure we remove groups at the right order Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	39300e7143	i7core_edac: explicitly remove PCI devices from the devices list Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	41ba6c1058	i7core_edac: MCE NMI handling should stop first Otherwise, a NMI may happen causing a race condition and a panic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	6ee7dd5044	i7core_edac: Initialize all priv vars before start polling Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	3cfd01468b	i7core_edac: Improve debug to seek for register/remove errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:37 -02:00
Mauro Carvalho Chehab	e9144601d3	i7core_edac: move #if PAGE_SHIFT to edac_core.h Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:36 -02:00
Mauro Carvalho Chehab	1288c18f48	i7core_edac: Properly mark const static vars as such There are two groups of sysfs attributes: one for rdimm and another for udimm. Instead of changing dynamically the unique static struct for handling udimm's, declare two vars and make them constant. This avoids the risk of having two or more memory controllers, each needing a different set of attributes. While here, use const on all places where it is applicable. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> edac_core: use const for constant sysfs arguments Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:14 -02:00
Mauro Carvalho Chehab	18c29002f9	i7core_edac: move static vars to the beginning of the file While here, don't initialize probed with 0. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:12 -02:00
Mauro Carvalho Chehab	939747bd68	i7core_edac: Be sure that the edac pci handler will be properly released With multi-sockets, more than one edac pci handler is enabled. Be sure to un-register all instances. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-10-24 11:20:12 -02:00
Linus Torvalds	c029e405bd	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (21 commits) EDAC, MCE: Fix shift warning on 32-bit EDAC, MCE: Add a BIT_64() macro EDAC, MCE: Enable MCE decoding on F12h EDAC, MCE: Add F12h NB MCE decoder EDAC, MCE: Add F12h IC MCE decoder EDAC, MCE: Add F12h DC MCE decoder EDAC, MCE: Add support for F11h MCEs EDAC, MCE: Enable MCE decoding on F14h EDAC, MCE: Fix FR MCEs decoding EDAC, MCE: Complete NB MCE decoders EDAC, MCE: Warn about LS MCEs on F14h EDAC, MCE: Adjust IC decoders to F14h EDAC, MCE: Adjust DC decoders to F14h EDAC, MCE: Rename files EDAC, MCE: Rework MCE injection EDAC: Export edac sysfs class to users. EDAC, MCE: Pass complete MCE info to decoders EDAC, MCE: Sanitize error codes EDAC, MCE: Remove unused function parameter EDAC, MCE: Add HW_ERR prefix ...	2010-10-21 14:04:58 -07:00
Linus Torvalds	2f0384e5fc	Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd_nb: Enable GART support for AMD family 0x15 CPUs x86, amd: Use compute unit information to determine thread siblings x86, amd: Extract compute unit information for AMD CPUs x86, amd: Add support for CPUID topology extension of AMD CPUs x86, nmi: Support NMI watchdog on newer AMD CPU families x86, mtrr: Assume SYS_CFG[Tom2ForceMemTypeWB] exists on all future AMD CPUs x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB x86, k8-gart: Decouple handling of garts and northbridges x86, cacheinfo: Fix dependency of AMD L3 CID x86, kvm: add new AMD SVM feature bits x86, cpu: Fix allowed CPUID bits for KVM guests x86, cpu: Update AMD CPUID feature bits x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bit x86, AMD: Remove needless CPU family check (for L3 cache info) x86, tsc: Remove CPU frequency calibration on AMD	2010-10-21 13:01:08 -07:00
Borislav Petkov	525906bc89	EDAC, MCE: Fix shift warning on 32-bit Fix drivers/edac/mce_amd.c:262: warning: left shift count >= width of type on 32-bit builds. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:07 +02:00
Borislav Petkov	cf1d2200db	EDAC, MCE: Add a BIT_64() macro Add a macro for 64-bit vectors to use when accessing MSR contents. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:06 +02:00
Borislav Petkov	fda7561f43	EDAC, MCE: Enable MCE decoding on F12h Turn on MCE decoding on F12h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:06 +02:00
Borislav Petkov	cb9d5ecdff	EDAC, MCE: Add F12h NB MCE decoder F12h is completely covered by the generic path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:05 +02:00
Borislav Petkov	e7281eb37d	EDAC, MCE: Add F12h IC MCE decoder ... which is the same as for K8 and F10h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:05 +02:00
Borislav Petkov	9be0bb1072	EDAC, MCE: Add F12h DC MCE decoder F12h DC MCE signatures are a subset of F10h's so reuse them. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:04 +02:00
Borislav Petkov	f0157b3afd	EDAC, MCE: Add support for F11h MCEs F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5 bank errors. Reuse functionality from the other families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:04 +02:00
Borislav Petkov	9530d608ef	EDAC, MCE: Enable MCE decoding on F14h Now that all decoders have been taught about F14h, models < 0x10 MCEs, enable decoding on this family of CPUs. Also, issue a short informational message upon boot that MCE decoding gets enabled. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:03 +02:00
Borislav Petkov	fe4ea2623b	EDAC, MCE: Fix FR MCEs decoding Those are N/A on K8, so don't decode them there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:03 +02:00
Borislav Petkov	5ce88f6ea6	EDAC, MCE: Complete NB MCE decoders Add support for decoding F14h BU MCEs and improve decoding of the remaining families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:02 +02:00
Borislav Petkov	ded5062328	EDAC, MCE: Warn about LS MCEs on F14h F14h CPUs do not generate LS MCEs so exit early and warn the user in case this path is ever hit that something else might be going haywire. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:02 +02:00
Borislav Petkov	dd53bce4e8	EDAC, MCE: Adjust IC decoders to F14h Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical so use one function for both. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:01 +02:00
Borislav Petkov	888ab8e6eb	EDAC, MCE: Adjust DC decoders to F14h Add a per-family data cache decoders. Since there is a certain overlap between the different DC MCE signatures, reuse functionality between the families as far as possible. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:00 +02:00
Borislav Petkov	47ca08a40b	EDAC, MCE: Rename files Drop "edac_" string from the filenames since they're prefixed with edac/ in their pathname anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:48:00 +02:00
Borislav Petkov	9cdeb404a1	EDAC, MCE: Rework MCE injection Add sysfs injection facilities for testing of the MCE decoding code. Remove large parts of amd64_edac_dbg.c, as a result, which did only NB MCE injection anyway and the new injection code supports that functionality already. Add an injection module so that MCE decoding code in production kernels like those in RHEL and SLES can be tested. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:59 +02:00
Borislav Petkov	30e1f7a812	EDAC: Export edac sysfs class to users. Move toplevel sysfs class to the stub and make it available to non-modularized code too. Add proper refcounting of its users and move the registration functionality into the reference counting routines. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:59 +02:00
Borislav Petkov	7cfd4a8744	EDAC, MCE: Pass complete MCE info to decoders ... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:58 +02:00
Borislav Petkov	6337583d7d	EDAC, MCE: Sanitize error codes Clean up error codes names, shorten to mnemonics, add RRRR boundary checking. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:58 +02:00
Borislav Petkov	0ee8efa8f4	EDAC, MCE: Remove unused function parameter Remove remains from previous functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:57 +02:00
Borislav Petkov	c9f281fd96	EDAC, MCE: Add HW_ERR prefix .. so that the user knows what she's looking at there in dmesg. Also, fix a minor cosmetic output inconsistency. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:57 +02:00
Borislav Petkov	ca755e0a49	EDAC: Fix error return We should return a negative value when we cannot get the toplevel edac sysfs class. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-10-21 14:47:56 +02:00
Justin P. Mattock	631dd1a885	Update broken web addresses in the kernel. The patch below updates broken web addresses in the kernel Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mike Frysinger <vapier.adi@gmail.com> Acked-by: Ben Pfaff <blp@cs.stanford.edu> Acked-by: Hans J. Koch <hjk@linutronix.de> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-10-18 11:03:14 +02:00
Marcin Slusarz	64aab720bd	i7core_edac: fix panic in udimm sysfs attributes registration Array of udimm sysfs attributes was not ended with NULL marker, leading to dereference of random memory. EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2 BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name RIP: 0010:[<ffffffff81330b36>] [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 (...) Call Trace: [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1 [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2 [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557 [<ffffffff81428901>] i7core_probe+0xccf/0xec0 RIP [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 ---[ end trace 20de320855b81d78 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-01 10:50:58 -07:00
Borislav Petkov	00740c5854	amd64_edac: Fix driver module removal `f4347553b3` removed the edac polling mechanism in favor of using a notifier chain for conveying MCE information to edac. However, the module removal path didn't test whether the driver had setup the polling function workqueue at all and the rmmod process was hanging in the kernel at try_to_del_timer_sync() in the cancel_delayed_work() path, trying to cancel an uninitialized work struct. Fix that by adding a balancing check to the workqueue removal path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-09-27 12:52:58 +02:00
Mauro Carvalho Chehab	e6649cc629	i7300_edac: Properly initialize per-csrow memory size Due to the current edac-core limits, we cannot represent a per-channel memory size, for FB-DIMM drivers. So, we need to sum-up all values for each slot, in order to properly represent the total amount of memory found by the i7300 driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-09-24 14:16:12 -03:00
Mauro Carvalho Chehab	1aa4a7b6b0	V4L/DVB: i7300_edac: better initialize page counts It is still somewhat fake, as the pages may not be on this exact order, and may even be used in mirror mode, but this is a best guess than the other random fake values. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-09-24 14:16:12 -03:00
Andreas Herrmann	23ac4ae827	x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-20 14:22:58 -07:00
Andreas Herrmann	900f9ac9f1	x86, k8-gart: Decouple handling of garts and northbridges So far we only provide num_k8_northbridges. This is required in different areas (e.g. L3 cache index disable, GART). But not all AMD CPUs provide a GART. Thus it is useful to split off the GART handling from the generic caching of AMD northbridge misc devices. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160254.GC4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-17 13:26:21 -07:00
Mauro Carvalho Chehab	9c6f6b65d2	i7300-edac: CodingStyle cleanup Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:06 -03:00
Mauro Carvalho Chehab	d091a6eb17	i7300_edac: Improve comments This is basically a cleanup patch, improving the comments for each function. While here, do a few cleanups. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:05 -03:00
Mauro Carvalho Chehab	b4552aceb3	i7300_edac: Cleanup: reorganize the file contents This change should do no functional change. It just rearranges the contents of the c file, in order to make easier to understand and maintain it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:04 -03:00
Mauro Carvalho Chehab	37b69cf91c	i7300_edac: Properly detect channel on CE errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:03 -03:00
Mauro Carvalho Chehab	32f9472613	i7300_edac: enrich FBD error info for corrected errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:02 -03:00
Mauro Carvalho Chehab	8199d8cc65	i7300_edac: enrich FBD error info for fatal errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:57:01 -03:00
Mauro Carvalho Chehab	85580ea4f7	i7300_edac: pre-allocate a buffer used to prepare err messages Instead of dynamically allocating a buffer for it where needed, just allocate it once. As we'll use the same buffer also during fatal and non-fatal errors, is is very risky to dynamically allocate it during an error. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:59 -03:00
Mauro Carvalho Chehab	28c2ce7c8b	i7300_edac: Fix MTR x4/x8 detection logic Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:58 -03:00
Mauro Carvalho Chehab	3b330f6758	i7300_edac: Make the debug messages coherent with the others Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:57 -03:00
Mauro Carvalho Chehab	f427742248	i7300_edac: Cleanup: remove get_error_info logic As the error logic in this driver came from i5400 driver, it were using one function to get errors, and another to display. Let's make it simpler and avoid doing it into two steps. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:56 -03:00
Mauro Carvalho Chehab	e432760509	i7300_edac: Add a code to cleanup error registers Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:55 -03:00
Mauro Carvalho Chehab	57021918aa	i7300_edac: Add support for reporting FBD errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:54 -03:00
Mauro Carvalho Chehab	15154c57c6	i7300_edac: Properly detect the type of error correction Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:52 -03:00
Mauro Carvalho Chehab	bb81a21637	i7300_edac: Detect if the device is on single mode Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:51 -03:00
Mauro Carvalho Chehab	d7de2bdb0e	i7300_edac: Adds detection for enhanced scrub mode on x8 While here, do some cleanup by adding some macros to check for device features. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:50 -03:00
Mauro Carvalho Chehab	86002324cf	i7300_edac: Clear the error bit after reading Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:49 -03:00
Mauro Carvalho Chehab	5de6e07ed7	i7300_edac: Add error detection code for global errors There's no mention at the datasheet about how to enable global error reporting. So, I'm assuming that those errors are always enabled. Maybe I'm plain wrong about that ;) Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:48 -03:00
Mauro Carvalho Chehab	3e57eef64c	i7300_edac: Better name PCI devices Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:47 -03:00
Mauro Carvalho Chehab	116389ed21	i7300_edac: Add a FIXME note about the error correction type Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:45 -03:00
Mauro Carvalho Chehab	c3af2eaf7a	i7300_edac: add global error registers Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:44 -03:00
Mauro Carvalho Chehab	af3d8831e7	i7300_edac: display info if ECC is enabled or not Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:43 -03:00
Mauro Carvalho Chehab	fcaf780b2a	i7300_edac: start a driver for i7300 chipset (Clarksboro) Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-08-30 14:56:42 -03:00
Borislav Petkov	37b7370a8d	amd64_edac: Do not report error overflow as a separate error When the Overflow MCi_STATUS bit is set, EDAC reports the lost error with a "no information available" message which often puzzles users parsing the dmesg. This doesn't make much sense since this error has been lost anyway so no need for reporting it separately. Thus, report the overflow bit setting in the MCE dump instead. While at it, remove reporting of MiscV and ErrorEnable (en) which are superfluous. Now it looks like this: [ 1501.650024] MC4_STATUS: Corrected error, other errors lost: yes, CPU context corrupt: no, CECC Error [ 1501.666887] Northbridge Error, node 2 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-26 12:46:03 +02:00
Borislav Petkov	e045c29126	MCE, AMD: Limit MCE decoding to current families for now Limit MCE error decoding to current and older families only (K8-F11h). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-24 18:06:54 +02:00
Linus Torvalds	58d4ea65b9	Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6 * 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: mmc_spi: Fix unterminated of_match_table of/sparc: fix build regression from of_device changes of/device: Replace struct of_device with struct platform_device	2010-08-12 09:11:31 -07:00
Anton Vorontsov	cd1542c819	edac: mpc85xx: add support for new MPCxxx/Pxxxx EDAC controllers Simply add proper IDs into the device table. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Kulikov Vasiliy	b425d5c82d	edac: i5400: improve handling of pci_enable_device() return value -EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Kulikov Vasiliy	44aa80f005	edac: i5000: improve handling of pci_enable_device() return value -EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Christoph Egger	bd1688dcdf	edac: add wissing pieces from MPC85xx -> FSL_SOC_BOOKE In `5753c082f6` ("powerpc/85xx: Kconfig cleanup") menuconfig MPC85xx was replaced by FSL_SOC_BOOKE but some references insider the code were not adjusted accordingly. This patch adresses these missing pieces. Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Grant Likely	2dc1158137	of/device: Replace struct of_device with struct platform_device of_device is just an alias for platform_device, so remove it entirely. Also replace to_of_device() with to_platform_device() and update comment blocks. This patch was initially generated from the following semantic patch, and then edited by hand to pick up the bits that coccinelle didn't catch. @@ @@ -struct of_device +struct platform_device Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Reviewed-by: David S. Miller <davem@davemloft.net>	2010-08-06 09:25:50 -06:00
Borislav Petkov	c4799c7570	amd64_edac: Minor formatting fix EDAC MC3: CE page 0xc32281, offset 0x8a0, grain 0, syndrome 0x1, row 2, channel 1, label "": amd64_edac EDAC MC3: CE - no information available: amd64_edacError Overflow Add the missing space before "Error Overflow" on the second line. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:16:01 +02:00
Borislav Petkov	962b70a1eb	amd64_edac: Fix operator precendence error The bitwise AND is of higher precedence, make that explicit. Cc: <stable@kernel.org> # 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-04 11:15:09 +02:00
Borislav Petkov	eba042a81e	edac, mc: Improve scrub rate handling Fortify the interface to not accept negative values, remove memctrl_int_store() as a result. Also, sanitize bandwidth setting by making the argument a simple u32 instead of strange u32 pointer being passed around for no obvious reason. Then, fix error handling and teach it to return proper error values. Finally, make code more readable, simplify debug messages. Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Arthur Jones <ajones@riverbed.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:06 +02:00
Borislav Petkov	bc57117856	amd64_edac: Correct scrub rate setting Exit early when setting scrub rate on unknown/unsupported families. Cc: <stable@kernel.org> # 32.x 33.x 34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:05 +02:00
Borislav Petkov	9975a5f22a	amd64_edac: Fix DCT base address selector The correct check is to verify whether in high range we're below 4GB and not to extract the DctSelBaseAddr again. See "2.8.5 Routing DRAM Requests" in the F10h BKDG. Cc: <stable@kernel.org> # .32.x .33.x .34.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:04 +02:00
Borislav Petkov	f4347553b3	amd64_edac: Remove polling mechanism Switch to reusing the mcheck core's machine check polling mechanism instead of duplicating functionality by using the EDAC polling routine. Correct formatting while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:03 +02:00
Borislav Petkov	695426506e	amd64_edac: Remove unneeded defines All F2x110-related bit defines are used at only one place so replace them with simple BIT() macros. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:01 +02:00
Borislav Petkov	935ab88e34	edac: Remove EDAC_DEBUG_VERBOSE This option differs from EDAC_DEBUG only by printing the file and line of where the debug statement is placed, which contains unneeded information. So remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-08-03 16:14:00 +02:00
Borislav Petkov	ad6a32e969	amd64_edac: Sanitize syndrome extraction Remove the two syndrome extraction macros and add a single function which does the same thing but with proper typechecking. While at it, make sure to cache ECC syndrome size and dump it in debug output. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-08-03 16:13:31 +02:00
Anton Vorontsov	952e1c6632	edac: mpc85xx: fix coldplug/hotplug module autoloading The MPC85xx EDAC driver is missing module device aliases, so the driver won't load automatically on boot. This patch fixes the issue by adding proper MODULE_DEVICE_TABLE() macros. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-27 14:32:06 -07:00
Daniel J Blueman	ab08937400	quiesce EDAC initialisation on desktop/mobile i7 Don't print failure to detect Core i7 EDAC facilities to the console at boot time, most often occurring on Core i7 desktops and laptops. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-26 08:17:44 -07:00
Anton Vorontsov	5528e229f0	edac: mpc85xx: add support for MPC8569 EDAC controllers Simply add a proper ID into the device table. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-20 16:25:40 -07:00
Anton Vorontsov	1cd8521e7d	edac: mpc85xx: fix MPC85xx dependency Since commit `5753c082f6` ("powerpc/85xx: Kconfig cleanup"), there is no MPC85xx Kconfig symbol anymore, so the driver became non-selectable. This patch fixes the issue by switching to PPC_85xx symbol. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-20 16:25:40 -07:00
Linus Torvalds	62fd985717	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: MAINTAINERS: Add an entry for i7core_edac i7core_edac: Avoid doing multiple probes for the same card i7core_edac: Properly discover the first QPI device	2010-07-04 20:12:06 -07:00
Mauro Carvalho Chehab	2d95d8158b	i7core_edac: Avoid doing multiple probes for the same card As Nehalem/Nehalem-EP/Westmere devices uses several devices for the same functionality (memory controller), the default way of proping devices doesn't work. So, instead of a per-device probe, all devices should be probed at once. This means that we should block any new attempt of probe, otherwise, it will try to register the same device several times. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-07-02 18:04:29 -03:00
Mauro Carvalho Chehab	bda142890e	i7core_edac: Properly discover the first QPI device On Nehalem/Nehalem-EP/Westmere, the first QPI device is the last PCI bus. The last bus is generally at 0x3f or 0xff, but there are also other systems using different setups. For example, HP Z800 has 0x7f as the last bus. This patch adds a logic to discover the last bus, dynamically detecting it at runtime. Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-07-02 18:04:05 -03:00
Borislav Petkov	41c310447f	amd64_edac: Fix syndrome calculation on K8 When calculating the DCT channel from the syndrome we need to know the syndrome type (x4 vs x8). On F10h, this is read out from extended PCI cfg space register F3x180 while on K8 we only support x4 syndromes and don't have extended PCI config space anyway. Make the code accessing F3x180 F10h only and fall back to x4 syndromes on everything else. Cc: <stable@kernel.org> # .33.x .34.x Reported-by: Jeffrey Merkey <jeffmerkey@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-07-02 17:32:34 +02:00
Linus Torvalds	9a9620db07	Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits) i7core_edac: Better describe the supported devices Add support for Westmere to i7core_edac driver i7core_edac: don't free on success i7core_edac: Add support for X5670 Always call i7core_[ur]dimm_check_mc_ecc_err i7core_edac: fix memory leak of i7core_dev EDAC: add __init to i7core_xeon_pci_fixup i7core_edac: Fix wrong device id for channel 1 devices i7core: add support for Lynnfield alternate address i7core_edac: Add initial support for Lynnfield i7core_edac: do not export static functions edac: fix i7core build edac: i7core_edac produces undefined behaviour on 32bit i7core_edac: Use a more generic approach for probing PCI devices i7core_edac: PCI device is called NONCORE, instead of NOCORE i7core_edac: Fix ringbuffer maxsize i7core_edac: First store, then increment i7core_edac: Better parse "any" addrmask i7core_edac: Use a lockless ringbuffer edac: Create an unique instance for each kobj ...	2010-06-04 15:39:54 -07:00
Anatolij Gustschin	a26f95fed3	of/edac: fix build breakage in drivers Fixes build errors in EDAC drivers caused by the OF device_node pointer being moved into struct device Signed-off-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2010-06-02 21:02:41 -06:00
Joe Perches	63ae96be98	drivers/edac: convert logging messages direct uses of __FILE__ to %s, __FILE Reduces text by eliminating multiple __FILE__ uses. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Joe Perches <joe@perches.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Tim Small <tim@buttersideup.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:52 -07:00
Grant Likely	cf9b59e9d3	Merge remote branch 'origin' into secretlab/next-devicetree Merging in current state of Linus' tree to deal with merge conflicts and build failures in vio.c after merge. Conflicts: drivers/i2c/busses/i2c-cpm.c drivers/i2c/busses/i2c-mpc.c drivers/net/gianfar.c Also fixed up one line in arch/powerpc/kernel/vio.c to use the correct node pointer. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2010-05-22 00:36:56 -06:00
Grant Likely	4018294b53	of: Remove duplicate fields from of_platform_driver .name, .match_table and .owner are duplicated in both of_platform_driver and device_driver. This patch is a removes the extra copies from struct of_platform_driver and converts all users to the device_driver members. This patch is a pretty mechanical change. The usage model doesn't change and if any drivers have been missed, or if anything has been fixed up incorrectly, then it will fail with a compile time error, and the fixup will be trivial. This patch looks big and scary because it touches so many files, but it should be pretty safe. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Sean MacLennan <smaclennan@pikatech.com>	2010-05-22 00:10:40 -06:00
Mauro Carvalho Chehab	52707f918c	i7core_edac: Better describe the supported devices Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 20:43:52 -03:00
Vernon Mauery	bd9e19ca46	Add support for Westmere to i7core_edac driver This adds new PCI IDs for the Westmere's memory controller devices and modifies the i7core_edac driver to be able to probe both Nehalem and Westmere processors. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 20:23:56 -03:00
Roman Fietze	ee6583f6e8	PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments This fixes all occurrences of pci_enable_device and pci_disable_device in all comments. There are no code changes involved. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-05-18 14:59:08 -07:00
Tony Luck	d4d1ef4515	i7core_edac: don't free on success Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 14:47:31 -03:00
Mauro Carvalho Chehab	ac1ececea9	i7core_edac: Add support for X5670 As reported by Vernon Mauery <vernux@us.ibm.com>, X5670 (Westmere-EP) uses a different register for one of the uncore PCI devices. Add support for it. Those are the PCI ID's on this new chipset: fe:00.0 0600: 8086:2c70 (rev 02) fe:00.1 0600: 8086:2d81 (rev 02) fe:02.0 0600: 8086:2d90 (rev 02) fe:02.1 0600: 8086:2d91 (rev 02) fe:02.2 0600: 8086:2d92 (rev 02) fe:02.3 0600: 8086:2d93 (rev 02) fe:02.4 0600: 8086:2d94 (rev 02) fe:02.5 0600: 8086:2d95 (rev 02) fe:03.0 0600: 8086:2d98 (rev 02) fe:03.1 0600: 8086:2d99 (rev 02) fe:03.2 0600: 8086:2d9a (rev 02) fe:03.4 0600: 8086:2d9c (rev 02) fe:04.0 0600: 8086:2da0 (rev 02) fe:04.1 0600: 8086:2da1 (rev 02) fe:04.2 0600: 8086:2da2 (rev 02) fe:04.3 0600: 8086:2da3 (rev 02) fe:05.0 0600: 8086:2da8 (rev 02) fe:05.1 0600: 8086:2da9 (rev 02) fe:05.2 0600: 8086:2daa (rev 02) fe:05.3 0600: 8086:2dab (rev 02) fe:06.0 0600: 8086:2db0 (rev 02) fe:06.1 0600: 8086:2db1 (rev 02) fe:06.2 0600: 8086:2db2 (rev 02) fe:06.3 0600: 8086:2db3 (rev 02) (as usual, the same PCI devices repeat at ff: bus) The PCI device 8086:2c70 is shown as: fe:00.0 Host bridge: Intel Corporation QuickPath Architecture Generic Non-core Registers (rev 02) So, for this device to be recognized, it is only a matter of adding this new PCI ID to the driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 13:15:42 -03:00
Vernon Mauery	8a311e179e	Always call i7core_[ur]dimm_check_mc_ecc_err This fixes an error in function i7core_check_error In commit `ca9c90ba09` which converts the driver to use double buffering, there is a change in the logic. Before, if mce_count was zero, it skipped over a couple of statements and finished out with a call to the check_mc_ecc_err function. The current code checks to see if mce_count is 0 and then exits. This change reverts the behavior back to the original where if there are no errors to report, we skip to the end and call the check_mc_ecc_err function. This fix allows the driver to work again on my Nehalem based blades again. Signed-off-by: Vernon Mauery <vernux@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 12:43:23 -03:00
Alexander Beregalov	2a6fae3267	i7core_edac: fix memory leak of i7core_dev Free already allocated i7core_dev. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 11:45:20 -03:00
Jiri Slaby	71753e0141	EDAC: add __init to i7core_xeon_pci_fixup It's called only from an __init function and is the only user of pcibios_scan_specific_bus which will be marked as __devinit in the next patch. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-18 11:45:19 -03:00
Mauro Carvalho Chehab	508fa179f8	i7core_edac: Fix wrong device id for channel 1 devices Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 12:18:31 -03:00
Mauro Carvalho Chehab	f05da2f785	i7core: add support for Lynnfield alternate address Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 12:18:29 -03:00
Mauro Carvalho Chehab	52a2e4fc37	i7core_edac: Add initial support for Lynnfield Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 12:18:28 -03:00
Randy Dunlap	3b918c12df	edac: fix i7core build Fix build warning (missing header file) and build error when CONFIG_SMP=n. drivers/edac/i7core_edac.c:860: error: implicit declaration of function 'msleep' drivers/edac/i7core_edac.c:1700: error: 'struct cpuinfo_x86' has no member named 'phys_proc_id' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:32 -03:00
Alan Cox	486dd09f12	edac: i7core_edac produces undefined behaviour on 32bit Fix the shifts up Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:32 -03:00
Mauro Carvalho Chehab	de06eeef58	i7core_edac: Use a more generic approach for probing PCI devices Currently, only one PCI set of tables is allowed. This prevents using the driver for other devices like Lynnfield, with have a different set of PCI ID's. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:31 -03:00
Mauro Carvalho Chehab	fd3826549d	i7core_edac: PCI device is called NONCORE, instead of NOCORE Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:31 -03:00
Mauro Carvalho Chehab	321ece4dda	i7core_edac: Fix ringbuffer maxsize Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:31 -03:00
Mauro Carvalho Chehab	6e103be1c7	i7core_edac: First store, then increment Fix ringbuffer store logic. While here, add a few comments to the code and remove the undesired printk that could otherwise be called during NMI time. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:31 -03:00
Mauro Carvalho Chehab	4f87fad1d3	i7core_edac: Better parse "any" addrmask Instead of accepting just "any", accept also "any\n" Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:30 -03:00
Mauro Carvalho Chehab	ca9c90ba09	i7core_edac: Use a lockless ringbuffer Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:30 -03:00
Mauro Carvalho Chehab	b968759ee7	edac: Create an unique instance for each kobj Current code only works when there's just one memory controller, since we need one kobj for each instance. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:49:30 -03:00
Mauro Carvalho Chehab	f338d73691	i7core_edac: Convert UDIMM error counters into a proper sysfs group Instead of displaying 3 values at the same var, break it into 3 different sysfs nodes: /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1 /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2 For registered dimms, however, the error counters are already being displayed at: /sys/devices/system/edac/mc/mc0/csrow*/ce_count So, there's no need to add any extra sysfs nodes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:02 -03:00
Mauro Carvalho Chehab	c419d921e6	edac: Don't create csrow entries on instance groups Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:02 -03:00
Mauro Carvalho Chehab	cc301b3ae3	edac: store/show methods for device groups weren't working Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:02 -03:00
Mauro Carvalho Chehab	a5538e531f	i7core_edac: Add support for sysfs addrmatch group Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:01 -03:00
Mauro Carvalho Chehab	9fa2fc2e2d	edac_core: Allow the creation of sysfs groups Currently, all sysfs nodes are stored at /sys/.*/mc. (regex) However, sometimes it is needed to create attribute groups. This patch extends edac_core to allow groups creation. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:01 -03:00
Mauro Carvalho Chehab	4af91889e0	i7core_edac: Avoid printing a warning when debug is disabled Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:00 -03:00
Mauro Carvalho Chehab	4253868034	i7core_edac: We need to use list_for_each_entry_safe to avoid errors Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:00 -03:00
Mauro Carvalho Chehab	22e6bcbdcf	i7core_edac: change remove module strategy The old remove module stragegy didn't work on devices with multiple cores, since only one PCI device is used to open all mc's, due to Nehalem nature. Also, it were based at pdev value. However, this doesn't point to the pci device used at mci->dev. So, instead, it unregisters all devices at once, deleting them from the device list. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:00 -03:00
Mauro Carvalho Chehab	0f062792b4	i7core_edac: remove static counter for max sockets The number of sockets is now fully dynamic. Get rid of this obsolete var. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:45:00 -03:00
Mauro Carvalho Chehab	13d6e9b653	i7core_edac: at remove, don't remove all pci devices at once Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:59 -03:00
Mauro Carvalho Chehab	d88b85072f	i7core_edac: Fix a bug when printing error counts with RDIMMs Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:59 -03:00
Mauro Carvalho Chehab	d4c277957f	i7core_edac: a few fixes for multiple mc's Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:59 -03:00
Mauro Carvalho Chehab	6c6aa3afdb	i7core_edac: sanity check: print a warning if a mcelog is ignored In thesis, the other mc controller should handle it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:58 -03:00
Mauro Carvalho Chehab	f47429494f	i7core_edac: create one mc per socket/QPI Instead of creating just one memory controller, create one per socket (e. g. per Quick Link Path Interconnect). This better reflects the Nehalem architecture. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:58 -03:00
Mauro Carvalho Chehab	66607706ce	Dynamically allocate memory for PCI devices Instead of using a static table assuming always 2 CPU sockets, allocate space dynamically for Nehalem PCI devs. This patch is part of a series of patches that changes i7core_edac to allow more than 2 sockets and to properly report one memory controller per socket.	2010-05-10 11:44:58 -03:00
Mauro Carvalho Chehab	a55456f344	i7core: temporary workaround to allow it to compile against 2.6.30 Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:58 -03:00
Mauro Carvalho Chehab	3a3bb4a647	i7core_edac: Improve corrected_error_counts output for RDIMM Just cosmetics. instead of showing something like: socket 0, channel 2dimm0: 1 dimm1: 0 dimm2: 0 socket 1, channel 2dimm0: 0 dimm1: 0 dimm2: 0 Show: socket 0, channel 2 RDIMM0: 1 RDIMM1: 0 RDIMM2: 0 socket 0, channel 2 RDIMM0: 0 RDIMM1: 0 RDIMM2: 0 This is more synthetic and easier to parse. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:58 -03:00
Keith Mannthey	bc2d7245ff	i7core_edac: Probe on Xeons eariler On the Xeon 55XX series cpus the pci deives are not exposed via acpi so we much explicitly probe them to make the usable as a Linux PCI device. This moves the detection of this state to before pci_register_driver is called. Its present position was not working on my systems, the driver would complain about not finding a specific device. This patch allows the driver to load on my systems. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:57 -03:00
Mauro Carvalho Chehab	14d2c08343	i7core: Use registered memories per processor Instead of assuming that the entire machine has either registered or unregistered memories, do it at CPU socket based. While here, fix a bug at i7core_mce_output_error(), where the we're using m->cpu directly as if it would represent a socket. Instead, the proper socket_id is given by cpu_data[m->cpu].phys_proc_id. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> ---	2010-05-10 11:44:57 -03:00
Mauro Carvalho Chehab	b4e8f0b6ea	i7core_edac: Use Device 3 function 2 to report errors with RDIMM's Nehalem and upper chipsets provide an special device that has corrected memory error counters detected with registered dimms. This device is only seen if there are registered memories plugged. After this patch, on a machine fully equiped with RDIMM's, it will use the Device 3 function 2 to count corrected errors instead on relying at mcelog. For unregistered DIMMs, it will keep the old behavior, counting errors via mcelog. This patch were developed together with Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:56 -03:00
Keith Mannthey	61053fdedb	i7core_edac: Fix ecc enable shift From: Keith Mannthey <kmannth@us.ibm.com> Simple correction to a shift value. ECC_ENABLED is bit 4 of MC_STATUS, Dev 3 Fun 0 Offset 0x4c This correctly identifies the state of the ECC at the machine. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:56 -03:00
Mauro Carvalho Chehab	3ef288a983	i7core_edac: Print an error message if pci register fails Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:56 -03:00
Mauro Carvalho Chehab	b990538a78	i7core_edac: CodingSyle fixes/cleanups No functional changes. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:56 -03:00
Mauro Carvalho Chehab	4157d9f554	i7core_edac: fix error injection There were two stupid error injection bugs introduced by wrong cut-and-paste: one at socket store, and another at the error inject register. The last one were causing the code to not work at all. While here, adds debug messages to allow seeing what registers are being set while sending error injection. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:55 -03:00
Mauro Carvalho Chehab	2068def56c	i7core_edac: fix error codes for sysfs error injection interface Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:55 -03:00
Mauro Carvalho Chehab	276b824c30	i7core_edac: some fixes at error injection code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:54 -03:00
Mauro Carvalho Chehab	17cb7b0cf7	i7core_edac: Some cleanups at displayed info Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:54 -03:00
Mauro Carvalho Chehab	086271a037	i7core: remove some uneeded noisy debug messages Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:54 -03:00
Mauro Carvalho Chehab	3a7dde7fcd	i7core: add socket info at the debug msg Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:53 -03:00
Mauro Carvalho Chehab	ec6df24c15	i7core: better document i7core_get_active_channels() Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:53 -03:00
Mauro Carvalho Chehab	c77720b954	i7core: fix get_devices routine for Xeon55xx i7core_get_devices() were preparet to get just the first found device of each type. Due to that, on Xeon 55xx, only socket 1 were retrived. Rework i7core_get_devices() to clean it and to properly support Xeon 55xx. While here, fix a small typo. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:53 -03:00
Mauro Carvalho Chehab	a639539fa2	i7core: enrich error information based on memory transaction type Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:53 -03:00
Mauro Carvalho Chehab	c5d3452869	i7core: check if the memory error is fatal or non-fatal Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:53 -03:00
Mauro Carvalho Chehab	310cbb7284	i7core: fix probing on Xeon55xx Xeon55xx fails to probe with this error message: EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1660: MC: drivers/edac/i7core_edac.c: i7core_init() EDAC i7core: Device not found: dev 00:00.0 PCI ID 8086:2c41 i7core_edac: probe of 0000:00:14.0 failed with error -22 This is due to the fact that, on Xeon35xx (and i7core), device 00.0 has PCI ID 8086:2c40. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:52 -03:00
Mauro Carvalho Chehab	f237fcf2b7	i7core_edac: some fixes at memory error parser m->bank is not related to the memory bank but, instead, to the MCA Error register bank. Fix it accordingly. While here, improves the comments for Nehalem bank. A later fix is needed, in order to get bank/rank information from MCA error log. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:52 -03:00
Mauro Carvalho Chehab	8a2f118e3a	i7core_edac: decode mcelog error and send it via edac interface Enriches mcelog error by using the encoded information at MCE status and misc registers (IA32_MCx_STATUS, IA32_MCx_MISC). Some fixes are still needed here, in order to properly fill the EDAC fields. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:52 -03:00
Mauro Carvalho Chehab	ba6c5c62ee	i7core_edac: maps all sockets as if ther are one MC controller Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:52 -03:00
Mauro Carvalho Chehab	67166af4ab	i7core_edac: add support for more than one MC socket Some Nehalem architectures have more than one MC socket. Socket 0 is located at bus 255. Currently, it is using up to 2 sockets, but increasing it to a larger number is just a matter of increasing MAX_SOCKETS definition. This seems to be required for properly support of Xeon 55xx. Still needs testing with Xeon 55xx. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:51 -03:00
Mauro Carvalho Chehab	d1fd4fb69e	i7core_edac: Add a code to probe Xeon 55xx bus This code changes the detection procedure of i7core_edac. Instead of directly probing for MC registers, it probes for another register found on Nehalem. If found, it tries to pick the first MC PCI BUS. This should work fine with Xeon 35xx, but, on Xeon 55xx, this is at bus 254 and 255 that are not properly detected by the non-legacy PCI methods. The new detection code scans specifically at buses 254 and 255 for the Xeon 55xx devices. This code has not tested yet. After working, a change at the code will be needed, since the i7core is not yet ready for working with 2 sets of MC. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:51 -03:00
Mauro Carvalho Chehab	e9bd2e7379	i7core_edac: Adds write unlock to MC registers The public Intel Xeon 5500 volume 2 datasheet describes, on page 53, session 2.6.7 a register that can lock/unlock Memory Controller the configuration register, called MC_CFG_CONTROL. Adds support for it in the hope that software error injection would work. With my tests with Xeon 35xx, there's still something missing. With a program that does sequencial bit writes at dev 0.0, sometimes, it produces error injection, after unblocking the MC_CFG_CONTROL (and, sometimes, it just locks my testing machine). I'll try later to discover by trial and error what's the register that solves this issue on Xeon 35xx. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab	d5381642ab	i7core_edac: Add edac_mce glue Adds a glue code to allow i7core to work with mcelog. With the glue, i7core registers itself on edac_mce. At mce, when an error is detected, it calls all registered drivers (in this case, i7core), for EDAC error handling. TODO: It currently just prints the MCE error log using about the same format as mce panic messages. The error message should be enhanced with mcelog userspace info and converted into the proper EDAC format, to feed the EDAC error counts. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab	963c5ba359	edac/Kconfig: edac_mce can't be module Since mcelog is bool, edac_mce glue should also be bool, or otherwise will not work. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:50 -03:00
Mauro Carvalho Chehab	696e409dbd	edac_mce: Add an interface driver to report mce errors via edac edac_mce module is an interface module that gets mcelog data and forwards to any registered edac module that expects to receive data via mce. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:49 -03:00
Mauro Carvalho Chehab	41fcb7feed	i7core_edac: CodingStyle fixes Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab	eb94fc402f	i7core_edac: fill csrows edac sysfs info csrows is still fake, since we can't identify its representation with Nehalem registers. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab	5566cb7c91	i7core_edac: Memory info fixes and preparation for properly filling cswrow data Now, memory size is properly displayed: EDAC i7core: DOD Max limits: DIMMS: 2, 1-ranked, 8-banked EDAC i7core: DOD Max rows x colums = 0x4000 x 0x400 EDAC i7core: Memory channel configuration: EDAC i7core: Ch0 phy rd0, wr0 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: dimm 1 (0x00001288) 1024 Mb offset: 4, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: Ch1 phy rd1, wr1 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 EDAC i7core: Ch2 phy rd3, wr3 (0x063f7c31): 2 ranks, UDIMMs EDAC i7core: dimm 0 (0x00000288) 1024 Mb offset: 0, numbank: 8, numrank: 1, numrow: 0x4000, numcol: 0x400 Still, as the way to retrieve csrows info is not known, it does a mapping of what's available to csrows basic unit at edac core. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab	854d334997	i7core_edac: Get more info about the memory DIMMs Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:48 -03:00
Mauro Carvalho Chehab	7dd6953c5f	i7core_edac: Add more information about each active dimm Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab	b7c761512c	i7core_edac: Improve error handling Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab	1c6fed808f	i7core_edac: Properly fill struct csrow_info Thanks-to: Aristeu Rozanski <aris@redhat.com> for part of the code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab	ef708b53b9	i7core_edac: Add additional tests for error detection Properly check the number of channels and improve probing error detection Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:47 -03:00
Mauro Carvalho Chehab	442305b152	i7core_edac: Add a memory check routine, based on device 3 function 4 This function appears only on Xeon 5500 datasheet. Yet, testing with a Xeon 3503 showed that this is also implemented on other Nehalem processors. At the first read, MC_TEST_ERR_RCV1 and MC_TEST_ERR_RCV0 can contain any value. Modify CE error logic to update the error count only after the second read. An alternative approach would be to do a write at rcv0 and rcv1 registers, but it seemed better to keep they untouched, since BIOS might eventually assume that they are exclusive for their usage. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab	87d1d272ba	i7core_edac: need mci->edac_check, otherwise module removal doesn't work There are some locking troubles with edac_core: if you don't declare an edac_check, module may suffer from soft lock. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab	7b029d03c3	i7core_edac: A few fixes at error injection code Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab	f122a89222	i7core_edac: Show read/write virtual/physical channel association Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:46 -03:00
Mauro Carvalho Chehab	8f33190757	i7core_edac: Registers all supported MC functions Now, it will try to register on all supported Memory Controller functions. It should be noticed that dev3, function 2 is present only on chips with Registered DIMM's, according to the datasheet. So, the driver doesn't return -ENODEV is all functions but this one were successfully registered and enabled: EDAC i7core: Registered device 8086:2c18 fn=3 0 EDAC i7core: Registered device 8086:2c19 fn=3 1 EDAC i7core: Device not found: PCI ID 8086:2c1a (dev 3, func 2) EDAC i7core: Registered device 8086:2c1c fn=3 4 EDAC i7core: Registered device 8086:2c20 fn=4 0 EDAC i7core: Registered device 8086:2c21 fn=4 1 EDAC i7core: Registered device 8086:2c22 fn=4 2 EDAC i7core: Registered device 8086:2c23 fn=4 3 EDAC i7core: Registered device 8086:2c28 fn=5 0 EDAC i7core: Registered device 8086:2c29 fn=5 1 EDAC i7core: Registered device 8086:2c2a fn=5 2 EDAC i7core: Registered device 8086:2c2b fn=5 3 EDAC i7core: Registered device 8086:2c30 fn=6 0 EDAC i7core: Registered device 8086:2c31 fn=6 1 EDAC i7core: Registered device 8086:2c32 fn=6 2 EDAC i7core: Registered device 8086:2c33 fn=6 3 EDAC i7core: Driver loaded. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab	0b2b7b7ec0	i7core_edac: Add more status functions to EDAC driver This patch were co-authored with Aristeu Rozanski. Signed-off-by: Aristeu Sergio <arozansk@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab	194a40feab	i7core_edac: Add error insertion code for Nehalem Implements set_inject_error() with the low-level code needed to inject memory errors at Nehalem, and adds some sysfs nodes to allow error injection The next patch will add an API for error injection. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:45 -03:00
Mauro Carvalho Chehab	a0c36a1f0f	i7core_edac: Add an EDAC memory controller driver for Nehalem chipsets This driver is meant to support i7 core/i7core extreme desktop processors and Xeon 35xx/55xx series with integrated memory controller. It is likely that it can be expanded in the future to work with other processor series based at the same Memory Controller design. For now, it has just a few MCH status reads. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>	2010-05-10 11:44:45 -03:00
Borislav Petkov	35d824b28f	edac, mce: Fix wrong mask and macro usage Correct two mishaps which prevented reporting error type (CECC vs UECC) and extended error description. Cc: <stable@kernel.org> # 32.x, 33.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-30 10:15:39 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Borislav Petkov	5b89d2f9ac	edac, mce: Filter out invalid values Print the CPU associated with the error only when the field is valid. Cc: <stable@kernel.org> # .32.x .33.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-03-22 16:33:31 +01:00
Peter Tyser	8004fd2ad6	edac: e752x: add dram scrubbing support Add support to scrub DRAM using the e752x integrated memory scrubbing engine. The e7320/7520/e7525 chipsets support scrubbing at one rate while the i3100 chipset supports a normal and fast rate. A similar patch was originally sent back in 2008: http://sourceforge.net/mailarchive/forum.php?thread_name=1204835866.25206.70.camel@localhost.localdomain&forum_name=bluesmoke-devel This version has the following updates: - Use 16-bit PCI config cycles to access MCHSCRB register e7320/7520/e7525 docs say register is 16bits wide, i3100 says 8. I tested 16bits on the i3100 to be safe. - Recalcuate and round actual scrub rates The changes have been tested on an i3100-based board. Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:40 -08:00
Konstantin Olifer	8de5c1a165	edac: e752x fsb ecc FSB parity is only supported on the Xeon processor. Previously it was incorrectly enabled for the Celeron as well. Signed-off-by: Konstantin Olifer <kolifer@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:40 -08:00
H Hartley Sweeten	66ed3f7516	edac: mpc85xx use resource_size instead of raw math Use resource_size() instead of arithmetic. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Kumar Gala <galak@gate.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:40 -08:00
Peter Tyser	dcca7c3d00	edac: mpc85xx improve SDRAM error reporting Add the ability to detect the specific data line or ECC line which failed when printing out SDRAM single-bit errors. An example of a single-bit SDRAM ECC error is below: EDAC MPC85xx MC1: Err Detect Register: 0x80000004 EDAC MPC85xx MC1: Faulty data bit: 59 EDAC MPC85xx MC1: Expected Data / ECC: 0x7f80d000_409effa0 / 0x6d EDAC MPC85xx MC1: Captured Data / ECC: 0x7780d000_409effa0 / 0x6d EDAC MPC85xx MC1: Err addr: 0x00031ca0 EDAC MPC85xx MC1: PFN: 0x00000031 Knowning which specific data or ECC line caused an error can be useful in tracking down hardware issues such as improperly terminated signals, loose pins, etc. Note that this feature is only currently enabled for 64-bit wide data buses, 32-bit wide bus support should be added. I don't have any 32-bit wide systems to test on. If someone has one and is willing to give this patch a shot with the check for a 64-bit data bus removed it would be much appreciated and I can re-submit with both 32 and 64 bit buses supported. Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:40 -08:00
Peter Tyser	21768639be	edac: mpc85xx mask ecc syndrome correctly With a 64-bit wide data bus only the lowest 8-bits of the ECC syndrome are relevant. With a 32-bit wide data bus only the lowest 16-bits are relevant on most architectures. Without this change, the ECC syndrome displayed can be mildly confusing, eg: EDAC MPC85xx MC1: syndrome: 0x25252525 When in reality the ECC syndrome is 0x25. A variety of Freescale manuals say a variety of different things about how to decode the CAPTURE_ECC (syndrome) register. I don't have a system with a 32-bit bus to test on, but I believe the change is correct. It'd be good to get an ACK from someone at Freescale about this change though. Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:40 -08:00
Emese Revfy	52cf25d0ab	Driver core: Constify struct sysfs_ops in struct kobj_type Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:49 -08:00
Linus Torvalds	eaa5eec739	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Simplify ECC override handling	2010-03-03 09:25:37 -08:00
Linus Torvalds	0a135ba14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.	2010-03-03 07:34:18 -08:00
Borislav Petkov	d95cf4de6a	amd64_edac: Simplify ECC override handling No need for clearing ecc_enable_override and checking it in two places. Instead, simply check it during probing and act accordingly. Also, rename the flag bitfields according to the functionality they actually represent. What is more, make sure original BIOS ECC settings are restored when the module is unloaded. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-03-01 19:25:12 +01:00
Tejun Heo	a29d8b8e2d	percpu: add __percpu sparse annotations to what's left Add __percpu sparse annotations to places which didn't make it in one of the previous patches. All converions are trivial. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Neil Brown <neilb@suse.de>	2010-02-17 11:17:38 +09:00
Linus Torvalds	676ad58553	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Do not falsely trigger kerneloops	2010-02-11 14:07:13 -08:00
Peter Tyser	f8c63345b4	edac: mpc85xx fix build regression by removing unused debug code Some unused, unsupported debug code existed in the mpc85xx EDAC driver that resulted in a build failure when CONFIG_EDAC_DEBUG was defined: drivers/edac/mpc85xx_edac.c: In function 'mpc85xx_mc_err_probe': drivers/edac/mpc85xx_edac.c:1031: error: implicit declaration of function 'edac_mc_register_mcidev_debug' drivers/edac/mpc85xx_edac.c:1031: error: 'debug_attr' undeclared (first use in this function) drivers/edac/mpc85xx_edac.c:1031: error: (Each undeclared identifier is reported only once drivers/edac/mpc85xx_edac.c:1031: error: for each function it appears in.) Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-11 13:59:42 -08:00
Peter Tyser	cff9279e4e	edac: mpc85xx fix bad page calculation Commit `b484625172` ("edac: mpc85xx add mpc83xx support") accidentally broke how a chip select's first and last page addresses are calculated. The page addresses are being shifted too far right by PAGE_SHIFT. This results in errors such as: EDAC MPC85xx MC1: Err addr: 0x003075c0 EDAC MPC85xx MC1: PFN: 0x00000307 EDAC MPC85xx MC1: PFN out of range! EDAC MC1: INTERNAL ERROR: row out of range (4 >= 4) EDAC MC1: CE - no information available: INTERNAL ERROR The vaule of PAGE_SHIFT is already being taken into consideration during the calculation of the 'start' and 'end' variables, thus it is not necessary to account for it again when setting a chip select's first and last page address. Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Ira W. Snyder <iws@ovro.caltech.edu> Cc: Kumar Gala <galak@gate.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-11 13:59:42 -08:00
Borislav Petkov	cab4d27764	amd64_edac: Do not falsely trigger kerneloops An unfortunate "WARNING" in the message amd64_edac dumps when the system doesn't support DRAM ECC or ECC checking is not enabled in the BIOS used to trigger kerneloops which qualified the message as an OOPS thus misleading the users. See, e.g. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/422536 http://bugzilla.kernel.org/show_bug.cgi?id=15238 Downgrade the message level to KERN_NOTICE and fix the formulation. Cc: stable@kernel.org # .32.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com>	2010-02-11 20:32:14 +01:00
Tamas Vincze	118f3e1afd	edac: i5000_edac critical fix panic out of bounds EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4) Kernel panic - not syncing: EDAC MC0: Uncorrected Error (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. This happens because FERR_NF_FBD bit 28 is not updated on i5000. Due to that, both bits 28 and 29 may be equal to one, returning channel = 3. As this value is invalid, EDAC core generates the panic. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14568 Signed-off-by: Tamas Vincze <tom@vincze.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-16 12:15:38 -08:00
Roel Kluin	926311fd7d	amd64_edac: Ensure index stays within bounds in amd64_get_scrub_rate Add a missing iterator variable thus fixing the conditional of the for-loop in amd64_get_scrub_rate(). Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2010-01-15 10:45:58 +01:00
Borislav Petkov	5213c32f9d	edac, pci: remove pesky debug printk Do not spam the logs needlessly with the sole info that edac_pci_dev_parity_clear is being called. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:09 +01:00
Borislav Petkov	92389102b6	amd64_edac: restrict PCI config space access Do not access F2x19[0,4] on K8 since they're undefined there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	43f5e68733	amd64_edac: fix forcing module load/unload Clear the override flag after force-loading the module. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:08 +01:00
Borislav Petkov	56b34b91e2	amd64_edac: make driver loading more robust Currently, the module does not initialize fully when the DIMMs aren't ECC but remains still loaded. Propagate the error when no instance of the driver is properly initialized and prevent further loading. Reorganize and polish error handling in amd64_edac_init() while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	8f68ed9728	amd64_edac: fix driver instance freeing Fix use-after-free errors by pushing all memory-freeing calls to the end of amd64_remove_one_instance(). Reported-by: Darren Jenkins <darrenrjenkins@gmail.com> LKML-Reference: <1261370306.11354.52.camel@ICE-BOX> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Borislav Petkov	603adaf6b3	amd64_edac: fix K8 chip select reporting Fix the case when amd64_debug_display_dimm_sizes() reports only half the amount of DRAM on it because it doesn't account for when the single DCT operates in 128-bit mode and merges chip selects from different DIMMs. Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> LKML-Reference: <200912112202.48173.johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-24 11:07:07 +01:00
Linus Torvalds	661e338f72	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: edac, mce, amd: silence GART TLB errors edac, mce: correct corenum reporting	2009-12-16 10:09:43 -08:00
Borislav Petkov	256f7276af	edac, mce, amd: silence GART TLB errors Although reporting of benign GART TLB errors is disabled in __mcheck_cpu_apply_quirks, those are still being logged, and, as a result, trip up amd64_edac. Pull up reporting check so that machines with loaded edac module bail out early and don't spit fragments into dmesg. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-16 17:48:39 +01:00
Nils Carlson	bbead2104e	edac: i5100 add 6 ranks per channel Add support for 6 ranks per channel to the i5100 chipset. I have tested the patch as far as possible with correctible errors and things appear good. The DIMM mapping is correct for our board, but boards may differ. Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se> Acked-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:20:12 -08:00
Nils Carlson	295439f2a3	edac: i5100 add scrubbing Addscrubbing to the i5100 chipset. The i5100 chipset only supports one scrubbing rate, which is not constant but dependent on memory load. The rate returned by this driver is an estimate based on some experimentation, but is substantially closer to the truth than the speed supplied in the documentation. Also, scrubbing is done once, and then a done-bit is set. This means that to accomplish continuous scrubbing a re-enabling mechanism must be used. I have created the simplest possible such mechanism in the form of a work-queue which will check every five minutes. This interval is quite arbitrary but should be sufficient for all sizes of system memory. Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:20:12 -08:00
Nils Carlson	b18dfd05f9	edac: i5100 clean controller to channel terms The i5100 driver uses the word controller instead of channel in a lot of places, this is simply a cleanup of the patch. Signed-off-by: Nils Carlson <nils.carlson@ludd.ltu.se> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:20:12 -08:00
Borislav Petkov	35d8069234	edac, mce: correct corenum reporting Fix core number reporting with NB MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-15 15:52:13 +01:00
Borislav Petkov	505422517d	x86, msr: Add support for non-contiguous cpumasks The current rd/wrmsr_on_cpus helpers assume that the supplied cpumasks are contiguous. However, there are machines out there like some K8 multinode Opterons which have a non-contiguous core enumeration on each node (e.g. cores 0,2 on node 0 instead of 0,1), see http://www.gossamer-threads.com/lists/linux/kernel/1160268. This patch fixes out-of-bounds writes (see URL above) by adding per-CPU msr structs which are used on the respective cores. Additionally, two helpers, msrs_{alloc,free}, are provided for use by the callers of the MSR accessors. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091211171440.GD31998@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-11 10:59:21 -08:00
Borislav Petkov	df5b1606bd	amd64_edac: bump driver version This was long overdue ... Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:14 +01:00
Andrew Morton	18ba54ac12	amd64_edac: fix use-uninitialised bug drivers/edac/amd64_edac.c: In function 'amd64_edac_init': drivers/edac/amd64_edac.c:2840: warning: 'ret' may be used uninitialized in this function Cc: Doug Thompson <dougthompson@xmission.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:13 +01:00
Borislav Petkov	bdc30a0c8c	amd64_edac: correct sys address to chip select mapping The routine does the reverse mapping of the error address of a CECC back to the node id, DRAM controller and chip select of the DIMM which caused the error. We should lookup the channel using the syndromes _only_ when the DCTs are ganged so fix that. Also, add an early exit when there's an error while scanning for the csrow thus decreasing indentation levels for better readability. Finally, fixup comments. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:38:12 +01:00
Borislav Petkov	bfc04aec7d	amd64_edac: add a leaner syndrome decoding algorithm Instead of using the whole syndrome tables for channel decoding, use a set of eigenvectors with which the tables can be generated to search for the syndrome in error. The algorithm operates independently of symbol size and can be used for both x4 and x8 syndromes. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-08 13:37:59 +01:00
Borislav Petkov	986a42a250	amd64_edac: remove early hw support check The .probe_valid_hardware low_ops member checked whether the DCTs are in DDR3 mode and bailed out if so. Now that all the needed changes for DDR3 support is in place, remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	6b4c0bdeb0	amd64_edac: detect DDR3 memory type Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	239642fe19	edac: add memory types strings for debugging Instead of using deeply-nested conditionals for dumping the DIMM type in debug mode, add a strings array of the supported DIMM types. This is useful in cases where an edac driver supports multiple DRAM types and is only defined in debug builds. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:31 +01:00
Borislav Petkov	cec7924f56	edac, mce: update AMD F10h revD check F10h revD start with model number 8. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:30 +01:00
Borislav Petkov	1f6bcee75e	amd64_edac: remove unneeded extract_error_address wrapper Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:30 +01:00
Borislav Petkov	44e9e2ee21	amd64_edac: rename StinkyIdentifier SystemAddress -> sys_addr Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:30 +01:00
Borislav Petkov	ad858bfa14	amd64_edac: remove superfluous dbg printk Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00
Borislav Petkov	1433eb9903	amd64_edac: enhance address to DRAM bank mapping Add cs mode to cs size mapping tables for DDR2 and DDR3 and F10 and all K8 flavors and remove klugdy table of pseudo values. Add a low_ops->dbam_to_cs member which is family-specific and replaces low_ops->dbam_map_to_pages since the pages calculation is a one liner now. Further cleanups, while at it: - shorten family name defines - align amd64_family_types struct members Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00
Borislav Petkov	d16149e8c3	amd64_edac: cleanup f10_early_channel_count Do not read DCLR[01] again since this is done in amd64_read_mc_registers() earlier. There can be more than two physical DIMMs present so clamp the channels value to max 2. Also, do not report DCT data width - it is also done earlier. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:29 +01:00
Borislav Petkov	8566c4df16	amd64_edac: dump DIMM sizes on K8 too Extend f10_debug_display_dimm_sizes to dump the logical DIMMs configuration on K8 revF too. Remove the ganged arg since we print the DCT operating mode (ganged vs unganged) earlier. Also, DCT csrow configuration is relevant therefore dump it as KERN_DEBUG instead of only on debug builds. Remove misleading DIMM output since there's no reliable way of mapping of chip selects to actual physical DIMMs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:28 +01:00
Borislav Petkov	8de1d91e62	amd64_edac: cleanup rest of amd64_dump_misc_regs Clarify bitfields description, add PCI config function/offset names to registers for easy reference, simplify code layout, remove unneeded info. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:28 +01:00
Borislav Petkov	68798e1760	amd64_edac: cleanup DRAM cfg low debug output Carve out the register-specific debug statements into a separate function, clarify meanings of the single bitfields in the register, remove irrelevant output and macros. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:28 +01:00
Borislav Petkov	6ba5dcdc44	amd64_edac: wrap-up pci config read error handling Add a pci config read wrapper for signaling pci config space access errors instead of them being visible only on a debug build. This is important on amd64_edac since it uses all those pci config register values to access the DRAM/DIMM configuration of the nodes. In addition, the wrapper makes a _lot_ (look at the diffstat!) of error handling code superfluous and improves much of the overall code readability by removing error handling details out of the way. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:27 +01:00
Borislav Petkov	f6d6ae9657	amd64_edac: unify MCGCTL ECC switching Unify almost identical code into one function and remove NUMA-specific usage (specifically cpumask_of_node()) in favor of generic topology methods. Remove unused defines, while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:27 +01:00
Rusty Russell	ba578cb34a	cpumask: use modern cpumask style in drivers/edac/amd64_edac.c cpumask_t -> struct cpumask, and don't put one on the stack. (Note: this is actually on the stack unless CONFIG_CPUMASK_OFFSTACK=y). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:27 +01:00
Borislav Petkov	e97f8bb8ce	amd64_edac: make DRAM regions output more human-readable Do not shift the TOP_MEM and TOP_MEM2 values by 23 but rather save the whole 64-bit value read from the MSR. Although the TOP_MEM/TOP_MEM2 bits are only a subset of the 64bit register, the values are correct since the remaining bits are Read-As-Zero and no shifting is needed. Also, cleanup DRAM base/limit debug output. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:27 +01:00
Borislav Petkov	72381bd55e	amd64_edac: clarify DRAM CTL debug reporting Make debug info formulations about the DRAM and DCT configuration of the machine more human readable. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-12-07 19:14:26 +01:00
Ingo Molnar	26fb20d008	Merge branch 'perf/mce' into perf/core Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-03 20:11:06 +01:00
Borislav Petkov	17adea01b9	amd64_edac: fix CECCs reporting Shift error type bits properly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-11-04 14:04:06 +01:00
Li Hong	a3c4c58085	amd64_edac: fix a wrong goto clause in amd64_edac.c In amd64_edac_init(void) in amd64_edac.c, cache_k8_northbridges() is called before pci_register_driver. If it fails, should exit with err directly. Signed-off-by: Li Hong <lihong.hi@gmail.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-11-04 14:02:32 +01:00
Keith Mannthey	c2494ace99	edac: i5100 fix initialization code Allow csrows to properly initialize when the topology only has active channels on 2 and 3. This new check allows proper detection and initialization in this topology. Only checking the first mrt that represented channels 0 and 1 is not sufficient. I also fixed up the related debug information path. I can submit as a 2nd patch if needed. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Acked-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:30 -07:00
Ira W. Snyder	0616fb003d	edac: i5400 fix missing CONFIG_PCI define When building without CONFIG_PCI the edac_pci_idx variable is unused, causing a build-time warning. Wrap the variable in #ifdef CONFIG_PCI, just like the rest of the PCI support. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:30 -07:00
Jeff Roberson	156edd4aaa	edac: i5400 fix csrow mapping The i5400 EDAC driver has several bugs with chip-select row computation which most likely lead to bugs in detailed error reporting. Attempts to contact the authors have gone mostly unanswered so I am presenting my diff here. I do not subscribe to lkml and would appreciate being kept in the cc. The most egregious problem was miscalculating the addresses of MTR registers after register 0 by assuming they are 32bit rather than 16. This caused the driver to miss half of the memories. Most motherboards tend to have only 8 dimm slots and not 16, so this may not have been noticed before. Further, the row calculations multiplied the number of dimms several times, ultimately ending up with a maximum row of 32. The chipset only supports 4 dimms in each of 4 channels, so csrow could not be higher than 4 unless you use a row per-rank with dual-rank dimms. I opted to eliminate this behavior as it is confusing to the user and the error reporting works by slot and not rank. This gives a much clearer view of memory by slot and channel in /sys. Signed-off-by: Jeff Roberson <jroberson@jroberson.net> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:30 -07:00
Borislav Petkov	4997811e3b	amd64_edac: fix DRAM base and limit extraction masks, v2 This is a proper fix as a follow-up to `66216a7` and `916d11b`. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-16 18:51:22 +02:00
Borislav Petkov	fb2531953f	mce, edac: Use an atomic notifier for MCEs decoding Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-12 12:24:45 +02:00
Linus Torvalds	624235c5b3	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pci: Correct spelling in a comment x86: Simplify bound checks in the MTRR code x86: EDAC: carve out AMD MCE decoding logic initcalls: Add early_initcall() for modules x86: EDAC: MCE: Fix MCE decoding callback logic	2009-10-08 12:06:36 -07:00
Borislav Petkov	94baaee494	amd64_edac: beef up DRAM error injection When injecting DRAM ECC errors (F3xBC_x8), EccVector[15:0] is a bitmask of which bits should be error injected when written to and holds the payload of 16-bit DRAM word when read, respectively. Add /sysfs members to show the DRAM ECC section/word/vector. Fail wrong injection values entered over /sysfs instead of truncating them. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:51:28 +02:00
Borislav Petkov	66216a7a15	amd64_edac: fix DRAM base and limit extraction On Fam10h and above, F1x[1, 0][7C:40] are DRAM Base/Limit registers which specify the destination node of a DRAM address. Those address boundaries are being extracted into ->dram_base[] and ->dram_limit[]. Correct the extraction masks to match the respective address bits. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:51:15 +02:00
Borislav Petkov	9d858bb10a	amd64_edac: fix chip select handling Different processor families support a different number of chip selects. Handle this in a family-dependent way with the proper values assigned at init time (see amd64_set_dct_base_and_mask). Remove _DCSM_COUNT defines since they're used at one place and originate from public documentation. CC: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:50:50 +02:00
Keith Mannthey	2cff18c22c	amd64_edac: simple fix to allow reporting of CECC errors This allows the errors to be further decoded and mapped to csrows. Tested with ECC debug dimms and an Rev F cpu based system. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:49:58 +02:00
Borislav Petkov	8edc544589	amd64_edac: fix K8 intlv_sel check The check when DRAM interleaving is enabled should be done against the pvt->dram_IntlvSel field and not against the ->dram_limit. Simplify first loop and fixup printk formatting while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:49:43 +02:00
Borislav Petkov	72f158fe6f	amd64_edac: fix interleave enable tests The pvt->dram_IntlvEn saves the 3 "Interleave Enable" bits already right-shifted by 8 so the check in find_mc_by_sys_addr() by shifting the values to the left 8 bits is wrong. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:48:08 +02:00
Borislav Petkov	916d11b2b5	amd64_edac: fix DRAM base and limit address extraction K8 DRAM base and limit addresses from F1x40 +8i and F1x44 + 8i, where i in (0..7) are both bits 39-24 and therefore the shifting should be done by 24 and not by 8. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:47:51 +02:00
Borislav Petkov	3011b20da9	amd64_edac: fix driver instance lookup table allocation Allocate memory statically for 8-node machines max for simplicity instead of relying on MAX_NUMNODES which is 0 on !CONFIG_NUMA builds. Spotted by Jan Beulich. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-10-07 16:47:34 +02:00
Borislav Petkov	0d18b2e34b	x86: EDAC: carve out AMD MCE decoding logic This converts the MCE decoding logic into a standalone config option which can be built-in or a module, the first one being the default for MCEs happening early on in the boot process. This, beyond being separated in a cleaner way, also saves RAM by making the decoding logic modular. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091002133148.GD28682@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-02 15:42:19 +02:00
Ingo Molnar	f436f8bb73	x86: EDAC: MCE: Fix MCE decoding callback logic Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-02 15:42:18 +02:00
Jesper Dangaard Brouer	458e5ff13e	edac: core: remove completion-wait for complete with rcu_barrier Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c and edac_pci.c. They all use a wait_for_completion() scheme, but this scheme it not 100% safe on multiple CPUs. See the _rcu_barrier() implementation which explains why extra precausion is needed. The patch adds a comment about rcu_barrier() and as a precausion calls rcu_barrier(). A maintainer needs to look at removing the wait_for_completion code. [dougthompson@xmission.com: remove the wait_for_completion code] Signed-off-by Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:05 -07:00
Jason Uhlenkott	dd8ef1db87	edac: i3200 memory controller driver A driver for the Intel 3200 and 3210 memory controllers. It has only had light testing so far, and currently makes no attempt to decode error addresses at anything finer than csrow granularity. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Julia Lawall	30a61fff3a	edac: fix resource size calculation Use the function resource_size, which reduces the chance of introducing off-by-one errors in calculating the resource size. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ struct resource *res; @@ - (res->end - res->start) + 1 + resource_size(res) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Ira W. Snyder	b484625172	edac: mpc85xx add mpc83xx support Add support for the Freescale MPC83xx memory controller to the existing driver for the Freescale MPC85xx memory controller. The only difference between the two processors are in the CS_BNDS register parsing code, which has been changed so it will work on both processors. The L2 cache controller does not exist on the MPC83xx, but the OF subsystem will not use the driver if the device is not present in the OF device tree. I had to change the nr_pages calculation to make the math work out. I checked it on my board and did the math by hand for a 64GB 85xx using 64K pages. In both cases, nr_pages * PAGE_SIZE comes out to the correct value. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Kumar Gala <galak@gate.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Yang Shi	a014554e66	edac: mpc85xx add P2020DS support Based on Kumar's new compatible types patch, add P2020 into MPC85xx EDAC compatible lists so that EDAC can recognize P2020 meomry controller and L2 cache controller and export the relevant fields to sysfs. EDAC MPC85xx DDR3 support is needed if DDR3 memory stick is installed on a P2020DS board so that EDAC core can recognize DDR3 memory type. Signed-off-by: Yang Shi <yang.shi@windriver.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Anand Gadiyar	411c940385	trivial: fix typo "for for" in multiple files trivial: fix typo "for for" in multiple files Signed-off-by: Anand Gadiyar <gadiyar@ti.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-09-21 15:14:54 +02:00
Borislav Petkov	06724535f8	amd64_edac: check NB MCE bank enable on the current node properly The old code was using smp_call_function_many which skips the current cpu if it is in the supplied cpumask. Switch to the rdmsr_on_cpus() interface which takes care of that. In addition, add get_cpus_on_this_dct_cpumask helper which computes a cpumask of all the cores on a node and thus on a DCT. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-16 13:05:46 +02:00
Wan Wei	57a30854c8	amd64_edac: Rewrite unganged mode code of f10_early_channel_count Simplify the procedure by checking if there is any DIMM in each channel. This patch will fix the bugs such as when there is no DIMMs under certain node, two DIMMs in the same channel, and only one DIMM in each channel of the node. Borislav: minor fixups Signed-off-by: Wan Wei <wanwei@mail.dawning.com.cn> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-16 12:42:55 +02:00
Borislav Petkov	be3468e8ff	amd64_edac: cleanup amd64_check_ecc_enabled Simplify code flow and make sure return value is always valid since further driver init depends on it. Carve out long warning string and make code more readable. Shorten some names, while at it. There should be no functional change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-16 12:40:38 +02:00
Andreas Herrmann	6a8126911a	x86, EDAC: Provide function to return NodeId of a CPU Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: H. Peter Anvin <hpa@zytor.com>	2009-09-16 11:33:40 +02:00
Ingo Molnar	b9183f9b99	amd64_edac: build driver only on AMD hardware -tip testing found the following build failure (config attached): drivers/built-in.o: In function `amd64_check': amd64_edac.c:(.text+0x3e9491): undefined reference to `amd_decode_nb_mce' drivers/built-in.o: In function `amd64_init_2nd_stage': amd64_edac.c:(.text+0x3e9b46): undefined reference to `amd_report_gart_errors' amd64_edac.c:(.text+0x3e9b55): undefined reference to `amd_register_ecc_decoder' drivers/built-in.o: In function `amd64_nbea_store': amd64_edac_dbg.c:(.text+0x3ea22e): undefined reference to `amd_decode_nb_mce' drivers/built-in.o: In function `amd64_remove_one_instance': amd64_edac.c:(.devexit.text+0x3eea): undefined reference to `amd_report_gart_errors' amd64_edac.c:(.devexit.text+0x3ef6): undefined reference to `amd_unregister_ecc_decoder' the AMD EDAC code has a dependency on CONFIG_CPU_SUP_AMD facilities. The patch below solves the problem here. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-16 11:31:57 +02:00
Borislav Petkov	53bd5fedca	EDAC, AMD: decode FR MCEs See Fam10h BKDG (31116, rev. 3.28), Table 101. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:37 +02:00
Borislav Petkov	f9350efd6f	EDAC, AMD: decode load store MCEs See Fam10h BKDG (31116, rev. 3.28), Table 100. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:33 +02:00
Borislav Petkov	56cad2d6fb	EDAC, AMD: decode bus unit MCEs ... according to Table 69, Fam10h BKDG (31116, rev. 3.28). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:30 +02:00
Borislav Petkov	ab5535e70f	EDAC, AMD: decode instruction cache MCEs See Fam10h BKDG (31116, rev. 3.28), Table 95 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:27 +02:00
Borislav Petkov	5196624136	EDAC, AMD: decode data cache MCEs Those get reported in MC0_STATUS, see Table 92, F10h BKDG (31116, rev. 3.28) for more details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:23 +02:00
Borislav Petkov	d93cc222ad	EDAC, AMD: carve out decoding of MCi_STATUS ErrorCode This is the MCE error code from the MCi_STATUS banks, bits [15:0] which describe what type of error was encountered: GART TLB, Memory or Bus error. The semantics of those bits are identical across all MCE banks so decode those separately, irrespectively of MCE type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:20 +02:00
Borislav Petkov	b69b29de65	EDAC, AMD: carve out MCi_STATUS decoding The MCi_STATUS registers have most field definitions in common so decode them in the general path. Do not pass ecc_type along and compute it in __amd64_decode_bus_error instead. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 19:01:07 +02:00
Borislav Petkov	549d042df2	x86, mce: pass mce info to EDAC for decoding Move NB decoder along with required defines to EDAC MCE core. Add registration routines for further decoding of the MCE info in the AMD64 EDAC module. CC: Andi Kleen <andi@firstfloor.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:59:17 +02:00
Borislav Petkov	ecaf5606de	amd64_edac: cleanup amd64_decode_bus_error Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:58:37 +02:00
Borislav Petkov	b7225e4fc1	amd64_edac: remove memory and GART TLB error decoders Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:58:29 +02:00
Borislav Petkov	5110dbdeab	amd64_edac: cleanup/complete NB MCE decoding * don't dump info which mcheck already does * update to newest BKDG * mv amd64_process_error_info -> amd64_decode_nb_mce * shorten error struct names * remove redundant info ptr in amd64_process_error_info * remove unused ErrorCodeExt[19:16] (MCx_STATUS) defines Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:58:25 +02:00
Borislav Petkov	ef44cc4c22	amd64_edac: cleanup amd64_process_error_info * mv amd64_error_info_regs -> err_regs * remove redundant info ptr Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:58:18 +02:00
Borislav Petkov	1c43f2e24d	EDAC: beef up ErrorCodeExt error signatures Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:58:14 +02:00
Borislav Petkov	b70ef01016	EDAC: move MCE error descriptions to EDAC core This is in preparation of adding AMD-specific MCE decoding functionality to the EDAC core. The error decoding macros originate from the AMD64 EDAC driver albeit in a simplified and cleaned up version here. While at it, add macros to generate the error description strings and use them in the error type decoders directly which removes a bunch of code and makes the decoding functions much more readable. Also, fix strings and shorten macro names. Remove superfluous htlink_msgs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-09-14 18:57:48 +02:00
Doug Thompson	c2718348b4	amd64_edac: print debug statements only on error Add forgotten return calls for the successful cases. Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-08-04 12:10:06 +02:00
Doug Thompson	126b67b8d2	amd64_edac: fix ECC checking On the good path of BIOS enabled ECC and no override, the value returned is 1 by omission and thus is deemed failing by the probe-function. Allow proper module initialization by clearing the retval explicitly. Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-08-03 16:54:20 +02:00
Lu Zhihe	3d768213a6	edac: x38 fix mchbar high register addr Intel X38 MCHBAR is a 64bits register, base from 0x48, so its higher base is 0x4C. Signed-off-by: Lu Zhihe <tombowfly@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> [2.6.30.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-29 19:10:34 -07:00
Wan Wei	4afcd2dcc6	amd64_edac: read the right F2 maskoffset reg Signed-off-by: Wan Wei <onewayforever@gmail.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-07-27 14:42:24 +02:00
Yang Shi	b1cfebc923	edac: add DDR3 memory type for MPC85xx EDAC Since some new MPC85xx SOCs support DDR3 memory now, so add DDR3 memory type for MPC85xx EDAC. Signed-off-by: Yang Shi <yang.shi@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-30 18:55:59 -07:00
Borislav Petkov	37da045067	amd64_edac: misc small cleanups - cleanup debug calls - shorten function names - cleanup error exit paths Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-26 13:06:41 +02:00
Borislav Petkov	30c875cbc1	amd64_edac: fix ecc_enable_override handling amd64_check_ecc_enabled() returns non-zero status when ECC checking/correcting is disabled and this fails further loading of the driver even when 'ecc_enable_override' boot param is used. Fix that by clearing return status in that case. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-26 13:06:41 +02:00
Borislav Petkov	584fcff428	amd64_edac: check only ECC bit in amd64_determine_edac_cap Checking whether the machine is using ECC enabled DRAM is done through testing the DimmEccEn bit in the DRAM Cfg Low register (F2x[1,0]90). Do that instead of testing all bits from the DimmEccEn upwards. Also, remove mci->edac_cap assignment and use value returned from amd64_determine_edac_cap(). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-26 13:06:40 +02:00
GeunSik Lim	e24aca672f	edac: Kconfig: fix the meaning of EDAC abbreviation Fix the meaning of EDAC(Error Detection And Correction) correctly. [akpm@linux-foundation.org: add missing space] Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-18 13:03:57 -07:00
Mike Frysinger	20ea8fad9e	edac: add missing __devexit_p() The remove function uses __devexit, so the .remove assignment needs __devexit_p() to fix a build error with hotplug disabled. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-18 13:03:57 -07:00
Harry Ciao	1dc9b70d7d	edac: add edac_device_alloc_index() Add edac_device_alloc_index(), because for MAPLE platform there may exist several EDAC driver modules that could make use of edac_device_ctl_info structure at the same time. The index allocation for these structures should be taken care of by EDAC core. [akpm@linux-foundation.org: cleanups] Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-18 13:03:56 -07:00
Harry Ciao	2a9036afff	edac: add CPC925 Memory Controller driver Introduce IBM CPC925 EDAC driver, which makes use of ECC, CPU and HyperTransport Link error detections and corrections on the IBM CPC925 Bridge and Memory Controller. [akpm@linux-foundation.org: cleanup] Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-18 13:03:56 -07:00
Martin Olsson	98a1708de1	trivial: fix typos s/paramter/parameter/ and s/excute/execute/ in documentation and source comments. Signed-off-by: Martin Olsson <martin@minimum.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:46 +02:00
Borislav Petkov	9456ffffcf	EDAC: do not enable modules by default Prevent EDAC compilation units from being built by default and let the user explicitly select the needed modules. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:19:41 +02:00
Borislav Petkov	3d37329045	amd64_edac: do not enable module by default While at it, fix a link failure when !K8_NB. Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:19:40 +02:00
Doug Thompson	7d6034d321	amd64_edac: add module registration routines Also, link into Kbuild by adding Kconfig and Makefile entries. Borislav: - Kconfig/Makefile splitting - use zero-sized arrays for the sysfs attrs if not enabled - rename sysfs attrs to more conform values - shorten CONFIG_ names - make multiple structure members assignment vertically aligned - fix/cleanup comments - fix function return value patterns - fix err labels - fix a memleak bug caught by Ingo - remove the NUMA dependency and use num_k8_northbrides for initializing a driver instance per NB. - do not copy the pvt contents into the mci struct in amd64_init_2nd_stage() and save it in the mci->pvt_info void ptr instead. - cleanup debug calls - simplify amd64_setup_pci_device() Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:19:28 +02:00
Doug Thompson	f9431992b6	amd64_edac: add ECC reporting initializers Borislav: - convert to the new {rd\|wr}msr_on_cpus interfaces. - convert pvt->old_mcgctl to a bitmask thus saving some bytes - fix/cleanup comments - fix function return value patterns - add a proper bugfix found by Doug to amd64_check_ecc_enabled where we missed checking for the ECC enabled bit in NB CFG. - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:19:01 +02:00
Doug Thompson	0ec449ee95	amd64_edac: add EDAC core-related initializers Borislav: - add a amd64_free_mc_sibling_devices() helper instead of opencoding the release-path. - fix/cleanup comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:19:00 +02:00
Doug Thompson	d27bf6fa36	amd64_edac: add error decoding logic Borislav: - fold amd64_error_info_valid() into its only user - fix/cleanup comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:59 +02:00
Doug Thompson	b1289d6f9d	amd64_edac: add ECC chipkill syndrome mapping table Borislav: - fix comments - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:58 +02:00
Doug Thompson	4d37607adb	amd64_edac: add per-family descriptors Borislav: - fix comments - fix function return value patterns Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:57 +02:00
Doug Thompson	f71d0a0500	amd64_edac: add F10h-and-later methods-p3 Borislav: - compute dct_sel_base_off in f10_match_to_this_node() correctly since it cannot be assumed that the Reserved bits are zero and they have to be masked out instead. - cleanup, remove StinkyIdentifiers, simplify logic - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:56 +02:00
Doug Thompson	6163b5d4fb	amd64_edac: add F10h-and-later methods-p2 Borislav: - fix a wrong negation in f10_determine_base_addr_offset() - fix a wrong mask in f10_determine_base_addr_offset() which should select DctSelBaseAddr[31:11] and not [31:16] as it was before - remove StinkyIdentifiers, trivially simplify code. - fix/cleanup comments - fix function return value patterns Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:56 +02:00
Doug Thompson	1afd3c98b5	amd64_edac: add F10h-and-later methods-p1 Borislav: Fail f10_early_channel_count() if error encountered while reading a NB register since those cached register contents are accessed afterwards. - fix/cleanup comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:55 +02:00
Doug Thompson	ddff876d20	amd64_edac: add k8-specific methods Borislav: - fix/cleanup/move comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:54 +02:00
Doug Thompson	94be4bff21	amd64_edac: assign DRAM chip select base and mask in a family-specific way Borislav: - cleanup/fix comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:53 +02:00
Doug Thompson	2da11654ea	amd64_edac: add helper to dump relevant registers Borislav: - cleanup/fix comments - fix function return value patterns - cleanup dbg calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:52 +02:00
Doug Thompson	93c2df58b5	amd64_edac: add DRAM address type conversion facilities Borislav: - cleanup/fix comments, add BKDG refs - fix function return value patterns - cleanup dbg calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:51 +02:00
Doug Thompson	e2ce7255e8	amd64_edac: add functionality to compute the DRAM hole Borislav: - cleanup/fix comments, add BKDG refs - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:50 +02:00
Doug Thompson	6775763a23	amd64_edac: add sys addr to memory controller mapping helpers Borislav: - cleanup comments - cleanup debug calls - simplify find_mc_by_sys_addr's exit path Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:49 +02:00
Doug Thompson	2bc6541872	amd64_edac: add memory scrubber interface Borislav: - fix/cleanup comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:49 +02:00
Doug Thompson	b52401cece	amd64_edac: add MCA error types Borislav: - cleanup comments Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:48 +02:00
Doug Thompson	eb919690be	amd64_edac: add DRAM error injection logic using sysfs Borislav: - rename sysfs attrs to more conform names - cleanup/fix comments according to BKDG text - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:47 +02:00
Doug Thompson	fd3d6780f7	amd64_edac: add debugging/testing code This is for dumping different registers and testing the address mapping logic using the ECC syndromes. Borislav: - split sysfs attrs per file - use more conform names for the sysfs attrs - fix function return value patterns - cleanup/fix comments - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:46 +02:00
Doug Thompson	cfe40fdb4a	amd64_edac: add driver header Borislav: - remove register bit descriptions (complete text in BKDG) - cleanup and remove excessive/superfluous comments Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:45 +02:00
Borislav Petkov	d357cbb445	edac: fold __func__ into edac_debug_printk This shortens debugfX() calls a bit. Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> CC: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2009-06-10 12:18:44 +02:00
Harry Ciao	715fe7af9f	edac: AMD8111 & AMD8131 Kconfig fixup The amd8111_edac.c driver will fail allmodconfig on architectures other than PPC, introduce Kconfig dependency to avoid this, since both AMD8111 and AMD8131 chips are only adopted on Maple so far. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-29 08:40:03 -07:00
Harry Ciao	56ec0c7b88	edac: AMD8111 & AMD8131 use dev_name() The "bus_id" member in the device structure has been obsolete, use dev_name() instead. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-29 08:40:03 -07:00
Dave Jiang	55e5750b3e	edac: ppc mpc85xx fix mc err detect Error found by Jeff Haran. The error detect register is 0s when no errors are detected. The check code is incorrect, so reverse check sense. Reported-by: Jeff Haran <jharan@Brocade.COM> Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-21 13:41:51 -07:00
Jean Delvare	fbeb438474	edac: use to_delayed_work() The edac-core driver includes code which assumes that the work_struct which is included in every delayed_work is the first member of that structure. This is currently the case but might change in the future, so use to_delayed_work() instead, which doesn't make such an assumption. linux-2.6.30-rc1 has the to_delayed_work() function that will allow this patch to work Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-13 15:04:34 -07:00
Jeff Haran	e6da46b273	edac: fix local pci_write_bits32 Fix the edac local pci_write_bits32 to properly note the 'escape' mask if all ones in a 32-bit word. Currently no consumer of this function uses that mask, so there is no danger to existing code. Signed-off-by: Jeff Haran <jharan@Brocade.COM> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-13 15:04:33 -07:00
Harry Ciao	58b4ce6f24	edac: AMD8111 driver Kconfig & Makefile Introduce Kconfig and Makefile options for AMD8111 EDAC driver. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:04 -07:00
Harry Ciao	e876558415	edac: AMD8131 driver Kconfig & Makefile Introduce Kconfig and Makefile options for AMD8131 EDAC driver. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:04 -07:00
Harry Ciao	28d16272b1	edac: AMD8131 driver source file Introduce AMD8131 EDAC driver source file, which makes use of error detections on the PCI-X Bridge Controllers on the AMD8131 HyperTransport PCI-X Tunnel. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:04 -07:00
Harry Ciao	a35a281880	edac: AMD8131 driver header file Introduce AMD8131 EDAC driver header file, which adds register and bits definitions for the PCI-X Bridge Controller on the AMD8131 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Harry Ciao	8641a3845d	edac: Add edac_pci_alloc_index() Add edac_pci_alloc_index(), because for MAPLE platform there may exist several EDAC driver modules that could make use of edac_pci_ctl_info structure at the same time. The index allocation for these structures should be taken care of by EDAC core. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Harry Ciao	697dab6484	edac: AMD8111 driver source file Introduce AMD8111 EDAC driver source file, which makes use of error detections on the LPC Bridge Controller and PCI Bridge Controller on the AMD8111 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Harry Ciao	ec2cf2e272	edac: AMD8111 driver header file Introduce AMD8111 EDAC driver header file, which adds register and bits definitions for the LPC Bridge Controller and PCI Bridge Controller on the AMD8111 HyperTransport I/O Hub. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Grant Erickson	dba7a77c0e	edac: new ppc4xx driver module This adds support for an EDAC memory controller adaptation driver for the "ibm,sdram-4xx-ddr2" ECC controller realized in the AMCC PowerPC 405EX[r]. At present, this driver has been developed and tested against the controller realization in the AMCC PPC405EX[r] on the AMCC Kilauea and Haleakala boards (256 MiB w/o ECC memory soldered onto the board) and a proprietary board based on those designs (128 MiB ECC memory, also soldered onto the board). In the future, dynamic feature detection and handling needs to be added for the other realizations of this controller found in the 440SP, 440SPe, 460EX, 460GT and 460SX. Eventually, this driver will likely be evolved and adapted to the above variant realizations of this controller as well as broken apart to handle the other known ECC-capable controllers prevalent in other PPC4xx processors: - IBM SDRAM (405GP, 405CR and 405EP) "ibm,sdram-4xx" - IBM DDR1 (440GP, 440GX, 440EP and 440GR) "ibm,sdram-4xx-ddr" - Denali DDR1/DDR2 (440EPX and 440GRX) "denali,sdram-4xx-ddr2" [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Grant Erickson <gerickson@nuovations.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Doug Thompson	4577ca5568	edac: remove EDAC's experimental status After 3 years, this is a patch to remove the EXPERIMENTAL tag on EDAC. We now have many module drivers submitters in EDAC and believe EDAC is no longer EXPERIMENTAL Signed-off-by: Doug Thompson <dougthompson@xmission.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:03 -07:00
Hitoshi Mitake	cc18e3cd53	edac: add more verbose debug info A patch for making a debugging information more verbose for use in development debugging. By enabling the new option "More verbose debugging", information about source file and line number will be added to debugging message. This is sample output, EDAC MC0: Giving out device to 'e7xxx_edac' 'E7205': DEV 0000:00:00.0 EDAC DEBUG: in drivers/edac/edac_pci.c, line at 48: edac_pci_alloc_ctl_info() EDAC DEBUG: in drivers/edac/edac_pci.c, line at 334: edac_pci_add_device() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:05:02 -07:00
Kay Sievers	031d551859	edac: struct device - replace bus_id with dev_name(), dev_set_name() Cc: dougthompson@xmission.com Cc: bluesmoke-devel@lists.sourceforge.net Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>	2009-03-24 16:38:21 -07:00
Stephen Rothwell	4712fff9be	powerpc: More printing warning fixes for the l64 to ll64 conversion These are all powerpc specific drivers. res.start in fsl_elbc_nand.c needs to be cast since it may be either 32 or 64 bit. Thanks to Scott Wood for noticing. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Arnd Bergmann <arnd@arndb.de> call_edac bits in particular Acked-by: Olof Johansson <olof@lixom.net> pasemi_nand peices Acked-by: Scott Wood <scottwood@freescale.com> fsl_elbc fixes Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-01-28 17:15:52 +11:00
Mauro Carvalho Chehab	8375d4909a	edac: driver for i5400 MCH (update) Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Mauro Carvalho Chehab	920c8df6ac	edac: driver for i5400 MCH (Seaburg) EDAC driver for i5400 MCH (Seaburg) This driver adds support for i5400 MCH chipset. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Ben Woodard <woodard@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Kumar Gala	29d6cf26a7	edac: fix mpc85xx and add mpc8536 mpc8560 All other compatibles that are uniquely identifying the processor use a prefix of the form fsl,mpc85...'. We add support for it so we can deprecate the older 'fsl,85...' that was improperly used here. Additionally added mpc8536 & mpc8560 to the compatible lists. This patch is based on Nate's 8572 patch. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Cc: Nate Case <ncase@xes-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Kay Sievers	281efb17d8	edac: struct device: replace bus_id with dev_name(), dev_set_name() This patch is part of a larger patch series which will remove the "char bus_id[20]" name string from struct device. The device name is managed in the kobject anyway, and without any size limitation, and just needlessly copied into "struct device". [akpm@linux-foundation.org: coding-style fixes] Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Arjan van de Ven	1dca00bd02	pci: use pci_ioremap_bar() in drivers/edac Use the newly introduced pci_ioremap_bar() function in drivers/edac. pci_ioremap_bar() just takes a pci device and a bar number, with the goal of making it really hard to get wrong, while also having a central place to stick sanity checks. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Linus Torvalds	3c92ec8ae9	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions	2008-12-28 16:54:33 -08:00
Harry Ciao	d519c8d9cc	edac: fix edac core deadlock when removing a device When deleting an edac device, we have to wait for its edac_dev.work to be completed before deleting the whole edac_dev structure. Since we have no idea which work in current edac_poller's workqueue is the work we are conerned about, we wait for all work in the edac_poller's workqueue to be proceseed. This is done via flush_cpu_workqueue() which inserts a wq_barrier into the tail of the workqueue and then sleeping on the completion of this wq_barrier. The edac_poller will wake up sleepers when it is found. EDAC core creates only one kernel worker thread, edac_poller, to run the works of all current edac devices. They share the same callback function of edac_device_workq_function(), which would grab the mutex of device_ctls_mutex first before it checks the device. This is exactly where edac_poller and rmmod would have a great chance to deadlock. In below call trace of rmmod > ... > edac_device_del_device > edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue, device_ctls_mutex would have already been grabbed by edac_device_del_device(). So, on one hand rmmod would sleep on the completion of a wq_barrier, holding device_ctls_mutex; on the other hand edac_poller would be blocked on the same mutex when it's running any one of works of existing edac evices(Note, this edac_dev.work is likely to be totally irrelevant to the one that is being removed right now)and never would have a chance to run the work of above wq_barrier to wake rmmod up. edac_device_workq_teardown() should not be called within the critical region of device_ctls_mutex. Just like is done in edac_pci_del_device() and edac_mc_del_mc(), where edac_pci_workq_teardown() and edac_mc_workq_teardown() are called after related mutex are released. Moreover, an edac_dev.work should check first if it is being removed. If this is the case, then it should bail out immediately. Since not all of existing edac devices are to be removed, this "shutting flag" should be contained to edac device being removed. The current edac_dev.op_state can be used to serve this purpose. The original deadlock problem and the solution have been witnessed and tested on actual hardware. Without the solution, rmmod an edac driver would result in below deadlock: root@localhost:/root> rmmod mv64x60_edac EDAC DEBUG: mv64x60_dma_err_remove() EDAC DEBUG: edac_device_del_device() EDAC DEBUG: find_edac_device_by_dev() (hang for a moment) INFO: task edac-poller:2030 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. edac-poller D 00000000 0 2030 2 Call Trace: [df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable) [df159e80] [c000a024] __switch_to+0x6c/0xa0 [df159ea0] [c03587d8] schedule+0x2f4/0x4d8 [df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174 [df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core] [df159f60] [c003beb4] run_workqueue+0x114/0x218 [df159f90] [c003c674] worker_thread+0x5c/0xc8 [df159fd0] [c004106c] kthread+0x5c/0xa0 [df159ff0] [c0013538] original_kernel_thread+0x44/0x60 INFO: task rmmod:2062 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. rmmod D 0ff2c9fc 0 2062 1839 Call Trace: [df119c00] [c0437a74] 0xc0437a74 (unreliable) [df119cc0] [c000a024] __switch_to+0x6c/0xa0 [df119ce0] [c03587d8] schedule+0x2f4/0x4d8 [df119d40] [c03591dc] schedule_timeout+0xb0/0xf4 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-23 15:58:21 -08:00
Benjamin Krill	def434c231	powerpc/cell: add QPACE as a separate Cell platform Since the QPACE (Chromodynamics Parallel Computing on the Cell Broadband Engine) platform doesn't use a iommu, doesn't have PCI devices and a MPIC much lesser setup and configurations are needed. So far all devices are detected as OF device. A notifier function is used to set the dma_ops for the of_platform bus. Further this patch splits the PPC_CELL_NATIVE into PPC_CELL_COMMON which are parts that are shared with the QPACE platform and the rest. Signed-off-by: Benjamin Krill <ben@codiert.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2008-12-22 22:19:19 +01:00
Jarkko Lavinen	09a81269c7	i82875p_edac: fix module remove Fix module removal bugs of i82875p_edac. Also i82975x_edac code seems to have the same module removal bugs as in i82875p_edac. The problems were: 1. In module removal i82875p_remove_one() is never called. Variable i82875p_registered is newer changed from 1, which guarantees i82875p_remove_one() is not called (and even if it were called, it would be called in wrong order). As a result, the edac_mc workque is not stopped and keeps probing. If kernel debugging options are not enabled, user may not notice anything going wrong. if debugging options are enabled and I do "rmmod i82875p_edac", I get: edac debug: edac_pci_workq_function() checking BUG: unable to handle kernel paging request at f882d16f ... call trace: [<f8834df3>] ? edac_mc_workq_function+0x55/0x7e [edac_core] [<c0233974>] ? run_workqueue+0xd7/0x1a5 [<c023392f>] ? run_workqueue+0x92/0x1a5 [<f8834d9e>] ? edac_mc_workq_function+0x0/0x7e [edac_core] [<c0233af9>] ? worker_thread+0xb7/0xc3 [<c0236a7b>] ? autoremove_wake_function+0x0/0x33 [<c0233a42>] ? worker_thread+0x0/0xc3 [<c0236809>] ? kthread+0x3b/0x61 [<c02367ce>] ? kthread+0x0/0x61 [<c0204587>] ? kernel_thread_helper+0x7/0x10 Fix for this is to get rid of needles variable i82875p_registered altogether and run i82875p_remove_one() before pci_unregister_driver(). 2. edac_mc_del_mc() uses mci after freeing mci edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device(). The kobject refcount of mci drops to 0 and mci is freed. After this mci is accessed via debug print and i82875p_remove_one() still uses mci->pvt and tries to free mci again with edac_mc_free(). The fix for this is add kobject_get(&mci->edac_mci_kobj) after edac_mc_alloc(). Then the mci is still available after returning from edac_mc_del_mc() with refcount 1, and mci->pvt is still available. When i82875p_remove_one() finally calls edac_mc_free(), this will cause kobject_put() and mci is released properly. Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:25 -08:00
Jarkko Lavinen	307d114441	i82875p_edac: fix overflow device resource setup When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26 or later, the module load fails due to BAR 0 collision. On 2.6.25 the module loads just fine. The overflow device on the MB seems to be hidden and its resources are not allocated at normal PCI bus init. Log shows the missing resource problem: EDAC DEBUG: i82875p_probe1() PCI: 0000:00:06.0 reg 10 32bit mmio: [fecf0000, fecf0fff] pci 0000:00:06.0: device not available because of BAR 0 [0xfecf0000-0xfecf0fff] collisions EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow device The patch below fixes this by calling pci_bus_assign_resources() after the overflow device is revealed and added to the bus. With this patch I am again able to load and use the module. Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:25 -08:00
Darrick J. Wong	f0f7e0dc73	i5000-edac: hold reference to mci kobject It turns out that edac_mc_del_mc will kobject_put the last kref on the mci object. If the timing is just right, that means that the mci object is freed before before i5000_remove_one has a chance to free the resources associated with it, causing a null pointer exceptions when unloading the driver. Insert a kobject_{get,put} pair so that this doesn't happen. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-12 17:17:16 -08:00
Benjamin Herrenschmidt	992b692dcf	edac: fix enabling of polling cell module The edac driver on cell turned out to be not enabled because of a missing op_state. This patch introduces it. Verified to work on top of Ben's next branch. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Osterkamp <jens@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:46 -07:00
Hitoshi Mitake	df8bc08c19	edac x38: new MC driver module I wrote a new module for Intel X38 chipset. This chipset is very similar to Intel 3200 chipset, but there are some different points, so I copyed i3200_edac.c and modified. This is Intel's web page describing this chipset. http://www.intel.com/Products/Desktop/Chipsets/X38/X38-overview.htm I've tested this new module with broken memory, and it seems to be working well. Signed-off-by: Hitoshi Mitake <mitake@clustcom.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-30 11:38:45 -07:00
Benjamin Herrenschmidt	3b274f44d2	edac cell: fix incorrect edac_mode The cell_edac driver is setting the edac_mode field of the csrow's to an incorrect value, causing the sysfs show routine for that field to go out of an array bound and Oopsing the kernel when used. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> [2.6.27.x, 2.6.26.x. 2.6.25.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:52:40 -07:00
Aristeu Rozanski	8360e81b5d	edac i5000: fix thermal issues Make the Thermal messages (temperature got past Tmid) be displayed only once because: 1) it's the BIOS job to configure and handle the memory throttling 2) if the BIOS is broken or is aware about the condition, flooding the system logs won't help anything. 3) According to the specification update for Intel 5000 MCHs, all the revisions of this MCH have problems on the thermal sensors, making not automatic (a.k.a. intelligent thermal throttling) impossible. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:48 -07:00
Aristeu Rozanski	c066740739	edac i5000: fix error messages Update the i5000_edac messages, making everything pass through the EDAC (so the log controls will work) and being more specific about the errors. Also, it makes the miscellaneous errors optional and disabled by default. As I didn't found anywhere information about M23ERR-M26ERR (FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:48 -07:00
Andrew Kilkenny	60be75515e	edac mpc85xx: add support for mpc8572 This adds support for the dual-core MPC8572 processor. We have to support making SPR changes on each core. Also, since we can have multiple memory controllers sharing an interrupt, flag the interrupts with IRQF_SHARED. Signed-off-by: Andrew Kilkenny <akilkenny@xes-inc.com> Signed-off-by: Nate Case <ncase@xes-inc.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:48 -07:00
Vladislav Bogdanov	53a2fe5804	edac: make i82443bxgx_edac coexist with intel_agp Fix 443BX/GX MCH suppport in a EDAC. It makes i82443bxgx_edac coexist with intel_agp using the same approach as several other EDAC drivers. Tested on Intel's L443GX with redhat's 2.6.18 with whole EDAC subsystem backported a while ago. [root@host ~]# dmesg\|grep -iE '(AGP\|EDAC)' Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected an Intel 440GX Chipset. agpgart: AGP aperture is 64M @ 0xf8000000 EDAC MC: Ver: 2.1.0 Jun 27 2008 EDAC MC0: Giving out device to 'i82443bxgx_edac' 'I82443BXGX': DEV 0000:00:00.0 EDAC PCI0: Giving out device to module 'i82443bxgx_edac' controller 'EDAC PCI controller': DEV '0000:00:00.0' (POLLED) Signed-off-by: Vladislav Bogdanov <slava@nsys.by> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:48 -07:00
Adrian Bunk	7a8fc9b248	removed unused #include <linux/version.h>'s This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-23 12:14:12 -07:00
Dave Jiang	f87bd330ed	edac: mpc85xx fix pci ofdev 2nd pass Convert PCI err device from platform to open firmware of_dev to comply with powerpc schemes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Dave Jiang	fcb19171d1	edac: mv64x60 add pci fixup Fixup of missing bit 0 on 64360 PCIx_ERR_MASK and errata FEr-#11 and FEr-#16 for the 64460. Bit 0 must remain 0. Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Dave Jiang	596d394103	edac: mv64x60 fix get_property Update get_property() call to use of_get_property() in order to fix compile Signed-off-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Doug Thompson <dougthompson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Doug Thompson	10d33e9c36	edac: e752x fix too loud on nonmemory errors This module harvests more than just memory errors, it also harvests various bus and dma errors that the Chipset detects. Previously, it would report all such errors, which would cause output to be TOO loud. This patches therefore adds a parameter which is used to turn off NON-MEMORY error reports by default. Or the reporting can be enabled via the parameter Also did code style cleanup: less than 80 characters per line rule Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Arthur Jones	124682c785	edac: core fix added newline to sysfs dimm labels The channel DIMM label does not seem to be used much in the edac code. However, where it is used (in the core code), it is assumed to not have a newline embedded. This leaves the sysfs file newline free which looks funny when cat'ing it. Here we just add the trailing newline to the sysfs chX_dimm_label output... [Doug Thompson note: the DIMM label is one of the primary uses of EDAC. User space daemon scripts, edac-utils@sourceforge, populate the DIMM label fields, via /sys/devices/system/edac attributes, with the silk screen labels of the motherboard in use. dmidecode access BIOS tables, but BIOS tables are well known to be incorrect and useless in these respects. edac-utils will strip off any newlines before its use of the output, when displaying DIMM slot silk screen labels. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Arthur Jones	f9fc82adca	edac: core fix static to dynamic kset Static kobjects and ksets are not supported in Linux kernel. Convert the mc_kset from static to dynamic. This patch depends on my previous patch to remove the module parameter attributes from mc... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Arthur Jones	327dafb1c6	edac: core fix redundant sysfs controls to parameters /sys/devices/system/edac/mc has a few files which are duplicated in /sys/module/edac_core/parameters. Now that all the functionality is duplicated between these two locations, we remove the former kobject attributes and update the documentation. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Arthur Jones	096846e2b0	edac: core fix workq timer When updating the edac_mc_poll_msec module parameter from the sysfs /sys/module/edac_core/parameters/edac_mc_poll_msec file, we don't update the workq timers. So that, if we move from a big poll time to a small one, the small one won't take effect until the big one has timed out. Here we provide a new module parameter set method to call out to the update routine. This brings the /sys/module/edac_core/parameters functionality up to that provided by the /sys/drivers/system/edac/mc sysfs module parameter files so that we can remove them or at least link to the /sys/module files... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:49 -07:00
Arthur Jones	14cc571bb1	edac: core fix to use dynamic kobject Static kobjects are not supported in linux kernel. Convert the edac_pci_top_main_kobj from static to dynamic. This avoids the double free of the edac_pci_top_main_kobj.name that we see on module reload of the e752x edac driver (and probably others as well). In addition Greg KH <greg@kroah.com> has pointed out that this code may be cleaned up significantly. I will look at that as a follow-on patch, for now, I just want the minimum fix to get this double-free oops bug squashed... Many thanks to Greg KH for his patience in showing me what the Documentation/kobject.txt already said (oops)... Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Arthur Jones	b238e57723	edac: i5100: cleanup Some code cleanliness issues found by Andrew Morton (thanks!) which should not affect functionality, but which should help make the code more maintainable. In particular, we now: * convert all #define's w/ a parameter to static inlines * use 1UL rather than 1ULL when calculating an unsigned long * use pci_disable_device The resulting code is tested and seems to work fine... Signed-off-by: Arthur Jones <ajones@riverbed.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Arthur Jones	178d5a7422	edac: i5100 fix unmask ecc bits Explicitly unmask ECC errors we are interested in reporting. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Arthur Jones	43920a598f	edac: i5100 fix enable ecc hardware It is possible that the BIOS did not enable ECC at boot time. We check for that case and fail to load if it is true. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Arthur Jones	f7952ffcff	edac: i5100 fix missing bits The error mask we use to trigger ECC notifications is missing many bits of interest. We add these bits here so that all possible ECC errors can be reported. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Arthur Jones	8f421c595a	edac: i5100 new intel chipset driver Preliminary support for the Intel 5100 MCH. CE and UE errors are reported along with the current DIMM label information and other memory parameters. Reasons why this is preliminary: 1) This chip has 2 independent memory controllers which, for best perforance, use interleaved accesses to the DDR2 memory. This architecture does not map very well to the current edac data structures which depend on symmetric channel access to the interleaved data. Without core changes, the best I could do for now is to map both memory controllers to different csrows (first all ranks of controller 0, then all ranks of controller 1). Someone much more familiar with the edac core than I will probably need to come up with a more general data structure to handle the interleaving and de-interleaving of the two memory controllers. 2) I have not yet tackled the de-interleaving of the rank/controller address space into the physical address space of the CPU. There is nothing fundamentally missing, it is just ending up to be a lot of code, and I'd rather keep it separate for now, esp since it doesn't work yet... 3) The code depends on a particular i5100 chip select to DIMM mainboard chip select mapping. This mapping seems obvious to me in order to support dual and single ranked memory, but it is not unique and DIMM labels could be wrong on other mainboards. There is no way to query this mapping that I know of. 4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per controller, 2 ranks per DIMM are supported. I do not have hardware (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks per controller) mode. 5) The serial presence detect code should be broken out into a "real" i2c driver so that decode-dimms.pl can work. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Maxim Shchetynin	c134fd868f	powerpc/cell/edac: Log a syndrome code in case of correctable error If correctable error occurs the syndrome code was logged as 0. This patch lets EDAC to log a correct syndrome code to make problem investigation easier. Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-22 10:39:36 +10:00
Kumar Gala	f99c90094b	edac: mpc85xx: fix building as a module including of <asm/mpc85xx.h> causes build problems since it doesn't exist. Also removed warning: drivers/edac/mpc85xx_edac.c:45: warning: 'mpc85xx_ctl_name' defined but not used Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Dave Jiang <djiang@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:13 -07:00
Stephen Rothwell	17aa7e0344	dev_name introduction fall out fix Commit `06916639e2` ("driver-core: add dev_name() to help transition away from using bus_id") added a static inline dev_name() and used it in dev_printk. Unfortunately, drivers/edac/edac_core.h defines a macro called dev_name(). Rename the latter. Diagnosis by Tony Breeds and Michael Ellerman. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-05 15:08:38 -07:00
Stephen Rothwell	a94a630a4c	pasemi_edac needs to include linux/edac.h Commit `c3c52bce69` ("edac: fix module initialization on several modules 2nd time") added a call to opstate_init but did not include linux/edac.h that declares it. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 19:06:57 -07:00
Hitoshi Mitake	c3c52bce69	edac: fix module initialization on several modules 2nd time I implemented opstate_init() as a inline function in linux/edac.h. added calling opstate_init() to: i82443bxgx_edac.c i82860_edac.c i82875p_edac.c i82975x_edac.c I wrote a fixed patch of edac-fix-module-initialization-on-several-modules.patch, and tested building 2.6.25-rc7 with applying this. It was succeed. I think the patch is now correct. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Adrian Bunk	1a45027d1a	edac: remove unneeded functions and add static accessor Collection of patches, merged into one, from Adrian that do the following: 1) This patch makes the following needlessly global functions static: - edac_pci_get_log_pe() - edac_pci_get_log_npe() - edac_pci_get_panic_on_pe() - edac_pci_unregister_sysfs_instance_kobj() - edac_pci_main_kobj_setup() 2) Remove unneeded function edac_device_find() 3) Added #if 0 around function edac_pci_find() 4) make the needlessly global edac_pci_generic_check() static 5) Removed function edac_check_mc_devices() Doug Thompson modified Adrian's patches, to bettern represent the direction of EDAC, and make them one patch. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Robert P. J. Day	ff6ac2a616	edac: use the shorter LIST_HEAD for brevity Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Acked-by: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Peter Tyser	94ee1cf5a8	edac: add e752x parameter for sysbus_parity selection Add a module parameter "sysbus_parity" to allow forcing system bus parity error checking on or off. Also add support to automatically disable system bus parity errors for processors which do not support it. If the sysbus_parity parameter is specified, sysbus parity detection will be forced on or off. If it is not specified, the driver will attempt to look at the CPU identifier string and determine if the CPU supports system bus parity. A blacklist was used instead of a whitelist so that system bus parity would be enabled by default and to minimize the chances of breaking things for those people already using the driver which for some reason have a processor that does not have a valid CPU identifier string. [akpm@linux-foundation.org: coding-style fixes] Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Andrei Konovalov	5135b797c8	edac: new support for Intel 3100 chipset Add Intel 3100 chipset support to e752x EDAC driver. Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrei Konovalov <akonovalov@ru.mvista.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Jason Uhlenkott	870897a5ab	drivers/edac/i3000: document type promotion By popular request, add a comment documenting the implicit type promotion here. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Hitoshi Mitake	7ed31e0fa0	drivers/edac: i3000: missing init code There is a missing sequence of initialization code during startup. Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmisson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Doug Thompson	cd4755c2a9	drivers/edac: mpc85xx: add static scope Made a previous global variable, static in scope Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Jason Uhlenkott	f5c0454c86	drivers/edac: i3000: 64bit build Modified to run on x86_64 as well as x86 i3000_edac builds (and runs) fine on x86_64. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Bryan Boatright	6b09ff9d78	drivers/edac: pci: broken parity regression Using the EDAC code in kernel.org kernel version 2.6.23.8 I am seeing the following problem: In the kernel there is a pci device attribute located in sysfs that is checked by the EDAC PCI scanning code. If that attribute is set, PCI parity/error scannining is skipped for that device. The attribute is: broken_parity_status as is located in /sys/devices/pci<XXX>/0000:XX:YY.Z directorys for PCI devices. I don't think this check was actually implemented. I have a misbehaved card that reports a parity error every 1000 ms: Nov 25 07:28:43 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Nov 25 07:28:44 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Nov 25 07:28:45 beta kernel: EDAC PCI: Master Data Parity Error on 0000:05:01.0 Setting that card's broken_parity_status bit did not mask the error: echo "1" > /sys/bus/pci/devices/0000:05:01.0/broken_parity_status I looked through the EDAC code and did not readily see any reference to broken_parity_status at all (which makes sense based on the behavior I am seeing). I applied the following patch as a proof-of-concept and now EDAC's PCI parity error reporting behaves as documented: bryan Good regression find, bryan. It used to work. sigh. I added more logic to your patch, for more coverage of the error. Doug T Signed-off-by: Bryan Boatright <b1@omega71.com> Signed-off-by: Doug Thompson <dougthompson@xmisson.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Dave Jiang	4f4aeeabc0	drivers-edac: add marvell mv64x60 driver Marvell mv64x60 SoC support for EDAC. Used on PPC and MIPS platforms. Development and testing done on PPC Motorola prpmc2800 ATCA board. [akpm@linux-foundation.org: make mv64x60_ctl_name static] Signed-off-by: Dave Jiang <djiang@mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Dave Jiang	a9a753d532	drivers-edac: add freescale mpc85xx driver EDAC chip driver support for Freescale MPC85xx platforms. PPC based. Signed-off-by: Dave Jiang <djiang@mvista.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Jason Uhlenkott	4d2b165eca	drivers-edac: i3000 replace macros with functions Replace function-like macros with functions. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Jason Uhlenkott	ce783d70b9	drivers-edac: i3000 code tidying Style cleanup, mostly just 80-column fixes. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Benjamin Herrenschmidt	48764e4143	drivers-edac: add Cell MC driver Adds driver for the Cell memory controller when used without a Hypervisor such as on the IBM Cell blades. There might still be some improvements to do to this such as finding if it's possible to properly obtain more details about the address of the error but it's good enough already to report CE counts which is our main priority at the moment. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Benjamin Herrenschmidt	1d5f726cbf	drivers-edac: add Cell XDR memory types Add the definitions for the Rambus XDR memory type used by the Cell processor. It's a pre-requisite for the followup Cell EDAC patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Anton Blanchard	c2ae24cfd1	drivers-edac: use round_jiffies_relative When rounding a relative timeout we need to use round_jiffies_relative(). Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Doug Thompson	56e61a9c5f	drivers-edac: turn on edac device error logging ENABLE the 'logging' of CE and UE events for the EDAC_DEVICE class of error harvester in EDAC Cc: Alan Cox <alan@lxorguk.ukuu.org.uk Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:23 -08:00
Joe Perches	6f042b50e0	drivers/edac/: Spelling fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2008-02-03 17:12:34 +02:00
Paul Mackerras	bd45ac0c5d	Merge branch 'linux-2.6'	2008-01-31 11:25:51 +11:00
Kay Sievers	af5ca3f4ec	Driver core: change sysdev classes to use dynamic kobject names All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	c10997f657	Kobject: convert drivers/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	b2ed215a33	Kobject: change drivers/edac to use kobject_init_and_add Stop using kobject_register, as this way we can control the sending of the uevent properly, after everything is properly initialized. Acked-by: Doug Thompson <dougthompson@xmission.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:28 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Olof Johansson	0d08a84770	[POWERPC] pasemi: Broaden specific references to 1682M There will be more product numbers in the future than just PA6T-1682M, but they will share much of the features. Remove some of the explicit references and compatibility checks with 1682M, and replace most of them with the more generic term "PWRficient". Signed-off-by: Olof Johansson <olof@lixom.net> Acked-by: Michael Buesch <mb@bu3sch.de> Acked-by: Doug Thompson <dougthompson@xmission.com>	2007-11-29 22:30:47 -06:00
Darrick J. Wong	57510c2f93	i5000_edac: no need to __stringify() KBUILD_BASENAME The i5000_edac driver's PCI registration structure has the name ""i5000_edac"" (with extra set of double-quotes) which is probably not intentional. Get rid of __stringify. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Cc: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:41 -08:00
Stephen Rothwell	1b3e4c706c	NULL terminate the pci_device_ids in pasemi_edac Fixes: drivers/edac/pasemi_edac: struct pci_device_id is 32 bytes. The last of 1 is: 0x00 0x00 0x19 0x59 0x00 0x00 0xa0 0x0a 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 FATAL: drivers/edac/pasemi_edac: struct pci_device_id is not terminated with a NULL entry! Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Douglas Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:56 -07:00
Jiri Slaby	93043ece03	define global BIT macro define global BIT macro move all local BIT defines to the new globally define macro. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Greg Kroah-Hartman	34980ca8fa	Drivers: clean up direct setting of the name of a kset A kset should not have its name set directly, so dynamically set the name at runtime. This is needed to remove the static array in the kobject structure which will be changed in a future patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-12 14:51:02 -07:00
Aristeu Rozanski	f9b5a5d193	drivers/edac: fix e752x correct return code This patch changes the error code when dev0:fun1 was hidden by BIOS to one more appropriate. Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Mark Gross <mark.gross@intel.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:19 -07:00
Doug Thompson	3c8bb2cfa2	drivers/edac: fix printk level down to debug from emerg When EDAC is configured for EDAC DEBUGGING, the debug printk output level was set TOO high (EMERG). This patch brings it down to a DEBUG level Signed-off-by: Doug Thompson <dougthompson@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:19 -07:00
Doug Thompson	ddcc3050bd	drivers/edac: fix pasemi kconfig depends Fixed 'depends on PPC_PASEMI' in EDAC Kconfig. Module PASEMI depends ONLY on the PASEMI on PPC. Was previously enabled for ALL PPC Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Egor N. Martovetsky <egor@pasemi.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:35:18 -07:00

... 8 9 10 11 12 ...

1065 Commits