Commit Graph

1473 Commits

Author SHA1 Message Date
Michael Jin
8960de4a5c EDAC, amd64: Add Family 17h, models 10h-2fh support
Add new device IDs for family 17h, models 10h-2fh.

This is required by amd64_edac_mod in order to properly detect PCI
device functions 0 and 6.

Signed-off-by: Michael Jin <mikhail.jin@gmail.com>
Reviewed-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180816192840.31166-1-mikhail.jin@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-08-27 06:17:21 +02:00
Takashi Iwai
b748f2de4b EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[]
The edac_mem_types[] array misses a MEM_LRDDR4 entry, which leads to
NULL pointer dereference when accessed via sysfs or such.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180810141426.8918-1-tiwai@suse.de
Fixes: 1e8096bb20 ("EDAC: Add LRDDR4 DRAM type")
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-08-17 15:13:34 +02:00
Masayoshi Mizuma
190bd6e98a EDAC, sb_edac: Add support for systems with segmented PCI buses
Extend the driver to check whether segment number and bus number matches
when deciding how to group memory controller PCI devices to CPU sockets.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180724190213.26359-1-msys.mizuma@gmail.com
[ Cleanup commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-07-25 11:17:15 +02:00
Kees Cook
6663484b4e EDAC, thunderx: Remove VLA usage
In the quest to remove all stack VLA usage from the kernel[1], switch to
using a kmalloc-allocated buffer instead of stack space. This should be
fine since the existing routine is allocating memory too.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Jan Glauber <jglauber@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180629184850.GA37464@beast
Link: https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com [1]
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-07-09 11:33:02 +02:00
Johan Hovold
6c974d4dfa EDAC, i7core: Fix memleaks and use-after-free on probe and remove
Make sure to free and deregister the addrmatch and chancounts devices
allocated during probe in all error paths. Also fix use-after-free in a
probe error path and in the remove success path where the devices were
being put before before deregistration.

Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 356f0a3086 ("i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy")
Link: http://lkml.kernel.org/r/20180612124335.6420-2-johan@kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-06-18 13:26:05 +02:00
Johan Hovold
4708aa85d5 EDAC: Fix memleak in module init error path
Make sure to use put_device() to free the initialised struct device so
that resources managed by driver core also gets released in the event of
a registration failure.

Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Denis Kirjanov <kirjanov@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 2d56b109e3 ("EDAC: Handle error path in edac_mc_sysfs_init() properly")
Link: http://lkml.kernel.org/r/20180612124335.6420-1-johan@kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-06-17 13:09:44 +02:00
Christophe JAILLET
9d72fe1ce8 EDAC, altera: Fix an error handling path in altr_s10_sdram_probe()
If regmap_write() fails, we should release some resources as done in all
the other error handling paths of the function.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180610174532.22071-1-christophe.jaillet@wanadoo.fr
Fixes: e9918d7faf ("EDAC, altera: Handle SDRAM Uncorrectable Errors on Stratix10")
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-06-17 12:23:33 +02:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Borislav Petkov
eaa3a1d46c EDAC, ghes: Make platform-based whitelisting x86-only
ARM machines all have DMI tables so if they request hw error reporting
through GHES, then the driver should be able to detect DIMMs and report
errors successfully (famous last words :)).

Make the platform-based list x86-specific so that ghes_edac can load on
ARM.

Reported-by: Qiang Zheng <zhengqiang10@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Tested-by: Qiang Zheng <zhengqiang10@huawei.com>
Link: https://lkml.kernel.org/r/1526039543-180996-1-git-send-email-zhengqiang10@huawei.com
2018-05-21 12:18:57 +02:00
Thor Thayer
9ef20753e0 EDAC, altera: Fix ARM64 build warning
The kbuild test robot reported the following warning:

  drivers/edac/altera_edac.c: In function 'ocram_free_mem':
  drivers/edac/altera_edac.c:1410:42: warning: cast from pointer to integer
	of different size [-Wpointer-to-int-cast]
    gen_pool_free((struct gen_pool *)other, (u32)p, size);
                                             ^

After adding support for ARM64 architectures, the unsigned long
parameter is 64 bits and causes a build warning on 64-bit configs. Fix
by casting to the correct size (unsigned long) instead of u32.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: c3eea1942a ("EDAC, altera: Add Altera L2 cache and OCRAM support")
Link: http://lkml.kernel.org/r/1526317441-4996-1-git-send-email-thor.thayer@linux.intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-15 11:01:26 +02:00
Randy Dunlap
de245ae073 EDAC, skx: Fix skx_edac build error when ACPI_NFIT=m
Prevent build error when CONFIG_ACPI_NFIT=m and CONFIG_EDAC_SKX=y by
limiting EDAC_SKX based on how ACPI_NFIT is set.

Fixes this build error:
  drivers/edac/skx_edac.o: In function `get_nvdimm_info':
  ../drivers/edac/skx_edac.c:399: undefined reference to `nfit_get_smbios_id'

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 58ca9ac146 ("EDAC, skx_edac: Detect non-volatile DIMMs")
Link: http://lkml.kernel.org/r/3af91354-8e19-d2af-1bba-ced8dce053f1@infradead.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-14 11:35:23 +02:00
Borislav Petkov
a0671c39dd EDAC, ghes: Use BIT() macro
... for improved readability. Also, add a local mask variable for the
same reason.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
2018-05-12 14:35:30 +02:00
Toshi Kani
ad0d73b324 EDAC, ghes: Add DDR4 and NVDIMM memory types
The ghes_edac driver obtains memory type from SMBIOS type 17, but it
does not recognize DDR4 and NVDIMM types.

Add support of DDR4 and NVDIMM types. NVDIMM type is denoted by memory
type DDR3/4 and non-volatile.

Reported-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180509222030.9299-1-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12 14:16:49 +02:00
Thor Thayer
e9918d7faf EDAC, altera: Handle SDRAM Uncorrectable Errors on Stratix10
On Stratix10, uncorrectable errors are routed to the SError exception
instead of the IRQ exceptions. In Stratix10, uncorrectable SErrors
must be treated as fatal and will cause a panic. Older Altera/Intel
parts printed out a message for UE so do that here using the notifier
framework.

Record the UE in sticky registers that retain the state through a reset.
Check these registers on probe and printout the error on startup.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: mchehab@kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1526079610-5527-1-git-send-email-thor.thayer@linux.intel.com
[ Remove unused var in s10_edac_dberr_handler(), reorder args. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12 12:29:41 +02:00
Thor Thayer
3dab6bd526 EDAC, altera: Add support for Stratix10 SDRAM EDAC
Support for Stratix10 SDRAM ECC requires the use of SMC calls to Secure
Monitor for accessing registers.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: catalin.marinas@arm.com
Cc: devicetree@vger.kernel.org
Cc: dinguyen@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: robh+dt@kernel.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1524854238-19394-3-git-send-email-thor.thayer@linux.intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12 11:13:39 +02:00
Alexandru Gagniuc
305d0e006a EDAC, ghes: Remove unused argument to ghes_edac_report_mem_error()
The use of the @ghes argument was removed in a previous commit, but
function signature was not updated to reflect this.

Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180430213358.8319-1-mr.nuke.me@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-12 10:59:15 +02:00
Colin Ian King
83e548bea4 EDAC, i7core: Fix spelling mistake: "redundacy" -> "redundancy"
Trivial fix to spelling mistake in err string.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: kernel-janitors@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180504113804.17103-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-04 14:59:29 +02:00
Sughosh Ganu
a66bdf5d11 EDAC, ghes: Add a null pointer check in ghes_edac_unregister()
Add a null check for ghes_pvt, before dereferencing it. The pointer
could still be null in case the return path is taken before initialising
ghes_pvt in the registration function.

Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Sughosh Ganu <sughosh.ganu@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Link: http://lkml.kernel.org/r/1524737809-24475-1-git-send-email-sughosh.ganu@arm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-05-02 13:57:49 +02:00
Borislav Petkov
cc7f3f1326 ghes, EDAC: Fix ghes_edac registration
Tony reported seeing

  "Internal error: Can't find EDAC structure"

when injecting correctable errors due to the fact that ghes_edac would
still load even if the whitelist won't hit. Drop the pr_err() in
ghes_edac_report_mem_error() for now due to the hacky way how ghes_edac
depends on ghes.c.

While at it, make ghes_edac_register() return an error if it doesn't hit
in the whitelist as it is the only sensible thing to do in that
situation.

Furthermore, move the call to it to happen last in ghes_probe() so that
GHES initializing properly does not depend on ghes_edac init at all
as latter is only reporting errors and not required for GHES's proper
functioning.

Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Tested-by: Sughosh Ganu <sughosh.ganu@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20180420182015.zao3olss4tvvlxki@agluck-desk
2018-05-02 13:57:30 +02:00
Linus Torvalds
dd972f924d * Add NVDIMM support to EDAC (Tony Luck)
* misc fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAlrDZ6EACgkQEsHwGGHe
 VUpWWA//RG9WqqdXxPa05TxiRFWMiNSH/HSGJQZsc3K7xm2TEGVdDv5thYsxXCaW
 kA0oiUJA3zd62HeUIWMeGypm+hLM8T73836aIKcfxMduxQ6DR9nk51IV4RmHsLvT
 rd4bUsrjCov6spapqS265LOxZVnpKvxZv0NVTj6pIk0pkeghWMZh63vZtrxVkLsS
 6dHsdIDBr0fMhcFBVQ67SEDpMIyMAqziPOP9sv6NNElG4Q93RnZr6KYvQdE32Iev
 Z6lIUU/sjVJEuhgp6FjJrhvjc+FmGJGERsFqo3Z8RT9P26Cj3wq0ov8IY/KSBazG
 mfJFGbL+O3lppGKjjuAwMGr4R3fZMLQpgLxdYe7VNvOHB9ea7732q9MDGiNkN/u7
 z8o+dtvmPWr0Y1ir8te+7LnSQXIAwxMkKP/lMStzzuFUETQDd0RWr7GAKlbrgPOU
 +DhiQRPBFS7qpslA7BdqV6Ybr3nhDVZciDYWsvNpDMwdem05TxlFwC5l/BzP9NtM
 FmQ3B0rQ5we6gib6UZWUjW3BH9wUgFUpu/WXKJl1unMHwYNe3CvdM8yD5W7mUylh
 6SVYAmkZyuJ1Du40+YMeimroe77XV60IqoUFNEzMSFeB4/GxMG6+u2gUqNuTpHpZ
 aMmCRj9DxWGxZyDsTc49An0X/JbVyIE/Aq6heMTYScgBvCaD1VE=
 =K89j
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "Noteworthy is the NVDIMM support:

   - NVDIMM support to EDAC (Tony Luck)

   - misc fixes"

* tag 'edac_for_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, sb_edac: Remove variable length array usage
  EDAC, skx_edac: Detect non-volatile DIMMs
  firmware, DMI: Add function to look up a handle and return DIMM size
  acpi, nfit: Add function to look up nvdimm device and provide SMBIOS handle
  EDAC: Add new memory type for non-volatile DIMMs
  EDAC: Drop duplicated array of strings for memory type names
  EDAC, layerscape: Allow building for LS1021A
2018-04-05 14:21:13 -07:00
Linus Torvalds
f5a8eb632b arch: remove obsolete architecture ports
This removes the entire architecture code for blackfin, cris, frv, m32r,
 metag, mn10300, score, and tile, including the associated device drivers.
 
 I have been working with the (former) maintainers for each one to ensure
 that my interpretation was right and the code is definitely unused in
 mainline kernels. Many had fond memories of working on the respective
 ports to start with and getting them included in upstream, but also saw
 no point in keeping the port alive without any users.
 
 In the end, it seems that while the eight architectures are extremely
 different, they all suffered the same fate: There was one company
 in charge of an SoC line, a CPU microarchitecture and a software
 ecosystem, which was more costly than licensing newer off-the-shelf
 CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
 that all the SoC product lines are still around, but have not used the
 custom CPU architectures for several years at this point. In contrast,
 CPU instruction sets that remain popular and have actively maintained
 kernel ports tend to all be used across multiple licensees.
 
 The removal came out of a discussion that is now documented at
 https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
 marking any ports as deprecated but remove them all at once after I made
 sure that they are all unused. Some architectures (notably tile, mn10300,
 and blackfin) are still being shipped in products with old kernels,
 but those products will never be updated to newer kernel releases.
 
 After this series, we still have a few architectures without mainline
 gcc support:
 
 - unicore32 and hexagon both have very outdated gcc releases, but the
   maintainers promised to work on providing something newer. At least
   in case of hexagon, this will only be llvm, not gcc.
 
 - openrisc, risc-v and nds32 are still in the process of finishing their
   support or getting it added to mainline gcc in the first place.
   They all have patched gcc-7.3 ports that work to some degree, but
   complete upstream support won't happen before gcc-8.1. Csky posted
   their first kernel patch set last week, their situation will be similar.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
 PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
 Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
 yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
 FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
 sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
 ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
 cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
 pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
 X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
 57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
 3iIVHovno/JuJnTOE8LY
 =fQ8z
 -----END PGP SIGNATURE-----

Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pul removal of obsolete architecture ports from Arnd Bergmann:
 "This removes the entire architecture code for blackfin, cris, frv,
  m32r, metag, mn10300, score, and tile, including the associated device
  drivers.

  I have been working with the (former) maintainers for each one to
  ensure that my interpretation was right and the code is definitely
  unused in mainline kernels. Many had fond memories of working on the
  respective ports to start with and getting them included in upstream,
  but also saw no point in keeping the port alive without any users.

  In the end, it seems that while the eight architectures are extremely
  different, they all suffered the same fate: There was one company in
  charge of an SoC line, a CPU microarchitecture and a software
  ecosystem, which was more costly than licensing newer off-the-shelf
  CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
  seems that all the SoC product lines are still around, but have not
  used the custom CPU architectures for several years at this point. In
  contrast, CPU instruction sets that remain popular and have actively
  maintained kernel ports tend to all be used across multiple licensees.

  [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem"   - Linus ]

  The removal came out of a discussion that is now documented at
  https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
  marking any ports as deprecated but remove them all at once after I
  made sure that they are all unused. Some architectures (notably tile,
  mn10300, and blackfin) are still being shipped in products with old
  kernels, but those products will never be updated to newer kernel
  releases.

  After this series, we still have a few architectures without mainline
  gcc support:

   - unicore32 and hexagon both have very outdated gcc releases, but the
     maintainers promised to work on providing something newer. At least
     in case of hexagon, this will only be llvm, not gcc.

   - openrisc, risc-v and nds32 are still in the process of finishing
     their support or getting it added to mainline gcc in the first
     place. They all have patched gcc-7.3 ports that work to some
     degree, but complete upstream support won't happen before gcc-8.1.
     Csky posted their first kernel patch set last week, their situation
     will be similar

  [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum  - Linus ]"

This really says it all:

 2498 files changed, 95 insertions(+), 467668 deletions(-)

* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
  MAINTAINERS: UNICORE32: Change email account
  staging: iio: remove iio-trig-bfin-timer driver
  tty: hvc: remove tile driver
  tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
  serial: remove tile uart driver
  serial: remove m32r_sio driver
  serial: remove blackfin drivers
  serial: remove cris/etrax uart drivers
  usb: Remove Blackfin references in USB support
  usb: isp1362: remove blackfin arch glue
  usb: musb: remove blackfin port
  usb: host: remove tilegx platform glue
  pwm: remove pwm-bfin driver
  i2c: remove bfin-twi driver
  spi: remove blackfin related host drivers
  watchdog: remove bfin_wdt driver
  can: remove bfin_can driver
  mmc: remove bfin_sdh driver
  input: misc: remove blackfin rotary driver
  input: keyboard: remove bf54x driver
  ...
2018-04-02 20:20:12 -07:00
Arnd Bergmann
0833f7634f edac: remove tile driver
The Tile architecture is obsolete and getting removed from the kernel,
this driver appears to only be used there, and not on the ARM based
successors (Tile-Mx, BlueField), so we should remove it as well.

Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26 15:56:17 +02:00
Gustavo A. R. Silva
6fd0526652 EDAC, sb_edac: Remove variable length array usage
In preparation for enabling -Wvla, remove VLA and replace it with a
fixed-length array instead.

Also, remove max_interleave as it is no longer needed.

Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180314182131.GA25259@embeddedgus
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-03-17 05:24:55 +01:00
Tony Luck
58ca9ac146 EDAC, skx_edac: Detect non-volatile DIMMs
This just covers the topology function of the EDAC driver. We locate
which DIMM slots are populated with NVDIMMs and query the NFIT and
SMBIOS tables to get the size.

Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/20180312182430.10335-6-tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-03-15 00:33:55 +01:00
Tony Luck
001f86137d EDAC: Add new memory type for non-volatile DIMMs
There are now non-volatile versions of DIMMs. Add a new entry to "enum
mem_type" and a new string in edac_mem_types[].

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/20180312182430.10335-3-tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-03-14 12:32:06 +01:00
Tony Luck
d6dd77ebcd EDAC: Drop duplicated array of strings for memory type names
Somehow we ended up with two separate arrays of strings to describe the
"enum mem_type" values.

In edac_mc.c we have an exported list edac_mem_types[] that is used
by a couple of drivers in debug messaged.

In edac_mc_sysfs.c we have a private list that is used to display
values in:
  /sys/devices/system/edac/mc/mc*/dimm*/dimm_mem_type
  /sys/devices/system/edac/mc/mc*/csrow*/mem_type

This list was missing a value for MEM_LRDDR3.

The string values in the two lists were different :-(

Combining the lists, I kept the values so that the sysfs output
will be unchanged as some scripts may depend on that.

Reported-by: Borislav Petkov <bp@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/20180312182430.10335-2-tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-03-14 12:20:16 +01:00
Thomas Gleixner
422caa5f7a Merge branch 'ras/urgent' into ras/core
Pick up urgent fixes to apply further development changes.
2018-03-08 15:52:08 +01:00
Rasmus Villemoes
28dd6726e7 EDAC, layerscape: Allow building for LS1021A
The LS1021A has a memory controller supported by this driver. It builds
just fine, and I've done some rudimentary testing using the error
injection facility, which suggests that it is indeed working.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Acked-by: York Sun <york.sun@nxp.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180220150912.2954-1-rasmus.villemoes@prevas.dk
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-02-27 14:57:47 +01:00
Anna Karbownik
bf8486709a EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL
Commit

  3286d3eb90 ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")

decreased NUM_CHANNELS from 8 to 4, but this is not enough for Knights
Landing which supports up to 6 channels.

This caused out-of-bounds writes to pvt->mirror_mode and pvt->tolm
variables which don't pay critical role on KNL code path, so the memory
corruption wasn't causing any visible driver failures.

The easiest way of fixing it is to change NUM_CHANNELS to 6. Do that.

An alternative solution would be to restructure the KNL part of the
driver to 2MC/3channel representation.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anna Karbownik <anna.karbownik@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: jim.m.snow@intel.com
Cc: krzysztof.paliswiat@intel.com
Cc: lukasz.odzioba@intel.com
Cc: qiuxu.zhuo@intel.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Fixes: 3286d3eb90 ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
Link: http://lkml.kernel.org/r/1519312693-4789-1-git-send-email-anna.karbownik@intel.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-02-23 12:05:37 +01:00
Yazen Ghannam
68627a697c x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.

This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:

  WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
  ...

Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.

Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.

Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.

Don't try to find the block address on reserved banks.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14.x
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 17:00:54 +01:00
Linus Torvalds
d4667ca142 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
 "Here's the latest set of Spectre and PTI related fixes and updates:

  Spectre:
   - Add entry code register clearing to reduce the Spectre attack
     surface
   - Update the Spectre microcode blacklist
   - Inline the KVM Spectre helpers to get close to v4.14 performance
     again.
   - Fix indirect_branch_prediction_barrier()
   - Fix/improve Spectre related kernel messages
   - Fix array_index_nospec_mask() asm constraint
   - KVM: fix two MSR handling bugs

  PTI:
   - Fix a paranoid entry PTI CR3 handling bug
   - Fix comments

  objtool:
   - Fix paranoid_entry() frame pointer warning
   - Annotate WARN()-related UD2 as reachable
   - Various fixes
   - Add Add Peter Zijlstra as objtool co-maintainer

  Misc:
   - Various x86 entry code self-test fixes
   - Improve/simplify entry code stack frame generation and handling
     after recent heavy-handed PTI and Spectre changes. (There's two
     more WIP improvements expected here.)
   - Type fix for cache entries

  There's also some low risk non-fix changes I've included in this
  branch to reduce backporting conflicts:

   - rename a confusing x86_cpu field name
   - de-obfuscate the naming of single-TLB flushing primitives"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  x86/entry/64: Fix CR3 restore in paranoid_exit()
  x86/cpu: Change type of x86_cache_size variable to unsigned int
  x86/spectre: Fix an error message
  x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
  selftests/x86/mpx: Fix incorrect bounds with old _sigfault
  x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
  x86/speculation: Add <asm/msr-index.h> dependency
  nospec: Move array_index_nospec() parameter checking into separate macro
  x86/speculation: Fix up array_index_nospec_mask() asm constraint
  x86/debug: Use UD2 for WARN()
  x86/debug, objtool: Annotate WARN()-related UD2 as reachable
  objtool: Fix segfault in ignore_unreachable_insn()
  selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
  selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
  selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
  selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
  selftests/x86/pkeys: Remove unused functions
  selftests/x86: Clean up and document sscanf() usage
  selftests/x86: Fix vDSO selftest segfault for vsyscall=none
  x86/entry/64: Remove the unused 'icebp' macro
  ...
2018-02-14 17:02:15 -08:00
Jia Zhang
b399151cb4 x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
x86_mask is a confusing name which is hard to associate with the
processor's stepping.

Additionally, correct an indent issue in lib/cpu.c.

Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com>
[ Updated it to more recent kernels. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:52 +01:00
Christophe JAILLET
68fa24f912 EDAC, mv64x60: Fix an error handling path
We should not call edac_mc_del_mc() if a corresponding call to
edac_mc_add_mc() has not been performed yet.

So here, we should go to err instead of err2 to branch at the right
place of the error handling path.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180107205400.14068-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
2018-01-09 20:14:23 +01:00
Tero Kristo
86a18ee21e EDAC, ti: Add support for TI keystone and DRA7xx EDAC
TI Keystone and DRA7xx SoCs have support for EDAC on DDR3 memory that can
correct one bit errors and detect two bit errors. Add EDAC driver for this
feature which plugs into the generic kernel EDAC framework.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-omap@vger.kernel.org
Link: http://lkml.kernel.org/r/1510578490-14510-1-git-send-email-t-kristo@ti.com
[ Add SPDX tag and make _emif_get_id() use edac_printk(). ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-11-27 13:51:19 +01:00
James Hogan
544e92581a EDAC, octeon: Fix an uninitialized variable warning
Fix an uninitialized variable warning in the Octeon EDAC driver, as seen
in MIPS cavium_octeon_defconfig builds since v4.14 with Codescape GNU
Tools 2016.05-03:

  drivers/edac/octeon_edac-lmc.c In function ‘octeon_lmc_edac_poll_o2’:
  drivers/edac/octeon_edac-lmc.c:87:24: warning: ‘((long unsigned int*)&int_reg)[1]’ may \
    be used uninitialized in this function [-Wmaybe-uninitialized]
    if (int_reg.s.sec_err || int_reg.s.ded_err) {
                        ^
Iinitialise the whole int_reg variable to zero before the conditional
assignments in the error injection case.

Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Fixes: 1bc021e815 ("EDAC: Octeon: Add error injection support")
Link: http://lkml.kernel.org/r/20171113161206.20990-1-james.hogan@mips.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-11-27 11:57:26 +01:00
Linus Torvalds
1be2172e96 Modules updates for v4.15
Summary of modules changes for the 4.15 merge window:
 
 - Treewide module_param_call() cleanup, fix up set/get function
   prototype mismatches, from Kees Cook
 
 - Minor code cleanups
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJaDCyzAAoJEMBFfjjOO8FyaYQP/AwHBy6XmwwVlWDP4BqIF6hL
 Vhy3ccVLYEORvePv68tWSRPUz5n6+1Ebqanmwtkw6i8l+KwxY2SfkZql09cARc33
 2iBE4bHF98iWQmnJbF6me80fedY9n5bZJNMQKEF9VozJWwTMOTQFTCfmyJRDBmk9
 iidQj6M3idbSUOYIJjvc40VGx5NyQWSr+FFfqsz1rU5iLGRGEvA3I2/CDT0oTuV6
 D4MmFxzE2Tv/vIMa2GzKJ1LGScuUfSjf93Lq9Kk0cG36qWao8l930CaXyVdE9WJv
 bkUzpf3QYv/rDX6QbAGA0cada13zd+dfBr8YhchclEAfJ+GDLjMEDu04NEmI6KUT
 5lP0Xw0xYNZQI7bkdxDMhsj5jaz/HJpXCjPCtZBnSEKiL4OPXVMe+pBHoCJ2/yFN
 6M716XpWYgUviUOdiE+chczB5p3z4FA6u2ykaM4Tlk0btZuHGxjcSWwvcIdlPmjm
 kY4AfDV6K0bfEBVguWPJicvrkx44atqT5nWbbPhDwTSavtsuRJLb3GCsHedx7K8h
 ZO47lCQFAWCtrycK1HYw+oupNC3hYWQ0SR42XRdGhL1bq26C+1sei1QhfqSgA9PQ
 7CwWH4UTOL9fhtrzSqZngYOh9sjQNFNefqQHcecNzcEjK2vjrgQZvRNWZKHSwaFs
 fbGX8juZWP4ypbK+irTB
 =c8vb
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
 "Summary of modules changes for the 4.15 merge window:

   - treewide module_param_call() cleanup, fix up set/get function
     prototype mismatches, from Kees Cook

   - minor code cleanups"

* tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Do not paper over type mismatches in module_param_call()
  treewide: Fix function prototypes for module_param_call()
  module: Prepare to convert all module_param_call() prototypes
  kernel/module: Delete an error message for a failed memory allocation in add_module_usage()
2017-11-15 13:46:33 -08:00
Linus Torvalds
8e9a2dba86 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Another attempt at enabling cross-release lockdep dependency
     tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
     with better performance and fewer false positives. (Byungchul Park)

   - Introduce lockdep_assert_irqs_enabled()/disabled() and convert
     open-coded equivalents to lockdep variants. (Frederic Weisbecker)

   - Add down_read_killable() and use it in the VFS's iterate_dir()
     method. (Kirill Tkhai)

   - Convert remaining uses of ACCESS_ONCE() to
     READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
     driven. (Mark Rutland, Paul E. McKenney)

   - Get rid of lockless_dereference(), by strengthening Alpha atomics,
     strengthening READ_ONCE() with smp_read_barrier_depends() and thus
     being able to convert users of lockless_dereference() to
     READ_ONCE(). (Will Deacon)

   - Various micro-optimizations:

        - better PV qspinlocks (Waiman Long),
        - better x86 barriers (Michael S. Tsirkin)
        - better x86 refcounts (Kees Cook)

   - ... plus other fixes and enhancements. (Borislav Petkov, Juergen
     Gross, Miguel Bernal Marin)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
  rcu: Use lockdep to assert IRQs are disabled/enabled
  netpoll: Use lockdep to assert IRQs are disabled/enabled
  timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
  sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
  irq_work: Use lockdep to assert IRQs are disabled/enabled
  irq/timings: Use lockdep to assert IRQs are disabled/enabled
  perf/core: Use lockdep to assert IRQs are disabled/enabled
  x86: Use lockdep to assert IRQs are disabled/enabled
  smp/core: Use lockdep to assert IRQs are disabled/enabled
  timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
  timers/nohz: Use lockdep to assert IRQs are disabled/enabled
  workqueue: Use lockdep to assert IRQs are disabled/enabled
  irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
  locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
  locking/pvqspinlock: Implement hybrid PV queued/unfair locks
  locking/rwlocks: Fix comments
  x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
  block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
  workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
  ...
2017-11-13 12:38:26 -08:00
Linus Torvalds
1ec1699122 The usual pile of bugfixes, cleanups and minor driver enhancements.
Worth noting are the changes to ghes_edac to use a whitelist of
 known-good platforms on which GHES error reporting works relatively
 reliably. By Toshi Kani and Borislav Petkov.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAloIt2sACgkQEsHwGGHe
 VUpzmw/+PETRJYIwp9TnmLHsNjVA/TPFHmjQqYc+jR9sWzTdrsZqI1JbP6GNC4l6
 0bMdcyCO5J57F8xLKkgYC5sv24m1vL8MKx0+sQSI+EXupCyqDhhclU3c2rfSj0Um
 +qpPV4MnewqJNZlenKKHC+R2wADSsijIpIiMeV682/8HCf0EB68P5m6NRMSqL+hz
 IVOD2xAyn+3dDw4ZazJgra5xSVt+xJdt+x0/bvovPapzvF81b+PCB4XXVm4J5d4f
 gi0AwDbmyLpMx2HK5vLT5JgcbGg6aUS7OhwcZLSwGuWJTTsYM5uEk+F+dj3pxNge
 hk0JSeCLm0S4f60p496VuiweIOrkk4feDbz+RAK9GJaR270K621Zb/06jCybB6aB
 htR/VQS50BgCyehYzSWw9VAOcsEvNPHKpYSGkrDwPRkFZOxfigdnKb4MN8XbGund
 bpQqx6VlgPhV32v0F1F893IIPHrcuc7qFo4KJPSGnP58+iBOeJ6iy2vlUdtMIfyL
 opjS0nqg7B6pXCcANLFRWV0iPTH4e1PWAMGSSxp64cqjuEmHPtlvQGinevBq56Ob
 HpxbtnOqen14xxMulkPPHBw3yKWQA0/teAGcSn2D5P/Qv0X3aTGsifr3Ruk/Cl0E
 wQaGBNcn+5xJ8nhBS7GdUDdyfE5o9zW8wjw3eZEphREoOM3N8ko=
 =XEah
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:
 "The usual pile of bugfixes, cleanups and minor driver enhancements.

  Worth noting are the changes to ghes_edac to use a whitelist of
  known-good platforms on which GHES error reporting works relatively
  reliably. By Toshi Kani and Borislav Petkov"

* tag 'edac_for_4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, sb_edac: Fix missing break in switch
  MAINTAINERS: Split Cavium EDAC entry and add myself
  EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode
  EDAC, skx_edac: Handle systems with segmented PCI busses
  EDAC, thunderx: Remove suspend/resume support
  EDAC, skx_edac: Fix detection of single-rank DIMMs
  EDAC, sb_edac: Don't create a second memory controller if HA1 is not present
  EDAC: Add owner check to the x86 platform drivers
  EDAC: Add helper which returns the loaded platform driver
  EDAC, ghes: Add platform check
  EDAC, ghes: Model a single, logical memory controller
  EDAC, ghes: Remove symbol exports
  EDAC: Handle return value of kasprintf()
2017-11-13 08:54:06 -08:00
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Kees Cook
e4dca7b7aa treewide: Fix function prototypes for module_param_call()
Several function prototypes for the set/get functions defined by
module_param_call() have a slightly wrong argument types. This fixes
those in an effort to clean up the calls when running under type-enforced
compiler instrumentation for CFI. This is the result of running the
following semantic patch:

@match_module_param_call_function@
declarer name module_param_call;
identifier _name, _set_func, _get_func;
expression _arg, _mode;
@@

 module_param_call(_name, _set_func, _get_func, _arg, _mode);

@fix_set_prototype
 depends on match_module_param_call_function@
identifier match_module_param_call_function._set_func;
identifier _val, _param;
type _val_type, _param_type;
@@

 int _set_func(
-_val_type _val
+const char * _val
 ,
-_param_type _param
+const struct kernel_param * _param
 ) { ... }

@fix_get_prototype
 depends on match_module_param_call_function@
identifier match_module_param_call_function._get_func;
identifier _val, _param;
type _val_type, _param_type;
@@

 int _get_func(
-_val_type _val
+char * _val
 ,
-_param_type _param
+const struct kernel_param * _param
 ) { ... }

Two additional by-hand changes are included for places where the above
Coccinelle script didn't notice them:

	drivers/platform/x86/thinkpad_acpi.c
	fs/lockd/svc.c

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-10-31 15:30:37 +01:00
Mark Rutland
332efa6374 locking/atomics, EDAC/altera: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't currently harmful.

However, for some features it is necessary to instrument reads and
writes separately, which is not possible with ACCESS_ONCE(). This
distinction is critical to correct operation.

It's possible to transform the bulk of kernel code using the Coccinelle
script below. However, this doesn't handle comments, leaving references
to ACCESS_ONCE() instances which have been removed. As a preparatory
step, this patch converts the Altera EDAC code and comments to use
{READ,WRITE}_ONCE() consistently.

----
virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-2-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:00:56 +02:00
Gustavo A. R. Silva
a8e9b186f1 EDAC, sb_edac: Fix missing break in switch
Add missing break statement in order to prevent the code from falling
through.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-10-19 10:53:42 +02:00
Luis Felipe Sandoval Castro
24281a2f4c EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode
When figuring out the size of the DIMMs and the cluster mode is SNC2 or SNC4 the
current algorithm ignores the contribution of some of the channels resulting in
EDAC never knowing of the existence of some DIMMs attached to such channels (thus
sysfs is not populated).

Instead of selectively iterating from 0 to interlv_ways when looking for all the
participants in the interleave, do an exhaustive search and iterate from 0 to
KNL_MAX_CHANNELS. The algorithm is already smart enough to consider participants
only one time.

This works fine in all KNL cluster modes and even when there are missing DIMMs
as the contribution of those channels is 0.

Signed-off-by: Luis Felipe Sandoval Castro <luis.felipe.sandoval.castro@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: arozansk@redhat.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: qiuxu.zhuo@intel.com
Link: http://lkml.kernel.org/r/1506606882-90521-1-git-send-email-luis.felipe.sandoval.castro@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-10-11 15:57:25 +02:00
Tony Luck
88ae80aa60 EDAC, skx_edac: Handle systems with segmented PCI busses
Large systems separate their PCI busses into segments since
the limit of only 256 PCI busses can be too restrictive.

Extend this driver to check whether <segment, bus-number> matches
when deciding how to group memory controller PCI devices to
CPU sockets.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Charles Rose <charles.rose@dell.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/f58abfd10bf73c8bc5adc1fe4de7408128b00625.1506358467.git.tony.luck@intel.com
[ Make skx_dev.seg an int. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-28 18:01:55 +02:00
Jan Glauber
f821fe8cc7 EDAC, thunderx: Remove suspend/resume support
The memory controller on ThunderX/OcteonTX systems does not support
power management. Therefore remove the suspend/resume callbacks.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jan Glauber <jglauber@cavium.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhangshaokun <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20170925123502.17289-2-jglauber@cavium.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-27 17:35:58 +02:00
Charles Rose
a9c0a10888 EDAC, skx_edac: Fix detection of single-rank DIMMs
Single-rank DIMMs did not get detected on Skylake/Kabylake systems due
to wrong limit check. The single rank DIMM check is a simple typo. "0"
is a legal value in this field meaning single rank.

Signed-off-by: Charles Rose <charles.rose@dell.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/66df72d327c265fbf92fe25df96daa228a35f076.1506358467.git.tony.luck@intel.com
[ Also fix debug message to print number of ranks. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Expand commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-27 17:31:38 +02:00
Qiuxu Zhuo
15cc3ae001 EDAC, sb_edac: Don't create a second memory controller if HA1 is not present
Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3)
server (DELL PowerEdge 730xd):

  EDAC sbridge: Some needed devices are missing
  EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0
  EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Failed to register device with error -19.

The refactored sb_edac driver creates the IMC1 (the 2nd memory
controller) if any IMC1 device is present. In this case only
HA1_TA of IMC1 was present, but the driver expected to find
HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure.

The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi
Zhang inserted one DIMM per channel for each CPU, and did random error
address injection test with this patch:

      4024  addresses fell in TOLM hole area
     12715  addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
     12774  addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
     12798  addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
     12913  addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
     12674  addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0
     12686  addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0
     12882  addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0
     12934  addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0
    106400  addresses were injected totally.

The test result shows that all the 4 channels belong to IMC0 per CPU, so
the server really only has one IMC per CPU.

In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3'
implements either one or two IMCs. For CPUs with one IMC, IMC1 is not
used and should be ignored.

Thus, do not create a second memory controller if the key HA1 is absent.

[1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz
[2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf

Reported-and-tested-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: e2f747b1f4 ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller")
Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-27 12:15:43 +02:00
Toshi Kani
301375e764 EDAC: Add owner check to the x86 platform drivers
Change x86 EDAC platform drivers to verify the module owner at the
beginning of their module init functions. This allows them to fail their
init immediately when ghes_edac is enabled. Similar change can be made
to other edac drivers if necessary.

Also, remove ".c" from module names of pnp2_edac, sb_edac, and skx_edac.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170823225447.15608-6-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25 13:09:39 +02:00
Toshi Kani
3877c7d1e2 EDAC: Add helper which returns the loaded platform driver
Only a single EDAC platform driver can be loaded. When ghes_edac is
enabled, an EDAC platform driver still attempts to register itself and
fails in edac_mc_add_mc().

Add edac_get_owner() so that EDAC platform drivers can check the owner
first.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170823225447.15608-5-toshi.kani@hpe.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25 12:55:59 +02:00
Toshi Kani
5deed6b6a4 EDAC, ghes: Add platform check
The ghes_edac driver was introduced in 2013 [1], but it has not been
enabled by any distro yet. This driver obtains error info from firmware
interfaces (APEI), which are not properly implemented on many platforms,
as the driver says on load:

  This EDAC driver relies on BIOS to enumerate memory and get error
  reports. Unfortunately, not all BIOSes reflect the memory layout
  correctly. So, the end result of using this driver varies from vendor
  to vendor. If you find incorrect reports, please contact your hardware
  vendor to correct its BIOS.

To get out from this situation, add a platform check to selectively
enable the driver on platforms that are known to have proper APEI
firmware implementation.

"ghes_edac.force_load=1" skips this platform check.

[1]: https://lkml.kernel.org/r/cover.1360931635.git.mchehab@redhat.com

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170823225447.15608-4-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25 12:55:12 +02:00
Borislav Petkov
0fe5f281f7 EDAC, ghes: Model a single, logical memory controller
We're enumerating the DIMMs through a DMI walk and since we can't get
any more detailed topological information about which DIMMs belong to
which memory controller, convert it to a single, logical controller
which contains all the DIMMs.

The error reporting path from GHES ghes_edac_report_mem_error() doesn't
get called in NMI context but add a warning about it to catch any
changes in the future as if so, our locking scheme will be insufficient
then.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25 12:35:28 +02:00
Borislav Petkov
c9c8b4d6d0 EDAC, ghes: Remove symbol exports
They're called from builtin code so no need for the exports.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25 12:35:27 +02:00
Arvind Yadav
75f029c3a8 EDAC: Handle return value of kasprintf()
kasprintf() can fail and we must check its return value.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: linux-edac@vger.kernel.org
[ Merged into a single patch, small formatting fixups. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-21 12:18:44 +02:00
Borislav Petkov
398443471f EDAC, mce_amd: Get rid of local var in amd_filter_mce()
... and use the macro for that.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21 17:59:38 +02:00
Borislav Petkov
f3c0891c2f EDAC, mce_amd: Get rid of most struct cpuinfo_x86 uses
struct mce.cpuid contains CPUID(1).EAX which contains family, model and
stepping and thus has enough information for our purposes. Thus get rid
of some external dependencies which are not really needed.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21 17:54:57 +02:00
Borislav Petkov
4ab1784b48 EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error()
Singular fits better because it decodes a single error.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21 17:44:09 +02:00
Bhumika Goyal
b2b3e7362e EDAC: Make device_type const
Make these const as they are only stored in the type field of a device
structure, which is const.

Done using Coccinelle.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1503130946-2854-2-git-send-email-bhumirks@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-20 13:12:32 +02:00
Qiuxu Zhuo
bc8f10babc EDAC, pnd2: Properly toggle hidden state for P2SB PCI device
Properly handle hidden state of P2SB PCI device (DEV:D, FUN:0) for
Apollo Lake.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170814154905.21707-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19 10:51:11 +02:00
Qiuxu Zhuo
5fd77cb3ba EDAC, pnd2: Conditionally unhide/hide the P2SB PCI device to read BAR
On Deverton server, the P2SB PCI device (DEV:1F, FUN:1) is used by multiple
device drivers.

If it's hidden by some device driver (e.g. with the i801 I2C driver,
the commit

  9424693035 ("i2c: i801: Create iTCO device on newer Intel PCHs")

unconditionally hid the P2SB PCI device wrongly) it will make the
pnd2_edac driver read out an invalid BAR value of 0xffffffff and then
fail on ioremap().

Therefore, store the presence state of P2SB PCI device before unhiding
it for reading BAR and restore the presence state after reading BAR.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-i2c@vger.kernel.org
Link: http://lkml.kernel.org/r/20170814154845.21663-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19 10:47:24 +02:00
Qiuxu Zhuo
d84676a9e1 EDAC, pnd2: Mask off the lower four bits of a BAR
Bit[0] of BAR is always zero. Bit[2:1] and bit[3] of BAR contain the
information of 'type' and the 'prefetchable' accordingly. Therefore,
mask the lower four bits to retrieve the actual base address of a BAR.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170814154813.21619-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19 10:33:30 +02:00
Christophe JAILLET
3eaef0fa39 EDAC, thunderx: Fix error handling path in thunderx_lmc_probe()
Return the proper error value if ioremap() fails (and not 0).

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20170816045821.14165-1-christophe.jaillet@wanadoo.fr
[ Massage commit message, remove newline. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-18 19:12:08 +02:00
Christophe JAILLET
8b073d945c EDAC, altera: Fix error handling path in altr_edac_device_probe()
Return the proper error value if devm_ioremap() fails (and not 0).

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170816050506.14541-1-christophe.jaillet@wanadoo.fr
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-18 18:21:27 +02:00
Tony Luck
3e5d2bd191 EDAC, pnd2: Build in a minimal sideband driver for Apollo Lake
I've been waing a long time for the generic sideband driver to
appear. Patience has run out, so include the minimum here to
just read registers.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Patrick Geary <patrickg@supermicro.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170803210536.5662-1-tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-04 05:58:23 +02:00
Qiuxu Zhuo
039d7af651 EDAC, sb_edac: Classify memory mirroring modes
Basically, there are full memory mirroring and address range partial
memory mirroring (supported by Haswell EX and Broadwell EX) modes.

a) In full memory mirroring, the memory behind each memory controller
   is mirrored, i.e. the memory is split into two identical mirrors
   (primary and secondary), half of the memory is reserved for redundancy.

b) In address range partial memory mirroring, the memory size (range)
   of primary and secondary behind each memory controller can be user
   defined by the TAD0 register. The rest of memory ranges defined by
   TAD1/TAD2/... in that memory controller are non-mirrored.

For more detail on memory mirroring, see the following link written by Tony Luck:

  https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux

Currently the sb_edac driver only supports address decoding in full
memory mirroring and non-mirroring modes. In address range partial
memory mirroring mode, it may fail to decode an address that falls in a
non-mirroring area (the following was one of this kind of failed logs).

  mce: Uncorrected hardware memory error in user-access at 566d53a400
  Memory failure: 0x566d53a: Killing einj_mem_uc:4647 due to hardware memory corruption
  Memory failure: 0x566d53a: recovery action for dirty LRU page: Recovered
  mce: [Hardware Error]: Machine check events logged
  EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
  EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090
  EDAC sbridge MC1: TSC 4b914aa5a99dab
  EDAC sbridge MC1: ADDR 566d53a400
  EDAC sbridge MC1: MISC 1443a0c86
  EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80
  EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32)
  mce: [Hardware Error]: Machine check events logged

Therefore, classify memory mirroring modes and make the address decoding
in address range partial memory mode correct.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-02 05:40:11 +02:00
Rob Herring
2efdda4a41 EDAC, cpc925, ppc4xx: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing of
the full path string for each node.

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: devicetree@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170718214339.7774-19-robh@kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-19 07:42:41 +02:00
Borislav Petkov
c54182ec0e EDAC: Get rid of mci->mod_ver
It is a write-only variable so get rid of it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Robert Richter <rric@kernel.org>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Loc Ho <lho@apm.com>
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
2017-07-17 13:42:48 +02:00
Arvind Yadav
1c18be5a4e EDAC: Constify attribute_group structures
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with
const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
CC: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/776cb8265509054abd01b0b551624cc0da3b88e7.1499078335.git.arvind.yadav.cs@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-17 10:20:25 +02:00
Yazen Ghannam
fbe63acf62 EDAC, mce_amd: Use cpu_to_node() to find the node ID
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine
while the L3 to node mapping was 1:1. And Zen topology broke this. So
let's start slowly moving away from it and use the topology interfaces
instead.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-17 07:01:08 +02:00
Tony Luck
164c29244d EDAC, pnd2: Fix Apollo Lake DIMM detection
Non-existent or empty DIMM slots result in error return from
RD_REGP(). But we shouldn't give up on failure.

So long as we find at least one DIMM we can continue.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170628234407.21521-1-tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-29 10:37:50 +02:00
Jérémy Lefaure
a8c8261425 EDAC, i5000, i5400: Fix definition of NRECMEMB register
In the i5000 and i5400 drivers, the NRECMEMB register is defined as a
16-bit value, which results in wrong shifts in the code, as reported by
sparse.

In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21),
this register is a 32-bit register. A u32 value for the register fixes
the wrong shifts warnings and matches the datasheet.

Also fix the mask to access to the CAS bits [27:16] in the i5000 driver.

[1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf
[2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-29 10:33:13 +02:00
Colin Ian King
77641dacea EDAC, pnd2: Make function sbi_send() static
The function sbi_send() is local to just pnd2_edac.c and does not need
to be in global scope, so make it static.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170623084855.9197-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-26 16:13:25 +02:00
Gustavo A. R. Silva
ee514c7a23 EDAC, pnd2: Return proper error value from apl_rd_reg()
Add code comment to make it clear that the fall-through is intentional
and, OR ret with its previous value to avoid overwriting it so that
callers can check the correct return value.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170622220535.GA4896@embeddedgus
[ Massage a bit. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-23 09:48:50 +02:00
Chris Packham
ff0abed492 EDAC, altera: Simplify calculation of total memory
Use of_address_to_resource() and resource_size() instead of manually
parsing the "reg" property from the "memory" node(s).

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170606235500.22772-3-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-14 13:49:25 +02:00
Qiuxu Zhuo
133e4455c9 EDAC, sb_edac: Avoid creating SOCK memory controller
Xiaolong Ye reported the following failure on Broadwell D server:

  EDAC sbridge: Some needed devices are missing
  EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Failed to register device with error -19.

Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per
socket) use the same PCI device IDs for IMC0 per socket, then they
share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case,
Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller
and reports above error messages, since it has no IMC1 per socket.

Avoid creating the nonexistent SOCK memory controller.

Reported-and-tested-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com
[ Massage. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-14 11:53:39 +02:00
Yazen Ghannam
bdf1bf1744 EDAC, mce_amd: Fix typo in SMCA error description
Fix typo in "poison consumption" error description.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-12 19:03:55 +02:00
Chris Packham
3b405e30cb EDAC, mv64x60: Sanity check edac_op_state before registering
edac_op_state is a module parameter which affects the behaviour of
the driver probe which can potentially be invoked as soon as the
platform driver registration happens. Because of this we need to
ensure that we sanity check the module parameter before calling
platform_register_drivers().

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170607215530.8604-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-09 11:55:55 +02:00
Vadim Lomovtsev
cf97825862 EDAC, thunderx: Fix a warning during l2c debugfs node creation
Compare the number of debugfs entries created by
thunderx_create_debugfs_nodes() with the requested number of entries to
properly determine whether to print a warning.

Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20170531155157.93583-1-stemerkhanov@cavium.com
Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-01 20:08:17 +02:00
Chris Packham
7d2fdaa694 EDAC, mv64x60: Check driver registration success
Check the return status of platform_driver_register() in
mv64x60_edac_init(). Only output messages and initialise the
edac_op_state if the registration is successful.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170529212142.25572-2-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-30 09:55:51 +02:00
Jason Baron
7103de0e58 EDAC, ie31200: Add Intel Kaby Lake CPU support
Kaby Lake seems to work just like Skylake.

Reported-and-tested-by: Doug Thompson <bc.tdw@recursor.net>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1495823683-32569-1-git-send-email-jbaron@akamai.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-28 19:29:40 +02:00
Chris Packham
8b9afe5946 EDAC, mv64x60: Replace in_le32()/out_le32() with readl()/writel()
To allow this driver to be used on non-powerpc platforms it needs to use
io accessors suitable for all platforms.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170518083135.28048-4-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-26 22:55:50 +02:00
Chris Packham
0b3df44eeb EDAC, mv64x60: Fix pdata->name
Change this from mpc85xx_pci_err to mv64x60_pci_err. The former is
likely a hangover from when this driver was created.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170518083135.28048-3-chris.packham@alliedtelesis.co.nz
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-26 22:54:04 +02:00
Qiuxu Zhuo
d14e3a201f EDAC, sb_edac: Bump driver version and do some cleanups
Collapse 'case:' in *_mci_bind_devs() and update driver version from
1.1.1 to 1.1.2.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170523000934.87971-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 15:00:36 +02:00
Qiuxu Zhuo
4d475dde79 EDAC, sb_edac: Check if ECC enabled when at least one DIMM is present
This is based on previous work by Patrick Geary, see Link.

Additional cleanups ontop:

 - Remove the code to read MCMTR from pci_ha1_ta and CHN_TO_HA macro,
 now that TA0 and TA1 are unified.

 - Remove get_pdev_same_bus(), since in get_dimm_config() the
 variable "pvt->pci_ta" for KNL is also ready, we can simply use
 pci_read_config_dword(pvt->pci_ta, KNL_MCMTR, &pvt->info.mcmtr) to read
 MCMTR.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/57884350.1030401@supermicro.com
Link: http://lkml.kernel.org/r/20170523000910.87925-1-qiuxu.zhuo@intel.com
[ Make __populate_dimms() return int. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 14:57:52 +02:00
Qiuxu Zhuo
3286d3eb90 EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4
We don't need this quirk anymore now that the EDAC memory controller
representation matches the hardware.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170523000834.87881-1-qiuxu.zhuo@intel.com
[ Commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 14:40:40 +02:00
Borislav Petkov
6696522957 EDAC, sb_edac: Carve out dimm-populating loop
... to slim down get_dimm_config().

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 14:37:34 +02:00
Borislav Petkov
199389acd9 EDAC, sb_edac: Fix mod_name
It is called "sb_edac.c" now.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 14:37:33 +02:00
Qiuxu Zhuo
e2f747b1f4 EDAC, sb_edac: Assign EDAC memory controller per h/w controller
Tony pointed out: "currently the driver pretends there is one big
8-channel memory controller per socket instead of 2 4-channel
controllers. This is fine with all memory controller populated with
symmetrical DIMM configurations, but runs into difficulties on
asymmetrical setups".

Restructure the driver to assign an EDAC memory controller to each real
h/w memory controller to resolve the issue.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170523000731.87793-1-qiuxu.zhuo@intel.com
[ Break some lines at convenient points. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 14:37:21 +02:00
Tony Luck
7fd562b75d EDAC, sb_edac: Don't use "Socket#" in the memory controller name
EDAC assigns logical memory controller numbers in the order that we find
memory controllers, which depends on which PCI bus they are on. Some
systems end up with MC0 on socket0, others (e.g Haswell) have MC0 on
socket3.

All this is made more confusing for users because we use the string
"Socket" while generating names for memory controllers, but the number
that we attach there is the memory controller number. E.g.

  EDAC MC0: Giving out device to module sbridge_edac.c controller
    Haswell Socket#0: DEV 0000:ff:12.0 (INTERRUPT)

Change the names to say "SrcID#%d" (where the number we use is read from
the h/w associated with the memory controller instead of some logical
number internal to the EDAC driver). New message:

  EDAC MC0: Giving out device to module sbridge_edac.c controller
    Haswell SrcID#3: DEV 0000:ff:12.0 (INTERRUPT)

Reported-by: Andrey Korolyov <andrey@xdel.ru>
Reported-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170523000603.87748-1-qiuxu.zhuo@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 11:47:11 +02:00
Qiuxu Zhuo
00cf50d90a EDAC, sb_edac: Classify PCI-IDs by topology
Each of the PCI device IDs belongs to a CPU socket, or to one of the
integrated memory controllers. Provide an enum to specify the domain of
each, and distinguish the resource number in each domain: the number
of the PCI device IDs per integrated memory controller/socket, and the
number of integrated memory controllers per socket.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170523000533.87704-1-qiuxu.zhuo@intel.com
[ Realign pci_dev_descr_knl members. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25 11:19:25 +02:00
Tobias Klauser
18caec20bf EDAC, altera: Constify irq_domain_ops
struct irq_domain_ops is not modified, so it can be made const.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Cc: Thor Thayer <thor.thayer@linux.intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170524133505.1233-1-tklauser@distanz.ch
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-24 15:46:25 +02:00
Yazen Ghannam
eb77e6b80f EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h
The wrong index into the csbases/csmasks arrays was being passed to
the function to compute the chip select sizes, which resulted in the
wrong size being computed. Address that so that the correct values are
computed and printed.

Also, redo how we calculate the number of pages in a CS row.

Reported-by: Benjamin Bennett <benbennett@gmail.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: <stable@vger.kernel.org> # 4.10.x
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com
[ Remove unneeded integer math comment, minor cleanups. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-03 16:27:36 +02:00
Borislav Petkov
f8d5549df2 EDAC, ghes: Do not enable it by default
Leave it to the user to decide whether to enable this or not. Otherwise,
platform-specific drivers won't initialize (currently, EDAC supports
only a single platform driver loaded).

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-27 14:15:38 +02:00
Borislav Petkov
bffc7dece9 EDAC: Rename report status accessors
Change them to have the edac_ prefix.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:15:02 +02:00
Borislav Petkov
fee27d7d97 EDAC: Delete edac_stub.c
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to
a module parameter.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:48 +02:00
Borislav Petkov
a06b85ff07 EDAC: Update Kconfig help text
Remove the old URLs.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:44 +02:00
Borislav Petkov
e3c4ff6d8c EDAC: Remove EDAC_MM_EDAC
Move all the EDAC core functionality behind CONFIG_EDAC and get rid of
that indirection. Update defconfigs which had it.

While at it, fix dependencies such that EDAC depends on RAS for the
tracepoints.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-edac@vger.kernel.org
2017-04-10 17:14:41 +02:00
Borislav Petkov
be1d162948 EDAC: Issue tracepoint only when it is defined
... and this happens only when CONFIG_RAS is enabled.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:38 +02:00
Borislav Petkov
8c22b4fece EDAC: Move edac_op_state to edac_mc.c
... as part of moving stuff away from edac_stub.c

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:29 +02:00
Borislav Petkov
d3116a0837 EDAC: Remove edac_err_assert
... and the glue around it. It is not needed anymore.

Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10 17:14:21 +02:00