Commit Graph

126312 Commits

Author SHA1 Message Date
Vladimir Murzin
5a7a8426b2 arm64: KVM: Use static keys for selecting the GIC backend
Currently GIC backend is selected via alternative framework and this
is fine. We are going to introduce vgic-v3 to 32-bit world and there
we don't have patching framework in hand, so we can either check
support for GICv3 every time we need to choose which backend to use or
try to optimise it by using static keys. The later looks quite
promising because we can share logic involved in selecting GIC backend
between architectures if both uses static keys.

This patch moves arm64 from alternative to static keys framework for
selecting GIC backend. For that we embed static key into vgic_global
and enable the key during vgic initialisation based on what has
already been exposed by the host GIC driver.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:35 +02:00
Dan Williams
917db484dc x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation
In commit:

  ec776ef6bb ("x86/mm: Add support for the non-standard protected e820 type")

Christoph references the original patch I wrote implementing pmem support.
The intent of the 'max_pfn' changes in that commit were to enable persistent
memory ranges to be covered by the struct page memmap by default.

However, that approach was abandoned when Christoph ported the patches [1], and
that functionality has since been replaced by devm_memremap_pages().

In the meantime, this max_pfn manipulation is confusing kdump [2] that
assumes that everything covered by the max_pfn is "System RAM".  This
results in kdump hanging or crashing.

 [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-March/000348.html
 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1351098

So fix it.

Reported-by: Zhang Yi <yizhan@redhat.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Zhang Yi <yizhan@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@vger.kernel.org> # v4.1 and later kernels
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-nvdimm@lists.01.org
Fixes: ec776ef6bb ("x86/mm: Add support for the non-standard protected e820 type")
Link: http://lkml.kernel.org/r/147448744538.34910.11287693517367139607.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 12:26:48 +02:00
Laura Abbott
ca219452c6 arm64: Correctly bounds check virt_addr_valid
virt_addr_valid is supposed to return true if and only if virt_to_page
returns a valid page structure. The current macro does math on whatever
address is given and passes that to pfn_valid to verify. vmalloc and
module addresses can happen to generate a pfn that 'happens' to be
valid. Fix this by only performing the pfn_valid check on addresses that
have the potential to be valid.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-22 10:17:22 +01:00
Andrew Banman
4f059d514f x86/platform/uv/BAU: Add UV4-specific functions
Add the UV4-specific function definitions and define an operations struct
to implement them in the BAU driver.

Many BAU MMRs, although functionally the same, have new addresses on UV4
due to hardware changes. Each MMR requires new read/write functions, but
their implementation in the driver does not change. Thus, it is enough to
enumerate them in the operations struct for the changes to take effect.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-11-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:15 +02:00
Andrew Banman
6d78059bbc x86/platform/uv/BAU: Fix payload queue setup on UV4 hardware
The BAU on UV4 does not need to maintain the payload queue tail pointer. Do
not initialize the tail pointer MMR on UV4.

Note that write_payload_tail is not an abstracted BAU function since it is
an operation specific to pre-UV4 versions. Then we must switch on the UV
version to control its usage, for which we use uvhub_version rather than
is_uv*_hub because it is quicker/more concise.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-10-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:15 +02:00
Andrew Banman
e879c1124a x86/platform/uv/BAU: Disable software timeout on UV4 hardware
Software timeouts are not currently supported on BAU for UV4. Instead, the
BAU will rely on hardware-level fairness protocols to determine broadcast
timeouts.

Do not call enable_timeouts or calculate_destination_timeout on UV4. These
functions write to pre-UV4 MMRs so they generate error messages on UV4.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-9-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:14 +02:00
Andrew Banman
58d4ab46f2 x86/platform/uv/BAU: Populate ->uvhub_version with UV4 version information
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-8-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:14 +02:00
Andrew Banman
21e3f12fc0 x86/platform/uv/BAU: Use generic function pointers
Convert the use of UV version-specific functions to their abstracted
counterparts.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-7-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:14 +02:00
Andrew Banman
5e4f96fe2a x86/platform/uv/BAU: Add generic function pointers
Many BAU functions have different implementations depending on the UV
version. Rather than switching on the uvhub_version throughout the driver,
we can define a set of operations for each version. This is especially
beneficial for UV4, which will require many new MMR read/write functions.

Currently, the set of abstracted functions are the same for UV1, UV2, and
UV3. The functions were chosen because each one will have a different
implementation for UV4. Other functions will be added as needed to handle
new implementations or to cleanup the existing differences between UV1,
UV2, and UV3, i.e. read_status and wait_completion.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-6-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:13 +02:00
Andrew Banman
60e1c842c7 x86/platform/uv/BAU: Convert uv_physnodeaddr() use to uv_gpa_to_offset()
The BAU driver should use the functions provided by uv_hub.h rather than
its own implementations. uv_physnodeaddr converts vaddrs to paddrs for
BAU MMR fields, but this is done better by uv_gpa_to_offset.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-5-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:13 +02:00
Andrew Banman
d2a57afa53 x86/platform/uv/BAU: Clean up pq_init()
The payload queue first MMR requires the physical memory address and hub
GNODE of where the payload queue resides in memory, but the associated
variables are named as if the PNODE were used. Rename gnode-related
variables and clarify the definitions of the payload queue head, last, and
tail pointers.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-4-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:13 +02:00
Andrew Banman
efa59ab3e7 x86/platform/uv/BAU: Clean up and update printks
Replace all uses of printk with the appropriate pr_*() function.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-3-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:12 +02:00
Andrew Banman
67492c86b3 x86/platform/uv/BAU: Clean up vertical alignment
Fix whitespace on blocks of code to be vertically aligned.

Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-2-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:16:12 +02:00
Ingo Molnar
baad92e344 Merge branch 'linus' into x86/platform, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-22 11:15:38 +02:00
Gu Zheng
dc6db24d24 x86/acpi: Set persistent cpuid <-> nodeid mapping when booting
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.

This patch finishes step 4.

This patch set the persistent cpuid <-> nodeid mapping for all enabled/disabled
processors at boot time via an additional acpi namespace walk for processors.

[ tglx: Remove the unneeded exports ]

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-6-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 21:18:39 +02:00
Gu Zheng
8f54969dc8 x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.

This patch finishes step 2.

In this patch, we introduce a new static array named cpuid_to_apicid[],
which is large enough to store info for all possible cpus.

And then, we modify the cpuid calculation. In generic_processor_info(),
it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid
mapping changes with node hotplug.

After this patch, we find the next unused cpuid, map it to an apicid,
and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid
mapping will be persistent.

And finally we will use this array to make cpuid <-> nodeid persistent.

cpuid <-> apicid mapping is established at local apic registeration time.
But non-present or disabled cpus are ignored.

In this patch, we establish all possible cpuid <-> apicid mapping when
registering local apic.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 21:18:38 +02:00
Gu Zheng
f7c28833c2 x86/acpi: Enable acpi to register all possible cpus at boot time
cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches
the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time.

When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed,
which means, cpuid <-> nodeid mapping will change if node hotplug happens. But
workqueue does not update wq_numa_possible_cpumask.

So here is the problem:

Assume we have the following cpuid <-> nodeid in the beginning:

  Node | CPU

------------------------
node 0 |  0-14, 60-74
node 1 | 15-29, 75-89
node 2 | 30-44, 90-104
node 3 | 45-59, 105-119

and we hot-remove node2 and node3, it becomes:

  Node | CPU
------------------------
node 0 |  0-14, 60-74
node 1 | 15-29, 75-89

and we hot-add node4 and node5, it becomes:

  Node | CPU
------------------------
node 0 |  0-14, 60-74
node 1 | 15-29, 75-89
node 4 | 30-59
node 5 | 90-119

But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like.

When a pool workqueue is initialized, if its cpumask belongs to a node, its
pool->node will be mapped to that node. And memory used by this workqueue will
also be allocated on that node.

static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){
...
        /* if cpumask is contained inside a NUMA node, we belong to that node */
        if (wq_numa_enabled) {
                for_each_node(node) {
                        if (cpumask_subset(pool->attrs->cpumask,
                                           wq_numa_possible_cpumask[node])) {
                                pool->node = node;
                                break;
                        }
                }
        }

Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node,
which will lead to memory allocation failure:

 SLUB: Unable to allocate memory on node 2 (gfp=0x80d0)
  cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0
  node 0: slabs: 6172, objs: 259224, free: 245741
  node 1: slabs: 3261, objs: 136962, free: 127656

It happens here:

create_worker(struct worker_pool *pool)
 |--> worker = alloc_worker(pool->node);

static struct worker *alloc_worker(int node)
{
        struct worker *worker;

        worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node.

        ......

        return worker;
}

[Solution]

There are four mappings in the kernel:
1. nodeid (logical node id)   <->   pxm
2. apicid (physical cpu id)   <->   nodeid
3. cpuid (logical cpu id)     <->   apicid
4. cpuid (logical cpu id)     <->   nodeid

1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm
   mapping is setup at boot time. This mapping is persistent, won't change.

2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot
   time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also
   persistent.

3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is
   allocated, lower ids first, and released at CPU hotremove time, reused for other
   hotadded CPUs. So this mapping is not persistent.

4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and
   cleared at CPU hotremove time. As a result of 3, this mapping is not persistent.

To fix this problem, we establish cpuid <-> nodeid mapping for all the possible
cpus at boot time, and make it persistent. And according to init_cpu_to_node(),
cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid
mapping. So the key point is obtaining all cpus' apicid.

apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in
MADT (Multiple APIC Description Table). So we finish the job in the following steps:

1. Enable apic registeration flow to handle both enabled and disabled cpus.
   This is done by introducing an extra parameter to generic_processor_info to let the
   caller control if disabled cpus are ignored.

2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify
   the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when
   registering local apic. Store the mapping in this array.

3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
   This is also done by introducing an extra parameter to these apis to let the caller
   control if disabled cpus are ignored.

4. Establish all possible cpuid <-> nodeid mapping.
   This is done via an additional acpi namespace walk for processors.

This patch finished step 1.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-3-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 21:18:38 +02:00
Tang Chen
2532fc318d x86/numa: Online memory-less nodes at boot time
For now, x86 does not support memory-less node. A node without memory
will not be onlined, and the cpus on it will be mapped to the other
online nodes with memory in init_cpu_to_node(). The reason of doing this
is to ensure each cpu has mapped to a node with memory, so that it will
be able to allocate local memory for that cpu.

But we don't have to do it in this way.

In this series of patches, we are going to construct cpu <-> node mapping
for all possible cpus at boot time, which is a persistent mapping. It means
that the cpu will be mapped to the node which it belongs to, and will never
be changed. If a node has only cpus but no memory, the cpus on it will be
mapped to a memory-less node. And the memory-less node should be onlined.

Allocate pgdats for all memory-less nodes and online them at boot
time. Then build zonelists for these nodes. As a result, when cpus on these
memory-less nodes try to allocate memory from local node, it will
automatically fall back to the proper zones in the zonelists.

Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-2-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 21:18:38 +02:00
Russell King
b60752f2b2 ARM: sa1111: provide to_sa1111_device() macro
Provide a nicer to_sa1111_device macro to convert a struct device to a
sa1111_dev.  We will need this for drivers when converting them to
dev_pm_ops, or removing shutdown methods.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-21 18:52:45 +01:00
James Hogan
554af0c396 MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
The page structures associated with the vDSO pages in the kernel image
are calculated using virt_to_page(), which uses __pa() under the hood to
find the pfn associated with the virtual address. The vDSO data pointers
however point to kernel symbols, so __pa_symbol() should really be used
instead.

Since there is no equivalent to virt_to_page() which uses __pa_symbol(),
fix init_vdso_image() to work directly with pfns, calculated with
__phys_to_pfn(__pa_symbol(...)).

This issue broke the Malta Enhanced Virtual Addressing (EVA)
configuration which has a non-default implementation of __pa_symbol().
This is because it uses a physical alias so that the kernel executes
from KSeg0 (VA 0x80000000 -> PA 0x00000000), while RAM is provided to
the kernel in the KUSeg range (VA 0x00000000 -> PA 0x80000000) which
uses the same underlying RAM.

Since there are no page structures associated with the low physical
address region, some arbitrary kernel memory would be interpreted as a
page structure for the vDSO pages and badness ensues.

Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/14229/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-21 15:56:10 +02:00
Denys Vlasenko
1827822902 x86/e820: Use much less memory for e820/e820_saved, save up to 120k
The maximum size of e820 map array for EFI systems is defined as
E820_X_MAX (E820MAX + 3 * MAX_NUMNODES).

In x86_64 defconfig, this ends up with E820_X_MAX = 320, e820 and e820_saved
are 6404 bytes each.

With larger configs, for example Fedora kernels, E820_X_MAX = 3200, e820
and e820_saved are 64004 bytes each. Most of this space is wasted.
Typical machines have some 20-30 e820 areas at most.

After previous patch, e820 and e820_saved are pointers to e280 maps.

Change them to initially point to maps which are __initdata.

At the very end of kernel init, just before __init[data] sections are freed
in free_initmem(), allocate smaller blocks, copy maps there,
and change pointers.

The late switch makes sure that all functions which can be used to change
e820 maps are no longer accessible (they are all __init functions).

Run-tested.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160918182125.21000-1-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21 15:02:12 +02:00
Denys Vlasenko
475339684e x86/e820: Prepare e280 code for switch to dynamic storage
This patch turns e820 and e820_saved into pointers to e820 tables,
of the same size as before.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160917213927.1787-2-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21 15:02:12 +02:00
Denys Vlasenko
8c2103f224 x86/e820: Mark some static functions __init
They are all called only from other __init functions in e820.c

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160917213927.1787-1-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21 15:02:11 +02:00
Ingo Molnar
580498a23b Merge branch 'linus' into x86/boot, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-21 15:01:57 +02:00
Alexandre TORGUE
5a79d59637 ARM/dts: Add EXTI controller node to stm32f429
Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: arnd@arndb.de
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: bruherrera@gmail.com
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: lee.jones@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1474387259-18926-5-git-send-email-alexandre.torgue@st.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 14:13:21 +02:00
Alexandre TORGUE
47f9151954 ARM/STM32: Select external interrupts controller
Originally-from: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: arnd@arndb.de
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: bruherrera@gmail.com
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: lee.jones@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1474387259-18926-4-git-send-email-alexandre.torgue@st.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21 14:13:21 +02:00
Russell Currey
b79331a5eb powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
Commit 5958d19a14 checks for prefetchable m64 BARs by comparing the
addresses instead of using resource flags.  This broke SR-IOV as the m64
check in pnv_pci_ioda_fixup_iov_resources() fails.

The condition in pnv_pci_window_alignment() also changed to checking
only IORESOURCE_MEM_64 instead of both IORESOURCE_MEM_64 and
IORESOURCE_PREFETCH.

Revert these cases to the previous behaviour, adding a new helper function
to do so.  This is named pnv_pci_is_m64_flags() to make it clear this
function is only looking at resource flags and should not be relied on for
non-SRIOV resources.

Fixes: 5958d19a14 ("Fix incorrect PE reservation attempt on some 64-bit BARs")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-21 14:04:13 +10:00
Max Filippov
5b8b3c860b xtensa: add default memmap and mmio32native options to defconfigs
Now that memory initialization doesn't add default memory region
specify it explicitly in the memmap command line option in case somebody
wants to boot in non-DT-enabled configuration.

While at it update earlycon access mode to mmio32native to support both
LE and BE cores transparently.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 20:43:24 -07:00
Max Filippov
6edbbd1675 xtensa: add default memmap option to common_defconfig
Now that memory initialization doesn't add default memory region specify
it explicitly in the memmap command line option.

Save common_defconfig as defconfig so that it doesn't have all option
settings in it, only those that are different from the Kconfig defaults.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 20:43:23 -07:00
Max Filippov
81da313f2c xtensa: add default memmap option to iss_defconfig
Now that memory initialization doesn't add default memory region specify
it explicitly in the memmap command line option.

Save iss_defconfig as defconfig so that it doesn't have all option
settings in it, only those that are different from the Kconfig defaults.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 20:43:23 -07:00
Max Filippov
549409b4b5 xtensa: ISS: allow simdisk to use high memory buffers
ISS kernel by default has only low memory. But it may be configured to
support high memory and started in a simulator with more than 128M of
RAM. Simdisk driver in such configuration can get IO request with a
high memory page. There may be no TLB entry for that page, only page
table entry. However simulators don't do pagewalking, so such IO request
will fail. Touch IO buffer in the buffer read/write loop so that a TLB
entry is likely there when read or write simcall is invoked.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 20:43:22 -07:00
Max Filippov
feec273a2b xtensa: ISS: define simc_exit and use it instead of inline asm
A number of ISS platform functions use inline assembly to invoke
simulator exit, not all correctly. Define simc_exit(exit_code) and use
it instead of inline assembly.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 18:53:00 -07:00
Max Filippov
70feca7199 xtensa: xtfpga: group platform_* functions together
Group platform_* functions together and turn two separate #ifdef/#ifndef
blocks into single #ifdef/#else. No functional changes.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 18:52:59 -07:00
Max Filippov
205ad548a7 xtensa: rearrange CCOUNT calibration
DT-enabled kernel should have a CPU node connected to a clock. This clock
is the CCOUNT clock. Use old platform_calibrate_ccount call as a fallback
when CPU node cannot be found or has no clock and in non-DT-enabled
configurations.

Drop no longer needed code that updates CPU clock-frequency property in
the DT; drop DT-related code from the platform_calibrate_ccount too.

Move of_clk_init to the top of time_init, so that clocks are initialized
before CCOUNT calibration is attempted.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-20 18:52:59 -07:00
Max Filippov
58c3e3ac7a xtensa: xtfpga: use clock provider, don't update DT
Instead of querying hardcoded FPGA frequency register and then updating
clock-frequency property in specificly named DT nodes in machine setup
code register a clock provider that returns fixed-rate clock, configured
by register specified in DT. This way we have less magic/hardcoded names
and use more existing common clock framework code.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
2016-09-20 18:52:51 -07:00
Josh Poimboeuf
71f5443ebb x86/dumpstack: Fix show_stack() task pointer regression
With the following commit:

  e18bcccd1a ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")

The task pointer argument to show_stack_log_lvl() in show_stack() was
inadvertently changed to 'current'.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: byungchul.park@lge.com
Cc: fweisbec@gmail.com
Cc: keescook@chromium.org
Cc: linux-tip-commits@vger.kernel.org
Cc: luto@amacapital.net
Cc: nilayvaish@gmail.com
Cc: rostedt@goodmis.org
Cc: tip-bot for Josh Poimboeuf <tipbot@zytor.com>
Fixes: e18bcccd1a ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
Link: http://lkml.kernel.org/r/20160920155340.yhewlx7vmgmov5fb@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 23:36:37 +02:00
Thomas Gleixner
464b5847e6 Merge branch 'irq/urgent' into irq/core
Merge urgent fixes so pending patches for 4.9 can be applied.
2016-09-20 23:20:32 +02:00
Ingo Molnar
2ab78a724b * Fix a boot crash reported by Mike Galbraith and Mike Krinkin. The
new EFI memory map reservation code didn't align reservations to
    EFI_PAGE_SIZE boundaries causing bogus regions to be inserted into
    the global EFI memory map - Matt Fleming
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJX4UshGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
 wI0jPVX2RA/9HiT48K98R1fkiig/V3Wh8H8y7lQbwO+qA1WZ7q/D+DYau62qI3zM
 sLgTSoV2zzjBr+4JWuGIdbtpYBjhHci0HwEcOGp5EFSZmgJGOpc7VY2VdFaYyZ2X
 xhowGLXv2jgfKoAPt9MfgZX2Qh5Jt4CRlC+UcilUJp0wy1+uD4W7XmrLM1Rg7Ug2
 8XTL/BA5QxQa3wx8+/z6xV/ADlhzf3DQ7BwI3qkx5RHSXSuy1/E2XGTnyuDt/7Oo
 KmMwQGGrrv92uSjyq5vJ53DuNxXDJ7nGtlXTT2u0WtcZbzJXsXIaCZItbCA5rqW/
 wWGgsz5XSc406m1WNv8Zs8xx9NAEP/+KXQfftwuQHFqoylG4QqYuO060V5XOxQlh
 cdRuyVUgMONZAvTzEAUC+5SaN3c+WDCL8oEnrs/tdMxSzBL9Rj7yaHMW+ozVe5Nf
 GFxMnO0l5SG9+o8hSAkkn5RDfIDzSTC1W5imqWJQXhC2BXHEdmh4wVWIG4QgPM/K
 gqaTpX72Nu04lBjCXXpfKSK5Xqv67hZOzxUDkkjxtq1PtDbL1dPpzAF3s3GFdreI
 FfQgetg4jM4EPnWDQWqWVJxQUv+uCfpDV9g5y9kMinVb2OpyGjcZpNckGHKnqIXr
 hyntPYqyy793W6dPGEjTqfcVP751qisMjp6iIIqk3h+qr7HyJSF5nqs=
 =S9gJ
 -----END PGP SIGNATURE-----

Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core

Pull EFI fix from Matt Fleming:

 * Fix a boot crash reported by Mike Galbraith and Mike Krinkin. The
   new EFI memory map reservation code didn't align reservations to
   EFI_PAGE_SIZE boundaries causing bogus regions to be inserted into
   the global EFI memory map (Matt Fleming)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:59:15 +02:00
Ingo Molnar
41a66072c3 Merge branch 'efi/urgent' into efi/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:58:59 +02:00
Ingo Molnar
7597cdc066 * Fix a boot hang on large memory machines (multiple terabyte) caused
by type conversion errors in the x86 pat code - Matt Fleming
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJX4UkgGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
 wI0jPVWaoA/9EqgJqgXP5JSoV5EFzGA0uyE65J46hhmpqry0gnJ0Jv4lf9nWVShI
 iYm3VRW2bIjF6sNKw/Z9ZFzhXY96bn3axlPwoMbwsuCJoWHwIG0dYkI+N4oQGcaR
 7xX5hrnWeRPdU28wTZNyvqWBojXJcY2Cv3mg+/1/wByL7wABljcAxmr3a0Vk1j/R
 UsAdWKz0AFYMnKKhfwcmIwTRDFXlzmCVNP1luycFNjVXALW7vHOIav7Wzn7ZMCfv
 Nd6FYlabUm5Nq/o8o7QD/KniBj/0maurBq2jnbtV/Iim6OsMpN62AsXYY06QNtrp
 IlrDb4RIYJa+Oi4VDpjoEZicBlVuEBBJaziS3bPMUlswkO2S/v7LvddIqqEm/K1V
 rqLTpcx+O66LODs1GHz2hDd/T8I5lHsR8F9ggmboQj8kTze4kowpeN4kuTB91+oK
 4gybkXiB7olrjdWCxPe2UE/2c325nHsSmuObDmfJjTxCn1zsbJY233oXLLfaiMgn
 y4URazGH7m2kTAfjbwfJQKLMAz9AOItxaY4f2nIO6BM/ZaNM59CyGQUKU9muhecd
 p/e8qJtyP4narwtDFWm5XMUybS3W46F5jAAux1AGi83vhpfbVjnvajm5wzeonTq3
 pAS3NBWMuJ0xHQsq/zkI+S5bvzErsQKwV0p+tNhpKSb0n72tDaN+CYk=
 =qMBL
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/urgent

Pull EFI fixes from Matt Fleming:

 * Fix a boot hang on large memory machines (multiple terabyte) caused
   by type conversion errors in the x86 PAT code (Matt Fleming)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:56:56 +02:00
Matt Fleming
92dc33501b x86/efi: Round EFI memmap reservations to EFI_PAGE_SIZE
Mike Galbraith reported that his machine started rebooting during boot
after,

  commit 8e80632fb2 ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")

The ESRT table on his machine is 56 bytes and at no point in the
efi_arch_mem_reserve() call path is that size rounded up to
EFI_PAGE_SIZE, nor is the start address on an EFI_PAGE_SIZE boundary.

Since the EFI memory map only deals with whole pages, inserting an EFI
memory region with 56 bytes results in a new entry covering zero
pages, and completely screws up the calculations for the old regions
that were trimmed.

Round all sizes upwards, and start addresses downwards, to the nearest
EFI_PAGE_SIZE boundary.

Additionally, efi_memmap_insert() expects the mem::range::end value to
be one less than the end address for the region.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Reported-by: Mike Krinkin <krinkin.m.u@gmail.com>
Tested-by: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 15:43:31 +01:00
Sebastian Andrzej Siewior
f1e1c9e5e3 perf/x86/intel/bts: Make sure debug store is valid
Since commit 4d4c474124 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

This happens because the code introduced in this commit dereferences the
debug store pointer unconditionally. The debug store is not guaranteed to
be available, so a NULL pointer check as on other places is required.

Fixes: 4d4c474124 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20160920131220.xg5pbdjtznszuyzb@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 16:06:09 +02:00
Matt Fleming
1297667083 x86/efi: Only map RAM into EFI page tables if in mixed-mode
Waiman reported that booting with CONFIG_EFI_MIXED enabled on his
multi-terabyte HP machine results in boot crashes, because the EFI
region mapping functions loop forever while trying to map those
regions describing RAM.

While this patch doesn't fix the underlying hang, there's really no
reason to map EFI_CONVENTIONAL_MEMORY regions into the EFI page tables
when mixed-mode is not in use at runtime.

Reported-by: Waiman Long <waiman.long@hpe.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
CC: Theodore Ts'o <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 14:53:04 +01:00
Matt Fleming
e535ec0899 x86/mm/pat: Prevent hang during boot when mapping pages
There's a mixture of signed 32-bit and unsigned 32-bit and 64-bit data
types used for keeping track of how many pages have been mapped.

This leads to hangs during boot when mapping large numbers of pages
(multiple terabytes, as reported by Waiman) because those values are
interpreted as being negative.

commit 742563777e ("x86/mm/pat: Avoid truncation when converting
cpa->numpages to address") fixed one of those bugs, but there is
another lurking in __change_page_attr_set_clr().

Additionally, the return value type for the populate_*() functions can
return negative values when a large number of pages have been mapped,
triggering the error paths even though no error occurred.

Consistently use 64-bit types on 64-bit platforms when counting pages.
Even in the signed case this gives us room for regions 8PiB
(pebibytes) in size whilst still allowing the usual negative value
error checking idiom.

Reported-by: Waiman Long <waiman.long@hpe.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
CC: Theodore Ts'o <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 14:53:00 +01:00
Russell King
cf6e4ca3e5 ARM: sa1111: add sa1111_get_irq()
Add a helper function to get the irq number for a device.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:08 +01:00
Russell King
1629c9ab36 ARM: sa1111: clean up duplication in IRQ chip implementation
Clean up the duplication in the IRQ chip implementation - we can compute
the register address from the interrupt number rather than duplicating
the code for each register.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:08 +01:00
Russell King
17cf50116e ARM: sa1111: implement a gpio_chip for SA1111 GPIOs
Add a gpio_chip instance for SA1111 GPIOs.  This allows us to use
gpiolib to lookup and manipulate SA1111 GPIOs.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:08 +01:00
Russell King
ccb7d854d6 ARM: sa1111: move irq cleanup to separate function
Move the SA1111 interrupt cleanup to a separate function, so it can be
re-used in the probe error cleanup path.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:08 +01:00
Russell King
deee856a5c ARM: sa1111: use devm_clk_get()
Convert sa1111 to use devm_clk_get() to get its clock resource, and
strip out the clk_put() calls.  This simplifies the error handling
a little.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:07 +01:00
Russell King
7d53c1f012 ARM: sa1111: use devm_kzalloc()
Use devm_kzalloc() to allocate our driver data, so we can eliminate its
kfree() from the device removal and error cleanup paths.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:07 +01:00
Russell King
eac8dbf74f ARM: sa1111: ensure we only touch RAB bus type devices when removing
When removing a SA1111 device, we try to remove all child devices.
However, we must only remove our own RAB bus typed devices from the
tree, there may be other devices present which should not be touched.

This is necessary before we introduce gpiochip to SA1111 to avoid
incorrectly trying to remove the gpiochip device, leading to an oops
in __release_resource().

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-20 14:21:07 +01:00
Paul Gortmaker
dcc096c540 s390: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

The additions of uaccess.h are to deal with implict includes like:

arch/s390/kernel/traps.c: In function 'do_report_trap':
arch/s390/kernel/traps.c:56:4: error: implicit declaration of function 'extable_fixup' [-Werror=implicit-function-declaration]
arch/s390/kernel/traps.c: In function 'illegal_op':
arch/s390/kernel/traps.c:173:3: error: implicit declaration of function 'get_user' [-Werror=implicit-function-declaration]

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:38 +02:00
Sebastian Ott
ecc6410abf s390: export header for CLP ioctl
Export clp.h for usage by userspace.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:35 +02:00
Christian Borntraeger
c42d8c7dbe s390: enable UBSAN
This enables UBSAN for s390. We have to disable the null sanitizer
as s390 code does access memory via a null pointer (the prefix page).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:23 +02:00
Masahiro Yamada
f296190e41 s390/crashdump: use list_first_entry_or_null
The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:04 +02:00
Christian Borntraeger
9078a54996 s390: claim efficient unaligned access
most unaligned accesses are reasonable efficient (no kernel emulation)
on s390, let's announce it

This also
- removes the ubsan false positives for unaligned accesses on s390 with
  default config
- uses simpler arithmetic in several functions in several other areas
  of the kernel like ethernet frame classification

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:01 +02:00
Paul Gortmaker
0edfa83916 arm64: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-20 09:36:21 +01:00
Colin Ian King
adad0d02a7 kvm: svm: fix unsigned compare less than zero comparison
vm_data->avic_vm_id is a u32, so the check for a error
return (less than zero) such as -EAGAIN from
avic_get_next_vm_id currently has no effect whatsoever.
Fix this by using a temporary int for the comparison
and assign vm_data->avic_vm_id to this. I used an explicit
u32 cast in the assignment to show why vm_data->avic_vm_id
cannot be used in the assign/compare steps.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-20 09:26:30 +02:00
Paolo Bonzini
095cf55df7 KVM: x86: Hyper-V tsc page setup
Lately tsc page was implemented but filled with empty
values. This patch setup tsc page scale and offset based
on vcpu tsc, tsc_khz and  HV_X64_MSR_TIME_REF_COUNT value.

The valid tsc page drops HV_X64_MSR_TIME_REF_COUNT msr
reads count to zero which potentially improves performance.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Peter Hornyack <peterhornyack@google.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
[Computation of TSC page parameters rewritten to use the Linux timekeeper
 parameters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-20 09:26:20 +02:00
Paolo Bonzini
108b249c45 KVM: x86: introduce get_kvmclock_ns
Introduce a function that reads the exact nanoseconds value that is
provided to the guest in kvmclock.  This crystallizes the notion of
kvmclock as a thin veneer over a stable TSC, that the guest will
(hopefully) convert with NTP.  In other words, kvmclock is *not* a
paravirtualized host-to-guest NTP.

Drop the get_kernel_ns() function, that was used both to get the base
value of the master clock and to get the current value of kvmclock.
The former use is replaced by ktime_get_boot_ns(), the latter is
the purpose of get_kernel_ns().

This also allows KVM to provide a Hyper-V time reference counter that
is synchronized with the time that is computed from the TSC page.

Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-20 09:26:15 +02:00
Paolo Bonzini
67198ac3f3 KVM: x86: initialize kvmclock_offset
Make the guest's kvmclock count up from zero, not from the host boot
time.  The guest cannot rely on that anyway because it changes on
migration, the numbers are easier on the eye and finally it matches the
desired semantics of the Hyper-V time reference counter.

Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-20 09:26:13 +02:00
Paolo Bonzini
0d6dd2ff82 KVM: x86: always fill in vcpu->arch.hv_clock
We will use it in the next patches for KVM_GET_CLOCK and as a basis for the
contents of the Hyper-V TSC page.  Get the values from the Linux
timekeeper even if kvmclock is not enabled.

Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-20 09:25:53 +02:00
Josh Poimboeuf
c8fe460982 x86/dumpstack: Remove dump_trace() and related callbacks
All previous users of dump_trace() have been converted to use the new
unwind interfaces, so we can remove it and the related
print_context_stack() and print_context_stack_bp() callback functions.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5b97da3572b40b5a4d8e185cf2429308d0987a13.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
e18bcccd1a x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder
Convert show_trace_log_lvl() to use the new unwinder.  dump_trace() has
been deprecated.

show_trace_log_lvl() is special compared to other users of the unwinder.
It's the only place where both reliable *and* unreliable addresses are
needed.  With frame pointers enabled, most callers of the unwinder don't
want to know about unreliable addresses.  But in this case, when we're
dumping the stack to the console because something presumably went
wrong, the unreliable addresses are useful:

- They show stale data on the stack which can provide useful clues.

- If something goes wrong with the unwinder, or if frame pointers are
  corrupt or missing, all the stack addresses still get shown.

So in order to show all addresses on the stack, and at the same time
figure out which addresses are reliable, we have to do the scanning and
the unwinding in parallel.

The scanning is done with the help of get_stack_info() to traverse the
stacks.  The unwinding is done separately by the new unwinder.

In theory we could simplify show_trace_log_lvl() by instead pushing some
of this logic into the unwind code.  But then we would need some kind of
"fake" frame logic in the unwinder which would add a lot of complexity
and wouldn't be worth it in order to support only one user.

Another benefit of this approach is that once we have a DWARF unwinder,
we should be able to just plug it in with minimal impact to this code.

Another change here is that callers of show_trace_log_lvl() don't need
to provide the 'bp' argument.  The unwinder already finds the relevant
frame pointer by unwinding until it reaches the first frame after the
provided stack pointer.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/703b5998604c712a1f801874b43f35d6dac52ede.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
ec2ad9ccf1 oprofile/x86: Convert x86_backtrace() to use the new unwinder
Convert oprofile's x86_backtrace() to use the new unwinder.
dump_trace() has been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/412df8927705795e8ea60cffcf89a79e010713b1.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
49a612c6b0 x86/stacktrace: Convert save_stack_trace_*() to use the new unwinder
Convert save_stack_trace_*() to use the new unwinder.  dump_trace() has
been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/815494c627d89887db0ce56ceffd58ad16ee6c21.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Josh Poimboeuf
35f4d9b325 perf/x86: Convert perf_callchain_kernel() to use the new unwinder
Convert perf_callchain_kernel() to use the new unwinder.  dump_trace()
has been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a2df0c4f09b3d438e11b41681f10b0775a819a7f.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Josh Poimboeuf
7c7900f897 x86/unwind: Add new unwind interface and implementations
The x86 stack dump code is a bit of a mess.  dump_trace() uses
callbacks, and each user of it seems to have slightly different
requirements, so there are several slightly different callbacks floating
around.

Also there are some upcoming features which will need more changes to
the stack dump code, including the printing of stack pt_regs, reliable
stack detection for live patching, and a DWARF unwinder.  Each of those
features would at least need more callbacks and/or callback interfaces,
resulting in a much bigger mess than what we have today.

Before doing all that, we should try to clean things up and replace
dump_trace() with something cleaner and more flexible.

The new unwinder is a simple state machine which was heavily inspired by
a suggestion from Andy Lutomirski:

  https://lkml.kernel.org/r/CALCETrUbNTqaM2LRyXGRx=kVLRPeY5A3Pc6k4TtQxF320rUT=w@mail.gmail.com

It's also similar to the libunwind API:

  http://www.nongnu.org/libunwind/man/libunwind(3).html

Some if its advantages:

- Simplicity: no more callback sprawl and less code duplication.

- Flexibility: it allows the caller to stop and inspect the stack state
  at each step in the unwinding process.

- Modularity: the unwinder code, console stack dump code, and stack
  metadata analysis code are all better separated so that changing one
  of them shouldn't have much of an impact on any of the others.

Two implementations are added which conform to the new unwind interface:

- The frame pointer unwinder which is used for CONFIG_FRAME_POINTER=y.

- The "guess" unwinder which is used for CONFIG_FRAME_POINTER=n.  This
  isn't an "unwinder" per se.  All it does is scan the stack for kernel
  text addresses.  But with no frame pointers, guesses are better than
  nothing in most cases.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6dc2f909c47533d213d0505f0a113e64585bec82.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Ingo Molnar
b2c16e1efd Merge branch 'linus' into x86/asm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:21 +02:00
Jan Beulich
c907420fda locking/rwsem, x86: Drop a bogus cc clobber
With the addition of uses of GCC's condition code outputs in commit:

  35ccfb7114 ("x86, asm: Use CC_SET()/CC_OUT() in <asm/rwsem.h>")

... there's now an overlap of outputs and clobbers in __down_write_trylock().

Such overlaps are generally getting tagged with an error (occasionally
even with an ICE). I can't really tell why plain GCC 6.2 doesn't detect
this (judging by the code it is meant to), while the slightly modified
one I use does. Since condition code clobbers are never necessary on x86
(other than perhaps for documentation purposes, which doesn't really
get done consistently), remove it altogether rather than inventing
something like CC_CLOBBER (to accompany CC_SET/CC_OUT).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/57E003CC0200007800110102@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:26:41 +02:00
Alexander Shishkin
8ee83b2ab3 perf/x86/intel/pt: Add support for PTWRITE and power event tracing
The Intel PT facility grew some new functionality:

  * PTWRITE packet carries the payload of the new PTWRITE instruction
    that can be used to instrument Intel PT traces with user-supplied
    data. Packets of this type are only generated if 'ptwrite' capability
    is set and PTWEn bit is set in the event attribute's config. Flow
    update packets (FUP) can be generated on PTWRITE packets if FUPonPTW
    config bit is set. Setting these bits is not allowed if 'ptwrite'
    capability is not set.

  * PWRE, PWRX, MWAIT, EXSTOP packets communicate core power management
    events. These depend on 'power_event_tracing' capability and are
    enabled by setting PwrEvtEn bit in the event attribute.

Extend the driver capabilities and provide the proper sanity checks in the
event validation function.

[ tglx: Massaged changelog ]

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20160916134819.1978-1-alexander.shishkin@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:18:28 +02:00
Paul Gortmaker
744c193eb9 x86: Migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table related
functions.  We've now separated that content out into its own file
"extable.h" so now move over to that and avoid all the extra header content
in module.h that we don't really need to compile these files.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/20160919210418.30243-1-paul.gortmaker@windriver.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:05:43 +02:00
Prarit Bhargava
6baf3d6182 x86/tsc: Add additional Intel CPU models to the crystal quirk list
commit aa297292d7 ("x86/tsc: Enumerate SKL cpu_khz and tsc_khz via
CPUID") added code to retrieve the crystal and TSC frequency from CPUID
leaves. If the crystal freqency is enumerated as 0,the resulting TSC
frequency is 0 as well. For CPUs with a known fixed crystal frequency a
quirk list is available to set the frequency,

Kabylake and SkylakeX CPUs are missing in the list of CPUs which need this
quirk. Add them so the TSC frequency can be calculated correctly.

[ tglx: Removed the silly default case as the switch() is only invoked when
  	cpu_khz is 0. Massaged changelog. ]

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/1474289501-31717-3-git-send-email-prarit@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:00:32 +02:00
Prarit Bhargava
655e52d2b6 x86/tsc: Use cpu id defines instead of hex constants
asm/intel-family.h contains defines for cpu ids which should be used
instead of hex constants. Convert the switch case in native_calibrate_tsc()
to use the defines before adding more cpu models.

[ tglx: Massaged changelog ]

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/1474289501-31717-2-git-send-email-prarit@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:00:32 +02:00
Denys Vlasenko
cff9ab2b29 x86/apic: Get rid of apic_version[] array
The array has a size of MAX_LOCAL_APIC, which can be as large as 32k, so it
can consume up to 128k.

The array has been there forever and was never used for anything useful
other than a version mismatch check which was introduced in 2009.

There is no reason to store the version in an array. The kernel is not
prepared to handle different APIC versions anyway, so the real important
part is to detect a version mismatch and warn about it, which can be done
with a single variable as well.

[ tglx: Massaged changelog ]

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Borislav Petkov <bp@alien8.de>
CC: Brian Gerst <brgerst@gmail.com>
CC: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20160913181232.30815-1-dvlasenk@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 00:31:19 +02:00
Wanpeng Li
b0f48706a1 x86/apic: Order irq_enter/exit() calls correctly vs. ack_APIC_irq()
===============================
[ INFO: suspicious RCU usage. ]
4.8.0-rc6+ #5 Not tainted
-------------------------------
./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/2/0.

stack backtrace:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.8.0-rc6+ #5
Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
 0000000000000000 ffff8d1bd6003f10 ffffffff94446949 ffff8d1bd4a68000
 0000000000000001 ffff8d1bd6003f40 ffffffff940e9247 ffff8d1bbdfcf3d0
 000000000000080b 0000000000000000 0000000000000000 ffff8d1bd6003f70
Call Trace:
 <IRQ>  [<ffffffff94446949>] dump_stack+0x99/0xd0
 [<ffffffff940e9247>] lockdep_rcu_suspicious+0xe7/0x120
 [<ffffffff9448e0d5>] do_trace_write_msr+0x135/0x140
 [<ffffffff9406e750>] native_write_msr+0x20/0x30
 [<ffffffff9406503d>] native_apic_msr_eoi_write+0x1d/0x30
 [<ffffffff9405b17e>] smp_trace_call_function_interrupt+0x1e/0x270
 [<ffffffff948cb1d6>] trace_call_function_interrupt+0x96/0xa0
 <EOI>  [<ffffffff947200f4>] ? cpuidle_enter_state+0xe4/0x360
 [<ffffffff947200df>] ? cpuidle_enter_state+0xcf/0x360
 [<ffffffff947203a7>] cpuidle_enter+0x17/0x20
 [<ffffffff940df008>] cpu_startup_entry+0x338/0x4d0
 [<ffffffff9405bfc4>] start_secondary+0x154/0x180

This can be reproduced readily by running ftrace test case of kselftest.

Move the irq_enter() call before ack_APIC_irq(), because irq_enter() tells
the RCU susbstems to end the extended quiescent state, so that the
following trace call in ack_APIC_irq() works correctly. The same applies to
exiting_ack_irq() which calls ack_APIC_irq() after irq_exit().

[ tglx: Massaged changelog ]

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474198491-3738-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 00:31:19 +02:00
Vinson Lee
6e68b08728 x86/vdso: Use CONFIG_X86_X32_ABI to enable vdso prctl
The prctl code which references vdso_image_x32 is built when CONFIG_X86_X32
is set. This results in the following build failure:

  LD      init/built-in.o
arch/x86/built-in.o: In function `do_arch_prctl':
(.text+0x27466): undefined reference to `vdso_image_x32'

vdso_image_x32 depends on CONFIG_X86_X32_ABI. So we need to make the prctl
depend on that as well.

[ tglx: Massaged changelog ]

Fixes: 2eefd87896 ("x86/arch_prctl/vdso: Add ARCH_MAP_VDSO_*")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/1474073513-6656-1-git-send-email-vlee@freedesktop.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 00:01:48 +02:00
Linus Torvalds
7bb91e0673 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a potential weakness in IPsec CBC IV generation, as well as
  a number of issues that arose out of an OOM crash on ARM with CTR-mode
  AES"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: arm64/aes-ctr - fix NULL dereference in tail processing
  crypto: arm/aes-ctr - fix NULL dereference in tail processing
  crypto: skcipher - Fix blkcipher walk OOM crash
  crypto: echainiv - Replace chaining with multiplication
2016-09-19 12:58:34 -07:00
Sebastian Andrzej Siewior
b067a7be41 x86/apic/uv: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-19-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:33 +02:00
Sebastian Andrzej Siewior
84c9ceefec s390/mm/pfault: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-s390@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: rt@linutronix.de
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20160906170457.32393-18-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:32 +02:00
Sebastian Andrzej Siewior
e476d31291 mips/loongson/smp: Convert to hotplug state machine
Install the callbacks via the state machine.

[ tglx: Reuse the MIPS_SOC_PREPARE state ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-17-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:32 +02:00
Sebastian Andrzej Siewior
dd6d7c6f3d mips/octeon/smp: Convert to hotplug state machine
Install the callbacks via the state machine.

[ tglx: Renamed the state to MIPS_SOC_PREPARE so it can be reused by other
  	SOCs ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-16-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:32 +02:00
Sebastian Andrzej Siewior
29bd7fbc07 x86/microcode: Convert to hotplug state machine
Install the callbacks via the state machine. 
CPU_UP_CANCELED_FROZEN() is not preserved: It is only there to free memory in an
error case because it is assumed if the CPU does show up on resume it won't be
seen ever again. As per Borislav:
|IOW, you don't need mc_cpu_dead().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160907164523.46a2xnffha4bv63g@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:27 +02:00
Sebastian Andrzej Siewior
515332336b sh/SH-X3 SMP: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-sh@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-6-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:27 +02:00
Sebastian Andrzej Siewior
6b8d642239 ia64/mca: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-5-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:26 +02:00
Sebastian Andrzej Siewior
a4fa9cc220 ARM/OMAP/wakeupgen: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Cc: linux-omap@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160906170457.32393-4-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:26 +02:00
Sebastian Andrzej Siewior
657ebf7a23 ARM/shmobile: Convert to hotplug state machine
Install the callbacks via the state machine so the old notifier based
cpuhotplug infrastructure can be removed.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-sh@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: rt@linutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160906170457.32393-3-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:26 +02:00
Sebastian Andrzej Siewior
c23a7266e6 arm64/FP/SIMD: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: rt@linutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160906170457.32393-2-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19 21:44:25 +02:00
Scott Telford
bebbc4bcf3 xtensa: Tweak xuartps UART driver Rx watermark for Cadence CSP config.
Add module parameter xilinx_uartps.rx_trigger_level=32 to command line
options for CSP to set Rx watermark for xuartps driver lower than the
default value, to avoid UART overruns at 115200 bps.

Signed-off-by: Scott Telford <stelford@cadence.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-19 11:51:32 -07:00
Marcin Nowakowski
08bccf4362 MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API
Add lost Kconfig symbol.  This should have been part of 40e084a506
('MIPS: Add uprobes support.').

Fixes: 40e084a506 ('MIPS: Add uprobes support.')
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-19 18:37:43 +02:00
Aaro Koskinen
8074d78295 MIPS: Octeon: Fix platform bus probing
Commit 44a7185c2a ("of/platform: Add common method to populate
default bus") added new arch_initcall of_platform_default_populate_init()
that will override device_initcall octeon_publish_devices(). This broke
many OCTEON boards as important devices are not getting probed anymore
(e.g. on EdgeRouter Lite the USB mass storage/rootfs is missing).

Fix by changing octeon_publish_devices() to arch_initcall.

Fixes: 44a7185c2a ("of/platform: Add common method to populate default bus")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Rob Herring <robh@kernel.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14041/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-19 17:35:44 +02:00
Aaro Koskinen
3312eca519 MIPS: Octeon: mangle-port: fix build failure with VDSO code
Commit 1685ddbe35 ("MIPS: Octeon: Changes to support readq()/writeq()
usage.") added bitwise shift operations that assume that unsigned long
is always 64-bits. This broke the build of VDSO code, as it gets compiled
also in "faked" 32-bit mode. Althought the failing inline functions are
never executed in 32-bit mode, they still need to pass the compilation.
Fix by using 64-bit types explicitly.

The patch fixes the following build failure:

  CC      arch/mips/vdso/gettimeofday-o32.o
In file included from los/git/devel/linux/arch/mips/include/asm/io.h:32:0,
                 from los/git/devel/linux/arch/mips/include/asm/page.h:194,
                 from los/git/devel/linux/arch/mips/vdso/vdso.h:26,
                 from los/git/devel/linux/arch/mips/vdso/gettimeofday.c:11:
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h: In function '__should_swizzle_bits':
los/git/devel/linux/arch/mips/include/asm/mach-cavium-octeon/mangle-port.h:19:40: error: right shift count >= width of type [-Werror=shift-count-overflow]
  unsigned long did = ((unsigned long)a >> 40) & 0xff;
                                        ^~

Fixes: 1685ddbe35 ("MIPS: Octeon: Changes to support readq()/writeq() usage.")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-19 17:21:37 +02:00
Marcin Nowakowski
b244614a60 MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
cpu_has_fpu macro uses smp_processor_id() and is currently executed
with preemption enabled, that triggers the warning at runtime.

It is assumed throughout the kernel that if any CPU has an FPU, then all
CPUs would have an FPU as well, so it is safe to perform the check with
preemption enabled - change the code to use raw_ variant of the check to
avoid the warning.

Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org  # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/14125/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-19 16:18:05 +02:00
Geert Uytterhoeven
a1a9e88f70 m68k: Use IS_ENABLED() instead of checking for built-in or module
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2016-09-19 11:29:45 +02:00
Linus Torvalds
88b4ad287c Two patches fixing problems introduced with copy_from_user changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX3qRTAAoJEMsfJm/On5mBtBAP/1iiuIju0XrGfVCA/xsP1bzy
 mVwjEd5llemxeqSTy1oo51UFmJRXvDJ/sWelKwpWMXc73suje90u+H7DdZX8nbzd
 kMdOGVyF/IfvU7vVajjV5nygAFJs5dgdrU5bLxGnt+NeLa+hLEV//uK1WftLV0hz
 EhVXWvxBcAUMk2r4wLFjTzHmaJPztW7s4RVOY7ub45ZZvvuXm/fvvsAjc+fu8a8P
 ufDsCONAlKs08sNpJq+yhOVmRjzbOpvzwjksnaTwMwEHEQa3+7zelHEL7Xu6ayS3
 g1uGR9UTYSypjVpPe2XHYfroYoT1EctnzRsGoJcjyD3i4IfAOppFingk1rvQpujs
 4zjbjXZl7qR/E1eeV+gA946Ijq2SQKjTmzj/D4aH98+C7ysMdPxIY9BY2NaBs29t
 WNHBZiCAbz1e5DXdPDxT1zcE31Up94lwjZ61RmI/dQLReweDC1P2IZu+uYY90ZBN
 z6anUTj03bj63NqML66da8z/IOBYtiZ6UDt/f4xLnLwkewlg/g1WUHST3xqAGEwE
 PAeWidIBt+f7Cd6ozGThoJmqTA1KbK5TF3ZY1rkp1MNVuKgujBvYYoHgl9A0m9KI
 98kQ+I3+ghYIJTzJfuk29sU5RnbL5xz4QgYukDJn+AT9UM+8pzES4aUVcKMXvrHl
 oBvfNiXYQmvEPaDtGk7U
 =g30z
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull uaccess fixes from Guenter Roeck:
 "Two patches fixing problems introduced with copy_from_user changes"

* tag 'fixes-for-linus-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  openrisc: fix the fix of copy_from_user()
  avr32: fix 'undefined reference to `___copy_from_user'
2016-09-18 11:57:24 -07:00
Linus Torvalds
3286be9480 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A couple of small fixes to x86 perf drivers:

   - Measure L2 for HW_CACHE* events on AMD

   - Fix the address filter handling in the intel/pt driver

   - Handle the BTS disabling at the proper place"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
  perf/x86/intel/pt: Do validate the size of a kernel address filter
  perf/x86/intel/pt: Fix kernel address filter's offset validation
  perf/x86/intel/pt: Fix an off-by-one in address filter configuration
  perf/x86/intel: Don't disable "intel_bts" around "intel" event batching
2016-09-18 11:50:48 -07:00
Guenter Roeck
8e4b72054f openrisc: fix the fix of copy_from_user()
Since commit acb2505d01 ("openrisc: fix copy_from_user()"),
copy_from_user() returns the number of bytes requested, not the
number of bytes not copied.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: acb2505d01 ("openrisc: fix copy_from_user()")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-18 07:26:42 -07:00
Guenter Roeck
65c0044ca8 avr32: fix 'undefined reference to `___copy_from_user'
avr32 builds fail with:

arch/avr32/kernel/built-in.o: In function `arch_ptrace':
(.text+0x650): undefined reference to `___copy_from_user'
arch/avr32/kernel/built-in.o:(___ksymtab+___copy_from_user+0x0): undefined
reference to `___copy_from_user'
kernel/built-in.o: In function `proc_doulongvec_ms_jiffies_minmax':
(.text+0x5dd8): undefined reference to `___copy_from_user'
kernel/built-in.o: In function `proc_dointvec_minmax_sysadmin':
sysctl.c:(.text+0x6174): undefined reference to `___copy_from_user'
kernel/built-in.o: In function `ptrace_has_cap':
ptrace.c:(.text+0x69c0): undefined reference to `___copy_from_user'
kernel/built-in.o:ptrace.c:(.text+0x6b90): more undefined references to
`___copy_from_user' follow

Fixes: 8630c32275 ("avr32: fix copy_from_user()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Havard Skinnemoen <hskinnemoen@gmail.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-09-18 07:26:26 -07:00
Linus Torvalds
baf009f927 powerpc fixes for 4.8 #6
Fixes for code merged this cycle:
  - Fix restore of SPRs upon wake up from hypervisor state loss from Gautham R. Shenoy
  - Fix the state of root PE from Gavin Shan
  - Detach from PE on releasing PCI device from Gavin Shan
  - Fix size of NUM_CPU_FTR_KEYS on 32-bit
  - Fix missed TCE invalidations that should fallback to OPAL
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX3QI8AAoJEFHr6jzI4aWAmDEP/3k8Tuk30d+QhfVm++N18cZ7
 EFdpO25m+kJH83PVc3Lri6sj2z6/Bpm6Ib2V3gynExB9SxUCcAJXHqwioLkL9/PW
 aUiotxsPvlpfBFAjNbk2myB8JSbc/+8yJaojKYWqwX796bjUdRkI7rmXtfrjmX6U
 uhQQ9nvKNxThwY5eedMH9PCJ89BzgLefrExHUD171iR43qfaouLkUn/Ba+UIhC5m
 pepwePCTXHEPm8e328hYVSNEmqWRgL+UN2EUZKqXjITNtDSHCdwGTF8iifwTku54
 g/rrta8CgFD4x5chTROnOhJMkTD9MRoneVR8nE4QD6yMHj9k1huL8J8wlfnG/zbB
 Ym6MNKBYbGPMAoYfbxAcvWr/7XL+szNoR+p+VWl+rgf2Z08dQaI4zNiB3aimCs1g
 7yWW649Gd4gXyNygfeMCDWGZbVhQdQIHcNrcAKFuIRvkn3iPZ0cPa5GYxZ7o/32B
 oKAtZMsufGN0eC21hbLaRkyeYPdqEjyk+T734t05cfBCvScWkHeBapnX9gYOoCqZ
 ok7b8wXVqVFXZ+FSZ8Ec7YquUHBhHECpqofMgB6d9DqbWPlubwiA3g4YnjrpFDC7
 u4a4bVKVZy8fk3w7+2ibkIdud35zL0LqkB2ZNhOn3IYM/yBD0zgUs+bIqruTKZ2+
 AYapeGjmf+SBD3ytGtab
 =MG5t
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Fixes for code merged this cycle:

   - Fix restore of SPRs upon wake up from hypervisor state loss from
     Gautham R  Shenoy
   - Fix the state of root PE from Gavin Shan
   - Detach from PE on releasing PCI device from Gavin Shan
   - Fix size of NUM_CPU_FTR_KEYS on 32-bit
   - Fix missed TCE invalidations that should fallback to OPAL"

* tag 'powerpc-4.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv/pci: Fix missed TCE invalidations that should fallback to OPAL
  powerpc/powernv: Detach from PE on releasing PCI device
  powerpc/powernv: Fix the state of root PE
  powerpc/kernel: Fix size of NUM_CPU_FTR_KEYS on 32-bit
  powerpc/powernv: Fix restore of SPRs upon wake up from hypervisor state loss
2016-09-17 12:52:01 -07:00
Linus Torvalds
008f08d64a ARM: SoC fixes
Here are a couple of bugfixes for v4.8-rc. Most of them have
 actually been around for a while this time but for some reason
 didn't get applied early on. The shmobile regulator fix is the
 only one that isn't completely obvious.
 
 device tree changes:
 - archtimer interrupts must be level triggered (multiple platforms)
 - fix for USB and MMC clocks on STiH410
 - fix split DT repository in case of raspberry-pi 3
 - A new use of skeleton.dtsi on arm64 has crept in after that
   was removed.
 
 defconfig updates:
 - xilinx vdma has a new Kconfig symbol name
 - keystone requires CONFIG_NOP_USB_XCEIV since v4.8-rc1
 
 code fixes:
 - fix regulator quirk on shmobile
 - suspend-to-ram regression on EXYNOS
 
 maintainer updates:
 - Javier Martinez Canillas is now a reviewer for Samsung EXYNOS
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAV9wHNGCrR//JCVInAQKPBhAA4HWz5YoE1FwatmrZ7LyLgl+SD7ezDuGC
 w/oo01kGYSq9vN8E7rTWqTW/lylTgt7adX3E6wNPIIfVg9dx9TnJ0HofH3TjHku4
 K7HeqapNqqqWh3VF8xFZO6YkKi09uhsX+j8NOAGlhd50A4OrOA1xh1NtpIakLX7z
 TYBpbjW1TB3SwNiq7CbC1PJUKzTfP49hQmf6dUdKajLZ2Wova4H0bonyo45DhanZ
 JiZyZlR9NnieVcftAP+kGFskM8q2hyZPqtoCar/mWrmerWMUG3n1MWj9LyDTVsVc
 zd7wBcX9dLOe26qGW88MUnbUBC/R2nZ+VDzMwyoYoIHlHALDcn2ldDotLDVIRp6A
 xGMejt06Q2qG8zX4SCjyq9hu2LeyBRWHkRTaoAD2tsT5SD4KNMi3GeYnq9Su+iYw
 hXrCOrua1pMDhWsU5RMGrfPXKbZSkkcvvt1MAoUn5h7xTqLQ1+PKLIUVOPnAR6Ns
 lHR86oo1kAoXDPbKZRnMbHSQ76kW+nWF+vDOJ7ozXNwZtcmXZiqfKxs/RDVecqFL
 kJMPJBPUGW5FAakarLb68f8XVsiHQr3ujofTyA77cUACqLBidbhxbfq+5NMWyck/
 zXPLk4HEGBlg9v8g17g1MxdttS+Na9pzNVfE23CuGKc147QIh1M3DeLjoIZ9gSfH
 p8cxVpe5gBs=
 =tYAb
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here are a couple of bugfixes for v4.8-rc.

  Most of them have actually been around for a while this time but for
  some reason didn't get applied early on.  The shmobile regulator fix
  is the only one that isn't completely obvious.

  Device tree changes:
   - archtimer interrupts must be level triggered (multiple platforms)
   - fix for USB and MMC clocks on STiH410
   - fix split DT repository in case of raspberry-pi 3
   - a new use of skeleton.dtsi on arm64 has crept in after that was
     removed.

  defconfig updates:
   - xilinx vdma has a new Kconfig symbol name
   - keystone requires CONFIG_NOP_USB_XCEIV since v4.8-rc1

  Code fixes:
   - fix regulator quirk on shmobile
   - suspend-to-ram regression on EXYNOS

  Maintainer updates:
   - Javier Martinez Canillas is now a reviewer for Samsung EXYNOS"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: keystone: defconfig: Fix USB configuration
  arm64: dts: Fix broken architected timer interrupt trigger
  ARM: multi_v7_defconfig: update XILINX_VDMA
  ARM64: dts: bcm: Use a symlink to R-Pi dtsi files from arch=arm
  ARM: dts: Remove use of skeleton.dtsi from bcm283x.dtsi
  ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
  ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
  ARM: shmobile: fix regulator quirk for Gen2
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support
2016-09-16 12:15:41 -07:00
Linus Torvalds
cac4662a88 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Most of this update are fixes primarily discovered from testing on the
  older StrongARM 1110 and PXA systems, as a result of recent interest
  from several people in these platforms:

   - Locomo interrupt handling incorrectly stores the handler data in
     the chip's private data slot: when Locomo is combined with an
     interrupt controller who's chip uses the chip private data, this
     leads to an oops.

   - SA1111 was missing a call to clk_disable() to clean up after a
     failed probe.

   - SA1111 and PCMCIA suspend/resume was broken:

     The PCMCIA "ds" layer was using the legacy bus suspend/resume
     methods, which the core PM code is no longer calling as a result of
     device_pm_check_callbacks() introduced in commit aa8e54b559
     ("PM / sleep: Go direct_complete if driver has no callbacks").

     SA1111 was broken due to changes to PCMCIA which makes PCMCIA
     suspend itself later than the SA1111 code expects, and resume
     before the SA1111 code has initialised access to the pcmcia
     sub-device.

   - the default SA1111 interrupt mask polarity got messed up when it
     was converted to use a dynamic interrupt base number for its
     interrupts.

   - fix platform_get_irq() error code propagation, which was causing
     problems on platforms where the interrupt may not be available at
      probe time in DT setups.

   - fix the lack of clock to PCMCIA code on PXA platforms, which was
     omitted in conversions of PXA to CCF.

   - fix an oops in the PXA PCMCIA code caused by a previous commit not
     realising that Lubbock is different from the rest of the PXA PCMCIA
     drivers.

   - ensure that SA1111 low-level PCMCIA drivers propagate their error
     codes to the main probe function, rather than the driver silently
     accepting a failure.

   - fix the sa11xx debugfs reporting of timing information, which
     always indicated zero due to the clock being a factor of 1000 out.

   - fix the polarity of the status change signal reported from the
     sockets.

  Lastly, one ARM specific commit from Stefan Agner fixing the LPAE
  cache attributes"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: pxa/lubbock: add pcmcia clock
  ARM: locomo: fix locomo irq handling
  ARM: 8612/1: LPAE: initialize cache policy correctly
  ARM: sa1111: fix missing clk_disable()
  ARM: sa1111: fix pcmcia suspend/resume
  ARM: sa1111: fix pcmcia interrupt mask polarity
  ARM: sa1111: fix error code propagation in sa1111_probe()
  pcmcia: lubbock: fix sockets configuration
  pcmcia: sa1111: fix propagation of lowlevel board init return code
  pcmcia: soc_common: fix SS_STSCHG polarity
  pcmcia: sa11xx_base: add units to the timing information
  pcmcia: sa11xx_base: fix reporting of timing information
  pcmcia: ds: fix suspend/resume
2016-09-16 12:08:13 -07:00
Jeremy Linton
85023b2e13 arm64: pmu: Hoist pmu platform device name
Move the PMU name into a common header file so it may
be referenced by other users.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 17:11:34 +01:00
Jeremy Linton
236b9b91cd arm64: pmu: Probe default hw/cache counters
ARMv8 machines can identify the micro/arch defined counters
that are available on a machine. Add all these counters to the
default armv8 perf map. At run-time disable the counters which
are not available on the given PMU.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 17:11:33 +01:00
Mark Salter
dbee3a74ef arm64: pmu: add fallback probe table
In preparation for ACPI support, add a pmu_probe_info table to
the arm_pmu_device_probe() call. This table gets used when
probing in the absence of a devicetree node for PMU.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 17:11:33 +01:00
Luiz Capitulino
4f5758fc67 kvm: x86: export TSC information to user-space
This commit exports the following information to
user-space via the newly created per-vcpu debugfs
directory:

 - TSC offset (as a signed number)
 - TSC scaling ratio
 - TSC scaling ratio fractinal bits

The original intention of this commit was to
export only the TSC offset, but the TSC scaling
information is exported for completeness.

We need to retrieve the TSC offset from user-space
in order to support the merging of host and guest
traces in trace-cmd. Today, we use the kvm_write_tsc_offset
tracepoint, but it has a number of problems (mainly,
it requires a running VM to be rebooted, ftrace setup,
and also tracepoints are not supposed to be ABIs).

The merging of host and guest traces is explained
in more detail in this thread:

 [Qemu-devel] [RFC] host and guest kernel trace merging
 https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html

This commit creates the following files in debugfs:

/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-offset
/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-scaling-ratio
/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-scaling-ratio-frac-bits

The last two are only created if TSC scaling is supported.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-16 16:57:48 +02:00
Luiz Capitulino
235539b48a kvm: add stubs for arch specific debugfs support
Two stubs are added:

 o kvm_arch_has_vcpu_debugfs(): must return true if the arch
   supports creating debugfs entries in the vcpu debugfs dir
   (which will be implemented by the next commit)

 o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
   entries in the vcpu debugfs dir

For x86, this commit introduces a new file to avoid growing
arch/x86/kvm/x86.c even more.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-16 16:57:47 +02:00
Luiz Capitulino
3e3f50262e kvm: x86: drop read_tsc_offset()
The TSC offset can now be read directly from struct kvm_arch_vcpu.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-16 16:57:45 +02:00
Luiz Capitulino
a545ab6a00 kvm: x86: add tsc_offset field to struct kvm_vcpu_arch
A future commit will want to easily read a vCPU's TSC offset,
so we store it in struct kvm_arch_vcpu_arch for easy access.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-16 16:57:45 +02:00
Arnd Bergmann
6408649115 1. A recent change in populating irqchip devices from Device Tree
broke Suspend to RAM on Exynos boards due to lack of probing of
    PMU (Power Management Unit) driver.  Multiple drivers attach to
    the PMU's DT node: irqchip, clock controller and PMU platform
    driver for handling suspend.  The new irqchip code marked the
    PMU's DT node as OF_POPULATED but we need to attach to this
    node also PMU platform driver.
 
 2. Add Javier as additional reviewer for Exynos patches.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX25oaAAoJEME3ZuaGi4PXDa4P/0KG/BAzf++1/43zP9I3PAhM
 od3QOLDrnR16ly2AL8QbJWq574ZvNa8ssZ5m3QrWgt4JEGVudc97OpByupTrgm51
 xlfceTcsyG95OV1Psm20NJrIK2o1FEDIr4Bdh0/ixjOs/zajLyzUtx0qQ2rHa8mQ
 hzJeJGbk4yu50jsaUDWBH2p9n2EHuCqWfvRroE/fEbGL6G120OkFpiP3AExEycnD
 U3gUMm+QorEWavf9cMtAzJcnFNKDjaT4iStgTAeCrvcJ4ttwX4x1z5PJV9M/zbs4
 XtGoE7mGfGdajhXXnFPPinUB+1MffJdttI6jHEuTWBc/Znfdgr796HnAUYp2bcPp
 WQT/OcnFN0WUeYY4dONBI5/mIuHcj1p5esOttjEvIJNOVdoHevnl5KUhaAajy9DV
 0alKU9DQ49Z5j+e2lxTJ749Lj1+FfnX63EDQik95hrDZc8QKbbIe3F4/OuR6xm4L
 mPUaMSP2OazSpMVNiFoWL/qyB5ukSP0fyLR9cRggUbBUoq8A0WWT0QRMYaaATnQS
 fzEeEoerHIxuqU2LKUBJJ7m3VUl0VbWt7VBU2jI73RQ13IzeH92eMh6ssMjO3k0d
 R2OAU6R8H6oaCb9qsYajzRjMxm4zEalkUqNTXB4GMM7tsGckqibdgRqF/4fI/4D4
 TtUkGN+FCEJOLzAgAntK
 =Ljgr
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into fixes

Pull "ARM: exynos: Fixes for v4.8, secound round" from Krzysztof Kozłowski:

1. A recent change in populating irqchip devices from Device Tree
   broke Suspend to RAM on Exynos boards due to lack of probing of
   PMU (Power Management Unit) driver.  Multiple drivers attach to
   the PMU's DT node: irqchip, clock controller and PMU platform
   driver for handling suspend.  The new irqchip code marked the
   PMU's DT node as OF_POPULATED but we need to attach to this
   node also PMU platform driver.

2. Add Javier as additional reviewer for Exynos patches.

* tag 'samsung-fixes-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support
2016-09-16 16:29:48 +02:00
Josh Poimboeuf
81539169f2 x86/dumpstack: Remove NULL task pointer convention
show_stack_log_lvl() and friends allow a NULL pointer for the
task_struct to indicate the current task.  This creates confusion and
can cause sneaky bugs.

Instead require the caller to pass 'current' directly.

This only changes the internal workings of the dumpstack code.  The
dump_trace() and show_stack() interfaces still allow a NULL task
pointer.  Those interfaces should also probably be fixed as well.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 16:21:39 +02:00
Matt Fleming
080fe0b790 perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.

This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,

  $ perf stat -e cache-references,cache-misses

measure completely different things.

Instead, make the AMD PMU measure instruction/data cache and TLB fill
requests to the L2 and instruction/data cache and TLB misses in the L2
when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled,
respectively. That way the events measure unified caches on both
platforms.

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 16:19:49 +02:00
Alexander Shishkin
1155bafcb7 perf/x86/intel/pt: Do validate the size of a kernel address filter
Right now, the kernel address filters in PT are prone to integer overflow
that may happen in adding filter's size to its offset to obtain the end
of the range. Such an overflow would also throw a #GP in the PT event
configuration path.

Fix this by explicitly validating the result of this calculation.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Alexander Shishkin
ddfdad991e perf/x86/intel/pt: Fix kernel address filter's offset validation
The kernel_ip() filter is used mostly by the DS/LBR code to look at the
branch addresses, but Intel PT also uses it to validate the address
filter offsets for kernel addresses, for which it is not sufficient:
supplying something in bits 64:48 that's not a sign extension of the lower
address bits (like 0xf00d000000000000) throws a #GP.

This patch adds address validation for the user supplied kernel filters.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Alexander Shishkin
95f60084ac perf/x86/intel/pt: Fix an off-by-one in address filter configuration
PT address filter configuration requires that a range is specified by
its first and last address, but at the moment we're obtaining the end
of the range by adding user specified size to its start, which is off
by one from what it actually needs to be.

Fix this and make sure that zero-sized filters don't pass the filter
validation.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Andy Lutomirski
74327a3e88 x86/process: Pin the target stack in get_wchan()
This will prevent a crash if get_wchan() runs after the task stack
is freed.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/337aeca8614024aa4d8d9c81053bbf8fcffbe4ad.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
1959a60182 x86/dumpstack: Pin the target stack when dumping it
Specifically, pin the stack in save_stack_trace_tsk() and
show_trace_log_lvl().

This will prevent a crash if the target task dies before or while
dumping its stack once we start freeing task stacks early.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cf0082cde65d1941a996d026f2b2cdbfaca17bfa.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
ff0071c036 x86/entry/64: Fix a minor comment rebase error
When I rebased my thread_info changes onto Brian's switch_to()
changes, I carefully checked that I fixed up all the code correctly,
but I missed a comment :(

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 15f4eae70d ("x86: Move thread_info into task_struct")
Link: http://lkml.kernel.org/r/089fe1e1cbe8b258b064fccbb1a5a5fd23861031.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:52 +02:00
Linus Torvalds
024c7e3756 One fix for an x86 regression in VM migration, mostly visible with
Windows because it uses RTC periodic interrupts.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJX2xpHAAoJEL/70l94x66D9NsIAIw+9oRA86qjehVnguV3fRKA
 ITZ4OGFDiXPWuxqDaw8mHHXr0RYx8KcMTzfFNbV+YL5U0cq9xYzdaNhchKPpyF+3
 7H5wL8Ku9wkYZ930kdCf5Q+LNCfg8d/wKlibPEbX0MDx4jL99kkcxLzEkmIRqFlq
 bpXaQe/KR1xCWR6gI/a6aRJWLfGuFMV82YSnk/dCSjwotbAwjJUSt+IPhLwhx28o
 7ddcxW3CxQqelJorcu2lvRiGnCvEzDhIdOvHJqibCjo3uzqbcI4PA2gs3rozbs9s
 VMEzqZpNgK0XsyKyccSw75npViIHYPkjMzxoyHMDhgiP3eIwp/tJquxAfLjK4WE=
 =h4P4
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
 "One fix for an x86 regression in VM migration, mostly visible with
  Windows because it uses RTC periodic interrupts"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: correctly reset dest_map->vector when restoring LAPIC state
2016-09-15 15:15:41 -07:00
Al Viro
1c109fabbd fix minor infoleak in get_user_ex()
get_user_ex(x, ptr) should zero x on failure.  It's not a lot of a leak
(at most we are leaking uninitialized 64bit value off the kernel stack,
and in a fairly constrained situation, at that), but the fix is trivial,
so...

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ This sat in different branch from the uaccess fixes since mid-August ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15 12:54:04 -07:00
Paolo Bonzini
b0eaf4506f kvm: x86: correctly reset dest_map->vector when restoring LAPIC state
When userspace sends KVM_SET_LAPIC, KVM schedules a check between
the vCPU's IRR and ISR and the IOAPIC redirection table, in order
to re-establish the IOAPIC's dest_map (the list of CPUs servicing
the real-time clock interrupt with the corresponding vectors).

However, __rtc_irq_eoi_tracking_restore_one was forgetting to
set dest_map->vectors.  Because of this, the IOAPIC did not process
the real-time clock interrupt EOI, ioapic->rtc_status.pending_eoi
got stuck at a non-zero value, and further RTC interrupts were
reported to userspace as coalesced.

Fixes: 9e4aabe2bb
Fixes: 4d99ba898d
Cc: stable@vger.kernel.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: David Gilbert <dgilbert@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-15 18:00:32 +02:00
Roger Quadros
a6805884e2 ARM: keystone: defconfig: Fix USB configuration
Simply enabling CONFIG_KEYSTONE_USB_PHY doesn't work anymore
as it depends on CONFIG_NOP_USB_XCEIV. We need to enable
that as well.

This fixes USB on Keystone boards from v4.8-rc1 onwards.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-09-15 11:46:12 +02:00
Alexander Shishkin
cecf62352a perf/x86/intel: Don't disable "intel_bts" around "intel" event batching
At the moment, intel_bts events get disabled from intel PMU's disable
callback, which includes event scheduling transactions of said PMU,
which have nothing to do with intel_bts events.

We do want to keep intel_bts events off inside the PMI handler to
avoid filling up their buffer too soon.

This patch moves intel_bts enabling/disabling directly to the PMI
handler.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915082233.11065-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 11:25:26 +02:00
Ingo Molnar
3947f49302 x86/vdso: Only define map_vdso_randomized() if CONFIG_X86_64
... otherwise the compiler complains:

   arch/x86/entry/vdso/vma.c:252:12: warning: ‘map_vdso_randomized’ defined but not used [-Wunused-function]

But the #ifdeffery here is getting pretty ugly, so move around
vdso_addr() as well to cluster the dependencies a bit more.

It's still not particulary pretty though ...

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gorcunov@openvz.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: oleg@redhat.com
Cc: xemul@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 09:44:32 +02:00
David A. Long
3e593f6675 arm64: Improve kprobes test for atomic sequence
Kprobes searches backwards a finite number of instructions to determine if
there is an attempt to probe a load/store exclusive sequence. It stops when
it hits the maximum number of instructions or a load or store exclusive.
However this means it can run up past the beginning of the function and
start looking at literal constants. This has been shown to cause a false
positive and blocks insertion of the probe. To fix this, further limit the
backwards search to stop if it hits a symbol address from kallsyms. The
presumption is that this is the entry point to this code (particularly for
the common case of placing probes at the beginning of functions).

This also improves efficiency by not searching code that is not part of the
function. There may be some possibility that the label might not denote the
entry path to the probed instruction but the likelihood seems low and this
is just another example of how the kprobes user really needs to be
careful about what they are doing.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-15 08:33:46 +01:00
Michael Ellerman
ed7d9a1d7d powerpc/powernv/pci: Fix missed TCE invalidations that should fallback to OPAL
In commit f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE
invalidations"), we added logic to fallback to OPAL for doing TCE
invalidations if we can't do it in Linux.

Ben sent a v2 of the patch, containing these additional call sites, but
I had already applied v1 and didn't notice. So fix them now.

Fixes: f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE invalidations")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-15 17:05:11 +10:00
Ingo Molnar
91b7bd39e6 x86/vdso: Only define prctl_map_vdso() if CONFIG_CHECKPOINT_RESTORE
... otherwise the compiler complains:

  arch/x86/kernel/process_64.c:528:13: warning: ‘prctl_map_vdso’ defined but not used [-Wunused-function]

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gorcunov@openvz.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: oleg@redhat.com
Cc: xemul@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:45:39 +02:00
Andy Lutomirski
15f4eae70d x86: Move thread_info into task_struct
Now that most of the thread_info users have been cleaned up,
this is straightforward.

Most of this code was written by Linus.

Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a50eab40abeaec9cb9a9e3cbdeafd32190206654.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:13 +02:00
Linus Torvalds
d896fa20a7 um/Stop conflating task_struct::stack with thread_info
thread_info may move in the future, so use the accessors.

[ Andy Lutomirski wrote this changelog message and changed
  "task_thread_info(child)->cpu" to "task_cpu(child)". ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3439705d9838940cc82733a7335fa8c654c37db8.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:12 +02:00
Linus Torvalds
97245d0058 x86/entry: Get rid of pt_regs_to_thread_info()
It was a nice optimization while it lasted, but thread_info is moving
and this optimization will no longer work.

Quoting Linus:

    Oh Gods, Andy. That pt_regs_to_thread_info() thing made me want
    to do unspeakable acts on a poor innocent wax figure that looked
    _exactly_ like you.

[ Changelog written by Andy. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6376aa81c68798cc81631673f52bd91a3e078944.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:12 +02:00
Andy Lutomirski
b9d989c721 x86/asm: Move the thread_info::status field to thread_struct
Because sched.h and thread_info.h are a tangled mess, I turned
in_compat_syscall() into a macro.  If we had current_thread_struct()
or similar and we could use it from thread_info.h, then this would
be a bit cleaner.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ccc8a1b2f41f9c264a41f771bb4a6539a642ad72.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:12 +02:00
Ingo Molnar
d4b80afbba Merge branch 'linus' into x86/asm, to pick up recent fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:24:53 +02:00
Josh Poimboeuf
fcd709ef20 x86/dumpstack: Add recursion checking for all stacks
in_exception_stack() has some recursion checking which makes sure the
stack trace code never traverses a given exception stack more than once.
This prevents an infinite loop if corruption somehow causes a stack's
"next stack" pointer to point to itself (directly or indirectly).

The recursion checking can be useful for other stacks in addition to the
exception stack, so extend it to work for all stacks.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/95de5db4cfe111754845a5cef04e20630d01423f.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:13:15 +02:00
Josh Poimboeuf
5fe599e02e x86/dumpstack: Add support for unwinding empty IRQ stacks
When an interrupt happens in entry code while running on a software IRQ
stack, and the IRQ stack was empty, regs->sp will contain the stack end
address (e.g., irq_stack_ptr).  If the regs are passed to dump_trace(),
get_stack_info() will report STACK_TYPE_UNKNOWN, causing dump_trace() to
return prematurely without trying to go to the next stack.

Update the bounds checking for software interrupt stacks so that the
ending address is now considered part of the stack.

This means that it's now possible for the 'walk_stack' callbacks --
print_context_stack() and print_context_stack_bp() -- to be called with
an empty stack.  But that's fine; they're already prepared to deal with
that due to their on_stack() checks.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5a5e5de92dcf11e8dc6b6e8e50ad7639d067830b.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:13:15 +02:00
Josh Poimboeuf
cb76c93982 x86/dumpstack: Add get_stack_info() interface
valid_stack_ptr() is buggy: it assumes that all stacks are of size
THREAD_SIZE, which is not true for exception stacks.  So the
walk_stack() callbacks will need to know the location of the beginning
of the stack as well as the end.

Another issue is that in general the various features of a stack (type,
size, next stack pointer, description string) are scattered around in
various places throughout the stack dump code.

Encapsulate all that information in a single place with a new stack_info
struct and a get_stack_info() interface.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:13:15 +02:00
Josh Poimboeuf
9c00390757 x86/dumpstack: Simplify in_exception_stack()
in_exception_stack() does some bad, bad things just so the unwinder can
print different values for different areas of the debug exception stack.

There's no need to clarify where exactly on the stack it is.  Just print
"#DB" and be done with it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e91cb410169dd576678dd427c35efb716fd0cee1.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:13:14 +02:00
Gavin Shan
29bf282dec powerpc/powernv: Detach from PE on releasing PCI device
The PCI hotplug can be part of EEH error recovery. The @pdn and
the device's PE number aren't removed and added afterwords. The
PE number in @pdn should be set to an invalid one. Otherwise, the
PE's device count is decreased on removing devices while failing
to be increased on adding devices. It leads to unbalanced PE's
device count and make normal PCI hotplug path broken.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-15 13:53:19 +10:00
Jon Mason
f4e8715099 clk: iproc: Make clocks visible options
Make the clocks visible options that can be selected by anyone.  This
avoids the problems of:
 1) Select is a reverse dependency and is hard for people to understand
    and can sometimes be a pain to track down
 2) Build coverage goes down because configs are hidden
 3) Code bloat

Patch suggested by Stephen Boyd

Signed-off-by: Jon Mason <jonmason@broadcom.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-09-14 17:21:47 -07:00
Linus Torvalds
4cea877657 PCI updates for v4.8:
Enumeration
     Mark Haswell Power Control Unit as having non-compliant BARs (Bjorn Helgaas)
 
   Power management
     Fix bridge_d3 update on device removal (Lukas Wunner)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX2az0AAoJEFmIoMA60/r8/EoP/28mGRiKi8mlqNAR3MYN3F0n
 VSIm7WyxWNawH1gRJXKQBzNqgMJnj4qRGXSIvP3AIYyBDcJs/X7j91/eKOARNfQr
 55A+gfSz4jUKlw+0WgPY8/U2/xQ4yoom1zhbsAYcIVeljZo/3JUg+wHpPjhIMkH0
 2slTerHRDExrS43jxQi225toiEaO6lcY8EVmHCDo+jYlQz3sCEwIXg9hn1rwTbvG
 sJI0zyUwHF+oWowgJqlwYxsbPPnelPAN5YAx7KrHuVmBdL0Bgo3oIRtbb3JZZ9Up
 L9bQ6NpRjSARvijaZ2TAhueqIIDv2HGgvwNB01l4Yggw7Sm1dFCuUS6vj/e5tpZA
 xntE3F6s2Z+I4I1D7pAX3jMYCdYx/QltiTCeGRp8pJv+f4ewW3jcel3FAksY3BEg
 0NCjDrGFqGYai4hGRROpt/aXlW/Pn53eQLlu4Xg2qgkj0NMh0ODMrTjMnABB39ae
 eGqIXab7WeVBxt10eU19J1u1RTqpUO2LJW+cMnvYdCfKAYby/gj8SD8vIsn3oDjZ
 hQS/4fSHurc7LZwDmwOfaiHlGnvcQWV9EKwgScS0v8AxPQnC8pNUgYpzZcXY8Q6I
 YXtyK7suFriSZPS0Qs4FrdfJrmTBaBQ55aZu9aftb5v4YqacN5qZo+HaXiONBy3v
 RzsFK6xIbnIgb35g8vKy
 =PQml
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Here are two changes for v4.8.  The first fixes a "[Firmware Bug]: reg
  0x10: invalid BAR (can't size)" warning on Haswell, and the second
  fixes a problem in some new runtime suspend functionality we merged
  for v4.8.  Summary:

  Enumeration:
    Mark Haswell Power Control Unit as having non-compliant BARs (Bjorn Helgaas)

  Power management:
    Fix bridge_d3 update on device removal (Lukas Wunner)"

* tag 'pci-v4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix bridge_d3 update on device removal
  PCI: Mark Haswell Power Control Unit as having non-compliant BARs
2016-09-14 14:06:30 -07:00
Arnd Bergmann
d20ced23c7 Merge branch 'dt/irq-fix' into fixes
* dt/irq-fix:
  arm64: dts: Fix broken architected timer interrupt trigger
2016-09-14 22:47:36 +02:00
Marc Zyngier
f2a89d3b2b arm64: dts: Fix broken architected timer interrupt trigger
The ARM architected timer specification mandates that the interrupt
associated with each timer is level triggered (which corresponds to
the "counter >= comparator" condition).

A number of DTs are being remarkably creative, declaring the interrupt
to be edge triggered. A quick look at the TRM for the corresponding ARM
CPUs clearly shows that this is wrong, and I've corrected those.
For non-ARM designs (and in the absence of a publicly available TRM),
I've made them active low as well, which can't be completely wrong
as the GIC cannot disinguish between level low and level high.

The respective maintainers are of course welcome to prove me wrong.

While I was at it, I took the liberty to fix a couple of related issue,
such as some spurious affinity bits on ThunderX, and their complete
absence on ls1043a (both of which seem to be related to copy-pasting
from other DTs).

Acked-by: Duc Dang <dhdang@apm.com>
Acked-by: Carlo Caione <carlo@endlessm.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-09-14 22:47:22 +02:00
Fabian Frederick
7ccb8e633c ARM: multi_v7_defconfig: update XILINX_VDMA
Commit fde57a7c44
("dmaengine: xilinx: Rename driver and config")

renamed config XILINX_VDMA to config XILINX_DMA
Update defconfig accordingly.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-09-14 22:40:59 +02:00
Dmitry Safonov
6846351052 x86/signal: Add SA_{X32,IA32}_ABI sa_flags
Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.

So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.

Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-7-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:11 +02:00
Dmitry Safonov
cc87324b3d x86/ptrace: Down with test_thread_flag(TIF_IA32)
As the task isn't executing at the moment of {GET,SET}REGS,
return regset that corresponds to code selector, rather than
value of TIF_IA32 flag.
I.e. if we ptrace i386 elf binary that has just changed it's
code selector to __USER_CS, than GET_REGS will return
full x86_64 register set.

Note, that this will work only if application has changed it's CS.
If the application does 32-bit syscall with __USER_CS, ptrace
will still return 64-bit register set. Which might be still confusing
for tools that expect TS_COMPACT to be exposed [1, 2].

So this this change should make PTRACE_GETREGSET more reliable and
this will be another step to drop TIF_{IA32,X32} flags.

[1]: https://sourceforge.net/p/strace/mailman/message/30471411/
[2]: https://lkml.org/lkml/2012/1/18/320

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: Pedro Alves <palves@redhat.com>
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-6-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:11 +02:00
Dmitry Safonov
90954e7b94 x86/coredump: Use pr_reg size, rather that TIF_IA32 flag
Killed PR_REG_SIZE and PR_REG_PTR macro as we can get regset size
from regset view.
I wish I could also kill PRSTATUS_SIZE nicely.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: 0x7f454c46@gmail.com
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-5-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:10 +02:00
Dmitry Safonov
2eefd87896 x86/arch_prctl/vdso: Add ARCH_MAP_VDSO_*
Add API to change vdso blob type with arch_prctl.
As this is usefull only by needs of CRIU, expose
this interface under CONFIG_CHECKPOINT_RESTORE.

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-4-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:09 +02:00
Dmitry Safonov
576ebfefd3 x86/vdso: Replace calculate_addr in map_vdso() with addr
That will allow to specify address where to map vDSO blob.
For the randomized vDSO mappings introduce map_vdso_randomized()
which will simplify calls to map_vdso.

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-3-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:09 +02:00
Dmitry Safonov
e38447ee1f x86/vdso: Unmap vdso blob on vvar mapping failure
If remapping of vDSO blob failed on vvar mapping,
we need to unmap previously mapped vDSO blob.

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-2-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14 21:28:08 +02:00
Linus Torvalds
77e5bdf9f7 Merge branch 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess fixes from Al Viro:
 "Fixes for broken uaccess primitives - mostly lack of proper zeroing
  in copy_from_user()/get_user()/__get_user(), but for several
  architectures there's more (broken clear_user() on frv and
  strncpy_from_user() on hexagon)"

* 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
  avr32: fix copy_from_user()
  microblaze: fix __get_user()
  microblaze: fix copy_from_user()
  m32r: fix __get_user()
  blackfin: fix copy_from_user()
  sparc32: fix copy_from_user()
  sh: fix copy_from_user()
  sh64: failing __get_user() should zero
  score: fix copy_from_user() and friends
  score: fix __get_user/get_user
  s390: get_user() should zero on failure
  ppc32: fix copy_from_user()
  parisc: fix copy_from_user()
  openrisc: fix copy_from_user()
  nios2: fix __get_user()
  nios2: copy_from_user() should zero the tail of destination
  mn10300: copy_from_user() should zero on access_ok() failure...
  mn10300: failing __get_user() and get_user() should zero
  mips: copy_from_user() must zero the destination on access_ok() failure
  ARC: uaccess: get_user to zero out dest in cause of fault
  ...
2016-09-14 09:35:05 -07:00
Linus Torvalds
b8f26e880c xen: regression fix for 4.8-rc6
- Fix SMP boot in arm guests.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX2VU5AAoJEFxbo/MsZsTR9LYIAI5VUqMXq2eeItorp2XZfZ24
 t5X+Noob+6NwiCWML2LvLVuoyPhsg1ADbOGRR08SkWWThxOtzrNaB1IvudKMxZ9Q
 c6BxxTVcAQ3lvs2PxvS0s3UI/GeF1yuolpdmNkoOkCc3hoNJ4H8J5RDJguEJzkWy
 OVFiMCkpTbQoJ/kAzlOVoBYV5BuSlEzc86fmS1wmdqmLC/YEAc9mnEB12Qjo8w6u
 IQ/lH9p5GXhLco0NrowxfxsNT0bIj8keaA1WozkDT8i4KFcFE4pw0i96Szdrraou
 hr3tZidPOMWxBEEDTY13Xp9+4RSZmVlVJfCc87jO71Jknk0dyfboGcic7362Mvo=
 =V1SL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.8b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen regression fix from David Vrabel:
 "Fix SMP boot in arm guests"

* tag 'for-linus-4.8b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/xen: fix SMP guests boot
2016-09-14 08:42:51 -07:00
Josh Poimboeuf
cfeeed279d x86/dumpstack: Allow preemption in show_stack_log_lvl() and dump_trace()
show_stack_log_lvl() and dump_trace() are already preemption safe:

- If they're running in irq or exception context, preemption is already
  disabled and the percpu stack pointers can be trusted.

- If they're running with preemption enabled, they must be running on
  the task stack anyway, so it doesn't matter if they're comparing the
  stack pointer against a percpu stack pointer from this CPU or another
  one: either way it won't match.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a0ca0b1044eca97d4f0ec7c1619cf80b3b65560d.1473371307.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-14 17:23:30 +02:00
Vitaly Kuznetsov
de75abbe01 arm/xen: fix SMP guests boot
Commit 88e957d6e4 ("xen: introduce xen_vcpu_id mapping") broke SMP
ARM guests on Xen. When FIFO-based event channels are in use (this is
the default), evtchn_fifo_alloc_control_block() is called on
CPU_UP_PREPARE event and this happens before we set up xen_vcpu_id
mapping in xen_starting_cpu. Temporary fix the issue by setting direct
Linux CPU id <-> Xen vCPU id mapping for all possible CPUs at boot. We
don't currently support kexec/kdump on Xen/ARM so these ids always
match.

In future, we have several ways to solve the issue, e.g.:

- Eliminate all hypercalls from CPU_UP_PREPARE, do them from the
  starting CPU. This can probably be done for both x86 and ARM and, if
  done, will allow us to get Xen's idea of vCPU id from CPUID/MPIDR on
  the starting CPU directly, no messing with ACPI/device tree
  required.

- Save vCPU id information from ACPI/device tree on ARM and use it to
  initialize xen_vcpu_id mapping. This is the same trick we currently
  do on x86.

Reported-by: Julien Grall <julien.grall@arm.com>
Tested-by: Wei Chen <Wei.Chen@arm.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-09-14 14:39:13 +01:00
Gavin Shan
6eaed1665f powerpc/powernv: Fix the state of root PE
The PE for root bus (root PE) can be removed because of PCI hot
remove in EEH recovery path for fenced PHB error. We need update
@phb->root_pe_populated accordingly so that the root PE can be
populated again in forthcoming PCI hot add path. Also, the PE
shouldn't be destroyed as it's global and reserved resource.

Fixes: c5f7700bbd ("powerpc/powernv: Dynamically release PE")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-14 11:40:09 +10:00
Al Viro
8630c32275 avr32: fix copy_from_user()
really ugly, but apparently avr32 compilers turns access_ok() into
something so bad that they want it in assembler.  Left that way,
zeroing added in inline wrapper.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:18 -04:00
Al Viro
e98b9e37ae microblaze: fix __get_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:17 -04:00
Al Viro
d0cf385160 microblaze: fix copy_from_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:17 -04:00
Al Viro
c90a3bc506 m32r: fix __get_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:16 -04:00
Al Viro
8f035983dd blackfin: fix copy_from_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:16 -04:00
Al Viro
917400cecb sparc32: fix copy_from_user()
Cc: stable@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:15 -04:00
Al Viro
6e050503a1 sh: fix copy_from_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:15 -04:00
Al Viro
c685238922 sh64: failing __get_user() should zero
It could be done in exception-handling bits in __get_user_b() et.al.,
but the surgery involved would take more knowledge of sh64 details
than I have or _want_ to have.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:14 -04:00
Al Viro
b615e3c746 score: fix copy_from_user() and friends
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:14 -04:00
Al Viro
c2f18fa4cb score: fix __get_user/get_user
* should zero on any failure
* __get_user() should use __copy_from_user(), not copy_from_user()

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:13 -04:00
Al Viro
fd2d2b191f s390: get_user() should zero on failure
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:13 -04:00
Al Viro
224264657b ppc32: fix copy_from_user()
should clear on access_ok() failures.  Also remove the useless
range truncation logics.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:50:02 -04:00
Al Viro
aace880fee parisc: fix copy_from_user()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:44 -04:00
Al Viro
acb2505d01 openrisc: fix copy_from_user()
... that should zero on faults.  Also remove the <censored> helpful
logics wrt range truncation copied from ppc32.  Where it had ever
been needed only in case of copy_from_user() *and* had not been merged
into the mainline until a month after the need had disappeared.
A decade before openrisc went into mainline, I might add...

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:44 -04:00
Al Viro
2e29f50ad5 nios2: fix __get_user()
a) should not leave crap on fault
b) should _not_ require access_ok() in any cases.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:43 -04:00
Al Viro
e33d1f6f72 nios2: copy_from_user() should zero the tail of destination
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:43 -04:00
Al Viro
ae7cc577ec mn10300: copy_from_user() should zero on access_ok() failure...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:42 -04:00
Al Viro
43403eabf5 mn10300: failing __get_user() and get_user() should zero
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:42 -04:00
Al Viro
e69d700535 mips: copy_from_user() must zero the destination on access_ok() failure
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:41 -04:00
Vineet Gupta
05d9d0b96e ARC: uaccess: get_user to zero out dest in cause of fault
Al reported potential issue with ARC get_user() as it wasn't clearing
out destination pointer in case of fault due to bad address etc.

Verified using following

| {
|  	u32 bogus1 = 0xdeadbeef;
|	u64 bogus2 = 0xdead;
|	int rc1, rc2;
|
|  	pr_info("Orig values %x %llx\n", bogus1, bogus2);
|	rc1 = get_user(bogus1, (u32 __user *)0x40000000);
|	rc2 = get_user(bogus2, (u64 __user *)0x50000000);
|	pr_info("access %d %d, new values %x %llx\n",
|		rc1, rc2, bogus1, bogus2);
| }

| [ARCLinux]# insmod /mnt/kernel-module/qtn.ko
| Orig values deadbeef dead
| access -14 -14, new values 0 0

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:41 -04:00
Al Viro
8ae95ed4ae metag: copy_from_user() should zero the destination on access_ok() failure
Cc: stable@vger.kernel.org
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:40 -04:00
Al Viro
a5e541f796 ia64: copy_from_user() should zero the destination on access_ok() failure
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:40 -04:00
Al Viro
f35c1e0671 hexagon: fix strncpy_from_user() error return
It's -EFAULT, not -1 (and contrary to the comment in there,
__strnlen_user() can return 0 - on faults).

Cc: stable@vger.kernel.org
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:39 -04:00
Al Viro
3b8767a8f0 frv: fix clear_user()
It should check access_ok().  Otherwise a bunch of places turn into
trivially exploitable rootholes.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:39 -04:00
Al Viro
eb47e0293b cris: buggered copy_from_user/copy_to_user/clear_user
* copy_from_user() on access_ok() failure ought to zero the destination
* none of those primitives should skip the access_ok() check in case of
small constant size.

Cc: stable@vger.kernel.org
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13 17:49:38 -04:00
Linus Torvalds
5924bbecd0 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Three fixes:

   - AMD microcode loading fix with randomization

   - an lguest tooling fix

   - and an APIC enumeration boundary condition fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Fix num_processors value in case of failure
  tools/lguest: Don't bork the terminal in case of wrong args
  x86/microcode/AMD: Fix load of builtin microcode with randomized memory
2016-09-13 12:52:45 -07:00
Linus Torvalds
ee319d5834 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This contains:

   - a set of fixes found by directed-random perf fuzzing efforts by
     Vince Weaver, Alexander Shishkin and Peter Zijlstra

   - a cqm driver crash fix

   - an AMD uncore driver use after free fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix PEBSv3 record drain
  perf/x86/intel/bts: Kill a silly warning
  perf/x86/intel/bts: Fix BTS PMI detection
  perf/x86/intel/bts: Fix confused ordering of PMU callbacks
  perf/core: Fix aux_mmap_count vs aux_refcount order
  perf/core: Fix a race between mmap_close() and set_output() of AUX events
  perf/x86/amd/uncore: Prevent use after free
  perf/x86/intel/cqm: Check cqm/mbm enabled state in event init
  perf/core: Remove WARN from perf_event_read()
2016-09-13 12:47:29 -07:00
Linus Torvalds
7c2c114416 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "This contains a Xen fix, an arm64 fix and a race condition /
  robustization set of fixes related to ExitBootServices() usage and
  boundary conditions"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Use efi_exit_boot_services()
  efi/libstub: Use efi_exit_boot_services() in FDT
  efi/libstub: Introduce ExitBootServices helper
  efi/libstub: Allocate headspace in efi_get_memory_map()
  efi: Fix handling error value in fdt_find_uefi_params
  efi: Make for_each_efi_memory_desc_in_map() cope with running on Xen
2016-09-13 12:02:00 -07:00
Masahiro Yamada
f148b41e8b x86: Clean up various simple wrapper functions
Remove unneeded variables and assignments.

While we are here, let's fix the following as well:

  - Remove unnecessary parentheses
  - Remove unnecessary unsigned-suffix 'U' from constant values
  - Reword the comment in set_apic_id() (suggested by Thomas Gleixner)

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mike Travis <travis@sgi.com>
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steffen Persvold <sp@numascale.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1473573502-27954-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-13 20:42:58 +02:00
Andy Lutomirski
85063fac1f x86/entry/64: Clean up and document espfix64 stack setup
The espfix64 setup code was a bit inscrutible and contained an
unnecessary push of RAX.  Remove that push, update all the stack
offsets to match, and document the whole mess.

Reported-By: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e5459eb10cf1175c8b36b840bc425f210d045f35.1473717910.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-13 20:34:16 +02:00
Ingo Molnar
5465fe0fc3 * Refactor the EFI memory map code into architecture neutral files
and allow drivers to permanently reserve EFI boot services regions
    on x86, as well as ARM/arm64 - Matt Fleming
 
  * Add ARM support for the EFI esrt driver - Ard Biesheuvel
 
  * Make the EFI runtime services and efivar API interruptible by
    swapping spinlocks for semaphores - Sylvain Chouleur
 
  * Provide the EFI identity mapping for kexec which allows kexec to
    work on SGI/UV platforms with requiring the "noefi" kernel command
    line parameter - Alex Thorlton
 
  * Add debugfs node to dump EFI page tables on arm64 - Ard Biesheuvel
 
  * Merge the EFI test driver being carried out of tree until now in
    the FWTS project - Ivan Hu
 
  * Expand the list of flags for classifying EFI regions as "RAM" on
    arm64 so we align with the UEFI spec - Ard Biesheuvel
 
  * Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32)
    or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot
    services function table for direct calls, alleviating us from
    having to maintain the custom function table - Lukas Wunner
 
  * Miscellaneous cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJX0tCTGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
 wI0jPVWLVBAAn/iM91Vmhggdk3t0wCMJzrNGonw61VJ9TZJVbCUJyiH0qdDUThhj
 R4rO+6Vf6yOuyswu+mGmae61tfsjwJHH+IPpB8nRLIGQRwzoxk+aGC7FzmQ0ISVO
 wIdv5shsmeWhFAyNB1D4hzlp1NxOZaqcU/0cfUVGe4HmK0Js3tUpWWx8VgJ7yvW+
 X1PBbfyChArGqiwV6FJz/mJxRAgByUfhvYMcX9HhQkou6F4U5Y8l3vlhUMbuAZAi
 ZfG2LWGYCQ+F4XKxMq2QDAtdUgBzlYWw6W60o55x9WO4cEVSzNVRgedto5o1Zea9
 2QGEr94gim+e5cJ/HeDIEmbWZhAqIdcNDqXSSBd1CDVQytp4PNAn6rxk+2S9kxoe
 T9Mk523HEabo+AZvDAPPJlzcsnIe83JYy69M1xFvcP25ebk7y2BwQtd1jwWPrPDQ
 Q/llzF93aezUFR/guvIw0oHckhQl0ZkNedL9Tq4+UKL0ibp2X4gSX636/x4PkBSP
 5+pyfmO1SAqTiiMQGQMnp4+ngPQeQrxkmVnh1P7cKlTNXg1IoS03t46Xn2Pj10cd
 3KneVDeN9DKIAOn7wPKuPnjTho+9FH36xbwTaIgbt0cWuFFfu090rmqOQfjAJEDN
 foHzsMZ7S6CmeOJnj97NNR8sMQDcc+p9bh1KXpJIHaZAgrKmvqPZpMk=
 =G7L6
 -----END PGP SIGNATURE-----

Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core

Pull EFI updates from Matt Fleming:

"* Refactor the EFI memory map code into architecture neutral files
   and allow drivers to permanently reserve EFI boot services regions
   on x86, as well as ARM/arm64 - Matt Fleming

 * Add ARM support for the EFI esrt driver - Ard Biesheuvel

 * Make the EFI runtime services and efivar API interruptible by
   swapping spinlocks for semaphores - Sylvain Chouleur

 * Provide the EFI identity mapping for kexec which allows kexec to
   work on SGI/UV platforms with requiring the "noefi" kernel command
   line parameter - Alex Thorlton

 * Add debugfs node to dump EFI page tables on arm64 - Ard Biesheuvel

 * Merge the EFI test driver being carried out of tree until now in
   the FWTS project - Ivan Hu

 * Expand the list of flags for classifying EFI regions as "RAM" on
   arm64 so we align with the UEFI spec - Ard Biesheuvel

 * Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32)
   or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot
   services function table for direct calls, alleviating us from
   having to maintain the custom function table - Lukas Wunner

 * Miscellaneous cleanups and fixes"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-13 20:21:55 +02:00
Paul Burton
801f823dc2 MIPS: c-r4k: Fix size calc when avoiding IPIs for small icache flushes
Commit f70ddc07b6 ("MIPS: c-r4k: Avoid small flush_icache_range SMP
calls") adds checks to force use of hit-type cache ops for small icache
flushes where they are globalised & index-type cache ops aren't, in
order to avoid the overhead of IPIs in those cases. However it
calculated the size of the region being flushed incorrectly, subtracting
the end address from the start address rather than the reverse. This
would have led to an overflow with size wrapping round to some large
value, and likely to the special case for avoiding IPIs not actually
being hit.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Fixes: f70ddc07b6 ("MIPS: c-r4k: Avoid small flush_icache_range SMP calls")
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14211/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 17:37:20 +02:00
Huacai Chen
3cbc6fc9c9 MIPS: Add a missing ".set pop" in an early commit
Commit 842dfc11ea ("MIPS: Fix build with binutils 2.24.51+") missing
a ".set pop" in macro fpu_restore_16even, so add it.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Acked-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # 3.18+
Patchwork: https://patchwork.linux-mips.org/patch/14210/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 17:25:11 +02:00
Matt Redfearn
951c39cd3b MIPS: paravirt: Fix undefined reference to smp_bootstrap
If the paravirt machine is compiles without CONFIG_SMP, the following
linker error occurs

arch/mips/kernel/head.o: In function `kernel_entry':
(.ref.text+0x10): undefined reference to `smp_bootstrap'

due to the kernel entry macro always including SMP startup code.
Wrap this code in CONFIG_SMP to fix the error.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 3.16+
Patchwork: https://patchwork.linux-mips.org/patch/14212/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 16:45:15 +02:00
Borislav Petkov
7cc4ef8ed1 x86/RAS/mce_amd_inj: Fix some W= warnings
In particular:

  arch/x86/ras/mce_amd_inj.c: In function ‘prepare_msrs’:
  arch/x86/ras/mce_amd_inj.c:249:13: warning: declaration of ‘i_mce’ shadows a global declaration [-Wshadow]
    struct mce i_mce = *(struct mce *)info;
               ^~~~~

  arch/x86/ras/mce_amd_inj.c: In function ‘init_mce_inject’:
  arch/x86/ras/mce_amd_inj.c:453:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < ARRAY_SIZE(dfs_fls); i++) {

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160912075941.24699-16-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:14 +02:00
Yazen Ghannam
a884675b87 x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE
records. However, broken hardware and software is not something unheard
of so warn about bank 4 errors. They shouldn't be coming from bank 4
naturally but users can still use mce_amd_inj to simulate errors from it
for testing purposed.

Also, avoid special handling in the injector mce_amd_inj like it is
being done on the older families.

[ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ]

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Aravind Gopalakrishnan  <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:14 +02:00
Yazen Ghannam
4f29b73bae x86/mce/AMD: Extract the error address on SMCA systems
The MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr
in bits [55:0] and the least significant bit of the address in bits
[61:56]. We should extract the valid ErrorAddr bits from the MCA_ADDR
register rather than saving the raw value to struct mce.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1473275643-1721-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:13 +02:00
Yazen Ghannam
4b711f92c9 x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems
The MCA_SYND and MCA_IPID registers contain valuable information and
should be included in MCE output. The MCA_SYND register contains
syndrome and other error information, and the MCA_IPID register will
uniquely identify the MCA bank's type without having to rely on system
software.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:13 +02:00
Yazen Ghannam
5828c46f2c x86/mce/AMD: Save MCA_IPID in MCE struct on SMCA systems
The MCA_IPID register uniquely identifies a bank's type and instance
on Scalable MCA systems. We should save the value of this register
in struct mce along with the other relevant error information. This
ensures that we can decode errors without relying on system software to
correlate the bank to the type.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:12 +02:00
Yazen Ghannam
66ef269dbb x86/mce/AMD: Ensure the deferred error interrupt is of type APIC on SMCA systems
The Deferred Error Interrupt Type is set per bank on Scalable MCA
systems. This is done in a bitfield in the MCA_CONFIG register of each
bank. We should set its type to APIC-based interrupt and not assume BIOS
has set it for us.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472737486-1720-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:11 +02:00
Yazen Ghannam
87a6d4091b x86/mce/AMD: Update sysfs bank names for SMCA systems
Define a bank's sysfs filename based on its IP type and InstanceId.

Credits go to Aravind  for:
 * The general idea and proto- get_name().
 * Defining smca_umc_block_names[] and buf_mcatype[].

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473193490-3291-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:11 +02:00
Yazen Ghannam
5896820e0a x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types
Scalable MCA defines a number of IP types. An MCA bank on an SMCA
system is defined as one of these IP types. A bank's type is uniquely
identified by the combination of the HWID and MCATYPE values read from
its MCA_IPID register.

Add the required tables in order to be able to lookup error descriptions
based on a bank's type and the error's extended error code.

[ bp: Align comments, simplify a bit. ]

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472741832-1690-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:10 +02:00
Yazen Ghannam
cfee4f6f0b x86/mce/AMD: Read MSRs on the CPU allocating the threshold blocks
Scalable MCA systems allow non-core MCA banks to only be accessible by
certain CPUs. The MSRs for these banks are Read-as-Zero on other CPUs.

During allocate_threshold_blocks(), get_block_address() can be scheduled
on CPUs other than the one allocating the block. This causes the MSRs to
be read on the wrong CPU and results in incorrect behavior.

Add a @cpu parameter to get_block_address() and pass this in to ensure
that the MSRs are only read on the CPU that is allocating the block.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472673994-12235-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:08 +02:00
Yazen Ghannam
bad744b7f2 x86/RAS: Add syndrome support to mce_amd_inj
Add a debugfs file which holds the error syndrome (written into
MCA_SYND) of an injected error. Only write it on SMCA systems. Update
README file, while at it.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1467633035-32080-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:07 +02:00
Yazen Ghannam
db819d60f6 x86/mce: Add support for new MCA_SYND register
Syndrome information is no longer contained in MCA_STATUS for SMCA
systems but in a new register - MCA_SYND.

Add a synd field to struct mce to hold MCA_SYND register value. Add it
to the end of struct mce to maintain compatibility with old versions of
mcelog. Also, add it to the respective tracepoint.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1467633035-32080-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:06 +02:00
Yazen Ghannam
74ab0e7a83 x86/mce/AMD: Use msr_ops.misc() in allocate_threshold_blocks()
Change MSR_IA32_MCx_MISC() macro to msr_ops.misc() because SMCA machines
define a different set of MSRs and msr_ops will give you the correct
MISC register.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1468269447-8808-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13 15:23:06 +02:00
Paolo Bonzini
ad53e35ae5 Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
Paul Mackerras writes:

    The highlights are:

    * Reduced latency for interrupts from PCI pass-through devices, from
      Suresh Warrier and me.
    * Halt-polling implementation from Suraj Jitindar Singh.
    * 64-bit VCPU statistics, also from Suraj.
    * Various other minor fixes and improvements.
2016-09-13 15:20:55 +02:00
Paul Burton
b03c1e3b8e MIPS: Remove compact branch policy Kconfig entries
Commit c1a0e9bc88 ("MIPS: Allow compact branch policy to be changed")
added Kconfig entries allowing for the compact branch policy used by the
compiler for MIPSr6 kernels to be specified. This can be useful for
debugging, particularly in systems where compact branches have recently
been introduced.

Unfortunately mainline gcc 5.x supports MIPSr6 but not the
-mcompact-branches compiler flag, leading to MIPSr6 kernels failing to
build with gcc 5.x with errors such as:

  mipsel-linux-gnu-gcc: error: unrecognized command line option '-mcompact-branches=optimal'
  make[2]: *** [kernel/bounds.s] Error 1

Fixing this by hiding the Kconfig entry behind another seems to be more
hassle than it's worth, as MIPSr6 & compact branches have been around
for a while now and if policy does need to be set for debug it can be
done easily enough with KCFLAGS. Therefore remove the compact branch
policy Kconfig entries & their handling in the Makefile.

This reverts commit c1a0e9bc88 ("MIPS: Allow compact branch policy to
be changed").

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: c1a0e9bc88 ("MIPS: Allow compact branch policy to be changed")
Cc: stable <stable@vger.kernel.org> # v4.4+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14241/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 14:14:50 +02:00
James Hogan
ac7e385f2b MIPS: MAAR: Fix address alignment
The alignment of MIPS MAAR region addresses isn't quite right.

- It rounds an already 64 KiB aligned start address up to the next
  64 KiB boundary, e.g. 0x80000000 is rounded up to 0x80010000.

- It assumes the end address is already on a 64 KiB boundary and doesn't
  round it down. Should that not be the case it will hit the second
  BUG_ON() in write_maar_pair().

Both cases are addressed by rounding up and down to 64 KiB boundaries in
the more traditional way of adding 0xffff (for rounding up) and masking
off the low 16 bits.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13858/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 14:13:26 +02:00
James Hogan
58cae9b0f0 MIPS: Fix memory regions reaching top of physical
Memory regions added with add_memory_region() at the top of the physical
address space will have their end address overflow to 0. This causes
them to be rejected as invalid, and would cause various other issues
later on.

This causes issues on Malta and Boston platforms when wanting to use all
2GB of RAM on a 32-bit kernel, either via highmem (using physical
addresses 0x90000000..0xFFFFFFFF), or with the Malta Enhanced Virtual
Addressing (EVA) layout which exposes the whole 0x80000000..0xFFFFFFFF
physical address range to kernel mode at 0x00000000..0x7FFFFFFF.

Due to the abundance of these non-overflow assumptions and the fact that
memblock already avoids the arithmetic overflow by limiting the size of
new memory regions without the arch code knowing it (in particular
mem_init_free_highmem() will trigger a page dump due to nonzero mapcount
on the last page), it is simpler and safer to just limit the size of the
region in a similar way to memblock but at the arch level to allow most
of the RAM to be used without arithmetic overflows.

Therefore we detect this case specifically and reduce the size of the
region slightly to avoid the arithmetic overflows and cause the last
page to be ignored.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13857/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 14:13:26 +02:00
Marcin Nowakowski
2809328f6e MIPS: uprobes: fix incorrect uprobe brk handling
When a uprobe-replacement breakpoint instruction is handled, a notifier
is called with DIE_UPROBE argument, but a corresponding exception notify
handler for MIPS attempts to handle DIE_BREAK instead. As a result
the breakpoint instruction isn't handled by the uprobe code and the probed
application terminates with SIGTRAP.
Fix this by changing arch_uprobe_exception_notify code to handle
DIE_UPROBE as a pre-singlestep condition instead of DIE_BREAK.

Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13884/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 14:13:26 +02:00
Amitoj Kaur Chawla
e3b23148fd MIPS: ath79: Fix test for error return of clk_register_fixed_factor().
clk_register_fixed_factor returns an ERR_PTR in case of an error and
should have an IS_ERR check instead of a null check.

The Coccinelle semantic patch used to find this issue is as follows:
@@
expression e;
statement S;
@@

*e = clk_register_fixed_factor(...);
if (!e) S

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Cc: julia.lawall@lip6.fr
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13894/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-13 14:13:26 +02:00
Ard Biesheuvel
2db34e78f1 crypto: arm64/aes-ctr - fix NULL dereference in tail processing
The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 49788fe2a1 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Cc: stable@vger.kernel.org
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-13 18:44:59 +08:00
Ard Biesheuvel
f82e90b286 crypto: arm/aes-ctr - fix NULL dereference in tail processing
The AES-CTR glue code avoids calling into the blkcipher API for the
tail portion of the walk, by comparing the remainder of walk.nbytes
modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight
into the tail processing block if they are equal. This tail processing
block checks whether nbytes != 0, and does nothing otherwise.

However, in case of an allocation failure in the blkcipher layer, we
may enter this code with walk.nbytes == 0, while nbytes > 0. In this
case, we should not dereference the source and destination pointers,
since they may be NULL. So instead of checking for nbytes != 0, check
for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in
non-error conditions.

Fixes: 86464859cc ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Cc: stable@vger.kernel.org
Reported-by: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-09-13 18:44:59 +08:00
Linus Walleij
eb8994172a Linux 4.8-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXsSTlAAoJEHm+PkMAQRiGZaYH/1MarD897Cju7AgQj8UYQYXB
 dsG3iuoJMj0ji/V466lmHZnS0pwmpJV3qBp7znaFI5jLoTYD2+BkBDY/FLQLqQH7
 vkRKOCLCbppKkDDsgd3dS3FFOnJC724iqnSsPegp2lz/8OS2dIs8i82PgKX3zQjJ
 PqkcMOEt3YlTV18/LiGvQhXWHJ/x8oIvcO0KuLcz3adX0B15+qZyJo1Ss3Z0HL/V
 CqcSVaLgDe0fAAMCexUcKxRNtvGrMENxYRE9YXJFxu1do2H4opXmt0qGun0yxSN4
 8/Y74BB80yUak8lM4mHMCy2G2lDmB32f188yPN45Hg6+F71UJChXXKUnyzUhx14=
 =QQX1
 -----END PGP SIGNATURE-----

Merge tag 'v4.8-rc2' into devel

Linux 4.8-rc2
2016-09-13 10:31:40 +02:00
Markus Elfring
aad9e5ba24 KVM: PPC: e500: Rename jump labels in kvmppc_e500_tlb_init()
Adjust jump labels according to the current Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-13 14:32:47 +10:00
Rafael J. Wysocki
e323218fb9 The schedutil cpufreq governor will be switched from tristate to bool. Fix
defconfigs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXvSdgAAoJEME3ZuaGi4PXnb0P/AzhZPDeMXtsZzXJXa30IM9E
 Ha0/FWOYmYMNDUrQesndFJ1tgb766/8T6uZwHNUjgp1a4fdmgmFsgVVN3UPIhJgh
 MjUu7qWymbTRYuQ3fyJ82zK7Dg9pREYA1hNFp+qHlUEeYSeGELqynHArXyy/ykLN
 UkGhfkBvbwkuIgrlNQSqaIareieg4yx9oviCmJD91ea0yWSKLdazkH6BvhHo3t7d
 oyAcT8p3s9/CDMJXpPZ4J0KrDvbTfS37pwzOr42bLeYIBjH7JAP/ti6y6HJoJXw7
 d+YkuBkYkpC9RWxYaWiZzgkVJVdYvUC2b6a78hJElWPRwI/s9jO+DcVT7hHMZpLy
 XkFavo+gBiE5Cvw9L2ikuHY1Y6+VT0/OB6X930Nq+nSDa6Xg09gJFp84oNm+xSSr
 TQiB1uqZEYIQc8S5+dWehpZHAqtQKOD9EO8tEAT+R7OkA6wKjxUPV7ekpAVCbjai
 Og72V6HoRNnnYra3nNJ9WwXuPAh04yZCMsL0zP9N5qxsUiGr3fFSqcKMYTpb1U8b
 /0pYZWQ7dmab8EkyQByO71iGehrBp86zMRi5tfJokqzqfzbW7H+17KYUzH6D1pXY
 FSYhps6x9SjG0bmr83LGtzlFWdErLKoymPoke/xdu2tlhu0mlgnj5Tm36yJag9AR
 gmPg1VshohG2WVGHz6dE
 =So6k
 -----END PGP SIGNATURE-----

Merge tag 'samsung-defconfig-schedutil-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into pm-cpufreq-sched

The schedutil cpufreq governor will be switched from tristate to bool. Fix
defconfigs.

* tag 'samsung-defconfig-schedutil-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: multi_v7_defconfig: Don't attempt to enable schedutil governor as module
  ARM: exynos_defconfig: Don't attempt to enable schedutil governor as module
2016-09-13 02:53:37 +02:00
Al Stone
c12f29a5d4 x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries()
This patch has no functional change; it is purely cosmetic, though
it does make it a wee bit easier to understand the code.  Before, the
count of LAPICs was being stored in the variable 'x2count' and the
count of X2APICs was being stored in the variable 'count'.  This
patch swaps that so that the routine acpi_parse_madt_lapic_entries()
will now consistently use x2count to refer to X2APIC info, and count
to refer to LAPIC info.

Signed-off-by: Al Stone <ahs3@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:19:59 +02:00
Al Stone
0f61aaa413 x86: ACPI: remove extraneous white space after semicolon
Signed-off-by: Al Stone <ahs3@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:19:59 +02:00
David S. Miller
b20b378d49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mediatek/mtk_eth_soc.c
	drivers/net/ethernet/qlogic/qed/qed_dcbx.c
	drivers/net/phy/Kconfig

All conflicts were cases of overlapping commits.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-12 15:52:44 -07:00
Linus Torvalds
ac059c4fa7 * s390: nested virt fixes (new 4.8 feature)
* x86: fixes for 4.8 regressions
 * ARM: two small bugfixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJX1xaBAAoJEL/70l94x66Db/8H/AuKWs6QbvUNnA+tWITdmFbi
 ah7R2BoRVak4h6UGYC4OsY9BTuWVeaCx2+8yLIqSdfWoYUJzwiGFPwkuUK/hVSkK
 /9gZO8SsREAm/1gVck2X2vzATYbfCxKcRsSq0/ocmIQEpRYWKGoJWS6zeiwLZ9kq
 H/KsAvsGc83VdlPXrhLVQa7js/Tl/M5JlBEbm6i8nIsvhoqZkES4rgwHKBtkxTyU
 8oepfNQnYTb2hZhJE0aW+9z3V0O+NKbGW75bKOzrbJBoEUGeHuPpLnP0mZIyEGPU
 grQkfuWMmfEaWrGahNA9ARpsNcy4Zp0BF5mgJ03OAmgfY+KmJroVk95IvcP9jF4=
 =+uuA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 - s390: nested virt fixes (new 4.8 feature)
 - x86: fixes for 4.8 regressions
 - ARM: two small bugfixes

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm-arm: Unmap shadow pagetables properly
  x86, clock: Fix kvm guest tsc initialization
  arm: KVM: Fix idmap overlap detection when the kernel is idmap'ed
  KVM: lapic: adjust preemption timer correctly when goes TSC backward
  KVM: s390: vsie: fix riccbd
  KVM: s390: don't use current->thread.fpu.* when accessing registers
2016-09-12 14:30:14 -07:00
Daniel Thompson
91ef84428a irqchip/gic-v3: Reset BPR during initialization
Currently, when running on FVP, CPU 0 boots up with its BPR changed from
the reset value. This renders it impossible to (preemptively) prioritize
interrupts on CPU 0.

This is harmless on normal systems since Linux typically does not
support preemptive interrupts. It does however cause problems in
systems with additional changes (such as patches for NMI simulation).

Many thanks to Andrew Thoelke for suggesting the BPR as having the
potential to harm preemption.

Suggested-by: Andrew Thoelke <andrew.thoelke@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-09-12 19:46:19 +01:00
Russell King
32b6377693 This series of 4 patches optimizes the ARM PLT generation code that
is invoked at module load time, to get rid of the O(n^2) algorithm
 that results in pathological load times of 10 seconds or more for
 large modules on certain STB platforms
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQF8BAABCgBmBQJXxbknXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
 ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ5Q0QyQTBEQTZBRDhGNzMzMDE3NUUyQkJD
 MjM3MjA3RTk1NzRGQTdEAAoJEMI3IH6VdPp9Q7cH/jg3kSsKl/tt07AIUtQdW99/
 mCCG8NXExi5sUi9L/0GXmQVSpOlitm2Cf6U3kOGoWA6uGyNr0mfAqtulMLIwerYN
 r344ZZ6/SgKhIhpIHerteDFeeSJKGVsNmMsrqoStqKUa7/SV6waAnq38m2DYB8KX
 iKhuPnsE0vOAypnf4MYtf8LARP1gSRNS5bfcO+S7+ySQDg3M02WEfkr/0n8jHwqT
 tkuWxI3yspt3SrYLGCMFN+X4TFUueRWO+wv+hp2Y4yuFxoIKiOqauDmnZ8uyWAhO
 GSAI9dhhEW4X2Wo8EpSAuM8B+TxVzYfy2wS0ziwVrPuulRvxkSvpmAyIu2wgGcw=
 =kOZH
 -----END PGP SIGNATURE-----

Merge tag 'arm-plt-optimizations-for-v4.9' of git://git.linaro.org/people/ard.biesheuvel/linux-arm into devel-stable

This series of 4 patches optimizes the ARM PLT generation code that
is invoked at module load time, to get rid of the O(n^2) algorithm
that results in pathological load times of 10 seconds or more for
large modules on certain STB platforms
2016-09-12 16:02:45 +01:00
Russell King
1a57c286d8 ARM: pxa/lubbock: add pcmcia clock
Add the required PCMCIA clock for the SA1111 "1800" device.  This clock
is used to compute timing information for the PCMCIA interface in the
SoC device, rather than the SA1111.  Hence, the provision of this clock
is a convenience for the driver and does not reflect the hardware, so
this must not be copied into DT.

Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 12:12:31 +01:00
Russell King
07f56e6646 ARM: locomo: fix locomo irq handling
Accidentally booting Collie on Assabet reveals that the locomo driver
incorrectly overwrites gpio-sa1100's chip data for its parent interrupt,
leading to oops in sa1100_gpio_unmask() and sa1100_update_edge_regs()
when "gpio: sa1100: convert to use IO accessors" is applied.  Fix locomo
to use the handler data rather than chip data for its parent interrupt.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 12:12:31 +01:00
Stefan Agner
6b3142b2b8 ARM: 8612/1: LPAE: initialize cache policy correctly
The cachepolicy variable gets initialized using a masked pmd
value. So far, the pmd has been masked with flags valid for the
2-page table format, but the 3-page table format requires a
different mask. On LPAE, this lead to a wrong assumption of what
initial cache policy has been used. Later a check forces the
cache policy to writealloc and prints the following warning:
Forcing write-allocate cache policy for SMP

This patch introduces a new definition PMD_SECT_CACHE_MASK for
both page table formats which masks in all cache flags in both
cases.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-12 12:12:30 +01:00
Russell King
87d5dd62c0 ARM: sa1111: fix missing clk_disable()
SA1111 forgets to call clk_disable() in the probe error cleanup path.
Add the necessary call.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 11:04:05 +01:00
Russell King
06dfe5cc0c ARM: sa1111: fix pcmcia suspend/resume
SA1111 PCMCIA was broken when PCMCIA switched to using dev_pm_ops for
the PCMCIA socket class.  PCMCIA used to handle suspend/resume via the
socket hosting device, which happened at normal device suspend/resume
time.

However, the referenced commit changed this: much of the resume now
happens much earlier, in the noirq resume handler of dev_pm_ops.

However, on SA1111, the PCMCIA device is not accessible as the SA1111
has not been resumed at _noirq time.  It's slightly worse than that,
because the SA1111 has already been put to sleep at _noirq time, so
suspend doesn't work properly.

Fix this by converting the core SA1111 code to use dev_pm_ops as well,
and performing its own suspend/resume at noirq time.

This fixes these errors in the kernel log:

pcmcia_socket pcmcia_socket0: time out after reset
pcmcia_socket pcmcia_socket1: time out after reset

and the resulting lack of PCMCIA cards after a S2RAM cycle.

Fixes: d7646f7632 ("pcmcia: use dev_pm_ops for class pcmcia_socket_class")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 11:04:05 +01:00
Russell King
7c0091ecea ARM: sa1111: fix pcmcia interrupt mask polarity
The polarity of the high IRQs was being calculated using
SA1111_IRQMASK_HI(), but this assumes a Linux interrupt number, not a
hardware interrupt number.  Hence, the resulting mask was incorrect.
Fix this.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 11:04:04 +01:00
Russell King
cb034407ec ARM: sa1111: fix error code propagation in sa1111_probe()
Ensure that we propagate the platform_get_irq() error code out of the
probe function.  This allows probe deferrals to work correctly should
platform_get_irq() not be able to resolve the interrupt in a DT
environment at probe time.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-09-12 11:04:03 +01:00
Mark Rutland
e506236a7b arm64/kvm: use alternative auto-nop
Make use of the new alternative_if and alternative_else_nop_endif and
get rid of our open-coded NOP sleds, making the code simpler to read.

Note that for __kvm_call_hyp the branch to __vhe_hyp_call has been moved
out of the alternative sequence, and in the default case there will be
four additional NOPs executed.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-12 10:46:07 +01:00
Mark Rutland
6ba3b554f5 arm64: use alternative auto-nop
Make use of the new alternative_if and alternative_else_nop_endif and
get rid of our homebew NOP sleds, making the code simpler to read.

Note that for cpu_do_switch_mm the ret has been moved out of the
alternative sequence, and in the default case there will be three
additional NOPs executed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-12 10:46:07 +01:00
Mark Rutland
792d47379f arm64: alternative: add auto-nop infrastructure
In some cases, one side of an alternative sequence is simply a number of
NOPs used to balance the other side. Keeping track of this manually is
tedious, and the presence of large chains of NOPs makes the code more
painful to read than necessary.

To ameliorate matters, this patch adds a new alternative_else_nop_endif,
which automatically balances an alternative sequence with a trivial NOP
sled.

In many cases, we would like a NOP-sled in the default case, and
instructions patched in in the presence of a feature. To enable the NOPs
to be generated automatically for this case, this patch also adds a new
alternative_if, and updates alternative_else and alternative_endif to
work with either alternative_if or alternative_endif.

Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: use new nops macro to generate nop sequences]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-12 10:45:34 +01:00
Max Filippov
bf15f86b34 xtensa: initialize MMU before jumping to reset vector
When reset is simulated MMU need to be brought into its initial state,
because that's what bootloaders/OS kernels assume. This is especially
important for MMUv3 because TLB state when the kernel is running is
significatly different from its reset state.

With this change it is possible to boot linux and get back to U-Boot
repeatedly.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-11 23:53:23 -07:00
Max Filippov
ea951c34ea xtensa: fix icountlevel setting in cpu_reset
icountlevel SR value specifies lowest intlevel that does not do
instruction counting, so to disable instruction counting completely it
must be set to 0, not to 15.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-11 23:53:22 -07:00
Max Filippov
4f2056873f xtensa: extract common CPU reset code into separate function
platform_restart implementatations do the same thing to reset CPU.
Don't duplicate that code, move it to a function and call it from
platform_restart.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-09-11 23:53:22 -07:00
Michael Ellerman
ffed15d3ce powerpc/kernel: Fix size of NUM_CPU_FTR_KEYS on 32-bit
The number of CPU feature keys is meant to map 1:1 to the number of CPU
feature flags defined in cputable.h, and the latter must fit in an
unsigned long.

In commit 4db7327194 ("powerpc: Add option to use jump label for
cpu_has_feature()"), I incorrectly defined NUM_CPU_FTR_KEYS to 64.

There should be no real adverse consequences of this bug, other than us
allocating too many keys.

Fix it by using BITS_PER_LONG.

Fixes: 4db7327194 ("powerpc: Add option to use jump label for cpu_has_feature()")
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-12 12:48:28 +10:00
Gautham R. Shenoy
bd00a240dc powerpc/powernv: Fix restore of SPRs upon wake up from hypervisor state loss
pnv_wakeup_tb_loss() currently expects cr4 to be "eq" if the CPU is
waking up from a complete hypervisor state loss. Hence, it currently
restores the SPR contents only if cr4 is "eq".

However, after commit bcef83a00d ("powerpc/powernv: Add platform
support for stop instruction"), on ISA v3.0 CPUs, the function
pnv_restore_hyp_resource() sets cr4 to contain the result of the
comparison between the state the CPU has woken up from and the first
deep stop state before calling pnv_wakeup_tb_loss().

Thus if the CPU woke up from a state that is deeper than the first
deep stop state, cr4 will have "gt" set and hence, pnv_wakeup_tb_loss()
will fail to restore the SPRs on waking up from such a state.

Fix the code in pnv_wakeup_tb_loss() to restore the SPR states when cr4
is "eq" or "gt".

Fixes: bcef83a00d ("powerpc/powernv: Add platform support for stop instruction")
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-12 12:45:50 +10:00
Markus Elfring
90235dc19e KVM: PPC: e500: Use kmalloc_array() in kvmppc_e500_tlb_init()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:13:01 +10:00
Markus Elfring
b0ac477bc4 KVM: PPC: e500: Replace kzalloc() calls by kcalloc() in two functions
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kcalloc".

  Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

  This issue was detected also by using the Coccinelle software.

* Replace the specification of data structures by pointer dereferences
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:56 +10:00
Markus Elfring
cfb60813fb KVM: PPC: e500: Delete an unnecessary initialisation in kvm_vcpu_ioctl_config_tlb()
The local variable "g2h_bitmap" will be set to an appropriate value
a bit later. Thus omit the explicit initialisation at the beginning.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:51 +10:00
Markus Elfring
46d4e74792 KVM: PPC: e500: Less function calls in kvm_vcpu_ioctl_config_tlb() after error detection
The kfree() function was called in two cases by the
kvm_vcpu_ioctl_config_tlb() function during error handling
even if the passed data structure element contained a null pointer.

* Split a condition check for memory allocation failures.

* Adjust jump targets according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:46 +10:00
Markus Elfring
f3c0ce86a8 KVM: PPC: e500: Use kmalloc_array() in kvm_vcpu_ioctl_config_tlb()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:40 +10:00
Suresh Warrier
65e7026a6c KVM: PPC: Book3S HV: Counters for passthrough IRQ stats
Add VCPU stat counters to track affinity for passthrough
interrupts.

pthru_all: Counts all passthrough interrupts whose IRQ mappings are
           in the kvmppc_passthru_irq_map structure.
pthru_host: Counts all cached passthrough interrupts that were injected
	    from the host through kvm_set_irq (i.e. not handled in
	    real mode).
pthru_bad_aff: Counts how many cached passthrough interrupts have
               bad affinity (receiving CPU is not running VCPU that is
	       the target of the virtual interrupt in the guest).

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:34 +10:00
Paul Mackerras
5d375199ea KVM: PPC: Book3S HV: Set server for passed-through interrupts
When a guest has a PCI pass-through device with an interrupt, it
will direct the interrupt to a particular guest VCPU.  In fact the
physical interrupt might arrive on any CPU, and then get
delivered to the target VCPU in the emulated XICS (guest interrupt
controller), and eventually delivered to the target VCPU.

Now that we have code to handle device interrupts in real mode
without exiting to the host kernel, there is an advantage to having
the device interrupt arrive on the same sub(core) as the target
VCPU is running on.  In this situation, the interrupt can be
delivered to the target VCPU without any exit to the host kernel
(using a hypervisor doorbell interrupt between threads if
necessary).

This patch aims to get passed-through device interrupts arriving
on the correct core by setting the interrupt server in the real
hardware XICS for the interrupt to the first thread in the (sub)core
where its target VCPU is running.  We do this in the real-mode H_EOI
code because the H_EOI handler already needs to look at the
emulated ICS state for the interrupt (whereas the H_XIRR handler
doesn't), and we know we are running in the target VCPU context
at that point.

We set the server CPU in hardware using an OPAL call, regardless of
what the IRQ affinity mask for the interrupt says, and without
updating the affinity mask.  This amounts to saying that when an
interrupt is passed through to a guest, as a matter of policy we
allow the guest's affinity for the interrupt to override the host's.

This is inspired by an earlier patch from Suresh Warrier, although
none of this code came from that earlier patch.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:28 +10:00
Suresh Warrier
366274f59c KVM: PPC: Book3S HV: Update irq stats for IRQs handled in real mode
When a passthrough IRQ is handled completely within KVM real
mode code, it has to also update the IRQ stats since this
does not go through the generic IRQ handling code.

However, the per CPU kstat_irqs field is an allocated (not static)
field and so cannot be directly accessed in real mode safely.

The function this_cpu_inc_rm() is introduced to safely increment
per CPU fields (currently coded for unsigned integers only) that
are allocated and could thus be vmalloced also.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:23 +10:00
Suresh Warrier
644abbb254 KVM: PPC: Book3S HV: Tunable to disable KVM IRQ bypass
Add a  module parameter kvm_irq_bypass for kvm_hv.ko to
disable IRQ bypass for passthrough interrupts. The default
value of this tunable is 1 - that is enable the feature.

Since the tunable is used by built-in kernel code, we use
the module_param_cb macro to achieve this.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:18 +10:00
Suresh Warrier
af893c7dc9 KVM: PPC: Book3S HV: Dump irqmap in debugfs
Dump the passthrough irqmap structure associated with a
guest as part of /sys/kernel/debug/powerpc/kvm-xics-*.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:13 +10:00
Suresh Warrier
f7af5209b8 KVM: PPC: Book3S HV: Complete passthrough interrupt in host
In existing real mode ICP code, when updating the virtual ICP
state, if there is a required action that cannot be completely
handled in real mode, as for instance, a VCPU needs to be woken
up, flags are set in the ICP to indicate the required action.
This is checked when returning from hypercalls to decide whether
the call needs switch back to the host where the action can be
performed in virtual mode. Note that if h_ipi_redirect is enabled,
real mode code will first try to message a free host CPU to
complete this job instead of returning the host to do it ourselves.

Currently, the real mode PCI passthrough interrupt handling code
checks if any of these flags are set and simply returns to the host.
This is not good enough as the trap value (0x500) is treated as an
external interrupt by the host code. It is only when the trap value
is a hypercall that the host code searches for and acts on unfinished
work by calling kvmppc_xics_rm_complete.

This patch introduces a special trap BOOK3S_INTERRUPT_HV_RM_HARD
which is returned by KVM if there is unfinished business to be
completed in host virtual mode after handling a PCI passthrough
interrupt. The host checks for this special interrupt condition
and calls into the kvmppc_xics_rm_complete, which is made an
exported function for this reason.

[paulus@ozlabs.org - moved logic to set r12 to BOOK3S_INTERRUPT_HV_RM_HARD
 in book3s_hv_rmhandlers.S into the end of kvmppc_check_wake_reason.]

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:12:07 +10:00
Suresh Warrier
e3c13e56a4 KVM: PPC: Book3S HV: Handle passthrough interrupts in guest
Currently, KVM switches back to the host to handle any external
interrupt (when the interrupt is received while running in the
guest). This patch updates real-mode KVM to check if an interrupt
is generated by a passthrough adapter that is owned by this guest.
If so, the real mode KVM will directly inject the corresponding
virtual interrupt to the guest VCPU's ICS and also EOI the interrupt
in hardware. In short, the interrupt is handled entirely in real
mode in the guest context without switching back to the host.

In some rare cases, the interrupt cannot be completely handled in
real mode, for instance, a VCPU that is sleeping needs to be woken
up. In this case, KVM simply switches back to the host with trap
reason set to 0x500. This works, but it is clearly not very efficient.
A following patch will distinguish this case and handle it
correctly in the host. Note that we can use the existing
check_too_hard() routine even though we are not in a hypercall to
determine if there is unfinished business that needs to be
completed in host virtual mode.

The patch assumes that the mapping between hardware interrupt IRQ
and virtual IRQ to be injected to the guest already exists for the
PCI passthrough interrupts that need to be handled in real mode.
If the mapping does not exist, KVM falls back to the default
existing behavior.

The KVM real mode code reads mappings from the mapped array in the
passthrough IRQ map without taking any lock.  We carefully order the
loads and stores of the fields in the kvmppc_irq_map data structure
using memory barriers to avoid an inconsistent mapping being seen by
the reader. Thus, although it is possible to miss a map entry, it is
not possible to read a stale value.

[paulus@ozlabs.org - get irq_chip from irq_map rather than pimap,
 pulled out powernv eoi change into a separate patch, made
 kvmppc_read_intr get the vcpu from the paca rather than being
 passed in, rewrote the logic at the end of kvmppc_read_intr to
 avoid deep indentation, simplified logic in book3s_hv_rmhandlers.S
 since we were always restoring SRR0/1 anyway, get rid of the cached
 array (just use the mapped array), removed the kick_all_cpus_sync()
 call, clear saved_xirr PACA field when we handle the interrupt in
 real mode, fix compilation with CONFIG_KVM_XICS=n.]

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-12 10:11:00 +10:00
Russell King
3521a0f05d Input: jornada720_ts - get rid of mach/irqs.h and mach/hardware.h includes
Switch the jornada720 touchscreen driver to obtain its gpio from
the platform device via gpiolib and derive the interrupt from the
GPIO, rather than via a hard-coded interrupt number obtained from
the mach/irqs.h and mach/hardware.h headers.

Tested-by: Adam Wysocki <armlinux@chmurka.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-09-10 10:47:48 -07:00
Linus Torvalds
98ac9a608d Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "nvdimm fixes for v4.8, two of them are tagged for -stable:

   - Fix devm_memremap_pages() to use track_pfn_insert().  Otherwise,
     DAX pmd mappings end up with an uncached pgprot, and unusable
     performance for the device-dax interface.  The device-dax interface
     appeared in 4.7 so this is tagged for -stable.

   - Fix a couple VM_BUG_ON() checks in the show_smaps() path to
     understand DAX pmd entries.  This fix is tagged for -stable.

   - Fix a mis-merge of the nfit machine-check handler to flip the
     polarity of an if() to match the final version of the patch that
     Vishal sent for 4.8-rc1.  Without this the nfit machine check
     handler never detects / inserts new 'badblocks' entries which
     applications use to identify lost portions of files.

   - For test purposes, fix the nvdimm_clear_poison() path to operate on
     legacy / simulated nvdimm memory ranges.  Without this fix a test
     can set badblocks, but never clear them on these ranges.

   - Fix the range checking done by dax_dev_pmd_fault().  This is not
     tagged for -stable since this problem is mitigated by specifying
     aligned resources at device-dax setup time.

  These patches have appeared in a next release over the past week.  The
  recent rebase you can see in the timestamps was to drop an invalid fix
  as identified by the updated device-dax unit tests [1].  The -mm
  touches have an ack from Andrew"

[1]: "[ndctl PATCH 0/3] device-dax test for recent kernel bugs"
   https://lists.01.org/pipermail/linux-nvdimm/2016-September/006855.html

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm: allow legacy (e820) pmem region to clear bad blocks
  nfit, mce: Fix SPA matching logic in MCE handler
  mm: fix cache mode of dax pmd mappings
  mm: fix show_smap() for zone_device-pmd ranges
  dax: fix mapping size check
2016-09-10 09:58:52 -07:00
Kan Liang
cd34cd97b7 perf/x86/intel/uncore: Add Skylake server uncore support
This patch implements the uncore monitoring driver for Skylake server.
The uncore subsystem in Skylake server is similar to previous
server. There are some differences in config register encoding and pci
device IDs. Besides, Skylake introduces many new boxes to reflect the
MESH architecture changes.

The control registers for IIO and UPI have been extended to 64 bit. This
patch also introduces event_mask_ext to handle the high 32 bit mask.

The CHA box number could vary for different machines. This patch gets
the CHA box number by counting the CHA register space during
initialization at runtime.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1471378190-17276-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:18:52 +02:00
Harry Pan
2668c61956 perf/x86/rapl: Enable Apollo Lake RAPL support
This patch enables RAPL counters (energy consumption counters)
support for Intel Apollo Lake (Goldmont) processors (Model 92):

RAPL of Goldmont, unlikes ESU increment of Silvermont/Airmont,
it likes the Haswell microarchitecture in 1/2^ESU joules and
supports power domains in PP0/PP1/PKG/RAM.

ESU and power domains refer to Intel Software Developers' Manual,
Vol. 3C, Order No. 325384, Table 35-12.

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10

Signed-off-by: Harry Pan <harry.pan@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@alien8.de
Cc: gs0622@gmail.com
Cc: hpa@zytor.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/1473325738-730-1-git-send-email-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:18:52 +02:00
Ingo Molnar
5006921837 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:17:54 +02:00
Peter Zijlstra
8ef9b8455a perf/x86/intel: Fix PEBSv3 record drain
Alexander hit the WARN_ON_ONCE(!event) on his Skylake while running
the perf fuzzer.

This means the PEBSv3 record included a status bit for an inactive
event, something that _should_ not happen.

Move the code that filters the status bits against our known PEBS
events up a spot to guarantee we only deal with events we know about.

Further add "continue" statements to the WARN_ON_ONCE()s such that
we'll not die nor generate silly events in case we ever do hit them
again.

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: stable@vger.kernel.org
Fixes: a3d86542de ("perf/x86/intel/pebs: Add PEBSv3 decoding")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:39 +02:00
Alexander Shishkin
ef9ef3befa perf/x86/intel/bts: Kill a silly warning
At the moment, intel_bts will WARN() out if there is more than one
event writing to the same ring buffer, via SET_OUTPUT, and will only
send data from one event to a buffer.

There is no reason to have this warning in, so kill it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-6-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:38 +02:00
Alexander Shishkin
4d4c474124 perf/x86/intel/bts: Fix BTS PMI detection
Since BTS doesn't have a dedicated PMI status bit, the driver needs to
take extra care to check for the condition that triggers it to avoid
spurious NMI warnings.

Regardless of the local BTS context state, the only way of knowing that
the NMI is ours is to compare the write pointer against the interrupt
threshold.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160906132353.19887-5-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-10 11:15:38 +02:00