Commit Graph

6158 Commits

Author SHA1 Message Date
Andy Lutomirski
7bbcdb1ca4 x86/head: Pass a real pt_regs and trapnr to early_fixup_exception()
early_fixup_exception() is limited by the fact that it doesn't have a
real struct pt_regs.  Change both the 32-bit and 64-bit asm and the
C code to pass and accept a real pt_regs.

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: KVM list <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/e3fb680fcfd5e23e38237e8328b64a25cc121d37.1459605520.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:44 +02:00
Borislav Petkov
782511b00f x86/cpufeature: Replace cpu_has_xsaves with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <kvm@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459801503-15600-11-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:42 +02:00
Borislav Petkov
d366bf7eb9 x86/cpufeature: Replace cpu_has_xsave with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1459801503-15600-10-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:41 +02:00
Borislav Petkov
01f8fd7379 x86/cpufeature: Replace cpu_has_fxsr with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459801503-15600-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:41 +02:00
Borislav Petkov
93984fbd4e x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: iommu@lists.linux-foundation.org
Cc: linux-pm@vger.kernel.org
Cc: oprofile-list@lists.sf.net
Link: http://lkml.kernel.org/r/1459801503-15600-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:41 +02:00
Borislav Petkov
59e21e3d00 x86/cpufeature: Replace cpu_has_tsc with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Link: http://lkml.kernel.org/r/1459801503-15600-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:41 +02:00
Borislav Petkov
a402a8dffc x86/cpufeature: Replace cpu_has_fpu with boot_cpu_has() usage
Use static_cpu_has() in the timing-sensitive paths in fpstate_init() and
fpu__copy().

While at it, simplify the use in init_cyrix() and get rid of the ternary
operator.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459801503-15600-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:40 +02:00
Borislav Petkov
dda9edf7c1 x86/cpufeature: Replace cpu_has_xmm with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459801503-15600-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:40 +02:00
Borislav Petkov
da154e82af x86/cpufeature: Replace cpu_has_avx with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459801503-15600-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:40 +02:00
Borislav Petkov
1f4dd7938e x86/cpufeature: Replace cpu_has_aes with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459801503-15600-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:39 +02:00
Borislav Petkov
abcfdfe07d x86/cpufeature: Replace cpu_has_avx2 with boot_cpu_has() usage
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459801503-15600-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:37:39 +02:00
Ingo Molnar
95a8e746f8 Merge branch 'x86/urgent' into x86/asm to pick up dependent fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:36:44 +02:00
Ingo Molnar
d8d1c35139 Merge branch 'x86/mm' into x86/asm to resolve conflict and to create common base
Conflicts:
	arch/x86/include/asm/cpufeature.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:36:19 +02:00
Ingo Molnar
cb44d0cfc2 Merge branch 'x86/cpu' into x86/asm, to merge more patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 11:15:39 +02:00
Andy Lutomirski
1ed95e52d9 x86/vdso: Remove direct HPET access through the vDSO
Allowing user code to map the HPET is problematic.  HPET
implementations are notoriously buggy, and there are probably many
machines on which even MMIO reads from bogus HPET addresses are
problematic.

We have a report that the Dell Precision M2800 with:

  ACPI: HPET 0x00000000C8FE6238 000038 (v01 DELL   CBX3  01072009 AMI. 00000005)

is either so slow when accessing the HPET or actually hangs in some
regard, causing soft lockups to be reported if users do unexpected
things to the HPET.

The vclock HPET code has also always been a questionable speedup.
Accessing an HPET is exceedingly slow (on the order of several
microseconds), so the added overhead in requiring a syscall to read
the HPET is a small fraction of the total code of accessing it.

To avoid future problems, let's just delete the code entirely.

In the long run, this could actually be a speedup.  Waiman Long as a
patch to optimize the case where multiple CPUs contend for the HPET,
but that won't help unless all the accesses are mediated by the
kernel.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: Waiman Long <waiman.long@hpe.com>
Link: http://lkml.kernel.org/r/d2f90bba98db9905041cff294646d290d378f67a.1460074438.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 10:28:34 +02:00
Borislav Petkov
96e5d28ae7 x86/cpu: Add Erratum 88 detection on AMD
Erratum 88 affects old AMD K8s, where a SWAPGS fails to cause an input
dependency on GS. Therefore, we need to MFENCE before it.

But that MFENCE is expensive and unnecessary on the remaining x86 CPUs
out there so patch it out on the CPUs which don't require it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/aec6b2df1bfc56101d4e9e2e5d5d570bf41663c6.1460075211.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 10:20:42 +02:00
Andy Lutomirski
7a5d670487 x86/cpu: Probe the behavior of nulling out a segment at boot time
AMD and Intel do different things when writing zero to a segment
selector.  Since neither vendor documents the behavior well and it's
easy to test the behavior, try nulling fs to see what happens.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/61588ba0e0df35beafd363dc8b68a4c5878ef095.1460075211.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-13 10:20:41 +02:00
Borislav Petkov
16bf92261b x86/cpufeature: Remove cpu_has_pse
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459266123-21878-11-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:10 +02:00
Borislav Petkov
c109bf9599 x86/cpufeature: Remove cpu_has_pge
Use static_cpu_has() in __flush_tlb_all() due to the time-sensitivity of
this one.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459266123-21878-10-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:09 +02:00
Borislav Petkov
054efb6467 x86/cpufeature: Remove cpu_has_xmm2
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459266123-21878-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:09 +02:00
Borislav Petkov
906bf7fda2 x86/cpufeature: Remove cpu_has_clflush
Use the fast variant in the DRM code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dri-devel@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Link: http://lkml.kernel.org/r/1459266123-21878-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:09 +02:00
Borislav Petkov
b8291adc19 x86/cpufeature: Remove cpu_has_gbpages
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459266123-21878-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:08 +02:00
Borislav Petkov
62436a4d36 x86/cpufeature: Remove cpu_has_x2apic
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459266123-21878-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:08 +02:00
Borislav Petkov
ab4a56fa2c x86/cpufeature: Remove cpu_has_osxsave
Use boot_cpu_has() instead.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Link: http://lkml.kernel.org/r/1459266123-21878-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:08 +02:00
Borislav Petkov
0c9f3536cc x86/cpufeature: Remove cpu_has_hypervisor
Use boot_cpu_has() instead.

Tested-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: sparmaintainer@unisys.com
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1459266123-21878-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:35:07 +02:00
Borislav Petkov
7b5e74e637 x86/cpufeature: Remove cpu_has_arch_perfmon
Use boot_cpu_has() instead.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: oprofile-list@lists.sf.net
Link: http://lkml.kernel.org/r/1459266123-21878-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:33:17 +02:00
Borislav Petkov
568a58e5df x86/mm/pat, x86/cpufeature: Remove cpu_has_pat
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Link: http://lkml.kernel.org/r/1459266123-21878-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 13:32:43 +02:00
Toshi Kani
88ba281108 x86/xen, pat: Remove PAT table init code from Xen
Xen supports PAT without MTRRs for its guests.  In order to
enable WC attribute, it was necessary for xen_start_kernel()
to call pat_init_cache_modes() to update PAT table before
starting guest kernel.

Now that the kernel initializes PAT table to the BIOS handoff
state when MTRR is disabled, this Xen-specific PAT init code
is no longer necessary.  Delete it from xen_start_kernel().

Also change __init_cache_modes() to a static function since
PAT table should not be tweaked by other modules.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: elliott@hpe.com
Cc: paul.gortmaker@windriver.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1458769323-24491-7-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 12:23:27 +02:00
Toshi Kani
edfe63ec97 x86/mtrr: Fix Xorg crashes in Qemu sessions
A Xorg failure on qemu32 was reported as a regression [1] caused by
commit 9cd25aac1f ("x86/mm/pat: Emulate PAT when it is disabled").

This patch fixes the Xorg crash.

Negative effects of this regression were the following two failures [2]
in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were
triggered by the fact that its virtual CPU does not support MTRRs.

 #1. copy_process() failed in the check in reserve_pfn_range()

    copy_process
     copy_mm
      dup_mm
       dup_mmap
        copy_page_range
         track_pfn_copy
          reserve_pfn_range

 A WC map request was tracked as WC in memtype, which set a PTE as
 UC (pgprot) per __cachemode2pte_tbl[].  This led to this error in
 reserve_pfn_range() called from track_pfn_copy(), which obtained
 a pgprot from a PTE.  It converts pgprot to page_cache_mode, which
 does not necessarily result in the original page_cache_mode since
 __cachemode2pte_tbl[] redirects multiple types to UC.

 #2. error path in copy_process() then hit WARN_ON_ONCE in
     untrack_pfn().

     x86/PAT: Xorg:509 map pfn expected mapping type uncached-
     minus for [mem 0xfd000000-0xfdffffff], got write-combining
      Call Trace:
     dump_stack
     warn_slowpath_common
     ? untrack_pfn
     ? untrack_pfn
     warn_slowpath_null
     untrack_pfn
     ? __kunmap_atomic
     unmap_single_vma
     ? pagevec_move_tail_fn
     unmap_vmas
     exit_mmap
     mmput
     copy_process.part.47
     _do_fork
     SyS_clone
     do_syscall_32_irqs_on
     entry_INT80_32

These negative effects are caused by two separate bugs, but they
can be addressed in separate patches.  Fixing the pat_init() issue
described below addresses the root cause, and avoids Xorg to hit
these cases.

When the CPU does not support MTRRs, MTRR does not call pat_init(),
which leaves PAT enabled without initializing PAT.  This pat_init()
issue is a long-standing issue, but manifested as issue #1 (and then
hit issue #2) with the above-mentioned commit because the memtype
now tracks cache attribute with 'page_cache_mode'.

This pat_init() issue existed before the commit, but we used pgprot
in memtype.  Hence, we did not have issue #1 before.  But WC request
resulted in WT in effect because WC pgrot is actually WT when PAT
is not initialized.  This is not how it was designed to work.  When
PAT is set to disable properly, WC is converted to UC.  The use of
WT can result in a system crash if the target range does not support
WT.  Fortunately, nobody ran into such issue before.

To fix this pat_init() issue, PAT code has been enhanced to provide
pat_disable() interface.  Call this interface when MTRRs are disabled.
By setting PAT to disable properly, PAT bypasses the memtype check,
and avoids issue #1.

  [1]: https://lkml.org/lkml/2016/3/3/828
  [2]: https://lkml.org/lkml/2016/3/4/775

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: elliott@hpe.com
Cc: konrad.wilk@oracle.com
Cc: paul.gortmaker@windriver.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1458769323-24491-5-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 12:23:26 +02:00
Toshi Kani
224bb1e5d6 x86/mm/pat: Add pat_disable() interface
In preparation for fixing a regression caused by:

  9cd25aac1f ("x86/mm/pat: Emulate PAT when it is disabled")

... PAT needs to provide an interface that prevents the OS from
initializing the PAT MSR.

PAT MSR initialization must be done on all CPUs using the specific
sequence of operations defined in the Intel SDM.  This requires MTRRs
to be enabled since pat_init() is called as part of MTRR init
from mtrr_rendezvous_handler().

Make pat_disable() as the interface that prevents the OS from
initializing the PAT MSR.  MTRR will call this interface when it
cannot provide the SDM-defined sequence to initialize PAT.

This also assures that pat_disable() called from pat_bsp_init()
will set the PAT table properly when CPU does not support PAT.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: konrad.wilk@oracle.com
Cc: paul.gortmaker@windriver.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1458769323-24491-3-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 12:23:25 +02:00
Toshi Kani
02f037d641 x86/mm/pat: Add support of non-default PAT MSR setting
In preparation for fixing a regression caused by:

  9cd25aac1f ("x86/mm/pat: Emulate PAT when it is disabled")'

... PAT needs to support a case that PAT MSR is initialized with a
non-default value.

When pat_init() is called and PAT is disabled, it initializes the
PAT table with the BIOS default value. Xen, however, sets PAT MSR
with a non-default value to enable WC. This causes inconsistency
between the PAT table and PAT MSR when PAT is set to disable on Xen.

Change pat_init() to handle the PAT disable cases properly.  Add
init_cache_modes() to handle two cases when PAT is set to disable.

 1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
 2. CPU does not support PAT: Set PAT table to be consistent with
    PWT and PCD bits in a PTE.

Note, __init_cache_modes(), renamed from pat_init_cache_modes(),
will be changed to a static function in a later patch.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: elliott@hpe.com
Cc: konrad.wilk@oracle.com
Cc: paul.gortmaker@windriver.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1458769323-24491-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-29 12:23:25 +02:00
Borislav Petkov
5f870a3f71 x86/thread_info: Merge two !__ASSEMBLY__ sections
We have

  #ifndef __ASSEMBLY__
  ...
  #endif

  #ifndef __ASSEMBLY__
  ...
  #endif

Merge the two.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1459189217-25532-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 11:12:10 +02:00
Vladimir Zapolskiy
4a6772f514 x86/cpufreq: Remove duplicated TDP MSR macro definitions
The list of CPU model specific registers contains two copies of TDP
registers, remove the one, which is out of numerical order in the
list.

Fixes: 6a35fc2d6c ("cpufreq: intel_pstate: get P1 from TAR when available")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Kristen Carlson
 Accardi <kristen@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: http://lkml.kernel.org/r/1459018020-24577-1-git-send-email-vladimir_zapolskiy@mentor.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 11:12:10 +02:00
Borislav Petkov
8196dab4fc x86/cpu: Get rid of compute_unit_id
It is cpu_core_id anyway.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1458917557-8757-3-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Peter Zijlstra
ee6825c80e x86/topology: Fix AMD core count
It turns out AMD gets x86_max_cores wrong when there are compute
units.

The issue is that Linux assumes:

	nr_logical_cpus = nr_cores * nr_siblings

But AMD reports its CU unit as 2 cores, but then sets num_smp_siblings
to 2 as well.

Boris: fixup ras/mce_amd_inj.c too, to compute the Node Base Core
properly, according to the new nomenclature.

Fixes: 1f12e32f4c ("x86/topology: Create logical package id")
Reported-by: Xiong Zhou <jencce.kernel@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Herrmann <aherrmann@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20160317095220.GO6344@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-29 10:45:04 +02:00
Linus Torvalds
3fa2fe2ce0 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains various perf fixes on the kernel side, plus three
  hw/event-enablement late additions:

   - Intel Memory Bandwidth Monitoring events and handling
   - the AMD Accumulated Power Mechanism reporting facility
   - more IOMMU events

  ... and a final round of perf tooling updates/fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  perf llvm: Use strerror_r instead of the thread unsafe strerror one
  perf llvm: Use realpath to canonicalize paths
  perf tools: Unexport some methods unused outside strbuf.c
  perf probe: No need to use formatting strbuf method
  perf help: Use asprintf instead of adhoc equivalents
  perf tools: Remove unused perf_pathdup, xstrdup functions
  perf tools: Do not include stringify.h from the kernel sources
  tools include: Copy linux/stringify.h from the kernel
  tools lib traceevent: Remove redundant CPU output
  perf tools: Remove needless 'extern' from function prototypes
  perf tools: Simplify die() mechanism
  perf tools: Remove unused DIE_IF macro
  perf script: Remove lots of unused arguments
  perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
  perf machine: Rename perf_event__preprocess_sample to machine__resolve
  perf tools: Add cpumode to struct perf_sample
  perf tests: Forward the perf_sample in the dwarf unwind test
  perf tools: Remove misplaced __maybe_unused
  perf list: Fix documentation of :ppp
  perf bench numa: Fix assertion for nodes bitfield
  ...
2016-03-24 10:02:14 -07:00
Linus Torvalds
d88f48e128 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - fix hotplug bugs
   - fix irq live lock
   - fix various topology handling bugs
   - fix APIC ACK ordering
   - fix PV iopl handling
   - fix speling
   - fix/tweak memcpy_mcsafe() return value
   - fix fbcon bug
   - remove stray prototypes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/msr: Remove unused native_read_tscp()
  x86/apic: Remove declaration of unused hw_nmi_is_cpu_stuck
  x86/oprofile/nmi: Add missing hotplug FROZEN handling
  x86/hpet: Use proper mask to modify hotplug action
  x86/apic/uv: Fix the hotplug notifier
  x86/apb/timer: Use proper mask to modify hotplug action
  x86/topology: Use total_cpus not nr_cpu_ids for logical packages
  x86/topology: Fix Intel HT disable
  x86/topology: Fix logical package mapping
  x86/irq: Cure live lock in fixup_irqs()
  x86/tsc: Prevent NULL pointer deref in calibrate_delay_is_known()
  x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt()
  x86/iopl: Fix iopl capability check on Xen PV
  x86/iopl/64: Properly context-switch IOPL on Xen PV
  selftests/x86: Add an iopl test
  x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe()
  x86/video: Don't assume all FB devices are PCI devices
  arch/x86/irq: Purge useless handler declarations from hw_irq.h
  x86: Fix misspellings in comments
2016-03-24 09:47:32 -07:00
Prarit Bhargava
9da77666d6 x86/msr: Remove unused native_read_tscp()
After e76b027 ("x86,vdso: Use LSL unconditionally for vgetcpu")
native_read_tscp() is unused in the kernel. The function can be removed like
native_read_tsc() was.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1458687968-9106-1-git-send-email-prarit@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-23 12:34:17 +01:00
Linus Torvalds
a24e3d414e Merge branch 'akpm' (patches from Andrew)
Merge third patch-bomb from Andrew Morton:

 - more ocfs2 changes

 - a few hotfixes

 - Andy's compat cleanups

 - misc fixes to fatfs, ptrace, coredump, cpumask, creds, eventfd,
   panic, ipmi, kgdb, profile, kfifo, ubsan, etc.

 - many rapidio updates: fixes, new drivers.

 - kcov: kernel code coverage feature.  Like gcov, but not
   "prohibitively expensive".

 - extable code consolidation for various archs

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits)
  ia64/extable: use generic search and sort routines
  x86/extable: use generic search and sort routines
  s390/extable: use generic search and sort routines
  alpha/extable: use generic search and sort routines
  kernel/...: convert pr_warning to pr_warn
  drivers: dma-coherent: use memset_io for DMA_MEMORY_IO mappings
  drivers: dma-coherent: use MEMREMAP_WC for DMA_MEMORY_MAP
  memremap: add MEMREMAP_WC flag
  memremap: don't modify flags
  kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE
  mm/mprotect.c: don't imply PROT_EXEC on non-exec fs
  ipc/sem: make semctl setting sempid consistent
  ubsan: fix tree-wide -Wmaybe-uninitialized false positives
  kfifo: fix sparse complaints
  scripts/gdb: account for changes in module data structure
  scripts/gdb: add cmdline reader command
  scripts/gdb: add version command
  kernel: add kcov code coverage
  profile: hide unused functions when !CONFIG_PROC_FS
  hpwdt: use nmi_panic() when kernel panics in NMI handler
  ...
2016-03-22 17:09:14 -07:00
Ard Biesheuvel
29934b0fb8 x86/extable: use generic search and sort routines
Replace the arch specific versions of search_extable() and
sort_extable() with calls to the generic ones, which now support
relative exception tables as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski
f970165bee x86/compat: remove is_compat_task()
x86's is_compat_task always checked the current syscall type, not the
task type.  It has no non-arch users any more, so just remove it to
avoid confusion.

On x86, nothing should really be checking the task ABI.  There are
legitimate users for the syscall ABI and for the mm ABI.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Huaitong Han
b9baba8614 KVM, pkeys: expose CPUID/CR4 to guest
X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:
CPUID.7.0.ECX[3]:PKU. X86_FEATURE_OSPKE is software support for pkeys,
enumerated with CPUID.7.0.ECX[4]:OSPKE, and it reflects the setting of
CR4.PKE(bit 22).

This patch disables CPUID:PKU without ept, because pkeys is not yet
implemented for shadow paging.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 16:38:17 +01:00
Huaitong Han
be94f6b710 KVM, pkeys: add pkeys support for permission_fault
Protection keys define a new 4-bit protection key field (PKEY) in bits
62:59 of leaf entries of the page tables, the PKEY is an index to PKRU
register(16 domains), every domain has 2 bits(write disable bit, access
disable bit).

Static logic has been produced in update_pkru_bitmask, dynamic logic need
read pkey from page table entries, get pkru value, and deduce the correct
result.

[ Huaitong: Xiao helps to modify many sections. ]

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 16:23:37 +01:00
Huaitong Han
2d344105f5 KVM, pkeys: introduce pkru_mask to cache conditions
PKEYS defines a new status bit in the PFEC. PFEC.PK (bit 5), if some
conditions is true, the fault is considered as a PKU violation.
pkru_mask indicates if we need to check PKRU.ADi and PKRU.WDi, and
does cache some conditions for permission_fault.

[ Huaitong: Xiao helps to modify many sections. ]

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 16:21:06 +01:00
Xiao Guangrong
9e90199c25 x86: pkey: introduce write_pkru() for KVM
KVM will use it to switch pkru between guest and host.

CC: Ingo Molnar <mingo@redhat.com>
CC: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 16:21:05 +01:00
Huang Rui
01fe03ff1c x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag
AMD CPU family 15h model 0x60 introduces a mechanism for measuring
accumulated power. It is used to report the processor power consumption
and support for it is indicated by CPUID Fn8000_0007_EDX[12].

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wan Zongshun <Vincent.Wan@amd.com>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
[ Resolved conflict and moved the synthetic CPUID slot to 19. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21 09:35:29 +01:00
Vikas Shivappa
33c3cc7acf perf/x86/mbm: Add Intel Memory B/W Monitoring enumeration and init
The MBM init patch enumerates the Intel MBM (Memory b/w monitoring)
and initializes the perf events and datastructures for monitoring the
memory b/w.

Its based on original patch series by Tony Luck and Kanaka Juvva.

Memory bandwidth monitoring (MBM) provides OS/VMM a way to monitor
bandwidth from one level of cache to another. The current patches
support L3 external bandwidth monitoring. It supports both 'local
bandwidth' and 'total bandwidth' monitoring for the socket. Local
bandwidth measures the amount of data sent through the memory controller
on the socket and total b/w measures the total system bandwidth.

Extending the cache quality of service monitoring (CQM) we add two
more events to the perf infrastructure:

  intel_cqm_llc/local_bytes - bytes sent through local socket memory controller
  intel_cqm_llc/total_bytes - total L3 external bytes sent

The tasks are associated with a Resouce Monitoring ID (RMID) just like
in CQM and OS uses a MSR write to indicate the RMID of the task during
scheduling.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fenghua.yu@intel.com
Cc: h.peter.anvin@intel.com
Cc: ravi.v.shankar@intel.com
Cc: vikas.shivappa@intel.com
Link: http://lkml.kernel.org/r/1457652732-4499-4-git-send-email-vikas.shivappa@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-21 09:08:19 +01:00
Linus Torvalds
643ad15d47 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 protection key support from Ingo Molnar:
 "This tree adds support for a new memory protection hardware feature
  that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

  There's a background article at LWN.net:

      https://lwn.net/Articles/643797/

  The gist is that protection keys allow the encoding of
  user-controllable permission masks in the pte.  So instead of having a
  fixed protection mask in the pte (which needs a system call to change
  and works on a per page basis), the user can map a (handful of)
  protection mask variants and can change the masks runtime relatively
  cheaply, without having to change every single page in the affected
  virtual memory range.

  This allows the dynamic switching of the protection bits of large
  amounts of virtual memory, via user-space instructions.  It also
  allows more precise control of MMU permission bits: for example the
  executable bit is separate from the read bit (see more about that
  below).

  This tree adds the MM infrastructure and low level x86 glue needed for
  that, plus it adds a high level API to make use of protection keys -
  if a user-space application calls:

        mmap(..., PROT_EXEC);

  or

        mprotect(ptr, sz, PROT_EXEC);

  (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
  this special case, and will set a special protection key on this
  memory range.  It also sets the appropriate bits in the Protection
  Keys User Rights (PKRU) register so that the memory becomes unreadable
  and unwritable.

  So using protection keys the kernel is able to implement 'true'
  PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
  PROT_READ as well.  Unreadable executable mappings have security
  advantages: they cannot be read via information leaks to figure out
  ASLR details, nor can they be scanned for ROP gadgets - and they
  cannot be used by exploits for data purposes either.

  We know about no user-space code that relies on pure PROT_EXEC
  mappings today, but binary loaders could start making use of this new
  feature to map binaries and libraries in a more secure fashion.

  There is other pending pkeys work that offers more high level system
  call APIs to manage protection keys - but those are not part of this
  pull request.

  Right now there's a Kconfig that controls this feature
  (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
  (like most x86 CPU feature enablement code that has no runtime
  overhead), but it's not user-configurable at the moment.  If there's
  any serious problem with this then we can make it configurable and/or
  flip the default"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
  mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
  x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
  mm/core, x86/mm/pkeys: Add execute-only protection keys support
  x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
  x86/mm/pkeys: Allow kernel to modify user pkey rights register
  x86/fpu: Allow setting of XSAVE state
  x86/mm: Factor out LDT init from context init
  mm/core, x86/mm/pkeys: Add arch_validate_pkey()
  mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
  x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
  x86/mm/pkeys: Add Kconfig prompt to existing config option
  x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
  x86/mm/pkeys: Dump PKRU with other kernel registers
  mm/core, x86/mm/pkeys: Differentiate instruction fetches
  x86/mm/pkeys: Optimize fault handling in access_error()
  mm/core: Do not enforce PKEY permissions on remote mm access
  um, pkeys: Add UML arch_*_access_permitted() methods
  mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
  x86/mm/gup: Simplify get_user_pages() PTE bit handling
  ...
2016-03-20 19:08:56 -07:00
Linus Torvalds
24b5e20f11 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main changes are:

   - Use separate EFI page tables when executing EFI firmware code.
     This isolates the EFI context from the rest of the kernel, which
     has security and general robustness advantages.  (Matt Fleming)

   - Run regular UEFI firmware with interrupts enabled.  This is already
     the status quo under other OSs.  (Ard Biesheuvel)

   - Various x86 EFI enhancements, such as the use of non-executable
     attributes for EFI memory mappings.  (Sai Praneeth Prakhya)

   - Various arm64 UEFI enhancements.  (Ard Biesheuvel)

   - ... various fixes and cleanups.

  The separate EFI page tables feature got delayed twice already,
  because it's an intrusive change and we didn't feel confident about
  it - third time's the charm we hope!"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  x86/mm/pat: Fix boot crash when 1GB pages are not supported by the CPU
  x86/efi: Only map kernel text for EFI mixed mode
  x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tables
  x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()
  efi/arm*: Perform hardware compatibility check
  efi/arm64: Check for h/w support before booting a >4 KB granular kernel
  efi/arm: Check for LPAE support before booting a LPAE kernel
  efi/arm-init: Use read-only early mappings
  efi/efistub: Prevent __init annotations from being used
  arm64/vmlinux.lds.S: Handle .init.rodata.xxx and .init.bss sections
  efi/arm64: Drop __init annotation from handle_kernel_image()
  x86/mm/pat: Use _PAGE_GLOBAL bit for EFI page table mappings
  efi/runtime-wrappers: Run UEFI Runtime Services with interrupts enabled
  efi: Reformat GUID tables to follow the format in UEFI spec
  efi: Add Persistent Memory type name
  efi: Add NV memory attribute
  x86/efi: Show actual ending addresses in efi_print_memmap
  x86/efi/bgrt: Don't ignore the BGRT if the 'valid' bit is 0
  efivars: Use to_efivar_entry
  efi: Runtime-wrapper: Get rid of the rtc_lock spinlock
  ...
2016-03-20 18:58:18 -07:00
Linus Torvalds
26660a4046 Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull 'objtool' stack frame validation from Ingo Molnar:
 "This tree adds a new kernel build-time object file validation feature
  (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
  It was written by and is maintained by Josh Poimboeuf.

  The motivation: there's a category of hard to find kernel bugs, most
  of them in assembly code (but also occasionally in C code), that
  degrades the quality of kernel stack dumps/backtraces.  These bugs are
  hard to detect at the source code level.  Such bugs result in
  incorrect/incomplete backtraces most of time - but can also in some
  rare cases result in crashes or other undefined behavior.

  The build time correctness checking is done via the new 'objtool'
  user-space utility that was written for this purpose and which is
  hosted in the kernel repository in tools/objtool/.  The tool's (very
  simple) UI and source code design is shaped after Git and perf and
  shares quite a bit of infrastructure with tools/perf (which tooling
  infrastructure sharing effort got merged via perf and is already
  upstream).  Objtool follows the well-known kernel coding style.

  Objtool does not try to check .c or .S files, it instead analyzes the
  resulting .o generated machine code from first principles: it decodes
  the instruction stream and interprets it.  (Right now objtool supports
  the x86-64 architecture.)

  From tools/objtool/Documentation/stack-validation.txt:

   "The kernel CONFIG_STACK_VALIDATION option enables a host tool named
    objtool which runs at compile time.  It has a "check" subcommand
    which analyzes every .o file and ensures the validity of its stack
    metadata.  It enforces a set of rules on asm code and C inline
    assembly code so that stack traces can be reliable.

    Currently it only checks frame pointer usage, but there are plans to
    add CFI validation for C files and CFI generation for asm files.

    For each function, it recursively follows all possible code paths
    and validates the correct frame pointer state at each instruction.

    It also follows code paths involving special sections, like
    .altinstructions, __jump_table, and __ex_table, which can add
    alternative execution paths to a given instruction (or set of
    instructions).  Similarly, it knows how to follow switch statements,
    for which gcc sometimes uses jump tables."

  When this new kernel option is enabled (it's disabled by default), the
  tool, if it finds any suspicious assembly code pattern, outputs
  warnings in compiler warning format:

    warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
    warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
    warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
    warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer

  ... so that scripts that pick up compiler warnings will notice them.
  All known warnings triggered by the tool are fixed by the tree, most
  of the commits in fact prepare the kernel to be warning-free.  Most of
  them are bugfixes or cleanups that stand on their own, but there are
  also some annotations of 'special' stack frames for justified cases
  such entries to JIT-ed code (BPF) or really special boot time code.

  There are two other long-term motivations behind this tool as well:

   - To improve the quality and reliability of kernel stack frames, so
     that they can be used for optimized live patching.

   - To create independent infrastructure to check the correctness of
     CFI stack frames at build time.  CFI debuginfo is notoriously
     unreliable and we cannot use it in the kernel as-is without extra
     checking done both on the kernel side and on the build side.

  The quality of kernel stack frames matters to debuggability as well,
  so IMO we can merge this without having to consider the live patching
  or CFI debuginfo angle"

* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  objtool: Only print one warning per function
  objtool: Add several performance improvements
  tools: Copy hashtable.h into tools directory
  objtool: Fix false positive warnings for functions with multiple switch statements
  objtool: Rename some variables and functions
  objtool: Remove superflous INIT_LIST_HEAD
  objtool: Add helper macros for traversing instructions
  objtool: Fix false positive warnings related to sibling calls
  objtool: Compile with debugging symbols
  objtool: Detect infinite recursion
  objtool: Prevent infinite recursion in noreturn detection
  objtool: Detect and warn if libelf is missing and don't break the build
  tools: Support relative directory path for 'O='
  objtool: Support CROSS_COMPILE
  x86/asm/decoder: Use explicitly signed chars
  objtool: Enable stack metadata validation on 64-bit x86
  objtool: Add CONFIG_STACK_VALIDATION option
  objtool: Add tool to perform compile-time stack metadata validation
  x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
  sched: Always inline context_switch()
  ...
2016-03-20 18:23:21 -07:00