linux_dsm_epyc7002/arch/x86/kernel
Suresh Siddha 304bceda6a x86, fpu: use non-lazy fpu restore for processors supporting xsave
Fundamental model of the current Linux kernel is to lazily init and
restore FPU instead of restoring the task state during context switch.
This changes that fundamental lazy model to the non-lazy model for
the processors supporting xsave feature.

Reasons driving this model change are:

i. Newer processors support optimized state save/restore using xsaveopt and
xrstor by tracking the INIT state and MODIFIED state during context-switch.
This is faster than modifying the cr0.TS bit which has serializing semantics.

ii. Newer glibc versions use SSE for some of the optimized copy/clear routines.
With certain workloads (like boot, kernel-compilation etc), application
completes its work with in the first 5 task switches, thus taking upto 5 #DNA
traps with the kernel not getting a chance to apply the above mentioned
pre-load heuristic.

iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit
and thus will not work correctly in the presence of lazy restore. Non-lazy
state restore is needed for enabling such features.

Some data on a two socket SNB system:
 * Saved 20K DNA exceptions during boot on a two socket SNB system.
 * Saved 50K DNA exceptions during kernel-compilation workload.
 * Improved throughput of the AVX based checksumming function inside the
   kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts
   pair.

Also now kernel_fpu_begin/end() relies on the patched
alternative instructions. So move check_fpu() which uses the
kernel_fpu_begin/end() after alternative_instructions().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com
Merge 32-bit boot fix from,
Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-18 15:52:11 -07:00
..
acpi ACPI/x86: revert 'x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler' 2012-07-30 21:10:16 -04:00
apic x86, apic: fix broken legacy interrupts in the logical apic mode 2012-08-14 09:52:20 -07:00
cpu x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
.gitignore
alternative.c x86/alternatives: Fix p6 nops on non-modular kernels 2012-08-22 12:09:49 +02:00
amd_gart_64.c X86 & IA64: adapt for dma_map_ops changes 2012-03-28 16:36:31 +02:00
amd_nb.c Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 16:07:45 -07:00
apb_timer.c Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-07-23 10:34:47 -07:00
aperture_64.c x86/gart: Fix kmemleak warning 2012-06-06 11:58:38 +02:00
apm_32.c x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
asm-offsets_32.c x86: Generate system call tables and unistd_*.h from tables 2011-11-17 13:35:37 -08:00
asm-offsets_64.c x32: If configured, add x32 system calls to system call tables 2012-02-20 12:52:06 -08:00
asm-offsets.c x86, efi: EFI boot stub support 2011-12-12 14:26:10 -08:00
audit_64.c
bootflag.c
check.c x86: kernel/check.c simple_strtoul cleanup 2012-05-15 15:36:41 -07:00
cpuid.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
crash_dump_32.c x86: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:15 +08:00
crash_dump_64.c
crash.c x86, nmi: Wire up NMI handlers to new routines 2011-10-10 06:56:57 +02:00
devicetree.c irq_domain/x86: Convert x86 (embedded) to use common irq_domain 2012-02-23 14:37:47 -07:00
doublefault_32.c
dumpstack_32.c x86: Move call to print_modules() out of show_regs() 2012-06-20 14:33:48 +02:00
dumpstack_64.c x86: Move call to print_modules() out of show_regs() 2012-06-20 14:33:48 +02:00
dumpstack.c x86: Move call to print_modules() out of show_regs() 2012-06-20 14:33:48 +02:00
e820.c firmware_map: make firmware_map_add_early() argument consistent with firmware_map_add_hotplug() 2012-07-30 17:25:17 -07:00
early_printk.c Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()" 2012-07-22 15:47:52 +02:00
early-quirks.c
entry_32.S x86: get rid of calling do_notify_resume() when returning to kernel mode 2012-06-01 13:01:51 -04:00
entry_64.S Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
ftrace.c ftrace: Use breakpoint method to update ftrace caller 2012-05-31 23:12:19 -04:00
head32.c x86, realmode: Move ACPI wakeup to unified realmode code 2012-05-08 11:46:05 -07:00
head64.c x86, realmode: Move ACPI wakeup to unified realmode code 2012-05-08 11:46:05 -07:00
head_32.S Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-29 20:14:53 -07:00
head_64.S Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-29 20:14:53 -07:00
head.c memblock, x86: Replace memblock_x86_reserve/free_range() with generic ones 2011-07-14 11:47:53 -07:00
hpet.c x86: hpet: Fix copy-and-paste mistake in earlier change 2012-05-25 15:32:29 +02:00
hw_breakpoint.c
i386_ksyms_32.c
i387.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
i8237.c
i8253.c x86: Use common i8253 clockevent 2011-07-01 10:37:14 +02:00
i8259.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
io_delay.c
ioport.c
irq_32.c x86: Use common threadinfo allocator 2012-05-08 14:08:44 +02:00
irq_64.c x86: Add stack top margin for stack overflow checking 2011-12-07 09:27:11 +01:00
irq_work.c
irq.c x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask 2012-08-22 10:36:08 +02:00
irqinit.c x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR 2012-06-27 19:29:13 -07:00
jump_label.c jump_label, x86: Fix section mismatch 2011-12-06 20:41:02 +01:00
kdebugfs.c arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case 2012-07-26 15:07:20 +02:00
kgdb.c x86: Fix kernel-doc warnings 2012-06-18 10:53:18 +02:00
kprobes-common.h x86/kprobes: Split out optprobe related code to kprobes-opt.c 2012-03-06 09:49:49 +01:00
kprobes-opt.c x86/kprobes: Split out optprobe related code to kprobes-opt.c 2012-03-06 09:49:49 +01:00
kprobes.c x86: Avoid double stack traces with show_regs() 2012-05-09 11:44:42 +02:00
kvm.c KVM guest: switch to apic_set_eoi_write, apic_write 2012-07-16 12:51:44 +03:00
kvmclock.c x86: kvmclock: remove check_and_clear_guest_paused warning 2012-06-11 23:18:33 -03:00
ldt.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
machine_kexec_32.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
machine_kexec_64.c
Makefile Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-29 20:14:53 -07:00
microcode_amd.c x86, microcode, AMD: Fix broken ucode patch size check 2012-08-22 16:10:41 -07:00
microcode_core.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 12:04:44 -07:00
microcode_intel.c x86/microcode: Ensure that module is only loaded on supported Intel CPUs 2012-05-07 14:37:14 +02:00
mmconf-fam10h_64.c
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-07-24 13:34:56 -07:00
mpparse.c Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-29 20:14:53 -07:00
msr.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
nmi_selftest.c x86/nmi: Clean up register_nmi_handler() usage 2012-06-20 14:23:17 +02:00
nmi.c x86: Save cr2 in NMI in case NMIs take a page fault (for i386) 2012-06-08 18:51:12 -04:00
paravirt_patch_32.c
paravirt_patch_64.c
paravirt-spinlocks.c
paravirt.c x86, pvops: Remove hooks for {rd,wr}msr_safe_regs 2012-06-07 11:41:08 -07:00
pci-calgary_64.c x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
pci-dma.c iommu: Remove group_mf 2012-06-25 13:48:30 +02:00
pci-iommu_table.c
pci-nommu.c X86: integrate CMA with DMA-mapping subsystem 2012-05-21 15:09:38 +02:00
pci-swiotlb.c X86 & IA64: adapt for dma_map_ops changes 2012-03-28 16:36:31 +02:00
pcspeaker.c
probe_roms.c x86/pci/probe_roms: Add missing __iomem annotation to pci_map_biosrom() 2012-09-05 10:52:25 +02:00
process_32.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
process_64.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
process.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
ptrace.c x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels 2012-09-18 15:51:48 -07:00
pvclock.c
quirks.c x86/PCI: move fixup hooks from __init to __devinit 2012-06-12 09:10:54 -06:00
reboot_fixups_32.c
reboot.c Merge branch 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 12:25:47 -07:00
relocate_kernel_32.S kexec, x86: Fix incorrect jump back address if not preserving context 2011-07-21 11:19:28 +02:00
relocate_kernel_64.S kexec, x86: Fix incorrect jump back address if not preserving context 2011-07-21 11:19:28 +02:00
resource.c
rtc.c x86/rtc, mrst: Don't register a platform RTC device for for Intel MID platforms 2011-12-05 17:09:21 +01:00
setup_percpu.c x86: Add read_mostly declaration/definition to variables from smp.h 2012-06-14 12:42:11 +02:00
setup.c Merge branch 'x86/cleanups' into x86/apic 2012-06-15 14:17:01 +02:00
signal.c x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels 2012-09-18 15:51:48 -07:00
smp.c x86/reboot: Update nonmi_ipi parameter 2012-05-14 11:49:38 +02:00
smpboot.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
stacktrace.c x86: Swap save_stack_trace_regs parameters 2011-06-14 22:48:51 -04:00
step.c x86-64: Add user_64bit_mode paravirt op 2011-08-04 16:13:49 -07:00
sys_i386_32.c
sys_x86_64.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
syscall_32.c x86, syscall: Re-fix typo in comment 2011-11-18 16:25:07 -08:00
syscall_64.c x32: If configured, add x32 system calls to system call tables 2012-02-20 12:52:06 -08:00
tboot.c x86, realmode: fixes compilation issue in tboot.c 2012-05-08 15:04:27 -07:00
tce_64.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
test_nx.c
test_rodata.c x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c 2012-04-20 13:51:38 -07:00
time.c MCA: delete all remaining traces of microchannel bus support. 2012-05-17 19:06:13 -04:00
tls.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 14:28:26 -07:00
tls.h
topology.c x86: Fix files explicitly requiring export.h for EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:30:35 -04:00
traps.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00
tsc_sync.c x86/tsc: Reduce the TSC sync check time for core-siblings 2012-02-22 11:49:40 +01:00
tsc.c x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
uprobes.c uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-08 12:22:27 +02:00
verify_cpu.S
vm86_32.c x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
vmlinux.lds.S x86, realmode: Move ACPI wakeup to unified realmode code 2012-05-08 11:46:05 -07:00
vsmp_64.c x86/apic/x2apic: Limit the vector reservation to the user specified mask 2012-07-06 11:00:22 +02:00
vsyscall_64.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 12:04:44 -07:00
vsyscall_emu_64.S x86-64: Rework vsyscall emulation and add vsyscall= parameter 2011-08-10 19:26:46 -05:00
vsyscall_trace.h x86-64: Add vsyscall:emulate_vsyscall trace event 2011-08-04 16:13:53 -07:00
x86_init.c Revert "x86/platform: Add a wallclock_init func to x86_platforms ops" 2012-05-24 23:16:14 +02:00
x8664_ksyms_64.c x86/copy_user_generic: Optimize copy_user_generic with CPU erms feature 2012-06-29 15:33:34 -07:00
xsave.c x86, fpu: use non-lazy fpu restore for processors supporting xsave 2012-09-18 15:52:11 -07:00