linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 01:26:29 +07:00

Author	SHA1	Message	Date
Konrad Rzeszutek Wilk	e5fd47bfab	xen/pm_idle: Make pm_idle be default_idle under Xen. The idea behind commit `d91ee5863b` ("cpuidle: replace xen access to x86 pm_idle and default_idle") was to have one call - disable_cpuidle() which would make pm_idle not be molested by other code. It disallows cpuidle_idle_call to be set to pm_idle (which is excellent). But in the select_idle_routine() and idle_setup(), the pm_idle can still be set to either: amd_e400_idle, mwait_idle or default_idle. This depends on some CPU flags (MWAIT) and in AMD case on the type of CPU. In case of mwait_idle we can hit some instances where the hypervisor (Amazon EC2 specifically) sets the MWAIT and we get: Brought up 2 CPUs invalid opcode: 0000 [#1] SMP Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1 RIP: e030:[<ffffffff81015d1d>] [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 ... Call Trace: [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8 [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10 RIP [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4 RSP <ffff8801d28ddf10> In the case of amd_e400_idle we don't get so spectacular crashes, but we do end up making an MSR which is trapped in the hypervisor, and then follow it up with a yield hypercall. Meaning we end up going to hypervisor twice instead of just once. The previous behavior before v3.0 was that pm_idle was set to default_idle regardless of select_idle_routine/idle_setup. We want to do that, but only for one specific case: Xen. This patch does that. Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-03 10:49:58 -08:00
Avi Kivity	15d6aba24d	x86: Demacro CONFIG_PARAVIRT cpu accessors Recently, we had a build failure on !CONFIG_PARAVIRT due to a callback ->wbinvd() clashing with a macro wbinvd(). While we worked around the issue, avoid it in the future by changing the macro (and a few surrounding ones) to an inline function. Signed-off-by: Avi Kivity <avi@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/1303632711-21662-1-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-04-24 13:20:36 +02:00
Borislav Petkov	2b15cd96e5	x86, system.h: Drop unused __SAVE/__RESTORE macros Those are unused since at least the beginning of git history. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1298044056-31104-1-git-send-email-bp@amd64.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-20 13:38:47 +01:00
Linus Torvalds	6ba74014c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits) phy/marvell: add 88ec048 support igb: Program MDICNFG register prior to PHY init e1000e: correct MAC-PHY interconnect register offset for 82579 hso: Add new product ID can: Add driver for esd CAN-USB/2 device l2tp: fix export of header file for userspace can-raw: Fix skb_orphan_try handling Revert "net: remove zap_completion_queue" net: cleanup inclusion phy/marvell: add 88e1121 interface mode support u32: negative offset fix net: Fix a typo from "dev" to "ndev" igb: Use irq_synchronize per vector when using MSI-X ixgbevf: fix null pointer dereference due to filter being set for VLAN 0 e1000e: Fix irq_synchronize in MSI-X case e1000e: register pm_qos request on hardware activation ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice net: Add getsockopt support for TCP thin-streams cxgb4: update driver version cxgb4: add new PCI IDs ... Manually fix up conflicts in: - drivers/net/e1000e/netdev.c: due to pm_qos registration infrastructure changes - drivers/net/phy/marvell.c: conflict between adding 88ec048 support and cleaning up the IDs - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req conflict (registration change vs marking it static)	2010-08-04 11:47:58 -07:00
Alexander Duyck	7475271004	x86: Drop CONFIG_MCORE2 check around setting of NET_IP_ALIGN This patch removes the CONFIG_MCORE2 check from around NET_IP_ALIGN. It is based on a suggestion from Andi Kleen. The assumption is that there are not any x86 cores where unaligned access is really slow, and this change would allow for a performance improvement to still exist on configurations that are not necessarily optimized for Core 2. Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-01 22:45:54 -07:00
Alexander Duyck	ea812ca1b0	x86: Align skb w/ start of cacheline on newer core 2/Xeon Arch x86 architectures can handle unaligned accesses in hardware, and it has been shown that unaligned DMA accesses can be expensive on Nehalem architectures. As such we should overwrite NET_IP_ALIGN to resolve this issue. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-30 14:34:09 -07:00
Andi Kleen	124482935f	x86: Fix vsyscall on gcc 4.5 with -Os This fixes the -Os breaks with gcc 4.5 bug. rdtsc_barrier needs to be force inlined, otherwise user space will jump into kernel space and kill init. This also addresses http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44129 I believe. Signed-off-by: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20100618210859.GA10913@basil.fritz.box> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>	2010-06-18 14:16:31 -07:00
Linus Torvalds	0a135ba14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.	2010-03-03 07:34:18 -08:00
Serge E. Hallyn	cf9db6c41f	x86-32: Make AT_VECTOR_SIZE_ARCH=2 Both x86-32 and x86-64 with 32-bit compat use ARCH_DLINFO_IA32, which defines two saved_auxv entries. But system.h only defines AT_VECTOR_SIZE_ARCH as 2 for CONFIG_IA32_EMULATION, not for CONFIG_X86_32. Fix that. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> LKML-Reference: <20100209023502.GA15408@us.ibm.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-02-09 16:05:08 -08:00
Tejun Heo	32032df6c2	Merge branch 'master' into percpu Conflicts: arch/powerpc/platforms/pseries/hvCall.S include/linux/percpu.h	2010-01-05 09:17:33 +09:00
Andy Isaacson	814e2c84a7	x86: Factor duplicated code out of __show_regs() into show_regs_common() Unify x86_32 and x86_64 implementations of __show_regs() header, standardizing on the x86_64 format string in the process. Also, 32-bit will now call print_modules. Signed-off-by: Andy Isaacson <adi@hexapodia.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Robert Hancock <hancockrwd@gmail.com> Cc: Richard Zidlicky <rz@linux-m68k.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20091208082942.GA27174@hexapodia.org> [ v2: resolved conflict ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-09 10:17:58 +01:00
Ingo Molnar	64b028b226	x86: Clean up the loadsegment() macro Make it readable in the source too, not just in the assembly output. No change in functionality. Cc: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:38:52 +01:00
Brian Gerst	79b0379cee	x86: Optimize loadsegment() Zero the input register in the exception handler instead of using an extra register to pass in a zero value. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:33:58 +01:00
Masami Hiramatsu	c12a229bc5	x86: Remove unused thread_return label from switch_to() Remove unused thread_return label from switch_to() macro on x86-64. Since this symbol cuts into schedule(), backtrace at the latter half of schedule() was always shown as thread_return(). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20091105160359.5181.26225.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 11:57:13 +01:00
Rusty Russell	dd17c8f729	percpu: remove per_cpu__ prefix. Now that the return from alloc_percpu is compatible with the address of per-cpu vars, it makes sense to hand around the address of per-cpu variables. To make this sane, we remove the per_cpu__ prefix we used created to stop people accidentally using these vars directly. Now we have sparse, we can use that (next patch). tj: * Updated to convert stuff which were missed by or added after the original patch. * Kill per_cpu_var() macro. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org>	2009-10-29 22:34:15 +09:00
Jeremy Fitzhardinge	1ea0d14e48	x86/i386: Make sure stack-protector segment base is cache aligned The Intel Optimization Reference Guide says: In Intel Atom microarchitecture, the address generation unit assumes that the segment base will be 0 by default. Non-zero segment base will cause load and store operations to experience a delay. - If the segment base isn't aligned to a cache line boundary, the max throughput of memory operations is reduced to one [e]very 9 cycles. [...] Assembly/Compiler Coding Rule 15. (H impact, ML generality) For Intel Atom processors, use segments with base set to 0 whenever possible; avoid non-zero segment base address that is not aligned to cache line boundary at all cost. We can't avoid having a non-zero base for the stack-protector segment, but we can make it cache-aligned. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> LKML-Reference: <4AA01893.6000507@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-03 21:30:51 +02:00
Akinobu Mita	57594742a2	x86: Introduce set_desc_base() and set_desc_limit() Rename set_base()/set_limit to set_desc_base()/set_desc_limit() and rewrite them in C. These are naturally introduced by the idea of get_desc_base()/get_desc_limit(). The conversion actually found the bug in apm_32.c: bad_bios_desc is written at run-time, but it is defined const variable. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> LKML-Reference: <20090718151105.GC11294@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-07-19 18:27:52 +02:00
Jeremy Fitzhardinge	2fb6b2a048	x86: add forward decl for tss_struct Its the correct thing to do before using the struct in a prototype. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:07:49 +01:00
Jeremy Fitzhardinge	389d1fb11e	x86: unify chunks of kernel/process.c With x86-32 and -64 using the same mechanism for managing the tss io permissions bitmap, large chunks of process.c are trivially unifyable, including: - exit_thread - flush_thread - __switch_to_xtra (along with tsc enable/disable) and as bonus pickups: - sys_fork - sys_vfork (Note: asmlinkage expands to empty on x86-64) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-02 12:07:48 +01:00
Ingo Molnar	ab639f3593	Merge branch 'core/percpu' into x86/core	2009-02-13 09:45:09 +01:00
Tejun Heo	5c79d2a517	x86: fix x86_32 stack protector bugs Impact: fix x86_32 stack protector Brian Gerst found out that %gs was being initialized to stack_canary instead of stack_canary - 20, which basically gave the same canary value for all threads. Fixing this also exposed the following bugs. * cpu_idle() didn't call boot_init_stack_canary() * stack canary switching in switch_to() was being done too late making the initial run of a new thread use the old stack canary value. Fix all of them and while at it update comment in cpu_idle() about calling boot_init_stack_canary(). Reported-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 11:33:49 +01:00
Tejun Heo	60a5317ff0	x86: implement x86_32 stack protector Impact: stack protector for x86_32 Implement stack protector for x86_32. GDT entry 28 is used for it. It's set to point to stack_canary-20 and have the length of 24 bytes. CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs to the stack canary segment on entry. As %gs is otherwise unused by the kernel, the canary can be anywhere. It's defined as a percpu variable. x86_32 exception handlers take register frame on stack directly as struct pt_regs. With -fstack-protector turned on, gcc copies the whole structure after the stack canary and (of course) doesn't copy back on return thus losing all changed. For now, -fno-stack-protector is added to all files which contain those functions. We definitely need something better. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-10 00:42:01 +01:00
Tejun Heo	ccbeed3a05	x86: make lazy %gs optional on x86_32 Impact: pt_regs changed, lazy gs handling made optional, add slight overhead to SAVE_ALL, simplifies error_code path a bit On x86_32, %gs hasn't been used by kernel and handled lazily. pt_regs doesn't have place for it and gs is saved/loaded only when necessary. In preparation for stack protector support, this patch makes lazy %gs handling optional by doing the followings. * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs. * Save and restore %gs along with other registers in entry_32.S unless LAZY_GS. Note that this unfortunately adds "pushl $0" on SAVE_ALL even when LAZY_GS. However, it adds no overhead to common exit path and simplifies entry path with error code. * Define different user_gs accessors depending on LAZY_GS and add lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS. The lazy__gs() ops are used to save, load and clear %gs lazily. Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly. xen and lguest changes need to be verified. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-10 00:42:00 +01:00
Tejun Heo	d9a89a26e0	x86: add %gs accessors for x86_32 Impact: cleanup On x86_32, %gs is handled lazily. It's not saved and restored on kernel entry/exit but only when necessary which usually is during task switch but there are few other places. Currently, it's done by calling savesegment() and loadsegment() explicitly. Define get_user_gs(), set_user_gs() and task_user_gs() and use them instead. While at it, clean up register access macros in signal.c. This cleans up code a bit and will help future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-10 00:41:58 +01:00
Ingo Molnar	74b6eb6b93	Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core	2009-01-28 23:13:53 +01:00
Tejun Heo	67e68bde02	x86: update canary handling during switch Impact: cleanup In switch_to(), instead of taking offset to irq_stack_union.stack, make it a proper percpu access using __percpu_arg() and per_cpu_var(). Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-21 17:26:05 +09:00
Brian Gerst	947e76cdc3	x86: move stack_canary into irq_stack Impact: x86_64 percpu area layout change, irq_stack now at the beginning Now that the PDA is empty except for the stack canary, it can be removed. The irqstack is moved to the start of the per-cpu section. If the stack protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack. tj: * updated subject * dropped asm relocation of irq_stack_ptr * updated comments a bit * rebased on top of stack canary changes Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-20 12:29:20 +09:00
Tejun Heo	b4a8f7a262	x86: conditionalize stack canary handling in hot path Impact: no unnecessary stack canary swapping during context switch There's no point in moving stack_canary around during context switch if it's not enabled. Conditionalize it. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-20 12:29:19 +09:00
Ingo Molnar	b2b062b816	Merge branch 'core/percpu' into stackprotector Conflicts: arch/x86/include/asm/pda.h arch/x86/include/asm/system.h Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-18 18:37:14 +01:00
Brian Gerst	87b2640658	x86-64: Use absolute displacements for per-cpu accesses. Accessing memory through %gs should not use rip-relative addressing. Adding a P prefix for the argument tells gcc to not add (%rip) to the memory references. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-19 00:38:59 +09:00
Brian Gerst	c6f5e0acd5	x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-01-19 00:38:58 +09:00
Benjamin LaHaise	7106a5ab89	x86-64: remove locked instruction from switch_to() Impact: micro-optimization The patch below removes an unnecessary locked instruction from switch_to(). TIF_FORK is only ever set in copy_thread() on initial process creation, and gets cleared during the first scheduling of the process. As such, it is safe to use an unlocked test for the flag within switch_to(). Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 05:05:33 +01:00
Ingo Molnar	a9de18eb76	Merge branch 'linus' into stackprotector Conflicts: arch/x86/include/asm/pda.h kernel/fork.c	2008-12-31 08:31:57 +01:00
Ingo Molnar	fa623d1b02	Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core	2008-12-23 16:27:23 +01:00
Jaswinder Singh	aab02f0ae2	x86: process_64.c declare __switch_to() and sys_arch_prctl before they get used Impact: cleanup In asm/system.h moved out __switch_to from CONFIG_X86_32 as it is common for both 32 and 64 bit. In asm/pctl.h defined sys_arch_prctl Signed-off-by: Jaswinder Singh <jaswinder@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:10:27 +01:00
Ivan Vecera	d3ec5cae09	x86: call machine_shutdown and stop all CPUs in native_machine_halt Impact: really halt all CPUs on halt Function machine_halt (resp. native_machine_halt) is empty for x86 architectures. When command 'halt -f' is invoked, the message "System halted." is displayed but this is not really true because all CPUs are still running. There are also similar inconsistencies for other arches (some uses power-off for halt or forever-loop with IRQs enabled/disabled). IMO there should be used the same approach for all architectures OR what does the message "System halted" really mean? This patch fixes it for x86. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-11 14:50:02 +01:00
H. Peter Anvin	1965aae3c9	x86: Fix ASM_X86__ header guards Change header guards named "ASM_X86__" to "_ASM_X86_" since: a. the double underscore is ugly and pointless. b. no leading underscore violates namespace constraints. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-10-22 22:55:23 -07:00
Al Viro	bb8985586b	x86, um: ... and asm-x86 move Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-10-22 22:55:20 -07:00

38 Commits