linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-05 07:56:48 +07:00

Author	SHA1	Message	Date
Eduardo Habkost	3ce672d484	KVM: SVM: init_vmcb(): remove redundant save->cr0 initialization The svm_set_cr0() call will initialize save->cr0 properly even when npt is enabled, clearing the NW and CD bits as expected, so we don't need to initialize it manually for npt_enabled anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:21 +02:00
Eduardo Habkost	18fa000ae4	KVM: SVM: Reset cr0 properly on vcpu reset svm_vcpu_reset() was not properly resetting the contents of the guest-visible cr0 register, causing the following issue: https://bugzilla.redhat.com/show_bug.cgi?id=525699 Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine with paging enabled, making the vcpu get a pagefault exception while trying to run it. Instead of setting vmcb->save.cr0 directly, the new code just resets kvm->arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on vmcb->save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0(). kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:21 +02:00
Eduardo Habkost	fa40052ca0	KVM: VMX: Use macros instead of hex value on cr0 initialization This should have no effect, it is just to make the code clearer. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:21 +02:00
Glauber Costa	afbcf7ab8d	KVM: allow userspace to adjust kvmclock offset When we migrate a kvm guest that uses pvclock between two hosts, we may suffer a large skew. This is because there can be significant differences between the monotonic clock of the hosts involved. When a new host with a much larger monotonic time starts running the guest, the view of time will be significantly impacted. Situation is much worse when we do the opposite, and migrate to a host with a smaller monotonic clock. This proposed ioctl will allow userspace to inform us what is the monotonic clock value in the source host, so we can keep the time skew short, and more importantly, never goes backwards. Userspace may also need to trigger the current data, since from the first migration onwards, it won't be reflected by a simple call to clock_gettime() anymore. [marcelo: future-proof abi with a flags field] [jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it] Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:19 +02:00
Jan Kiszka	6be7d3062b	KVM: SVM: Cleanup NMI singlestep Push the NMI-related singlestep variable into vcpu_svm. It's dealing with an AMD-specific deficit, nothing generic for x86. Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> arch/x86/include/asm/kvm_host.h \| 1 - arch/x86/kvm/svm.c \| 12 +++++++----- 2 files changed, 7 insertions(+), 6 deletions(-) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:19 +02:00
Jan Kiszka	94fe45da48	KVM: x86: Fix guest single-stepping while interruptible Commit 705c5323 opened the doors of hell by unconditionally injecting single-step flags as long as guest_debug signaled this. This doesn't work when the guest branches into some interrupt or exception handler and triggers a vmexit with flag reloading. Fix it by saving cs:rip when user space requests single-stepping and restricting the trace flag injection to this guest code position. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:19 +02:00
Ed Swierk	ffde22ac53	KVM: Xen PV-on-HVM guest support Support for Xen PV-on-HVM guests can be implemented almost entirely in userspace, except for handling one annoying MSR that maps a Xen hypercall blob into guest address space. A generic mechanism to delegate MSR writes to userspace seems overkill and risks encouraging similar MSR abuse in the future. Thus this patch adds special support for the Xen HVM MSR. I implemented a new ioctl, KVM_XEN_HVM_CONFIG, that lets userspace tell KVM which MSR the guest will write to, as well as the starting address and size of the hypercall blobs (one each for 32-bit and 64-bit) that userspace has loaded from files. When the guest writes to the MSR, KVM copies one page of the blob from userspace to the guest. I've tested this patch with a hacked-up version of Gerd's userspace code, booting a number of guests (CentOS 5.3 i386 and x86_64, and FreeBSD 8.0-RC1 amd64) and exercising PV network and block devices. [jan: fix i386 build warning] [avi: future proof abi with a flags field] Signed-off-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:18 +02:00
Jan Kiszka	94c30d9ca6	KVM: x86: Drop unneeded CONFIG_HAS_IOMEM check This (broken) check dates back to the days when this code was shared across architectures. x86 has IOMEM, so drop it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:18 +02:00
Marcelo Tosatti	9fb41ba896	KVM: VMX: fix handle_pause declaration There's no kvm_run argument anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:18 +02:00
Zachary Amsden	6b7d7e762b	KVM: x86: Harden against cpufreq If cpufreq can't determine the CPU khz, or cpufreq is not compiled in, we should fallback to the measured TSC khz. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:18 +02:00
Mark Langsdorf	565d0998ec	KVM: SVM: Support Pause Filter in AMD processors New AMD processors (Family 0x10 models 8+) support the Pause Filter Feature. This feature creates a new field in the VMCB called Pause Filter Count. If Pause Filter Count is greater than 0 and intercepting PAUSEs is enabled, the processor will increment an internal counter when a PAUSE instruction occurs instead of intercepting. When the internal counter reaches the Pause Filter Count value, a PAUSE intercept will occur. This feature can be used to detect contended spinlocks, especially when the lock holding VCPU is not scheduled. Rescheduling another VCPU prevents the VCPU seeking the lock from wasting its quantum by spinning idly. Experimental results show that most spinlocks are held for less than 1000 PAUSE cycles or more than a few thousand. Default the Pause Filter Counter to 3000 to detect the contended spinlocks. Processor support for this feature is indicated by a CPUID bit. On a 24 core system running 4 guests each with 16 VCPUs, this patch improved overall performance of each guest's 32 job kernbench by approximately 3-5% when combined with a scheduler algorithm thati caused the VCPU to sleep for a brief period. Further performance improvement may be possible with a more sophisticated yield algorithm. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:17 +02:00
Zhai, Edwin	4b8d54f972	KVM: VMX: Add support for Pause-Loop Exiting New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution control fields: PLE_Gap - upper bound on the amount of time between two successive executions of PAUSE in a loop. PLE_Window - upper bound on the amount of time a guest is allowed to execute in a PAUSE loop If the time, between this execution of PAUSE and previous one, exceeds the PLE_Gap, processor consider this PAUSE belongs to a new loop. Otherwise, processor determins the the total execution time of this loop(since 1st PAUSE in this loop), and triggers a VM exit if total time exceeds the PLE_Window. * Refer SDM volume 3b section 21.6.13 & 22.1.3. Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where one VP is sched-out after hold a spinlock, then other VPs for same lock are sched-in to waste the CPU time. Our tests indicate that most spinlocks are held for less than 212 cycles. Performance tests show that with 2X LP over-commitment we can get +2% perf improvement for kernel build(Even more perf gain with more LPs). Signed-off-by: Zhai Edwin <edwin.zhai@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:17 +02:00
Joerg Roedel	d36f19e9ec	KVM: SVM: Remove nsvm_printk debugging code With all important informations now delivered through tracepoints we can savely remove the nsvm_printk debugging code for nested svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:17 +02:00
Joerg Roedel	532a46b989	KVM: SVM: Add tracepoint for skinit instruction This patch adds a tracepoint for the event that the guest executed the SKINIT instruction. This information is important because SKINIT is an SVM extenstion not yet implemented by nested SVM and we may need this information for debugging hypervisors that do not yet run on nested SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:16 +02:00
Joerg Roedel	ec1ff79084	KVM: SVM: Add tracepoint for invlpga instruction This patch adds a tracepoint for the event that the guest executed the INVLPGA instruction. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:16 +02:00
Joerg Roedel	236649de33	KVM: SVM: Add tracepoint for #vmexit because intr pending This patch adds a special tracepoint for the event that a nested #vmexit is injected because kvm wants to inject an interrupt into the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:16 +02:00
Joerg Roedel	17897f3668	KVM: SVM: Add tracepoint for injected #vmexit This patch adds a tracepoint for a nested #vmexit that gets re-injected to the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:15 +02:00
Joerg Roedel	d8cabddf7e	KVM: SVM: Add tracepoint for nested #vmexit This patch adds a tracepoint for every #vmexit we get from a nested guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:15 +02:00
Joerg Roedel	0ac406de8f	KVM: SVM: Add tracepoint for nested vmrun This patch adds a dedicated kvm tracepoint for a nested vmrun. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:15 +02:00
Joerg Roedel	cd3ff653ae	KVM: SVM: Move INTR vmexit out of atomic code The nested SVM code emulates a #vmexit caused by a request to open the irq window right in the request function. This is a bug because the request function runs with preemption and interrupts disabled but the #vmexit emulation might sleep. This can cause a schedule()-while-atomic bug and is fixed with this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:15 +02:00
Alexander Graf	8d23c46624	KVM: SVM: Notify nested hypervisor of lost event injections If event_inj is valid on a #vmexit the host CPU would write the contents to exit_int_info, so the hypervisor knows that the event wasn't injected. We don't do this in nested SVM by now which is a bug and fixed by this patch. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:14 +02:00
Glauber Costa	e3267cbbbf	KVM: x86: include pvclock MSRs in msrs_to_save For a while now, we are issuing a rdmsr instruction to find out which msrs in our save list are really supported by the underlying machine. However, it fails to account for kvm-specific msrs, such as the pvclock ones. This patch moves then to the beginning of the list, and skip testing them. Cc: stable@kernel.org Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:14 +02:00
Jan Kiszka	91586a3b7d	KVM: x86: Rework guest single-step flag injection and filtering Push TF and RF injection and filtering on guest single-stepping into the vender get/set_rflags callbacks. This makes the whole mechanism more robust wrt user space IOCTL order and instruction emulations. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:14 +02:00
Marcelo Tosatti	a68a6a7282	KVM: x86: disable paravirt mmu reporting Disable paravirt MMU capability reporting, so that new (or rebooted) guests switch to native operation. Paravirt MMU is a burden to maintain and does not bring significant advantages compared to shadow anymore. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:14 +02:00
Jan Kiszka	355be0b930	KVM: x86: Refactor guest debug IOCTL handling Much of so far vendor-specific code for setting up guest debug can actually be handled by the generic code. This also fixes a minor deficit in the SVM part /wrt processing KVM_GUESTDBG_ENABLE. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:14 +02:00
Juan Quintela	201d945bcf	KVM: remove pre_task_link setting in save_state_to_tss16 Now, also remove pre_task_link setting in save_state_to_tss16. commit `b237ac37a1` Author: Gleb Natapov <gleb@redhat.com> Date: Mon Mar 30 16:03:24 2009 +0300 KVM: Fix task switch back link handling. CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:13 +02:00
Zachary Amsden	3230bb4707	KVM: Fix hotplug of CPUs Both VMX and SVM require per-cpu memory allocation, which is done at module init time, for only online cpus. Backend was not allocating enough structure for all possible CPUs, so new CPUs coming online could not be hardware enabled. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:13 +02:00
Zachary Amsden	e6732a5af9	KVM: Fix printk name error in svm.c Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:13 +02:00
Zachary Amsden	0cca790753	KVM: Kill the confusing tsc_ref_khz and ref_freq variables They are globals, not clearly protected by any ordering or locking, and vulnerable to various startup races. Instead, for variable TSC machines, register the cpufreq notifier and get the TSC frequency directly from the cpufreq machinery. Not only is it always right, it is also perfectly accurate, as no error prone measurement is required. On such machines, when a new CPU online is brought online, it isn't clear what frequency it will start with, and it may not correspond to the reference, thus in hardware_enable we clear the cpu_tsc_khz variable to zero and make sure it is set before running on a VCPU. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:12 +02:00
Zachary Amsden	b820cc0ca2	KVM: Separate timer intialization into an indepedent function Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:12 +02:00
Joerg Roedel	e935d48e1b	KVM: SVM: Remove remaining occurences of rdtscll This patch replaces them with native_read_tsc() which can also be used in expressions and saves a variable on the stack in this case. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:12 +02:00
Joerg Roedel	33527ad7e1	KVM: SVM: don't copy exit_int_info on nested vmrun The exit_int_info field is only written by the hardware and never read. So it does not need to be copied on a vmrun emulation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:11 +02:00
Joerg Roedel	7fcdb5103d	KVM: SVM: reorganize svm_interrupt_allowed This patch reorganizes the logic in svm_interrupt_allowed to make it better to read. This is important because the logic is a lot more complicated with Nested SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:11 +02:00
Huang Weiyi	bfc33beaed	KVM: remove duplicated #include Remove duplicated #include('s) in arch/x86/kvm/lapic.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:10 +02:00
Alexander Graf	10474ae894	KVM: Activate Virtualization On Demand X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:10 +02:00
Marcelo Tosatti	e8b3433a5c	KVM: SVM: remove needless mmap_sem acquision from nested_svm_map nested_svm_map unnecessarily takes mmap_sem around gfn_to_page, since gfn_to_page / get_user_pages are responsible for it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:10 +02:00
Mohammed Gamal	80ced186d1	KVM: VMX: Enhance invalid guest state emulation - Change returned handle_invalid_guest_state() to return relevant exit codes - Move triggering the emulation from vmx_vcpu_run() to vmx_handle_exit() - Return to userspace instead of repeatedly trying to emulate instructions that have already failed Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-12-03 09:32:09 +02:00
Mohammed Gamal	abcf14b560	KVM: x86 emulator: Add pusha and popa instructions This adds pusha and popa instructions (opcodes 0x60-0x61), this enables booting MINIX with invalid guest state emulation on. [marcelo: remove unused variable] Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:09 +02:00
Mohammed Gamal	94677e61fd	KVM: x86 emulator: Add missing decoder flags for 'or' instructions Add missing decoder flags for or instructions (0xc-0xd). Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:09 +02:00
Avi Kivity	bfd99ff5d4	KVM: Move assigned device code to own file Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:09 +02:00
Avi Kivity	367e1319b2	KVM: Return -ENOTTY on unrecognized ioctls Not the incorrect -EINVAL. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	680b3648ba	KVM: Drop kvm->irq_lock lock from irq injection path The only thing it protects now is interrupt injection into lapic and this can work lockless. Even now with kvm->irq_lock in place access to lapic is not entirely serialized since vcpu access doesn't take kvm->irq_lock. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	eba0226bdf	KVM: Move IO APIC to its own lock The allows removal of irq_lock from the injection path. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	136bdfeee7	KVM: Move irq ack notifier list to arch independent code Mask irq notifier list is already there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:07 +02:00
Gleb Natapov	3e71f88bc9	KVM: Maintain back mapping from irqchip/pin to gsi Maintain back mapping from irqchip/pin to gsi to speedup interrupt acknowledgment notifications. [avi: build fix on non-x86/ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:07 +02:00
Gleb Natapov	1a6e4a8c27	KVM: Move irq sharing information to irqchip level This removes assumptions that max GSIs is smaller than number of pins. Sharing is tracked on pin level not GSI level. [avi: no PIC on ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Gleb Natapov	79c727d437	KVM: Call pic_clear_isr() on pic reset to reuse logic there Also move call of ack notifiers after pic state change. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Avi Kivity	851ba6922a	KVM: Don't pass kvm_run arguments They're just copies of vcpu->run, which is readily accessible. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Mohammed Gamal	d8769fedd4	KVM: x86 emulator: Introduce No64 decode option Introduces a new decode option "No64", which is used for instructions that are invalid in long mode. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:05 +02:00
Mohammed Gamal	0934ac9d13	KVM: x86 emulator: Add 'push/pop sreg' instructions [avi: avoid buffer overflow] Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:05 +02:00
Avi Kivity	58988b07cf	Merge remote branch 'tip/x86/entry' into kvm-updates/2.6.33 Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:30:06 +02:00
Hidetoshi Seto	fe5ed91ddc	x86, mce: don't restart timer if disabled Even it is in error path unlikely taken, add_timer_on() at CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-02 21:27:32 -08:00
Jan Beulich	99063c0bce	x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h This at once also gets the alignment specification right for x86-64. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8F80200007800022708@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 11:39:45 +01:00
Jan Beulich	01be50a308	x86/alternatives: Check replacementlen <= instrlen at build time Having run into the run-(boot-)time check a couple of times lately, I finally took time to find a build-time check so that one doesn't need to analyze the register/stack dump and resolve this (through manual lookup in vmlinux) to the offending construct. The assembler will emit a message like "Error: value of <num> too large for field of 1 bytes at <offset>", which while not pointing out the source location still makes analysis quite a bit easier. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 11:39:45 +01:00
Masami Hiramatsu	e859cf8656	x86: Fix comments of register/stack access functions Fix typos and some redundant comments of register/stack access functions in asm/ptrace.h. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Wenji Huang <wenji.huang@oracle.com> Cc: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> LKML-Reference: <20091201000222.7669.7477.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu> Suggested-by: Wenji Huang <wenji.huang@oracle.com>	2009-12-02 10:22:22 +01:00
Suresh Siddha	6d20792e85	x86: Remove unnecessary mdelay() from cpu_disable_common() fixup_irqs() already has a mdelay(). Remove the extra and unnecessary mdelay() from cpu_disable_common(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.232177348@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:02 +01:00
Suresh Siddha	1c83995b6c	x86, ioapic: Document another case when level irq is seen as an edge In the case when cpu goes offline, fixup_irqs() will forward any unhandled interrupt on the offlined cpu to the new cpu destination that is handling the corresponding interrupt. This interrupt forwarding is done via IPI's. Hence, in this case also level-triggered io-apic interrupt will be seen as an edge interrupt in the cpu's APIC IRR. Document this scenario in the code which handles this case by doing an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE. Requested-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:01 +01:00
Suresh Siddha	c29d9db338	x86, ioapic: Fix the EOI register detection mechanism Maciej W. Rozycki reported: > 82093AA I/O APIC has its version set to 0x11 and it > does not support the EOI register. Similarly I/O APICs > integrated into the 82379AB south bridge and the 82374EB/SB > EISA component. IO-APIC versions below 0x20 don't support EOI register. Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic version as 0x2. This is an error with documentation and these ICH chips use io-apic's of version 0x20 and indeed has a working EOI register for the io-apic. Fix the EOI register detection mechanism to check for version 0x20 and beyond. And also, a platform can potentially have io-apic's with different versions. Make the EOI register check per io-apic. Reported-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:01 +01:00
Maciej W. Rozycki	ca64c47cec	x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq When the level-triggered interrupt is seen as an edge interrupt, we try to clear the remoteIRR explicitly (using either an io-apic eoi register when present or through the idea of changing trigger mode of the io-apic RTE to edge and then back to level). But this explicit try also needs to happen before we try to migrate the irq. Otherwise irq migration attempt will fail anyhow, as it postpones the irq migration to a later attempt when it sees the remoteIRR in the io-apic RTE still set. Signed-off-by: "Maciej W. Rozycki" <macro@linux-mips.org> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:00 +01:00
Frederic Weisbecker	1cedae7290	hw-breakpoints: Keep track of user disabled breakpoints When we disable a breakpoint through dr7, we unregister it right away, making us lose track of its corresponding address register value. It means that the following sequence would be unsupported: - set address in dr0 - enable it through dr7 - disable it through dr7 - enable it through dr7 because we lost the address register value when we disabled the breakpoint. Don't unregister the disabled breakpoints but rather disable them. Reported-by: "K.Prasad" <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259735536-9236-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 09:59:03 +01:00
David S. Miller	ff9c38bba3	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/ht.c	2009-12-01 22:13:38 -08:00
Herbert Xu	8386324381	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2009-12-01 15:16:22 +08:00
H. Peter Anvin	ccef086454	x86, mm: Correct the implementation of is_untracked_pat_range() The semantics the PAT code expect of is_untracked_pat_range() is "is this range completely contained inside the untracked region." This means that checkin `8a27138924` was technically wrong, because the implementation needlessly confusing. The sane interface is for it to take a semiclosed range like just about everything else (as evidenced by the sheer number of "- 1"'s removed by that patch) so change the actual implementation to match. Reported-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-30 21:33:51 -08:00
Helight.Xu	9eaa192d89	x86: Fix a section mismatch in arch/x86/kernel/setup.c copy_edd() should be __init. warning msg: WARNING: vmlinux.o(.text+0x7759): Section mismatch in reference from the function copy_edd() to the variable .init.data:boot_params The function copy_edd() references the variable __initdata boot_params. This is often because copy_edd lacks a __initdata annotation or the annotation of boot_params is wrong. Signed-off-by: ZhenwenXu <helight.xu@gmail.com> LKML-Reference: <4B139F8F.4000907@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-30 11:16:49 -08:00
Thomas Gleixner	b8b7d791a8	x86: Use -maccumulate-outgoing-args for sane mcount prologues commit `746357d` (x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage) uses -mtune=generic to work around the function prologue problem with mcount on -march=pentium-mmx and others. Jakub pointed out that we can use -maccumulate-outgoing-args instead which is selected by -mtune=generic and prevents the problem without losing the -march specific optimizations. Pointed-out-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@kernel.org	2009-11-28 15:08:30 +01:00
Thomas Gleixner	18ed61da98	x86: hpet: Make WARN_ON understandable Andrew complained rightly that the WARN_ON in hpet_next_event() is confusing and the code comment not really helpful. Change it to WARN_ONCE and print the reason in clear text. Change the comment to explain what kind of hardware wreckage we deal with. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>	2009-11-27 20:37:41 +01:00
Joerg Roedel	492667dacc	x86/amd-iommu: Remove amd_iommu_pd_table The data that was stored in this table is now available in dev->archdata.iommu. So this table is not longer necessary. This patch removes the remaining uses of that variable and removes it from the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:37 +01:00
Joerg Roedel	8eed983334	x86/amd-iommu: Move reset_iommu_command_buffer out of locked code This patch removes the ugly contruct where the iommu->lock must be released while before calling the reset_iommu_command_buffer function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:37 +01:00
Joerg Roedel	b00d3bcff4	x86/amd-iommu: Cleanup DTE flushing code This patch cleans up the code to flush device table entries in the IOMMU. With this chance the driver can get rid of the iommu_queue_inv_dev_entry() function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:36 +01:00
Joerg Roedel	3fa43655d8	x86/amd-iommu: Introduce iommu_flush_device() function This patch adds a function to flush a DTE entry for a given struct device and replaces iommu_queue_inv_dev_entry calls with this function where appropriate. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:35 +01:00
Joerg Roedel	7f760ddd70	x86/amd-iommu: Cleanup attach/detach_device code This patch cleans up the attach_device and detach_device paths and fixes reference counting while at it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:35 +01:00
Joerg Roedel	7c392cbe98	x86/amd-iommu: Keep devices per domain in a list This patch introduces a list to each protection domain which keeps all devices associated with the domain. This can be used later to optimize certain functions and to completly remove the amd_iommu_pd_table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:34 +01:00
Joerg Roedel	241000556f	x86/amd-iommu: Add device bind reference counting This patch adds a reference count to each device to count how often the device was bound to that domain. This is important for single devices that act as an alias for a number of others. These devices must stay bound to their domains until all devices that alias to it are unbound from the same domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:33 +01:00
Joerg Roedel	657cbb6b6c	x86/amd-iommu: Use dev->arch->iommu to store iommu related information This patch changes IOMMU code to use dev->archdata->iommu to store information about the alias device and the domain the device is attached to. This allows the driver to get rid of the amd_iommu_pd_table in the future. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:32 +01:00
Joerg Roedel	8793abeb78	x86/amd-iommu: Remove support for domain sharing This patch makes device isolation mandatory and removes support for the amd_iommu=share option. This simplifies the code in several places. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:32 +01:00
Joerg Roedel	171e7b3739	x86/amd-iommu: Rearrange dma_ops related functions This patch rearranges two dma_ops related functions so that their forward declarations are not longer necessary. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:31 +01:00
Joerg Roedel	308973d3b9	x86/amd-iommu: Move some pte allocation functions in the right section This patch moves alloc_pte() and fetch_pte() into the page table handling code section so that the forward declarations for them could be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:30 +01:00
Joerg Roedel	87a64d5238	x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc This function doesn't use the parameter anymore so it can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:30 +01:00
Joerg Roedel	98fc5a693b	x86/amd-iommu: Use get_device_id and check_device where appropriate The logic of these two functions is reimplemented (at least in parts) in places in the code. This patch removes these code duplications and uses the functions instead. As a side effect it moves check_device() to the helper function code section. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:29 +01:00
Joerg Roedel	71c70984e5	x86/amd-iommu: Move find_protection_domain to helper functions This is a helper function and when its placed in the helper function section we can remove its forward declaration. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:28 +01:00
Joerg Roedel	94f6d190ee	x86/amd-iommu: Simplify get_device_resources() With the previous changes the get_device_resources function can be simplified even more. The only important information for the callers is the protection domain. This patch renames the function to get_domain() and let it only return the protection domain for a device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:21 +01:00
Joerg Roedel	15898bbcb4	x86/amd-iommu: Let domain_for_device handle aliases If there is no domain associated to a device yet and the device has an alias device which already has a domain, the original device needs to have the same domain as the alias device. This patch changes domain_for_device to handle this situation and directly assigns the alias device domain to the device in this situation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:09 +01:00
Joerg Roedel	f3be07da53	x86/amd-iommu: Remove iommu specific handling from dma_ops path This patch finishes the removal of all iommu specific handling code in the dma_ops path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:08 +01:00
Joerg Roedel	cd8c82e875	x86/amd-iommu: Remove iommu parameter from __(un)map_single With the prior changes this parameter is not longer required. This patch removes it from the function and all callers. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:08 +01:00
Joerg Roedel	576175c250	x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs Since the assumption that an dma_ops domain is only bound to one IOMMU was given up we need to make alloc_new_range aware of it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:01 +01:00
Joerg Roedel	680525e06d	x86/amd-iommu: Remove iommu parameter from dma_ops_domain_(un)map The parameter is unused in these function so remove it from the parameter list. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:31 +01:00
Joerg Roedel	f99c0f1c75	x86/amd-iommu: Use check_device in get_device_resources Every call-place of get_device_resources calls check_device before it. So call it from get_device_resources directly and simplify the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:30 +01:00
Joerg Roedel	420aef8a3a	x86/amd-iommu: Use check_device for amd_iommu_dma_supported The check_device logic needs to include the dma_supported checks to be really sure. Merge the dma_supported logic into check_device and use it to implement dma_supported. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:30 +01:00
Joerg Roedel	318afd41d2	x86/amd-iommu: Make np-cache a global flag The non-present cache flag was IOMMU local until now which doesn't make sense. Make this a global flag so we can remove the lase user of 'struct iommu' in the map/unmap path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:29 +01:00
Joerg Roedel	09b4280439	x86/amd-iommu: Reimplement flush_all_domains_on_iommu() This patch reimplements the function flush_all_domains_on_iommu to use the global protection domain list. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:28 +01:00
Joerg Roedel	e3306664eb	x86/amd-iommu: Reimplement amd_iommu_flush_all_domains() This patch reimplementes the amd_iommu_flush_all_domains function to use the global protection domain list instead of flushing every domain on every IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:28 +01:00
Joerg Roedel	aeb26f5533	x86/amd-iommu: Implement protection domain list This patch adds code to keep a global list of all protection domains. This allows to simplify the resume code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:27 +01:00
Joerg Roedel	601367d76b	x86/amd-iommu: Remove iommu_flush_domain function This iommu_flush_tlb_pde function does essentially the same. So the iommu_flush_domain function is redundant and can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:26 +01:00
Joerg Roedel	dcd1e92e40	x86/amd-iommu: Use __iommu_flush_pages for tlb flushes This patch re-implements iommu_flush_tlb functions to use the __iommu_flush_pages logic. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:26 +01:00
Joerg Roedel	6de8ad9b9e	x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs This patch extends the iommu_flush_pages function to flush the TLB entries on all IOMMUs the domain has devices on. This basically gives up the former assumption that dma_ops domains are only bound to one IOMMU in the system. For dma_ops domains this is still true but not for IOMMU-API managed domains. Giving this assumption up for dma_ops domains too allows code simplification. Further it splits out the main logic into a generic function which can be used by iommu_flush_tlb too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:18 +01:00
Joerg Roedel	0518a3a458	x86/amd-iommu: Add function to complete a tlb flush This patch adds a function to the AMD IOMMU driver which completes all queued commands an all IOMMUs a specific domain has devices attached on. This is required in a later patch when per-domain flushing is implemented. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:50 +01:00
Joerg Roedel	c459611424	x86/amd-iommu: Add per IOMMU reference counting This patch adds reference counting for protection domains per IOMMU. This allows a smarter TLB flushing strategy. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:50 +01:00
Joerg Roedel	bb52777ec4	x86/amd-iommu: Add an index field to struct amd_iommu This patch adds an index field to struct amd_iommu which can be used to lookup it up in an array. This index will be used in struct protection_domain to keep track which protection domain has devices behind which IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:49 +01:00
Joerg Roedel	bf3118c127	x86/amd-iommu: Update copyright headers This patch updates the copyright headers in the relevant AMD IOMMU driver files to match the date of the latest changes. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:49 +01:00
Joerg Roedel	6a9401a7ac	x86/amd-iommu: Separate internal interface definitions This patch moves all function declarations which are only used inside the driver code to a seperate header file. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:48 +01:00

1 2 3 4 5 ...

9429 Commits