linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-12 22:56:48 +07:00

Author	SHA1	Message	Date
Gleb Natapov	1a6e4a8c27	KVM: Move irq sharing information to irqchip level This removes assumptions that max GSIs is smaller than number of pins. Sharing is tracked on pin level not GSI level. [avi: no PIC on ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Gleb Natapov	79c727d437	KVM: Call pic_clear_isr() on pic reset to reuse logic there Also move call of ack notifiers after pic state change. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Avi Kivity	851ba6922a	KVM: Don't pass kvm_run arguments They're just copies of vcpu->run, which is readily accessible. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:06 +02:00
Mohammed Gamal	d8769fedd4	KVM: x86 emulator: Introduce No64 decode option Introduces a new decode option "No64", which is used for instructions that are invalid in long mode. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:05 +02:00
Mohammed Gamal	0934ac9d13	KVM: x86 emulator: Add 'push/pop sreg' instructions [avi: avoid buffer overflow] Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:05 +02:00
Avi Kivity	58988b07cf	Merge remote branch 'tip/x86/entry' into kvm-updates/2.6.33 Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:30:06 +02:00
Hidetoshi Seto	fe5ed91ddc	x86, mce: don't restart timer if disabled Even it is in error path unlikely taken, add_timer_on() at CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-02 21:27:32 -08:00
Jan Beulich	99063c0bce	x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h This at once also gets the alignment specification right for x86-64. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8F80200007800022708@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 11:39:45 +01:00
Jan Beulich	01be50a308	x86/alternatives: Check replacementlen <= instrlen at build time Having run into the run-(boot-)time check a couple of times lately, I finally took time to find a build-time check so that one doesn't need to analyze the register/stack dump and resolve this (through manual lookup in vmlinux) to the offending construct. The assembler will emit a message like "Error: value of <num> too large for field of 1 bytes at <offset>", which while not pointing out the source location still makes analysis quite a bit easier. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 11:39:45 +01:00
Masami Hiramatsu	e859cf8656	x86: Fix comments of register/stack access functions Fix typos and some redundant comments of register/stack access functions in asm/ptrace.h. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Wenji Huang <wenji.huang@oracle.com> Cc: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> LKML-Reference: <20091201000222.7669.7477.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu> Suggested-by: Wenji Huang <wenji.huang@oracle.com>	2009-12-02 10:22:22 +01:00
Suresh Siddha	6d20792e85	x86: Remove unnecessary mdelay() from cpu_disable_common() fixup_irqs() already has a mdelay(). Remove the extra and unnecessary mdelay() from cpu_disable_common(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.232177348@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:02 +01:00
Suresh Siddha	1c83995b6c	x86, ioapic: Document another case when level irq is seen as an edge In the case when cpu goes offline, fixup_irqs() will forward any unhandled interrupt on the offlined cpu to the new cpu destination that is handling the corresponding interrupt. This interrupt forwarding is done via IPI's. Hence, in this case also level-triggered io-apic interrupt will be seen as an edge interrupt in the cpu's APIC IRR. Document this scenario in the code which handles this case by doing an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE. Requested-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:01 +01:00
Suresh Siddha	c29d9db338	x86, ioapic: Fix the EOI register detection mechanism Maciej W. Rozycki reported: > 82093AA I/O APIC has its version set to 0x11 and it > does not support the EOI register. Similarly I/O APICs > integrated into the 82379AB south bridge and the 82374EB/SB > EISA component. IO-APIC versions below 0x20 don't support EOI register. Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic version as 0x2. This is an error with documentation and these ICH chips use io-apic's of version 0x20 and indeed has a working EOI register for the io-apic. Fix the EOI register detection mechanism to check for version 0x20 and beyond. And also, a platform can potentially have io-apic's with different versions. Make the EOI register check per io-apic. Reported-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:01 +01:00
Maciej W. Rozycki	ca64c47cec	x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq When the level-triggered interrupt is seen as an edge interrupt, we try to clear the remoteIRR explicitly (using either an io-apic eoi register when present or through the idea of changing trigger mode of the io-apic RTE to edge and then back to level). But this explicit try also needs to happen before we try to migrate the irq. Otherwise irq migration attempt will fail anyhow, as it postpones the irq migration to a later attempt when it sees the remoteIRR in the io-apic RTE still set. Signed-off-by: "Maciej W. Rozycki" <macro@linux-mips.org> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 10:11:00 +01:00
Frederic Weisbecker	1cedae7290	hw-breakpoints: Keep track of user disabled breakpoints When we disable a breakpoint through dr7, we unregister it right away, making us lose track of its corresponding address register value. It means that the following sequence would be unsupported: - set address in dr0 - enable it through dr7 - disable it through dr7 - enable it through dr7 because we lost the address register value when we disabled the breakpoint. Don't unregister the disabled breakpoints but rather disable them. Reported-by: "K.Prasad" <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259735536-9236-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-02 09:59:03 +01:00
David S. Miller	ff9c38bba3	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/ht.c	2009-12-01 22:13:38 -08:00
Herbert Xu	8386324381	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2009-12-01 15:16:22 +08:00
H. Peter Anvin	ccef086454	x86, mm: Correct the implementation of is_untracked_pat_range() The semantics the PAT code expect of is_untracked_pat_range() is "is this range completely contained inside the untracked region." This means that checkin `8a27138924` was technically wrong, because the implementation needlessly confusing. The sane interface is for it to take a semiclosed range like just about everything else (as evidenced by the sheer number of "- 1"'s removed by that patch) so change the actual implementation to match. Reported-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-30 21:33:51 -08:00
Helight.Xu	9eaa192d89	x86: Fix a section mismatch in arch/x86/kernel/setup.c copy_edd() should be __init. warning msg: WARNING: vmlinux.o(.text+0x7759): Section mismatch in reference from the function copy_edd() to the variable .init.data:boot_params The function copy_edd() references the variable __initdata boot_params. This is often because copy_edd lacks a __initdata annotation or the annotation of boot_params is wrong. Signed-off-by: ZhenwenXu <helight.xu@gmail.com> LKML-Reference: <4B139F8F.4000907@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-30 11:16:49 -08:00
Thomas Gleixner	b8b7d791a8	x86: Use -maccumulate-outgoing-args for sane mcount prologues commit `746357d` (x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage) uses -mtune=generic to work around the function prologue problem with mcount on -march=pentium-mmx and others. Jakub pointed out that we can use -maccumulate-outgoing-args instead which is selected by -mtune=generic and prevents the problem without losing the -march specific optimizations. Pointed-out-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@kernel.org	2009-11-28 15:08:30 +01:00
Thomas Gleixner	18ed61da98	x86: hpet: Make WARN_ON understandable Andrew complained rightly that the WARN_ON in hpet_next_event() is confusing and the code comment not really helpful. Change it to WARN_ONCE and print the reason in clear text. Change the comment to explain what kind of hardware wreckage we deal with. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>	2009-11-27 20:37:41 +01:00
Joerg Roedel	492667dacc	x86/amd-iommu: Remove amd_iommu_pd_table The data that was stored in this table is now available in dev->archdata.iommu. So this table is not longer necessary. This patch removes the remaining uses of that variable and removes it from the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:37 +01:00
Joerg Roedel	8eed983334	x86/amd-iommu: Move reset_iommu_command_buffer out of locked code This patch removes the ugly contruct where the iommu->lock must be released while before calling the reset_iommu_command_buffer function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:37 +01:00
Joerg Roedel	b00d3bcff4	x86/amd-iommu: Cleanup DTE flushing code This patch cleans up the code to flush device table entries in the IOMMU. With this chance the driver can get rid of the iommu_queue_inv_dev_entry() function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:36 +01:00
Joerg Roedel	3fa43655d8	x86/amd-iommu: Introduce iommu_flush_device() function This patch adds a function to flush a DTE entry for a given struct device and replaces iommu_queue_inv_dev_entry calls with this function where appropriate. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:35 +01:00
Joerg Roedel	7f760ddd70	x86/amd-iommu: Cleanup attach/detach_device code This patch cleans up the attach_device and detach_device paths and fixes reference counting while at it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:35 +01:00
Joerg Roedel	7c392cbe98	x86/amd-iommu: Keep devices per domain in a list This patch introduces a list to each protection domain which keeps all devices associated with the domain. This can be used later to optimize certain functions and to completly remove the amd_iommu_pd_table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:34 +01:00
Joerg Roedel	241000556f	x86/amd-iommu: Add device bind reference counting This patch adds a reference count to each device to count how often the device was bound to that domain. This is important for single devices that act as an alias for a number of others. These devices must stay bound to their domains until all devices that alias to it are unbound from the same domain. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:33 +01:00
Joerg Roedel	657cbb6b6c	x86/amd-iommu: Use dev->arch->iommu to store iommu related information This patch changes IOMMU code to use dev->archdata->iommu to store information about the alias device and the domain the device is attached to. This allows the driver to get rid of the amd_iommu_pd_table in the future. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:32 +01:00
Joerg Roedel	8793abeb78	x86/amd-iommu: Remove support for domain sharing This patch makes device isolation mandatory and removes support for the amd_iommu=share option. This simplifies the code in several places. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:32 +01:00
Joerg Roedel	171e7b3739	x86/amd-iommu: Rearrange dma_ops related functions This patch rearranges two dma_ops related functions so that their forward declarations are not longer necessary. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:31 +01:00
Joerg Roedel	308973d3b9	x86/amd-iommu: Move some pte allocation functions in the right section This patch moves alloc_pte() and fetch_pte() into the page table handling code section so that the forward declarations for them could be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:30 +01:00
Joerg Roedel	87a64d5238	x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc This function doesn't use the parameter anymore so it can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:30 +01:00
Joerg Roedel	98fc5a693b	x86/amd-iommu: Use get_device_id and check_device where appropriate The logic of these two functions is reimplemented (at least in parts) in places in the code. This patch removes these code duplications and uses the functions instead. As a side effect it moves check_device() to the helper function code section. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:29 +01:00
Joerg Roedel	71c70984e5	x86/amd-iommu: Move find_protection_domain to helper functions This is a helper function and when its placed in the helper function section we can remove its forward declaration. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:28 +01:00
Joerg Roedel	94f6d190ee	x86/amd-iommu: Simplify get_device_resources() With the previous changes the get_device_resources function can be simplified even more. The only important information for the callers is the protection domain. This patch renames the function to get_domain() and let it only return the protection domain for a device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:20:21 +01:00
Joerg Roedel	15898bbcb4	x86/amd-iommu: Let domain_for_device handle aliases If there is no domain associated to a device yet and the device has an alias device which already has a domain, the original device needs to have the same domain as the alias device. This patch changes domain_for_device to handle this situation and directly assigns the alias device domain to the device in this situation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:09 +01:00
Joerg Roedel	f3be07da53	x86/amd-iommu: Remove iommu specific handling from dma_ops path This patch finishes the removal of all iommu specific handling code in the dma_ops path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:08 +01:00
Joerg Roedel	cd8c82e875	x86/amd-iommu: Remove iommu parameter from __(un)map_single With the prior changes this parameter is not longer required. This patch removes it from the function and all callers. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:08 +01:00
Joerg Roedel	576175c250	x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs Since the assumption that an dma_ops domain is only bound to one IOMMU was given up we need to make alloc_new_range aware of it. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:17:01 +01:00
Joerg Roedel	680525e06d	x86/amd-iommu: Remove iommu parameter from dma_ops_domain_(un)map The parameter is unused in these function so remove it from the parameter list. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:31 +01:00
Joerg Roedel	f99c0f1c75	x86/amd-iommu: Use check_device in get_device_resources Every call-place of get_device_resources calls check_device before it. So call it from get_device_resources directly and simplify the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:30 +01:00
Joerg Roedel	420aef8a3a	x86/amd-iommu: Use check_device for amd_iommu_dma_supported The check_device logic needs to include the dma_supported checks to be really sure. Merge the dma_supported logic into check_device and use it to implement dma_supported. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:30 +01:00
Joerg Roedel	318afd41d2	x86/amd-iommu: Make np-cache a global flag The non-present cache flag was IOMMU local until now which doesn't make sense. Make this a global flag so we can remove the lase user of 'struct iommu' in the map/unmap path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:29 +01:00
Joerg Roedel	09b4280439	x86/amd-iommu: Reimplement flush_all_domains_on_iommu() This patch reimplements the function flush_all_domains_on_iommu to use the global protection domain list. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:28 +01:00
Joerg Roedel	e3306664eb	x86/amd-iommu: Reimplement amd_iommu_flush_all_domains() This patch reimplementes the amd_iommu_flush_all_domains function to use the global protection domain list instead of flushing every domain on every IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:28 +01:00
Joerg Roedel	aeb26f5533	x86/amd-iommu: Implement protection domain list This patch adds code to keep a global list of all protection domains. This allows to simplify the resume code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:27 +01:00
Joerg Roedel	601367d76b	x86/amd-iommu: Remove iommu_flush_domain function This iommu_flush_tlb_pde function does essentially the same. So the iommu_flush_domain function is redundant and can be removed. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:26 +01:00
Joerg Roedel	dcd1e92e40	x86/amd-iommu: Use __iommu_flush_pages for tlb flushes This patch re-implements iommu_flush_tlb functions to use the __iommu_flush_pages logic. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:26 +01:00
Joerg Roedel	6de8ad9b9e	x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs This patch extends the iommu_flush_pages function to flush the TLB entries on all IOMMUs the domain has devices on. This basically gives up the former assumption that dma_ops domains are only bound to one IOMMU in the system. For dma_ops domains this is still true but not for IOMMU-API managed domains. Giving this assumption up for dma_ops domains too allows code simplification. Further it splits out the main logic into a generic function which can be used by iommu_flush_tlb too. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 14:16:18 +01:00
Joerg Roedel	0518a3a458	x86/amd-iommu: Add function to complete a tlb flush This patch adds a function to the AMD IOMMU driver which completes all queued commands an all IOMMUs a specific domain has devices attached on. This is required in a later patch when per-domain flushing is implemented. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:50 +01:00
Joerg Roedel	c459611424	x86/amd-iommu: Add per IOMMU reference counting This patch adds reference counting for protection domains per IOMMU. This allows a smarter TLB flushing strategy. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:50 +01:00
Joerg Roedel	bb52777ec4	x86/amd-iommu: Add an index field to struct amd_iommu This patch adds an index field to struct amd_iommu which can be used to lookup it up in an array. This index will be used in struct protection_domain to keep track which protection domain has devices behind which IOMMU. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:49 +01:00
Joerg Roedel	bf3118c127	x86/amd-iommu: Update copyright headers This patch updates the copyright headers in the relevant AMD IOMMU driver files to match the date of the latest changes. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:49 +01:00
Joerg Roedel	6a9401a7ac	x86/amd-iommu: Separate internal interface definitions This patch moves all function declarations which are only used inside the driver code to a seperate header file. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-27 11:45:48 +01:00
Frederic Weisbecker	5fa10b28e5	hw-breakpoints: Use struct perf_event_attr to define user breakpoints In-kernel user breakpoints are created using functions in which we pass breakpoint parameters as individual variables: address, length and type. Although it fits well for x86, this just does not scale across archictectures that may support this api later as these may have more or different needs. Pass in a perf_event_attr structure instead because it is meant to evolve as much as possible into a generic hardware breakpoint parameter structure. Reported-by: K.Prasad <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259294154-5197-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-27 06:22:58 +01:00
Xiaotian Feng	dd4377b02d	x86/pat: Trivial: don't create debugfs for memtype if pat is disabled If pat is disabled (boot with nopat), there's no need to create debugfs for it, it's empty all the time. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1259236428-16329-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 15:05:03 +01:00
Jack Steiner	918bc960dc	x86: SGI UV: Map low MMR ranges Explicitly mmap the UV chipset MMR address ranges used to access blade-local registers. Although these same MMRs are also mmaped at higher addresses, the low range is more convenient when accessing blade-local registers. The low range addresses always alias to the local blade regardless of the blade id. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091125162018.GA25445@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:52:36 +01:00
Brian Gerst	8ec6993d9f	x86, 64-bit: Set data segments to null after switching to 64-bit mode This prevents kernel threads from inheriting non-null segment selectors, and causing optimizations in __switch_to() to be ineffective. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Tim Blechmann <tim@klingt.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jan Beulich <JBeulich@novell.com> LKML-Reference: <1259165856-3512-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:44:30 +01:00
Ingo Molnar	64b028b226	x86: Clean up the loadsegment() macro Make it readable in the source too, not just in the assembly output. No change in functionality. Cc: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:38:52 +01:00
Brian Gerst	79b0379cee	x86: Optimize loadsegment() Zero the input register in the exception handler instead of using an extra register to pass in a zero value. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:33:58 +01:00
Hidetoshi Seto	767df1bdd8	x86, mce: Add __cpuinit to hotplug callback functions The mce_disable_cpu() and mce_reenable_cpu() are called only from mce_cpu_callback() which is marked as __cpuinit. So these functions can be __cpuinit too. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4B0E3C4E.4090809@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:29:41 +01:00
Mike Travis	9b3660a55a	x86: Limit number of per cpu TSC sync messages Limit the number of per cpu TSC sync messages by only printing to the console if an error occurs, otherwise print as a DEBUG message. The info message "Skipping synchronization ..." is only printed after the last cpu has booted. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091118002222.181053000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 10:17:45 +01:00
Frederic Weisbecker	2c31b7958f	x86/hw-breakpoints: Don't lose GE flag while disabling a breakpoint When we schedule out a breakpoint from the cpu, we also incidentally remove the "Global exact breakpoint" flag from the breakpoint control register. It makes us losing the fine grained precision about the origin of the instructions that may trigger breakpoint exceptions for the other breakpoints running in this cpu. Reported-by: Prasad <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259211878-6013-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 09:29:22 +01:00
Frederic Weisbecker	605bfaee90	hw-breakpoints: Simplify error handling in breakpoint creation requests This simplifies the error handling when we create a breakpoint. We don't need to check the NULL return value corner case anymore since we have improved perf_event_create_kernel_counter() to always return an error code in the failure case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1259210142-5714-3-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 09:29:21 +01:00
Ilya Loginov	2d4dc890b5	block: add helpers to run flush_dcache_page() against a bio and a request's pages Mtdblock driver doesn't call flush_dcache_page for pages in request. So, this causes problems on architectures where the icache doesn't fill from the dcache or with dcache aliases. The patch fixes this. The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid pointless empty cache-thrashing loops on architectures for which flush_dcache_page() is a no-op. Every architecture was provided with this flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is equal 1 or do nothing otherwise. See "fix mtd_blkdevs problem with caches on some architectures" discussion on LKML for more information. Signed-off-by: Ilya Loginov <isloginov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Peter Horton <phorton@bitbox.co.uk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:16:19 +01:00
Ingo Molnar	67f2de0bf9	x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks This warning: [ 847.140022] rb_producer D 0000000000000000 5928 519 2 0x00000000 [ 847.203627] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/517 [ 847.207360] caller is show_stack_log_lvl+0x2e/0x241 [ 847.210364] Pid: 517, comm: khungtaskd Not tainted 2.6.32-rc8-tip+ #13761 [ 847.213395] Call Trace: [ 847.215847] [<ffffffff81413bde>] debug_smp_processor_id+0x1f0/0x20a [ 847.216809] [<ffffffff81015eae>] show_stack_log_lvl+0x2e/0x241 [ 847.220027] [<ffffffff81018512>] show_stack+0x1c/0x1e [ 847.223365] [<ffffffff8107b7db>] sched_show_task+0xe4/0xe9 [ 847.226694] [<ffffffff8112f21f>] check_hung_task+0x140/0x199 [ 847.230261] [<ffffffff8112f4a8>] check_hung_uninterruptible_tasks+0x1b7/0x20f [ 847.233371] [<ffffffff8112f500>] ? watchdog+0x0/0x50 [ 847.236683] [<ffffffff8112f54e>] watchdog+0x4e/0x50 [ 847.240034] [<ffffffff810cee56>] kthread+0x97/0x9f [ 847.243372] [<ffffffff81012aea>] child_rip+0xa/0x20 [ 847.246690] [<ffffffff81e43494>] ? restore_args+0x0/0x30 [ 847.250019] [<ffffffff81e43083>] ? _spin_lock+0xe/0x10 [ 847.253351] [<ffffffff810cedbf>] ? kthread+0x0/0x9f [ 847.256833] [<ffffffff81012ae0>] ? child_rip+0x0/0x20 Happens because on preempt-RCU, khungd calls show_stack() with preemption enabled. Make sure we are not preemptible while walking the IRQ and exception stacks on 64-bit. (32-bit stack dumping is preemption safe.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 08:29:10 +01:00
Ingo Molnar	b803090615	x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details Make the initialization more readable, plus tidy up a few small visual details as well. No change in functionality. LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-26 08:24:33 +01:00
Tejun Heo	28b4e0d86a	x86: Rename global percpu symbol dr7 to cpu_dr7 Percpu symbols now occupy the same namespace as other global symbols and as such short global symbols without subsystem prefix tend to collide with local variables. dr7 percpu variable used by x86 was hit by this. Rename it to cpu_dr7. The rename also makes it more consistent with its fellow cpu_debugreg percpu variable. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20091125115856.GA17856@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>	2009-11-25 14:30:04 +01:00
FUJITA Tomonori	273bee27fa	x86: Fix iommu=soft boot option iommu=soft boot option forces the kernel to use swiotlb. ( This has the side-effect of enabling the swiotlb over the GART if this boot option is provided. This is the desired behavior of the swiotlb boot option and works like that for all other hw-IOMMU drivers. ) Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: yinghai@kernel.org LKML-Reference: <20091125084611O.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-25 10:12:51 +01:00
Lin Ming	2263576cfc	ACPICA: Add post-order callback to acpi_walk_namespace The existing interface only has a pre-order callback. This change adds an additional parameter for a post-order callback which will be more useful for bus scans. ACPICA BZ 779. Also update the external calls to acpi_walk_namespace. http://www.acpica.org/bugzilla/show_bug.cgi?id=779 Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2009-11-24 21:31:10 -05:00
Bjorn Helgaas	f6e1d8cc38	x86/PCI: MMCONFIG: add lookup function This patch factors out the search for an MMCONFIG region, which was previously implemented in both mmconfig_32 and mmconfig_64. No functional change. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:36 -08:00
Bjorn Helgaas	8c57786ad3	x86/PCI: MMCONFIG: clean up printks No functional change; just tidy up printks and make them more consistent with the rest of PCI. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:30 -08:00
Bjorn Helgaas	ba2afbabfc	x86/PCI: MMCONFIG: add pci_mmconfig_remove() to remove MMCONFIG region This is only used internally now, but eventually will be used in the hot-remove path to remove the MMCONFIG region associated with a host bridge. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:24 -08:00
Bjorn Helgaas	ff097ddd4a	x86/PCI: MMCONFIG: manage pci_mmcfg_region as a list, not a table This changes pci_mmcfg_region from a table to a list, to make it easier to add and remove MMCONFIG regions for PCI host bridge hotplug. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:14 -08:00
Bjorn Helgaas	987c367b4e	x86/PCI: MMCONFIG: remove typeof so we can use a list This replaces "typeof(pci_mmcfg_config[0])" with the actual type because I plan to convert pci_mmcfg_config to a list, and then "pci_mmcfg_config[0]" won't mean anything. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:08 -08:00
Bjorn Helgaas	3f0f550392	x86/PCI: MMCONFIG: add virtual address to struct pci_mmcfg_region The virtual address is only used for x86_64, but it's so much simpler to manage it as part of the pci_mmcfg_region that I think it's worth wasting a pointer per MMCONFIG region on x86_32. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:30:01 -08:00
Bjorn Helgaas	2f2a8b9c90	x86/PCI: MMCONFIG: trivial is_mmconf_reserved() interface simplification Since pci_mmcfg_region contains the struct resource, no need to pass the pci_mmcfg_region and the resource start/size. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:49 -08:00
Bjorn Helgaas	56ddf4d3cf	x86/PCI: MMCONFIG: add resource to struct pci_mmcfg_region This patch adds a resource and corresponding name to the MMCONFIG structure. This makes allocation simpler (we can allocate the resource and name at the same time we allocate the pci_mmcfg_region), and gives us a way to hang onto the resource after inserting it. This will be needed so we can release and free it when hot-removing a host bridge. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:41 -08:00
Bjorn Helgaas	95cf1cf0c5	x86/PCI: MMCONFIG: use pointer to simplify pci_mmcfg_config[] structure access No functional change, but simplifies a future patch to convert the table to a list. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:34 -08:00
Bjorn Helgaas	d7e6b66fe8	x86/PCI: MMCONFIG: rename pci_mmcfg_region structure members This only renames the struct pci_mmcfg_region members; no functional change. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:24 -08:00
Bjorn Helgaas	d215a9c8b4	x86/PCI: MMCONFIG: use a private structure rather than the ACPI MCFG one This adds a struct pci_mmcfg_region with a little more information than the struct acpi_mcfg_allocation used previously. The acpi_mcfg structure is defined by the spec, so we can't change it. To begin with, struct pci_mmcfg_region is basically the same as the ACPI MCFG version, but future patches will add more information. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:17 -08:00
Bjorn Helgaas	df5eb1d67e	x86/PCI: MMCONFIG: add PCI_MMCFG_BUS_OFFSET() to factor common expression This factors out the common "bus << 20" expression used when computing the MMCONFIG address. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:11 -08:00
Bjorn Helgaas	f7ca698487	x86/PCI: MMCONFIG: reject MMCONFIG apertures at address zero Since all MMCONFIG regions go through pci_mmconfig_add(), we can test the address once there. If the caller supplies an address of zero, we never insert it in the pci_mmcfg_config[] table, so no need to test it elsewhere. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:29:03 -08:00
Bjorn Helgaas	463a5df175	x86/PCI: MMCONFIG: simplify tests for empty pci_mmcfg_config table We never set pci_mmcfg_config unless we increment pci_mmcfg_config_num, so there's no need to test both pci_mmcfg_config_num and pci_mmcfg_config. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:28:56 -08:00
Bjorn Helgaas	7da7d360ae	x86/PCI: MMCONFIG: centralize MCFG structure management This patch encapsulate pci_mmcfg_config[] updates. All alloc/free is now done in pci_mmconfig_add() and free_all_mcfg(), so all updates to pci_mmcfg_config[] and pci_mmcfg_config_num are in those two functions. This replaces the previous sequence of extend_mmcfg(), fill_one_mmcfg() with the single pci_mmconfig_add() interface. This interface is currently static but will eventually be used in the host bridge hot-add path. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:28:51 -08:00
Bjorn Helgaas	d3578ef7aa	x86/PCI: MMCONFIG: step through MCFG table, not pci_mmcfg_config[] Step through the ACPI MCFG table, not pci_mmcfg_config[]. No functional change, but simplifies future patches that encapsulate pci_mmcfg_config[]. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:28:43 -08:00
Bjorn Helgaas	e823d6ff58	x86/PCI: MMCONFIG: count MCFG structures with local variable Use a local variable, not pci_mmcfg_config_num, to count MCFG entries. No functional change, but simplifies future changes. Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:28:37 -08:00
Bjorn Helgaas	5663b1b963	x86/PCI: MMCONFIG: remove unused definitions Reviewed-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:28:30 -08:00
Yinghai Lu	67f241f457	x86/pci: seperate x86_pci_rootbus_res_quirks from amd_bus.c Those functions are used by intel_bus.c so seperate them to another file. and make amd_bus a bit smaller. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:25:59 -08:00
Jiri Kosina	7b7a785942	PCI: fix comment typo in bus_numa.h Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:25:20 -08:00
Alex Chiang	2ed7a806d8	x86/PCI: remove early PCI pr_debug statements commit db635adc turned -DDEBUG for x86/pci on when CONFIG_PCI_DEBUG is set. In general, I agree with that change. However, it exposes a bunch of very low level PCI debugging in the early x86 path, such as: 0 reading 2 from a: ffff 1 reading 2 from a: ffff 2 reading 2 from a: ffff 3 reading 2 from a: 300 3 reading 2 from 0: 1002 3 reading 2 from 2: 515e These statements add a lot of noise to the boot and aren't likely to be necessary even when handling random upstream bug reports. [In contrast, statements such as these: pci 0000:02:04.0: found [14e4:164a] class 000200 header type 00 pci 0000:02:04.0: reg 10: [mem 0xf8000000-0xf9ffffff 64bit] pci 0000:02:04.0: reg 30: [mem 0x00000000-0x0001ffff pref] are indeed useful when remote debugging users' machines] Remove the noisy printks and save electrons everywhere. Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-24 15:25:19 -08:00
Yinghai Lu	5bf65b9ba6	x86, mtrr: Fix sorting of mtrr after subtracting In some cases we can coalesce MTRR entries after cleanup; this may allow us to have more entries. As such, introduce clean_sort_range to to sort and coaelsce the MTRR entries. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B0BB9A3.5020908@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-24 13:06:16 -08:00
Thomas Renninger	e2f74f355e	[ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface This interface is mainly intended (and implemented) for ACPI _PPC BIOS frequency limitations, but other cpufreq drivers can also use it for similar use-cases. Why is this needed: Currently it's not obvious why cpufreq got limited. People see cpufreq/scaling_max_freq reduced, but this could have happened by: - any userspace prog writing to scaling_max_freq - thermal limitations - hardware (_PPC in ACPI case) limitiations Therefore export bios_limit (in kHz) to: - Point the user that it's the BIOS (broken or intended) which limits frequency - Export it as a sysfs interface for userspace progs. While this was a rarely used feature on laptops, there will appear more and more server implemenations providing "Green IT" features like allowing the service processor to limit the frequency. People want to know about HW/BIOS frequency limitations. All ACPI P-state driven cpufreq drivers are covered with this patch: - powernow-k8 - powernow-k7 - acpi-cpufreq Tested with a patched DSDT which limits the first two cores (_PPC returns 1) via _PPC, exposed by bios_limit: # echo 2200000 >cpu2/cpufreq/scaling_max_freq # cat cpu/cpufreq/scaling_max_freq 2600000 2600000 2200000 2200000 # #scaling_max_freq shows general user/thermal/BIOS limitations # cat cpu/cpufreq/bios_limit 2600000 2600000 2800000 2800000 # #bios_limit only shows the HW/BIOS limitation CC: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com> CC: Len Brown <lenb@kernel.org> CC: davej@codemonkey.org.uk CC: linux@dominikbrodowski.net Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:34 -05:00
Rusty Russell	1cce76c2ac	[CPUFREQ] use an enum for speedstep processor identification The "unsigned int processor" everywhere confused Rusty, leading to breakage when he passed in smp_processor_id(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:34 -05:00
Krzysztof Helt	db2820dd54	[CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used Set the transition latency to value smaller than CPUFREQ_ETERNAL so governors other than "performance" work (like the "ondemand" one). The value is found in "AMD PowerNow! Technology Platform Design Guide for Embedded Processors" dated December 2000 (AMD doc #24267A). There is the answer to one of FAQs on page 40 which states that suggested complete transition period is 200 us. Tested on K6-2+ CPU with K6-3 core (model 13, stepping 4). Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:33 -05:00
Rusty Russell	b8cbe7e82e	[CPUFREQ] cpumask: don't put a cpumask on the stack in x86...cpufreq/powernow-k8.c It's still mugging the current process's cpumask, but as comment in `1ff6e97f1d` says, it's not a trivial fix. So, at least we can use a cpumask_var_t to do the Wrong Thing the Right Way :) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: cpufreq@vger.kernel.org Cc: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:33 -05:00
Harald Welte	d77b819745	[CPUFREQ] Enable ACPI PDC handshake for VIA/Centaur CPUs In commit `0de51088e6`, we introduced the use of acpi-cpufreq on VIA/Centaur CPU's by removing a vendor check for VENDOR_INTEL. However, as it turns out, at least the Nano CPU's also need the PDC (processor driver capabilities) handshake in order to activate the methods required for acpi-cpufreq. Since arch_acpi_processor_init_pdc() contains another vendor check for Intel, the PDC is not initialized on VIA CPU's. The resulting behavior of a current mainline kernel on such systems is: acpi-cpufreq loads and it indicates CPU frequency changes. However, the CPU stays at a single frequency This trivial patch ensures that init_intel_pdc() is called on Intel and VIA/Centaur CPU's alike. Signed-off-by: Harald Welte <HaraldWelte@viatech.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:32 -05:00
Stephane Eranian	1261a02a0c	perf_events, x86: Fix validate_event bug The validate_event() was failing on valid event combinations. The function was assuming that if x86_schedule_event() returned 0, it meant error. But x86_schedule_event() returns the counter index and 0 is a perfectly valid value. An error is returned if the function returns a negative value. Furthermore, validate_event() was also failing for event groups because the event->pmu was not set until after hw_perf_event_init(). Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net Cc: eranian@gmail.com LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> -- arch/x86/kernel/cpu/perf_event.c \| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)	2009-11-24 19:23:48 +01:00
Yinghai Lu	b24c2a925a	x86: Move find_smp_config() earlier and avoid bootmem usage Move the find_smp_config() call to before bootmem is initialized. Use reserve_early() instead of reserve_bootmem() in it. This simplifies the code, we only need to call find_smp_config() once and can remove the now unneeded reserve parameter from x86_init_mpparse::find_smp_config. We thus also reduce x86's dependency on bootmem allocations. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B0BB9F2.70907@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-24 12:10:51 +01:00
H. Peter Anvin	eb41c8be89	x86, platform: Change is_untracked_pat_range() to bool; cleanup init - Change is_untracked_pat_range() to return bool. - Clean up the initialization of is_untracked_pat_range() -- by default, we simply point it at is_ISA_range() directly. - Move is_untracked_pat_range to the end of struct x86_platform, since it is the newest field. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-23 17:09:59 -08:00
H. Peter Anvin	65f116f5f1	x86: Change is_ISA_range() into an inline function Change is_ISA_range() from a macro to an inline function. This makes it type safe, and also allows it to be assigned to a function pointer if necessary. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-23 17:09:59 -08:00
H. Peter Anvin	8a27138924	x86, mm: is_untracked_pat_range() takes a normal semiclosed range is_untracked_pat_range() -- like its components, is_ISA_range() and is_GRU_range(), takes a normal semiclosed interval (>=, <) whereas the PAT code called it as if it took a closed range (>=, <=). Fix. Although this is a bug, I believe it is non-manifest, simply because none of the callers will call this with non-page-aligned addresses. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-23 17:09:59 -08:00
H. Peter Anvin	55a6ca2547	x86, mm: Call is_untracked_pat_range() rather than is_ISA_range() Checkin `fd12a0d69a` made the PAT untracked range a platform configurable, but missed on occurrence of is_ISA_range() which still refers to PAT-untracked memory, and therefore should be using the configurable. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091119202341.GA4420@sgi.com>	2009-11-23 17:09:59 -08:00
Borislav Petkov	27c13ecec4	x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes display_cacheinfo() doesn't display anything anymore and it is used to detect CPU cache sizes. Rename it accordingly. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <20091121130145.GA31357@liondog.tnic> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-23 11:59:53 -08:00
Jack Steiner	fd12a0d69a	x86: UV SGI: Don't track GRU space in PAT GRU space is always mapped as WB in the page table. There is no need to track the mappings in the PAT. This also eliminates the "freeing invalid memtype" messages when the GRU space is unmapped. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091119202341.GA4420@sgi.com> [ v2: fix build failure ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 19:47:42 +01:00
Dimitri Sivanich	581f202bcd	x86: UV RTC: Always enable RTC clocksource Always enable the RTC clocksource on UV systems. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091120214826.GA20016@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 19:41:30 +01:00
Jan Beulich	0444c9bd0c	x86: Tighten conditionals on MCE related statistics irq_thermal_count is only being maintained when X86_THERMAL_VECTOR, and both X86_THERMAL_VECTOR and X86_MCE_THRESHOLD don't need extra wrapping in X86_MCE conditionals. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Yong Wang <yong.y.wang@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <4B06AFA902000078000211F8@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 19:40:06 +01:00
Cliff Wickman	e38e2af1c5	x86: SGI UV: Fix BAU initialization A memory mapped register that affects the SGI UV Broadcast Assist Unit's interrupt handling may sometimes be unintialized. Remove the condition on its initialization, as that condition can be randomly satisfied by a hardware reset. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> LKML-Reference: <E1NBGB9-0005nU-Dp@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 19:12:50 +01:00
K.Prasad	ba6909b719	hw-breakpoint: Attribute authorship of hw-breakpoint related files Attribute authorship to developers of hw-breakpoint related files. Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091123154713.GA5593@in.ibm.com> [ v2: moved it to latest -tip ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 18:18:29 +01:00
Jiri Kosina	68ee87164e	crypto: ghash-clmulni-intel - Put proper .data section in place Lbswap_mask, Lpoly and Ltwo_one should clearly belong to .data section, not .text. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-23 20:19:47 +08:00
Huang Ying	564ec0ec05	crypto: ghash-clmulni-intel - Use gas macro for PCLMULQDQ-NI and PSHUFB Old binutils do not support PCLMULQDQ-NI and PSHUFB, to make kernel can be compiled by them, .byte code is used instead of assembly instructions. But the readability and flexibility of raw .byte code is not good. So corresponding assembly instruction like gas macro is used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-23 19:55:22 +08:00
Joerg Roedel	be83129771	x86/amd-iommu: attach devices to pre-allocated domains early For some devices the ACPI table may define unity map requirements which must me met when the IOMMU is enabled. So we need to attach devices to their domains as early as possible so that these mappings are in place when needed. This patch assigns the domains right after they are allocated. Otherwise this can result in I/O page faults before a driver binds to a device and BIOS is still using it. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-23 12:54:17 +01:00
Huang Ying	b369e52123	crypto: aesni-intel - Use gas macro for AES-NI instructions Old binutils do not support AES-NI instructions, to make kernel can be compiled by them, .byte code is used instead of AES-NI assembly instructions. But the readability and flexibility of raw .byte code is not good. So corresponding assembly instruction like gas macro is used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-23 19:54:06 +08:00
Joerg Roedel	9f800de38b	x86/amd-iommu: un__init iommu_setup_msi This function may be called on the resume path and can not be dropped after booting. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-11-23 12:45:25 +01:00
Jan Beulich	0e7810be30	x86: Suppress stack overrun message for init_task init_task doesn't get its stack end location set to STACK_END_MAGIC, and hence the message is confusing rather than helpful in this case. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4B06AEFE02000078000211F4@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 11:45:34 +01:00
Ingo Molnar	6e3d8330ae	perf events: Do not generate function trace entries in perf code Decreases perf overhead when function tracing is enabled, by about 50%. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 10:19:20 +01:00
Yinghai Lu	d9c2d5ac6a	x86, numa: Use near(er) online node instead of roundrobin for NUMA CPU to node mapping is set via the following sequence: 1. numa_init_array(): Set up roundrobin from cpu to online node 2. init_cpu_to_node(): Set that according to apicid_to_node[] according to srat only handle the node that is online, and leave other cpu on node without ram (aka not online) to still roundrobin. 3. later call srat_detect_node for Intel/AMD, will use first_online node or nearby node. Problem is that setup_per_cpu_areas() is not called between 2 and 3, the per_cpu for cpu on node with ram is on different node, and could put that on node with two hops away. So try to optimize this and add find_near_online_node() and call init_cpu_to_node(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 10:06:24 +01:00
Yinghai Lu	021428ad14	x86, numa, bootmem: Only free bootmem on NUMA failure path In the NUMA bootmem setup failure path we freed nodedata_phys incorrectly. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 10:00:48 +01:00
Yinghai Lu	163d3866cf	x86: apic: Print out SRAT table APIC id in hex Make it consistent with APIC MADT print out, for big systems APIC id in hex is more readable. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 09:56:30 +01:00
Yinghai Lu	37ef2a3029	x86: Re-get cfg_new in case reuse/move irq_desc When irq_desc is moved, we need to make sure to use the right cfg_new. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 09:56:05 +01:00
Yinghai Lu	e670761f12	x86: apic: Remove not needed #ifdef Suresh made dmar_table_init() already have that protection. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 09:54:15 +01:00
Yinghai Lu	44280733e7	x86: Change crash kernel to reserve via reserve_early() use find_e820_area()/reserve_early() instead. -v2: address Eric's request, to restore original semantics. will fail, if the provided address can not be used. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <4B09E2F9.7040403@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-23 09:09:23 +01:00
Ingo Molnar	96200591a3	Merge branch 'tracing/hw-breakpoints' into perf/core Conflicts: arch/x86/kernel/kprobes.c kernel/trace/Makefile Merge reason: hw-breakpoints perf integration is looking good in testing and in reviews, plus conflicts are mounting up - so merge & resolve. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-21 14:07:23 +01:00
Masami Hiramatsu	6f5f67267d	x86: insn decoder test checks objdump version Check objdump version before using it for insn decoder build test, because some older objdump can't decode AVX code correctly. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> LKML-Reference: <20091120171314.6715.30390.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-20 23:01:04 -08:00
Masami Hiramatsu	80509e27e4	x86: Fix insn decoder test typos Fix postest_verbose to posttest_verbose, and add posttest_64bit option for CONFIG_64BIT != y, since old command just passed '-' instead of '-n' when CONFIG_64BIT is not set. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> LKML-Reference: <20091120171307.6715.66099.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-20 22:59:36 -08:00
Thomas Gleixner	746357d6a5	x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage When the kernel is compiled with -pg for tracing GCC 4.4.x inserts stack alignment of a function _before_ the mcount prologue if the -march=pentium-mmx is set and -mtune=generic is not set. This breaks the assumption of the function graph tracer which expects that the mcount prologue push %ebp mov %esp, %ebp is the first stack operation in a function because it needs to modify the function return address on the stack to trap into the tracer before returning to the real caller. The generated code is: push %edi lea 0x8(%esp),%edi and $0xfffffff0,%esp pushl -0x4(%edi) push %ebp mov %esp,%ebp so the tracer modifies the copy of the return address which is stored after the stack alignment and therefor does not trap the return which in turn breaks the call chain logic of the tracer and leads to a kernel panic. Aside of the fact that the generated code is horrible for no good reason other -march -mtune options generate the expected: push %ebp mov %esp,%ebp and $0xfffffff0,%esp which does the same and keeps everything intact. After some experimenting we found out that this problem is restricted to gcc4.4.x and to the following -march settings: i586, pentium, pentium-mmx, k6, k6-2, k6-3, winchip-c6, winchip2, c3, geode By adding -mtune=generic the code generator produces always the expected code. So forcing -mtune=generic when CONFIG_FUNCTION_GRAPH_TRACER=y is not pretty, but at the moment the only way to prevent that the kernel trips over gcc-shrooms induced code madness. Most distro kernels have CONFIG_X86_GENERIC=y anyway which forces -mtune=generic as well so it will not impact those. References: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109 http://lkml.org/lkml/2009/11/19/17 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.0911200206570.24119@localhost.localdomain> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com>, Cc: Jeff Law <law@redhat.com> Cc: gcc@gcc.gnu.org Cc: David Daney <ddaney@caviumnetworks.com> Cc: Andrew Haley <aph@redhat.com> Cc: Richard Guenther <richard.guenther@gmail.com> Cc: stable@kernel.org	2009-11-20 14:06:46 +01:00
Masami Hiramatsu	ce64c62074	x86: Instruction decoder test should generate build warning Since some instructions are not decoded correctly by older versions of objdump, it may cause false positive error in insn decoder posttest. This changes build error of insn decoder test to build warning. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091116230631.5250.41579.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-19 21:40:13 +01:00
David S. Miller	3505d1a9fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sfc/sfe4001.c drivers/net/wireless/libertas/cmd.c drivers/staging/Kconfig drivers/staging/Makefile drivers/staging/rtl8187se/Kconfig drivers/staging/rtl8192e/Kconfig	2009-11-18 22:19:03 -08:00
Jan Beulich	350f8f5631	x86: Eliminate redundant/contradicting cache line size config options Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT (with inconsistent defaults), just having the latter suffices as the former can be easily calculated from it. To be consistent, also change X86_INTERNODE_CACHE_BYTES to X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA to account for last level cache line size (which here matters more than L1 cache line size). Finally, make sure the default value for X86_L1_CACHE_SHIFT, when X86_GENERIC is selected, is being seen before that for the individual CPU model options (other than on x86-64, where GENERIC_CPU is part of the choice construct, X86_GENERIC is a separate option on ix86). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Ravikiran Thirumalai <kiran@scalex86.org> Acked-by: Nick Piggin <npiggin@suse.de> LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-19 04:58:34 +01:00
Thomas Gleixner	070e5c3f99	x86: vmiclock: Fix printk format clockevents.mult became u32. Fix the printk format. Pointed-out-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-11-18 12:31:06 +01:00
Thomas Gleixner	6dbfe5a57d	x86: Fixup last users of irq_chip->typename The typename member of struct irq_chip was kept for migration purposes and is obsolete since more than 2 years. Fix up the leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-11-18 11:45:29 +01:00
Rusty Russell	8dca15e408	[CPUFREQ] speedstep-ich: fix error caused by `394122ab14` "[CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.c" changed the code to mistakenly pass the current cpu as the "processor" argument of speedstep_get_frequency(), whereas it should be the type of the processor. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14340 Based on a patch by Dave Mueller. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Dominik Brodowski <linux@brodo.de> Reported-by: Dave Mueller <dave.mueller@gmx.ch> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:04 -05:00
John Villalovos	293afe44d7	[CPUFREQ] acpi-cpufreq: blacklist Intel 0f68: Fix HT detection and put in notification message Removing the SMT/HT check, since the Errata doesn't mention Hyper-Threading. Adding in a printk, so that the user knows why acpi-cpufreq refuses to load. Also, once system is blacklisted, don't repeat checks to see if blacklisted. This also causes the message to only be printed once, rather than for each CPU. Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:03 -05:00
Roel Kluin	c53614ec17	[CPUFREQ] powernow-k8: Fix test in get_transition_latency() Not makes it a bool before the comparison. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:03 -05:00
Krzysztof Helt	f7f3cad060	[CPUFREQ] longhaul: select Longhaul version 2 for capable CPUs There is a typo in the longhaul detection code so only Longhaul v1 or Longhaul v3 is selected. The Longhaul v2 is not selected even for CPUs which are capable of. Tested on PCChips Giga Pro board. Frequency changes work and the Longhaul v2 detects that the board is not capable of changing CPU voltage. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:03 -05:00
Yinghai Lu	508d85c2c6	x86: When cleaning MTRRs, do not fold WP into UC The current MTRR code treats WP as a form of UC. This really isn't desirable behaviour, except possibly in the case of severe MTRR shortage. Disable this, to allow legitimate uses of WP to remain unmolested. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-17 10:26:53 -08:00
Lin Ming	0696b711e4	timekeeping: Fix clock_gettime vsyscall time warp Since commit `0a544198` "timekeeping: Move NTP adjusted clock multiplier to struct timekeeper" the clock multiplier of vsyscall is updated with the unmodified clock multiplier of the clock source and not with the NTP adjusted multiplier of the timekeeper. This causes user space observerable time warps: new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf Add a new argument "mult" to update_vsyscall() and hand in the timekeeping internal NTP adjusted multiplier. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-11-17 11:52:34 +01:00
Ingo Molnar	a7b63425a4	Merge branch 'perf/core' into perf/probes Resolved merge conflict in tools/perf/Makefile Merge reason: we want to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 10:17:47 +01:00
Andreas Herrmann	8cc2361bd0	x86: ucode-amd: Move family check to microcde_amd.c's init function ... to avoid useless trial to load firmware on systems with unsupported AMD CPUs. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com> Cc: Mike Travis <travis@sgi.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Mohr <andi@lisas.de> Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091117070638.GA27691@alberich.amd.com> [ v2: changed BUG_ON() to WARN_ON() ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 10:10:40 +01:00
Eric W. Biederman	bb9074ff58	Merge commit 'v2.6.32-rc7' Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler gets a small bug fix and the sysctl tree where I am removing all sysctl strategy routines.	2009-11-17 01:01:34 -08:00
Ingo Molnar	123bf0e2ed	x86: gart: Clean up the code a bit Clean up various small stylistic details in the GART code. No functionality changed. Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:57:00 +01:00
FUJITA Tomonori	1f7564ca83	x86: Calgary: Remove unnecessary DMA_ERROR_CODE usage This cleans up iommu_alloc() a bit and removes unnecessary DMA_ERROR_CODE usage. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:53:21 +01:00
FUJITA Tomonori	8fd524b355	x86: Kill bad_dma_address variable This kills bad_dma_address variable, the old mechanism to enable IOMMU drivers to make dma_mapping_error() work in IOMMU's specific way. bad_dma_address variable was introduced to enable IOMMU drivers to make dma_mapping_error() work in IOMMU's specific way. However, it can't handle systems that use both swiotlb and HW IOMMU. SO we introduced dma_map_ops->mapping_error to solve that case. Intel VT-d, GART, and swiotlb already use dma_map_ops->mapping_error. Calgary, AMD IOMMU, and nommu use zero for an error dma address. This adds DMA_ERROR_CODE and converts them to use it (as SPARC and POWER does). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:53:21 +01:00
FUJITA Tomonori	42109197eb	x86: gart: Add own dma_mapping_error function GART IOMMU is the only user of bad_dma_address variable. This patch converts GART to use the newer mechanism, fill in ->mapping_error() in struct dma_map_ops, to make dma_mapping_error() work in IOMMU specific way. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:53:20 +01:00
Ingo Molnar	99f4c9de2b	Merge commit 'v2.6.32-rc7' into core/iommu Merge reason: Add fixes we'll depend on. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:51:07 +01:00
Masami Hiramatsu	35039eb6b1	x86: Show symbol name if insn decoder test failed Show symbol name if insn decoder test find a difference. This will help us to find out where the issue is. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091116230624.5250.49813.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:16:50 +01:00
Masami Hiramatsu	d65ff75fbe	x86: Add verbose option to insn decoder test Add verbose option to insn decoder test. This dumps decoded instruction when building kernel with V=1. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091116230618.5250.18762.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-17 07:16:48 +01:00
H. Peter Anvin	5bd085b5fb	x86: remove "extern" from function prototypes in <asm/proto.h> Function prototypes don't need "extern", and it is generally frowned upon to have them. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-16 13:55:31 -08:00
Kees Cook	4b0f3b81eb	x86, mm: Report state of NX protections during boot It is possible for x86_64 systems to lack the NX bit either due to the hardware lacking support or the BIOS having turned off the CPU capability, so NX status should be reported. Additionally, anyone booting NX-capable CPUs in 32bit mode without PAE will lack NX functionality, so this change provides feedback for that case as well. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1258154897-6770-6-git-send-email-hpa@zytor.com>	2009-11-16 13:44:59 -08:00
H. Peter Anvin	4763ed4d45	x86, mm: Clean up and simplify NX enablement The 32- and 64-bit code used very different mechanisms for enabling NX, but even the 32-bit code was enabling NX in head_32.S if it is available. Furthermore, we had a bewildering collection of tests for the available of NX. This patch: a) merges the 32-bit set_nx() and the 64-bit check_efer() function into a single x86_configure_nx() function. EFER control is left to the head code. b) eliminates the nx_enabled variable entirely. Things that need to test for NX enablement can verify __supported_pte_mask directly, and cpu_has_nx gives the supported status of NX. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Tejun Heo <tj@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Chris Wright <chrisw@sous-sol.org> LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com> Acked-by: Kees Cook <kees.cook@canonical.com>	2009-11-16 13:44:59 -08:00
H. Peter Anvin	583140afb9	x86, pageattr: Make set_memory_(x\|nx) aware of NX support Make set_memory_x/set_memory_nx directly aware of if NX is supported in the system or not, rather than requiring that every caller assesses that support independently. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Tim Starling <tstarling@wikimedia.org> Cc: Hannes Eder <hannes@hanneseder.net> LKML-Reference: <1258154897-6770-4-git-send-email-hpa@zytor.com> Acked-by: Kees Cook <kees.cook@canonical.com>	2009-11-16 13:44:58 -08:00
H. Peter Anvin	a7c4c0d934	x86, sleep: Always save the value of EFER Always save the value of EFER, regardless of the state of NX. Since EFER may not actually exist, use rdmsr_safe() to do so. v2: check the return value from rdmsr_safe() instead of relying on the output values being unchanged on error. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@tuxonice.net> LKML-Reference: <1258154897-6770-3-git-send-email-hpa@zytor.com> Acked-by: Kees Cook <kees.cook@canonical.com>	2009-11-16 13:44:57 -08:00
H. Peter Anvin	8a50e5135a	x86-32: Use symbolic constants, safer CPUID when enabling EFER.NX Use symbolic constants rather than hard-coded values when setting EFER.NX in head_32.S, and do a more rigorous test for the validity of the response when probing for the extended CPUID range. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1258154897-6770-2-git-send-email-hpa@zytor.com> Acked-by: Kees Cook <kees.cook@canonical.com>	2009-11-16 13:44:56 -08:00
Cyrill Gorcunov	e79c65a97c	x86: io-apic: IO-APIC MMIO should not fail on resource insertion If IO-APIC base address is 1K aligned we should not fail on resourse insertion procedure. For this sake we define IO_APIC_SLOT_SIZE constant which should cover all IO-APIC direct accessible registers. An example of a such configuration is there http://marc.info/?l=linux-kernel&m=118114792006520 \| \| Quoting the message \| \| IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 \| IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47 \| IOAPIC[2]: apic_id 4, version 32, address 0xfec80400, GSI 48-71 \| IOAPIC[3]: apic_id 5, version 32, address 0xfec84000, GSI 72-95 \| IOAPIC[4]: apic_id 8, version 32, address 0xfec84400, GSI 96-119 \| Reported-by: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20091116151426.GC5653@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-16 16:37:10 +01:00
Frederic Weisbecker	3c93ca00ee	x86: Add missing might_fault() checks to copy_{to,from}_user() On x86-64, copy_[to\|from]_user() rely on assembly routines that never call might_fault(), making us missing various lockdep checks. This doesn't apply to __copy_from,to_user() that explicitly handle these calls, neither is it a problem in x86-32 where copy_to,from_user() rely on the "__" prefixed versions that also call might_fault(). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1258382538-30979-1-git-send-email-fweisbec@gmail.com> [ v2: fix module export ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-16 16:09:52 +01:00
Prarit Bhargava	303fc0870f	x86: AMD Northbridge: Verify NB's node is online Fix panic seen on some IBM and HP systems on 2.6.32-rc6: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8120bf3f>] find_next_bit+0x77/0x9c [...] [<ffffffff8120bbde>] cpumask_next_and+0x2e/0x3b [<ffffffff81225c62>] pci_device_probe+0x8e/0xf5 [<ffffffff812b9be6>] ? driver_sysfs_add+0x47/0x6c [<ffffffff812b9da5>] driver_probe_device+0xd9/0x1f9 [<ffffffff812b9f1d>] __driver_attach+0x58/0x7c [<ffffffff812b9ec5>] ? __driver_attach+0x0/0x7c [<ffffffff812b9298>] bus_for_each_dev+0x54/0x89 [<ffffffff812b9b4f>] driver_attach+0x19/0x1b [<ffffffff812b97ae>] bus_add_driver+0xd3/0x23d [<ffffffff812ba1e7>] driver_register+0x98/0x109 [<ffffffff81225ed0>] __pci_register_driver+0x63/0xd3 [<ffffffff81072776>] ? up_read+0x26/0x2a [<ffffffffa0081000>] ? k8temp_init+0x0/0x20 [k8temp] [<ffffffffa008101e>] k8temp_init+0x1e/0x20 [k8temp] [<ffffffff8100a073>] do_one_initcall+0x6d/0x185 [<ffffffff8108d765>] sys_init_module+0xd3/0x236 [<ffffffff81011ac2>] system_call_fastpath+0x16/0x1b I put in a printk and commented out the set_dev_node() call when and got this output: quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x0 quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x1 quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x2 quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x3 I.e. the issue appears to be that the HW has set val to a valid value, however, the system is only configured for a single node -- 0, the others are offline. Check to see if the node is actually online before setting the numa node for an AMD northbridge in quirk_amd_nb_node(). Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: bhavna.sarathy@amd.com Cc: jbarnes@virtuousgeek.org Cc: andreas.herrmann3@amd.com LKML-Reference: <20091112180933.12532.98685.sendpatchset@prarit.bos.redhat.com> [ v2: clean up the code and add comments ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-16 15:43:05 +01:00
Thomas Gleixner	411462f62a	x86: Fix printk format due to variable type change clockevents.mult became u32. Fix the printk format. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-11-16 11:55:20 +01:00
Hiroshi Shimamoto	62ad33f670	x86: Don't put iommu_shutdown_noop() in init section It causes kernel panic on shutdown or reboot. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <4B00BC8E.50801@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-16 08:58:51 +01:00
Thomas Gleixner	dc186ad741	workqueue: Add debugobjects support Add debugobject support to track the life time of work_structs. While at it, remove duplicate definition of INIT_DELAYED_WORK_ON_STACK(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-11-16 01:09:48 +09:00
Ingo Molnar	39dc78b651	Merge commit 'v2.6.32-rc7' into perf/core Merge reason: pick up perf fixlets Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:50:41 +01:00
Jan Beulich	1472248583	x86-64: __copy_from_user_inatomic() adjustments This v2.6.26 commit: `ad2fc2c`: x86: fix copy_user on x86 rendered __copy_from_user_inatomic() identical to copy_user_generic(), yet didn't make the former just call the latter from an inline function. Furthermore, this v2.6.19 commit: `b885808`: [PATCH] Add proper sparse __user casts to __copy_to_user_inatomic converted the return type of __copy_to_user_inatomic() from unsigned long to int, but didn't do the same to __copy_from_user_inatomic(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: <v.mayatskih@gmail.com> LKML-Reference: <4AFD5778020000780001F8F4@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:29:47 +01:00
FUJITA Tomonori	f4131c6259	x86: Make calgary_iommu_init() static This makes calgary_iommu_init() static and moves it to remove the forward declaration. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: muli@il.ibm.com LKML-Reference: <20091114212603U.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:04:14 +01:00
FUJITA Tomonori	6959450e56	swiotlb: Remove duplicate swiotlb_force extern declarations Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: tony.luck@intel.com LKML-Reference: <1258199198-16657-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:03:10 +01:00
FUJITA Tomonori	94a15564ac	x86: Move iommu_shutdown_noop to x86_init.c iommu_init_noop() is in arch/x86/kernel/x86_init.c but iommu_shutdown_noop() in arch/x86/include/asm/iommu.h. This moves iommu_shutdown_noop() to x86_init.c for consistency. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <1258199198-16657-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:03:10 +01:00
FUJITA Tomonori	a3b28ee109	x86: Set dma_ops to nommu_dma_ops by default We set dma_ops to nommu_dma_ops at two different places for x86_32 and x86_64. This unifies them by setting dma_ops to nommu_dma_ops by default. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <1258199198-16657-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:03:09 +01:00
Ingo Molnar	68efa37df7	hw-breakpoints, x86: Fix modular KVM build This build error: arch/x86/kvm/x86.c:3655: error: implicit declaration of function 'hw_breakpoint_restore' Happens because in the CONFIG_KVM=m case there's no 'CONFIG_KVM' define in the kernel - it's CONFIG_KVM_MODULE in that case. Make the prototype available unconditionally. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1258114575-32655-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-14 15:32:53 +01:00
Ingo Molnar	31c997cac7	x86: Fix cpu_devs[] initialization in early_cpu_init() Yinghai Lu noticed that this commit: `0388423`: x86: Minimise printk spew from per-vendor init code mistakenly left out the initialization of cpu_devs[] in the !PROCESSOR_SELECT case. Fix it. Reported-by: Yinghai Lu <yinghai@kernel.org> Cc: Dave Jones <davej@redhat.com> LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-14 10:36:50 +01:00
Roland Dreier	b01c845f0f	x86: Remove CPU cache size output for non-Intel too As Dave Jones said about the output in intel_cacheinfo.c: "They aren't useful, and pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace." Give the generic display_cacheinfo() function the same treatment. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Dave Jones <davej@redhat.com> Cc: Mike Travis <travis@sgi.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <adaocn6dp99.fsf_-_@roland-alpha.cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-14 01:51:18 +01:00
Dave Jones	0388423dba	x86: Minimise printk spew from per-vendor init code In the default case where the kernel supports all CPU vendors, we currently print out a bunch of not useful messages on every system. 32-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC 64-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls Given that "what CPUs does the kernel support" isn't useful for the "support everything" case, we can suppress these printk's. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-14 01:18:05 +01:00
Matthew Garrett	d9b263528e	x86, setup: Store the boot cursor state Add a field to store the boot cursor state and implement this for VGA on x86. This can then be used to set the default policy for the boot console. Signed-off-by: Matthew Garrett <mjg@redhat.com> LKML-Reference: <1258142222-16092-1-git-send-email-mjg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-13 14:23:11 -08:00
Dave Jones	15cd8812ab	x86: Remove the CPU cache size printk's They aren't really useful, and they pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace. Reported-by: Mike Travis <travis@sgi.com> Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091112231542.GA7129@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-13 09:14:55 +01:00
Eric W. Biederman	24a065624d	sysctl x86: Remove dead binary sysctl support Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:05:04 -08:00
Hiroshi Shimamoto	db48cccc7c	perf_event, x86: Annotate init functions and data Annotate init functions and data with __init and __initconst. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <4AFB721E.8070203@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-12 09:18:36 +01:00
Hidetoshi Seto	cffd377e58	x86, mce: Fix __init annotations The intel_init_thermal() is called from resume path, so it cannot be marked as __init. OTOH mce_banks_init() is only called from __mcheck_cpu_cap_init() which is marked as __cpuinit, so it can be also marked as __cpuinit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Yong Wang <yong.y.wang@linux.intel.com> LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-12 09:17:11 +01:00
Linus Torvalds	55871bdd03	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86/PCI: Adjust GFP mask handling for coherent allocations PCI ASPM: fix oops on root port removal	2009-11-11 11:34:14 -08:00
Linus Torvalds	605f37504f	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd-ucode: Check UCODE_MAGIC before loading the container file x86: Fix error return sequence in __ioremap_caller() x86: Add Phoenix/MSC BIOSes to lowmem corruption list	2009-11-11 11:29:10 -08:00
Yinghai Lu	196cf0d67a	x86: Make sure wakeup trampoline code is below 1MB Instead of using bootmem, try find_e820_area()/reserve_early(), and call acpi_reserve_memory() early, to allocate the wakeup trampoline code area below 1M. This is more reliable, and it also removes a dependency on bootmem. -v2: change function name to acpi_reserve_wakeup_memory(), as suggested by Rafael. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: pm list <linux-pm@lists.linux-foundation.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AFA210B.3020207@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-11 20:14:32 +01:00
Andreas Herrmann	9f15226e75	x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle When switching a CPU offline/online and then doing suspend/resume, ucode is not updated on this CPU. This is due to the microcode_fini_cpu() call which frees uci->mc when setting the CPU offline: static void microcode_fini_cpu_amd(int cpu) { struct ucode_cpu_info uci = ucode_cpu_info + cpu; vfree(uci->mc); uci->mc = NULL; } When the CPU is set online uci->mc is still NULL because no ucode update is required. Finally this prevents ucode update when resuming after suspend: static enum ucode_state microcode_resume_cpu(int cpu) { struct ucode_cpu_info uci = ucode_cpu_info + cpu; if (!uci->mc) return UCODE_NFOUND; ... } Fix is to check whether uci->mc is valid before microcode_resume_cpu() is called. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: dimm <dmitry.adamushko@gmail.com> LKML-Reference: <20091111190329.GF18592@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-11 20:06:54 +01:00
FUJITA Tomonori	b18485e7ac	swiotlb: Remove the swiotlb variable usage POWERPC doesn't expect it to be used. This fixes the linux-next build failure reported by Stephen Rothwell: lib/swiotlb.c: In function 'setup_io_tlb_npages': lib/swiotlb.c:114: error: 'swiotlb' undeclared (first use in this function) Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: peterz@infradead.org LKML-Reference: <20091112000258F.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-11 16:51:18 +01:00
Yong Wang	ce6b5d768c	x86: Mark the thermal init functions __init Mark the thermal init functions __init so that the init memory can be freed. Signed-off-by: Yong Wang <yong.y.wang@intel.com> LKML-Reference: <20091111075125.GA17900@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-11 12:33:32 +01:00
Randy Dunlap	e84446de5c	x86 VSDO: Fix Kconfig help COMPAT_VDSO has 2 help text blocks, but kconfig only uses the last one found, so merge the 2 blocks. It would be real nice if kconfig would warn about this. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <4AF9FB6C.70003@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-11 07:26:41 +01:00
Dimitri Sivanich	200a9ae280	x86: Remove asm/apicnum.h arch/x86/include/asm/apicnum.h is not referenced anywhere anymore. Its definitions appear in apicdef.h. Remove it. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Mike Travis <travis@sgi.com> LKML-Reference: <20091110195835.GA4393@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 22:07:35 +01:00
Dave Jones	e02e0e1a13	x86: Fix typo in Intel CPU cache size descriptor I double-checked the datasheet. One of the existing descriptors has a typo: it should be 2MB not 2038 KB. Signed-off-by: Dave Jones <davej@redhat.com> Cc: <stable@kernel.org> # .3x.x: `85160b9`: x86: Add new Intel CPU cache size descriptors Cc: <stable@kernel.org> # .3x.x LKML-Reference: <20091110200120.GA27090@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 21:52:32 +01:00
Dave Jones	85160b92fb	x86: Add new Intel CPU cache size descriptors The latest rev of Intel doc AP-485 details new cache descriptors that we don't yet support. 12MB, 18MB and 24MB 24-way assoc L3 caches. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20091110184924.GA20337@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 20:06:16 +01:00
Ingo Molnar	b4941a9a60	x86: Add iommu_init to x86_init_ops, fix build Most of the time x86_init.h is included in pci-dma.c - but not always, leading to this rare build failure: arch/x86/kernel/pci-dma.c:296: error: 'x86_init' undeclared (first use in this function) So include asm/x86_init.h explicitly. Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 14:37:58 +01:00
FUJITA Tomonori	72d03802b8	x86, 32-bit: Fix swiotlb boot crash Ingo Molnar reported this boot crash: [ 8.655620] pata_amd 0000:00:06.0: version 0.4.1 [ 8.660286] BUG: unable to handle kernel NULL pointer dereference at 00000034 [ 8.663572] IP: [<c100617b>] dma_supported+0x3b/0xa4 [ 8.663572] *pde = 00000000 Initialize dma_ops properly in the 32-bit case. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 14:11:32 +01:00
FUJITA Tomonori	75f1cdf1dd	x86: Handle HW IOMMU initialization failure gracefully If HW IOMMU initialization fails (Intel VT-d often does this, typically due to BIOS bugs), we fall back to nommu. It doesn't work for the majority since nowadays we have more than 4GB memory so we must use swiotlb instead of nommu. The problem is that it's too late to initialize swiotlb when HW IOMMU initialization fails. We need to allocate swiotlb memory earlier from bootmem allocator. Chris explained the issue in detail: http://marc.info/?l=linux-kernel&m=125657444317079&w=2 The current x86 IOMMU initialization sequence is too complicated and handling the above issue makes it more hacky. This patch changes x86 IOMMU initialization sequence to handle the above issue cleanly. The new x86 IOMMU initialization sequence are: 1. we initialize the swiotlb (and setting swiotlb to 1) in the case of (max_pfn > MAX_DMA32_PFN && !no_iommu). dma_ops is set to swiotlb_dma_ops or nommu_dma_ops. if swiotlb usage is forced by the boot option, we finish here. 2. we call the detection functions of all the IOMMUs 3. the detection function sets x86_init.iommu.iommu_init to the IOMMU initialization function (so we can avoid calling the initialization functions of all the IOMMUs needlessly). 4. if the IOMMU initialization function doesn't need to swiotlb then sets swiotlb to zero (e.g. the initialization is sucessful). 5. if we find that swiotlb is set to zero, we free swiotlb resource. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-10-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:32:07 +01:00
FUJITA Tomonori	ad32e8cb86	swiotlb: Defer swiotlb init printing, export swiotlb_print_info() This enables us to avoid printing swiotlb memory info when we initialize swiotlb. After swiotlb initialization, we could find that we don't need swiotlb. This patch removes the code to print swiotlb memory info in swiotlb_init() and exports the function to do that. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: tony.luck@intel.com Cc: benh@kernel.crashing.org LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: merge up conflict ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:32:00 +01:00
FUJITA Tomonori	9d5ce73a64	x86: intel-iommu: Convert detect_intel_iommu to use iommu_init hook This changes detect_intel_iommu() to set intel_iommu_init() to iommu_init hook if detect_intel_iommu() finds the IOMMU. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-6-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: build fix for the !CONFIG_DMAR case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:31:36 +01:00
FUJITA Tomonori	ea1b0d3945	x86: amd_iommu: Convert amd_iommu_detect() to use iommu_init hook This changes amd_iommu_detect() to set amd_iommu_init to iommu_init hook if amd_iommu_detect() finds the AMD IOMMU. We can kill the code to check if we found the IOMMU in amd_iommu_init() since amd_iommu_detect() sets amd_iommu_init() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-5-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:31:30 +01:00
FUJITA Tomonori	de957628ce	x86: GART: Convert gart_iommu_hole_init() to use iommu_init hook This changes gart_iommu_hole_init() to set gart_iommu_init() to iommu_init hook if gart_iommu_hole_init() finds the GART IOMMU. We can kill the code to check if we found the IOMMU in gart_iommu_init() since gart_iommu_hole_init() sets gart_iommu_init() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:31:23 +01:00
FUJITA Tomonori	d7b9f7be21	x86: Calgary: Convert detect_calgary() to use iommu_init hook This changes detect_calgary() to set init_calgary() to iommu_init hook if detect_calgary() finds the Calgary IOMMU. We can kill the code to check if we found the IOMMU in init_calgary() since detect_calgary() sets init_calgary() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com LKML-Reference: <1257849980-22640-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:31:15 +01:00
FUJITA Tomonori	d07c1be069	x86: Add iommu_init to x86_init_ops We call the detections functions of all the IOMMUs then all their initialization functions. The latter is pointless since we don't detect multiple different IOMMUs. What we need to do is calling the initialization function of the detected IOMMU. This adds iommu_init hook to x86_init_ops so if an IOMMU detection function can set its initialization function to the hook. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:31:07 +01:00
Andreas Herrmann	1a74357066	x86: ucode-amd: Convert printk(KERN_...) to pr_(...) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: dimm <dmitry.adamushko@gmail.com> LKML-Reference: <20091110110920.GJ30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:15:50 +01:00
Andreas Herrmann	14c569425a	x86: ucode-amd: Don't warn when no ucode is available for a CPU revision There is no point in warning when there is no ucode available for a specific CPU revision. Currently the container-file, which provides the AMD ucode patches for OS load, contains only a few ucode patches. It's already clearly indicated by the printed patch_level whenever new ucode was available and an update happened. So the warning message is of no help but rather annoying on systems with many CPUs. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: dimm <dmitry.adamushko@gmail.com> LKML-Reference: <20091110110825.GI30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:15:49 +01:00
Andreas Herrmann	d1c84f79a6	x86: ucode-amd: Load ucode-patches once and not separately of each CPU This also implies that corresponding log messages, e.g. platform microcode: firmware: requesting amd-ucode/microcode_amd.bin show up only once on module load and not when ucode is updated for each CPU. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: dimm <dmitry.adamushko@gmail.com> LKML-Reference: <20091110110723.GH30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:15:48 +01:00
Frederic Weisbecker	59d8eb53ea	hw-breakpoints: Wrap in the KVM breakpoint active state check Wrap in the cpu dr7 check that tells if we have active breakpoints that need to be restored in the cpu. This wrapper makes the check more self-explainable and also reusable for any further other uses. Reported-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>	2009-11-10 11:23:43 +01:00
Frederic Weisbecker	9f6b3c2c30	hw-breakpoints: Fix broken a.out format dump Fix the broken a.out format dump. For now we only dump the ptrace breakpoints. TODO: Dump every perf breakpoints for the current thread, not only ptrace based ones. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>	2009-11-10 11:23:05 +01:00
Xiaotian Feng	2fb8f4e6a8	x86: pat: Remove ioremap_default() Commit: `b6ff32d`: x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype() consolidated reserve_memtype() and pat_x_mtrr_type, this made ioremap_default() same as ioremap_cache(). Remove the redundant function and change the only caller to use ioremap_cache. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <1257845005-7938-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 10:34:05 +01:00
Xiaotian Feng	83ea05ea69	x86: pat: Clean up req_type special case for reserve_memtype() Commit: `b6ff32d`: x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype() consolidated code in pat_x_mtrr_type() and reserve_memtype(), which removed the special case (req_type is -1) for the PAT-enabled part. We should also change comments and the PAT-disabled part. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <1257844987-7906-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 10:34:04 +01:00
Joe Perches	41855b7754	x86: GART: pci-gart_64.c: Use correct length in strncmp Signed-off-by: Joe Perches <joe@perches.com> Cc: <stable@kernel.org> # .3x.x LKML-Reference: <1257818330.12852.72.camel@Joe-Laptop.home> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 06:05:39 +01:00
Yong Wang	a2202aa292	x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang <yong.y.wang@intel.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org	2009-11-10 05:57:55 +01:00
Cyrill Gorcunov	7abc075313	x86: apic: Do not use stacked physid_mask_t We should not use physid_mask_t as a stack based variable in apic code. This type depends on MAX_APICS parameter which may be huge enough. Especially it became a problem with apic NOOP driver which is portable between 32 bit and 64 bit environment (where we have really huge MAX_APICS). So apic driver should operate with pointers and a caller in turn should aware of allocation physid_mask_t variable. As a side (but positive) effect -- we may use already implemented physid_set_mask_of_physid function eliminating default_apicid_to_cpu_present completely. Note that physids_coerce and physids_promote turned into static inline from macro (since macro hides the fact that parameter is being interpreted as unsigned long, make it explicit). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091109220659.GA5568@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 05:52:07 +01:00
Andreas Herrmann	6e18da75c2	x86, amd-ucode: Remove needless log messages Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091029134742.GD30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 05:49:28 +01:00
Borislav Petkov	506f90eeae	x86, amd-ucode: Check UCODE_MAGIC before loading the container file Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091029134552.GC30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 05:46:09 +01:00
Huang Ying	fd650a6394	x86: Generate .byte code for some new instructions via gas macro It will take some time for binutils (gas) to support some newly added instructions, such as SSE4.1 instructions or the AES-NI instructions found in upcoming Intel CPU. To make the source code can be compiled by old binutils, .byte code is used instead of the assembly instruction. But the readability and flexibility of raw .byte code is not good. This patch solves the issue of raw .byte code via generating it via assembly instruction like gas macro. The syntax is as close as possible to real assembly instruction. Some helper macros such as MODRM is not a full feature implementation. It can be extended when necessary. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-09 13:52:26 -05:00
Cyrill Gorcunov	f4a70c5537	x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit In fact it's never get used on x86-64 (for 64 bit platform we use differ technique to enumerate io-units). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091108131645.GD5300@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 19:46:17 +01:00
Cyrill Gorcunov	4343fe1024	x86, ioapic: Use snrpintf while set names for IO-APIC resourses We should be ready that one day MAX_IO_APICS may raise its number. To prevent memory overwrite we're to use safe snprintf while set IO-APIC resourse name. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20091108155431.GC25940@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 17:06:23 +01:00
Cyrill Gorcunov	46dc281b1b	x86, apic: Use PAGE_SIZE instead of numbers The whole page is reserved for IO-APIC fixmap due to non-cacheable requirement. So lets note this explicitly instead of playing with numbers. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> LKML-Reference: <20091108155356.GB25940@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 17:06:22 +01:00
Jan Beulich	eb647138ac	x86/PCI: Adjust GFP mask handling for coherent allocations Rather than forcing GFP flags and DMA mask to be inconsistent, GFP flags should be determined even for the fallback device through dma_alloc_coherent_mask()/dma_alloc_coherent_gfp_flags(). This restores 64-bit behavior as it was prior to commits `8965eb1938` and `4a367f3a9d` (not sure why there are two of them), where GFP_DMA was forced on for 32-bit, but not for 64-bit, with the slight adjustment that afaict even 32-bit doesn't need this without CONFIG_ISA. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Takashi Iwai <tiwai@suse.de> LKML-Reference: <4AF18187020000780001D8AA@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-08 07:44:30 -08:00
Frederic Weisbecker	24f1e32c60	hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events This patch rebase the implementation of the breakpoints API on top of perf events instances. Each breakpoints are now perf events that handle the register scheduling, thread/cpu attachment, etc.. The new layering is now made as follows: ptrace kgdb ftrace perf syscall \ \| / / \ \| / / / Core breakpoint API / / \| / \| / Breakpoints perf events \| \| Breakpoints PMU ---- Debug Register constraints handling (Part of core breakpoint API) \| \| Hardware debug registers Reasons of this rewrite: - Use the centralized/optimized pmu registers scheduling, implying an easier arch integration - More powerful register handling: perf attributes (pinned/flexible events, exclusive/non-exclusive, tunable period, etc...) Impact: - New perf ABI: the hardware breakpoints counters - Ptrace breakpoints setting remains tricky and still needs some per thread breakpoints references. Todo (in the order): - Support breakpoints perf counter events for perf tools (ie: implement perf_bpcounter_event()) - Support from perf tools Changes in v2: - Follow the perf "event " rename - The ptrace regression have been fixed (ptrace breakpoint perf events weren't released when a task ended) - Drop the struct hw_breakpoint and store generic fields in perf_event_attr. - Separate core and arch specific headers, drop asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h - Use new generic len/type for breakpoint - Handle off case: when breakpoints api is not supported by an arch Changes in v3: - Fix broken CONFIG_KVM, we need to propagate the breakpoint api changes to kvm when we exit the guest and restore the bp registers to the host. Changes in v4: - Drop the hw_breakpoint_restore() stub as it is only used by KVM - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a module - Restore the breakpoints unconditionally on kvm guest exit: TIF_DEBUG_THREAD doesn't anymore cover every cases of running breakpoints and vcpu->arch.switch_db_regs might not always be set when the guest used debug registers. (Waiting for a reliable optimization) Changes in v5: - Split-up the asm-generic/hw-breakpoint.h moving to linux/hw_breakpoint.h into a separate patch - Optimize the breakpoints restoring while switching from kvm guest to host. We only want to restore the state if we have active breakpoints to the host, otherwise we don't care about messed-up address registers. - Add asm/hw_breakpoint.h to Kbuild - Fix bad breakpoint type in trace_selftest.c Changes in v6: - Fix wrong header inclusion in trace.h (triggered a build error with CONFIG_FTRACE_SELFTEST Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jan Kiszka <jan.kiszka@web.de> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Avi Kivity <avi@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org>	2009-11-08 15:34:42 +01:00
Tejun Heo	2ae8bb75db	x86: Fix iommu=nodac parameter handling iommu=nodac should forbid dac instead of enabling it. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Matteo Frigo <athena@fftw.org> Cc: <stable@kernel.org> # .32.x and older LKML-Reference: <4AE5B52A.4050408@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 13:19:05 +01:00
FUJITA Tomonori	338bac527e	x86: Use x86_platform for iommu_shutdown This patch cleans up pci_iommu_shutdown() a bit to use x86_platform (similar to how IA64 initializes an IOMMU driver). This adds iommu_shutdown() to x86_platform to avoid calling every IOMMUs' shutdown functions in pci_iommu_shutdown() in order. The IOMMU shutdown functions are platform specific (we don't have multiple different IOMMU hardware) so the current way is pointless. An IOMMU driver sets x86_platform.iommu_shutdown to the shutdown function if necessary. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: joerg.roedel@amd.com LKML-Reference: <20091027163358F.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 13:12:26 +01:00
Randy Dunlap	0420101c07	x86: k8.h: Add struct bootnode k8.h uses struct bootnode but does not #include a header file for it, so provide a simple declaration for it. arch/x86/include/asm/k8.h:13: warning: 'struct bootnode' declared inside parameter list arch/x86/include/asm/k8.h:13: warning: its scope is only this definition or declaration, which is probably not what you want Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091028160955.d27ccb16.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 12:55:38 +01:00
Xiaotian Feng	de2a47cf2b	x86: Fix error return sequence in __ioremap_caller() kernel missed to free memtype if get_vm_area_caller failed in __ioremap_caller. This patch introduces error path to fix this and cleans up the repetitive error return sequences that contributed to the creation of the bug. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1257389031-20429-1-git-send-email-dfeng@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 12:48:58 +01:00
Rusty Russell	0d0fbbddcc	x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t This makes the declarations match the definitions, which already use 'struct cpumask'. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <200911052245.41803.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 11:58:38 +01:00
Masami Hiramatsu	c12a229bc5	x86: Remove unused thread_return label from switch_to() Remove unused thread_return label from switch_to() macro on x86-64. Since this symbol cuts into schedule(), backtrace at the latter half of schedule() was always shown as thread_return(). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20091105160359.5181.26225.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-08 11:57:13 +01:00
Simon Kagstrom	f1b291d4c4	x86: Add Phoenix/MSC BIOSes to lowmem corruption list We have a board with a Phoenix/MSC BIOS which also corrupts the low 64KB of RAM, so add an entry to the table. Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net> LKML-Reference: <20091106154404.002648d9@marrow.netinsight.se> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-11-06 14:49:39 -08:00
Bjorn Helgaas	ea7f1b6ee9	x86/PCI: remove 64-bit division The roundup() caused a build error (undefined reference to `__udivdi3'). We're aligning to power-of-two boundaries, so it's simpler to just use ALIGN() anyway, which avoids the division. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-06 13:59:34 -08:00
Eric W. Biederman	c3359fbce4	sysctl: x86 Use the compat_sys_sysctl Now that we have a generic 32bit compatibility implementation there is no need for x86 to implement it's own. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-06 03:53:58 -08:00
Frederic Weisbecker	2da3e160cb	hw-breakpoint: Move asm-generic/hw_breakpoint.h to linux/hw_breakpoint.h We plan to make the breakpoints parameters generic among architectures. For that it's better to move the asm-generic header to a generic linux header. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-11-05 23:48:01 +01:00
Linus Torvalds	7c9abfb884	Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: get_tss_base_addr() should return a gpa_t KVM: x86: Catch potential overrun in MCE setup	2009-11-05 13:24:15 -08:00
Chris Lalancette	2c75910f1a	x86: Make sure get_user_desc() doesn't sign extend. The current implementation of get_user_desc() sign extends the return value because of integer promotion rules. For the most part, this doesn't matter, because the top bit of base2 is usually 0. If, however, that bit is 1, then the entire value will be 0xffff... which is probably not what the caller intended. This patch casts the entire thing to unsigned before returning, which generates almost the same assembly as the current code but replaces the final "cltq" (sign extend) with a "mov %eax %eax" (zero-extend). This fixes booting certain guests under KVM. Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-05 13:22:18 -08:00
Linus Torvalds	9a6fc8d0f8	Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: mask extended topology info in cpuid xen/hvc: make sure console output is always emitted, with explicit polling	2009-11-05 10:58:07 -08:00
Linus Torvalds	608221fdf9	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix kthread_bind() by moving the body of kthread_bind() to sched.c sched: Disable SD_PREFER_LOCAL at node level sched: Fix boot crash by zalloc()ing most of the cpu masks sched: Strengthen buddies and mitigate buddy induced latencies	2009-11-05 10:56:47 -08:00
Linus Torvalds	411094acb7	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, fs: Fix x86 procfs stack information for threads on 64-bit x86: Add reboot quirk for 3 series Mac mini x86: Fix printk message typo in mtrr cleanup code dma-debug: Fix compile warning with PAE enabled x86/amd-iommu: Un__init function required on shutdown x86/amd-iommu: Workaround for erratum 63	2009-11-05 10:54:08 -08:00
Bjorn Helgaas	03db42adfe	x86/PCI: fix bogus host bridge window start/end alignment from _CRS PCI device BARs are guaranteed to start and end on at least a four-byte (I/O) or a sixteen-byte (MMIO) boundary because they're aligned on their size and the low BAR bits are reserved. PCI-to-PCI bridge apertures have even larger alignment restrictions. However, some BIOSes (e.g., HP DL360 BIOS P31) report host bridge windows like "[io 0x0000-0x2cfe]". This is wrong because it excludes the last port at 0x2cff: it's impossible for a downstream device to claim 0x2cfe without also claiming 0x2cff. In fact, this BIOS configures a device behind the bridge to "[io 0x2c00-0x2cff]", so we know the window actually does include 0x2cff. This patch rounds the start and end of apertures to the appropriate boundary. I experimentally determined that Windows contains a similar workaround; details here: http://bugzilla.kernel.org/show_bug.cgi?id=14337 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:46 -08:00
Bjorn Helgaas	f1db6fde09	x86/PCI: for debuggability, show host bridge windows even when ignoring _CRS We have occasional problems with PCI resource allocation, and sometimes they could be avoided by paying attention to what ACPI tells us about the host bridges. This patch doesn't change the behavior, but it prints window information that should make debugging easier. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:45 -08:00
Bjorn Helgaas	865df576e8	PCI: improve discovery/configuration messages This makes PCI resource management messages more consistent and adds a few new messages to aid debugging. Whenever we assign resources to a device, update a BAR, or change a bridge aperture, it's worth noting it. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:44 -08:00
Bjorn Helgaas	2a6bed8301	x86/PCI: print domain:bus in conventional format Use the dev_printk-like "%04x:%02x" format for printing PCI bus numbers. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:43 -08:00
Bjorn Helgaas	c7dabef8a2	vsprintf: use %pR, %pr instead of %pRt, %pRf Jesse accidentally applied v1 [1] of the patchset instead of v2 [2]. This is the diff between v1 and v2. The changes in this patch are: - tidied vsprintf stack buffer to shrink and compute size more accurately - use %pR for decoding and %pr for "raw" (with type and flags) instead of adding %pRt and %pRf [1] http://lkml.org/lkml/2009/10/6/491 [2] http://lkml.org/lkml/2009/10/13/441 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:41 -08:00
Bjorn Helgaas	af5a8ee054	x86/PCI: use -DDEBUG when CONFIG_PCI_DEBUG set We use dev_dbg() in arch/x86/pci, but there's no easy way to turn it on. Add -DDEBUG when CONFIG_PCI_DEBUG=y, just like we do in drivers/pci. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:27 -08:00
Jeremy Fitzhardinge	1ccbf5344c	xen: move Xen-testing predicates to common header Move xen_domain and related tests out of asm-x86 to xen/xen.h so they can be included whenever they are necessary. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:24 -08:00
Bjorn Helgaas	9a08f7d350	x86/PCI: allow MMCONFIG above 4GB The current whitelist requires a kernel change for every machine that has MMCONFIG regions above 4GB, even if BIOS provides a correct MCFG table. This patch expands the whitelist to include machines with a rev 1 or newer MCFG table and a DMI_BIOS_DATE of 2010 or later. That way, we only need kernel changes for new machines that provide incorrect MCFG tables. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> CC: Matthew Wilcox <willy@linux.intel.com> CC: John Keller <jpk@sgi.com> CC: Yinghai Lu <yhlu.kernel@gmail.com> CC: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> CC: Andi Kleen <andi@firstfloor.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:22 -08:00
Suresh Siddha	2992e545ea	x86/PCI/PAT: return EINVAL for pci mmap WC request for !pat_enabled Thomas Schlichter reported: > X.org uses libpciaccess which tries to mmap with write combining enabled via > /sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the > kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded > mapping with write combining enabled and does not set up suited MTRR entries. > ;-( Instead of silently mapping pci mmap region as UC minus in the case of !pat_enabled and wc request, we can return error. Eric Anholt mentioned that caller (like X) typically follows up with UC minus pci mmap request and if there is a free mtrr slot, caller will manage adding WC mtrr. Jesse Barnes says: > Older versions of libpciaccess will behave better if we do it that way > (iirc it only allocates an MTRR if the resource_wc file doesn't exist or > fails to get mapped). Reported-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:22 -08:00
Bjorn Helgaas	42887b29ce	x86/PCI: print resources consistently with %pRt This uses %pRt to print additional resource information (type, size, prefetchability, etc.) consistently. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:18 -08:00
Dave Jones	76b1a87b21	x86/PCI: Use generic cacheline sizing instead of per-vendor tests. Instead of the PCI code needing to have code to determine the cacheline size of each processor, use the data the cpu identification code should have already determined during early boot. (The vendor checks are also incomplete, and don't take into account modern CPUs) I've been carrying a variant of this code in Fedora for a while, that prints debug information. There are a number of cases where we are currently setting the PCI cacheline size to 32 bytes, when the CPU cacheline size is 64 bytes. With this patch, we set them both the same. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:12 -08:00
Jesse Barnes	ac1aa47b13	PCI: determine CLS more intelligently Till now, CLS has been determined either by arch code or as L1_CACHE_BYTES. Only x86 and ia64 set CLS explicitly and x86 doesn't always get it right. On most configurations, the chance is that firmware configures the correct value during boot. This patch makes pci_init() determine CLS by looking at what firmware has configured. It scans all devices and if all non-zero values agree, the value is used. If none is configured or there is a disagreement, pci_dfl_cache_line_size is used. arch can set the dfl value (via PCI_CACHE_LINE_BYTES or pci_dfl_cache_line_size) or override the actual one. ia64, x86 and sparc64 updated to set the default cls instead of the actual one. While at it, declare pci_cache_line_size and pci_dfl_cache_line_size in pci.h and drop private declarations from arch code. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Miller <davem@davemloft.net> Acked-by: Greg KH <gregkh@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:10 -08:00
Yinghai Lu	99935a7a59	x86/PCI: read root resources from IOH on Intel For intel systems with multi IOH, we should read peer root resources directly from PCI config space, and don't trust _CRS. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:09 -08:00
Gleb Natapov	abb3911965	KVM: get_tss_base_addr() should return a gpa_t If TSS we are switching to resides in high memory task switch will fail since address will be truncated. Windows2k3 does this sometimes when running with more then 4G Cc: stable@kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-11-04 12:42:36 -02:00
Jan Kiszka	a9e38c3e01	KVM: x86: Catch potential overrun in MCE setup We only allocate memory for 32 MCE banks (KVM_MAX_MCE_BANKS) but we allow user space to fill up to 255 on setup (mcg_cap & 0xff), corrupting kernel memory. Catch these overflows. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-11-04 12:42:35 -02:00
Stefani Seibold	89240ba059	x86, fs: Fix x86 procfs stack information for threads on 64-bit This patch fixes two issues in the procfs stack information on x86-64 linux. The 32 bit loader compat_do_execve did not store stack start. (this was figured out by Alexey Dobriyan). The stack information on a x64_64 kernel always shows 0 kbyte stack usage, because of a missing implementation of the KSTK_ESP macro which always returned -1. The new implementation now returns the right value. Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1257240160.4889.24.camel@wall-e> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:25:03 +01:00
Rusty Russell	d7d3756c5b	cpumask: Use modern cpumask style in Xen Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> LKML-Reference: <200911031458.38406.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:19:50 +01:00
Rusty Russell	6ac5c5310c	cpumask: Use modern cpumask style in arch/x86/kernel/cpu/mcheck/mce-inject.c Note that there's no freeing the cpu var, since this module has no unload function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> LKML-Reference: <200911031458.30987.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:19:01 +01:00
Rusty Russell	ce7c42710e	cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c Ingo wants the certainty of a static cpumask (rather than a cpumask_var_t), but cpumask_t will some day be undefined to avoid on-stack declarations. This is what DECLARE_BITMAP/to_cpumask() is for. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <200911031453.52394.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:17:53 +01:00
Hiroshi Shimamoto	09879b99d4	x86: Gitignore: arch/x86/lib/inat-tables.c Ignore generated file arch/x86/lib/inat-tables.c. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4AF0FBD7.7000501@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 13:11:28 +01:00
Ingo Molnar	a2e7127153	Merge commit 'v2.6.32-rc6' into perf/core Conflicts: tools/perf/Makefile Merge reason: Resolve the conflict, merge to upstream and merge in perf fixes so we can add a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-04 11:59:45 +01:00
Brian Gerst	97829de5a3	x86, 64-bit: Fix bstep_iret jump This jump should be unconditional. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1257274925-15713-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 20:50:02 +01:00
Jeremy Fitzhardinge	82d6469916	xen: mask extended topology info in cpuid A Xen guest never needs to know about extended topology, and knowing would just confuse it. This patch just zeros ebx in leaf 0xb which indicates no topology info, preventing a crash under Xen on cpus which support this leaf. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org>	2009-11-03 11:09:12 -08:00
Paul Mundt	41a48d14f6	x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread(). flush_thread() tries to do a TIF_DEBUG check before calling in to flush_thread_hw_breakpoint() (which subsequently clears the thread flag), but for some reason, the x86 code is manually clearing TIF_DEBUG immediately before the test, so this path will never be taken. This kills off the erroneous clear_tsk_thread_flag() and lets flush_thread_hw_breakpoint() actually get invoked. Presumably folks were getting lucky with testing and the free_thread_info() -> free_thread_xstate() path was taking care of the flush there. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: "K.Prasad" <prasad@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alan Stern <stern@rowland.harvard.edu> LKML-Reference: <20091005102306.GA7889@linux-sh.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-11-03 18:05:44 +01:00
Huang Ying	01dd958277	crypto: ghash-intel - Fix irq_fpu_usable usage When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the function is changed too, from mesuring whether kernel is using FPU, that is, the FPU is NOT available, to measuring whether FPU is usable, that is, the FPU is available. But the usage of irq_fpu_usable in ghash-clmulni-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-03 10:55:20 -05:00
Ingo Molnar	1d87cff407	Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent	2009-11-03 16:54:14 +01:00
Arjan van de Ven	a489ca355e	x86: Make sure we also print a Code: line for show_regs() show_regs() is called as a mini BUG() equivalent in some places, specifically for the "scheduling while atomic" case. Unfortunately right now it does not print a Code: line unlike a real bug/oops. This patch changes the x86 implementation of show_regs() so that it calls the same function as oopses do to print the registers as well as the Code: line. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20091102165915.4a980fc0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 16:50:22 +01:00
Herbert Xu	3b0d65969b	crypto: ghash-intel - Add PSHUFB macros Add PSHUFB macros instead of repeating byte sequences, suggested by Ingo. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 09:11:15 -05:00
Joerg Roedel	342688f9db	Merge branches 'amd-iommu/fixes' and 'dma-debug/fixes' into iommu/fixes	2009-11-03 12:05:40 +01:00
Mike Galbraith	6b9de613ae	sched: Disable SD_PREFER_LOCAL at node level Yanmin Zhang reported that SD_PREFER_LOCAL induces an order of magnitude increase in select_task_rq_fair() overhead while running heavy wakeup benchmarks (tbench and vmark). Since SD_BALANCE_WAKE is off at node level, turn SD_PREFER_LOCAL off as well pending further investigation. Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-03 07:24:07 +01:00
Linus Torvalds	efcd9e0b91	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Make EFI RTC function depend on 32bit again x86-64: Fix register leak in 32-bit syscall audting x86: crash_dump: Fix non-pae kdump kernel memory accesses x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium x86: Remove STACKPROTECTOR_ALL	2009-11-02 09:45:17 -08:00
Suresh Siddha	e7d23dde9b	x86_64, cpa: Use only text section in set_kernel_text_rw/ro set_kernel_text_rw()/set_kernel_text_ro() are marking pages starting from _text to __start_rodata as RW or RO. With CONFIG_DEBUG_RODATA, there might be free pages (associated with padding the sections to 2MB large page boundary) between text and rodata sections that are given back to page allocator. So we should use only use the start (__text) and end (__stop___ex_table) of the text section in set_kernel_text_rw()/set_kernel_text_ro(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024821.164525222@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 17:17:24 +01:00
Suresh Siddha	55ca3cc174	x86_64, ftrace: Make ftrace use kernel identity mapping to modify code On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA. So use the kernel identity mapping instead of the kernel text mapping to modify the kernel text. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024821.080941108@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 17:16:36 +01:00
Suresh Siddha	502f660466	x86, cpa: Fix kernel text RO checks in static_protection() Steven Rostedt reported that we are unconditionally making the kernel text mapping as read-only. i.e., if someone does cpa() to the kernel text area for setting/clearing any page table attribute, we unconditionally clear the read-write attribute for the kernel text mapping that is set at compile time. We should delay (to forbid the write attribute) and enforce only after the kernel has mapped the text as read-only. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091029024820.996634347@sbs-t61.sc.intel.com> [ marked kernel_set_to_readonly as __read_mostly ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 17:16:35 +01:00
Suresh Siddha	5231a68614	x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() To ensure that we handle all the pending interrupts (destined for this cpu that is going down) in the interrupt subsystem before the cpu goes offline, fixup_irqs() does: local_irq_enable(); mdelay(1); local_irq_disable(); Enabling interrupts is not a good thing as this cpu is already offline. So this patch replaces that logic with, mdelay(1); check APIC_IRR bits Retrigger the irq at the new destination if any interrupt has arrived via IPI. For IO-APIC level triggered interrupts, this retrigger IPI will appear as an edge interrupt. ack_apic_level() will detect this condition and IO-APIC RTE's remoteIRR is cleared using directed EOI(using IO-APIC EOI register) on Intel platforms and for others it uses the existing mask+edge logic followed by unmask+level. We can also remove mdelay() and then send spuriuous interrupts to new cpu targets for all the irqs that were handled previously by this cpu that is going offline. While it works, I have seen spurious interrupt messages (nothing wrong but still annoying messages during cpu offline, which can be seen during suspend/resume etc) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230002.043281924@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:37 +01:00
Suresh Siddha	b3ec0a37a7	x86: Use EOI register in io-apic on intel platforms IO-APIC's in intel chipsets support EOI register starting from IO-APIC version 2. Use that when ever we need to clear the IO-APIC RTE's RemoteIRR bit explicitly. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.947855317@sbs-t61.sc.intel.com> [ Marked use_eio_reg as __read_mostly, fixed small details ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:36 +01:00
Suresh Siddha	a5e74b8419	x86: Force irq complete move during cpu offline When a cpu goes offline, fixup_irqs() try to move irq's currently destined to the offline cpu to a new cpu. But this attempt will fail if the irq is recently moved to this cpu and the irq still hasn't arrived at this cpu (for non intr-remapping platforms this is when we free the vector allocation at the previous destination) that is about to go offline. This will endup with the interrupt subsystem still pointing the irq to the offline cpu, causing that irq to not work any more. Fix this by forcing the irq to complete its move (its been a long time we moved the irq to this cpu which we are offlining now) and then move this irq to a new cpu before this cpu goes offline. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:36 +01:00
Suresh Siddha	23359a88e7	x86: Remove move_cleanup_count from irq_cfg move_cleanup_count for each irq in irq_cfg is keeping track of the total number of cpus that need to free the corresponding vectors associated with the irq which has now been migrated to new destination. As long as this move_cleanup_count is non-zero (i.e., as long as we have n't freed the vector allocations on the old destinations) we were preventing the irq's further migration. This cleanup count is unnecessary and it is enough to not allow the irq migration till we send the cleanup vector to the previous irq destination, for which we already have irq_cfg's move_in_progress. All we need to make sure is that we free the vector at the old desintation but we don't need to wait till that gets freed. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:35 +01:00
Suresh Siddha	84e21493a3	x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping In the presence of interrupt-remapping, irqs will be migrated in the process context and we don't do (and there is no need to) irq_chip mask/unmask while migrating the interrupt. Similarly fix the fixup_irqs() that get called during cpu offline and avoid calling irq_chip mask/unmask for irqs that are ok to be migrated in the process context. While we didn't observe any race condition with the existing code, this change takes complete advantage of interrupt-remapping in the newer generation platforms and avoids any potential HW lockup's (that often worry Eric :) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: garyhade@us.ibm.com LKML-Reference: <20091026230001.661423939@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:35 +01:00
Suresh Siddha	7a7732bc0f	x86: Unify fixup_irqs() for 32-bit and 64-bit kernels There is no reason to have different fixup_irqs() for 32-bit and 64-bit kernels. Unify by using the superior 64-bit version for both the kernels. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.562512739@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:56:34 +01:00
Gottfried Haider	05154752cf	x86: Add reboot quirk for 3 series Mac mini Reboot does not work out of the box on my "Early 2009" Mac mini (3,1). Detect this machine via DMI as we do for recent MacBooks. Signed-off-by: Gottfried Haider <gottfried.haider@gmail.com> Cc: Ozan Çağlayan <ozan@pardus.org.tr> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 15:46:17 +01:00
Dave Jones	16121d70fd	x86: Fix printk message typo in mtrr cleanup code Trivial typo. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-02 08:36:18 +01:00
Herbert Xu	2d06ef7f42	crypto: ghash-intel - Hard-code pshufb Old gases don't have a clue what pshufb stands for so we have to hard-code it for now. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-11-01 12:49:44 -05:00
Linus Torvalds	2e2ec95235	Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: set up mmu_ops before trying to set any ptes	2009-10-29 15:03:36 -07:00
Linus Torvalds	6e958d73c2	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Do less agressive buddy clearing sched: Disable SD_PREFER_LOCAL for MC/CPU domains	2009-10-29 08:10:38 -07:00
Linus Torvalds	7811a32407	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: Set DELIVERY_MODE=4 for vector=NMI_VECTOR in uv_hub_send_ipi() x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode() x86: Don't print number of MCE banks for every CPU x86, UV: Fix information in __uv_hub_info structure x86: Document linker script ASSERT() quirk	2009-10-29 08:10:26 -07:00
Tejun Heo	0fe1e00954	percpu: make percpu symbols in x86 unique This patch updates percpu related symbols in x86 such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/x86/kernel/cpu/common.c: rename local variable to avoid collision * arch/x86/kvm/svm.c: s/svm_data/sd/ for local variables to avoid collision * arch/x86/kernel/cpu/cpu_debug.c: s/cpu_arr/cpud_arr/ s/priv_arr/cpud_priv_arr/ s/cpu_priv_count/cpud_priv_count/ * arch/x86/kernel/cpu/intel_cacheinfo.c: s/cpuid4_info/ici_cpuid4_info/ s/cache_kobject/ici_cache_kobject/ s/index_kobject/ici_index_kobject/ * arch/x86/kernel/ds.c: s/cpu_context/cpu_ds_context/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: (kvm) Avi Kivity <avi@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: x86@kernel.org	2009-10-29 22:34:14 +09:00
Tejun Heo	c6e22f9e3e	percpu: make percpu symbols in xen unique This patch updates percpu related symbols in xen such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/x86/xen/smp.c, arch/x86/xen/time.c, arch/ia64/xen/irq_xen.c: add xen_ prefix to percpu variables * arch/ia64/xen/time.c: add xen_ prefix to percpu variables, drop processed_ prefix and make them static Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org>	2009-10-29 22:34:13 +09:00
Tejun Heo	f16250669d	percpu: make percpu symbols in cpufreq unique This patch updates percpu related symbols in cpufreq such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * drivers/cpufreq/cpufreq.c: s/policy_cpu/cpufreq_policy_cpu/ * drivers/cpufreq/freq_table.c: s/show_table/cpufreq_show_table/ * arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: s/drv_data/acfreq_data/ s/old_perf/acfreq_old_perf/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 22:34:13 +09:00
Tejun Heo	0f5e4816db	percpu: remove some sparse warnings Make the following changes to remove some sparse warnings. * Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before defining it. * Annotate pcpu_extend_area_map() that it is entered with pcpu_lock held, releases it and then reacquires it. * Make percpu related macros use unique nested variable names. * While at it, add pcpu prefix to __size_call[_return]() macros as to-be-implemented sparse annotations will add percpu specific stuff to these macros. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 22:34:12 +09:00
Ingo Molnar	9de09ace8d	Merge branch 'tracing/urgent' into tracing/core Merge reason: Pick up fixes and move base from -rc1 to -rc5. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 09:02:20 +01:00
Masami Hiramatsu	3f7e454af1	x86: Add Intel FMA instructions to x86 opcode map Add Intel FMA(FUSED-MULTIPLY-ADD) instructions to x86 opcode map for x86 instruction decoder. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204235.30545.33997.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:47 +01:00
Masami Hiramatsu	e0e492e99b	x86: AVX instruction set decoder support Add Intel AVX(Advanced Vector Extensions) instruction set support to x86 instruction decoder. This adds insn.vex_prefix field for storing VEX prefixes, and introduces some original tags for expressing opcodes attributes. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204226.30545.23451.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:46 +01:00
Masami Hiramatsu	82cb57028c	x86: Add pclmulq to x86 opcode map Add pclmulq opcode to x86 opcode map. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204219.30545.82039.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:46 +01:00
Masami Hiramatsu	04d46c1b13	x86: Merge INAT_REXPFX into INAT_PFX_* Merge INAT_REXPFX into INAT_PFX_* macro and rename it to INAT_PFX_REX. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204211.30545.58090.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:45 +01:00
Masami Hiramatsu	7f387d3f24	x86: Fix SSE opcode map bug Fix superscripts position because some superscripts of SSE opcode are not put in correct position. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> LKML-Reference: <20091027204204.30545.97296.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-29 08:47:45 +01:00
Joerg Roedel	ca0207114f	x86/amd-iommu: Un__init function required on shutdown The function iommu_feature_disable is required on system shutdown to disable the IOMMU but it is marked as __init. This may result in a panic if the memory is reused. This patch fixes this bug. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2009-10-28 18:02:26 +01:00
Jeremy Fitzhardinge	973df35ed9	xen: set up mmu_ops before trying to set any ptes xen_setup_stackprotector() ends up trying to set page protections, so we need to have vm_mmu_ops set up before trying to do so. Failing to do so causes an early boot crash. [ Impact: Fix early crash under Xen. ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-10-27 16:54:19 -07:00
Steven Rostedt	883242dd0e	tracing: allow to change permissions for text with dynamic ftrace enabled The commit `74e081797b` x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA prevents text sections from becoming read/write using set_memory_rw. The dynamic ftrace changes all text pages to read/write just before converting the calls to tracing to nops, and vice versa. I orginally just added a flag to allow this transaction when ftrace did the change, but I also found that when the CPA testing was running it would remove the read/write as well, and ftrace does not do the text conversion on boot up, and the CPA changes caused the dynamic tracer to fail on self tests. The current solution I have is to simply not to prevent change_page_attr from setting the RW bit for kernel text pages. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-10-27 13:15:11 -04:00
Andreas Herrmann	6f9b41006a	x86, apic: Clear APIC Timer Initial Count Register on shutdown Commit `a98f8fd24f` (x86: apic reset counter on shutdown) set the counter to max to avoid spurious interrupts when the timer is re-enabled. (In theory) you'll still get a spurious interrupt if spending more than 344 seconds with this interrupt disabled and then unmasking it. The right thing to do is to clear the register. This disables the interrupt from happening (at least it does on AMD hardware). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091027100138.GB30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-27 14:54:21 +01:00
Feng Tang	772be899bc	x86: Make EFI RTC function depend on 32bit again The EFI RTC functions are only available on 32 bit. commit `7bd867df` (x86: Move get/set_wallclock to x86_platform_ops) removed the 32bit dependency which leads to boot crashes on 64bit EFI systems. Add the dependency back. Solves: http://bugzilla.kernel.org/show_bug.cgi?id=14466 Tested-by: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Feng Tang <feng.tang@intel.com> LKML-Reference: <20091020125402.028d66d5@feng-desktop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-10-27 12:35:48 +01:00
Jan Beulich	81766741fe	x86-64: Fix register leak in 32-bit syscall audting Restoring %ebp after the call to audit_syscall_exit() is not only unnecessary (because the register didn't get clobbered), but in the sysenter case wasn't even doing the right thing: It loaded %ebp from a location below the top of stack (RBP < ARGOFFSET), i.e. arbitrary kernel data got passed back to user mode in the register. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <4AE5CC4D020000780001BD13@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-26 16:23:26 +01:00
Jiri Slaby	72ed7de74e	x86: crash_dump: Fix non-pae kdump kernel memory accesses Non-PAE 32-bit dump kernels may wrap an address around 4G and poke unwanted space. ptes there are 32-bit long, and since pfn << PAGE_SIZE may exceed this limit, high pfn bits are cropped and wrong address mapped by kmap_atomic_pfn in copy_oldmem_page. Don't allow this behavior in non-PAE kdump kernels by checking pfns passed into copy_oldmem_page. In the case of failure, userspace process gets EFAULT. [v2] - fix comments - move ifdefs inside the function Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Simon Horman <horms@verge.net.au> Cc: Paul Mundt <lethal@linux-sh.org> LKML-Reference: <1256551903-30567-1-git-send-email-jirislaby@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-26 12:38:59 +01:00
Rusty Russell	ae1b22f6e4	x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium Commit `79e1dd05d1` "x86: Provide an alternative() based cmpxchg64()" broke lguest, even on systems which have cmpxchg8b support. The emulation code gets used until alternatives get run, but it contains native instructions, not their paravirt alternatives. The simplest fix is to turn this code off except for 386 and 486 builds. Reported-by: Johannes Stezenbach <js@sig21.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: lguest@ozlabs.org Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <200910261426.05769.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-26 12:33:02 +01:00
Alexander Potashev	4868402d95	x86, boot: Simplify setting of the PAE bit A single 'movl' is shorter than the 'xorl'-'orl' pair. No change in behaviour. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> LKML-Reference: <1256341043-4928-1-git-send-email-aspotashev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-24 11:06:38 +02:00
Arjan van de Ven	14a3f40aaf	x86: Remove STACKPROTECTOR_ALL STACKPROTECTOR_ALL has a really high overhead (runtime and stack footprint) and is not really worth it protection wise (the normal STACKPROTECTOR is in effect for all functions with buffers already), so lets just remove the option entirely. Reported-by: Dave Jones <davej@redhat.com> Reported-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Eric Sandeen <sandeen@redhat.com> LKML-Reference: <20091023073101.3dce4ebb@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-23 16:35:23 +02:00
Minchan Kim	b1258ac296	x86: Remove pfn in add_one_highpage_init() commit `cc9f7a0ccf` changed add_one_highpage_init. We don't use pfn any more. Let's remove unnecessary argument. This patch doesn't chage function behavior. This patch is based on v2.6.32-rc5. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> LKML-Reference: <20091022112722.adc8e55c.minchan.kim@barrios-desktop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-23 13:39:26 +02:00
Ingo Molnar	4331595650	Merge branch 'perf/core' into perf/probes Conflicts: tools/perf/Makefile Merge reason: - fix the conflict - pick up the pr_*() infrastructure to queue up dependent patch Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-23 08:23:20 +02:00
Linus Torvalds	422b42fa79	Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Prevent kvm_init from corrupting debugfs structures KVM: MMU: fix pointer cast KVM: use proper hrtimer function to retrieve expiration time	2009-10-22 08:26:15 +09:00
Linus Torvalds	4fe71dba2f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: aesni-intel - Fix irq_fpu_usable usage crypto: padlock-sha - Fix stack alignment	2009-10-22 08:16:01 +09:00
Ingo Molnar	9bf4e7fba8	x86, instruction decoder: Fix test_get_len build rules Add the kernel source include file as well to the include files search path, to fix this build bug: In file included from arch/x86/tools/test_get_len.c:28: arch/x86/lib/insn.c:21:26: error: linux/string.h: No such file or directory Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap<systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-21 14:42:56 +02:00
Robin Holt	02dd0a0613	x86, UV: Set DELIVERY_MODE=4 for vector=NMI_VECTOR in uv_hub_send_ipi() When sending a NMI_VECTOR IPI using the UV_HUB_IPI_INT register, we need to ensure the delivery mode field of that register has NMI delivery selected. This makes those IPIs true NMIs, instead of flat IPIs. It matters to reboot sequences and KGDB, both of which use NMI IPIs. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Cc: Martin Hicks <mort@sgi.com> Cc: <stable@kernel.org> LKML-Reference: <20091020193620.877322000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-21 13:31:13 +02:00
Masami Hiramatsu	9983d60d74	x86: Add AES opcodes to opcode map Add Intel AES opcodes to x86 opcode map. These opcodes are used in arch/x86/crypt/aesni-intel_asm.S. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap<systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-21 13:25:29 +02:00
Masami Hiramatsu	06ed6ba5ec	x86: Fix group attribute decoding bug Fix a typo in inat_get_group_attribute() which should refer inat_group_tables, not inat_escape_tables. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap<systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091020165524.4145.97333.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-21 13:25:28 +02:00
Huang Ying	13b79b9715	crypto: aesni-intel - Fix irq_fpu_usable usage When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the function is changed too, from mesuring whether kernel is using FPU, that is, the FPU is NOT available, to measuring whether FPU is usable, that is, the FPU is available. But the usage of irq_fpu_usable in aesni-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-10-20 16:20:47 +09:00
Suresh Siddha	d6cc1c3af7	x86-64: add comment for RODATA large page retainment Add a comment explaining why RODATA is aligned to 2 MB. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-20 14:46:01 +09:00
Suresh Siddha	74e081797b	x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel text/rodata/data to small 4KB pages as they are mapped with different attributes (text as RO, RODATA as RO and NX etc). On x86_64, preserve the large page mappings for kernel text/rodata/data boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the RODATA section to be hugepage aligned and having same RWX attributes for the 2MB page boundaries Extra Memory pages padding the sections will be freed during the end of the boot and the kernel identity mappings will have different RWX permissions compared to the kernel text mappings. Kernel identity mappings to these physical pages will be mapped with smaller pages but large page mappings are still retained for kernel text,rodata,data mappings. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-20 14:46:00 +09:00
Suresh Siddha	b9af7c0d44	x86-64: preserve large page mapping for 1st 2MB kernel txt with CONFIG_DEBUG_RODATA In the first 2MB, kernel text is co-located with kernel static page tables setup by head_64.S. CONFIG_DEBUG_RODATA chops this 2MB large page mapping to small 4KB pages as we mark the kernel text as RO, leaving the static page tables as RW. With CONFIG_DEBUG_RODATA disabled, OLTP run on NHM-EP shows 1% improvement with 2% reduction in system time and 1% improvement in iowait idle time. To recover this, move the kernel static page tables to .data section, so that we don't have to break the first 2MB of kernel text to small pages with CONFIG_DEBUG_RODATA. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.063193621@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-10-20 14:46:00 +09:00
Huang Ying	0e1227d356	crypto: ghash - Add PCLMULQDQ accelerated implementation PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at: http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/ Because PCLMULQDQ changes XMM state, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be defered to the cryptd kernel thread. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-10-19 11:53:06 +09:00
Frederic Weisbecker	0f8f86c7bd	Merge commit 'perf/core' into perf/hw-breakpoint Conflicts: kernel/Makefile kernel/trace/Makefile kernel/trace/trace.h samples/Makefile Merge reason: We need to be uptodate with the perf events development branch because we plan to rewrite the breakpoints API on top of perf events.	2009-10-18 01:12:33 +02:00
Ingo Molnar	bb3c3e8071	Merge commit 'v2.6.32-rc5' into perf/probes Conflicts: kernel/trace/trace_event_profile.c Merge reason: update to -rc5 and resolve conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-17 09:58:25 +02:00
Masami Hiramatsu	d1baf5a5a6	x86: Add AMD prefetch and 3DNow! opcodes to opcode map Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow! uses the last immediate byte as an opcode extension byte, x86 insn just treats the extenstion byte as an immediate byte instead of a part of opcode (insn_get_opcode() decodes first "0x0f 0x0f" bytes.) Users who are interested in analyzing 3DNow! opcode still can decode it by analyzing the immediate byte. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-17 09:53:59 +02:00
Masami Hiramatsu	8c95bc3e20	x86: Add MMX/SSE opcode groups to opcode map Add missing MMX/SSE opcode groups to x86 opcode map. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-17 09:53:58 +02:00
Frederik Deweerdt	8a8365c560	KVM: MMU: fix pointer cast On a 32 bits compile, commit `3da0dd433d` introduced the following warnings: arch/x86/kvm/mmu.c: In function ‘kvm_set_pte_rmapp’: arch/x86/kvm/mmu.c:770: warning: cast to pointer from integer of different size arch/x86/kvm/mmu.c: In function ‘kvm_set_spte_hva’: arch/x86/kvm/mmu.c:849: warning: cast from pointer to integer of different size The following patch uses 'unsigned long' instead of u64 to match the pointer size on both arches. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-10-16 12:30:26 -03:00
Marcelo Tosatti	ace1546487	KVM: use proper hrtimer function to retrieve expiration time hrtimer->base can be temporarily NULL due to racing hrtimer_start. See switch_hrtimer_base/lock_hrtimer_base. Use hrtimer_get_remaining which is robust against it. CC: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-10-16 12:30:25 -03:00
Robin Holt	1d21e6e3ff	x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode() Create an inline function to extract the pnode from a global physical address and then convert the broadcast assist unit to use the newly created uv_gpa_to_pnode function. The open-coded code was wrong as well - it might explain a few of our unexplained bau hangs. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Cliff Whickman <cpw@sgi.com> Cc: linux-mm@kvack.org Cc: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091016112920.GZ8903@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:51:53 +02:00
Borislav Petkov	b33a636364	x86, mce: Add a global MCE init helper Add an early initcall (pre SMP) which sets up global MCE functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:46:50 +02:00
Borislav Petkov	5e09954a9a	x86, mce: Fix up MCE naming nomenclature Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:46:49 +02:00
Ingo Molnar	6b50f5c7c7	Merge branches 'x86/mce' and 'x86/urgent' into perf/mce Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 14:42:25 +02:00
Roland Dreier	93ae5012a7	x86: Don't print number of MCE banks for every CPU The MCE initialization code explicitly says it doesn't handle asymmetric configurations where different CPUs support different numbers of MCE banks, and it prints a big warning in that case. Therefore, printing the "mce: CPU supports <x> MCE banks" message into the kernel log for every CPU is pure redundancy that clutters the log significantly for systems with lots of CPUs. Signed-off-by: Roland Dreier <rolandd@cisco.com> LKML-Reference: <adaeip473qt.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 09:20:03 +02:00
Robin Holt	036ed8ba61	x86, UV: Fix information in __uv_hub_info structure A few parts of the uv_hub_info structure are initialized incorrectly. - n_val is being loaded with m_val. - gpa_mask is initialized with a bytes instead of an unsigned long. - Handle the case where none of the alias registers are used. Lastly I converted the bau over to using the uv_hub_info->m_val which is the correct value. Without this patch, booting a large configuration hits a problem where the upper bits of the gnode affect the pnode and the bau will not operate. Signed-off-by: Robin Holt <holt@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> Cc: Cliff Whickman <cpw@sgi.com> Cc: stable@kernel.org LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 08:18:34 +02:00
Ingo Molnar	a5912f6b3e	x86: Document linker script ASSERT() quirk Older binutils breaks if ASSERT() is used without a sink for the output. For example 2.14.90.0.6 is known to be broken, the link fails with: LD .tmp_vmlinux1 ld:arch/x86/kernel/vmlinux.lds:678: parse error Document this quirk in all three files that use it. See: http://marc.info/?l=linux-kbuild&m=124930110427870&w=2 See[2]: `d2ba8b2` ("x86: Fix assert syntax in vmlinux.lds.S") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Roland McGrath <roland@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-16 07:18:46 +02:00
Cyrill Gorcunov	f88f2b4fdb	x86: apic: Allow noop operations to be called almost at any time As only apic noop is used we allow to use almost any operation caller wants (and which of them noop driver supports of course). Initially it was reported by Ingo Molnar that apic noop issue a warning for pkg id (which is actually false positive and should be eliminated). So we save checking (and warning issue) for read/write operations while allow any other ops to be freely used. Also: - fix noop_cpu_to_logical_apicid, it should be 0. - rename noop_default_phys_pkg_id to noop_phys_pkg_id (we use default_ prefix for more general routines in apic subsystem). Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> LKML-Reference: <20091015150416.GC5331@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 17:26:53 +02:00
Ingo Molnar	713490e02e	Merge branch 'tracing/core' into perf/core Merge reason: to add event filter support we need the following commits from the tracing tree: `3f6fe06`: tracing/filters: Unify the regex parsing helpers `1889d20`: tracing/filters: Provide basic regex support `737f453`: tracing/filters: Cleanup useless headers Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 11:34:00 +02:00
Ingo Molnar	b226f744d4	Merge branch 'linus' into perf/core Merge reason: pick up tools/perf/ changes from upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 08:44:44 +02:00
Ingo Molnar	db8590f504	Revert "x86: linker script syntax nits" This reverts commit `e9a63a4e55`. This breaks older binutils, where sink-less asserts are broken. See this commit for further details: `d2ba8b2`: x86: Fix assert syntax in vmlinux.lds.S Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 08:09:55 +02:00
Ingo Molnar	a0738a688d	Merge branch 'linus' into x86/urgent Merge reason: pull in latest, to be able to revert a patch there. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 08:07:30 +02:00
Linus Torvalds	601adfedba	Merge branch 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: x86: linker script syntax nits	2009-10-14 15:33:05 -07:00
Linus Torvalds	f061d83a2b	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix missing kernel-doc notation Revert "x86, timers: Check for pending timers after (device) interrupts" sched: Update the clock of runqueue select_task_rq() selected	2009-10-14 15:25:04 -07:00
Linus Torvalds	ea87644105	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/paravirt: Use normal calling sequences for irq enable/disable x86: fix kernel panic on 32 bits when profiling x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop x86, vmi: Mark VMI deprecated and schedule it for removal	2009-10-14 15:24:32 -07:00
Roland McGrath	e9a63a4e55	x86: linker script syntax nits The linker scripts grew some use of weirdly wrong linker script syntax. It happens to work, but it's not what the syntax is documented to be. Clean it up to use the official syntax. Signed-off-by: Roland McGrath <roland@redhat.com> CC: Ian Lance Taylor <iant@google.com>	2009-10-14 14:16:38 -07:00
Dimitri Sivanich	4a4de9c7d7	x86: UV RTC: Rename generic_interrupt to x86_platform_ipi Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014142257.GE11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 18:27:11 +02:00
Dimitri Sivanich	d5991ff297	x86: UV RTC: Clean up error handling Cleanup error handling in uv_rtc_setup_clock. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014142103.GD11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 18:27:10 +02:00
Dimitri Sivanich	8c28de4d01	x86: UV RTC: Add clocksource only boot option Add clocksource only boot option for UV RTC. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014141848.GC11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 18:27:10 +02:00
Dimitri Sivanich	e47938b1fa	x86: UV RTC: Fix early expiry handling Tune/fix early timer expiry handling and return correct early timeout value for set_next_event. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091014141630.GB11048@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 18:27:09 +02:00
Thomas Gleixner	05d86412ea	x86: Remove BKL from apm_32 The lock/unlock kernel pair in do_open() got there with the BKL push down and protects nothing. Remove it. Replace the lock/unlock kernel in the ioctl code with a mutex to protect standbys_pending and suspends_pending. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091010153349.365236337@linutronix.de>	2009-10-14 17:04:48 +02:00
Thomas Gleixner	ac06ea2cd0	x86: Remove BKL from microcode cycle_lock_kernel() in microcode_open() is a worthless exercise as there is nothing to wait for. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091010153349.196074920@linutronix.de>	2009-10-14 17:04:48 +02:00
Ingo Molnar	7ec13187ef	x86, apic: Fix prototype in hw_irq.h This warning: In file included from arch/x86/include/asm/ipi.h:23, from arch/x86/kernel/apic/apic_noop.c:27: arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want triggers because irq_desc is defined after hw_irq.h is included in irq.h. Since it's pointer reference only, a forward declaration of the type will solve the problem. LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 15:06:42 +02:00
Peter Zijlstra	799e2205ec	sched: Disable SD_PREFER_LOCAL for MC/CPU domains Yanmin reported that both tbench and hackbench were significantly hurt by trying to keep tasks local on these domains, esp on small cache machines. So disable it in order to promote spreading outside of the cache domains. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Mike Galbraith <efault@gmx.de> LKML-Reference: <1255083400.8802.15.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 15:02:34 +02:00
Li Hong	89ccf465ab	x86, perf_event: Rename 'performance counter interrupt' In 'cdd6c482c9ff9c55475ee7392ec8f672eddb7be6', we renamed Performance Counters -> Performance Events. The name showed up in /proc/interrupts also needs a change. I use PMI (Performance monitoring interrupt) here, since it is the official name used in Intel's documents. Signed-off-by: Li Hong <lihong.hi@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091014105039.GA22670@uhli> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 14:37:24 +02:00
Frederic Weisbecker	c44fc77084	tracing: Move syscalls metadata handling from arch to core Most of the syscalls metadata processing is done from arch. But these operations are mostly generic accross archs. Especially now that we have a common variable name that expresses the number of syscalls supported by an arch: NR_syscalls, the only remaining bits that need to reside in arch is the syscall nr to addr translation. v2: Compare syscalls symbols only after the "sys" prefix so that we avoid spurious mismatches with archs that have syscalls wrappers, in which case syscalls symbols have "SyS" prefixed aliases. (Reported by: Heiko Carstens) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org>	2009-10-14 09:53:56 +02:00
Dimitri Sivanich	9338ad6ffb	x86, apic: Move SGI UV functionality out of generic IO-APIC code Move UV specific functionality out of the generic IO-APIC code. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091013203236.GD20543@sgi.com> [ Cleaned up the code some more in their new places. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:09 +02:00
Dimitri Sivanich	6c2c502910	x86: SGI UV: Fix irq affinity for hub based interrupts This patch fixes handling of uv hub irq affinity. IRQs with ALL or NODE affinity can be routed to cpus other than their originally assigned cpu. Those with CPU affinity cannot be rerouted. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20090930160259.GA7822@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:01 +02:00
Cyrill Gorcunov	2626eb2b2f	x86, apic: Limit apic dumping, introduce new show_lapic= setup option In case if a system has a large number of cpus printing apics contents may consume a long time period. We limit such an output by 1 apic by default. But to have an ability to see all apics or some part of them we introduce "show_lapic" setup option which allow us to limit/unlimit the number of APICs being dumped. Example: apic=debug show_lapic=5, or apic=debug show_lapic=all Also move apic_verbosity checking upper that way so helper routines do not need to inspect it at all. Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.926793122@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:01 +02:00
Cyrill Gorcunov	a933c61829	x86, apic: Use apic noop driver In case if apic were disabled we may use the whole apic NOOP driver instead of sparse poking the some functions in apic driver. Also NOOP would catch any inappropriate apic operation calls (not just read/write). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.747817361@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:00 +02:00
Cyrill Gorcunov	9844ab11c7	x86, apic: Introduce the NOOP apic driver Introduce NOOP APIC driver. We should use it in case if apic was disabled due to hardware of software/firmware problems (including user requested to disable it case). The driver is attempting to catch any inappropriate apic operation call with warning issue. Also it is possible to use some apic operation like IPI calls, read/write without checking for apic presence which should make callers code easier. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.534682104@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 09:17:00 +02:00
Steven Rostedt	194ec34184	function-graph/x86: Replace unbalanced ret with jmp The function graph tracer replaces the return address with a hook to trace the exit of the function call. This hook will finish by returning to the real location the function should return to. But the current implementation uses a ret to jump to the real return location. This causes a imbalance between calls and ret. That is the original function does a call, the ret goes to the handler and then the handler does a ret without a matching call. Although the function graph tracer itself still breaks the branch predictor by replacing the original ret, by using a second ret and causing an imbalance, it breaks the predictor even more. This patch replaces the ret with a jmp to keep the calls and ret balanced. I tested this on one box and it showed a 1.7% increase in performance. Another box only showed a small 0.3% increase. But no box that I tested this on showed a decrease in performance by making this change. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013203425.042034383@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-14 08:13:53 +02:00
Linus Torvalds	80fa680d22	Merge git://git.infradead.org/~dwmw2/iommu-2.6.32 * git://git.infradead.org/~dwmw2/iommu-2.6.32: x86: Move pci_iommu_init to rootfs_initcall() Run pci_apply_final_quirks() sooner. Mark pci_apply_final_quirks() __init rather than __devinit Rename pci_init() to pci_apply_final_quirks(), move it to quirks.c intel-iommu: Yet another BIOS workaround: Isoch DMAR unit with no TLB space intel-iommu: Decode (and ignore) RHSA entries intel-iommu: Make "Unknown DMAR structure" message more informative	2009-10-13 10:04:40 -07:00
Hidetoshi Seto	8968f9d3dc	perf_event, x86, mce: Use TRACE_EVENT() for MCE logging This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:43:38 +02:00
Ingo Molnar	9dbdd6c41c	Merge commit 'v2.6.32-rc4' into perf/core Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:31:34 +02:00
Ingo Molnar	2c96c142e9	Merge branch 'tracing/urgent' into tracing/core Merge reason: Pick up tracing/filters fix from the urgent queue, we will queue up dependent patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:24:59 +02:00
Jeremy Fitzhardinge	71999d9862	x86/paravirt: Use normal calling sequences for irq enable/disable Bastian Blank reported a boot crash with stackprotector enabled, and debugged it back to edx register corruption. For historical reasons irq enable/disable/save/restore had special calling sequences to make them more efficient. With the more recent introduction of higher-level and more general optimisations this is no longer necessary so we can just use the normal PVOP_ macros. This fixes some residual bugs in the old implementations which left edx liable to inadvertent clobbering. Also, fix some bugs in __PVOP_VCALLEESAVE which were revealed by actual use. Reported-by: Bastian Blank <bastian@waldi.eu.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: Xen-devel <xen-devel@lists.xensource.com> LKML-Reference: <4AD3BC9B.7040501@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 09:22:01 +02:00
Arnaldo Carvalho de Melo	a2e2725541	net: Introduce recvmmsg socket syscall Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-12 23:40:10 -07:00

... 5 6 7 8 9 ...

9682 Commits