linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-30 00:06:46 +07:00

Author	SHA1	Message	Date
Linus Torvalds	b1ae8d3a00	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, geode: add a VSA2 ID for General Software x86: use BOOTMEM_EXCLUSIVE on 32-bit x86, 32-bit: fix boot failure on TSC-less processors x86: fix NULL pointer deref in __switch_to x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.	2008-06-20 12:36:38 -07:00
Linus Torvalds	55017923f6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: Blackfin Serial Driver: Use timer to poll CTS PIN instead of workqueue. Blackfin arch: fix typo error in bf548 serial header file	2008-06-20 12:34:43 -07:00
Linus Torvalds	b4eea67a12	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ahci: sis can't do PMP ata_piix: add TECRA M4 to broken suspend list LIBATA: Add HAVE_PATA_PLATFORM to select PATA_PLATFORM driver sata_mv: warn on PIO with multiple DRQs sata_mv: enable async_notify for 60x1 Rev.C0 and higher libata: don't check whether to use DMA or not for no data commands ahci: jmb361 has only one port	2008-06-20 12:31:03 -07:00
Linus Torvalds	1f6ef23429	[watchdog] hpwdt: fix use of inline assembly The inline assembly in drivers/watchdog/hpwdt.c was incredibly broken, and included all the function prologue and epilogue stuff, even though it was itself then inside a C function where the compiler would add its own prologue and epilogue on top of it all. This then just _happened_ to work if you had exactly the right compiler version and exactly the right compiler flags, so that gcc just happened to not create any prologue at all (the gcc-generated epilogue wouldn't matter, since it would never be reached). But the more proper way to fix it is to simply not do this. Move the inline asm to the top level, with no surrounding function at all (the better alternative would be to remove the prologue and make it actually use proper description of the arguments to the inline asm, but that's a bigger change than the one I'm willing to make right now). Tested-by: S.Çağlar Onur <caglar@pardus.org.tr> Acked-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-20 12:25:34 -07:00
Cliff Wickman	e0c6d97c65	[IA64] SN2: security hole in sn2_ptc_proc_write Security hole in sn2_ptc_proc_write It is possible to overrun a buffer with a write to this /proc file. Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2008-06-20 12:02:00 -07:00
Ben Dooks	ac1623625c	BAST: Remove old IDE driver Remove the old BAST IDE driver, as we are now using the platform-pata support. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:35 +02:00
Christophe Niclaes	a49c06bfe4	pcmcia ide kingston compactflash's have a new manufacturer id Up to now, Kingston compactflash cards (ab)used the Toshiba Manufacturer's ID, In their new CF cards, they use a new one. Let's the ide subsystem recognize CF cards with the new id. Signed-off-by: Christophe Niclaes <cniclaes@develtech.com> Acked-by: Philippe De Muyter <phdm@macqel.be> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:34 +02:00
Kristoffer Ericson	a17bf22023	pcmcia: add another pata/ide ID Addition of Transcend 1GB 45x id so that it is properly detected. [bart: fix typo in ide-cs's ID spotted by Alan Cox] Signed-off-by: William Peters <w1ll14@gmail.com> Signed-off-by: Kristoffer Ericson <Kristoffer_e1@hotmail.com> CC: Alan Cox <alan@lxorguk.ukuu.org.uk> CC: linux-ide@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:34 +02:00
Matt Reimer	74e23386b7	pcmcia: add an pata/ide ID Add an id for: product info: "M-Systems", "CF300", "" manfid: 0x000a, 0x0000 function: 4 (fixed disk) Signed-off-by: Matt Reimer <mreimer@vpop.net> CC: Alan Cox <alan@lxorguk.ukuu.org.uk> CC: linux-ide@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:34 +02:00
Bartlomiej Zolnierkiewicz	f54feafa6d	ide: increase timeout in wait_drive_not_busy() Some ATAPI devices take longer than the current max timeout value to become ready (i.e. TEAC DV-W28ECW takes 6 ms) so increase the timeout value to 10 ms. This fixes kernel.org bugzilla bug #10887: http://bugzilla.kernel.org/show_bug.cgi?id=10887 Reported-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:33 +02:00
Sergei Shtylyov	ce42a54946	palm_bk3710: fix resource management The driver expected a virtual address in the IDE platform device's memory resource and didn't request the memory region for the register block. Fix this taking into account the fact that DaVinci SoC devices are fixed-mapped to the virtual memory early and we can get their virtual addresses using IO_ADDRESS() macro, not having to call ioremap()... While at it, also do some cosmetic changes... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-06-20 20:53:32 +02:00
Linus Torvalds	89f5b7da2a	Reinstate ZERO_PAGE optimization in 'get_user_pages()' and fix XIP KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit `557ed1fa26` ("remove ZERO_PAGE") removed the ZERO_PAGE from the VM mappings, any users of get_user_pages() will generally now populate the VM with real empty pages needlessly. We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but since fault handling no longer uses ZERO_PAGE for new anonymous pages, we now need to handle that special case in follow_page() instead. In particular, the removal of ZERO_PAGE effectively removed the core file writing optimization where we would skip writing pages that had not been populated at all, and increased memory pressure a lot by allocating all those useless newly zeroed pages. This reinstates the optimization by making the unmapped PTE case the same as for a non-existent page table, which already did this correctly. While at it, this also fixes the XIP case for follow_page(), where the caller could not differentiate between the case of a page that simply could not be used (because it had no "struct page" associated with it) and a page that just wasn't mapped. We do that by simply returning an error pointer for pages that could not be turned into a "struct page *". The error is arbitrarily picked to be EFAULT, since that was what get_user_pages() already used for the equivalent IO-mapped page case. [ Also removed an impossible test for pte_offset_map_lock() failing: that's not how that function works ] Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-20 11:18:25 -07:00
Frederic Bohe	2856922c15	Ext4: Fix online resize block group descriptor corruption This is the patch for the group descriptor table corruption during online resize pointed out by Theodore Tso. The problem was caused by the fact that the ext4 group descriptor can be either 32 or 64 bytes long. Only the 64 bytes structure was taken into account. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-06-20 11:48:48 -04:00
Oleg Nesterov	ea71a54670	sched: refactor wait_for_completion_timeout() Simplify the code and fix the boundary condition of wait_for_completion_timeout(,0). We can kill the first __remove_wait_queue() as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2008-06-20 17:15:49 +02:00
Jeremy Fitzhardinge	ebb9cfe20f	xen: don't drop NX bit Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:56:41 +02:00
Jeremy Fitzhardinge	05345b0f00	xen: mask unwanted pte bits in __supported_pte_mask [ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:56:36 +02:00
Isaku Yamahata	4653938379	xen: Use wmb instead of rmb in xen_evtchn_do_upcall(). This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg. Use wmb instead of rmb to enforce ordering between evtchn_upcall_pending and evtchn_pending_sel stores in xen_evtchn_do_upcall(). Cc: Samuel Thibault <samuel.thibault@eu.citrix.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:56:30 +02:00
Suresh Siddha	54481cf88b	x86: fix NULL pointer deref in __switch_to I am able to reproduce the oops reported by Simon in __switch_to() with lguest. My debug showed that there is at least one lguest specific issue (which should be present in 2.6.25 and before aswell) and it got exposed with a kernel oops with the recent fpu dynamic allocation patches. In addition to the previous possible scenario (with fpu_counter), in the presence of lguest, it is possible that the cpu's TS bit it still set and the lguest launcher task's thread_info has TS_USEDFPU still set. This is because of the way the lguest launcher handling the guest's TS bit. (look at lguest_set_ts() in lguest_arch_run_guest()). This can result in a DNA fault while doing unlazy_fpu() in __switch_to(). This will end up causing a DNA fault in the context of new process thats getting context switched in (as opossed to handling DNA fault in the context of lguest launcher/helper process). This is wrong in both pre and post 2.6.25 kernels. In the recent 2.6.26-rc series, this is showing up as NULL pointer dereferences or sleeping function called from atomic context(__switch_to()), as we free and dynamically allocate the FPU context for the newly created threads. Older kernels might show some FPU corruption for processes running inside of lguest. With the appended patch, my test system is running for more than 50 mins now. So atleast some of your oops (hopefully all!) should get fixed. Please give it a try. I will spend more time with this fix tomorrow. Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk> Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 13:26:18 +02:00
Roland Dreier	bb10ed0994	sched: fix wait_for_completion_timeout() spurious failure under heavy load It seems that the current implementaton of wait_for_completion_timeout() has a small problem under very high load for the common pattern: if (!wait_for_completion_timeout(&done, timeout)) /* handle failure / because the implementation very roughly does (lots of code deleted to show the basic flow): static inline long __sched do_wait_for_common(struct completion x, long timeout, int state) { if (x->done) return timeout; do { timeout = schedule_timeout(timeout); if (!timeout) return timeout; } while (!x->done); return timeout; } so if the system is very busy and x->done is not set when do_wait_for_common() is entered, it is possible that the first call to schedule_timeout() returns 0 because the task doing wait_for_completion doesn't get rescheduled for a long time, even if it is woken up early enough. In this case, wait_for_completion_timeout() returns 0 without even checking x->done again, and the code above falls into its failure case purely for scheduler reasons, even if the hardware event or whatever was being waited for happened early enough. It would make sense to add an extra test to do_wait_for() in the timeout case and return 1 if x->done is actually set. A quick audit (not exhaustive) of wait_for_completion_timeout() callers seems to indicate that no one actually cares about the return value in the success case -- they just test for 0 (timed out) versus non-zero (wait succeeded). Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 13:19:32 +02:00
Peter Zijlstra	8a8cde163e	sched: rt: dont stop the period timer when there are tasks wanting to run So if the group ever gets throttled, it will never wake up again. Reported-by: "Daniel K." <dk@uw.no> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Daniel K. <dk@uw.no> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 11:00:19 +02:00
Len Brown	5a87f7f5e5	Merge branch 'bugzilla-9761' into release	2008-06-20 02:47:16 -04:00
Len Brown	7b09f27891	Merge branch 'bugzilla-10695' into release	2008-06-20 02:45:05 -04:00
Dave Airlie	858a3685bc	drm: only trust core drm ioctls - driver ioctls are a mess. So driver ioctls need a full auditing before we can make this change. Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-06-20 15:42:38 +10:00
Zhenyu Wang	d3adbc0c58	drm/i915: add support for Intel series 4 chipsets. Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-06-20 12:12:56 +10:00
Zhenyu Wang	7d15ddf79e	[agp]: fixup chipset flush for new Intel G4x. Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-06-20 11:48:06 +10:00
YOSHIFUJI Hideaki	f630e43a21	ipv6: Drop packets for loopback address from outside of the box. [ Based upon original report and patch by Karsten Keil. Karsten has verified that this fixes the TAHI test case "ICMPv6 test v6LC.5.1.2 Part F". -DaveM ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-19 16:33:57 -07:00
Shan Wei	aea7427f70	ipv6: Remove options header when setsockopt's optlen is 0 Remove the sticky Hop-by-Hop options header by calling setsockopt() for IPV6_HOPOPTS with a zero option length, per RFC3542. Routing header and Destination options header does the same as Hop-by-Hop options header. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-19 16:29:39 -07:00
Jordan Crouse	ffe6e1da86	x86, geode: add a VSA2 ID for General Software General Software writes their own VSA2 module for their version of the Geode BIOS, which returns a different ID then the standard VSA2. This was causing the framebuffer driver to break for most GSW boards. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Cc: tglx@linutronix.de Cc: linux-geode@lists.infradead.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 14:19:03 +02:00
Bharath Ravi	d4abc238c9	sched, delay accounting: fix incorrect delay time when constantly waiting on runqueue This patch corrects the incorrect value of per process run-queue wait time reported by delay statistics. The anomaly was due to the following reason. When a process leaves the CPU and immediately starts waiting for CPU on the runqueue (which means it remains in the TASK_RUNNABLE state), the time of re-entry into the run-queue is never recorded. Due to this, the waiting time on the runqueue from this point of re-entry upto the next time it hits the CPU is not accounted for. This is solved by recording the time of re-entry of a process leaving the CPU in the sched_info_depart() function IF the process will go back to waiting on the run-queue. This IF condition is verified by checking whether the process is still in the TASK_RUNNABLE state. The patch was tested on 2.6.26-rc6 using two simple CPU hog programs. The values noted prior to the fix did not account for the time spent on the runqueue waiting. After the fix, the correct values were reported back to user space. Signed-off-by: Bharath Ravi <bharathravi1@gmail.com> Signed-off-by: Madhava K R <madhavakr@gmail.com> Cc: dhaval@linux.vnet.ibm.com Cc: vatsa@in.ibm.com Cc: balbir@in.ibm.com Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 14:15:28 +02:00
David Brownell	bcccc3a28e	hwmon: (lm75) sensor reading bugfix LM75 sensor reading bugfix: never save error status as valid sensor output. This could be improved, but at least this prevents certain rude failure modes. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:32 -04:00
Hans de Goede	b3aeab0cdb	hwmon: (abituguru3) update driver detection It has been reported that the abituguru3 driver fails to load after a BIOS update. This patch fixes this by loosening the detection routine so that it will work after the BIOS update too. To compensate for the now very loose detection an additional check is added on the DMI Base Board vendor string to make sure we only load on Abit motherboards, this is the same as the check in the abituguru (1 / 2) driver. Signed-of-by: Hans de Goede <j.w.r.degoede@hhs.nl> Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:32 -04:00
Marc Hulsman	25845c2264	hwmon: (w83791d) new maintainer Signed-off-by: Charles Spirakis <bezaur@gmail.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:31 -04:00
Hans de Goede	1604e78b7d	hwmon: (abituguru3) Identify Abit AW8D board as such This patch identifies the Abit AW8D board as such, and adds support for its aux5 fan connector Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:31 -04:00
Jean Delvare	125ff8087f	hwmon: Update the sysfs interface documentation * Document the characteristics of libsensors 3.0.0 and 3.0.1. * The sysfs interface is no longer subject to changes. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Juerg Haefliger <juergh at gmail.com> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:31 -04:00
Jean Delvare	ed4ec814e4	hwmon: (adt7473) Initialize max_duty_at_overheat before use data->max_duty_at_overheat is not updated in adt7473_update_device, so it might be used before it is initialized (if the user reads from sysfs file max_duty_at_crit before writing to it.) Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:31 -04:00
Jean Delvare	d38b149794	hwmon: (lm85) Fix function RANGE_TO_REG() Function RANGE_TO_REG() is broken. For a requested range of 2000 (2 degrees C), it will return an index value of 15, i.e. 80.0 degrees C, instead of the expected index value of 0. All other values are handled properly, just 2000 isn't. The bug was introduced back in November 2004 by this patch: http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commit;h=1c28d80f1992240373099d863e4996cdd5d646d0 While this can be fixed easily with the current code, I'd rather rewrite the whole function in a way which is more obviously correct. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Justin Thiessen <jthiessen@penguincomputing.com> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2008-06-19 06:50:31 -04:00
Sonic Zhang	f30ac0ce34	Blackfin Serial Driver: Use timer to poll CTS PIN instead of workqueue. This allows other threads to run when the serial driver polls the CTS PIN in a loop. Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>	2008-06-19 17:46:39 +08:00
Sonic Zhang	ec64b6c876	Blackfin arch: fix typo error in bf548 serial header file Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>	2008-06-19 17:07:15 +08:00
Bernhard Walle	d3942cff62	x86: use BOOTMEM_EXCLUSIVE on 32-bit This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for i386 and prints a error message on failure. The patch is still for 2.6.26 since it is only bug fixing. The unification of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>	2008-06-19 10:08:48 +02:00
Mikael Pettersson	df17b1d990	x86, 32-bit: fix boot failure on TSC-less processors Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6" (invalid opcode) and a kernel halt immediately after the kernel has been uncompressed. The BUG shows EIP pointing to an rdtsc instruction in native_read_tsc(), invoked from native_sched_clock(). (This error occurs so early that not even the serial console can capture it.) A bisection showed that this bug first occurs in 2.6.26-rc3-git7, via commit `9ccc906c97`: >x86: distangle user disabled TSC from unstable > >tsc_enabled is set to 0 from the command line switch "notsc" and from >the mark_tsc_unstable code. Seperate those functionalities and replace >tsc_enable with tsc_disable. This makes also the native_sched_clock() >decision when to use TSC understandable. > >Preparatory patch to solve the sched_clock() issue on 32 bit. > >Signed-off-by: Thomas Gleixner <tglx@linutronix.de> The core reason for this bug is that native_sched_clock() gets called before tsc_init(). Before the commit above, tsc_32.c used a "tsc_enabled" variable which defaulted to 0 == disabled, and which only got enabled late in tsc_init(). Thus early calls to native_sched_clock() would skip the TSC and use jiffies instead. After the commit above, tsc_32.c uses a "tsc_disabled" variable which defaults to 0, meaning that the TSC is Ok to use. Early calls to native_sched_clock() now erroneously try to use the TSC on !cpu_has_tsc processors, leading to invalid opcode exceptions. My proposed fix is to initialise tsc_disabled to a "soft disabled" state distinct from the hard disabled state set up by the "notsc" kernel option. This fixes the native_sched_clock() problem. It also allows tsc_init() to be simplified: instead of setting tsc_disabled = 1 on every error return, we just set tsc_disabled = 0 once when all checks have succeeded. I've verified that this lets my 486 boot again. I've also verified that a Core2 machine still uses the TSC as clocksource after the patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 10:08:47 +02:00
Suresh Siddha	75118a82e2	x86: fix NULL pointer deref in __switch_to Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118 > > [62060.043009] pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... / * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 10:08:45 +02:00
Jeremy Fitzhardinge	ad524d46f3	x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits. When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can potentially have the same number of physical address bits as the 64-bit host ("Enhanced Legacy PAE Paging"). This means, in theory, we could have up to 52 bits of physical address in a pte. The 32-bit kernel uses a 32-bit unsigned long to represent a pfn. This means that it can only represent physical addresses up to 32+12=44 bits wide. Rather than widening pfns everywhere, just set 2^44 as the Linux x86_32-PAE architectural limit for physical address size. This is a bugfix for two cases: 1. running a 32-bit PAE kernel on a machine with more than 64GB RAM. 2. running a 32-bit PAE Xen guest on a host machine with more than 64GB RAM In both cases, a pte could need to have more than 36 bits of physical, and masking it to 36-bits will cause fairly severe havoc. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 10:08:44 +02:00
Jason Wessel	9c106c119e	softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression The touch_nmi_watchdog() routine on x86 ultimately calls touch_softlockup_watchdog(). The problem is that to touch the softlockup watchdog, the cpu_clock code has to be called which could involve multiple cpu locks and can lead to a hard hang if one of the locks is held by a processor that is not going to return anytime soon (such as could be the case with kgdb or perhaps even with some other kind of exception). This patch causes the public version of the touch_softlockup_watchdog() to defer the cpu clock access to a later point. The test case for this problem is to use the following kernel config options: CONFIG_KGDB_TESTS=y CONFIG_KGDB_TESTS_ON_BOOT=y CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000" It should be noted that kgdb test suite and these options were not available until 2.6.26-rc2, so it was necessary to patch the kgdb test suite during the bisection. I would consider this patch a regression fix because the problem first appeared in commit `27ec440779` when some logic was added to try to periodically sync the clocks. It was possible to work around this particular problem by simply not performing the sync anytime the system was in a critical context. This was ok until commit `3e51f33fcc`, which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some multi-cpu locks to sync the clocks. It became clear that accessing this code from an nmi was the source of the lockups. Avoiding the access to the low level clock code from an code inside the NMI processing also fixed the problem with the 27ec44... commit. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 09:45:38 +02:00
Steven Rostedt	afd38009cc	rcupreempt: remove export of rcu_batches_completed_bh In rcupreempt, rcu_batches_completed_bh is defined as a static inline in the header file. This does not need to be exported, and not only that, this breaks my PPC build. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: paulus@samba.org Cc: linuxppc-dev@ozlabs.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-19 09:45:37 +02:00
Li Zefan	30e0e17819	cpuset: limit the input of cpuset.sched_relax_domain_level We allow the inputs to be [-1 ... SD_LV_MAX), and return -EINVAL for inputs outside this range. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Acked-by: Paul Jackson <pj@sgi.com> Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-19 09:45:36 +02:00
Ingo Molnar	d819c49da6	Merge branch 'linus' into sched/urgent	2008-06-19 09:37:31 +02:00
Max Krasnyansky	f18f982abf	sched: CPU hotplug events must not destroy scheduler domains created by the cpusets First issue is not related to the cpusets. We're simply leaking doms_cur. It's allocated in arch_init_sched_domains() which is called for every hotplug event. So we just keep reallocation doms_cur without freeing it. I introduced free_sched_domains() function that cleans things up. Second issue is that sched domains created by the cpusets are completely destroyed by the CPU hotplug events. For all CPU hotplug events scheduler attaches all CPUs to the NULL domain and then puts them all into the single domain thereby destroying domains created by the cpusets (partition_sched_domains). The solution is simple, when cpusets are enabled scheduler should not create default domain and instead let cpusets do that. Which is exactly what the patch does. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Cc: pj@sgi.com Cc: menage@google.com Cc: rostedt@goodmis.org Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-06-19 09:14:51 +02:00
Peter Zijlstra	15a8641ead	sched: rt-group: fix RR buglet In tick_task_rt() we first call update_curr_rt() which can dequeue a runqueue due to it running out of runtime, and then we try to requeue it, of it also having exhausted its RR quota. Obviously requeueing something that is no longer on the runqueue will not have the expected result. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Daniel K. <dk@uw.no> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 09:06:59 +02:00
Peter Zijlstra	ad2a3f13b7	sched: rt-group: heirarchy aware throttle The bandwidth throttle code dequeues a group when it runs out of quota, and re-queues it once the period rolls over and the quota gets refreshed. Sadly it failed to take the hierarchy into consideration. Share more of the enqueue/dequeue code with regular task opterations. Also, some operations like sched_setscheduler() can dequeue/enqueue tasks that are in throttled runqueues, we should not inadvertly re-enqueue empty runqueues so check for that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Daniel K. <dk@uw.no> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 09:06:57 +02:00
Peter Zijlstra	7ea56616ba	sched: rt-group: fix hierarchy Don't re-set the entity's runqueue to the wrong rq after we've set it to the right one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Daniel K. <dk@uw.no> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 09:06:56 +02:00

1 2 3 4 5 ...

98485 Commits