linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-30 10:46:42 +07:00

Author	SHA1	Message	Date
Steven Rostedt	2232c2d8e0	rcu: add support for dynamic ticks and preempt rcu The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The idle CPU will not progress the RCU through its grace period and a synchronize_rcu my get stuck. Without this patch I have a box that will not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine with this patch. This patch comes from the -rt kernel where it has been tested for several months. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:46:50 +01:00
Pavel Machek	db4315d6f5	timer_list: print relative expiry time signed Relative expiry time can get negative, so it should be signed. Signed-off-by: Pavel Machek <Pavel@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-17 17:29:38 +01:00
john stultz	e13a2e61dd	ntp: correct inconsistent interval/tick_length usage clocksource initialization and error accumulation. This corrects a 280ppm drift seen on some systems using acpi_pm, and affects other clocksources as well (likely to a lesser degree). Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-10 10:48:03 +01:00
Li Zefan	3eb056764d	time: fix typo in comments Fix typo in comments. BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise checkpatch.pl will be complaining. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:29 -08:00
Li Zefan	cf4fc6cb76	timekeeping: rename timekeeping_is_continuous to timekeeping_valid_for_hres Function timekeeping_is_continuous() no longer checks flag CLOCK_IS_CONTINUOUS, and it checks CLOCK_SOURCE_VALID_FOR_HRES now. So rename the function accordingly. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:29 -08:00
Li Zefan	0b858e6ff9	clockevent: simplify list operations list_for_each_safe() suffices here. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:29 -08:00
Li Zefan	818c357802	clocksource: remove redundant code Flag CLOCK_SOURCE_WATCHDOG is cleared twice. Note clocksource_change_rating() won't do anyting with the cs flag. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:29 -08:00
Miao Xie	5e2cb1018a	time: fix sysfs_show_{available,current}_clocksources() buffer overflow problem I found that there is a buffer overflow problem in the following code. Version: 2.6.24-rc2, File: kernel/time/clocksource.c:417-432 -------------------------------------------------------------------- static ssize_t sysfs_show_available_clocksources(struct sys_device dev, char buf) { struct clocksource src; char curr = buf; spin_lock_irq(&clocksource_lock); list_for_each_entry(src, &clocksource_list, list) { curr += sprintf(curr, "%s ", src->name); } spin_unlock_irq(&clocksource_lock); curr += sprintf(curr, "\n"); return curr - buf; } ----------------------------------------------------------------------- sysfs_show_current_clocksources() also has the same problem though in practice the size of current clocksource's name won't exceed PAGE_SIZE. I fix the bug by using snprintf according to the specification of the kernel (Version:2.6.24-rc2,File:Documentation/filesystems/sysfs.txt) Fix sysfs_show_available_clocksources() and sysfs_show_current_clocksources() buffer overflow problem with snprintf(). Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:03 -08:00
Thomas Gleixner	5df7fa1c62	tick-sched: add more debug information To allow better diagnosis of tick-sched related, especially NOHZ related problems, we need to know when the last wakeup via an irq happened and when the CPU left the idle state. Add two fields (idle_waketime, idle_exittime) to the tick_sched structure and add them to the timer_list output. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:45:14 +01:00
Thomas Gleixner	1001d0a9ee	timekeeping: update xtime_cache when time(zone) changes xtime_cache needs to be updated whenever xtime and or wall_to_monotic are changed. Otherwise users of xtime_cache might see a stale (and in the case of timezone changes utterly wrong) value until the next update happens. Fixup the obvious places, which miss this update. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <johnstul@us.ibm.com> Tested-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:45:13 +01:00
Venki Pallipadi	6378ddb592	time: track accurate idle time with tick_sched.idle_sleeptime Current idle time in kstat is based on jiffies and is coarse grained. tick_sched.idle_sleeptime is making some attempt to keep track of idle time in a fine grained manner. But, it is not handling the time spent in interrupts fully. Make tick_sched.idle_sleeptime accurate with respect to time spent on handling interrupts and also add tick_sched.idle_lastupdate, which keeps track of last time when idle_sleeptime was updated. This statistics will be crucial for cpufreq-ondemand governor, which can shed some conservative gaurd band that is uses today while setting the frequency. The ondemand changes that uses the exact idle time is coming soon. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:04 +01:00
john stultz	bbe4d18ac2	NTP: correct inconsistent ntp interval/tick_length usage I recently noticed on one of my boxes that when synched with an NTP server, the drift value reported for the system was ~283ppm. While in some cases, clock hardware can be that bad, it struck me as unusual as the system was using the acpi_pm clocksource, which is one of the more trustworthy and accurate clocksources on x86 hardware. I brought up another system and let it sync to the same NTP server, and I noticed a similar 280some ppm drift. In looking at the code, I found that the acpi_pm's constant frequency was being computed correctly at boot-up, however once the system was up, even without the ntp daemon running, the clocksource's frequency was being modified by the clocksource_adjust() function. Digging deeper, I realized that in the code that keeps track of how much the clocksource is skewing from the ntp desired time, we were using different lengths to establish how long an time interval was. The clocksource was being setup with the following interval: NTP_INTERVAL_LENGTH = NSEC_PER_SEC/NTP_INTERVAL_FREQ While the ntp code was using the tick_length_base value: tick_length_base ~= (tick_usec * NSEC_PER_USEC * USER_HZ) /NTP_INTERVAL_FREQ The subtle difference is: (tick_usec * NSEC_PER_USEC * USER_HZ) != NSEC_PER_SEC This difference in calculation was causing the clocksource correction code to apply a correction factor to the clocksource so the two intervals were the same, however this results in the actual frequency of the clocksource to be made incorrect. I believe this difference would affect all clocksources, although to differing degrees depending on the clocksource resolution. The issue was introduced when my HZ free ntp patch landed in 2.6.21-rc1, so my apologies for the mistake, and for not noticing it until now. The following patch, corrects the clocksource's initialization code so it uses the same interval length as the code in ntp.c. After applying this patch, the drift value for the same system went from ~283ppm to only 2.635ppm. I believe this patch to be good, however it does affect all arches and I've only tested on x86, so some caution is advised. I do think it would be a likely candidate for a stable 2.6.24.x release. Any thoughts or feedback would be appreciated. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:03 +01:00
Ingo Molnar	45fe4fe191	x86: make clockevents more robust detect zero event-device multiplicators - they then cause division-by-zero crashes if a clockevent has been initialized incorrectly. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:03 +01:00
Thomas Gleixner	4713e22ce8	clocksource: add unregister function to disable unusable clocksources On x86 the PIT might become an unusable clocksource. Add an unregister function to provide a possibilty to remove the PIT from the list of available clock sources. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:02 +01:00
Andi Kleen	1ada5cba6a	clocksource: make clocksource watchdog cycle through online CPUs This way it checks if the clocks are synchronized between CPUs too. This might be able to detect slowly drifting TSCs which only go wrong over longer time. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:02 +01:00
Parag Warudkar	1077f5a917	clocksource.c: use init_timer_deferrable for clocksource_watchdog clocksource_watchdog can use a deferrable timer - reduces wakeups from idle per second. Signed-off-by: Parag Warudkar <parag.warudkar@gmail.com> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:01 +01:00
Geert Uytterhoeven	efd9ac8630	time: fold __get_realtime_clock_ts() into getnstimeofday() - getnstimeofday() was just a wrapper around __get_realtime_clock_ts() - Replace calls to __get_realtime_clock_ts() by calls to getnstimeofday() - Fix bogus reference to get_realtime_clock_ts(), which never existed Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:01 +01:00
Thomas Gleixner	186e3cb8a4	timer: clean up tick-broadcast.c clean up tick-broadcast.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:01 +01:00
Pavel Machek	b10db7f0d2	time: more timer related cleanups I was confused by FSEC = 10^15 NSEC statement, plus small whitespace fixes. When there's copyright, there should be GPL. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:00 +01:00
Pavel Machek	4c9dc64122	time: timer cleanups Small cleanups to tick-related code. Wrong preempt count is followed by BUG(), so it is hardly KERN_WARNING. Signed-off-by: Pavel Machek <pavel@suse.cz> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:00 +01:00
Peter Zijlstra	2d44ae4d71	hrtimer: clean up cpu->base locking tricks In order to more easily allow for the scheduler to use timers, clean up the locking a bit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Peter Zijlstra	48d5e25821	sched: rt throttling vs no_hz We need to teach no_hz about the rt throttling because its tick driven. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Kay Sievers	af5ca3f4ec	Driver core: change sysdev classes to use dynamic kobject names All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Thomas Gleixner	cdc6f27d9e	clockevents: fix reprogramming decision in oneshot broadcast Resolve the following regression of a choppy, almost unusable laptop: http://lkml.org/lkml/2007/12/7/299 http://bugzilla.kernel.org/show_bug.cgi?id=9525 A previous version of the code did the reprogramming of the broadcast device in the return from idle code. This was removed, but the logic in tick_handle_oneshot_broadcast() was kept the same. When a broadcast interrupt happens we signal the expiry to all CPUs which have an expired event. If none of the CPUs has an expired event, which can happen in dyntick mode, then we reprogram the broadcast device. We do not reprogram otherwise, but this is only correct if all CPUs, which are in the idle broadcast state have been woken up. The code ignores, that there might be pending not yet expired events on other CPUs, which are in the idle broadcast state. So the delivery of those events can be delayed for quite a time. Change the tick_handle_oneshot_broadcast() function to check for CPUs, which are in broadcast state and are not woken up by the current event, and enforce the rearming of the broadcast device for those CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-12-18 18:05:58 +01:00
Thomas Gleixner	167b1de3ee	clockevents: warn once when program_event() is called with negative expiry The hrtimer problem with large relative timeouts resulting in a negative expiry time went unnoticed as there is no check in the clockevents_program_event() code. Put a check there with a WARN_ONCE to avoid such problems in the future. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-12-07 19:16:17 +01:00
Thomas Gleixner	d393820446	softlockup: fix false positives on CONFIG_NOHZ David Miller reported soft lockup false-positives that trigger on NOHZ due to CPUs idling for more than 10 seconds. The solution is touch the softlockup watchdog when we return from idle. (by definition we are not 'locked up' when we were idle) http://bugzilla.kernel.org/show_bug.cgi?id=9409 Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-28 15:52:56 +01:00
John Stultz	52bfb36050	time: add ADJ_OFFSET_SS_READ Michael Kerrisk reported that a long standing bug in the adjtimex() system call causes glibc's adjtime(3) function to deliver the wrong results if 'delta' is NULL. add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc to fix this API compatibility bug. Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761 [ mingo@elte.hu: added patch description and made it backwards compatible ] NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on older kernels - this way glibc can use it safely. Suggested by Ulrich Drepper. Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-11-26 20:42:19 +01:00
David P. Reed	fa6a1a554b	ntp: fix typo that makes sync_cmos_clock erratic Fix a typo in ntp.c that has caused updating of the persistent (RTC) clock when synced to NTP to behave erratically. When debugging a freeze that arises on my AMD64 machines when I run the ntpd service, I added a number of printk's to monitor the sync_cmos_clock procedure. I discovered that it was not syncing to cmos RTC every 11 minutes as documented, but instead would keep trying every second for hours at a time. The reason turned out to be a typo in sync_cmos_clock, where it attempts to ensure that update_persistent_clock is called very close to 500 msec. after a 1 second boundary (required by the PC RTC's spec). That typo referred to "xtime" in one spot, rather than "now", which is derived from "xtime" but not equal to it. This makes the test erratic, creating a "coin-flip" that decides when update_persistent_clock is called - when it is called, which is rarely, it may be at any time during the one second period, rather than close to 500 msec, so the value written is needlessly incorrect, too. Signed-off-by: David P. Reed Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-11-17 16:27:01 +01:00
Li Zefan	8dce39c231	time: fix inconsistent function names in comments Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-05 15:12:33 -08:00
Vegard Nossum	129f1d2c53	timer_list: Fix printk format strings This makes sure printk format strings contain no more than a single line. Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-29 09:39:38 +01:00
Adrian Bunk	64e38eb082	clockevents: unexport tick_nohz_get_sleep_length This patch removes the unused EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-29 09:39:38 +01:00
Linus Torvalds	c4ec207173	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (41 commits) ACPICA: hw: Don't carry spinlock over suspend ACPICA: hw: remove use_lock flag from acpi_hw_register_{read, write} ACPI: cpuidle: port idle timer suspend/resume workaround to cpuidle ACPI: clean up acpi_enter_sleep_state_prep Hibernation: Make sure that ACPI is enabled in acpi_hibernation_finish ACPI: suppress uninitialized var warning cpuidle: consolidate 2.6.22 cpuidle branch into one patch ACPI: thinkpad-acpi: skip blanks before the data when parsing sysfs ACPI: AC: Add sysfs interface ACPI: SBS: Add sysfs alarm ACPI: SBS: Add ACPI_PROCFS around procfs handling code. ACPI: SBS: Add support for power_supply class (and sysfs) ACPI: SBS: Make SBS reads table-driven. ACPI: SBS: Simplify data structures in SBS ACPI: SBS: Split host controller (ACPI0001) from SBS driver (ACPI0002) ACPI: EC: Add new query handler to list head. ACPI: Add acpi_bus_generate_event4() function ACPI: Battery: add sysfs alarm ACPI: Battery: Add sysfs support ACPI: Battery: Misc clean-ups, no functional changes ... Fix up conflicts in drivers/misc/thinkpad_acpi.[ch] manually	2007-10-19 13:12:46 -07:00
Matthias Kaehlcke	2e1975868a	kernel/time/clocksource.c: Use list_for_each_entry instead of list_for_each kernel/time/clocksource.c: Convert list_for_each to list_for_each_entry in clocksource_resume(), sysfs_override_clocksource() and show_available_clocksources() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Thomas Gleixner	3dfbc88464	x86: C1E late detection fix. Really switch off lapic timer Doh, I completely missed that devices marked DUMMY are not running the set_mode function. So we force broadcasting, but we keep the local APIC timer running. Let the clock event layer mark the device _after_ switching it off. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:15:13 +02:00
john stultz	b2d9323d13	Use num_possible_cpus() instead of NR_CPUS for timer distribution To avoid lock contention, we distribute the sched_timer calls across the cpus so they do not trigger at the same instant. However, I used NR_CPUS, which can cause needless grouping on small smp systems depending on your kernel config. This patch converts to using num_possible_cpus() so we spread it as evenly as possible on every machine. Briefly tested w/ NR_CPUS=255 and verified reduced contention. Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:53 -07:00
Adrian Bunk	ba2a631b14	kernel/time/timekeeping.c: cleanups - remove the no longer required __attribute__((weak)) of xtime_lock - remove the following no longer used EXPORT_SYMBOL's: - xtime - xtime_lock Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:53 -07:00
Avi Kivity	bf020cb7b3	time: simplify smp_call_function_single() call sequence smp_call_function_single() now knows how to call the function on the current cpu. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:48 -07:00
Ingo Molnar	f20bf61256	time: introduce xtime_seconds improve performance of sys_time(). sys_time() returns time in seconds, but it does so by calling do_gettimeofday() and then returning the tv_sec portion of the GTOD time. But the data structure "xtime", which is updated by every timer/scheduler tick, already offers HZ granularity time. the patch improves the sysbench oltp macrobenchmark by 4-5% on an AMD dual-core system: v2.6.23: #threads 1: transactions: 4073 (407.23 per sec.) 2: transactions: 8530 (852.81 per sec.) 3: transactions: 8321 (831.88 per sec.) 4: transactions: 8407 (840.58 per sec.) 5: transactions: 8070 (806.74 per sec.) v2.6.23 + sys_time-speedup.patch: 1: transactions: 4281 (428.09 per sec.) 2: transactions: 8910 (890.85 per sec.) 3: transactions: 8659 (865.79 per sec.) 4: transactions: 8676 (867.34 per sec.) 5: transactions: 8532 (852.91 per sec.) and by 4-5% on an Intel dual-core system too: 2.6.23: 1: transactions: 4560 (455.94 per sec.) 2: transactions: 10094 (1009.30 per sec.) 3: transactions: 9755 (975.36 per sec.) 4: transactions: 9859 (985.78 per sec.) 5: transactions: 9701 (969.72 per sec.) 2.6.23 + sys_time-speedup.patch: 1: transactions: 4779 (477.84 per sec.) 2: transactions: 10103 (1010.14 per sec.) 3: transactions: 10141 (1013.93 per sec.) 4: transactions: 10371 (1036.89 per sec.) 5: transactions: 10178 (1017.50 per sec.) (the more CPUs the system has, the more speedup this patch gives for this particular workload.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 10:01:50 -07:00
Thomas Gleixner	1595f452f3	clockevents: introduce force broadcast notifier The 64bit SMP bootup is slightly different to the 32bit one. It enables the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E systems have the C1E feature flag only set in the secondary CPU. Due to the early enable of the boot CPU local APIC timer the APIC timer is registered as a fully functional device. When we detect the wreckage during the bringup of the secondary CPU, we need to force the boot CPU into broadcast mode. Add a new notifier reason and implement the force broadcast in the clock events layer. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-14 22:57:45 +02:00
Venki Pallipadi	4a93232dab	clock events: allow replacement of broadcast timer Change the broadcast timer, if a timer with higher rating becomes available. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-12 23:04:23 +02:00
Thomas Gleixner	c8a1d398de	clockevents: fix periodic broadcast for oneshot devices The next_event member of the clock event device is used to keep track of the next periodic event. For one shot only devices it is wrong to clear the variable, as the next event will be based on it. Pointed out by Ralf Baechle Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2007-10-12 23:04:06 +02:00
Thomas Gleixner	de68d9b173	clockevents: Allow build w/o run-tine usage for migration purposes Migration aid to allow preparatory patches which introduce not yet used parts of clock events code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2007-10-12 23:04:05 +02:00
Len Brown	4f86d3a8e2	cpuidle: consolidate 2.6.22 cpuidle branch into one patch commit e5a16b1f9eec0af7cfa0830304b41c1c0833cf9f Author: Len Brown <len.brown@intel.com> Date: Tue Oct 2 23:44:44 2007 -0400 cpuidle: shrink diff processor_idle.c \| 440 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 429 insertions(+), 11 deletions(-) Signed-off-by: Len Brown <len.brown@intel.com> commit dfbb9d5aedfb18848a3e0d6f6e3e4969febb209c Author: Len Brown <len.brown@intel.com> Date: Wed Sep 26 02:17:55 2007 -0400 cpuidle: reduce diff size Reduces the cpuidle processor_idle.c diff vs 2.6.22 from this processor_idle.c \| 2006 ++++++++++++++++++++++++++----------------- 1 file changed, 1219 insertions(+), 787 deletions(-) to this: processor_idle.c \| 502 +++++++++++++++++++++++++++++++++++++++---- 1 file changed, 458 insertions(+), 44 deletions(-) ...for the purpose of making the cpuilde patch less invasive and easier to review. no functional changes. build tested only. Signed-off-by: Len Brown <len.brown@intel.com> commit 889172fc915f5a7fe20f35b133cbd205ce69bf6c Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Sep 13 13:40:05 2007 -0700 cpuidle: Retain old ACPI policy for !CONFIG_CPU_IDLE Retain the old policy in processor_idle, so that when CPU_IDLE is not configured, old C-state policy will still be used. This provides a clean gradual migration path from old ACPI policy to new cpuidle based policy. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 9544a8181edc7ecc33b3bfd69271571f98ed08bc Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Sep 13 13:39:17 2007 -0700 cpuidle: Configure governors by default Quoting Len "Do not give an option to users to shoot themselves in the foot". Remove the configurability of ladder and menu governors as they are needed for default policy of cpuidle. That way users will not be able to have cpuidle without any policy loosing all C-state power savings. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 8975059a2c1e56cfe83d1bcf031bcf4cb39be743 Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:27:07 2007 -0400 CPUIDLE: load ACPI properly when CPUIDLE is disabled Change the registration return codes for when CPUIDLE support is not compiled into the kernel. As a result, the ACPI processor driver will load properly even if CPUIDLE is unavailable. However, it may be possible to cleanup the ACPI processor driver further and eliminate some dead code paths. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit e0322e2b58dd1b12ec669bf84693efe0dc2414a8 Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:26:06 2007 -0400 CPUIDLE: remove cpuidle_get_bm_activity() Remove cpuidle_get_bm_activity() and updates governors accordingly. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 18a6e770d5c82ba26653e53d240caa617e09e9ab Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:25:58 2007 -0400 CPUIDLE: max_cstate fix Currently max_cstate is limited to 0, resulting in no idle processor power management on ACPI platforms. This patch restores the value to the array size. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 1fdc0887286179b40ce24bcdbde663172e205ef0 Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:25:40 2007 -0400 CPUIDLE: handle BM detection inside the ACPI Processor driver Update the ACPI processor driver to detect BM activity and limit state entry depth internally, rather than exposing such requirements to CPUIDLE. As a result, CPUIDLE can drop this ACPI-specific interface and become more platform independent. BM activity is now handled much more aggressively than it was in the original implementation, so some testing coverage may be needed to verify that this doesn't introduce any DMA buffer under-run issues. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 0ef38840db666f48e3cdd2b769da676c57228dd9 Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:25:14 2007 -0400 CPUIDLE: menu governor updates Tweak the menu governor to more effectively handle non-timer break events. Non-timer break events are detected by comparing the actual sleep time to the expected sleep time. In future revisions, it may be more reliable to use the timer data structures directly. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit bb4d74fca63fa96cf3ace644b15ae0f12b7df5a1 Author: Adam Belay <abelay@novell.com> Date: Tue Aug 21 18:24:40 2007 -0400 CPUIDLE: fix 'current_governor' sysfs entry Allow the "current_governor" sysfs entry to properly handle input terminated with '\n'. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit df3c71559bb69b125f1a48971bf0d17f78bbdf47 Author: Len Brown <len.brown@intel.com> Date: Sun Aug 12 02:00:45 2007 -0400 cpuidle: fix IA64 build (again) Signed-off-by: Len Brown <len.brown@intel.com> commit a02064579e3f9530fd31baae16b1fc46b5a7bca8 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Sun Aug 12 01:39:27 2007 -0400 cpuidle: Remove support for runtime changing of max_cstate Remove support for runtime changeability of max_cstate. Drivers can use use latency APIs. max_cstate can still be used as a boot time option and dmi override. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 0912a44b13adf22f5e3f607d263aed23b4910d7e Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Sun Aug 12 01:39:16 2007 -0400 cpuidle: Remove ACPI cstate_limit calls from ipw2100 ipw2100 already has code to use accetable_latency interfaces to limit the C-state. Remove the calls to acpi_set_cstate_limit and acpi_get_cstate_limit as they are redundant. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit c649a76e76be6bff1fd770d0a775798813a3f6e0 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Sun Aug 12 01:35:39 2007 -0400 cpuidle: compile fix for pause and resume functions Fix the compilation failure when cpuidle is not compiled in. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Adam Belay <adam.belay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 2305a5920fb8ee6ccec1c62ade05aa8351091d71 Author: Adam Belay <abelay@novell.com> Date: Thu Jul 19 00:49:00 2007 -0400 cpuidle: re-write Some portions have been rewritten to make the code cleaner and lighter weight. The following is a list of changes: 1.) the state name is now included in the sysfs interface 2.) detection, hotplug, and available state modifications are handled by CPUIDLE drivers directly 3.) the CPUIDLE idle handler is only ever installed when at least one cpuidle_device is enabled and ready 4.) the menu governor BM code no longer overflows 5.) the sysfs attributes are now printed as unsigned integers, avoiding negative values 6.) a variety of other small cleanups Also, Idle drivers are no longer swappable during runtime through the CPUIDLE sysfs inteface. On i386 and x86_64 most idle handlers (e.g. poll, mwait, halt, etc.) don't benefit from an infrastructure that supports multiple states, so I think using a more general case idle handler selection mechanism would be cleaner. Signed-off-by: Adam Belay <abelay@novell.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit df25b6b56955714e6e24b574d88d1fd11f0c3ee5 Author: Len Brown <len.brown@intel.com> Date: Tue Jul 24 17:08:21 2007 -0400 cpuidle: fix IA64 buid Signed-off-by: Len Brown <len.brown@intel.com> commit fd6ada4c14488755ff7068860078c437431fbccd Author: Adrian Bunk <bunk@stusta.de> Date: Mon Jul 9 11:33:13 2007 -0700 cpuidle: static make cpuidle_replace_governor() static Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit c1d4a2cebcadf2429c0c72e1d29aa2a9684c32e0 Author: Adrian Bunk <bunk@stusta.de> Date: Tue Jul 3 00:54:40 2007 -0400 cpuidle: static This patch makes the needlessly global struct menu_governor static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit dbf8780c6e8d572c2c273da97ed1cca7608fd999 Author: Andrew Morton <akpm@linux-foundation.org> Date: Tue Jul 3 00:49:14 2007 -0400 export symbol tick_nohz_get_sleep_length ERROR: "tick_nohz_get_sleep_length" [drivers/cpuidle/governors/menu.ko] undefined! ERROR: "tick_nohz_get_idle_jiffies" [drivers/cpuidle/governors/menu.ko] undefined! And please be sure to get your changes to core kernel suitably reviewed. Cc: Adam Belay <abelay@novell.com> Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 29f0e248e7017be15f99febf9143a2cef00b2961 Author: Andrew Morton <akpm@linux-foundation.org> Date: Tue Jul 3 00:43:04 2007 -0400 tick.h needs hrtimer.h It uses hrtimers. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit e40cede7d63a029e92712a3fe02faee60cc38fb4 Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:40:34 2007 -0400 cpuidle: first round of documentation updates Documentation changes based on Pavel's feedback. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 83b42be2efece386976507555c29e7773a0dfcd1 Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:39:25 2007 -0400 cpuidle: add rating to the governors and pick the one with highest rating by default Introduce a governor rating scheme to pick the right governor by default. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit d2a74b8c5e8f22def4709330d4bfc4a29209b71c Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:38:08 2007 -0400 cpuidle: make cpuidle sysfs driver governor switch off by default Make default cpuidle sysfs to show current_governor and current_driver in read-only mode. More elaborate available_governors and available_drivers with writeable current_governor and current_driver interface only appear with "cpuidle_sysfs_switch" boot parameter. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 1f60a0e80bf83cf6b55c8845bbe5596ed8f6307b Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:37:00 2007 -0400 cpuidle: menu governor: change the early break condition Change the C-state early break out algorithm in menu governor. We only look at early breakouts that result in wakeups shorter than idle state's target_residency. If such a breakout is frequent enough, eliminate the particular idle state upto a timeout period. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 45a42095cf64b003b4a69be3ce7f434f97d7af51 Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:35:38 2007 -0400 cpuidle: fix uninitialized variable in sysfs routine Fix the uninitialized usage of ret. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 80dca7cdba3e6ee13eae277660873ab9584eb3be Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:34:16 2007 -0400 cpuidle: reenable /proc/acpi//power interface for the time being Keep /proc/acpi/processor/CPU/power around for a while as powertop depends on it. It will be marked deprecated and removed in future. powertop can use cpuidle interfaces instead. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 589c37c2646c5e3813a51255a5ee1159cb4c33fc Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Jul 3 00:32:37 2007 -0400 cpuidle: menu governor and hrtimer compile fix Compile fix for menu governor. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 0ba80bd9ab3ed304cb4f19b722e4cc6740588b5e Author: Len Brown <len.brown@intel.com> Date: Thu May 31 22:51:43 2007 -0400 cpuidle: build fix - cpuidle vs ipw2100 module ERROR: "acpi_set_cstate_limit" [drivers/net/wireless/ipw2100.ko] undefined! Signed-off-by: Len Brown <len.brown@intel.com> commit d7d8fa7f96a7f7682be7c6cc0cc53fa7a18c3b58 Author: Adam Belay <abelay@novell.com> Date: Sat Mar 24 03:47:07 2007 -0400 cpuidle: add the 'menu' governor Here is my first take at implementing an idle PM governor that takes full advantage of NO_HZ. I call it the 'menu' governor because it considers the full list of idle states before each entry. I've kept the implementation fairly simple. It attempts to guess the next residency time and then chooses a state that would meet at least the break-even point between power savings and entry cost. To this end, it selects the deepest idle state that satisfies the following constraints: 1. If the idle time elapsed since bus master activity was detected is below a threshold (currently 20 ms), then limit the selection to C2-type or above. 2. Do not choose a state with a break-even residency that exceeds the expected time remaining until the next timer interrupt. 3. Do not choose a state with a break-even residency that exceeds the elapsed time between the last pair of break events, excluding timer interrupts. This governor has an advantage over "ladder" governor because it proactively checks how much time remains until the next timer interrupt using the tick infrastructure. Also, it handles device interrupt activity more intelligently by not including timer interrupts in break event calculations. Finally, it doesn't make policy decisions using the number of state entries, which can have variable residency times (NO_HZ makes these potentially very large), and instead only considers sleep time deltas. The menu governor can be selected during runtime using the cpuidle sysfs interface like so: "echo "menu" > /sys/devices/system/cpu/cpuidle/current_governor" Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com> commit a4bec7e65aa3b7488b879d971651cc99a6c410fe Author: Adam Belay <abelay@novell.com> Date: Sat Mar 24 03:47:03 2007 -0400 cpuidle: export time until next timer interrupt using NO_HZ Expose information about the time remaining until the next timer interrupt expires by utilizing the dynticks infrastructure. Also modify the main idle loop to allow dynticks to handle non-interrupt break events (e.g. DMA). Finally, expose sleep ticks information to external code. Thomas Gleixner is responsible for much of the code in this patch. However, I've made some additional changes, so I'm probably responsible if there are any bugs or oversights :) Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 2929d8996fbc77f41a5ff86bb67cdde3ca7d2d72 Author: Adam Belay <abelay@novell.com> Date: Sat Mar 24 03:46:58 2007 -0400 cpuidle: governor API changes This patch prepares cpuidle for the menu governor. It adds an optional stage after idle state entry to give the governor an opportunity to check why the state was exited. Also it makes sure the idle loop returns after each state entry, allowing the appropriate dynticks code to run. Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 3a7fd42f9825c3b03e364ca59baa751bb350775f Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Apr 26 00:03:59 2007 -0700 cpuidle: hang fix Prevent hang on x86-64, when ACPI processor driver is added as a module on a system that does not support C-states. x86-64 expects all idle handlers to enable interrupts before returning from idle handler. This is due to enter_idle(), exit_idle() races. Make cpuidle_idle_call() confirm to this when there is no pm_idle_old. Also, cpuidle look at the return values of attch_driver() and set current_driver to NULL if attach fails on all CPUs. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 4893339a142afbd5b7c01ffadfd53d14746e858e Author: Shaohua Li <shaohua.li@intel.com> Date: Thu Apr 26 10:40:09 2007 +0800 cpuidle: add support for max_cstate limit With CPUIDLE framework, the max_cstate (to limit max cpu c-state) parameter is ingored. Some systems require it to ignore C2/C3 and some drivers like ipw require it too. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 43bbbbe1cb998cbd2df656f55bb3bfe30f30e7d1 Author: Shaohua Li <shaohua.li@intel.com> Date: Thu Apr 26 10:40:13 2007 +0800 cpuidle: add cpuidle_fore_redetect_devices API add cpuidle_force_redetect_devices API, which forces all CPU redetect idle states. Next patch will use it. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit d1edadd608f24836def5ec483d2edccfb37b1d19 Author: Shaohua Li <shaohua.li@intel.com> Date: Thu Apr 26 10:40:01 2007 +0800 cpuidle: fix sysfs related issue Fix the cpuidle sysfs issue. a. make kobject dynamicaly allocated b. fixed sysfs init issue to avoid suspend/resume issue Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 7169a5cc0d67b263978859672e86c13c23a5570d Author: Randy Dunlap <randy.dunlap@oracle.com> Date: Wed Mar 28 22:52:53 2007 -0400 cpuidle: 1-bit field must be unsigned A 1-bit bitfield has no room for a sign bit. drivers/cpuidle/governors/ladder.c:54:16: error: dubious bitfield without explicit `signed' or `unsigned' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 4658620158dc2fbd9e4bcb213c5b6fb5d05ba7d4 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Wed Mar 28 22:52:41 2007 -0400 cpuidle: fix boot hang Patch for cpuidle boot hang reported by Larry Finger here. http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/2025.html Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Larry Finger <larry.finger@lwfinger.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit c17e168aa6e5fe3851baaae8df2fbc1cf11443a9 Author: Len Brown <len.brown@intel.com> Date: Wed Mar 7 04:37:53 2007 -0500 cpuidle: ladder does not depend on ACPI build fix for CONFIG_ACPI=n In file included from drivers/cpuidle/governors/ladder.c:21: include/acpi/processor.h:88: error: expected specifier-qualifier-list before âacpi_integerâ include/acpi/processor.h:106: error: expected specifier-qualifier-list before âacpi_integerâ include/acpi/processor.h:168: error: expected specifier-qualifier-list before âacpi_handleâ Signed-off-by: Len Brown <len.brown@intel.com> commit 8c91d958246bde68db0c3f0c57b535962ce861cb Author: Adrian Bunk <bunk@stusta.de> Date: Tue Mar 6 02:29:40 2007 -0800 cpuidle: make code static This patch makes the following needlessly global code static: - driver.c: __cpuidle_find_driver() - governor.c: __cpuidle_find_governor() - ladder.c: struct ladder_governor Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Adam Belay <abelay@novell.com> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 0c39dc3187094c72c33ab65a64d2017b21f372d2 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Wed Mar 7 02:38:22 2007 -0500 cpu_idle: fix build break This patch fixes a build breakage with !CONFIG_HOTPLUG_CPU and CONFIG_CPU_IDLE. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 8112e3b115659b07df340ef170515799c0105f82 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Tue Mar 6 02:29:39 2007 -0800 cpuidle: build fix for !CPU_IDLE Fix the compile issues when CPU_IDLE is not configured. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Adam Belay <abelay@novell.com> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> commit 1eb4431e9599cd25e0d9872f3c2c8986821839dd Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Feb 22 13:54:57 2007 -0800 cpuidle take2: Basic documentation for cpuidle Documentation for cpuidle infrastructure Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit ef5f15a8b79123a047285ec2e3899108661df779 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Feb 22 13:54:03 2007 -0800 cpuidle take2: Hookup ACPI C-states driver with cpuidle Hookup ACPI C-states onto generic cpuidle infrastructure. drivers/acpi/procesor_idle.c is now a ACPI C-states driver that registers as a driver in cpuidle infrastructure and the policy part is removed from drivers/acpi/processor_idle.c. We use governor in cpuidle instead. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 987196fa82d4db52c407e8c9d5dec884ba602183 Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Date: Thu Feb 22 13:52:57 2007 -0800 cpuidle take2: Core cpuidle infrastructure Announcing 'cpuidle', a new CPU power management infrastructure to manage idle CPUs in a clean and efficient manner. cpuidle separates out the drivers that can provide support for multiple types of idle states and policy governors that decide on what idle state to use at run time. A cpuidle driver can support multiple idle states based on parameters like varying power consumption, wakeup latency, etc (ACPI C-states for example). A cpuidle governor can be usage model specific (laptop, server, laptop on battery etc). Main advantage of the infrastructure being, it allows independent development of drivers and governors and allows for better CPU power management. A huge thanks to Adam Belay and Shaohua Li who were part of this mini-project since its beginning and are greatly responsible for this patchset. This patch: Core cpuidle infrastructure. Introduces a new abstraction layer for cpuidle: which manages drivers that can support multiple idles states. Drivers can be generic or particular to specific hardware/platform * allows pluging in multiple policy governors that can take idle state policy decision * The core also has a set of sysfs interfaces with which administrato can know about supported drivers and governors and switch them at run time. Signed-off-by: Adam Belay <abelay@novell.com> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-10-10 00:12:41 -04:00
Anton Blanchard	74922be148	Fix timer_stats printout of events/sec When using /proc/timer_stats on ppc64 I noticed the events/sec field wasnt accurate. Sometimes the integer part was incorrect due to rounding (we werent taking the fractional seconds into consideration). The fraction part is also wrong, we need to pad the printf statement and take the bottom three digits of 1000 times the value. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-07 16:28:43 -07:00
Thomas Gleixner	b7e113dc9d	clockevents: remove the suspend/resume workaround^Wthinko In a desparate attempt to fix the suspend/resume problem on Andrews VAIO I added a workaround which enforced the broadcast of the oneshot timer on resume. This was actually resolving the problem on the VAIO but was just a stupid workaround, which was not tackling the root cause: the assignement of lower idle C-States in the ACPI processor_idle code. The cpuidle patches, which utilize the dynamic tick feature and go faster into deeper C-states exposed the problem again. The correct solution is the previous patch, which prevents lower C-states across the suspend/resume. Remove the enforcement code, including the conditional broadcast timer arming, which helped to pamper over the real problem for quite a time. The oneshot broadcast flag for the cpu, which runs the resume code can never be set at the time when this code is executed. It only gets set, when the CPU is entering a lower idle C-State. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-22 17:15:34 -07:00
Thomas Gleixner	5e41d0d60a	clockevents: prevent stale tick update on offline cpu Taking a cpu offline removes the cpu from the online mask before the CPU_DEAD notification is done. The clock events layer does the cleanup of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is used to avoid xtime lock contention by assigning the task of jiffies xtime updates to one CPU. If a CPU is taken offline, then this assignment becomes stale. This went unnoticed because most of the time the offline CPU went dead before the online CPU reached __cpu_die(), where the CPU_DEAD state is checked. In the case that the offline CPU did not reach the DEAD state before we reach __cpu_die(), the code in there goes to sleep for 100ms. Due to the stale time update assignment, the system is stuck forever. Take the assignment away when a cpu is not longer in the cpu_online_mask. We do this in the last call to tick_nohz_stop_sched_tick() when the offline CPU is on the way to the final play_dead() idle entry. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-09-16 15:36:43 +02:00
Thomas Gleixner	31d9b3938c	clockevents: do not shutdown the oneshot broadcast device When a cpu goes offline it is removed from the broadcast masks. If the mask becomes empty the code shuts down the broadcast device. This is wrong, because the broadcast device needs to be ready for the online cpu going idle (into a c-state, which stops the local apic timer). Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-09-16 15:36:43 +02:00
Thomas Gleixner	07eec6af44	clockevents: Enforce oneshot broadcast when broadcast mask is set on resume The jinxed VAIO refuses to resume without hitting keys on the keyboard when this is not enforced. It is unclear why the cpu ends up in a lower C State without notifying the clock events layer, but enforcing the oneshot broadcast here is safe. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-09-16 15:36:43 +02:00
Thomas Gleixner	6a669ee8a7	timekeeping: Prevent time going backwards on resume Timekeeping resume adjusts xtime by adding the slept time in seconds and resets the reference value of the clock source (clock->cycle_last). clock->cycle last is used to calculate the delta between the last xtime update and the readout of the clock source in __get_nsec_offset(). xtime plus the offset is the current time. The resume code ignores the delta which had already elapsed between the last xtime update and the actual time of suspend. If the suspend time is short, then we can see time going backwards on resume. Suspend: offs_s = clock->read() - clock->cycle_last; now = xtime + offs_s; timekeeping_suspend_time = read_rtc(); Resume: sleep_time = read_rtc() - timekeeping_suspend_time; xtime.tv_sec += sleep_time; clock->cycle_last = clock->read(); offs_r = clock->read() - clock->cycle_last; now = xtime + offs_r; if sleep_time_seconds == 0 and offs_r < offs_s, then time goes backwards. Fix this by storing the offset from the last xtime update and add it to xtime during resume, when we reset clock->cycle_last: sleep_time = read_rtc() - timekeeping_suspend_time; xtime.tv_sec += sleep_time; xtime += offs_s; /* Fixup xtime offset at suspend time */ clock->cycle_last = clock->read(); offs_r = clock->read() - clock->cycle_last; now = xtime + offs_r; Thanks to Marcelo for tracking this down on the OLPC and providing the necessary details to analyze the root cause. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com> Cc: Tosatti <marcelo@kvack.org>	2007-09-16 15:36:43 +02:00
Thomas Gleixner	3be9095063	timekeeping: access rtc outside of xtime lock Lockdep complains about the access of rtc in timekeeping_suspend inside the interrupt disabled region of the write locked xtime lock. Move the access outside. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com>	2007-09-16 15:36:43 +02:00
Tony Breeds	298a5df45d	Fix "no_sync_cmos_clock" logic inversion in kernel/time/ntp.c Seems to me that this timer will only get started on platforms that say they don't want it? Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Gabriel Paubert <paubert@iram.es> Cc: Zachary Amsden <zach@vmware.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:27 -07:00
Miao Xie	6ddfca9548	timer: remove clockevents_unregister_notifier I find a function(clockevents_unregister_notifier) which is not called by anything in tree. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-11 15:47:42 -07:00
Alexey Dobriyan	5ea473a1df	Fix leaks on /proc/{*/sched,sched_debug,timer_list,timer_stats} On every open/close one struct seq_operations leaks. Kudos to /proc/slab_allocators. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:40 -07:00
john stultz	17c38b7490	Cache xtime every call to update_wall_time This avoids xtime lag seen with dynticks, because while 'xtime' itself is still not updated often, we keep a 'xtime_cache' variable around that contains the approximate real-time that _is_ updated each time we do a 'update_wall_time()', and is thus never off by more than one tick. IOW, this restores the original semantics for 'xtime' users, as long as you use the proper abstraction functions (ie 'current_kernel_time()' or 'get_seconds()' depending on whether you want a timespec or just the seconds field). [ Updated Patch. As penance for my sins I've also yanked another #ifdef that was added to avoid the xtime lag w/ hrtimers. ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-25 10:17:44 -07:00
john stultz	2c6b47de17	Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). This avoids use of the kernel-internal "xtime" variable directly outside of the actual time-related functions. Instead, use the helper functions that we already have available to us. This doesn't actually change any behaviour, but this will allow us to fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ (because much of the realtime information is maintained as separate offsets to 'xtime'), which has caused interfaces that use xtime directly to get a time that is out of sync with the real-time clock by up to a third of a second or so. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-25 10:09:20 -07:00
Thomas Gleixner	82644459c5	NTP: move the cmos update code into ntp.c i386 and sparc64 have the identical code to update the cmos clock. Move it into kernel/time/ntp.c as there are other architectures coming along with the same requirements. [akpm@linux-foundation.org: build fixes] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:15 -07:00
Ingo Molnar	820de5c39e	highres: improve debug output Add some more debug information to the hrtimer and clock events code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:15 -07:00
john stultz	3704540b48	tick management: spread timer interrupt After discussing w/ Thomas over IRC, it seems the issue is the sched tick fires on every cpu at the same time, causing extra lock contention. This smaller change, adds an extra offset per cpu so the ticks don't line up. This patch also drops the idle latency from 40us down to under 20us. Signed-off-by: john stultz <johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:15 -07:00
Thomas Gleixner	5590a536c0	clockevents: fix device replacement When a device is replaced by a better rated device, then the broadcast mode needs to be evaluated again. When the new device has no requirement for broadcasting, then the broadcast bits for the CPU must be cleared. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:15 -07:00
Thomas Gleixner	18de5bc4c1	clockevents: fix resume logic We need to make sure, that the clockevent devices are resumed, before the tick is resumed. The current resume logic does not guarantee this. Add CLOCK_EVT_MODE_RESUME and call the set mode functions of the clock event devices before resuming the tick / oneshot functionality. Fixup the existing users. Thanks to Nigel Cunningham for tracking down a long standing thinko, which affected the jinxed VAIO. [akpm@linux-foundation.org: xen build fix] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:15 -07:00
Tony Luck	c36c282b88	Pull ia64-clocksource into release branch	2007-07-20 11:26:47 -07:00
Bob Picco	1f564ad6d4	[IA64] remove time interpolator Remove time_interpolator code (This is generic code, but only user was ia64. It has been superseded by the CONFIG_GENERIC_TIME code). Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Peter Keilty <peter.keilty@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2007-07-20 11:23:02 -07:00
Thomas Gleixner	71120f183b	timekeeping: fixup shadow variable argument clocksource_adjust() has a clock argument, which shadows the file global clock variable. Fix this up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:52 -07:00
Tejun Heo	9281acea6a	kallsyms: make KSYM_NAME_LEN include space for trailing '\0' KSYM_NAME_LEN is peculiar in that it does not include the space for the trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating buffer. This is nonsense and error-prone. Moreover, when the caller forgets that it's very likely to subtly bite back by corrupting the stack because the last position of the buffer is always cleared to zero. This patch increments KSYM_NAME_LEN by one and updates code accordingly. * off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro is fixed. * Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together, MODULE_NAME_LEN was treated as if it didn't include space for the trailing '\0'. Fix it. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Paulo Marques <pmarques@grupopie.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:03 -07:00
Alexey Dobriyan	aa0ac36518	Remove capability.h from mm.h I forgot to remove capability.h from mm.h while removing sched.h! This patch remedies that, because the only inline function which was using CAP_something was made out of line. Cross-compile tested without regressions on: all powerpc defconfigs all mips defconfigs all m68k defconfigs all arm defconfigs all ia64 defconfigs alpha alpha-allnoconfig alpha-defconfig alpha-up arm i386 i386-allnoconfig i386-defconfig i386-up ia64 ia64-allnoconfig ia64-defconfig ia64-up m68k mips parisc parisc-allnoconfig parisc-defconfig parisc-up powerpc powerpc-up s390 s390-allnoconfig s390-defconfig s390-up sparc sparc-allnoconfig sparc-defconfig sparc-up sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up um-x86_64 x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:45 -07:00
Venki Pallipadi	c5c061b8f9	Add a flag to indicate deferrable timers in /proc/timer_stats Add a flag in /proc/timer_stats to indicate deferrable timers. This will let developers/users to differentiate between types of tiemrs in /proc/timer_stats. Deferrable timer and normal timer will appear in /proc/timer_stats as below. 10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) Also version of timer_stats changes from v0.1 to v0.2 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:45 -07:00
Andi Kleen	78c1b06574	Remove clockevents_{release,request}_device Not called by anything in tree. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:43 -07:00
Tomas Janousek	7c3f1a5732	Introduce boot based time The commits `411187fb05` (GTOD: persistent clock support) `c1d370e167` (i386: use GTOD persistent clock support) changed the monotonic time so that it no longer jumps after resume, but it's not possible to use it for boot time and process start time calculations then. Also, the uptime no longer increases during suspend. I add a variable to track the wall_to_monotonic changes, a function to get the real boot time and a function to get the boot based time from the monotonic one. [akpm@linux-foundation.org: remove exports, add comment] Signed-off-by: Tomas Janousek <tjanouse@redhat.com> Cc: Tomas Smetana <tsmetana@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:41 -07:00
Thomas Gleixner	746976a301	NTP: remove clock_was_set() call to prevent deadlock The clock_was_set() call in seconds_overflow() which happens only when leap seconds are inserted / deleted is wrong in two aspects: 1. it results in a call to on_each_cpu() with interrupts disabled 2. it is potential deadlock source vs. call_lock in smp_call_function() The only possible side effect of the removal might be, that an absolute CLOCK_REALTIME timer fires 1 second too late, in the rare case of leap second deletion and an absolute CLOCK_REALTIME timer which expires in the affected time frame. It will never fire too early. This was probably observed by the reporter of a June 30th -> July 1st hang: http://lkml.org/lkml/2007/7/3/103 A similar problem was observed by Dave Jones, who provided a screen shot with a lockdep back trace, which allowed to analyse the problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-03 13:54:27 -07:00
Ingo Molnar	c1a834dc70	timer stats: speedups Make timer-stats have almost zero overhead when enabled in the config but not used. (this way distros can enable it more easily) Also update the documentation about overhead of timer_stats - it was written for the first version which had a global lock and a linear list walk based lookup ;-) Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-01 08:18:30 -07:00
Bjorn Steinbrink	9fcc15ec3c	timer statistics: fix race Fix two races in the timer stats lookup code. One by ensuring that the initialization of a new entry is finished upon insertion of that entry. The other by cleaning up the hash table when the entries array is cleared, so that we don't have any "pre-inserted" entries. Thanks to Eric Dumazet for reminding me of the memory barriers. Signed-off-by: Bjorn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Ian Kumlien <pomac@vapor.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-01 08:18:30 -07:00
Thomas Gleixner	eaad084bb0	NOHZ: prevent multiplication overflow - stop timer for huge timeouts get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case there is no timer pending. On 64 bit machines this results in a multiplication overflow in tick_nohz_stop_sched_tick(). Reported by: Dave Miller <davem@davemloft.net> Make the return value a constant and limit the return value to a 32 bit value. When the max timeout value is returned, we can safely stop the tick timer device. The max jiffies delta results in a 12 days timeout for HZ=1000. In the long term the get_next_timer_interrupt() code needs to be reworked to return ktime instead of jiffies, but we have to wait until the last users of the original NO_IDLE_HZ code are converted. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-29 18:11:10 -07:00
Thomas Gleixner	3528231606	NOHZ: Rate limit the local softirq pending warning output The warning in the NOHZ code, which triggers when a CPU goes idle with softirqs pending can fill up the logs quite quickly. Rate limit the output until we found the root cause of that problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:11 -07:00
Thomas Gleixner	72fcde9662	Ignore bogus ACPI info for offline CPUs Booting a SMP kernel with maxcpus=1 on a SMP system leads to a hard hang, because ACPI ignores the maxcpus setting and sends timer broadcast info for the offline CPUs. This results in a stuck for ever call to smp_call_function_single() on an offline CPU. Ignore the bogus information and print a kernel error to remind ACPI folks to fix it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:11 -07:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Thomas Gleixner	8f89441b37	clocksource: fix lock order in the resume path lockdep complains about the lock nesting of clocksource and watchdog lock in the resume path. Change the resume marker to a bit operation and remove the lock from this path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-15 08:54:00 -07:00
Thomas Gleixner	d10ff3fb62	timekeeping fix patch got mis-applied The time keeping code move to kernel/time/timekeeping.c broke the clocksource resume logic patch, which got applied to the old file by a fuzzy application. Fix it up and move the clocksource_resume() call to the appropriate place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ tssk, tssk, everybody should use --fuzz=0 ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-14 12:13:11 -07:00
Thomas Gleixner	b52f52a093	clocksource: fix resume logic We need to make sure that the clocksources are resumed, when timekeeping is resumed. The current resume logic does not guarantee this. Add a resume function pointer to the clocksource struct, so clocksource drivers which need to reinitialize the clocksource can provide a resume function. Add a resume function, which calls the maybe available clocksource resume functions and resets the watchdog function, so a stable TSC can be used accross suspend/resume. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
David Miller	9b04bd2756	Fix printk format warnings in timer_list.c u64 and s64 are not necessarily 'long long' on some 64-bit platforms, so explicit the type to kill the compiler warnings. Also consistently use '%Lu' which is unsigned. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:50 -07:00
Daniel Walker	35c35d1afa	clocksource: spelling error in watchdog code There's more that need fixing, and fix my own subject spelling error too. Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:49 -07:00
Siddha, Suresh B	46cb4b7c88	sched: dynticks idle load balancing Fix the process idle load balancing in the presence of dynticks. cpus for which ticks are stopped will sleep till the next event wakes it up. Potentially these sleeps can be for large durations and during which today, there is no periodic idle load balancing being done. This patch nominates an owner among the idle cpus, which does the idle load balancing on behalf of the other idle cpus. And once all the cpus are completely idle, then we can stop this idle load balancing too. Checks added in fast path are minimized. Whenever there are busy cpus in the system, there will be an owner(idle cpu) doing the system wide idle load balancing. Open items: 1. Intelligent owner selection (like an idle core in a busy package). 2. Merge with rcu's nohz_cpu_mask? Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:17 -07:00
Thomas Gleixner	d3ed782458	highres/dyntick: prevent xtime lock contention While the !highres/!dyntick code assigns the duty of the do_timer() call to one specific CPU, this was dropped in the highres/dyntick part during development. Steven Rostedt discovered the xtime lock contention on highres/dyntick due to several CPUs trying to update jiffies. Add the single CPU assignement back. In the dyntick case this needs to be handled carefully, as the CPU which has the do_timer() duty must drop the assignement and let it be grabbed by another CPU, which is active. Otherwise the do_timer() calls would not happen during the long sleep. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mark Lord <mlord@pobox.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:10 -07:00
Alexey Dobriyan	9d65cb4a17	Fix race between cat /proc//wchan and rmmod et al kallsyms_lookup() can go iterating over modules list unprotected which is OK for emergency situations (oops), but not OK for regular stuff like /proc//wchan. Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol name into caller-supplied buffer or return -ERANGE. All copying is done with module_mutex held, so... Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
Alexey Dobriyan	ffb4512276	Simplify kallsyms_lookup() Several kallsyms_lookup() pass dummy arguments but only need, say, module's name. Make kallsyms_lookup() accept NULLs where possible. Also, makes picture clearer about what interfaces are needed for all symbol resolving business. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:08 -07:00
john stultz	8524070b79	Move timekeeping code to timekeeping.c Move the timekeeping code out of kernel/timer.c and into kernel/time/timekeeping.c. I made no cleanups or other changes in transit. [akpm@linux-foundation.org: build fix] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:06 -07:00
john stultz	98de9e3ba2	[PATCH] fix jiffies clocksource inittime In debugging a problem w/ the -rt tree, I noticed that on systems that mark the tsc as unstable before it is registered, the TSC would still be selected and used for a short period of time. Digging in it looks to be a result of the mix of the clocksource list changes and my clocksource initialization changes. With the -rt tree, using a bad TSC, even for a short period of time can results in a hang at boot. I was not able to reproduce this hang w/ mainline, but I'm not completely certain that someone won't trip on it. This patch resolves the issue by initializing the jiffies clocksource earlier so a bad TSC won't get selected just because nothing else is yet registered. Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-04 21:12:47 -07:00
john stultz	d62ac21aa0	[PATCH] ntp: avoid time_offset overflows I've been seeing some odd NTP behavior recently on a few boxes and finally narrowed it down to time_offset overflowing when converted to SHIFT_UPDATE units (which was a side effect from my HZfreeNTP patch). This patch converts time_offset from a long to a s64 which resolves the issue. [tglx@linutronix.de: signedness fixes] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-27 09:05:15 -07:00
Thomas Gleixner	291bc047e1	[PATCH] clockevents: remove bad designed sysfs support for now The current sysfs support of clockevents does not obey the "only one value per file" rule. The real fix is not 2.6.21 material. Therefor remove the sysfs support for now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-26 14:07:23 -07:00
Thomas Gleixner	948ac6d71c	[PATCH] clocksource: Fix thinko in watchdog selection The watchdog implementation excludes low res / non continuous clocksources from being selected as a watchdog reference unintentionally. Allow using jiffies/PIT as a watchdog reference as long as no better clocksource is available. This is necessary to detect TSC breakage on systems, which have no pmtimer/hpet. The main goal of the initial patch (preventing to switch to highres/nohz when no reliable fallback clocksource is available) is still guaranteed by the checks in clocksource_watchdog(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-25 14:57:34 -07:00
James Morris	0444b3035e	[PATCH] time: fix formatting in /proc/timer_list Fix the print formatting of three unsigned long fields in /proc/timer_list, which are currently being formatted as signed long. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-23 11:01:21 -07:00
Thomas Gleixner	cd05a1f818	[PATCH] clockevents: Fix suspend/resume to disk hangs I finally found a dual core box, which survives suspend/resume without crashing in the middle of nowhere. Sigh, I never figured out from the code and the bug reports what's going on. The observed hangs are caused by a stale state transition of the clock event devices, which keeps the RCU synchronization away from completion, when the non boot CPU is brought back up. The suspend/resume in oneshot mode needs the similar care as the periodic mode during suspend to RAM. My assumption that the state transitions during the different shutdown/bringups of s2disk would go through the periodic boot phase and then switch over to highres resp. nohz mode were simply wrong. Add the appropriate suspend / resume handling for the non periodic modes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-16 19:35:25 -07:00
Thomas Gleixner	6321dd60c7	[PATCH] Save/restore periodic tick information over suspend/resume The programming of periodic tick devices needs to be saved/restored across suspend/resume - otherwise we might end up with a system coming up that relies on getting a PIT (or HPET) interrupt, while those devices default to 'no interrupts' after powerup. (To confuse things it worked to a certain degree on some systems because the lapic gets initialized as a side-effect of SMP bootup.) This suspend / resume thing was dropped unintentionally during the last-minute -mm code reshuffling. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-06 09:30:24 -08:00
john stultz	6bb74df481	[PATCH] clocksource init adjustments (fix bug #7426 ) This patch resolves the issue found here: http://bugme.osdl.org/show_bug.cgi?id=7426 The basic summary is: Currently we register most of i386/x86_64 clocksources at module_init time. Then we enable clocksource selection at late_initcall time. This causes some problems for drivers that use gettimeofday for init calibration routines (specifically the es1968 driver in this case), where durring module_init, the only clocksource available is the low-res jiffies clocksource. This may cause slight calibration errors, due to the small sampling time used. It should be noted that drivers that require fine grained time may not function on architectures that do not have better then jiffies resolution timekeeping (there are a few). However, this does not discount the reasonable need for such fine-grained timekeeping at init time. Thus the solution here is to register clocksources earlier (ideally when the hardware is being initialized), and then we enable clocksource selection at fs_initcall (before device_initcall). This patch should probably get some testing time in -mm, since clocksource selection is one of the most important issues for correct timekeeping, and I've only been able to test this on a few of my own boxes. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-05 07:57:53 -08:00
David S. Miller	3494c16676	[TICK] tick-common: Fix one-shot handling in tick_handle_periodic(). When clockevents_program_event() is given an expire time in the past, it does not update dev->next_event, so this looping code would loop forever once the first in-the-past expiration time was used. Keep advancing "next" locally to fix this bug. Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:14:15 -08:00
David S. Miller	9e203bcc10	[TIME] tick-sched: Add missing asm/irq_regs.h include. Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:13:49 -08:00
Thomas Gleixner	bc5393a6c9	[PATCH] NOHZ: Produce debug output instead of a BUG() The BUG_ON() in tick_nohz_stop_sched_tick() triggers on some boxen. Remove the BUG_ON and print information about the pending softirq to allow better debugging of the problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-19 14:18:43 -08:00
Ingo Molnar	6ba9b346e1	[PATCH] NOHZ: Fix RCU handling When a CPU is needed for RCU the tick has to continue even when it was stopped before. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-19 14:18:43 -08:00
Ingo Molnar	289f480af8	[PATCH] Add debugging feature /proc/timer_list add /proc/timer_list, which prints all currently pending (high-res) timers, all clock-event sources and their parameters in a human-readable form. Sample output: Timer List Version: v0.1 HRTIMER_MAX_CLOCK_BASES: 2 now at 4246046273872 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1273998312645738432 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0 # expires at 4246432689566 nsecs [in 386415694 nsecs] #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050 # expires at 4247018194689 nsecs [in 971920817 nsecs] #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909 # expires at 4247351358392 nsecs [in 1305084520 nsecs] #3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157 # expires at 4249097614968 nsecs [in 3051341096 nsecs] #4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888 # expires at 4251329900926 nsecs [in 5283627054 nsecs] .expires_next : 4246432689566 nsecs .hres_active : 1 .check_clocks : 0 .nr_events : 31306 .idle_tick : 4246020791890 nsecs .tick_stopped : 1 .idle_jiffies : 986504 .idle_calls : 40700 .idle_sleeps : 36014 .idle_entrytime : 4246019418883 nsecs .idle_sleeptime : 4178181972709 nsecs cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1273998312645738432 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0 # expires at 4246050084568 nsecs [in 3810696 nsecs] #1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227 # expires at 4261010635003 nsecs [in 14964361131 nsecs] #2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332 # expires at 5469485798970 nsecs [in 1223439525098 nsecs] .expires_next : 4246050084568 nsecs .hres_active : 1 .check_clocks : 0 .nr_events : 24043 .idle_tick : 4246046084568 nsecs .tick_stopped : 0 .idle_jiffies : 986510 .idle_calls : 26360 .idle_sleeps : 22551 .idle_entrytime : 4246043874339 nsecs .idle_sleeptime : 4170763761184 nsecs tick_broadcast_mask: 00000003 event_broadcast_mask: 00000001 CPU#0's local event device: Clock Event Device: lapic capabilities: 0000000e max_delta_ns: 807385544 min_delta_ns: 1443 mult: 44624025 shift: 32 set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt .installed: 1 .expires: 4246432689566 nsecs CPU#1's local event device: Clock Event Device: lapic capabilities: 0000000e max_delta_ns: 807385544 min_delta_ns: 1443 mult: 44624025 shift: 32 set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt .installed: 1 .expires: 4246050084568 nsecs Clock Event Device: hpet capabilities: 00000007 max_delta_ns: 2147483647 min_delta_ns: 3352 mult: 61496110 shift: 32 set_next_event: hpet_next_event set_mode: hpet_set_mode event_handler: handle_nextevt_broadcast Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-16 08:13:59 -08:00
Ingo Molnar	82f67cd9fc	[PATCH] Add debugging feature /proc/timer_stat Add /proc/timer_stats support: debugging feature to profile timer expiration. Both the starting site, process/PID and the expiration function is captured. This allows the quick identification of timer event sources in a system. Sample output: # echo 1 > /proc/timer_stats # cat /proc/timer_stats Timer Stats Version: v0.1 Sample period: 4.010 s 24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 11, 0 swapper sk_reset_timer (tcp_delack_timer) 6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 4, 2050 pcscd do_nanosleep (hrtimer_wakeup) 5, 4179 sshd sk_reset_timer (tcp_write_timer) 4, 2248 yum-updatesd schedule_timeout (process_timeout) 18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 3, 0 swapper sk_reset_timer (tcp_delack_timer) 1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) 2, 1 swapper e1000_up (e1000_watchdog) 1, 1 init schedule_timeout (process_timeout) 100 total events, 25.24 events/sec [ cleanups and hrtimers support from Thomas Gleixner <tglx@linutronix.de> ] [bunk@stusta.de: nr_entries can become static] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-16 08:13:59 -08:00
Thomas Gleixner	54cdfdb47f	[PATCH] hrtimers: add high resolution timer support Implement high resolution timers on top of the hrtimers infrastructure and the clockevents / tick-management framework. This provides accurate timers for all hrtimer subsystem users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-16 08:13:59 -08:00

1 2 3 4

177 Commits