linux_dsm_epyc7002/kernel
Roland Dreier bb10ed0994 sched: fix wait_for_completion_timeout() spurious failure under heavy load
It seems that the current implementaton of wait_for_completion_timeout()
has a small problem under very high load for the common pattern:

	if (!wait_for_completion_timeout(&done, timeout))
		/* handle failure */

because the implementation very roughly does (lots of code deleted to
show the basic flow):

	static inline long __sched
	do_wait_for_common(struct completion *x, long timeout, int state)
	{
		if (x->done)
			return timeout;

		do {
			timeout = schedule_timeout(timeout);

			if (!timeout)
				return timeout;

		} while (!x->done);

		return timeout;
	}

so if the system is very busy and x->done is not set when
do_wait_for_common() is entered, it is possible that the first call to
schedule_timeout() returns 0 because the task doing wait_for_completion
doesn't get rescheduled for a long time, even if it is woken up early
enough.

In this case, wait_for_completion_timeout() returns 0 without even
checking x->done again, and the code above falls into its failure case
purely for scheduler reasons, even if the hardware event or whatever was
being waited for happened early enough.

It would make sense to add an extra test to do_wait_for() in the timeout
case and return 1 if x->done is actually set.

A quick audit (not exhaustive) of wait_for_completion_timeout() callers
seems to indicate that no one actually cares about the return value in
the success case -- they just test for 0 (timed out) versus non-zero
(wait succeeded).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 13:19:32 +02:00
..
irq genirq: reenable a nobody cared disabled irq when a new driver arrives 2008-05-02 13:40:34 +02:00
power Merge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', 'bugzilla-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release 2008-04-30 13:58:00 -04:00
time clocksource: allow read access to available/current_clocksource 2008-05-03 18:11:48 +02:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
audit.c [patch 1/1] audit_send_reply(): fix error-path memory leak 2008-05-17 03:30:22 -04:00
audit.h [PATCH 1/2] audit: move extern declarations to audit.h 2008-04-28 06:28:04 -04:00
auditfilter.c Merge branch 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current 2008-04-29 11:41:22 -07:00
auditsc.c [PATCH] new predicate - AUDIT_FILETYPE 2008-04-28 06:28:37 -04:00
backtracetest.c x86: add a simple backtrace test module 2008-01-30 13:33:08 +01:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c capabilities: remain source compatible with 32-bit raw legacy capability support. 2008-05-31 16:36:16 -07:00
cgroup_debug.c CGroup API files: move "releasable" to cgroup_debug subsystem 2008-04-29 08:06:09 -07:00
cgroup.c cgroups: remove node_ prefix_from ns subsystem 2008-05-24 09:56:14 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
cpu.c kernel: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cpuset.c sched: CPU hotplug events must not destroy scheduler domains created by the cpusets 2008-06-19 09:14:51 +02:00
delayacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
dma.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
exec_domain.c whitespace fixes: execution domains 2007-10-18 14:37:26 -07:00
exit.c signals: fix sigqueue_free() vs __exit_signal() race 2008-05-24 09:56:10 -07:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c [PATCH] dup_fd() fixes, part 1 2008-05-16 17:22:26 -04:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
futex.c Removal of FUTEX_FD 2008-05-05 08:18:45 -07:00
hrtimer.c hrtimer: remove duplicate helper function 2008-05-03 18:11:48 +02:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
Kconfig.hz sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c is_power_of_2: kernel/kfifo.c 2007-07-16 09:05:50 -07:00
kgdb.c kgdb: use common ascii helpers and put_unaligned_be32 helper 2008-05-28 12:49:56 -05:00
kmod.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
kprobes.c kprobes: fix error checking of batch registration 2008-06-12 18:05:40 -07:00
ksysfs.c Kobject: convert remaining kobject_unregister() to kobject_put() 2008-01-24 20:40:40 -08:00
kthread.c Deprecate find_task_by_pid() 2008-04-30 08:29:48 -07:00
latencytop.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep_internals.h [PATCH] lockdep: more chains 2006-12-07 08:39:43 -08:00
lockdep_proc.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep.c Subject: lockdep: include all lock classes in all_lock_classes 2008-02-25 23:03:02 +01:00
Makefile sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
marker.c make marker_debug static 2008-04-30 08:29:49 -07:00
module.c modules: proper cleanup of kobject without CONFIG_SYSFS 2008-05-23 13:09:33 +10:00
mutex-debug.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
mutex.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex.h [PATCH] lockdep: prove mutex locking correctness 2006-07-03 15:27:04 -07:00
notifier.c ipc: re-enable msgmni automatic recomputing msgmni if set to negative 2008-04-29 08:06:13 -07:00
ns_cgroup.c cgroups: kernel/ns_cgroup.c should #include <linux/nsproxy.h> 2008-04-29 08:06:07 -07:00
nsproxy.c ipc: sysvsem: refuse clone(CLONE_SYSVSEM|CLONE_NEWIPC) 2008-04-29 08:06:14 -07:00
panic.c Taint kernel after WARN_ON(condition) 2008-04-29 08:05:59 -07:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pid.c pids: introduce change_pid() helper 2008-04-30 08:29:48 -07:00
pm_qos_params.c pm qos infrastructure and interface 2008-02-05 09:44:22 -08:00
posix-cpu-timers.c remove div_long_long_rem 2008-05-01 08:03:58 -07:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c printk: don't read beyond string arguments' terminating zero 2008-04-30 08:29:52 -07:00
profile.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
ptrace.c make generic sys_ptrace unconditional 2008-05-01 10:21:54 -07:00
rcuclassic.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupdate.c rcupdate: fix comment 2008-02-13 16:21:18 -08:00
rcupreempt_trace.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupreempt.c generic: reduce stack pressure in sched_affinity 2008-04-19 19:44:59 +02:00
rcutorture.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
relay.c splice: fix sendfile() issue with relay 2008-05-28 14:49:27 +02:00
res_counter.c memcgroup: add the max_usage member on the res_counter 2008-04-29 08:06:10 -07:00
resource.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rtmutex-tester.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rwsem.c sched: mark rwsem functions as __sched for wchan/profiling 2007-12-18 15:21:13 +01:00
sched_clock.c sched: fix sched_clock_cpu() 2008-05-29 11:29:19 +02:00
sched_debug.c revert ("sched: fair-group: SMP-nice for group scheduling") 2008-05-29 11:28:57 +02:00
sched_fair.c sched: stop wake_affine from causing serious imbalance 2008-05-29 11:29:20 +02:00
sched_features.h sched: /debug/sched_features 2008-04-19 19:45:00 +02:00
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c sched: rt: dont stop the period timer when there are tasks wanting to run 2008-06-20 11:00:19 +02:00
sched_stats.h sched, delay accounting: fix incorrect delay time when constantly waiting on runqueue 2008-06-19 14:15:28 +02:00
sched.c sched: fix wait_for_completion_timeout() spurious failure under heavy load 2008-06-20 13:19:32 +02:00
seccomp.c make seccomp zerocost in schedule 2007-07-16 09:05:50 -07:00
semaphore.c Revert "semaphore: fix" 2008-05-10 20:43:22 -07:00
signal.c posix timers: discard SI_TIMER signals on exec 2008-05-26 10:37:07 -07:00
softirq.c Fix cpu hotplug problem in softirq code 2008-05-01 08:03:58 -07:00
softlockup.c softlockup: fix task state setting 2008-02-29 18:46:53 +01:00
spinlock.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
srcu.c make srcu_readers_active() static 2008-02-06 10:41:02 -08:00
stacktrace.c [PATCH] lockdep: stacktrace subsystem, core 2006-07-03 15:27:02 -07:00
stop_machine.c stop_machine: make stop_machine_run more virtualization friendly 2008-05-23 13:09:34 +10:00
sys_ni.c timerfd: new timerfd API 2008-02-05 09:44:07 -08:00
sys.c sys_prctl(): fix return of uninitialized value 2008-05-24 09:56:13 -07:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
sysctl.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
taskstats.c Use find_task_by_vpid in taskstats 2008-04-30 08:29:48 -07:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c debugobjects: add timer specific object debugging code 2008-04-30 08:29:53 -07:00
tsacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
uid16.c asmlinkage_protect replaces prevent_tail_call 2008-04-10 17:28:26 -07:00
user_namespace.c eCryptfs: make key module subsystem respect namespaces 2008-04-29 08:06:07 -07:00
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
utsname_sysctl.c Isolate the UTS namespace's domainname and hostname back 2007-11-29 09:24:53 -08:00
utsname.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c workqueue: remove redundant function invocation 2008-05-01 08:04:02 -07:00