linux_dsm_epyc7002/kernel
Oleg Nesterov 4ab6c08336 clone(): fix race between copy_process() and de_thread()
Spotted by Hiroshi Shimamoto who also provided the test-case below.

copy_process() uses signal->count as a reference counter, but it is not.
This test case

	#include <sys/types.h>
	#include <sys/wait.h>
	#include <unistd.h>
	#include <stdio.h>
	#include <errno.h>
	#include <pthread.h>

	void *null_thread(void *p)
	{
		for (;;)
			sleep(1);

		return NULL;
	}

	void *exec_thread(void *p)
	{
		execl("/bin/true", "/bin/true", NULL);

		return null_thread(p);
	}

	int main(int argc, char **argv)
	{
		for (;;) {
			pid_t pid;
			int ret, status;

			pid = fork();
			if (pid < 0)
				break;

			if (!pid) {
				pthread_t tid;

				pthread_create(&tid, NULL, exec_thread, NULL);
				for (;;)
					pthread_create(&tid, NULL, null_thread, NULL);
			}

			do {
				ret = waitpid(pid, &status, 0);
			} while (ret == -1 && errno == EINTR);
		}

		return 0;
	}

quickly creates an unkillable task.

If copy_process(CLONE_THREAD) races with de_thread()
copy_signal()->atomic(signal->count) breaks the signal->notify_count
logic, and the execing thread can hang forever in kernel space.

Change copy_process() to increment count/live only when we know for sure
we can't fail.  In this case the forked thread will take care of its
reference to signal correctly.

If copy_process() fails, check CLONE_THREAD flag.  If it it set - do
nothing, the counters were not changed and current belongs to the same
thread group.  If it is not set, ->signal must be released in any case
(and ->count must be == 1), the forked child is the only thread in the
thread group.

We need more cleanups here, in particular signal->count should not be used
by de_thread/__exit_signal at all.  This patch only fixes the bug.

Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Tested-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26 20:06:52 -07:00
..
gcov gcov: enable GCOV_PROFILE_ALL for x86_64 2009-06-18 13:03:58 -07:00
irq genirq: Wake up irq thread after action has been installed 2009-08-18 17:22:43 +02:00
power headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
time clockevent: Prevent dead lock on clockevents_lock 2009-08-19 18:15:10 +02:00
trace tracing: handle broken names in ftrace filter 2009-08-18 20:39:48 -04:00
.gitignore
acct.c bsdacct: fix access to invalid filp in acct_on() 2009-06-30 18:56:00 -07:00
async.c async: Fix lack of boot-time console due to insufficient synchronization 2009-06-08 12:31:53 -07:00
audit_tree.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
audit_watch.c Audit: clean up all op= output to include string quoting 2009-06-24 00:00:52 -04:00
audit.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
audit.h Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
auditfilter.c Audit: clean up all op= output to include string quoting 2009-06-24 00:00:52 -04:00
auditsc.c Fix rule eviction order for AUDIT_DIR 2009-06-24 00:02:38 -04:00
backtracetest.c
bounds.c
capability.c
cgroup_debug.c debug cgroup: remove unneeded cgroup_lock 2009-04-02 19:04:54 -07:00
cgroup_freezer.c
cgroup.c cgroup avoid permanent sleep at rmdir 2009-07-29 19:10:35 -07:00
compat.c signals: implement sys_rt_tgsigqueueinfo 2009-04-30 19:24:24 +02:00
configs.c
cpu.c mm/init: cpu_hotplug_init() must be initialized before SLAB 2009-06-22 21:18:12 -07:00
cpuset.c cpuset,mm: update tasks' mems_allowed in time 2009-06-16 19:47:31 -07:00
cred-internals.h
cred.c CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace 2009-05-11 08:15:36 +10:00
delayacct.c
dma-coherent.c
dma.c
exec_domain.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
exit.c headers: mnt_namespace.h redux 2009-07-08 09:31:56 -07:00
extable.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
fork.c clone(): fix race between copy_process() and de_thread() 2009-08-26 20:06:52 -07:00
freezer.c sched: fix nr_uninterruptible accounting of frozen tasks really 2009-07-18 14:19:53 +02:00
futex_compat.c futex: Fix compat_futex to be same as futex for REQUEUE_PI 2009-08-10 15:41:12 +02:00
futex.c futex: Fix handling of bad requeue syscall pairing 2009-08-10 20:38:11 +02:00
groups.c groups: move code to kernel/groups.c 2009-06-16 19:47:48 -07:00
hrtimer.c hrtimer: Fix migration expiry check 2009-07-10 17:32:55 +02:00
hung_task.c
itimer.c
kallsyms.c kernel/kallsyms.c: replace deprecated __initcall with device_initcall and fix whitespace 2009-06-09 22:37:52 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.preempt
kexec.c kexec: fix omitting offset in extended crashkernel syntax 2009-07-29 19:10:34 -07:00
kfifo.c kernel/kfifo.c: replace conditional test with is_power_of_2() 2009-06-16 19:47:47 -07:00
kgdb.c sysrq, intel_fb: fix sysrq g collision 2009-05-15 07:56:24 -05:00
kmod.c headers: mnt_namespace.h redux 2009-07-08 09:31:56 -07:00
kprobes.c kprobes: Use kernel_text_address() for checking probe address 2009-07-30 16:44:06 -07:00
ksysfs.c
kthread.c update the comment in kthread_stop() 2009-07-27 12:15:46 -07:00
latencytop.c
lockdep_internals.h lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS 2009-05-12 19:59:52 +02:00
lockdep_proc.c lockdep: Fix file mode of lock_stat 2009-08-07 11:58:38 +02:00
lockdep_states.h
lockdep.c Merge branch 'linus' into tracing/core 2009-05-07 11:17:34 +02:00
Makefile Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-28 11:05:04 -07:00
marker.c
module.c module: use MODULE_SYMBOL_PREFIX with module_layout 2009-07-27 12:15:45 -07:00
mutex-debug.c
mutex-debug.h
mutex.c Merge branch 'linus' into perfcounters/core 2009-06-11 17:55:42 +02:00
mutex.h
notifier.c
ns_cgroup.c cgroups: relax ns_can_attach checks to allow attaching to grandchild cgroups 2009-04-02 19:04:53 -07:00
nsproxy.c nsproxy: extract create_nsproxy() 2009-06-18 13:03:56 -07:00
panic.c trace: stop tracer in oops_enter() 2009-07-24 15:30:45 -04:00
params.c module_param: allow 'bool' module_params to be bool, not just int. 2009-06-12 21:46:58 +09:30
perf_counter.c perf_counter: Fix typo in read() output generation 2009-08-21 18:00:35 +02:00
pid_namespace.c pidns: rewrite copy_pid_ns() 2009-06-18 13:03:55 -07:00
pid.c kmemleak: Remove alloc_bootmem annotations introduced in the past 2009-07-09 17:07:02 +01:00
pm_qos_params.c
posix-cpu-timers.c posix_cpu_timers_exit_group(): Do not use thread_group_cputimer() 2009-08-08 18:30:25 +02:00
posix-timers.c posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAW 2009-08-04 10:16:41 +02:00
printk.c printk: Add KERN_DEFAULT printk log-level 2009-06-16 11:02:28 -07:00
profile.c profile: suppress warning about large allocations when profile=1 is specified 2009-07-29 19:10:36 -07:00
ptrace.c cred_guard_mutex: do not return -EINTR to user-space 2009-07-06 13:57:04 -07:00
rcuclassic.c kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies 2009-04-03 12:23:02 +02:00
rcupdate.c RCU: Don't try and predeclare inline funcs as it upsets some versions of gcc 2009-04-15 13:55:14 -07:00
rcupreempt_trace.c
rcupreempt.c rcu: rcu_sched_grace_period(): kill the bogus flush_signals() 2009-05-05 20:28:05 +02:00
rcutorture.c cpumask: convert rcutorture.c 2009-03-30 22:05:16 +10:30
rcutree_trace.c rcu: Add __rcu_pending tracing to hierarchical RCU 2009-04-14 11:33:43 +02:00
rcutree.c rcu: Mark Hierarchical RCU no longer experimental 2009-06-24 15:02:48 +02:00
rcutree.h kmemtrace, rcu: fix rcu_tree_trace.c data structure dependencies 2009-04-03 12:23:03 +02:00
relay.c Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-04-05 11:04:19 -07:00
res_counter.c memcg: add interface to reset limits 2009-06-18 13:03:48 -07:00
resource.c kernel/resource.c: fix sign extension in reserve_setup() 2009-06-30 18:56:00 -07:00
rtmutex_common.h rt_mutex: add proxy lock routines 2009-04-06 11:14:02 +02:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c rtmutex: Avoid deadlock in rt_mutex_start_proxy_lock() 2009-08-06 05:50:21 +02:00
rtmutex.h
rwsem.c
sched_clock.c sched: Fix fallback sched_clock()'s offset when using jiffies 2009-05-09 10:08:19 +02:00
sched_cpupri.c sched: Fix race in cpupri introduced by cpumask_var changes 2009-08-02 14:23:29 +02:00
sched_cpupri.h cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sched_debug.c sched: Hide runqueues from direct refer at source code level 2009-06-17 18:29:42 +02:00
sched_fair.c sched: Fix latencytop and sleep profiling vs group scheduling 2009-08-02 14:10:12 +02:00
sched_features.h Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-30 17:17:35 -07:00
sched_idletask.c sched, timers: move calc_load() to scheduler 2009-05-15 15:32:45 +02:00
sched_rt.c sched_rt: Fix overload bug on rt group scheduling 2009-07-10 10:43:29 +02:00
sched_stats.h sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
sched.c sched: fix load average accounting vs. cpu hotplug 2009-07-18 14:19:52 +02:00
seccomp.c x86-64: seccomp: fix 32/64 syscall hole 2009-03-02 15:41:30 -08:00
semaphore.c
signal.c do_sigaltstack: small cleanups 2009-08-01 11:18:56 -07:00
slow-work.c slow-work: use round_jiffies() for thread pool's cull and OOM timers 2009-06-16 19:47:49 -07:00
smp.c generic-ipi: fix hotplug_cfd() 2009-08-07 10:39:55 -07:00
softirq.c softirq: introduce tasklet_hrtimer infrastructure 2009-07-22 17:01:17 +02:00
softlockup.c
spinlock.c Allow rwlocks to re-enable interrupts 2009-04-02 19:05:11 -07:00
srcu.c
stacktrace.c
stop_machine.c cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sys_ni.c
sys.c groups: move code to kernel/groups.c 2009-06-16 19:47:48 -07:00
sysctl_check.c
sysctl.c Security/SELinux: seperate lsm specific mmap_min_addr 2009-08-17 15:09:11 +10:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c timer: Avoid reading uninitialized data 2009-07-18 23:11:43 +02:00
tracepoint.c tracepoints: dont update zero-sized tracepoint sections 2009-03-18 19:55:00 +01:00
tsacct.c Fix fixpoint divide exception in acct_update_integrals 2009-03-09 08:13:35 -07:00
uid16.c
up.c
user_namespace.c Fix recursive lock in free_uid()/free_user_ns() 2009-02-27 16:26:21 -08:00
user.c sched: delayed cleanup of user_struct 2009-06-15 21:30:23 -07:00
utsname_sysctl.c proc_sysctl: use CONFIG_PROC_SYSCTL around ipc and utsname proc_handlers 2009-04-02 19:05:01 -07:00
utsname.c utsns: extract creeate_uts_ns() 2009-06-18 13:03:55 -07:00
wait.c locking, sched: Give waitqueue spinlocks their own lockdep classes 2009-08-10 14:43:09 +02:00
workqueue.c ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro 2009-06-02 01:10:40 +02:00