linux_dsm_epyc7002/arch/arm/kernel
Russell King f8f2a8522a ARM: vfp: fix a hole in VFP thread migration
Fix a hole in the VFP thread migration.  Lets define two threads.

Thread 1, we'll call 'interesting_thread' which is a thread which is
running on CPU0, using VFP (so vfp_current_hw_state[0] =
&interesting_thread->vfpstate) and gets migrated off to CPU1, where
it continues execution of VFP instructions.

Thread 2, we'll call 'new_cpu0_thread' which is the thread which takes
over on CPU0.  This has also been using VFP, and last used VFP on CPU0,
but doesn't use it again.

The following code will be executed twice:

		cpu = thread->cpu;

		/*
		 * On SMP, if VFP is enabled, save the old state in
		 * case the thread migrates to a different CPU. The
		 * restoring is done lazily.
		 */
		if ((fpexc & FPEXC_EN) && vfp_current_hw_state[cpu]) {
			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
			vfp_current_hw_state[cpu]->hard.cpu = cpu;
		}
		/*
		 * Thread migration, just force the reloading of the
		 * state on the new CPU in case the VFP registers
		 * contain stale data.
		 */
		if (thread->vfpstate.hard.cpu != cpu)
			vfp_current_hw_state[cpu] = NULL;

The first execution will be on CPU0 to switch away from 'interesting_thread'.
interesting_thread->cpu will be 0.

So, vfp_current_hw_state[0] points at interesting_thread->vfpstate.
The hardware state will be saved, along with the CPU number (0) that
it was executing on.

'thread' will be 'new_cpu0_thread' with new_cpu0_thread->cpu = 0.
Also, because it was executing on CPU0, new_cpu0_thread->vfpstate.hard.cpu = 0,
and so the thread migration check is not triggered.

This means that vfp_current_hw_state[0] remains pointing at interesting_thread.

The second execution will be on CPU1 to switch _to_ 'interesting_thread'.
So, 'thread' will be 'interesting_thread' and interesting_thread->cpu now
will be 1.  The previous thread executing on CPU1 is not relevant to this
so we shall ignore that.

We get to the thread migration check.  Here, we discover that
interesting_thread->vfpstate.hard.cpu = 0, yet interesting_thread->cpu is
now 1, indicating thread migration.  We set vfp_current_hw_state[1] to
NULL.

So, at this point vfp_current_hw_state[] contains the following:

[0] = &interesting_thread->vfpstate
[1] = NULL

Our interesting thread now executes a VFP instruction, takes a fault
which loads the state into the VFP hardware.  Now, through the assembly
we now have:

[0] = &interesting_thread->vfpstate
[1] = &interesting_thread->vfpstate

CPU1 stops due to ptrace (and so saves its VFP state) using the thread
switch code above), and CPU0 calls vfp_sync_hwstate().

	if (vfp_current_hw_state[cpu] == &thread->vfpstate) {
		vfp_save_state(&thread->vfpstate, fpexc | FPEXC_EN);

BANG, we corrupt interesting_thread's VFP state by overwriting the
more up-to-date state saved by CPU1 with the old VFP state from CPU0.

Fix this by ensuring that we have sane semantics for the various state
describing variables:

1. vfp_current_hw_state[] points to the current owner of the context
   information stored in each CPUs hardware, or NULL if that state
   information is invalid.
2. thread->vfpstate.hard.cpu always contains the most recent CPU number
   which the state was loaded into or NR_CPUS if no CPU owns the state.

So, for a particular CPU to be a valid owner of the VFP state for a
particular thread t, two things must be true:

 vfp_current_hw_state[cpu] == &t->vfpstate && t->vfpstate.hard.cpu == cpu.

and that is valid from the moment a CPU loads the saved VFP context
into the hardware.  This gives clear and consistent semantics to
interpreting these variables.

This patch also fixes thread copying, ensuring that t->vfpstate.hard.cpu
is invalidated, otherwise CPU0 may believe it was the last owner.  The
hole can happen thus:

- thread1 runs on CPU2 using VFP, migrates to CPU3, exits and thread_info
  freed.
- New thread allocated from a previously running thread on CPU2, reusing
  memory for thread1 and copying vfp.hard.cpu.

At this point, the following are true:

	new_thread1->vfpstate.hard.cpu == 2
	&new_thread1->vfpstate == vfp_current_hw_state[2]

Lastly, this also addresses thread flushing in a similar way to thread
copying.  Hole is:

- thread runs on CPU0, using VFP, migrates to CPU1 but does not use VFP.
- thread calls execve(), so thread flush happens, leaving
  vfp_current_hw_state[0] intact.  This vfpstate is memset to 0 causing
  thread->vfpstate.hard.cpu = 0.
- thread migrates back to CPU0 before using VFP.

At this point, the following are true:

	thread->vfpstate.hard.cpu == 0
	&thread->vfpstate == vfp_current_hw_state[0]

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-09 17:22:12 +01:00
..
.gitignore [ARM] 5194/1: update .gitignore 2008-08-12 19:54:09 +01:00
armksyms.c Merge branch 'p2v' into devel 2011-03-16 23:35:27 +00:00
arthur.c [ARM] arm/kernel/arthur.c: add MODULE_LICENSE 2008-05-17 22:55:16 +01:00
asm-offsets.c ARM: vfp: fix a hole in VFP thread migration 2011-07-09 17:22:12 +01:00
atags.c clean up atags exporting code 2008-05-30 10:33:49 +02:00
atags.h [ARM] 4736/1: Export atags to userspace and allow kexec to use customised atags 2008-02-04 13:21:03 +00:00
bios32.c arm: bios32: Remove non exisiting machine code 2011-03-29 14:47:50 +02:00
calls.S Merge branch 'setns' 2011-05-28 10:51:01 -07:00
compat.c ARM: deprecate support for old way to pass kernel parameters 2010-07-07 16:38:36 +02:00
compat.h ARM: deprecate support for old way to pass kernel parameters 2010-07-07 16:38:36 +02:00
crash_dump.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
crunch-bits.S [ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach 2008-08-07 09:55:48 +01:00
crunch.c ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread() 2009-12-18 14:53:41 +00:00
debug.S ARM: 6826/1: Merge v6 and v7 DEBUG_LL DCC support 2011-03-28 19:01:43 +01:00
devtree.c ARM: 6953/1: DT: don't try to access physical address zero 2011-06-09 10:15:06 +01:00
dma-isa.c ARM: dma-isa: request cascade channel after registering it 2009-12-24 18:34:08 +00:00
dma.c ARM: dma: add /proc/dma support to arch/arm/kernel/dma.c 2010-04-14 13:13:30 +01:00
early_printk.c ARM: Add an earlyprintk debug console 2009-12-09 10:02:18 +00:00
ecard.c arm: Fold irq_set_chip/irq_set_handler 2011-03-29 14:47:58 +02:00
ecard.h [ARM] rpc: ecard: remove deprecated ecard_address() and relatives 2008-07-03 14:25:58 +01:00
elf.c ARM: 6878/1: fix personality flag propagation across an exec 2011-04-14 09:15:24 +01:00
entry-armv.S ARM: 6952/1: fix lockdep warning of "unannotated irqs-off" 2011-06-06 10:56:22 +01:00
entry-common.S ARM: 6952/1: fix lockdep warning of "unannotated irqs-off" 2011-06-06 10:56:22 +01:00
entry-header.S ARM: v6k: select clear exclusive code seqences according to V6 variants 2011-02-02 21:23:28 +00:00
etm.c ARM: 6838/1: etm: fix section mismatch warning 2011-03-28 19:01:17 +01:00
fiq.c ARM: 6938/1: fiq: Refactor {get,set}_fiq_regs() for Thumb-2 2011-05-26 10:31:06 +01:00
fiqasm.S ARM: 6938/1: fiq: Refactor {get,set}_fiq_regs() for Thumb-2 2011-05-26 10:31:06 +01:00
ftrace.c ARM: ftrace: graph tracer + dynamic ftrace 2010-11-19 21:43:27 +05:30
head-common.S arm/dt: Make __vet_atags also accept a dtb image 2011-05-11 15:12:32 +02:00
head-nommu.S ARM: Defer lookup of machine_type to setup.c 2011-02-15 16:36:44 +00:00
head.S Merge branches 'devel', 'devel-stable' and 'fixes' into for-linus 2011-05-27 22:59:57 +01:00
hw_breakpoint.c ARM: 6864/1: hw_breakpoint: clear DBGVCR out of reset 2011-04-10 21:13:35 +01:00
init_task.c Use new __init_task_data macro in arch init_task.c files. 2009-09-21 06:27:08 +02:00
io.c [ARM] Convert asm/io.h to linux/io.h 2008-09-06 12:10:45 +01:00
irq.c arm: Use generic show_interrupts() 2011-03-29 14:47:57 +02:00
isa.c sysctl: Drop & in front of every proc_handler. 2009-11-18 08:37:40 -08:00
iwmmxt.S ARM: pxa: add iwmmx support for PJ4 2010-12-20 23:07:36 +08:00
kgdb.c kgdb,arm: fix register dump 2010-10-29 13:14:40 -05:00
kprobes-decode.c ARM: kprobes: Tidy-up kprobes-decode.c 2011-04-28 23:41:01 -04:00
kprobes.c ARM: kprobes: Fix probing of conditionally executed instructions 2011-04-28 23:40:54 -04:00
leds.c ARM: Use struct syscore_ops instead of sysdevs for PM in common code 2011-04-24 19:16:08 +02:00
machine_kexec.c [ARM] add machine-specific hook to machine_kexec 2011-03-03 16:26:55 -05:00
Makefile Merge branches 'devel', 'devel-stable' and 'fixes' into for-linus 2011-05-27 22:59:57 +01:00
module.c ARM: 6963/1: Thumb-2: Relax relocation requirements for non-function symbols 2011-06-17 11:25:04 +01:00
perf_event_v6.c ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency 2011-03-26 10:06:09 +00:00
perf_event_v7.c ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency 2011-03-26 10:06:09 +00:00
perf_event_xscale.c ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency 2011-03-26 10:06:09 +00:00
perf_event.c ARM: 6902/1: perf: Remove erroneous check on active_events 2011-05-20 22:39:17 +01:00
pj4-cp0.c ARM: pxa: add iwmmx support for PJ4 2010-12-20 23:07:36 +08:00
pmu.c ARM: 6742/1: pmu: avoid setting IRQ affinity on UP systems 2011-02-19 11:24:05 +00:00
process.c ARM: 6867/1: Introduce THREAD_NOTIFY_COPY for copy_thread() hooks 2011-04-10 21:13:36 +01:00
ptrace.c ARM: 6883/1: ptrace: Migrate to regsets framework 2011-05-14 21:36:55 +01:00
relocate_kernel.S ARM: 6497/1: kexec: Correct data alignment for CONFIG_THUMB2_KERNEL 2010-11-30 13:44:23 +00:00
return_address.c ARM: fix some sparse errors in generic ARM code 2011-02-23 17:24:12 +00:00
sched_clock.c ARM: sched_clock: make minsec argument to clocks_calc_mult_shift() zero 2011-01-11 16:44:02 +00:00
setup.c Merge branch 'devicetree/arm-next' of git://git.secretlab.ca/git/linux-2.6 into devel-stable 2011-05-25 00:08:17 +01:00
signal.c ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls 2011-05-12 10:52:00 +01:00
signal.h ARM: Fix signal restart issues with NX and OABI compat 2009-10-25 15:39:37 +00:00
sleep.S ARM: 6825/1: kernel/sleep.S: fix Thumb2 compilation issues 2011-03-26 10:06:08 +00:00
smp_scu.c ARM: pm: add function to set WFI low-power mode for SMP CPUs 2011-02-11 12:29:18 +00:00
smp_tlb.c ARM: SMP: split out software TLB maintainence broadcasting 2010-12-20 15:09:17 +00:00
smp_twd.c ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode 2011-01-25 21:17:58 +00:00
smp.c ARM: SMP: wait for CPU to be marked active 2011-06-21 11:09:05 +01:00
stacktrace.c ARM: fix /proc/$PID/stack on SMP 2011-01-15 09:27:04 +00:00
swp_emulate.c Fix common misspellings 2011-03-31 11:26:23 -03:00
sys_arm.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
sys_oabi-compat.c ARM: 6891/1: prevent heap corruption in OABI semtimedop 2011-04-29 15:53:14 +01:00
tcm.c ARM: P2V: separate PHYS_OFFSET from platform definitions 2011-02-17 23:26:55 +00:00
tcm.h ARM: 5580/2: ARM TCM (Tightly-Coupled Memory) support v3 2009-09-15 22:11:05 +01:00
thumbee.c Fix the teehbr_read function prototype 2008-11-10 14:14:11 +00:00
time.c ARM: Use struct syscore_ops instead of sysdevs for PM in common code 2011-04-24 19:16:08 +02:00
traps.c ARM: extend Code: line by one 16-bit quantity for Thumb instructions 2011-06-09 23:55:45 +01:00
unwind.c ARM: 6468/1: backtrace: fix calculation of thread stack base 2010-11-07 16:12:37 +00:00
vmlinux.lds.S percpu: Always align percpu output section to PAGE_SIZE 2011-03-24 18:50:09 +01:00
xscale-cp0.c ARM: Convert VFP/Crunch/XscaleCP thread_release() to exit_thread() 2009-12-18 14:53:41 +00:00