linux_dsm_epyc7002/arch/arm64/mm
Mark Rutland 23529049c6 arm64: entry: fix non-NMI user<->kernel transitions
When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE
will WARN() at boot time that interrupts are enabled when we call
context_tracking_user_enter(), despite the DAIF flags indicating that
IRQs are masked.

The problem is that we're not tracking IRQ flag changes accurately, and
so lockdep believes interrupts are enabled when they are not (and
vice-versa). We can shuffle things so to make this more accurate. For
kernel->user transitions there are a number of constraints we need to
consider:

1) When we call __context_tracking_user_enter() HW IRQs must be disabled
   and lockdep must be up-to-date with this.

2) Userspace should be treated as having IRQs enabled from the PoV of
   both lockdep and tracing.

3) As context_tracking_user_enter() stops RCU from watching, we cannot
   use RCU after calling it.

4) IRQ flag tracing and lockdep have state that must be manipulated
   before RCU is disabled.

... with similar constraints applying for user->kernel transitions, with
the ordering reversed.

The generic entry code has enter_from_user_mode() and
exit_to_user_mode() helpers to handle this. We can't use those directly,
so we add arm64 copies for now (without the instrumentation markers
which aren't used on arm64). These replace the existing user_exit() and
user_exit_irqoff() calls spread throughout handlers, and the exception
unmasking is left as-is.

Note that:

* The accounting for debug exceptions from userspace now happens in
  el0_dbg() and ret_to_user(), so this is removed from
  debug_exception_enter() and debug_exception_exit(). As
  user_exit_irqoff() wakes RCU, the userspace-specific check is removed.

* The accounting for syscalls now happens in el0_svc(),
  el0_svc_compat(), and ret_to_user(), so this is removed from
  el0_svc_common(). This does not adversely affect the workaround for
  erratum 1463225, as this does not depend on any of the state tracking.

* In ret_to_user() we mask interrupts with local_daif_mask(), and so we
  need to inform lockdep and tracing. Here a trace_hardirqs_off() is
  sufficient and safe as we have not yet exited kernel context and RCU
  is usable.

* As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only
  needs to check for the latter.

* EL0 SError handling will be dealt with in a subsequent patch, as this
  needs to be treated as an NMI.

Prior to this patch, booting an appropriately-configured kernel would
result in spats as below:

| DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
| WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0
| Modules linked in:
| CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3
| Hardware name: linux,dummy-virt (DT)
| pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : check_flags.part.54+0x1dc/0x1f0
| lr : check_flags.part.54+0x1dc/0x1f0
| sp : ffff80001003bd80
| x29: ffff80001003bd80 x28: ffff66ce801e0000
| x27: 00000000ffffffff x26: 00000000000003c0
| x25: 0000000000000000 x24: ffffc31842527258
| x23: ffffc31842491368 x22: ffffc3184282d000
| x21: 0000000000000000 x20: 0000000000000001
| x19: ffffc318432ce000 x18: 0080000000000000
| x17: 0000000000000000 x16: ffffc31840f18a78
| x15: 0000000000000001 x14: ffffc3184285c810
| x13: 0000000000000001 x12: 0000000000000000
| x11: ffffc318415857a0 x10: ffffc318406614c0
| x9 : ffffc318415857a0 x8 : ffffc31841f1d000
| x7 : 647261685f706564 x6 : ffffc3183ff7c66c
| x5 : ffff66ce801e0000 x4 : 0000000000000000
| x3 : ffffc3183fe00000 x2 : ffffc31841500000
| x1 : e956dc24146b3500 x0 : 0000000000000000
| Call trace:
|  check_flags.part.54+0x1dc/0x1f0
|  lock_is_held_type+0x10c/0x188
|  rcu_read_lock_sched_held+0x70/0x98
|  __context_tracking_enter+0x310/0x350
|  context_tracking_enter.part.3+0x5c/0xc8
|  context_tracking_user_enter+0x6c/0x80
|  finish_ret_to_user+0x2c/0x13cr

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30 12:11:38 +00:00
..
cache.S arm64: mm: Use modern annotations for assembly functions 2020-01-08 12:23:38 +00:00
context.c arm64: mm: Pin down ASIDs for sharing mm with devices 2020-09-28 22:15:38 +01:00
copypage.c arm64: Avoid unnecessary clear_user_page() indirection 2020-09-04 12:46:06 +01:00
dma-mapping.c dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> 2020-10-06 07:07:06 +02:00
extable.c arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM 2020-09-18 16:35:54 +01:00
fault.c arm64: entry: fix non-NMI user<->kernel transitions 2020-11-30 12:11:38 +00:00
flush.c mm: introduce page_size() 2019-09-24 15:54:08 -07:00
hugetlbpage.c mm: remove unneeded includes of <asm/pgalloc.h> 2020-08-07 11:33:26 -07:00
init.c More arm64 updates for 5.10 2020-10-23 09:46:16 -07:00
ioremap.c mm: remove unneeded includes of <asm/pgalloc.h> 2020-08-07 11:33:26 -07:00
kasan_init.c arch, drivers: replace for_each_membock() with for_each_mem_range() 2020-10-13 18:38:35 -07:00
Makefile Merge branch 'for-next/mte' into for-next/core 2020-10-02 12:16:11 +01:00
mmap.c arm64, mm: move generic mmap layout functions to mm 2019-09-24 15:54:11 -07:00
mmu.c arm64/mm: Validate hotplug range before creating linear mapping 2020-11-13 09:45:08 +00:00
mteswap.c arm64: mte: Enable swap of tagged pages 2020-09-04 12:46:07 +01:00
numa.c memblock: use separate iterators for memory and reserved regions 2020-10-13 18:38:35 -07:00
pageattr.c arm64: mm: Fix missing-prototypes in pageattr.c 2020-09-18 14:33:46 +01:00
pgd.c mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
physaddr.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
proc.S arm64: mte: CPU feature detection and initial sysreg configuration 2020-09-03 17:26:32 +01:00
ptdump_debugfs.c arm64/mm: Hold memory hotplug lock while walking for kernel page table dump 2020-03-04 15:35:22 +00:00
ptdump.c Merge branch 'for-next/mte' into for-next/core 2020-10-02 12:16:11 +01:00