linux_dsm_epyc7002/arch/powerpc/platforms/powernv
Mark Hairgrove 3689c37d23 powerpc/powernv/npu: Use size-based ATSD invalidates
Prior to this change only two types of ATSDs were issued to the NPU:
invalidates targeting a single page and invalidates targeting the whole
address space. The crossover point happened at the configurable
atsd_threshold which defaulted to 2M. Invalidates that size or smaller
would issue per-page invalidates for the whole range.

The NPU supports more invalidation sizes however: 64K, 2M, 1G, and all.
These invalidates target addresses aligned to their size. 2M is a common
invalidation size for GPU-enabled applications because that is a GPU
page size, so reducing the number of invalidates by 32x in that case is a
clear improvement.

ATSD latency is high in general so now we always issue a single invalidate
rather than multiple. This will over-invalidate in some cases, but for any
invalidation size over 2M it matches or improves the prior behavior.
There's also an improvement for single-page invalidates since the prior
version issued two invalidates for that case instead of one.

With this change all issued ATSDs now perform a flush, so the flush
parameter has been removed from all the helpers.

To show the benefit here are some performance numbers from a
microbenchmark which creates a 1G allocation then uses mprotect with
PROT_NONE to trigger invalidates in strides across the allocation.

One NPU (1 GPU):

         mprotect rate (GB/s)
Stride   Before      After      Speedup
64K         5.3        5.6           5%
1M         39.3       57.4          46%
2M         49.7       82.6          66%
4M        286.6      285.7           0%

Two NPUs (6 GPUs):

         mprotect rate (GB/s)
Stride   Before      After      Speedup
64K         6.5        7.4          13%
1M         33.4       67.9         103%
2M         38.7       93.1         141%
4M        356.7      354.6          -1%

Anything over 2M is roughly the same as before since both cases issue a
single ATSD.

Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-04 16:55:53 +10:00
..
copy-paste.h powerpc/powernv: copy/paste - Mask SO bit in CR 2018-06-04 22:58:41 +10:00
eeh-powernv.c powerpc/eeh: Avoid misleading message "EEH: no capable adapters found" 2018-07-02 23:54:26 +10:00
idle.c powerpc/powernv/idle: Fix build error 2018-08-10 22:12:39 +10:00
Kconfig powerpc/powernv: Don't select the cpufreq governors 2018-09-17 21:16:26 +10:00
Makefile powerpc/powernv: Move TCE manupulation code to its own file 2018-07-16 22:53:07 +10:00
memtrace.c powerpc/memtrace: Remove memory in chunks 2018-09-19 21:58:02 +10:00
npu-dma.c powerpc/powernv/npu: Use size-based ATSD invalidates 2018-10-04 16:55:53 +10:00
ocxl.c ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action 2018-06-03 20:40:32 +10:00
opal-async.c powerpc/opal: Add opal_async_wait_response_interruptible() to opal-async 2017-11-06 20:39:28 +11:00
opal-dump.c powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler 2018-07-24 22:03:24 +10:00
opal-elog.c powerpc: Use octal numbers for file permissions 2018-01-22 05:48:33 +11:00
opal-flash.c powerpc/powernv: Always stop secondaries before reboot/shutdown 2018-04-03 22:59:57 +10:00
opal-hmi.c powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" 2018-06-05 11:33:47 +10:00
opal-imc.c powerpc/perf: Unregister thread-imc if core-imc not supported 2018-06-03 20:43:37 +10:00
opal-irqchip.c powerpc/powernv/opal: Use standard interrupts property when available 2018-08-08 00:32:38 +10:00
opal-kmsg.c powerpc/powernv: Implement and use opal_flush_console 2018-07-24 22:09:56 +10:00
opal-lpc.c powerpc: Create asm/debugfs.h and move powerpc_debugfs_root there 2017-04-11 07:46:03 +10:00
opal-memory-errors.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal-msglog.c locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() 2017-10-25 11:01:08 +02:00
opal-nvram.c powerpc/powernv: Fix NVRAM sleep in invalid context when crashing 2018-05-18 00:23:07 +10:00
opal-power.c powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform 2015-07-16 13:34:36 +10:00
opal-powercap.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-prd.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
opal-psr.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal-rtc.c powerpc: use time64_t in read_persistent_clock 2018-06-03 20:43:33 +10:00
opal-sensor-groups.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-sensor.c powernv: opal-sensor: Add support to read 64bit sensor values 2018-05-21 14:48:02 +10:00
opal-sysparam.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-tracepoints.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
opal-wrappers.S crypto/nx: Initialize 842 high and normal RxFIFO control registers 2018-08-08 00:32:34 +10:00
opal-xscom.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal.c powerpc/64s: consolidate MCE counter increment. 2018-10-03 15:40:06 +10:00
pci-cxl.c Revert "powerpc/powernv: Add support for the cxl kernel api on the real phb" 2018-07-02 23:54:32 +10:00
pci-ioda-tce.c powerpc/powernv/ioda: Allocate indirect TCE levels on demand 2018-07-16 22:53:11 +10:00
pci-ioda.c powerpc fixes for 4.19 #2 2018-08-24 09:34:23 -07:00
pci.c powerpc/powernv: Move TCE manupulation code to its own file 2018-07-16 22:53:07 +10:00
pci.h Merge branch 'topic/ppc-kvm' into next 2018-07-19 14:37:57 +10:00
powernv.h powerpc/powernv: process all OPAL event interrupts with kopald 2018-06-03 20:40:30 +10:00
rng.c powerpc: Convert to using %pOF instead of full_name 2017-08-23 22:27:04 +10:00
setup.c powerpc/powernv: Make possible for user to force a full ipl cec reboot 2018-10-03 15:39:45 +10:00
smp.c powerpc/64s: Remove POWER9 DD1 support 2018-07-16 11:37:21 +10:00
subcore-asm.S powerpc/powernv: Add support for POWER8 split core on powernv 2014-05-28 13:35:37 +10:00
subcore.c powerpc/64: Use array of paca pointers and allocate pacas individually 2018-03-30 23:34:23 +11:00
subcore.h powernv/powerpc: Add winkle support for offline cpus 2014-12-15 10:46:41 +11:00
vas-debug.c powerpc/vas: Fix cleanup when VAS is not configured 2018-03-14 20:11:37 +11:00
vas-trace.h powerpc/vas: Add a couple of trace points 2018-03-14 20:13:58 +11:00
vas-window.c ppc: Convert vas ID allocation to new IDA API 2018-08-21 23:54:19 -04:00
vas.c powerpc/vas: Fix cleanup when VAS is not configured 2018-03-14 20:11:37 +11:00
vas.h powerpc: clean the inclusion of stringify.h 2018-07-30 22:48:17 +10:00