linux_dsm_epyc7002/arch/powerpc/platforms/powernv
Alexey Kardashevskiy 11f5acce2f powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables
We store 2 multilevel tables in iommu_table - one for the hardware and
one with the corresponding userspace addresses. Before allocating
the tables, the iommu_table_group_ops::get_table_size() hook returns
the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts
the locked_vm counter correctly. When the table is actually allocated,
the amount of allocated memory is stored in iommu_table::it_allocated_size
and used to decrement the locked_vm counter when we release the memory
used by the table; .get_table_size() and .create_table() calculate it
independently but the result is expected to be the same.

However the allocator does not add the userspace table size to
.it_allocated_size so when we destroy the table because of VFIO PCI
unplug (i.e. VFIO container is gone but the userspace keeps running),
we decrement locked_vm by just a half of size of memory we are
releasing.

To make things worse, since we enabled on-demand allocation of
indirect levels, it_allocated_size contains only the amount of memory
actually allocated at the table creation time which can just be a
fraction. It is not a problem with incrementing locked_vm (as
get_table_size() value is used) but it is with decrementing.

As the result, we leak locked_vm and may not be able to allocate more
IOMMU tables after few iterations of hotplug/unplug.

This sets it_allocated_size in the pnv_pci_ioda2_ops::create_table()
hook to what pnv_pci_ioda2_get_table_size() returns so from now on we
have a single place which calculates the maximum memory a table can
occupy. The original meaning of it_allocated_size is somewhat lost now
though.

We do not ditch it_allocated_size whatsoever here and we do not call
get_table_size() from vfio_iommu_spapr_tce.c when decrementing
locked_vm as we may have multiple IOMMU groups per container and even
though they all are supposed to have the same get_table_size()
implementation, there is a small chance for failure or confusion.

Fixes: 090bad39b2 ("powerpc/powernv: Add indirect levels to it_userspace")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-28 11:50:02 +11:00
..
copy-paste.h powerpc/powernv: copy/paste - Mask SO bit in CR 2018-06-04 22:58:41 +10:00
eeh-powernv.c powerpc/powernv/eeh/npu: Fix uninitialized variables in opal_pci_eeh_freeze_status 2018-12-20 22:59:03 +11:00
idle.c powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit 2019-02-22 00:10:15 +11:00
Kconfig PCI: consolidate PCI config entry in drivers/pci 2018-11-23 11:45:34 +09:00
Makefile powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C 2019-02-26 23:55:09 +11:00
memtrace.c powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages() 2018-10-31 08:54:17 -07:00
npu-dma.c powerpc/powernv/npu: Remove redundant change_pte() hook 2019-02-22 00:10:14 +11:00
ocxl.c ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action 2018-06-03 20:40:32 +10:00
opal-async.c powerpc/opal: Add opal_async_wait_response_interruptible() to opal-async 2017-11-06 20:39:28 +11:00
opal-call.c powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C 2019-02-26 23:55:09 +11:00
opal-dump.c powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler 2018-07-24 22:03:24 +10:00
opal-elog.c powerpc: Use octal numbers for file permissions 2018-01-22 05:48:33 +11:00
opal-flash.c powerpc/powernv: Always stop secondaries before reboot/shutdown 2018-04-03 22:59:57 +10:00
opal-hmi.c powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" 2018-06-05 11:33:47 +10:00
opal-imc.c powerpc/perf: Unregister thread-imc if core-imc not supported 2018-06-03 20:43:37 +10:00
opal-irqchip.c powerpc/powernv/opal: Use standard interrupts property when available 2018-08-08 00:32:38 +10:00
opal-kmsg.c powerpc/powernv: Implement and use opal_flush_console 2018-07-24 22:09:56 +10:00
opal-lpc.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
opal-memory-errors.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal-msglog.c powerpc/powernv: Make opal log only readable by root 2019-02-27 22:11:31 +11:00
opal-nvram.c powerpc/powernv: Fix NVRAM sleep in invalid context when crashing 2018-05-18 00:23:07 +10:00
opal-power.c powerpc/powernv: Move opal_power_control_init() call in opal_init(). 2018-12-21 11:32:49 +11:00
opal-powercap.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-prd.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
opal-psr.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal-rtc.c powerpc: use time64_t in read_persistent_clock 2018-06-03 20:43:33 +10:00
opal-sensor-groups.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-sensor.c powernv: opal-sensor: Add support to read 64bit sensor values 2018-05-21 14:48:02 +10:00
opal-sysparam.c powerpc: Convert to using %pOFn instead of device_node.name 2018-10-03 15:40:01 +10:00
opal-tracepoints.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
opal-wrappers.S powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C 2019-02-26 23:55:09 +11:00
opal-xscom.c powerpc: Use sizeof(*foo) rather than sizeof(struct foo) 2018-03-20 16:47:53 +11:00
opal.c Merge branch 'topic/ppc-kvm' into next 2019-02-22 00:09:56 +11:00
pci-cxl.c Revert "powerpc/powernv: Add support for the cxl kernel api on the real phb" 2018-07-02 23:54:32 +10:00
pci-ioda-tce.c powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables 2019-02-28 11:50:02 +11:00
pci-ioda.c powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables 2019-02-28 11:50:02 +11:00
pci.c powerpc/powernv/pseries: Rework device adding to IOMMU groups 2018-12-21 16:20:46 +11:00
pci.h powerpc/powernv/npu: Add compound IOMMU groups 2018-12-21 16:20:46 +11:00
powernv.h powerpc/powernv: process all OPAL event interrupts with kopald 2018-06-03 20:40:30 +10:00
rng.c powerpc: Convert to using %pOF instead of full_name 2017-08-23 22:27:04 +10:00
setup.c powerpc/powernv: Make possible for user to force a full ipl cec reboot 2018-10-03 15:39:45 +10:00
smp.c powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit 2019-02-22 00:10:15 +11:00
subcore-asm.S powerpc/powernv: Add support for POWER8 split core on powernv 2014-05-28 13:35:37 +10:00
subcore.c powerpc/64: Use array of paca pointers and allocate pacas individually 2018-03-30 23:34:23 +11:00
subcore.h powernv/powerpc: Add winkle support for offline cpus 2014-12-15 10:46:41 +11:00
vas-debug.c powerpc/powernv/vas: Use DEFINE_SHOW_ATTRIBUTE macro 2018-11-25 17:11:22 +11:00
vas-trace.h powerpc/vas: Add a couple of trace points 2018-03-14 20:13:58 +11:00
vas-window.c ppc: Convert vas ID allocation to new IDA API 2018-08-21 23:54:19 -04:00
vas.c powerpc/vas: Fix cleanup when VAS is not configured 2018-03-14 20:11:37 +11:00
vas.h powerpc: clean the inclusion of stringify.h 2018-07-30 22:48:17 +10:00