linux_dsm_epyc7002/Documentation
Sean Christopherson 919f6cd8bb KVM: doc: Document the life cycle of a VM and its resources
The series to add memcg accounting to KVM allocations[1] states:

  There are many KVM kernel memory allocations which are tied to the
  life of the VM process and should be charged to the VM process's
  cgroup.

While it is correct to account KVM kernel allocations to the cgroup of
the process that created the VM, it's technically incorrect to state
that the KVM kernel memory allocations are tied to the life of the VM
process.  This is because the VM itself, i.e. struct kvm, is not tied to
the life of the process which created it, rather it is tied to the life
of its associated file descriptor.  In other words, kvm_destroy_vm() is
not invoked until fput() decrements its associated file's refcount to
zero.  A simple example is to fork() in Qemu and have the child sleep
indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file
descriptor *and* the rogue child is killed.

The allocations are guaranteed to be *accounted* to the process which
created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the
ioctl() with -EIO if kvm->mm != current->mm.  I.e. the child can keep
the VM "alive" but can't do anything useful with its reference.

Note that because 'struct kvm' also holds a reference to the mm_struct
of its owner, the above behavior also applies to userspace allocations.

Given that mucking with a VM's file descriptor can lead to subtle and
undesirable behavior, e.g. memcg charges persisting after a VM is shut
down, explicitly document a VM's lifecycle and its impact on the VM's
resources.

Alternatively, KVM could aggressively free resources when the creating
process exits, e.g. via mmu_notifier->release().  However, mmu_notifier
isn't guaranteed to be available, and freeing resources when the creator
exits is likely to be error prone and fragile as KVM would need to
ensure that it only freed resources that are truly out of reach. In
practice, the existing behavior shouldn't be problematic as a properly
configured system will prevent a child process from being moved out of
the appropriate cgroup hierarchy, i.e. prevent hiding the process from
the OOM killer, and will prevent an unprivileged user from being able to
to hold a reference to struct kvm via another method, e.g. debugfs.

[1]https://patchwork.kernel.org/patch/10806707/

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-28 17:29:51 +01:00
..
ABI device-dax for 5.1 2019-03-16 13:05:32 -07:00
accelerators ocxl: Document new OCXL IOCTLs 2018-06-03 20:40:33 +10:00
accounting psi: cgroup support 2018-10-26 16:26:32 -07:00
acpi ACPI: Documentation: Fix path for acpidbg tool 2019-03-07 11:28:33 +01:00
admin-guide for-5.1/block-post-20190315 2019-03-16 12:36:39 -07:00
aoe
arm ARM: 8833/1: Ensure that NEON code always compiles with Clang 2019-02-12 15:20:09 +00:00
arm64 arm64 updates for 5.1: 2019-03-10 10:17:23 -07:00
auxdisplay Doc: misc-devices: move lcd-panel-cgram.txt to auxdisplay/ 2018-04-12 16:08:02 +02:00
backlight
block block: document usage of bio iterator helpers 2019-02-15 08:40:12 -07:00
blockdev zram: idle writeback fixes and cleanup 2019-01-08 17:15:10 -08:00
bpf docs/bpf: minor casing/punctuation fixes 2019-03-02 00:40:04 +01:00
bus-devices
cdrom Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
cgroup-v1 A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
cma
connector
console Documentation: corrections to console/console.txt 2018-08-10 16:09:40 -06:00
core-api Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
cpu-freq Documentation: cpu-freq: Frequencies aren't always sorted 2018-11-07 13:29:04 +01:00
crypto crypto: skcipher - remove remnants of internal IV generators 2018-12-23 11:52:45 +08:00
dev-tools Documentation/dev-tools: Use gcc version number instead svn revision number 2019-01-14 17:25:41 -07:00
device-mapper dm cache: add support for discard passdown to the origin device 2019-03-05 14:53:52 -05:00
devicetree irqchip updates for 5.1, take #2 2019-03-21 12:30:54 +01:00
doc-guide docs: kernel-doc: typo "if ... if" -> "if ... is" 2019-02-17 15:38:47 -07:00
driver-api dmaengine updates for v5.1-rc1 2019-03-14 09:11:54 -07:00
driver-model Merge branches 'clk-optional', 'clk-devm-clkdev-register', 'clk-allwinner', 'clk-meson' and 'clk-renesas' into clk-next 2019-03-08 10:27:21 -08:00
early-userspace Correct gen_init_cpio tool's documentation 2018-11-25 12:25:53 -07:00
EDID Docs/EDID: Calculate CRC while building the code 2018-11-06 07:36:22 -07:00
extcon
fault-injection doc: fault-injection: fix macro name in example 2019-01-07 15:36:11 -07:00
fb fbdev: fbmem: convert CONFIG_FB_LOGO_CENTER into a cmd line option 2019-01-16 17:42:35 +01:00
features Documentation/features: Add csky kernel features 2019-01-07 22:22:16 +08:00
filesystems various tracing and debugging improvements, crediting fixes, some cleanup, and important fallocate fix (fixes three xfstests) and lock fix 2019-03-15 18:52:12 -07:00
firmware_class
fmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
fpga docs: fpga: add a document for FPGA Device Feature List (DFL) Framework Overview 2018-07-15 13:55:44 +02:00
gpio Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
gpu Merge drm/drm-next into drm-misc-next 2019-02-11 10:35:35 +01:00
hid HID: doc: fix wrong data structure reference for UHID_OUTPUT 2018-12-18 14:55:22 +01:00
hwmon A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
i2c i2c: gpio: fault-injector: add 'inject_panic' injector 2019-02-23 10:34:08 +01:00
ia64
ide Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
iio
infiniband Documentation/infiniband: update from locked to pinned_vm 2019-02-07 12:56:23 -07:00
input doc: Change LXR references to elixir.bootlin.com 2019-02-01 16:05:03 -07:00
interconnect interconnect: Add generic on-chip interconnect API 2019-01-22 13:37:25 +01:00
ioctl seccomp: add a return code to trap to userspace 2018-12-11 16:28:41 -08:00
isdn Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
kbuild kbuild: force all architectures except um to include mandatory-y 2019-03-17 12:56:32 +09:00
kdump kdump: Document kernel data exported in the vmcoreinfo note 2019-01-15 11:05:28 +01:00
kernel-hacking doc:it_IT: translation for kernel-hacking 2018-07-26 16:21:09 -06:00
laptops Documentation: fix lg-laptop.rst warnings 2019-02-11 08:27:47 -07:00
leds Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
lightnvm
livepatch livepatch: Remove signal sysfs attribute 2019-01-16 22:09:33 +01:00
locking Documentation/locking/lockdep: Drop last two chars of sample states 2019-03-04 12:55:18 -07:00
m68k Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
maintainer docs: Fix more broken references 2018-06-15 18:11:26 -03:00
md
media media: Documentation: fix several typos 2019-03-01 09:54:06 -05:00
memory-devices
mic
mips Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
misc-devices Documentation: add ibmvmc to toctree(index) and fix warnings 2019-01-14 08:37:17 -07:00
mmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
mtd Documentation: mtd: remove stale pxa3xx NAND controller documentation 2018-09-04 23:37:38 +02:00
namespaces
netlabel Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
networking A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
nfc
nios2
nvdimm libnvdimm/security: Add documentation for nvdimm security support 2018-12-21 12:44:41 -08:00
nvmem Documentation: nvmem: document cell tables and lookup entries 2018-09-28 15:14:54 +02:00
openrisc
parisc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
PCI pci-v4.20-changes 2018-10-25 06:50:48 -07:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf Documentation: perf: Add documentation for ThunderX2 PMU uncore driver 2018-12-06 12:29:47 +00:00
phy
platform
power PM/EM: Document the Energy Model framework 2019-01-27 12:29:37 +01:00
powerpc powerpc/fadump: Reservationless firmware assisted dump 2018-12-21 11:32:49 +11:00
pps
process A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
pti
ptp
rapidio
RCU A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
riscv perf: riscv: Add Document for Future Porting Guide 2018-06-04 14:02:11 -07:00
s390 Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
scheduler sched/doc: Document Energy Aware Scheduling 2019-01-27 12:29:37 +01:00
scsi scsi: ufs-bsg: Allow reading descriptors 2019-02-27 09:00:02 -05:00
security doc: security: Add kern-doc for lsm_hooks.h 2019-02-22 08:54:09 -07:00
serial Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
sh sh: remove board_time_init() callback 2018-12-18 16:13:04 +01:00
sound ASoC: More changes for v5.1 2019-02-28 13:30:55 +01:00
sparc
sphinx Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
sphinx-static docs: improve readability for people with poorer eyesight 2018-10-07 09:16:50 -06:00
spi pxa2xx: replace spi_master with spi_controller 2019-01-23 10:59:56 +00:00
sysctl A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
target scsi: target/core: Remove the write_pending_status() callback function 2019-02-04 21:23:59 -05:00
thermal Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
timers Docs: Correct /proc/stat path 2019-02-22 08:50:17 -07:00
trace doc: trace: Fix documentation for uprobe_profile 2019-02-21 10:28:55 -05:00
translations A fairly routine cycle for docs - lots of typo fixes, some new documents, 2019-03-09 09:56:17 -08:00
usb usb: core: add option of only authorizing internal devices 2019-02-22 09:27:55 +01:00
userspace-api x86/speculation: Add PR_SPEC_DISABLE_NOEXEC 2019-01-29 22:11:49 +01:00
virtual KVM: doc: Document the life cycle of a VM and its resources 2019-03-28 17:29:51 +01:00
vm docs: Use underscore not hyphen in label 2019-02-01 16:04:01 -07:00
w1 Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
watchdog Documentation/watchdog: Add documentation mlx-wdt driver 2019-03-02 15:28:20 +01:00
wimax
x86 x86/resctrl: Avoid confusion over the new X86_RESCTRL config 2019-02-02 10:34:52 +01:00
xilinx Documentation: xilinx: Add documentation for eemi APIs 2018-10-09 13:26:05 +02:00
xtensa xtensa: document boot parameter passing 2019-02-03 18:06:19 -08:00
.gitignore
atomic_bitops.txt
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt Documentation: remove stale firmware API reference 2018-05-14 16:44:41 +02:00
digsig.txt
DMA-API-HOWTO.txt Documentation/DMA-API-HOWTO: update dma_mask sections 2019-02-20 07:29:47 -07:00
DMA-API.txt virtio: fixes, cleanups 2019-03-10 12:47:57 -07:00
DMA-attributes.txt
DMA-ISA-LPC.txt Documentation/DMA-ISA-LPC: fix an incorrect reference 2019-02-11 08:23:07 -07:00
docutils.conf
dontdiff kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
efi-stub.txt efi_stub: update documentation on dtb= parameter 2018-09-09 14:46:44 -06:00
eisa.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst Documentation: add ibmvmc to toctree(index) and fix warnings 2019-01-14 08:37:17 -07:00
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt doc: Update removal of RCU-bh/sched update machinery 2018-08-30 10:59:48 -07:00
kobject.txt kref/kobject: Improve documentation 2018-12-06 13:57:03 +01:00
kprobes.txt kprobes/Documentation: Fix various typos 2018-06-22 11:10:55 +02:00
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt lib/lzo: separate lzo-rle from lzo 2019-03-07 18:32:03 -08:00
mailbox.txt
Makefile kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
memory-barriers.txt Documentation: Use "while" instead of "whilst" 2018-11-20 09:30:43 -07:00
men-chameleon-bus.txt
nommu-mmap.txt Documentation: nommu-map: Fix duplicate word typo 2018-06-26 09:01:27 -06:00
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt Documentation: preempt-locking: Use better example 2018-10-12 11:35:47 -06:00
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt rfkill: Fix several typos in documentation 2018-06-15 13:36:08 +02:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt static_keys.txt: Fix trivial spelling mistake 2019-02-06 16:44:16 -07:00
SubmittingPatches
svga.txt
switchtec.txt NTB: switchtec_ntb: Update switchtec documentation with prerequisites for NTB 2018-10-11 11:28:53 -05:00
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-06-08 10:24:27 -06:00
vfio.txt vfio: fix documentation 2018-05-08 09:16:41 -06:00
video-output.txt
xillybus.txt
xz.txt
zorro.txt