linux_dsm_epyc7002/kernel
Jens Axboe edafccee56 io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we
setup the io_uring. That avoids the need to do get_user_pages() for
each and every IO.

To utilize this feature, the application must call io_uring_register()
after having setup an io_uring instance, passing in
IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
an iovec array, and the nr_args should contain how many iovecs the
application wishes to map.

If successful, these buffers are now mapped into the kernel, eligible
for IO. To use these fixed buffers, the application must use the
IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
must point to somewhere inside the indexed buffer.

The application may register buffers throughout the lifetime of the
io_uring instance. It can call io_uring_register() with
IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
buffers, and then register a new set. The application need not
unregister buffers explicitly before shutting down the io_uring
instance.

It's perfectly valid to setup a larger buffer, and then sometimes only
use parts of it for an IO. As long as the range is within the originally
mapped region, it will work just fine.

For now, buffers must not be file backed. If file backed buffers are
passed in, the registration will fail with -1/EOPNOTSUPP. This
restriction may be relaxed in the future.

RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
arbitrary 1G per buffer size is also imposed.

Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-28 08:24:23 -07:00
..
bpf bpf: Fix syscall's stackmap lookup potential deadlock 2019-01-31 23:18:21 +01:00
cgroup Merge branch 'for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2018-12-29 10:57:20 -08:00
configs kvm_config: add CONFIG_VIRTIO_MENU 2018-10-24 20:55:56 -04:00
debug kdb: use bool for binary state indicators 2018-12-30 08:31:52 +00:00
dma swiotlb: clear io_tlb_start and io_tlb_end in swiotlb_exit 2019-01-16 09:59:17 -05:00
events perf/core: Don't WARN() for impossible ring-buffer sizes 2019-02-04 08:45:25 +01:00
gcov
irq genirq/irqdesc: Fix double increment in alloc_descs() 2019-01-18 00:43:09 +01:00
livepatch livepatch: Replace synchronize_sched() with synchronize_rcu() 2018-12-01 12:38:50 -08:00
locking futex: Handle early deadlock return correctly 2019-02-08 13:00:36 +01:00
power mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
printk Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
rcu rcutorture: Don't do busted forward-progress testing 2018-12-01 12:45:42 -08:00
sched Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-03 09:02:03 -08:00
time posix-cpu-timers: Unbreak timer rearming 2019-01-15 16:34:37 +01:00
trace Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-02-08 11:21:54 -08:00
.gitignore
acct.c
async.c
audit_fsnotify.c audit: minimize our use of audit_log_format() 2018-11-26 18:40:00 -05:00
audit_tree.c audit: minimize our use of audit_log_format() 2018-11-26 18:40:00 -05:00
audit_watch.c audit: minimize our use of audit_log_format() 2018-11-26 18:40:00 -05:00
audit.c audit: remove duplicated include from audit.c 2018-12-14 12:09:30 -05:00
audit.h audit: use current whenever possible 2018-11-26 18:41:21 -05:00
auditfilter.c
auditsc.c audit: use current whenever possible 2018-11-26 18:41:21 -05:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-10-31 08:54:14 -07:00
capability.c
compat.c make 'user_access_begin()' do 'access_ok()' 2019-01-04 12:56:09 -08:00
configs.c
context_tracking.c
cpu_pm.c
cpu.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-01-30 19:27:00 +01:00
crash_core.c kernel/crash_core.c: print timestamp using time64_t 2018-08-22 10:52:47 -07:00
crash_dump.c
cred.c cred: export get_task_cred(). 2018-12-19 13:52:44 -05:00
delayacct.c delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
dma.c
elfcore.c
exec_domain.c
exit.c kernel/exit.c: release ptraced tasks before zap_pid_ns_processes 2019-02-01 15:46:23 -08:00
extable.c
fail_function.c kernel/fail_function.c: remove meaningless null pointer check before debugfs_remove_recursive 2018-10-31 08:54:12 -07:00
fork.c Merge branch 'akpm' (patches from Andrew) 2019-01-08 18:58:29 -08:00
freezer.c PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
futex.c futex: Handle early deadlock return correctly 2019-02-08 13:00:36 +01:00
groups.c
hung_task.c kernel/hung_task.c: break RCU locks based on jiffies 2019-01-04 13:13:45 -08:00
iomem.c
irq_work.c
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
kallsyms.c kallsyms: reduce size a little on 64-bit 2018-09-10 22:54:33 +09:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt kconfig: warn no new line at end of file 2018-12-15 17:44:35 +09:00
kcov.c kernel/kcov.c: mark write_comp_data() as notrace 2019-01-04 13:13:47 -08:00
kexec_core.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
kexec_file.c kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump 2018-12-06 14:38:50 +00:00
kexec_internal.h
kexec.c kexec: add call to LSM hook in original kexec_load syscall 2018-07-16 12:31:57 -07:00
kmod.c
kprobes.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-12-26 14:45:18 -08:00
ksysfs.c
kthread.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
latencytop.c
Makefile kbuild: change filechk to surround the given command with { } 2019-01-06 09:46:51 +09:00
memremap.c mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm 2018-12-28 12:11:52 -08:00
module_signing.c modsign: use all trusted keys to verify module signature 2018-11-07 14:41:41 +01:00
module-internal.h modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
module.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
notifier.c
nsproxy.c
padata.c padata: clean an indentation issue, remove extraneous space 2018-11-16 14:11:04 +08:00
panic.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
params.c
pid_namespace.c signal: Use group_send_sig_info to kill all processes in a pid namespace 2018-09-16 16:08:25 +02:00
pid.c Fix failure path in alloc_pid() 2018-12-28 12:42:30 -08:00
profile.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ptrace.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
range.c
reboot.c kernel/reboot.c: export pm_power_off_prepare 2018-09-11 16:13:24 +01:00
relay.c relay: check return of create_buf_file() properly 2019-01-31 14:01:48 +01:00
resource.c kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable 2018-12-28 12:11:49 -08:00
rseq.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
seccomp.c seccomp: fix UAF in user-trap code 2019-01-15 09:43:12 -08:00
signal.c signal: Better detection of synchronous signals 2019-02-07 09:00:36 -06:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-01-30 19:27:00 +01:00
smpboot.c smpboot: Remove cpumask from the API 2018-07-03 09:20:44 +02:00
smpboot.h
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-25 11:43:47 -07:00
stackleak.c stackleak: Mark stackleak_track_stack() as notrace 2018-12-05 19:31:44 -08:00
stacktrace.c
stop_machine.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
sys_ni.c io_uring: add support for pre-mapped user IO buffers 2019-02-28 08:24:23 -07:00
sys.c kernel/sys.c: Clarify that UNAME26 does not generate unique versions anymore 2019-01-14 10:38:03 +12:00
sysctl_binary.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
sysctl.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
task_work.c
taskstats.c
test_kprobes.c
torture.c torture: Remove unnecessary "ret" variables 2018-12-01 12:45:35 -08:00
tracepoint.c tracing: Replace synchronize_sched() and call_rcu_sched() 2018-11-27 09:21:41 -08:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c umh: add exit routine for UMH process 2019-01-11 18:05:40 -08:00
up.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-07 23:51:16 -06:00
user-return-notifier.c
user.c userns: use irqsave variant of refcount_dec_and_lock() 2018-08-22 10:52:47 -07:00
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-08-11 02:05:53 -05:00
utsname.c
watchdog_hld.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
watchdog.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue_internal.h psi: fix aggregation idle shut-off 2019-02-01 15:46:23 -08:00
workqueue.c psi: fix aggregation idle shut-off 2019-02-01 15:46:23 -08:00