linux_dsm_epyc7002/arch/arm/tools/syscall.tbl

454 lines
16 KiB
Plaintext
Raw Normal View History

#
# Linux system call numbers and entry vectors
#
# The format is:
# <num> <abi> <name> [<entry point> [<oabi compat entry point>]]
#
# Where abi is:
# common - for system calls shared between oabi and eabi (may have compat)
# oabi - for oabi-only system calls (may have compat)
# eabi - for eabi-only system calls
#
# For each syscall number, "common" is mutually exclusive with oabi and eabi
#
0 common restart_syscall sys_restart_syscall
1 common exit sys_exit
2 common fork sys_fork
3 common read sys_read
4 common write sys_write
5 common open sys_open
6 common close sys_close
# 7 was sys_waitpid
8 common creat sys_creat
9 common link sys_link
10 common unlink sys_unlink
11 common execve sys_execve
12 common chdir sys_chdir
13 oabi time sys_time32
14 common mknod sys_mknod
15 common chmod sys_chmod
16 common lchown sys_lchown16
# 17 was sys_break
# 18 was sys_stat
19 common lseek sys_lseek
20 common getpid sys_getpid
21 common mount sys_mount
22 oabi umount sys_oldumount
23 common setuid sys_setuid16
24 common getuid sys_getuid16
25 oabi stime sys_stime32
26 common ptrace sys_ptrace
27 oabi alarm sys_alarm
# 28 was sys_fstat
29 common pause sys_pause
30 oabi utime sys_utime32
# 31 was sys_stty
# 32 was sys_gtty
33 common access sys_access
34 common nice sys_nice
# 35 was sys_ftime
36 common sync sys_sync
37 common kill sys_kill
38 common rename sys_rename
39 common mkdir sys_mkdir
40 common rmdir sys_rmdir
41 common dup sys_dup
42 common pipe sys_pipe
43 common times sys_times
# 44 was sys_prof
45 common brk sys_brk
46 common setgid sys_setgid16
47 common getgid sys_getgid16
# 48 was sys_signal
49 common geteuid sys_geteuid16
50 common getegid sys_getegid16
51 common acct sys_acct
52 common umount2 sys_umount
# 53 was sys_lock
54 common ioctl sys_ioctl
55 common fcntl sys_fcntl
# 56 was sys_mpx
57 common setpgid sys_setpgid
# 58 was sys_ulimit
# 59 was sys_olduname
60 common umask sys_umask
61 common chroot sys_chroot
62 common ustat sys_ustat
63 common dup2 sys_dup2
64 common getppid sys_getppid
65 common getpgrp sys_getpgrp
66 common setsid sys_setsid
67 common sigaction sys_sigaction
# 68 was sys_sgetmask
# 69 was sys_ssetmask
70 common setreuid sys_setreuid16
71 common setregid sys_setregid16
72 common sigsuspend sys_sigsuspend
73 common sigpending sys_sigpending
74 common sethostname sys_sethostname
75 common setrlimit sys_setrlimit
# Back compat 2GB limited rlimit
76 oabi getrlimit sys_old_getrlimit
77 common getrusage sys_getrusage
78 common gettimeofday sys_gettimeofday
79 common settimeofday sys_settimeofday
80 common getgroups sys_getgroups16
81 common setgroups sys_setgroups16
82 oabi select sys_old_select
83 common symlink sys_symlink
# 84 was sys_lstat
85 common readlink sys_readlink
86 common uselib sys_uselib
87 common swapon sys_swapon
88 common reboot sys_reboot
89 oabi readdir sys_old_readdir
90 oabi mmap sys_old_mmap
91 common munmap sys_munmap
92 common truncate sys_truncate
93 common ftruncate sys_ftruncate
94 common fchmod sys_fchmod
95 common fchown sys_fchown16
96 common getpriority sys_getpriority
97 common setpriority sys_setpriority
# 98 was sys_profil
99 common statfs sys_statfs
100 common fstatfs sys_fstatfs
# 101 was sys_ioperm
102 oabi socketcall sys_socketcall sys_oabi_socketcall
103 common syslog sys_syslog
104 common setitimer sys_setitimer
105 common getitimer sys_getitimer
106 common stat sys_newstat
107 common lstat sys_newlstat
108 common fstat sys_newfstat
# 109 was sys_uname
# 110 was sys_iopl
111 common vhangup sys_vhangup
# 112 was sys_idle
# syscall to call a syscall!
113 oabi syscall sys_syscall
114 common wait4 sys_wait4
115 common swapoff sys_swapoff
116 common sysinfo sys_sysinfo
117 oabi ipc sys_ipc sys_oabi_ipc
118 common fsync sys_fsync
119 common sigreturn sys_sigreturn_wrapper
120 common clone sys_clone
121 common setdomainname sys_setdomainname
122 common uname sys_newuname
# 123 was sys_modify_ldt
124 common adjtimex sys_adjtimex_time32
125 common mprotect sys_mprotect
126 common sigprocmask sys_sigprocmask
# 127 was sys_create_module
128 common init_module sys_init_module
129 common delete_module sys_delete_module
# 130 was sys_get_kernel_syms
131 common quotactl sys_quotactl
132 common getpgid sys_getpgid
133 common fchdir sys_fchdir
134 common bdflush sys_bdflush
135 common sysfs sys_sysfs
136 common personality sys_personality
# 137 was sys_afs_syscall
138 common setfsuid sys_setfsuid16
139 common setfsgid sys_setfsgid16
140 common _llseek sys_llseek
141 common getdents sys_getdents
142 common _newselect sys_select
143 common flock sys_flock
144 common msync sys_msync
145 common readv sys_readv
146 common writev sys_writev
147 common getsid sys_getsid
148 common fdatasync sys_fdatasync
149 common _sysctl sys_sysctl
150 common mlock sys_mlock
151 common munlock sys_munlock
152 common mlockall sys_mlockall
153 common munlockall sys_munlockall
154 common sched_setparam sys_sched_setparam
155 common sched_getparam sys_sched_getparam
156 common sched_setscheduler sys_sched_setscheduler
157 common sched_getscheduler sys_sched_getscheduler
158 common sched_yield sys_sched_yield
159 common sched_get_priority_max sys_sched_get_priority_max
160 common sched_get_priority_min sys_sched_get_priority_min
161 common sched_rr_get_interval sys_sched_rr_get_interval_time32
162 common nanosleep sys_nanosleep_time32
163 common mremap sys_mremap
164 common setresuid sys_setresuid16
165 common getresuid sys_getresuid16
# 166 was sys_vm86
# 167 was sys_query_module
168 common poll sys_poll
169 common nfsservctl
170 common setresgid sys_setresgid16
171 common getresgid sys_getresgid16
172 common prctl sys_prctl
173 common rt_sigreturn sys_rt_sigreturn_wrapper
174 common rt_sigaction sys_rt_sigaction
175 common rt_sigprocmask sys_rt_sigprocmask
176 common rt_sigpending sys_rt_sigpending
177 common rt_sigtimedwait sys_rt_sigtimedwait_time32
178 common rt_sigqueueinfo sys_rt_sigqueueinfo
179 common rt_sigsuspend sys_rt_sigsuspend
180 common pread64 sys_pread64 sys_oabi_pread64
181 common pwrite64 sys_pwrite64 sys_oabi_pwrite64
182 common chown sys_chown16
183 common getcwd sys_getcwd
184 common capget sys_capget
185 common capset sys_capset
186 common sigaltstack sys_sigaltstack
187 common sendfile sys_sendfile
# 188 reserved
# 189 reserved
190 common vfork sys_vfork
# SuS compliant getrlimit
191 common ugetrlimit sys_getrlimit
192 common mmap2 sys_mmap2
193 common truncate64 sys_truncate64 sys_oabi_truncate64
194 common ftruncate64 sys_ftruncate64 sys_oabi_ftruncate64
195 common stat64 sys_stat64 sys_oabi_stat64
196 common lstat64 sys_lstat64 sys_oabi_lstat64
197 common fstat64 sys_fstat64 sys_oabi_fstat64
198 common lchown32 sys_lchown
199 common getuid32 sys_getuid
200 common getgid32 sys_getgid
201 common geteuid32 sys_geteuid
202 common getegid32 sys_getegid
203 common setreuid32 sys_setreuid
204 common setregid32 sys_setregid
205 common getgroups32 sys_getgroups
206 common setgroups32 sys_setgroups
207 common fchown32 sys_fchown
208 common setresuid32 sys_setresuid
209 common getresuid32 sys_getresuid
210 common setresgid32 sys_setresgid
211 common getresgid32 sys_getresgid
212 common chown32 sys_chown
213 common setuid32 sys_setuid
214 common setgid32 sys_setgid
215 common setfsuid32 sys_setfsuid
216 common setfsgid32 sys_setfsgid
217 common getdents64 sys_getdents64
218 common pivot_root sys_pivot_root
219 common mincore sys_mincore
220 common madvise sys_madvise
221 common fcntl64 sys_fcntl64 sys_oabi_fcntl64
# 222 for tux
# 223 is unused
224 common gettid sys_gettid
225 common readahead sys_readahead sys_oabi_readahead
226 common setxattr sys_setxattr
227 common lsetxattr sys_lsetxattr
228 common fsetxattr sys_fsetxattr
229 common getxattr sys_getxattr
230 common lgetxattr sys_lgetxattr
231 common fgetxattr sys_fgetxattr
232 common listxattr sys_listxattr
233 common llistxattr sys_llistxattr
234 common flistxattr sys_flistxattr
235 common removexattr sys_removexattr
236 common lremovexattr sys_lremovexattr
237 common fremovexattr sys_fremovexattr
238 common tkill sys_tkill
239 common sendfile64 sys_sendfile64
240 common futex sys_futex_time32
241 common sched_setaffinity sys_sched_setaffinity
242 common sched_getaffinity sys_sched_getaffinity
243 common io_setup sys_io_setup
244 common io_destroy sys_io_destroy
245 common io_getevents sys_io_getevents_time32
246 common io_submit sys_io_submit
247 common io_cancel sys_io_cancel
248 common exit_group sys_exit_group
249 common lookup_dcookie sys_lookup_dcookie
250 common epoll_create sys_epoll_create
251 common epoll_ctl sys_epoll_ctl sys_oabi_epoll_ctl
252 common epoll_wait sys_epoll_wait sys_oabi_epoll_wait
253 common remap_file_pages sys_remap_file_pages
# 254 for set_thread_area
# 255 for get_thread_area
256 common set_tid_address sys_set_tid_address
257 common timer_create sys_timer_create
258 common timer_settime sys_timer_settime32
259 common timer_gettime sys_timer_gettime32
260 common timer_getoverrun sys_timer_getoverrun
261 common timer_delete sys_timer_delete
262 common clock_settime sys_clock_settime32
263 common clock_gettime sys_clock_gettime32
264 common clock_getres sys_clock_getres_time32
265 common clock_nanosleep sys_clock_nanosleep_time32
266 common statfs64 sys_statfs64_wrapper
267 common fstatfs64 sys_fstatfs64_wrapper
268 common tgkill sys_tgkill
269 common utimes sys_utimes_time32
270 common arm_fadvise64_64 sys_arm_fadvise64_64
271 common pciconfig_iobase sys_pciconfig_iobase
272 common pciconfig_read sys_pciconfig_read
273 common pciconfig_write sys_pciconfig_write
274 common mq_open sys_mq_open
275 common mq_unlink sys_mq_unlink
276 common mq_timedsend sys_mq_timedsend_time32
277 common mq_timedreceive sys_mq_timedreceive_time32
278 common mq_notify sys_mq_notify
279 common mq_getsetattr sys_mq_getsetattr
280 common waitid sys_waitid
281 common socket sys_socket
282 common bind sys_bind sys_oabi_bind
283 common connect sys_connect sys_oabi_connect
284 common listen sys_listen
285 common accept sys_accept
286 common getsockname sys_getsockname
287 common getpeername sys_getpeername
288 common socketpair sys_socketpair
289 common send sys_send
290 common sendto sys_sendto sys_oabi_sendto
291 common recv sys_recv
292 common recvfrom sys_recvfrom
293 common shutdown sys_shutdown
294 common setsockopt sys_setsockopt
295 common getsockopt sys_getsockopt
296 common sendmsg sys_sendmsg sys_oabi_sendmsg
297 common recvmsg sys_recvmsg
298 common semop sys_semop sys_oabi_semop
299 common semget sys_semget
ipc: rename old-style shmctl/semctl/msgctl syscalls The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Most architectures that implement the split IPC syscalls don't set that symbol and only get the modern version, but alpha, arm, microblaze, mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag. For the architectures that so far only implement sys_ipc(), i.e. m68k, mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior when adding the split syscalls, so we need to distinguish between the two groups of architectures. The method I picked for this distinction is to have a separate system call entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl() does not. The system call tables of the five architectures are changed accordingly. As an additional benefit, we no longer need the configuration specific definition for ipc_parse_version(), it always does the same thing now, but simply won't get called on architectures with the modern interface. A small downside is that on architectures that do set ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points that are never called. They only add a few bytes of bloat, so it seems better to keep them compared to adding yet another Kconfig symbol. I considered adding new syscall numbers for the IPC_64 variants for consistency, but decided against that for now. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-01-01 04:22:40 +07:00
300 common semctl sys_old_semctl
301 common msgsnd sys_msgsnd
302 common msgrcv sys_msgrcv
303 common msgget sys_msgget
ipc: rename old-style shmctl/semctl/msgctl syscalls The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Most architectures that implement the split IPC syscalls don't set that symbol and only get the modern version, but alpha, arm, microblaze, mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag. For the architectures that so far only implement sys_ipc(), i.e. m68k, mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior when adding the split syscalls, so we need to distinguish between the two groups of architectures. The method I picked for this distinction is to have a separate system call entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl() does not. The system call tables of the five architectures are changed accordingly. As an additional benefit, we no longer need the configuration specific definition for ipc_parse_version(), it always does the same thing now, but simply won't get called on architectures with the modern interface. A small downside is that on architectures that do set ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points that are never called. They only add a few bytes of bloat, so it seems better to keep them compared to adding yet another Kconfig symbol. I considered adding new syscall numbers for the IPC_64 variants for consistency, but decided against that for now. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-01-01 04:22:40 +07:00
304 common msgctl sys_old_msgctl
305 common shmat sys_shmat
306 common shmdt sys_shmdt
307 common shmget sys_shmget
ipc: rename old-style shmctl/semctl/msgctl syscalls The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Most architectures that implement the split IPC syscalls don't set that symbol and only get the modern version, but alpha, arm, microblaze, mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag. For the architectures that so far only implement sys_ipc(), i.e. m68k, mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior when adding the split syscalls, so we need to distinguish between the two groups of architectures. The method I picked for this distinction is to have a separate system call entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl() does not. The system call tables of the five architectures are changed accordingly. As an additional benefit, we no longer need the configuration specific definition for ipc_parse_version(), it always does the same thing now, but simply won't get called on architectures with the modern interface. A small downside is that on architectures that do set ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points that are never called. They only add a few bytes of bloat, so it seems better to keep them compared to adding yet another Kconfig symbol. I considered adding new syscall numbers for the IPC_64 variants for consistency, but decided against that for now. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-01-01 04:22:40 +07:00
308 common shmctl sys_old_shmctl
309 common add_key sys_add_key
310 common request_key sys_request_key
311 common keyctl sys_keyctl
312 common semtimedop sys_semtimedop_time32 sys_oabi_semtimedop
313 common vserver
314 common ioprio_set sys_ioprio_set
315 common ioprio_get sys_ioprio_get
316 common inotify_init sys_inotify_init
317 common inotify_add_watch sys_inotify_add_watch
318 common inotify_rm_watch sys_inotify_rm_watch
319 common mbind sys_mbind
320 common get_mempolicy sys_get_mempolicy
321 common set_mempolicy sys_set_mempolicy
322 common openat sys_openat
323 common mkdirat sys_mkdirat
324 common mknodat sys_mknodat
325 common fchownat sys_fchownat
326 common futimesat sys_futimesat_time32
327 common fstatat64 sys_fstatat64 sys_oabi_fstatat64
328 common unlinkat sys_unlinkat
329 common renameat sys_renameat
330 common linkat sys_linkat
331 common symlinkat sys_symlinkat
332 common readlinkat sys_readlinkat
333 common fchmodat sys_fchmodat
334 common faccessat sys_faccessat
335 common pselect6 sys_pselect6_time32
336 common ppoll sys_ppoll_time32
337 common unshare sys_unshare
338 common set_robust_list sys_set_robust_list
339 common get_robust_list sys_get_robust_list
340 common splice sys_splice
341 common arm_sync_file_range sys_sync_file_range2
342 common tee sys_tee
343 common vmsplice sys_vmsplice
344 common move_pages sys_move_pages
345 common getcpu sys_getcpu
346 common epoll_pwait sys_epoll_pwait
347 common kexec_load sys_kexec_load
348 common utimensat sys_utimensat_time32
349 common signalfd sys_signalfd
350 common timerfd_create sys_timerfd_create
351 common eventfd sys_eventfd
352 common fallocate sys_fallocate
353 common timerfd_settime sys_timerfd_settime32
354 common timerfd_gettime sys_timerfd_gettime32
355 common signalfd4 sys_signalfd4
356 common eventfd2 sys_eventfd2
357 common epoll_create1 sys_epoll_create1
358 common dup3 sys_dup3
359 common pipe2 sys_pipe2
360 common inotify_init1 sys_inotify_init1
361 common preadv sys_preadv
362 common pwritev sys_pwritev
363 common rt_tgsigqueueinfo sys_rt_tgsigqueueinfo
364 common perf_event_open sys_perf_event_open
365 common recvmmsg sys_recvmmsg_time32
366 common accept4 sys_accept4
367 common fanotify_init sys_fanotify_init
368 common fanotify_mark sys_fanotify_mark
369 common prlimit64 sys_prlimit64
370 common name_to_handle_at sys_name_to_handle_at
371 common open_by_handle_at sys_open_by_handle_at
372 common clock_adjtime sys_clock_adjtime32
373 common syncfs sys_syncfs
374 common sendmmsg sys_sendmmsg
375 common setns sys_setns
376 common process_vm_readv sys_process_vm_readv
377 common process_vm_writev sys_process_vm_writev
378 common kcmp sys_kcmp
379 common finit_module sys_finit_module
380 common sched_setattr sys_sched_setattr
381 common sched_getattr sys_sched_getattr
382 common renameat2 sys_renameat2
383 common seccomp sys_seccomp
384 common getrandom sys_getrandom
385 common memfd_create sys_memfd_create
386 common bpf sys_bpf
387 common execveat sys_execveat
388 common userfaultfd sys_userfaultfd
389 common membarrier sys_membarrier
390 common mlock2 sys_mlock2
391 common copy_file_range sys_copy_file_range
392 common preadv2 sys_preadv2
393 common pwritev2 sys_pwritev2
394 common pkey_mprotect sys_pkey_mprotect
395 common pkey_alloc sys_pkey_alloc
396 common pkey_free sys_pkey_free
397 common statx sys_statx
398 common rseq sys_rseq
399 common io_pgetevents sys_io_pgetevents_time32
400 common migrate_pages sys_migrate_pages
401 common kexec_file_load sys_kexec_file_load
# 402 is unused
403 common clock_gettime64 sys_clock_gettime
404 common clock_settime64 sys_clock_settime
405 common clock_adjtime64 sys_clock_adjtime
406 common clock_getres_time64 sys_clock_getres
407 common clock_nanosleep_time64 sys_clock_nanosleep
408 common timer_gettime64 sys_timer_gettime
409 common timer_settime64 sys_timer_settime
410 common timerfd_gettime64 sys_timerfd_gettime
411 common timerfd_settime64 sys_timerfd_settime
412 common utimensat_time64 sys_utimensat
413 common pselect6_time64 sys_pselect6
414 common ppoll_time64 sys_ppoll
416 common io_pgetevents_time64 sys_io_pgetevents
417 common recvmmsg_time64 sys_recvmmsg
418 common mq_timedsend_time64 sys_mq_timedsend
419 common mq_timedreceive_time64 sys_mq_timedreceive
420 common semtimedop_time64 sys_semtimedop
421 common rt_sigtimedwait_time64 sys_rt_sigtimedwait
422 common futex_time64 sys_futex
423 common sched_rr_get_interval_time64 sys_sched_rr_get_interval
424 common pidfd_send_signal sys_pidfd_send_signal
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
428 common open_tree sys_open_tree
429 common move_mount sys_move_mount
430 common fsopen sys_fsopen
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
434 common pidfd_open sys_pidfd_open
clone3-v5.3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhhgAKCRCRxhvAZXjc or7kAP9VzDcQaK/WoDd2ezh2C7Wh5hNy9z/qJVCa6Tb+N+g1UgEAxbhFUg55uGOA JNf7fGar5JF5hBMIXR+NqOi1/sb4swg= =ELWo -----END PGP SIGNATURE----- Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull clone3 system call from Christian Brauner: "This adds the clone3 syscall which is an extensible successor to clone after we snagged the last flag with CLONE_PIDFD during the 5.2 merge window for clone(). It cleanly supports all of the flags from clone() and thus all legacy workloads. There are few user visible differences between clone3 and clone. First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse this flag. Second, the CSIGNAL flag is deprecated and will cause EINVAL to be reported. It is superseeded by a dedicated "exit_signal" argument in struct clone_args thus freeing up even more flags. And third, clone3 gives CLONE_PIDFD a dedicated return argument in struct clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr argument. The clone3 uapi is designed to be easy to handle on 32- and 64 bit: /* uapi */ struct clone_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 child_tid; __aligned_u64 parent_tid; __aligned_u64 exit_signal; __aligned_u64 stack; __aligned_u64 stack_size; __aligned_u64 tls; }; and a separate kernel struct is used that uses proper kernel typing: /* kernel internal */ struct kernel_clone_args { u64 flags; int __user *pidfd; int __user *child_tid; int __user *parent_tid; int exit_signal; unsigned long stack; unsigned long stack_size; unsigned long tls; }; The system call comes with a size argument which enables the kernel to detect what version of clone_args userspace is passing in. clone3 validates that any additional bytes a given kernel does not know about are set to zero and that the size never exceeds a page. A nice feature is that this patchset allowed us to cleanup and simplify various core kernel codepaths in kernel/fork.c by making the internal _do_fork() function take struct kernel_clone_args even for legacy clone(). This patch also unblocks the time namespace patchset which wants to introduce a new CLONE_TIMENS flag. Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and xtensa. These were the architectures that did not require special massaging. Other architectures treat fork-like system calls individually and after some back and forth neither Arnd nor I felt confident that we dared to add clone3 unconditionally to all architectures. We agreed to leave this up to individual architecture maintainers. This is why there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3 which any architecture can set once it has implemented support for clone3. The patch also adds a cond_syscall(clone3) for architectures such as nios2 or h8300 that generate their syscall table by simply including asm-generic/unistd.h. The hope is to get rid of __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon" * tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: handle arches who do not yet define clone3 arch: wire-up clone3() syscall fork: add clone3
2019-07-12 00:09:44 +07:00
435 common clone3 sys_clone3
open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). It also only contains u64s (even though ->mode arguably should be a u16) to avoid having padding fields which are never used in the future. Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). After input from Florian Weimer, the new open_how and flag definitions are inside a separate header from uapi/linux/fcntl.h, to avoid problems that glibc has with importing that header. /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. Another possible avenue of future work would be some kind of CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace which openat2(2) flags and fields are supported by the current kernel (to avoid userspace having to go through several guesses to figure it out). [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit 629e014bb834 ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ [6]: https://youtu.be/ggD-eb3yPVs Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18 19:07:59 +07:00
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd