linux_dsm_epyc7002/fs
Linus Torvalds 45bcc21a50 pipe: do FASYNC notifications for every pipe IO, not just state changes
commit fe67f4dd8daa252eb9aa7acb61555f3cc3c1ce4c upstream.

It turns out that the SIGIO/FASYNC situation is almost exactly the same
as the EPOLLET case was: user space really wants to be notified after
every operation.

Now, in a perfect world it should be sufficient to only notify user
space on "state transitions" when the IO state changes (ie when a pipe
goes from unreadable to readable, or from unwritable to writable).  User
space should then do as much as possible - fully emptying the buffer or
what not - and we'll notify it again the next time the state changes.

But as with EPOLLET, we have at least one case (stress-ng) where the
kernel sent SIGIO due to the pipe being marked for asynchronous
notification, but the user space signal handler then didn't actually
necessarily read it all before returning (it read more than what was
written, but since there could be multiple writes, it could leave data
pending).

The user space code then expected to get another SIGIO for subsequent
writes - even though the pipe had been readable the whole time - and
would only then read more.

This is arguably a user space bug - and Colin King already fixed the
stress-ng code in question - but the kernel regression rules are clear:
it doesn't matter if kernel people think that user space did something
silly and wrong.  What matters is that it used to work.

So if user space depends on specific historical kernel behavior, it's a
regression when that behavior changes.  It's on us: we were silly to
have that non-optimal historical behavior, and our old kernel behavior
was what user space was tested against.

Because of how the FASYNC notification was tied to wakeup behavior, this
was first broken by commits f467a6a664 and 1b6b26ae70 ("pipe: fix
and clarify pipe read/write wakeup logic"), but at the time it seems
nobody noticed.  Probably because the stress-ng problem case ends up
being timing-dependent too.

It was then unwittingly fixed by commit 3a34b13a88ca ("pipe: make pipe
writes always wake up readers") only to be broken again when by commit
3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal
loads").

And at that point the kernel test robot noticed the performance
refression in the stress-ng.sigio.ops_per_sec case.  So the "Fixes" tag
below is somewhat ad hoc, but it matches when the issue was noticed.

Fix it for good (knock wood) by simply making the kill_fasync() case
separate from the wakeup case.  FASYNC is quite rare, and we clearly
shouldn't even try to use the "avoid unnecessary wakeups" logic for it.

Link: https://lore.kernel.org/lkml/20210824151337.GC27667@xsang-OptiPlex-9020/
Fixes: 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Oliver Sang <oliver.sang@intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05 19:00:51 +02:00
..
9p fs: 9p: add generic splice_write file operation 2020-12-01 21:40:47 +01:00
adfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix tracepoint string placement with built-in AFS 2021-07-28 14:35:41 +02:00
aufs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
autofs autofs: harden ioctl table 2020-10-16 11:11:22 -07:00
befs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: fix race between marking inode needs to be logged and log syncing 2024-07-05 19:00:50 +02:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph init: add dsm gpl source 2024-07-05 18:00:04 +02:00
cifs cifs: create sd context must be a multiple of 8 2024-07-05 18:53:59 +02:00
coda
configfs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
cramfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
crypto fscrypt: fix derivation of SipHash keys on big endian CPUs 2021-07-14 16:56:53 +02:00
debugfs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
devpts
dlm fs: dlm: fix memory leak when fenced 2021-07-14 16:55:59 +02:00
ecryptfs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-11-25 16:55:02 +01:00
efs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
erofs erofs: fix error return code in erofs_read_superblock() 2021-07-14 16:56:53 +02:00
exfat init: add dsm gpl source 2024-07-05 18:00:04 +02:00
exportfs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
ext2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
ext4 ext4: fix potential htree corruption when growing large_dir directories 2024-07-05 18:52:29 +02:00
f2fs f2fs: Show casefolding support only when supported 2021-07-25 14:36:17 +02:00
fat init: add dsm gpl source 2024-07-05 18:00:04 +02:00
freevxfs
fscache
fuse init: add dsm gpl source 2024-07-05 18:00:04 +02:00
gfs2 gfs2: Fix error handling in init_statfs 2021-07-14 16:55:38 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:16:12 +02:00
hfsplus init: add dsm gpl source 2024-07-05 18:00:04 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
hugetlbfs hugetlbfs: fix mount mode command line processing 2021-07-28 14:35:46 +02:00
iomap init: add dsm gpl source 2024-07-05 18:00:04 +02:00
isofs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
jbd2 ext4: fix debug format string warning 2021-05-19 10:13:19 +02:00
jffs2 jffs2: check the validity of dstlen in jffs2_zlib_compress() 2021-05-11 14:47:36 +02:00
jfs fs/jfs: Fix missing error code in lmLogInit() 2021-07-20 16:05:40 +02:00
kernfs kernfs: wire up ->splice_read and ->splice_write 2021-01-27 11:55:29 +01:00
lockd init: add dsm gpl source 2024-07-05 18:00:04 +02:00
minix [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
nfs init: add dsm gpl source 2024-07-05 18:00:04 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:53:45 +01:00
nfsd init: add dsm gpl source 2024-07-05 18:00:04 +02:00
nilfs2 nilfs2: fix memory leak in nilfs_sysfs_delete_device_group 2021-06-30 08:47:24 -04:00
nls
notify init: add dsm gpl source 2024-07-05 18:00:04 +02:00
ntfs ntfs: fix validity check for file name attribute 2021-07-14 16:55:38 +02:00
ocfs2 ocfs2: issue zeroout to EOF blocks 2024-07-05 18:03:15 +02:00
omfs fs: omfs: use kmemdup() rather than kmalloc+memcpy 2020-09-22 23:39:45 -04:00
openpromfs
orangefs orangefs: fix orangefs df output. 2021-07-20 16:05:48 +02:00
overlayfs ovl: fix uninitialized pointer read in ovl_lookup_real_one() 2024-07-05 18:56:55 +02:00
proc init: add dsm gpl source 2024-07-05 18:00:04 +02:00
pstore init: add dsm gpl source 2024-07-05 18:00:04 +02:00
qnx4 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
qnx6 [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
quota init: add dsm gpl source 2024-07-05 18:00:04 +02:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-16 11:11:22 -07:00
reiserfs reiserfs: check directory items on read from disk 2024-07-05 18:52:32 +02:00
romfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:13:10 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2020-10-02 12:02:30 +02:00
sysv [PATCH] reduce boilerplate in fsid handling 2020-09-18 16:45:50 -04:00
tracefs
ubifs ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode 2021-07-20 16:05:51 +02:00
udf init: add dsm gpl source 2024-07-05 18:00:04 +02:00
ufs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
unicode unicode: Add utf8_casefold_hash 2020-09-10 14:03:31 -07:00
vboxsf vboxsf: Add support for the atomic_open directory-inode op 2024-07-05 18:54:41 +02:00
verity fs-verity: use smp_load_acquire() for ->i_verity_info 2020-07-21 16:02:41 -07:00
xfs xfs: fix return of uninitialized value in variable error 2021-05-14 09:50:34 +02:00
zonefs zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() 2021-03-25 09:04:05 +01:00
aio.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
anon_inodes.c
attr.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot 2020-10-16 11:11:21 -07:00
binfmt_elf.c fs: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c
block_dev.c block: fix a race between del_gendisk and BLKRRPART 2021-06-03 09:00:45 +02:00
buffer.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: fix core_pattern parse error 2020-12-06 10:19:07 -08:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c dax: fix ENOMEM handling in grab_mapping_entry() 2021-07-14 16:56:13 +02:00
dcache.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-11 14:47:12 +02:00
exec.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
fcntl.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
fhandle.c
file_table.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
file.c kernel/io_uring: cancel io_uring before task works 2021-01-30 13:55:18 +01:00
filesystems.c
fs_context.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c
fs_struct.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
fs_types.c
fs-writeback.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
internal.h init: add dsm gpl source 2024-07-05 18:00:04 +02:00
io_uring.c io_uring: only assign io_uring_enter() SQPOLL error in actual error case 2024-07-05 18:56:00 +02:00
io-wq.c io_uring: fix false WARN_ONCE 2021-07-19 09:44:51 +02:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
ioctl.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
Kconfig init: add dsm gpl source 2024-07-05 18:00:04 +02:00
Kconfig.binfmt
kernel_read_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-22 10:48:22 -08:00
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile init: add dsm gpl source 2024-07-05 18:00:04 +02:00
mbcache.c
mount.h init: add dsm gpl source 2024-07-05 18:00:04 +02:00
mpage.c
namei.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
namespace.c fs: warn about impending deprecation of mandatory locks 2024-07-05 18:56:00 +02:00
no-block.c
nsfs.c
open.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
pipe.c pipe: do FASYNC notifications for every pipe IO, not just state changes 2024-07-05 19:00:51 +02:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
read_write.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:05:59 +02:00
signalfd.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
splice.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
stack.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
stat.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
statfs.c Add a "nosymfollow" mount option. 2020-08-27 16:06:47 -04:00
super.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
sync.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
syno_acl_api.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
syno_acl.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
syno_acl.h init: add dsm gpl source 2024-07-05 18:00:04 +02:00
timerfd.c
userfaultfd.c userfaultfd: do not untag user pointers 2021-07-28 14:35:46 +02:00
utimes.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
xattr.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00