linux_dsm_epyc7002/fs/ext4
Andrea Arcangeli ea51d132db ext4: avoid hangs in ext4_da_should_update_i_disksize()
If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 #10 <signal handler called>
 #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 #12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
2011-12-13 21:41:15 -05:00
..
acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
acl.h fs: take the ACL checks to common code 2011-07-25 14:30:23 -04:00
balloc.c ext4: fix up a undefined error in ext4_free_blocks in debugging code 2011-11-21 12:09:19 -05:00
bitmap.c ext4: Change unsigned long to unsigned int 2008-11-05 00:14:04 -05:00
block_validity.c ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
dir.c ext4: Use ext4_error_file() to print the pathname to the corrupted inode 2011-01-10 12:10:55 -05:00
ext4_extents.h ext4: Fix bigalloc quota accounting and i_blocks value 2011-09-09 19:04:51 -04:00
ext4_jbd2.c jbd2: add debugging information to jbd2_journal_dirty_metadata() 2011-09-04 10:18:14 -04:00
ext4_jbd2.h ext4: Fix ext4_should_writeback_data() for no-journal mode 2011-08-13 11:25:18 -04:00
ext4.h Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-11-02 10:06:20 -07:00
extents.c ext4: Fix crash due to getting bogus eh_depth value on big-endian systems 2011-12-12 11:00:56 -05:00
file.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-11-02 10:06:20 -07:00
fsync.c ext4: optimize locking for end_io extent conversion 2011-10-31 10:56:32 -04:00
hash.c ext4: Add support for non-native signed/unsigned htree hash algorithms 2008-10-28 13:21:44 -04:00
ialloc.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
indirect.c ext4: enforce bigalloc restrictions (e.g., no online resizing, etc.) 2011-09-09 18:36:51 -04:00
inode.c ext4: avoid hangs in ext4_da_should_update_i_disksize() 2011-12-13 21:41:15 -05:00
ioctl.c ext4: add __user decoration to calls of copy_{from,to}_user() 2011-10-18 10:59:51 -04:00
Kconfig ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured 2009-12-21 10:54:09 -05:00
Makefile ext4: move ext4_ind_* functions from inode.c to indirect.c 2011-06-27 19:40:50 -04:00
mballoc.c ext4: fix a wrong comment in __mb_check_buddy() 2011-10-26 08:48:54 -04:00
mballoc.h ext4: fix a typo in struct ext4_allocation_context 2011-10-31 18:55:50 -04:00
migrate.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
mmp.c ext4: Fix comparison endianness problem in MMP initialization 2011-10-18 10:53:51 -04:00
move_extent.c ext4: add some tracepoints in ext4/extents.c 2011-09-09 19:18:51 -04:00
namei.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
page-io.c ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten 2011-10-31 17:30:44 -04:00
resize.c ext4: Rename ext4_free_blks_{count,set}() to refer to clusters 2011-09-09 19:08:51 -04:00
super.c ext4: display the correct mount option in /proc/mounts for [no]init_itable 2011-12-12 22:06:18 -05:00
symlink.c ext4: symlink must be handled via filesystem specific operation 2010-05-16 02:00:00 -04:00
truncate.h ext4: move common truncate functions to header file 2011-06-27 19:16:04 -04:00
xattr_security.c security: new security_inode_init_security API adds function callback 2011-07-18 12:29:38 -04:00
xattr_trusted.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr_user.c ext4: constify xattr_handler 2010-05-21 18:31:19 -04:00
xattr.c ext4: fix race in xattr block allocation path 2011-10-29 10:15:35 -04:00
xattr.h fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00