linux_dsm_epyc7002/fs/xfs
Dave Chinner 110dc24ad2 xfs: log vector rounding leaks log space
The addition of direct formatting of log items into the CIL
linear buffer added alignment restrictions that the start of each
vector needed to be 64 bit aligned. Hence padding was added in
xlog_finish_iovec() to round up the vector length to ensure the next
vector started with the correct alignment.

This adds a small number of bytes to the size of
the linear buffer that is otherwise unused. The issue is that we
then use the linear buffer size to determine the log space used by
the log item, and this includes the unused space. Hence when we
account for space used by the log item, it's more than is actually
written into the iclogs, and hence we slowly leak this space.

This results on log hangs when reserving space, with threads getting
stuck with these stack traces:

Call Trace:
[<ffffffff81d15989>] schedule+0x29/0x70
[<ffffffff8150d3a2>] xlog_grant_head_wait+0xa2/0x1a0
[<ffffffff8150d55d>] xlog_grant_head_check+0xbd/0x140
[<ffffffff8150ee33>] xfs_log_reserve+0x103/0x220
[<ffffffff814b7f05>] xfs_trans_reserve+0x2f5/0x310
.....

The 4 bytes is significant. Brain Foster did all the hard work in
tracking down a reproducable leak to inode chunk allocation (it went
away with the ikeep mount option). His rough numbers were that
creating 50,000 inodes leaked 11 log blocks. This turns out to be
roughly 800 inode chunks or 1600 inode cluster buffers. That
works out at roughly 4 bytes per cluster buffer logged, and at that
I started looking for a 4 byte leak in the buffer logging code.

What I found was that a struct xfs_buf_log_format structure for an
inode cluster buffer is 28 bytes in length. This gets rounded up to
32 bytes, but the vector length remains 28 bytes. Hence the CIL
ticket reservation is decremented by 32 bytes (via lv->lv_buf_len)
for that vector rather than 28 bytes which are written into the log.

The fix for this problem is to separately track the bytes used by
the log vectors in the item and use that instead of the buffer
length when accounting for the log space that will be used by the
formatted log item.

Again, thanks to Brian Foster for doing all the hard work and long
hours to isolate this leak and make finding the bug relatively
simple.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-05-20 08:18:09 +10:00
..
Kconfig xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
kmem.c xfs: use NOIO contexts for vm_map_ram 2014-03-07 16:19:14 +11:00
kmem.h xfs: simplify kmem_{zone_}zalloc 2013-11-06 16:31:27 -06:00
Makefile xfs: abstract the differences in dir2/dir3 via an ops vector 2013-10-30 13:37:38 -05:00
mrlock.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
time.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.h xfs: add CRC infrastructure 2012-11-19 20:11:24 -06:00
xfs_acl.c xfs: return -E2BIG if hit the maximum size limits of ACLs 2014-02-07 15:26:11 +11:00
xfs_acl.h xfs: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
xfs_ag.h xfs: Use defines for CRC offsets in all cases 2014-02-27 15:15:27 +11:00
xfs_alloc_btree.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_alloc_btree.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_alloc.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_alloc.h xfs: create a shared header file for format-related information 2013-10-23 14:11:30 -05:00
xfs_aops.c xfs: don't map ranges that span EOF for direct IO 2014-04-17 08:15:19 +10:00
xfs_aops.h direct-io: Implement generic deferred AIO completions 2013-09-04 09:23:46 -04:00
xfs_attr_inactive.c xfs: vectorise encoding/decoding directory headers 2013-10-30 13:47:22 -05:00
xfs_attr_leaf.c xfs: remote attribute overwrite causes transaction overrun 2014-05-06 07:37:31 +10:00
xfs_attr_leaf.h xfs: unify directory/attribute format definitions 2013-10-23 14:21:40 -05:00
xfs_attr_list.c xfs: remote attribute overwrite causes transaction overrun 2014-05-06 07:37:31 +10:00
xfs_attr_remote.c xfs: remote attribute overwrite causes transaction overrun 2014-05-06 07:37:31 +10:00
xfs_attr_remote.h xfs: unify directory/attribute format definitions 2013-10-23 14:21:40 -05:00
xfs_attr_sf.h
xfs_attr.c xfs: remote attribute overwrite causes transaction overrun 2014-05-06 07:37:31 +10:00
xfs_attr.h xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00
xfs_bit.c xfs: fix static and extern sparse warnings 2013-10-30 13:59:56 -05:00
xfs_bit.h
xfs_bmap_btree.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_bmap_btree.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_bmap_util.c xfs: remove XFS_TRANS_RESERVE in collapse range 2014-05-20 08:15:57 +10:00
xfs_bmap_util.h xfs: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate 2014-02-24 10:58:19 +11:00
xfs_bmap.c xfs: collapse range is delalloc challenged 2014-04-17 08:15:25 +10:00
xfs_bmap.h xfs: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate 2014-02-24 10:58:19 +11:00
xfs_btree.c xfs: add helper for updating checksums on xfs_bufs 2014-02-27 15:18:23 +11:00
xfs_btree.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_buf_item.c xfs: remove XFS_TRANS_DEBUG dead code 2014-02-07 15:26:11 +11:00
xfs_buf_item.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_buf.c xfs: fix buffer use after free on IO error 2014-04-17 08:15:28 +10:00
xfs_buf.h xfs: add helper for updating checksums on xfs_bufs 2014-02-27 15:18:23 +11:00
xfs_cksum.h xfs: add CRC infrastructure 2012-11-19 20:11:24 -06:00
xfs_da_btree.c Merge branch 'xfs-bug-fixes-for-3.15-3' into for-next 2014-04-04 08:07:35 +11:00
xfs_da_btree.h xfs: remote attribute overwrite causes transaction overrun 2014-05-06 07:37:31 +10:00
xfs_da_format.c xfs: fix static and extern sparse warnings 2013-10-30 13:59:56 -05:00
xfs_da_format.h xfs: convert directory vector functions to constants 2013-10-30 13:48:41 -05:00
xfs_dinode.h xfs: Use defines for CRC offsets in all cases 2014-02-27 15:15:27 +11:00
xfs_dir2_block.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_dir2_data.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_dir2_leaf.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_dir2_node.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_dir2_priv.h xfs: vectorise encoding/decoding directory headers 2013-10-30 13:47:22 -05:00
xfs_dir2_readdir.c xfs: reinstate the ilock in xfs_readdir 2013-12-18 15:52:36 -06:00
xfs_dir2_sf.c xfs: xfs_dir2_block_to_sf temp buffer allocation fails 2013-12-11 14:59:20 -06:00
xfs_dir2.c xfs: allocate xfs_da_args to reduce stack footprint 2014-02-27 16:51:26 +11:00
xfs_dir2.h xfs: convert directory vector functions to constants 2013-10-30 13:49:18 -05:00
xfs_discard.c xfs: don't perform discard if the given range length is less than block size 2013-12-10 10:00:33 -06:00
xfs_discard.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot_buf.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_dquot_item.c xfs: remove the quotaoff log format from the quotaoff log item 2013-12-13 11:34:08 +11:00
xfs_dquot_item.h xfs: remove the quotaoff log format from the quotaoff log item 2013-12-13 11:34:08 +11:00
xfs_dquot.c xfs: use tr_qm_dqalloc log reservation for dquot alloc 2014-02-07 14:55:54 +11:00
xfs_dquot.h xfs: create a shared header file for format-related information 2013-10-23 14:11:30 -05:00
xfs_error.c xfs: print useful caller information in xfs_error_report 2014-02-27 15:21:37 +11:00
xfs_error.h xfs: add xfs_verifier_error() 2014-02-27 15:21:07 +11:00
xfs_export.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_export.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_extent_busy.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_extent_busy.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_extfree_item.c xfs: format log items write directly into the linear CIL buffer 2013-12-13 11:34:02 +11:00
xfs_extfree_item.h xfs: split out EFI/EFD log item format definition 2013-08-12 16:07:13 -05:00
xfs_file.c These are regression and bug fixes for ext4. 2014-04-20 20:43:47 -07:00
xfs_filestream.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_filestream.h xfs: xfs_filestreams.h doesn't need __KERNEL__ 2013-08-12 17:00:11 -05:00
xfs_format.h xfs: Use defines for CRC offsets in all cases 2014-02-27 15:15:27 +11:00
xfs_fs.h xfs: add the inode directory type support to XFS_IOC_FSGEOM 2013-10-08 14:28:09 -05:00
xfs_fsops.c xfs: growfs overruns AGFL buffer on V4 filesystems 2013-12-10 10:04:27 -06:00
xfs_fsops.h xfs: ensure log covering transactions are synchronous 2011-01-11 20:28:17 -06:00
xfs_globals.c xfs: add background scanning to clear eofblocks inodes 2012-11-08 15:34:59 -06:00
xfs_ialloc_btree.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_ialloc_btree.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_ialloc.c Merge branch 'xfs-bug-fixes-for-3.15-2' into for-next 2014-03-13 19:13:05 +11:00
xfs_ialloc.h xfs: introduce a common helper xfs_icluster_size_fsb 2013-12-13 15:51:48 +11:00
xfs_icache.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_icache.h xfs: update #2 for v3.12-rc1 2013-09-12 16:13:41 -07:00
xfs_icreate_item.c xfs: format log items write directly into the linear CIL buffer 2013-12-13 11:34:02 +11:00
xfs_icreate_item.h xfs: separate icreate log format definitions from xfs_icreate_item.h 2013-08-12 16:10:35 -05:00
xfs_inode_buf.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_inode_buf.h xfs: create a shared header file for format-related information 2013-10-23 14:11:30 -05:00
xfs_inode_fork.c Merge branch 'xfs-extent-list-locking-fixes' into for-next 2014-01-09 16:03:18 -06:00
xfs_inode_fork.h xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_inode_item.c xfs: remove the inode log format from the inode log item 2013-12-13 11:34:05 +11:00
xfs_inode_item.h xfs: remove the inode log format from the inode log item 2013-12-13 11:34:05 +11:00
xfs_inode.c xfs: fix tmpfile/selinux deadlock and initialize security 2014-04-17 08:15:30 +10:00
xfs_inode.h xfs: fix tmpfile/selinux deadlock and initialize security 2014-04-17 08:15:30 +10:00
xfs_inum.h xfs: move xfsagino_t to xfs_types.h 2012-05-14 16:20:54 -05:00
xfs_ioctl32.c xfs: underflow bug in xfs_attrlist_by_handle() 2013-12-10 09:59:37 -06:00
xfs_ioctl32.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_ioctl.c new helper: readlink_copy() 2014-04-01 23:19:15 -04:00
xfs_ioctl.h xfs: consolidate extent swap code 2013-08-12 16:56:06 -05:00
xfs_iomap.c xfs: always use unwritten extents for direct I/O writes 2014-02-10 10:27:43 +11:00
xfs_iomap.h xfs: get rid of count from xfs_iomap_write_allocate() 2013-10-01 15:42:34 -05:00
xfs_iops.c xfs: initialize default acls for ->tmpfile() 2014-05-06 07:34:28 +10:00
xfs_iops.h xfs: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
xfs_itable.c xfs: use xfs_icluster_size_fsb in xfs_bulkstat 2013-12-13 15:51:48 +11:00
xfs_itable.h
xfs_linux.h xfs: add xfs_verifier_error() 2014-02-27 15:21:07 +11:00
xfs_log_cil.c xfs: log vector rounding leaks log space 2014-05-20 08:18:09 +10:00
xfs_log_format.h xfs: create a shared header file for format-related information 2013-10-23 14:11:30 -05:00
xfs_log_priv.h xfs: decouple log and transaction headers 2013-10-23 16:17:44 -05:00
xfs_log_recover.c Merge branch 'xfs-for-linus-v3.13-rc5' into for-next 2013-12-18 10:36:58 -06:00
xfs_log_recover.h
xfs_log_rlimit.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_log.c xfs: fully support v5 format filesystems 2014-05-05 16:18:37 +10:00
xfs_log.h xfs: log vector rounding leaks log space 2014-05-20 08:18:09 +10:00
xfs_message.c xfs: decouple log and transaction headers 2013-10-23 16:17:44 -05:00
xfs_message.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_mount.c xfs: fully support v5 format filesystems 2014-05-05 16:18:37 +10:00
xfs_mount.h xfs: increase inode cluster size for v5 filesystems 2013-11-18 09:29:36 -06:00
xfs_mru_cache.c xfs: convert to alloc_workqueue() 2011-02-01 11:42:43 +01:00
xfs_mru_cache.h
xfs_qm_bhv.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_qm_syscalls.c xfs: make quota metadata truncation behavior consistent to user space 2013-12-06 14:06:15 -06:00
xfs_qm.c xfs: use xfs_ilock_data_map_shared in xfs_qm_dqiterate 2013-12-18 16:06:38 -06:00
xfs_qm.h xfs: integrate xfs_quota_priv header file to xfs_qm 2013-12-06 14:16:33 -06:00
xfs_quota_defs.h xfs: split dquot buffer operations out 2013-10-23 14:28:35 -05:00
xfs_quota.h xfs: split dquot buffer operations out 2013-10-23 14:28:35 -05:00
xfs_quotaops.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_rtalloc.c xfs: use tr_growrtalloc for growing rt files 2014-02-07 14:53:50 +11:00
xfs_rtalloc.h xfs: split xfs_rtalloc.c for userspace sanity 2013-10-23 17:16:32 -05:00
xfs_rtbitmap.c xfs: fix static and extern sparse warnings 2013-10-30 13:59:56 -05:00
xfs_sb.c xfs: fully support v5 format filesystems 2014-05-05 16:18:37 +10:00
xfs_sb.h xfs: Use defines for CRC offsets in all cases 2014-02-27 15:15:27 +11:00
xfs_shared.h xfs: add O_TMPFILE support 2014-01-06 13:50:06 -06:00
xfs_stats.c xfs: use common code for quota statistics 2012-03-14 11:09:06 -05:00
xfs_stats.h xfs: use common code for quota statistics 2012-03-14 11:09:06 -05:00
xfs_super.c Major changes for 3.14 include support for the newly added ZERO_RANGE 2014-04-04 15:39:39 -07:00
xfs_super.h xfs: xfs_sync_data is redundant. 2012-10-17 12:01:25 -05:00
xfs_symlink_remote.c xfs: modify verifiers to differentiate CRC from other errors 2014-02-27 15:23:10 +11:00
xfs_symlink.c Merge branch 'xfs-O_TMPFILE-support' into for-next 2014-03-13 19:14:43 +11:00
xfs_symlink.h xfs: push down inactive transaction mgmt for remote symlinks 2013-10-08 14:53:02 -05:00
xfs_sysctl.c xfs: Convert use of typedef ctl_table to struct ctl_table 2013-06-17 17:42:25 -05:00
xfs_sysctl.h xfs: add background scanning to clear eofblocks inodes 2012-11-08 15:34:59 -06:00
xfs_trace.c xfs: decouple inode and bmap btree header files 2013-10-23 16:28:49 -05:00
xfs_trace.h xfs: zeroing space needs to punch delalloc blocks 2014-04-14 18:15:11 +10:00
xfs_trans_ail.c xfs: trace AIL manipulations 2013-11-06 12:41:51 -06:00
xfs_trans_buf.c xfs: don't leak EFSBADCRC to userspace 2014-03-07 16:19:14 +11:00
xfs_trans_dquot.c xfs: fix the comment explaining xfs_trans_dqlockedjoin 2013-12-04 14:26:57 -06:00
xfs_trans_extfree.c xfs: decouple log and transaction headers 2013-10-23 16:17:44 -05:00
xfs_trans_inode.c xfs: open code inc_inode_iversion when logging an inode 2013-11-18 09:42:08 -06:00
xfs_trans_priv.h xfs: decouple log and transaction headers 2013-10-23 16:17:44 -05:00
xfs_trans_resv.c Merge branch 'xfs-O_TMPFILE-support' into for-next 2014-03-13 19:14:43 +11:00
xfs_trans_resv.h Merge branch 'xfs-O_TMPFILE-support' into for-next 2014-03-13 19:14:43 +11:00
xfs_trans_space.h xfs: get rid of XFS_IALLOC_BLOCKS macros 2013-12-13 15:51:48 +11:00
xfs_trans.c xfs: convert xfs_log_commit_cil() to void 2014-02-07 15:26:07 +11:00
xfs_trans.h xfs: format log items write directly into the linear CIL buffer 2013-12-13 11:34:02 +11:00
xfs_types.h xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_vnode.h xfs: remove unused FI_ flags 2013-12-04 14:11:05 -06:00
xfs_xattr.c xfs: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
xfs.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00