linux_dsm_epyc7002/fs/btrfs
Filipe Manana ebb929060a Btrfs: avoid fallback to transaction commit during fsync of files with holes
When we are doing a full fsync (bit BTRFS_INODE_NEEDS_FULL_SYNC set) of a
file that has holes and has file extent items spanning two or more leafs,
we can end up falling to back to a full transaction commit due to a logic
bug that leads to failure to insert a duplicate file extent item that is
meant to represent a hole between the last file extent item of a leaf and
the first file extent item in the next leaf. The failure (EEXIST error)
leads to a transaction commit (as most errors when logging an inode do).

For example, we have the two following leafs:

Leaf N:

  -----------------------------------------------
  | ..., ..., ..., (257, FILE_EXTENT_ITEM, 64K) |
  -----------------------------------------------
  The file extent item at the end of leaf N has a length of 4Kb,
  representing the file range from 64K to 68K - 1.

Leaf N + 1:

  -----------------------------------------------
  | (257, FILE_EXTENT_ITEM, 72K), ..., ..., ... |
  -----------------------------------------------
  The file extent item at the first slot of leaf N + 1 has a length of
  4Kb too, representing the file range from 72K to 76K - 1.

During the full fsync path, when we are at tree-log.c:copy_items() with
leaf N as a parameter, after processing the last file extent item, that
represents the extent at offset 64K, we take a look at the first file
extent item at the next leaf (leaf N + 1), and notice there's a 4K hole
between the two extents, and therefore we insert a file extent item
representing that hole, starting at file offset 68K and ending at offset
72K - 1. However we don't update the value of *last_extent, which is used
to represent the end offset (plus 1, non-inclusive end) of the last file
extent item inserted in the log, so it stays with a value of 68K and not
with a value of 72K.

Then, when copy_items() is called for leaf N + 1, because the value of
*last_extent is smaller then the offset of the first extent item in the
leaf (68K < 72K), we look at the last file extent item in the previous
leaf (leaf N) and see it there's a 4K gap between it and our first file
extent item (again, 68K < 72K), so we decide to insert a file extent item
representing the hole, starting at file offset 68K and ending at offset
72K - 1, this insertion will fail with -EEXIST being returned from
btrfs_insert_file_extent() because we already inserted a file extent item
representing a hole for this offset (68K) in the previous call to
copy_items(), when processing leaf N.

The -EEXIST error gets propagated to the fsync callback, btrfs_sync_file(),
which falls back to a full transaction commit.

Fix this by adjusting *last_extent after inserting a hole when we had to
look at the next leaf.

Fixes: 4ee3fad34a ("Btrfs: fix fsync after hole punching when using no-holes feature")
Cc: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-05-16 14:31:13 +02:00
..
tests btrfs: get fs_info from block group in search_free_space_info 2019-04-29 19:02:46 +02:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c btrfs: simplify workqueue name when allocating 2019-02-25 14:13:24 +01:00
async-thread.h
backref.c Btrfs: do not start a transaction during fiemap 2019-04-29 19:02:51 +02:00
backref.h
btrfs_inode.h Btrfs: improve performance on fsync of files with multiple hardlinks 2019-04-29 19:02:52 +02:00
check-integrity.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
check-integrity.h
compression.c btrfs: Check the compression level before getting a workspace 2019-05-03 18:21:25 +02:00
compression.h btrfs: change set_level() to bound the level passed in 2019-02-25 14:13:32 +01:00
ctree.c btrfs: ctree: Dump the leaf before BUG_ON in btrfs_set_item_key_safe 2019-04-29 19:02:52 +02:00
ctree.h btrfs: track DIO bytes in flight 2019-04-29 19:25:37 +02:00
dedupe.h
delayed-inode.c btrfs: get fs_info from eb in btrfs_leaf_free_space 2019-04-29 19:02:30 +02:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c btrfs: remove unused parameter fs_info from btrfs_add_delayed_extent_op 2019-04-29 19:02:51 +02:00
delayed-ref.h btrfs: remove unused parameter fs_info from btrfs_add_delayed_extent_op 2019-04-29 19:02:51 +02:00
dev-replace.c btrfs: get fs_info from device in btrfs_rm_dev_replace_free_srcdev 2019-04-29 19:02:48 +02:00
dev-replace.h btrfs: get fs_info from trans in btrfs_run_dev_replace 2019-04-29 19:02:43 +02:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
disk-io.c btrfs: track DIO bytes in flight 2019-04-29 19:25:37 +02:00
disk-io.h btrfs: get fs_info from trans in btrfs_create_tree 2019-04-29 19:02:41 +02:00
export.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
export.h
extent_io.c btrfs: remove unused parameter fs_info from emit_last_fiemap_cache 2019-04-29 19:02:51 +02:00
extent_io.h btrfs: Remove bio_offset argument from submit_bio_hook 2019-04-29 19:02:47 +02:00
extent_map.c btrfs: Optimize unallocated chunks discard 2019-04-29 19:02:38 +02:00
extent_map.h btrfs: Remove impossible condition from mergable_maps 2019-02-25 14:13:21 +01:00
extent-tree.c btrfs: extent-tree: Fix a bug that btrfs is unable to add pinned bytes 2019-05-16 14:31:13 +02:00
file-item.c btrfs: Document btrfs_csum_one_bio 2019-04-29 19:02:52 +02:00
file.c btrfs: don't double unlock on error in btrfs_punch_hole 2019-05-03 18:21:36 +02:00
free-space-cache.c btrfs: get fs_info from block group in btrfs_find_space_cluster 2019-04-29 19:02:46 +02:00
free-space-cache.h btrfs: get fs_info from block group in btrfs_find_space_cluster 2019-04-29 19:02:46 +02:00
free-space-tree.c btrfs: get fs_info from block group in search_free_space_info 2019-04-29 19:02:46 +02:00
free-space-tree.h btrfs: get fs_info from block group in search_free_space_info 2019-04-29 19:02:46 +02:00
inode-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
inode-map.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
inode-map.h
inode.c btrfs: Use kvmalloc for allocating compressed path context 2019-05-02 13:48:19 +02:00
ioctl.c btrfs: drop local copy of inode i_mode 2019-04-29 19:02:53 +02:00
Kconfig
locking.c btrfs: trace: Introduce trace events for all btrfs tree locking events 2019-04-29 19:02:43 +02:00
locking.h btrfs: merge btrfs_set_lock_blocking_rw with it's caller 2019-02-25 14:13:28 +01:00
lzo.c btrfs: change set_level() to bound the level passed in 2019-02-25 14:13:32 +01:00
Makefile
math.h
ordered-data.c btrfs: track DIO bytes in flight 2019-04-29 19:25:37 +02:00
ordered-data.h btrfs: Remove redundant inode argument from btrfs_add_ordered_sum 2019-04-29 19:02:40 +02:00
orphan.c
print-tree.c btrfs: get fs_info from eb in btrfs_leaf_free_space 2019-04-29 19:02:30 +02:00
print-tree.h
props.c btrfs: use the existing reserved items for our first prop for inheritance 2019-05-09 11:18:14 +02:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: get fs_info from trans in btrfs_create_tree 2019-04-29 19:02:41 +02:00
qgroup.h btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record 2019-02-25 14:13:39 +01:00
raid56.c for-5.1-rc2-tag 2019-03-26 10:32:13 -07:00
raid56.h
rcu-string.h
reada.c btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
ref-verify.c btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
relocation.c btrfs: extent-tree: Use btrfs_ref to refactor btrfs_free_extent() 2019-04-29 19:02:49 +02:00
root-tree.c Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path 2019-05-09 11:25:27 +02:00
scrub.c btrfs: get fs_info from device in btrfs_scrub_cancel_dev 2019-04-29 19:02:47 +02:00
send.c Btrfs: fix race between send and deduplication that lead to failures and crashes 2019-04-29 19:02:52 +02:00
send.h
struct-funcs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
super.c btrfs: drop unused parameter in mount_subvol 2019-04-29 19:02:35 +02:00
sysfs.c btrfs: sysfs: don't leak memory when failing add fsid 2019-05-16 14:31:12 +02:00
sysfs.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
transaction.c Btrfs: remove no longer used member num_dirty_bgs from transaction 2019-04-29 19:02:43 +02:00
transaction.h Btrfs: remove no longer used member num_dirty_bgs from transaction 2019-04-29 19:02:43 +02:00
tree-checker.c btrfs: tree-checker: Allow error injection for tree-checker 2019-04-29 19:02:52 +02:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: open code now trivial btrfs_set_lock_blocking 2019-02-25 14:13:27 +01:00
tree-log.c Btrfs: avoid fallback to transaction commit during fsync of files with holes 2019-05-16 14:31:13 +02:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
volumes.c btrfs: get fs_info from device in btrfs_rm_dev_replace_free_srcdev 2019-04-29 19:02:48 +02:00
volumes.h btrfs: get fs_info from device in btrfs_rm_dev_replace_free_srcdev 2019-04-29 19:02:48 +02:00
xattr.c btrfs: start transaction in xattr_handler_set_prop 2019-04-29 19:02:54 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: change set_level() to bound the level passed in 2019-02-25 14:13:32 +01:00
zstd.c btrfs: zstd: remove indirect calls for local functions 2019-04-29 19:02:18 +02:00