linux_dsm_epyc7002/fs/nilfs2
Andreas Rohner 70f2fe3a26 nilfs2: fix segctor bug that causes file system corruption
There is a bug in the function nilfs_segctor_collect, which results in
active data being written to a segment, that is marked as clean.  It is
possible, that this segment is selected for a later segment
construction, whereby the old data is overwritten.

The problem shows itself with the following kernel log message:

  nilfs_sufile_do_cancel_free: segment 6533 must be clean

Usually a few hours later the file system gets corrupted:

  NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0
  NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660)

The issue can be reproduced with a file system that is nearly full and
with the cleaner running, while some IO intensive task is running.
Although it is quite hard to reproduce.

This is what happens:

 1. The cleaner starts the segment construction
 2. nilfs_segctor_collect is called
 3. sc_stage is on NILFS_ST_SUFILE and segments are freed
 4. sc_stage is on NILFS_ST_DAT current segment is full
 5. nilfs_segctor_extend_segments is called, which
    allocates a new segment
 6. The new segment is one of the segments freed in step 3
 7. nilfs_sufile_cancel_freev is called and produces an error message
 8. Loop around and the collection starts again
 9. sc_stage is on NILFS_ST_SUFILE and segments are freed
    including the newly allocated segment, which will contain active
    data and can be allocated at a later time
10. A few hours later another segment construction allocates the
    segment and causes file system corruption

This can be prevented by simply reordering the statements.  If
nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments
the freed segments are marked as dirty and cannot be allocated any more.

Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-15 14:19:42 +07:00
..
alloc.c nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
alloc.h nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
bmap.c nilfs2: get rid of NILFS_I_NILFS 2011-05-10 22:21:56 +09:00
bmap.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
btnode.c nilfs2: use mark_buffer_dirty to mark btnode or meta data dirty 2011-05-10 22:21:57 +09:00
btnode.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
btree.c nilfs2: fix missing block address termination in btree node shrinking 2011-06-11 15:51:15 +09:00
btree.h
cpfile.c nilfs2: fix timing issue between rmcp and chcp ioctls 2012-07-30 17:25:19 -07:00
cpfile.h nilfs2: use iget for all metadata files 2010-10-23 09:24:38 +09:00
dat.c nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
dat.h nilfs2: use iget for all metadata files 2010-10-23 09:24:38 +09:00
dir.c [readdir] convert nilfs2 2013-06-29 12:56:36 +04:00
direct.c nilfs2: record used amount of each checkpoint in checkpoint list 2011-03-08 14:58:31 +09:00
direct.h
export.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
gcinode.c nilfs2: ensure proper cache clearing for gc-inodes 2012-06-20 14:39:35 -07:00
ifile.c ] nilfs2: use atomic64_t type for inodes_count and blocks_count fields in nilfs_root struct 2013-07-03 16:08:01 -07:00
ifile.h nilfs2: implement calculation of free inodes count 2013-07-03 16:08:01 -07:00
inode.c truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
ioctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
Kconfig fs/nilfs2: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:39:04 -08:00
Makefile nilfs2: get rid of GCDAT inode 2010-10-23 09:24:38 +09:00
mdt.c nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption 2013-04-30 17:04:04 -07:00
mdt.h nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
namei.c fs: encode_fh: return FILEID_INVALID if invalid fid_type 2013-02-26 02:46:10 -05:00
nilfs.h nilfs2: drop vmtruncate 2012-12-20 18:40:54 -05:00
page.c nilfs2: fix issue with race condition of competition between segments for dirty blocks 2013-09-30 14:31:02 -07:00
page.h nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption 2013-04-30 17:04:04 -07:00
recovery.c nilfs2: drop vmtruncate 2012-12-20 18:40:54 -05:00
segbuf.c nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection 2013-08-23 09:51:22 -07:00
segbuf.h
segment.c nilfs2: fix segctor bug that causes file system corruption 2014-01-15 14:19:42 +07:00
segment.h nilfs2: get rid of private page allocator 2011-05-10 22:21:44 +09:00
sufile.c nilfs2: add omitted comments for different structures in driver implementation 2012-07-30 17:25:19 -07:00
sufile.h nilfs2: get rid of NILFS_I_NILFS 2011-05-10 22:21:56 +09:00
super.c git simplify nilfs check for busy subtree 2013-09-03 22:52:50 -04:00
the_nilfs.c ] nilfs2: use atomic64_t type for inodes_count and blocks_count fields in nilfs_root struct 2013-07-03 16:08:01 -07:00
the_nilfs.h ] nilfs2: use atomic64_t type for inodes_count and blocks_count fields in nilfs_root struct 2013-07-03 16:08:01 -07:00