linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-26 10:40:53 +07:00

Author	SHA1	Message	Date
Miklos Szeredi	2aa15890f3	mm: prevent concurrent unmap_mapping_range() on the same inode Michael Leun reported that running parallel opens on a fuse filesystem can trigger a "kernel BUG at mm/truncate.c:475" Gurudas Pai reported the same bug on NFS. The reason is, unmap_mapping_range() is not prepared for more than one concurrent invocation per inode. For example: thread1: going through a big range, stops in the middle of a vma and stores the restart address in vm_truncate_count. thread2: comes in with a small (e.g. single page) unmap request on the same vma, somewhere before restart_address, finds that the vma was already unmapped up to the restart address and happily returns without doing anything. Another scenario would be two big unmap requests, both having to restart the unmapping and each one setting vm_truncate_count to its own value. This could go on forever without any of them being able to finish. Truncate and hole punching already serialize with i_mutex. Other callers of unmap_mapping_range() do not, and it's difficult to get i_mutex protection for all callers. In particular ->d_revalidate(), which calls invalidate_inode_pages2_range() in fuse, may be called with or without i_mutex. This patch adds a new mutex to 'struct address_space' to prevent running multiple concurrent unmap_mapping_range() on the same mapping. [ We'll hopefully get rid of all this with the upcoming mm preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex lockbreak" patch in particular. But that is for 2.6.39 ] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Michael Leun <lkml20101129@newton.leun.net> Reported-by: Gurudas Pai <gurudas.pai@oracle.com> Tested-by: Gurudas Pai <gurudas.pai@oracle.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-02-23 19:52:52 -08:00
Ryusuke Konishi	0ca7a5b9ac	nilfs2: fix crash after one superblock became unavailable Fixes the following kernel oops in nilfs_setup_super() which could arise if one of two super-blocks is unavailable. > BUG: unable to handle kernel NULL pointer dereference at (null) > Pid: 3529, comm: mount.nilfs2 Not tainted 2.6.37 #1 / > EIP: 0060:[<c03196bc>] EFLAGS: 00010202 CPU: 3 > EIP is at memcpy+0xc/0x1b > Call Trace: > [<f953720e>] ? nilfs_setup_super+0x6c/0xa5 [nilfs2] > [<f95369e9>] ? nilfs_get_root_dentry+0x81/0xcb [nilfs2] > [<f9537a08>] ? nilfs_mount+0x4f9/0x62c [nilfs2] > [<c02745cf>] ? kstrdup+0x36/0x3f > [<f953750f>] ? nilfs_mount+0x0/0x62c [nilfs2] > [<c0293940>] ? vfs_kern_mount+0x4d/0x12c > [<c02a5100>] ? get_fs_type+0x76/0x8f > [<c0293a68>] ? do_kern_mount+0x33/0xbf > [<c02a784a>] ? do_mount+0x2ed/0x714 > [<c02a6171>] ? copy_mount_options+0x28/0xfc > [<c02a7ce3>] ? sys_mount+0x72/0xaf > [<c0473085>] ? syscall_call+0x7/0xb Reported-by: Wakko Warner <wakko@animx.eu.org> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Wakko Warner <wakko@animx.eu.org> Cc: stable <stable@kernel.org> [2.6.37, 2.6.36] LKML-Reference: <20110121024918.GA29598@animx.eu.org>	2011-01-22 15:22:36 +09:00
Linus Torvalds	275220f0fc	Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits) block: ensure that completion error gets properly traced blktrace: add missing probe argument to block_bio_complete block cfq: don't use atomic_t for cfq_group block cfq: don't use atomic_t for cfq_queue block: trace event block fix unassigned field block: add internal hd part table references block: fix accounting bug on cross partition merges kref: add kref_test_and_get bio-integrity: mark kintegrityd_wq highpri and CPU intensive block: make kblockd_workqueue smarter Revert "sd: implement sd_check_events()" block: Clean up exit_io_context() source code. Fix compile warnings due to missing removal of a 'ret' variable fs/block: type signature of major_to_index(int) to major_to_index(unsigned) block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p) cfq-iosched: don't check cfqg in choose_service_tree() fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors cdrom: export cdrom_check_events() sd: implement sd_check_events() sr: implement sr_check_events() ...	2011-01-13 10:45:01 -08:00
Alexey Dobriyan	57cc7215b7	headers: kobject.h redux Remove kobject.h from files which don't need it, notably, sched.h and fs.h. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-10 08:51:44 -08:00
Ryusuke Konishi	06df0f9992	nilfs2: get rid of nilfs_mount_options structure Only mount_opt member is used in the nilfs_mount_options structure, and we can simplify it. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2011-01-10 14:05:46 +09:00
Joe Perches	b004a5eb0b	fs/nilfs2/super.c: Use printf extension %pV Using %pV reduces the number of printk calls and eliminates any possible message interleaving from other printk calls. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2011-01-10 14:05:45 +09:00
Nick Piggin	fa0d7e3de6	fs: icache RCU free inodes RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:26 +11:00
Nick Piggin	b7ab39f631	fs: dcache scale dentry refcount Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Tejun Heo	d4d7762995	block: clean up blkdev_get() wrappers and their users After recent blkdev_get() modifications, open_by_devnum() and open_bdev_exclusive() are simple wrappers around blkdev_get(). Replace them with blkdev_get_by_dev() and blkdev_get_by_path(). blkdev_get_by_dev() is identical to open_by_devnum(). blkdev_get_by_path() is slightly different in that it doesn't automatically add %FMODE_EXCL to @mode. All users are converted. Most conversions are mechanical and don't introduce any behavior difference. There are several exceptions. * btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no reason to OR it explicitly on blkdev_put(). * gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in sb->s_mode. * With the above changes, sb->s_mode now always should contain FMODE_EXCL. WARN_ON_ONCE() added to kill_block_super() to detect errors. The new blkdev_get_*() functions are with proper docbook comments. While at it, add function description to blkdev_get() too. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Joern Engel <joern@lazybastard.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: reiserfs-devel@vger.kernel.org Cc: xfs-masters@oss.sgi.com Cc: Alexander Viro <viro@zeniv.linux.org.uk>	2010-11-13 11:55:18 +01:00
Tejun Heo	e525fd89d3	block: make blkdev_get/put() handle exclusive access Over time, block layer has accumulated a set of APIs dealing with bdev open, close, claim and release. * blkdev_get/put() are the primary open and close functions. * bd_claim/release() deal with exclusive open. * open/close_bdev_exclusive() are combination of open and claim and the other way around, respectively. * bd_link/unlink_disk_holder() to create and remove holder/slave symlinks. * open_by_devnum() wraps bdget() + blkdev_get(). The interface is a bit confusing and the decoupling of open and claim makes it impossible to properly guarantee exclusive access as in-kernel open + claim sequence can disturb the existing exclusive open even before the block layer knows the current open if for another exclusive access. Reorganize the interface such that, * blkdev_get() is extended to include exclusive access management. @holder argument is added and, if is @FMODE_EXCL specified, it will gain exclusive access atomically w.r.t. other exclusive accesses. * blkdev_put() is similarly extended. It now takes @mode argument and if @FMODE_EXCL is set, it releases an exclusive access. Also, when the last exclusive claim is released, the holder/slave symlinks are removed automatically. * bd_claim/release() and close_bdev_exclusive() are no longer necessary and either made static or removed. * bd_link_disk_holder() remains the same but bd_unlink_disk_holder() is no longer necessary and removed. * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev() and blkdev_get(). It also has an unexpected extra bdev_read_only() test which probably should be moved into blkdev_get(). * open_by_devnum() is modified to take @holder argument and pass it to blkdev_get(). Most of bdev open/close operations are unified into blkdev_get/put() and most exclusive accesses are tested atomically at the open time (as it should). This cleans up code and removes some, both valid and invalid, but unnecessary all the same, corner cases. open_bdev_exclusive() and open_by_devnum() can use further cleanup - rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop special features. Well, let's leave them for another day. Most conversions are straight-forward. drbd conversion is a bit more involved as there was some reordering, but the logic should stay the same. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Brown <neilb@suse.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Philipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>	2010-11-13 11:55:17 +01:00
Al Viro	e4c59d61e8	convert nilfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:53 -04:00
Linus Torvalds	ab34c02afe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (36 commits) nilfs2: eliminate sparse warning - "context imbalance" nilfs2: eliminate sparse warnings - "symbol not declared" nilfs2: get rid of bdi from nilfs object nilfs2: change license of exported header file nilfs2: add bdev freeze/thaw support nilfs2: accept 64-bit checkpoint numbers in cp mount option nilfs2: remove own inode allocator and destructor for metadata files nilfs2: get rid of back pointer to writable sb instance nilfs2: get rid of mi_nilfs back pointer to nilfs object nilfs2: see state of root dentry for mount check of snapshots nilfs2: use iget for all metadata files nilfs2: get rid of GCDAT inode nilfs2: add routines to redirect access to buffers of DAT file nilfs2: add routines to roll back state of DAT file nilfs2: add routines to save and restore bmap state nilfs2: do not allocate nilfs_mdt_info structure to gc-inodes nilfs2: allow nilfs_clear_inode to clear metadata file inodes nilfs2: get rid of snapshot mount flag nilfs2: simplify life cycle management of nilfs object nilfs2: do not allocate multiple super block instances for a device ...	2010-10-23 01:26:47 -07:00
Jiro SEKIBA	abc0b50b6b	nilfs2: eliminate sparse warnings - "symbol not declared" change nilfs_dat_commit_free and nilfs_inode_cachep static to fix following warnings fs/nilfs2/super.c:72:19: warning: symbol 'nilfs_inode_cachep' was not declared. Should it be static? fs/nilfs2/dat.c:106:6: warning: symbol 'nilfs_dat_commit_free' was not declared. Should it be static? Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:40 +09:00
Ryusuke Konishi	026a7d63d5	nilfs2: get rid of bdi from nilfs object Nilfs now can use sb->s_bdi to get backing_dev_info, so we use it instead of ns_bdi on the nilfs object and remove ns_bdi. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:39 +09:00
Ryusuke Konishi	5beb6e0b20	nilfs2: add bdev freeze/thaw support Nilfs hasn't supported the freeze/thaw feature because it didn't work due to the peculiar design that multiple super block instances could be allocated for a device. This limitation was removed by the patch "nilfs2: do not allocate multiple super block instances for a device". So now this adds the freeze/thaw support to nilfs. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:39 +09:00
Ryusuke Konishi	c05dbfc260	nilfs2: accept 64-bit checkpoint numbers in cp mount option The current implementation doesn't mount snapshots with checkpoint numbers larger than INT_MAX since it uses match_int() for parsing "cp=" mount option. This uses simple_strtoull() for the conversion to resolve the issue. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:39 +09:00
Ryusuke Konishi	2879ed66e4	nilfs2: remove own inode allocator and destructor for metadata files This finally removes own inode allocator and destructor functions for metadata files. Several routines, nilfs_mdt_new(), nilfs_mdt_new_common(), nilfs_mdt_clear(), nilfs_mdt_destroy(), and nilfs_alloc_inode_common() will be gone. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:39 +09:00
Ryusuke Konishi	032dbb3b50	nilfs2: see state of root dentry for mount check of snapshots After applied the patch that unified sb instances, root dentry of snapshots can be left in dcache even after their trees are unmounted. The orphan root dentry/inode keeps a root object, and this causes false positive of nilfs_checkpoint_is_mounted function. This resolves the issue by having nilfs_checkpoint_is_mounted test whether the root dentry is busy or not. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:38 +09:00
Ryusuke Konishi	f1e89c86fd	nilfs2: use iget for all metadata files This makes use of iget5_locked to allocate or get inode for metadata files to stop using own inode allocator. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:38 +09:00
Ryusuke Konishi	348fe8da13	nilfs2: simplify life cycle management of nilfs object This stops pre-allocating nilfs object in nilfs_get_sb routine, and stops managing its life cycle by reference counting. nilfs_find_or_create_nilfs() function, nilfs->ns_mount_mutex, nilfs_objects list, and the reference counter will be removed through the simplification. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:36 +09:00
Ryusuke Konishi	f11459ad7d	nilfs2: do not allocate multiple super block instances for a device This stops allocating multiple super block instances for a device. All snapshots and a current mode mount (i.e. latest tree) will be controlled with nilfs_root objects that are kept within an sb instance. nilfs_get_sb() is rewritten so that it always has a root object for the latest tree and snapshots make additional root objects. The root dentry of the latest tree is binded to sb->s_root even if it isn't attached on a directory. Root dentries of snapshots or the latest tree are binded to mnt->mnt_root on which they are mounted. With this patch, nilfs_find_sbinfo() function, nilfs->ns_supers list, and nilfs->ns_current back pointer, are deleted. In addition, init_nilfs() and load_nilfs() are simplified since they will be called once for a device, not repeatedly called for mount points. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:36 +09:00
Ryusuke Konishi	ab4d8f7ebf	nilfs2: split out nilfs_attach_snapshot This splits the code to attach snapshots into a separate routine for convenience sake. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:36 +09:00
Ryusuke Konishi	367ea33486	nilfs2: split out nilfs_get_root_dentry This splits the code to allocate root dentry into a separate routine for convenience in successive changes. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:35 +09:00
Ryusuke Konishi	b7c0634204	nilfs2: move inode count and block count into root object This moves sbi->s_inodes_count and sbi->s_blocks_count into nilfs_root object. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:35 +09:00
Ryusuke Konishi	e912a5b668	nilfs2: use root object to get ifile This rewrites functions using ifile so that they get ifile from nilfs_root object, and will remove sbi->s_ifile. Some functions that don't know the root object are extended to receive it from caller. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:35 +09:00
Ryusuke Konishi	8e656fd518	nilfs2: make snapshots in checkpoint tree exportable The previous export operations cannot handle multiple versions of a filesystem if they belong to the same sb instance. This adds a new type of file handle and extends export operations so that they can get the inode specified by a checkpoint number as well as an inode number and a generation number. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:34 +09:00
Ryusuke Konishi	4d8d9293dc	nilfs2: set pointer to root object in inodes This puts a pointer to nilfs_root object in the private part of on-memory inode, and makes nilfs_iget function pick up the inode with the same root object. Non-root inodes inherit its nilfs_root object from parent inode. That of the root inode is allocated through nilfs_attach_checkpoint() function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:34 +09:00
Ryusuke Konishi	0e14a3595b	nilfs2: use iget5_locked to get inode This uses iget5_locked instead of iget_locked so that gc cache can look up inodes with an inode number and an optional checkpoint number. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:33 +09:00
Ryusuke Konishi	b91c9a97c9	nilfs2: allow nilfs_destroy_inode to destroy metadata file inodes The current nilfs_destroy_inode() doesn't handle metadata file inodes including gc inodes (dummy inodes used for garbage collection). This allows nilfs_destroy_inode() to destroy inodes of metadata files. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-10-23 09:24:33 +09:00
Linus Torvalds	a2887097f2	Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits) xen-blkfront: disable barrier/flush write support Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c block: remove BLKDEV_IFL_WAIT aic7xxx_old: removed unused 'req' variable block: remove the BH_Eopnotsupp flag block: remove the BLKDEV_IFL_BARRIER flag block: remove the WRITE_BARRIER flag swap: do not send discards as barriers fat: do not send discards as barriers ext4: do not send discards as barriers jbd2: replace barriers with explicit flush / FUA usage jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier jbd: replace barriers with explicit flush / FUA usage nilfs2: replace barriers with explicit flush / FUA usage reiserfs: replace barriers with explicit flush / FUA usage gfs2: replace barriers with explicit flush / FUA usage btrfs: replace barriers with explicit flush / FUA usage xfs: replace barriers with explicit flush / FUA usage block: pass gfp_mask and flags to sb_issue_discard dm: convey that all flushes are processed as empty ...	2010-10-22 17:07:18 -07:00
Jan Blunck	d6d4c19c5f	BKL: Remove BKL from NILFS2 The BKL is only used in put_super, fill_super and remount_fs that are all three protected by the superblocks s_umount rw_semaphore. Therefore it is safe to remove the BKL entirely. Signed-off-by: Jan Blunck <jblunck@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-04 21:10:41 +02:00
Jan Blunck	db71922217	BKL: Explicitly add BKL around get_sb/fill_super This patch is a preparation necessary to remove the BKL from do_new_mount(). It explicitly adds calls to lock_kernel()/unlock_kernel() around get_sb/fill_super operations for filesystems that still uses the BKL. I've read through all the code formerly covered by the BKL inside do_kern_mount() and have satisfied myself that it doesn't need the BKL any more. do_kern_mount() is already called without the BKL when mounting the rootfs and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called from various places without BKL: simple_pin_fs(), nfs_do_clone_mount() through nfs_follow_mountpoint(), afs_mntpt_do_automount() through afs_mntpt_follow_link(). Both later functions are actually the filesystems follow_link inode operation. vfs_kern_mount() is calling the specified get_sb function and lets the filesystem do its job by calling the given fill_super function. Therefore I think it is safe to push down the BKL from the VFS to the low-level filesystems get_sb/fill_super operation. [arnd: do not add the BKL to those file systems that already don't use it elsewhere] Signed-off-by: Jan Blunck <jblunck@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-04 21:10:10 +02:00
Christoph Hellwig	f8c131f5b6	nilfs2: replace barriers with explicit flush / FUA usage Switch to the WRITE_FLUSH_FUA flag for log writes, remove the EOPNOTSUPP detection for barriers and stop setting the barrier flag for discards. tj: nilfs is now fixed to wait for discard completion. Updated this patch accordingly and dropped warning about it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-10 12:35:39 +02:00
Linus Torvalds	145c3ae46b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: fs: brlock vfsmount_lock fs: scale files_lock lglock: introduce special lglock and brlock spin locks tty: fix fu_list abuse fs: cleanup files_lock locking fs: remove extra lookup in __lookup_hash fs: fs_struct rwlock to spinlock apparmor: use task path helpers fs: dentry allocation consolidation fs: fix do_lookup false negative mbcache: Limit the maximum number of cache entries hostfs ->follow_link() braino hostfs: dumb (and usually harmless) tpyo - strncpy instead of strlcpy remove SWRITE* I/O types kill BH_Ordered flag vfs: update ctime when changing the file's permission by setfacl cramfs: only unlock new inodes fix reiserfs_evict_inode end_writeback second call	2010-08-18 09:35:08 -07:00
Christoph Hellwig	87e99511ea	kill BH_Ordered flag Instead of abusing a buffer_head flag just add a variant of sync_dirty_buffer which allows passing the exact type of write flag required. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-18 01:09:00 -04:00
Ryusuke Konishi	af4e36318e	nilfs2: fix list corruption after ifile creation failure If nilfs_attach_checkpoint() gets a memory allocation failure during creation of ifile, it will return without removing nilfs_sb_info struct from ns_supers list. When a concurrently mounted snapshot is unmounted or another new snapshot is mounted after that, this causes kernel oops as below: > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<f83662ff>] nilfs_find_sbinfo+0x74/0xa4 [nilfs2] > *pde = 00000000 > Oops: 0000 [#1] SMP <snip> > Call Trace: > [<f835dc29>] ? nilfs_get_sb+0x165/0x532 [nilfs2] > [<c1173c87>] ? ida_get_new_above+0x16d/0x187 > [<c109a7f8>] ? alloc_vfsmnt+0x7e/0x10a > [<c1070790>] ? kstrdup+0x2c/0x40 > [<c1089041>] ? vfs_kern_mount+0x96/0x14e > [<c108913d>] ? do_kern_mount+0x32/0xbd > [<c109b331>] ? do_mount+0x642/0x6a1 > [<c101a415>] ? do_page_fault+0x0/0x2d1 > [<c1099c00>] ? copy_mount_options+0x80/0xe2 > [<c10705d8>] ? strndup_user+0x48/0x67 > [<c109b3f1>] ? sys_mount+0x61/0x90 > [<c10027cc>] ? sysenter_do_call+0x12/0x22 This fixes the problem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: stable@kernel.org	2010-08-16 11:08:36 +09:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Al Viro	6fd1e5c994	convert nilfs2 to ->evict_inode() [folded build fix from sfr] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:25 -04:00
Ryusuke Konishi	c5ca48aabe	nilfs2: reject incompatible filesystem This forces nilfs to check compatibility of feature flags so as to reject a filesystem with unknown features when it mounts or remounts the filesystem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:16 +09:00
Ryusuke Konishi	05d0e94b66	nilfs2: get rid of nilfs_bmap_union This removes nilfs_bmap_union and finally unifies three structures and the union in bmap/btree code into one. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:14 +09:00
Ryusuke Konishi	7c01745781	nilfs2: pass remount flag to parse_options This adds is_remount argument to the parse_options() function that obtains mount options from strings. Previously, parse_options did not distinguish context whether it's called for a new mount or remount, so the caller needed additional verifications outside the function. This allows parse_options to verify options and print messages depending on the context. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:12 +09:00
Ryusuke Konishi	c6b4d57ddf	nilfs2: use seq_puts to print mount options without argument This replaces seq_printf() with seq_puts() in nilfs_show_options for mount options which have no argument. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:12 +09:00
Ryusuke Konishi	802d317754	nilfs2: add nodiscard mount option Nilfs has "discard" mount option which issues discard/TRIM commands to underlying block device, but it lacks a complementary option and has no way to disable the feature through remount. This adds "nodiscard" option to resolve this imbalance. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:12 +09:00
Ryusuke Konishi	773bc4f3b6	nilfs2: add barrier mount option Nilfs enables write barriers by default and has "nobarrier" mount option to disable this feature. But it lacks the complementary option and has no way to re-enable the feature on remount. This adds "barrier" option to resolve this imbalance. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:12 +09:00
Jiro SEKIBA	b2ac86e1a8	nilfs2: sync super blocks in turns This will sync super blocks in turns instead of syncing duplicate super blocks at the time. This will help searching valid super root when super block is written into disk before log is written, which is happen when barrier-less block devices are unmounted uncleanly. In the situation, old super block likely points to valid log. This patch introduces ns_sbwcount member to the nilfs object and adds nilfs_sb_will_flip() function; ns_sbwcount counts how many times super blocks write back to the disk. And, nilfs_sb_will_flip() decides whether flipping required or not based on the count of ns_sbwcount to sync super blocks asymmetrically. The following functions are also changed: - nilfs_prepare_super(): flips super blocks according to the argument. The argument is calculated by nilfs_sb_will_flip() function. - nilfs_cleanup_super(): sets "clean" flag to both super blocks if they point to the same checkpoint. To update both of super block information, caller of nilfs_commit_super must set the information on both super blocks. Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:11 +09:00
Jiro SEKIBA	d26493b6f0	nilfs2: introduce nilfs_prepare_super This function checks validity of super block pointers. If first super block is invalid, it will swap the super blocks. The function should be called before any super block information updates. Caller must obtain nilfs->ns_sem. Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:10 +09:00
Ryusuke Konishi	60f46b7efc	nilfs2: separate function that updates log position This moves out section that updates information of the recent log position stored in super blocks from nilfs_commit_super to a new routine named nilfs_set_log_cursor. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:10 +09:00
Ryusuke Konishi	c8a11c8a14	nilfs2: add nilfs_set_error This function marks error state and write it on super blocks. This is a preparation for making super block writeback alternately. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:10 +09:00
Ryusuke Konishi	7ecaa46cfe	nilfs2: add nilfs_cleanup_super This function write out filesystem state to super blocks in order to share the same cleanup work. This is a preparation for making super block writeback alternately. Cc: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:10 +09:00
Ryusuke Konishi	bde4e696e4	nilfs2: do not update mount time on rw->ro remount Mount time field in super block is wrongly updated when nilfs remounts the partition from read-write to read-only. This fixes the issue. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2010-07-23 10:02:10 +09:00

1 2 3

108 Commits