linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-01 11:06:44 +07:00

Author	SHA1	Message	Date
Chao Yu	817202d937	f2fs: readahead multi pages of directory for performance We have no so such readahead mechanism in ->iterate() path as the one in ->read() path, it cause low performance when we read large directory. This patch add readahead in f2fs_readdir() for better performance. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:57 +09:00
Jaegeuk Kim	d928bfbfe7	f2fs: introduce fi->i_sem to protect fi's info This patch introduces fi->i_sem to protect fi's info that includes xattr_ver, pino, i_nlink. This enables to remove i_mutex during f2fs_sync_file, resulting in performance improvement when a number of fsync calls are triggered from many concurrent threads. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-20 22:10:11 +09:00
Jaegeuk Kim	3cb5ad152b	f2fs: call f2fs_wait_on_page_writeback instead of native function If a page is on writeback, f2fs can face with deadlock due to under writepages. This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw merged IOs, which is implemented by f2fs_wait_on_page_writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-20 22:10:04 +09:00
Jaegeuk Kim	20f70751c6	f2fs: fix wrong kernel coding style This patch includes a simple fix to adjust coding style. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-05 10:48:53 +09:00
Jaegeuk Kim	3843154598	f2fs: introduce large directory support This patch introduces an i_dir_level field to support large directory. Previously, f2fs maintains multi-level hash tables to find a dentry quickly from a bunch of chiild dentries in a directory, and the hash tables consist of the following tree structure as below. In Documentation/filesystems/f2fs.txt, ---------------------- A : bucket B : block N : MAX_DIR_HASH_DEPTH ---------------------- level #0 \| A(2B) \| level #1 \| A(2B) - A(2B) \| level #2 \| A(2B) - A(2B) - A(2B) - A(2B) . \| . . . . level #N/2 \| A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . \| . . . . level #N \| A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) But, if we can guess that a directory will handle a number of child files, we don't need to traverse the tree from level #0 to #N all the time. Since the lower level tables contain relatively small number of dentries, the miss ratio of the target dentry is likely to be high. In order to avoid that, we can configure the hash tables sparsely from level #0 like this. level #0 \| A(2B) - A(2B) - A(2B) - A(2B) level #1 \| A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . \| . . . . level #N/2 \| A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . \| . . . . level #N \| A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) With this structure, we can skip the ineffective tree searches in lower level hash tables. This patch adds just a facility for this by introducing i_dir_level in f2fs_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-27 19:56:09 +09:00
Jaegeuk Kim	5d0c667121	f2fs: remove costly bit operations for f2fs_find_entry It turns out that a bit operation like find_next_bit is not always fast enough for f2fs_find_entry. Instead, it is pretty much simple and fast to traverse each dentries. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-27 16:25:20 +09:00
Jaegeuk Kim	1fe54f9dd3	f2fs: clean up redundant function call This patch integrates inode_[inc\|dec]_dirty_dents with inc_page_count to remove redundant calls. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:53 +09:00
Jaegeuk Kim	bd859c6598	f2fs: fix to truncate dentry pages in the error case When a new directory is allocated, if an error is occurred, we should truncate preallocated dentry pages too. This bug was reported by Andrey Tsyvarev after a while as follows. mkdir()-> f2fs_add_link()-> init_inode_metadata()-> f2fs_init_acl()-> f2fs_get_acl()-> f2fs_getxattr()-> read_all_xattrs() fails. Also there was a BUG_ON triggered after the fault in mkdir()-> f2fs_add_link()-> init_inode_metadata()-> remove_inode_page() -> f2fs_bug_on(inode->i_blocks != 0 && inode->i_blocks != 1); But, previous patch wasn't perfect to resolve that bug, so the following bug report was also submitted. kernel BUG at fs/f2fs/inode.c:274! Call Trace: [<ffffffff811fde03>] evict+0xa3/0x1a0 [<ffffffff811fe615>] iput+0xf5/0x180 [<ffffffffa01c7f63>] f2fs_mkdir+0xf3/0x150 [f2fs] [<ffffffff811f2a77>] vfs_mkdir+0xb7/0x160 [<ffffffff811f36bf>] SyS_mkdir+0x5f/0xc0 [<ffffffff81680769>] system_call_fastpath+0x16/0x1b Finally, this patch resolves all the issues like below. If an error is occurred after make_empty_dir(), 1. truncate_inode_pages() The make_bad_inode() prior to iput() will change i_mode to S_IFREG, which means that f2fs will not decrement fi->dirty_dents during f2fs_evict_inode. But, by calling it here, we can do that. 2. truncate_blocks() Preallocated dentry pages are trucated here to sync i_blocks. 3. remove_dirty_dir_inode() Remove this directory inode from the list. Reported-and-Tested-by: Andrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:52 +09:00
Jaegeuk Kim	924a2ddbd0	f2fs: fix the potential mismatch between dir's i_size and i_blocks This is the erroneous scenario. i_size on-disk i_size i_blocks __f2fs_add_link() 4096 4096 2 get_new_data_page 8192 4096 3 -ENOSPC = init_inode_metadata checkpoint - 4096 3 POR and reboot __f2fs_add_link() 4096 4096 3 page = get_new_data_page (page->index = 1 by NEW_ADDR) add a dentry to the page successfully f2fs_rmdir() f2fs_empty_dir() 4096 4096 3 f2fs_unlink() goes, since there is no valid dentry due to i_size = 4096. But, still there is one dentry in page->index = 1. So this patch moves the code to write dir->i_size into on-disk i_size in order to sync dir's i_size, on-disk i_size, and its i_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:52 +09:00
Jaegeuk Kim	e8dae60458	f2fs: move a branch for code redability This patch moves a function in f2fs_delete_entry for code readability. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:41:07 +09:00
Jaegeuk Kim	a18ff06340	f2fs: call mark_inode_dirty to flush dirty pages If a dentry page is updated, we should call mark_inode_dirty to add the inode into the dirty list, so that its dentry pages are flushed to the disk. Otherwise, the inode can be evicted without flush. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:40:34 +09:00
Chris Fries	6c311ec6c2	f2fs: clean checkpatch warnings Fixed a variety of trivial checkpatch warnings. The only delta should be some minor formatting on log strings that were split / too long. Signed-off-by: Chris Fries <cfries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-20 10:27:12 +09:00
Jaegeuk Kim	a8865372a8	f2fs: handle errors correctly during f2fs_reserve_block The get_dnode_of_data nullifies inode and node page when error is occurred. There are two cases that passes inode page into get_dnode_of_data(). 1. make_empty_dir() -> get_new_data_page() -> f2fs_reserve_block(ipage) -> get_dnode_of_data() 2. f2fs_convert_inline_data() -> __f2fs_convert_inline_data() -> f2fs_reserve_block(ipage) -> get_dnode_of_data() This patch adds correct error handling codes when get_dnode_of_data() returns an error. At first, f2fs_reserve_block() calls f2fs_put_dnode() whenever reserve_new_block returns an error. So, the rule of f2fs_reserve_block() is to nullify inode page when there is any error internally. Finally, two callers of f2fs_reserve_block() should call f2fs_put_dnode() appropriately if they got an error since successful f2fs_reserve_block(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-06 16:42:21 +09:00
Jaegeuk Kim	58bfaf44df	f2fs: introduce F2FS_INODE macro to get f2fs_inode This patch introduces F2FS_INODE that returns struct f2fs_inode * from the inode page. By using this macro, we can remove unnecessary casting codes like below. struct f2fs_inode ri = &F2FS_NODE(inode_page)->i; -> struct f2fs_inode ri = F2FS_INODE(inode_page); Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-26 20:32:48 +09:00
Chao Yu	d96b143151	f2fs: check filename length in recover_dentry In current flow, we will get Null return value of f2fs_find_entry in recover_dentry when name.len is bigger than F2FS_NAME_LEN, and then we still add this inode into its dir entry. To avoid this situation, we must check filename length before we use it. Another point is that we could remove the code of checking filename length In f2fs_find_entry, because f2fs_lookup will be called previously to ensure of validity of filename length. V2: o add WARN_ON() as Jaegeuk Kim suggested. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-26 12:50:09 +09:00
Chao Yu	deead09009	f2fs: avoid to set wrong pino of inode when rename dir When we rename a dir to new name which is not exist previous, we will set pino of parent inode with ino of child inode in f2fs_set_link. It destroy consistency of pino, it should be fixed. Thanks for previous work of Shu Tan. Signed-off-by: Shu Tan <shu.tan@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:42:51 +09:00
Chao Yu	4f4124d0b9	f2fs: update several comments Update several comments: 1. use f2fs_{un}lock_op install of mutex_{un}lock_op. 2. update comment of get_data_block(). 3. update description of node offset. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:26:03 +09:00
Chao Yu	cfb271d485	f2fs: add unlikely() macro for compiler optimization As we know, some of our branch condition will rarely be true. So we could add 'unlikely' to let compiler optimize these code, by this way we could drop unneeded 'jump' assemble code to improve performance. change log: o add unlikely as many as possible across the whole source files at once suggested by Jaegeuk Kim. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:06 +09:00
Jaegeuk Kim	5d56b6718a	f2fs: add an option to avoid unnecessary BUG_ONs If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS in your kernel config. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-10-29 15:44:38 +09:00
Jaegeuk Kim	2ed2d5b33c	f2fs: fix a deadlock during init_acl procedure The deadlock is found through the following scenario. sys_mkdir() -> f2fs_add_link() -> __f2fs_add_link() -> init_inode_metadata() : lock_page(inode); -> f2fs_init_acl() -> f2fs_set_acl() -> f2fs_setxattr(..., NULL) : This NULL page incurs a deadlock at update_inode_page(). So, likewise f2fs_init_security(), this patch adds a parameter to transfer the locked inode page to f2fs_setxattr(). Found by Linux File System Verification project (linuxtesting.org). Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-10-28 13:39:09 +09:00
Jaegeuk Kim	cbd56e7d20	f2fs: fix handling orphan inodes This patch fixes mishandling of the sbi->n_orphans variable. If users request lots of f2fs_unlink(), check_orphan_space() could be contended. In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink() would fall into the wrong state which results in the failure of add_orphan_inode(). So, let's increment sbi->n_orphans virtually prior to the actual orphan inode stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode or remove_orphan_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Jaegeuk Kim	1cd14cafc6	f2fs: update file name in the inode block during f2fs_rename The error is reproducible by: 0. mkfs.f2fs /dev/sdb1 & mount 1. touch test1 2. touch test2 3. mv test1 test2 4. umount 5. dumpt.f2fs -i 4 /dev/sdb1 After this, when we retrieve the inode->i_name of test2 by dump.f2fs, we get test1 instead of test2. This is because f2fs didn't update the file name during the f2fs_rename. So, this patch fixes that. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Gu Zheng	4559071063	f2fs: introduce help function F2FS_NODE() Introduce help function F2FS_NODE() to simplify the conversion of node_page to f2fs_node. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:02 +09:00
Jaegeuk Kim	99b072bb38	f2fs: fix readdir incorrectness In the previous Al Viro's readdir patch set, there occurs a bug when running xfstest: 006 as follows. [Error output] alpha size = 4, name length = 6, total files = 4096, nproc=1 1023 files created rm: cannot remove `/mnt/f2fs/permname.15150/a': Directory not empty [Correct output] alpha size = 4, name length = 6, total files = 4096, nproc=1 4097 files created This bug is due to the misupdate of directory position in ctx. So, this patch fixes this. [AV: fixed a braino] CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-08 13:35:48 +04:00
Linus Torvalds	3f490f7f99	This patch-set includes the following major enhancement patches. o remount_fs callback function o restore parent inode number to enhance the fsync performance o xattr security labels o reduce the number of redundant lock/unlock data pages o avoid frequent write_inode calls The other minor bug fixes are as follows. o endian conversion bugs o various bugs in the roll-forward recovery routine -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJR0h8YAAoJEEAUqH6CSFDSJNMP/1V+e/PaLa8YsHuw5eWPT3Fe RYFXJv7SXadHqgQInjwD7JOF8BC9DlUyknYUiXFUZgvKsgMHz+OTCNl+1hLTbUXt e6Rhn7vU/fMu3TEtj+v6g8JbpXdXH8TSbtvh9LpoczRS98GYpZuckP0BtQsVkTL3 jIq6WD8JRkb2DvpDl7viTT0Eq0T61CSmtOwOIvfhhiVxggvDWcnR1mTM9Tymdi4J NKtFkjCsKP0Z/7IZZUJAczGEHjsT+O6JDwp8+KVWuZT4BSuchoX4MYAY5wX527Ne rZvkolbbfnAFCC3BtETr0DPOkpxnHmJ6dEveIxjZ9cux12CAFA/Ww1QAL4ygiDkd Avn8BBEJnfnuzeOUkE0by+9hdF9LOU3CVSxiDhWJegYB16z+c9pSBD9xvlKhKk5B QNsjptB6m0CftAq7vIDsryL60uJ2cSHFxPqfwAHEpNngiU/NohTFSZE0sUMbLUNh FI6NrHoT7yld1HcB6cvL1lnUqIENFbNhDSSDcTdlN49IJJap4oqtgCnmMMIwbUCO vR2/26k5W7NwG+K6XN2IX13AsayzQahxTv8in/+LV0bfjHo6/1VzzGcqAmXJbDQw dLrNAeWaaIJi8J/zJiENMbFKXTj8Wax9jxKsW+4towQuyEt4ADvyt1c5gX3VR42T x8+YEargsdBf6FAhtN+H =qFcy -----END PGP SIGNATURE----- Merge tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches: - remount_fs callback function - restore parent inode number to enhance the fsync performance - xattr security labels - reduce the number of redundant lock/unlock data pages - avoid frequent write_inode calls The other minor bug fixes are as follows. - endian conversion bugs - various bugs in the roll-forward recovery routine" * tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (56 commits) f2fs: fix to recover i_size from roll-forward f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes() f2fs: remove reusing any prefree segments f2fs: code cleanup and simplify in func {find/add}_gc_inode f2fs: optimize the init_dirty_segmap function f2fs: fix an endian conversion bug detected by sparse f2fs: fix crc endian conversion f2fs: add remount_fs callback support f2fs: recover wrong pino after checkpoint during fsync f2fs: optimize do_write_data_page() f2fs: make locate_dirty_segment() as static f2fs: remove unnecessary parameter "offset" from __add_sum_entry() f2fs: avoid freqeunt write_inode calls f2fs: optimise the truncate_data_blocks_range() range f2fs: use the F2FS specific flags in f2fs_ioctl() f2fs: sync dir->i_size with its block allocation f2fs: fix i_blocks translation on various types of files f2fs: set sb->s_fs_info before calling parse_options() f2fs: support xattr security labels f2fs: fix iget/iput of dir during recovery ...	2013-07-02 09:42:38 -07:00
Al Viro	6f7f231e7b	[readdir] convert f2fs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:56:46 +04:00
Jaegeuk Kim	354a3399dc	f2fs: recover wrong pino after checkpoint during fsync If a file is linked, f2fs loose its parent inode number so that fsync calls for the linked file should do checkpoint all the time. But, if we can recover its parent inode number after the checkpoint, we can adjust roll-forward mechanism for the further fsync calls, which is able to improve the fsync performance significatly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-14 09:04:45 +09:00
Jaegeuk Kim	699489bbbe	f2fs: sync dir->i_size with its block allocation If new dentry block is allocated and its i_size is updated, we should update its inode block together in order to sync i_size and its block allocation. Otherwise, we can loose additional dentry block due to the unconsistent i_size. Errorneous Scenario ------------------- In the recovery routine, - recovery_dentry \| - __f2fs_add_link \| \| - get_new_data_page \| \| \| - i_size_write(new_i_size) \| \| \| - mark_inode_dirty_sync(dir) \| \| - update_parent_metadata \| \| \| - mark_inode_dirty(dir) \| - write_checkpoint - sync_dirty_dir_inodes - filemap_flush(dentry_blocks) - f2fs_write_data_page - skip to write the last dentry block due to index < i_size In the above flow, new_i_size is not updated to its inode block so that the last dentry block will be lost accordingly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-12 08:18:35 +09:00
Jaegeuk Kim	8ae8f1627f	f2fs: support xattr security labels This patch adds the support of security labels for f2fs, which will be used by Linus Security Models (LSMs). Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules: "Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. The framework is licensed under the terms of the GNU General Public License and is standard part of the Linux kernel since Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted modules in the official kernel.". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-11 16:01:03 +09:00
Jaegeuk Kim	83d5d6f66b	f2fs: cover cp_file information with ilock If a file is linked with other files, it should be checkpointed at every fsync calls. For this, we use set_cp_file() with FADVISE_CP_BIT, but previously we didn't cover the flag by the global lock. This patch fixes that the inode page stores this correctly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:06 +09:00
Namjae Jeon	4777f86b7c	f2fs: remove unneeded initializations in f2fs_parent_dir There is no need to initialize few pointers in f2fs_parent_dir as the values are not checked and instead directly initialized values are used. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:05 +09:00
Jaegeuk Kim	44a83ff6a8	f2fs: update inode page after creation I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:02 +09:00
Jaegeuk Kim	64aa7ed98d	f2fs: change get_new_data_page to pass a locked node page This patch is for passing a locked node page to get_dnode_of_data. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:01 +09:00
Linus Torvalds	942d33da99	f2fs updates for v3.10 This patch-set includes the following major enhancement patches. o introduce a new gloabl lock scheme o add tracepoints on several major functions o fix the overall cleaning process focused on victim selection o apply the block plugging to merge IOs as much as possible o enhance management of free nids and its list o enhance the readahead mode for node pages o address several cretical deadlock conditions o reduce lock_page calls The other minor bug fixes and enhancements are as follows. o calculation mistakes: overflow o bio types: READ, READA, and READ_SYNC o fix the recovery flow, data races, and null pointer errors -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRijCLAAoJEEAUqH6CSFDSg9kQAIqxmQzCUvCN3HcyVe8bGhKz 8xhKrAY6ySRCKMuBbFRQsNrXUhckE3A44DgzYm5/gQikr/c8zhbqPVrtZ968eCKb wm3J+Re/uwZr5eOXlJEaHIiSkMDtERN7Cu2oYJWZi2B9wCSZcgvoWQ3c3LUVk6yF GFdi1Y00ll5tFKbEGbXSsfdul9P8jp0MmuMnWBBQZF3TrjETXMdThA5FXN0yTf9s XkcGE9vTCCPk8p7P3YmGGw6CwlaL8oallm0//iL4nMNpJzveq2C09IlY2BNrxU3L iTNXeIBdbhwXpnh2zq26Cy+cIEDIp0oXYui5BYdr/LWyWU3T/INa+hjUUszsESxF 51LIUA1rA9nX/BSmj2QomswZ3lt4u5jl6rSBFKv3NG1KsFrAdb8S4tHukRSTSxAJ gzpY6kLT1+bgciA16F5W4yhzMYPN5hPa8s6hx4LHlpoqQICQsurjtS9KW7vncLFt ttmCMn8ehHcTzKRNNqYaBerCtSB3Z3G/uAy1y+DB7Zx2h2mqhCBXRalyRvs7RKvK d5OyYCpHntxuzDwVuivnr9Ddp30LUP1WqexxK+ykn99Ji3leMmffHP8Oari8w96b RxSbjoo8hOgoS5xZ4v3AaqtLDlBpxC6oWJzDaq/fJeKxOx22Z5BDFUM9mBGxrouJ AATl8b+cW/aTZ4l7WOPU =Hqii -----END PGP SIGNATURE----- Merge tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches. - introduce a new gloabl lock scheme - add tracepoints on several major functions - fix the overall cleaning process focused on victim selection - apply the block plugging to merge IOs as much as possible - enhance management of free nids and its list - enhance the readahead mode for node pages - address several cretical deadlock conditions - reduce lock_page calls The other minor bug fixes and enhancements are as follows. - calculation mistakes: overflow - bio types: READ, READA, and READ_SYNC - fix the recovery flow, data races, and null pointer errors" * tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits) f2fs: cover free_nid management with spin_lock f2fs: optimize scan_nat_page() f2fs: code cleanup for scan_nat_page() and build_free_nids() f2fs: bugfix for alloc_nid_failed() f2fs: recover when journal contains deleted files f2fs: continue to mount after failing recovery f2fs: avoid deadlock during evict after f2fs_gc f2fs: modify the number of issued pages to merge IOs f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry. f2fs: fix inconsistent using of NM_WOUT_THRESHOLD f2fs: check truncation of mapping after lock_page f2fs: enhance alloc_nid and build_free_nids flows f2fs: add a tracepoint on f2fs_new_inode f2fs: check nid == 0 in add_free_nid f2fs: add REQ_META about metadata requests for submit f2fs: give a chance to merge IOs by IO scheduler f2fs: avoid frequent background GC f2fs: add tracepoints to debug checkpoint request f2fs: add tracepoints for write page operations f2fs: add tracepoints to debug the block allocation ...	2013-05-08 15:11:48 -07:00
Jaegeuk Kim	c718379b6b	f2fs: give a chance to merge IOs by IO scheduler Previously, background GC submits many 4KB read requests to load victim blocks and/or its (i)node blocks. ... f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb61, blkaddr = 0x3b964ed f2fs_gc : block_rq_complete: 8,16 R () 499854968 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb6f, blkaddr = 0x3b964ee f2fs_gc : block_rq_complete: 8,16 R () 499854976 + 8 [0] f2fs_gc : f2fs_readpage: ino = 1, page_index = 0xb79, blkaddr = 0x3b964ef f2fs_gc : block_rq_complete: 8,16 R () 499854984 + 8 [0] ... However, by the fact that many IOs are sequential, we can give a chance to merge the IOs by IO scheduler. In order to do that, let's use blk_plug. ... f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c6, blkaddr = 0x2e6ee f2fs_gc : f2fs_iget: ino = 143 f2fs_gc : f2fs_readpage: ino = 143, page_index = 0x1c7, blkaddr = 0x2e6ef <idle> : block_rq_complete: 8,16 R () 1519616 + 8 [0] <idle> : block_rq_complete: 8,16 R () 1519848 + 8 [0] <idle> : block_rq_complete: 8,16 R () `1520432` + 96 [0] <idle> : block_rq_complete: 8,16 R () 1520536 + 104 [0] <idle> : block_rq_complete: 8,16 R () 1521008 + 112 [0] <idle> : block_rq_complete: 8,16 R () 1521440 + 152 [0] <idle> : block_rq_complete: 8,16 R () 1521688 + 144 [0] <idle> : block_rq_complete: 8,16 R () 1522128 + 192 [0] <idle> : block_rq_complete: 8,16 R () 1523256 + 328 [0] ... Note that this issue should be addressed in checkpoint, and some readahead flows too. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-26 10:35:10 +09:00
Al Viro	0ecc833bac	mode_t, whack-a-mole at 11... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-04-09 14:13:05 -04:00
Jaegeuk Kim	399368372e	f2fs: introduce a new global lock scheme In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations / DENTRY_OPS, / for directory operations / DATA_WRITE, / for data write / DATA_NEW, / for data allocation / DATA_TRUNC, / for data truncate / NODE_NEW, / for node allocation / NODE_TRUNC, / for node truncate / NODE_WRITE, / for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-09 18:21:18 +09:00
Jaegeuk Kim	5a20d339c7	f2fs: align f2fs maximum name length to linux based filesystem The maximum filename length supported in linux is 255 characters. So let's follow that. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-18 21:00:35 +09:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Jaegeuk Kim	90b2fc64f0	Merge branch 'f2fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into dev Pull f2fs cleanup patches from Al Viro: f2fs: get rid of fake on-stack dentries f2fs: switch init_inode_metadata() to passing parent and name separately f2fs: switch new_inode_page() from dentry to qstr f2fs: init_dent_inode() should take qstr Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Conflicts: fs/f2fs/recovery.c	2013-02-12 07:17:20 +09:00
Al Viro	b7f7a5e0be	f2fs: get rid of fake on-stack dentries those should never be used for a lot of reasons... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-08 02:55:04 -05:00
Al Viro	69f24eac55	f2fs: switch init_inode_metadata() to passing parent and name separately ... sure, it's tempting to just pass dentry. Except that we don't _have_ anything resembling a real dentry on one of the paths to it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-08 02:55:04 -05:00
Al Viro	c004363dd6	f2fs: switch new_inode_page() from dentry to qstr Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-08 02:55:03 -05:00
Al Viro	53dc9a6776	f2fs: init_dent_inode() should take qstr for one thing, it doesn't (and shouldn't) use anything else from dentry; for another, on some call chains the dentry is fake and should be eliminated completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-08 02:55:03 -05:00
Namjae Jeon	163799872b	f2fs: avoid redundant time update for parent directory in f2fs_delete_entry In call to f2fs_delete_entry, 'dir' time modification code is put at two places. So, remove the redundant code for timing update. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-01-14 09:43:27 +09:00
Leon Romanovsky	9836b8b949	f2fs: unify string length declarations and usage This patch is intended to unify string length declarations and usage. There are number of calls to strlen which return size_t object. The size of this object depends on compiler if it will be bigger, equal or even smaller than an unsigned int Signed-off-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-28 11:27:53 +09:00
Jaegeuk Kim	398b1ac5a5	f2fs: fix handling errors got by f2fs_write_inode Ruslan reported that f2fs hangs with an infinite loop in f2fs_sync_file(): while (sync_node_pages(sbi, inode->i_ino, &wbc) == 0) f2fs_write_inode(inode, NULL); The reason was revealed that the cold flag is not set even thought this inode is a normal file. Therefore, sync_node_pages() skips to write node blocks since it only writes cold node blocks. The cold flag is stored to the node_footer in node block, and whenever a new node page is allocated, it is set according to its file type, file or directory. But, after sudden-power-off, when recovering the inode page, f2fs doesn't recover its cold flag. So, let's assign the cold flag in more right places. One more thing: If f2fs_write_inode() returns an error due to whatever situations, there would be no dirty node pages so that sync_node_pages() returns zero. (i.e., zero means nothing was written.) Reported-by: Ruslan N. Marchenko <me@ruff.mobi> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-26 10:39:52 +09:00
Namjae Jeon	38e0abdcfb	f2fs: fix up f2fs_get_parent issue to retrieve correct parent inode number Test Case: [NFS Client] ls -lR . [NFS Server] while [ 1 ] do echo 3 > /proc/sys/vm/drop_caches done Error on NFS Client: "No such file or directory" When cache is dropped at the server, it results in lookup failure at the NFS client due to non-connection with the parent. The default path is it initiates a lookup by calculating the hash value for the name, even though the hash values stored on the disk for "." and ".." is maintained as zero, which results in failure from find_in_block due to not matching HASH values. Fix up, by using the correct hashing values for these entries. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-26 10:39:52 +09:00
Jaegeuk Kim	6666e6aa9f	f2fs: fix tracking parent inode number Previously, f2fs didn't track the parent inode number correctly which is stored in each f2fs_inode. In the case of the following scenario, a bug can be occured. Let's suppose there are one directory, "/b", and two files, "/a" and "/b/a". - pino of "/a" is ROOT_INO. - pino of "/b/a" is DIR_B_INO. Then, # sync : The inode pages of "/a" and "/b/a" contain the parent inode numbers as ROOT_INO and DIR_B_INO respectively. # mv /a /b/a : The parent inode number of "/a" should be changed to DIR_B_INO, but f2fs didn't do that. Ref. f2fs_set_link(). In order to fix this clearly, I added i_pino in f2fs_inode_info, and whenever it needs to be changed like in f2fs_add_link() and f2fs_set_link(), it is updated temporarily in f2fs_inode_info. And later, f2fs_write_inode() stores the latest information to the inode pages. For power-off-recovery, f2fs_sync_file() triggers simply f2fs_write_inode(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:45 +09:00

1 2

56 Commits