When a hole spans across page boundaries, the next write forces
a read of the block. This could end up reading existing garbage
data from the disk in ocfs2_map_page_blocks. This leads to
non-zero holes. In order to avoid this, mark the writes as new
when the holes span across page boundaries.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de>
Signed-off-by: jlbec <jlbec@evilplan.org>
When CONFIG_DEBUG_FS=y and CONFIG_OCFS2_FS_STATS=n, we get the
following warning:
fs/ocfs2/cluster/tcp.c:213:16: warning: ‘o2net_get_func_run_time’
defined but not used
Since o2net_get_func_run_time is only called from
o2net_update_recv_stats, so move it under CONFIG_OCFS2_FS_STATS.
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: jlbec <jlbec@evilplan.org>
In dlm_query_region_handler(), move the kmalloc outside the spinlock.
This allows us to use GFP_KERNEL instead of GFP_ATOMIC.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>
Remove mlog(0,...) and mlog(ML_UPTODATE,...) from
fs/ocfs2/uptodate.c and fs/ocfs2/buffer_head_io.c.
The masklog UPTODATE is removed finally.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Remove mlog(0,...) and mlog(ML_BH_IO,...) from
fs/ocfs2/buffer_head_io.c.
The masklog BH_IO is removed finally.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
ocfs2_iget is used to get/create inode. Only iget5_locked
will give us an inode = NULL. So move this check ahead of
ocfs2_read_locked_inode so that we don't need to check
inode before we read and unlock inode. This is also helpful
for trace event(see the next patch).
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
As there are no such debug information in fs/ocfs2/ioctl.c,
fs/ocfs2/locks.c and fs/ocfs2/sysfile.c, ML_INODE are also
removed.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Change all the "mlog(0," in fs/ocfs2/refcounttree.c to trace events.
And finally remove masklog ML_REFCOUNT.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Since all 4 files, localalloc.c, suballoc.c, alloc.c and
resize.c, which use DISK_ALLOC are changed to trace events,
Remove masklog DISK_ALLOC totally.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
This is the 2nd step to remove the debug info of DISK_ALLOC.
So this patch removes all mlog(0,...) from localalloc.c and adds
the corresponding tracepoints. Different mlogs have different
solutions.
1. Some are replaced with trace event directly.
2. Some are replaced while some new parameters are added.
3. Some are combined into one trace events.
4. Some redundant mlogs are removed.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
This is the first try of replacing debug mlog(0,...) to
trace events. Wengang has did some work in his original
patch
http://oss.oracle.com/pipermail/ocfs2-devel/2009-November/005513.html
But he didn't finished it.
So this patch removes all mlog(0,...) from alloc.c and adds
the corresponding trace events. Different mlogs have different
solutions.
1. Some are replaced with trace event directly.
2. Some are replaced and some new parameters are added since
I think we need to know the btree owner in that case.
3. Some are combined into one trace events.
4. Some redundant mlogs are removed.
What's more, it defines some event classes so that we can use
them later.
Cc: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
About one year ago, Wengang Wang tried some first steps
to add tracepoints to ocfs2. Hiss original patch is here:
http://oss.oracle.com/pipermail/ocfs2-devel/2009-November/005512.html
But as Steven Rostedt indicated in his article
http://lwn.net/Articles/383362/, we'd better have our trace
files resides in fs/ocfs2, so I rewrited the patch using the
method Steven mentioned in that article.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.
This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.
So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits)
tidy the trailing symlinks traversal up
Turn resolution of trailing symlinks iterative everywhere
simplify link_path_walk() tail
Make trailing symlink resolution in path_lookupat() iterative
update nd->inode in __do_follow_link() instead of after do_follow_link()
pull handling of one pathname component into a helper
fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH
Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
readlinkat(), fchownat() and fstatat() with empty relative pathnames
Allow O_PATH for symlinks
New kind of open files - "location only".
ext4: Copy fs UUID to superblock
ext3: Copy fs UUID to superblock.
vfs: Export file system uuid via /proc/<pid>/mountinfo
unistd.h: Add new syscalls numbers to asm-generic
x86: Add new syscalls for x86_64
x86: Add new syscalls for x86_32
fs: Remove i_nlink check from file system link callback
fs: Don't allow to create hardlink for deleted file
vfs: Add open by file handle support
...
The new vfs locking scheme introduced in 2.6.38 breaks NFS sillyrename
because the latter relies on being able to determine the parent
directory of the dentry in the ->iput() callback in order to send the
appropriate unlink rpc call.
Looking at the code that cares about races with dput(), there doesn't
seem to be anything that specifically uses d_parent as a test for
whether or not there is a race:
- __d_lookup_rcu(), __d_lookup() all test for d_hashed() after d_parent
- shrink_dcache_for_umount() is safe since nothing else can rearrange
the dentries in that super block.
- have_submount(), select_parent() and d_genocide() can test for a
deletion if we set the DCACHE_DISCONNECTED flag when the dentry
is removed from the parent's d_subdirs list.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org (2.6.38, needs commit c826cb7dfc "dcache.c:
create helper function for duplicated functionality" )
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This creates a helper function for he "try to ascend into the parent
directory" case, which was written out in triplicate before. With all
the locking and subtle sequence number stuff, we really don't want to
duplicate that kind of code.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pull the handling of current->total_link_count into
__do_follow_link()
* put the common "do ->put_link() if needed and path_put() the link"
stuff into a helper (put_link(nd, link, cookie))
* rename __do_follow_link() to follow_link(), while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The last remaining place (resolution of nested symlink) converted
to the loop of the same kind we have in path_lookupat() and
path_openat().
Note that we still *do* have a recursion in pathname resolution;
can't avoid it, really. However, it's strictly for nested symlinks
now - i.e. ones in the middle of a pathname.
link_path_walk() has lost the tail now - it always walks everything
except the last component.
do_follow_link() renamed to nested_symlink() and moved down.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that link_path_walk() is called without LOOKUP_PARENT
only from do_follow_link(), we can simplify the checks in
last component handling. First of all, checking if we'd
arrived to a directory is not needed - the caller will check
it anyway. And LOOKUP_FOLLOW is guaranteed to be there,
since we only get to that place with nd->depth > 0.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new helper: walk_component(). Handles everything except symlinks;
returns negative on error, 0 on success and 1 on symlinks we decided
to follow. Drops out of RCU mode on such symlinks.
link_path_walk() and do_last() switched to using that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We don't want to allow creation of private hardlinks by different application
using the fd passed to them via SCM_RIGHTS. So limit the null relative name
usage in linkat syscall to CAP_DAC_READ_SEARCH
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Just need to make sure that AF_UNIX garbage collector won't
confuse O_PATHed socket on filesystem for real AF_UNIX opened
socket.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For readlinkat() we simply allow empty pathname; it will fail unless
we have dfd equal to O_PATH-opened symlink, so we are outside of
POSIX scope here. For fchownat() and fstatat() we allow AT_EMPTY_PATH;
let the caller explicitly ask for such behaviour.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
At that point we can't do almost nothing with them. They can be opened
with O_PATH, we can manipulate such descriptors with dup(), etc. and
we can see them in /proc/*/{fd,fdinfo}/*.
We can't (and won't be able to) follow /proc/*/fd/* symlinks for those;
there's simply not enough information for pathname resolution to go on
from such point - to resolve a symlink we need to know which directory
does it live in.
We will be able to do useful things with them after the next commit, though -
readlinkat() and fchownat() will be possible to use with dfd being an
O_PATH-opened symlink and empty relative pathname. Combined with
open_by_handle() it'll give us a way to do realink-by-handle and
lchown-by-handle without messing with more redundant syscalls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>