Commit Graph

1702 Commits

Author SHA1 Message Date
Abhijith Das
1990e91765 [GFS2] Quotas non-functional - fix another bug
This patch fixes a bug where gfs2 was writing update quota usage
information to the wrong location in the quota file.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:23:01 +01:00
Abhijith Das
2a87ab0806 [GFS2] Quotas non-functional - fix bug
This patch fixes an error in the quota code where a 'struct
gfs2_quota_lvb*' was being passed to gfs2_adjust_quota() instead of a
'struct gfs2_quota_data*'. Also moved 'struct gfs2_quota_lvb' from
fs/gfs2/incore.h to include/linux/gfs2_ondisk.h as per Steve's suggestion.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:26 +01:00
Steven Whitehouse
dbb7cae2a3 [GFS2] Clean up inode number handling
This patch cleans up the inode number handling code. The main difference
is that instead of looking up the inodes using a struct gfs2_inum_host
we now use just the no_addr member of this structure. The tests relating
to no_formal_ino can then be done by the calling code. This has
advantages in that we want to do different things in different code
paths if the no_formal_ino doesn't match. In the NFS patch we want to
return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
no_formal_ino doesn't match and thus we can withdraw in this case.

In order to later fix bz #201012, we need to be able to look up an inode
without knowing no_formal_ino, as the only information that is known to
us is the on-disk location of the inode in question.

This patch will also help us to fix bz #236099 at a later date by
cleaning up a lot of the code in that area.

There are no user visible changes as a result of this patch and there
are no changes to the on-disk format either.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:24 +01:00
Steven Whitehouse
41d7db0ab4 [GFS2] Reduce size of struct gdlm_lock
This patch removes the completion (which is rather large) from struct
gdlm_lock in favour of using the wait_on_bit() functions. We don't need
to add any extra fields to the structure to do this, so we save 32 bytes
(on x86_64) per structure. This adds up to quite a lot when we may
potentially have millions of these lock structures,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: David Teigland <teigland@redhat.com>
2007-07-09 08:22:21 +01:00
Robert Peterson
cd81a4bac6 [GFS2] Addendum patch 2 for gfs2_grow
This addendum patch 2 corrects three things:

1. It fixes a stupid mistake in the previous addendum that broke gfs2.
   Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html
2. It fixes a problem that Dave Teigland pointed out regarding the
   external declarations in ops_address.h being in the wrong place.
3. It recasts a couple more %llu printks to (unsigned long long)
   as requested by Steve Whitehouse.

I would have loved to put this all in one revised patch, but there was
a rush to get some patches for RHEL5.	Therefore, the previous patches
were applied to the git tree "as is" and therefore, I'm posting another
addendum.  Sorry.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:19 +01:00
Nate Diller
0507ecf50f [GFS2] use zero_user_page
Use zero_user_page() instead of open-coding it.

Signed-off-by: Nate Diller <nate.diller@gmail.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-07-09 08:22:17 +01:00
Robert Peterson
6c53267f05 [GFS2] Kernel changes to support new gfs2_grow command (part 2)
To avoid code redundancy, I separated out the operational "guts" into
a new function called read_rindex_entry.  Then I made two functions:
the closer-to-original gfs2_ri_update (without the special condition
checks) and gfs2_ri_update_special that's designed with that condition
in mind.  (I don't like the name, but if you have a suggestion, I'm
all ears).

Oh, and there's an added benefit:  we don't need all the ugly gotos
anymore.  ;)

This patch has been tested with gfs2_fsck_hellfire (which runs for
three and a half hours, btw).

Signed-off-By: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:14 +01:00
Robert Peterson
7ae8fa8451 [GFS2] kernel changes to support new gfs2_grow command
This is another revision of my gfs2 kernel patch that allows
gfs2_grow to function properly.

Steve Whitehouse expressed some concerns about the previous
patch and I restructured it based on his comments.
The previous patch was doing the statfs_change at file close time,
under its own transaction.  The current patch does the statfs_change
inside the gfs2_commit_write function, which keeps it under the
umbrella of the inode transaction.

I can't call ri_update to re-read the rindex file during the
transaction because the transaction may have outstanding unwritten
buffers attached to the rgrps that would be otherwise blown away.
So instead, I created a new function, gfs2_ri_total, that will
re-read the rindex file just to total the file system space
for the sake of the statfs_change.  The ri_update will happen
later, when gfs2 realizes the version number has changed, as it
happened before my patch.

Since the statfs_change is happening at write_commit time and there
may be multiple writes to the rindex file for one grow operation.
So one consequence of this restructuring is that instead of getting
one kernel message to indicate the change, you may see several.
For example, before when you did a gfs2_grow, you'd get a single
message like:

GFS2: File system extended by 247876 blocks (968MB)

Now you get something like:

GFS2: File system extended by 207896 blocks (812MB)
GFS2: File system extended by 39980 blocks (156MB)

This version has also been successfully run against the hours-long
"gfs2_fsck_hellfire" test that does several gfs2_grow and gfs2_fsck
while interjecting file system damage.  It does this repeatedly
under a variety Resource Group conditions.

Signed-off-By: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:12 +01:00
Benjamin Marzinski
b524fe646c [GFS2] flush the glock completely in inode_go_sync
Fix for bz #231910
When filemap_fdatawrite() is called on the inode mapping in data=ordered mode,
it will add the glock to the log. In inode_go_sync(), if you do the
gfs2_log_flush() before this, after the filemap_fdatawrite() call, the glock
and its associated data buffers will be on the log again. This means you can
demote a lock from exclusive, without having it flushed from the log. The
attached patch simply moves the gfs2_log_flush up to after the
filemap_fdatawrite() call.

Originally, I tried moving the gfs2_log_flush to after gfs2_meta_sync(), but
that caused me to trip the following assert.

GFS2: fsid=cypher-36:test.0: fatal: assertion "!buffer_busy(bh)" failed
GFS2: fsid=cypher-36:test.0:   function = gfs2_ail_empty_gl, file = fs/gfs2/glops.c, line = 61

It appears that gfs2_log_flush() puts some of the glocks buffers in the busy
state and the filemap_fdatawrite() call is necessary to flush them. This makes
me worry slightly that a related problem could happen because of moving the
gfs2_log_flush() after the initial filemap_fdatawrite(), but I assume that
gfs2_ail_empty_gl() would catch that case as well.

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:07 +01:00
Alexey Dobriyan
e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Christoph Lameter
a35afb830f Remove SLAB_CTOR_CONSTRUCTOR
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@ucw.cz>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-17 05:23:04 -07:00
Randy Dunlap
e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Guillaume Chazarain
3e9f45bd18 Factor outstanding I/O error handling
Cleanup: setting an outstanding error on a mapping was open coded too many
times.  Factor it out in mapping_set_error().

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
Linus Torvalds
2d56d3c43c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
  gfs2: nfs lock support for gfs2
  lockd: add code to handle deferred lock requests
  lockd: always preallocate block in nlmsvc_lock()
  lockd: handle test_lock deferrals
  lockd: pass cookie in nlmsvc_testlock
  lockd: handle fl_grant callbacks
  lockd: save lock state on deferral
  locks: add fl_grant callback for asynchronous lock return
  nfsd4: Convert NFSv4 to new lock interface
  locks: add lock cancel command
  locks: allow {vfs,posix}_lock_file to return conflicting lock
  locks: factor out generic/filesystem switch from setlock code
  locks: factor out generic/filesystem switch from test_lock
  locks: give posix_test_lock same interface as ->lock
  locks: make ->lock release private data before returning in GETLK case
  locks: create posix-to-flock helper functions
  locks: trivial removal of unnecessary parentheses
2007-05-07 12:34:24 -07:00
Linus Torvalds
5cefcab3db Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits)
  [GFS2] Uncomment sprintf_symbol calling code
  [DLM] lowcomms style
  [GFS2] printk warning fixes
  [GFS2] Patch to fix mmap of stuffed files
  [GFS2] use lib/parser for parsing mount options
  [DLM] Lowcomms nodeid range & initialisation fixes
  [DLM] Fix dlm_lowcoms_stop hang
  [DLM] fix mode munging
  [GFS2] lockdump improvements
  [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks
  [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump)
  [DLM] fs/dlm/ast.c should #include "ast.h"
  [DLM] Consolidate transport protocols
  [DLM] Remove redundant assignment
  [GFS2] Fix bz 234168 (ignoring rgrp flags)
  [DLM] change lkid format
  [DLM] interface for purge (2/2)
  [DLM] add orphan purging code (1/2)
  [DLM] split create_message function
  [GFS2] Set drop_count to 0 (off) by default
  ...
2007-05-07 12:26:27 -07:00
Christoph Lameter
50953fe9e0 slab allocators: Remove SLAB_DEBUG_INITIAL flag
I have never seen a use of SLAB_DEBUG_INITIAL.  It is only supported by
SLAB.

I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again?  The callback is
performed before each freeing of an object.

I would think that it is much easier to check the object state manually
before the free.  That also places the check near the code object
manipulation of the object.

Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on.  If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code.  But there is no such code
in the kernel.  I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e.  add debug code before kfree).

There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches.  Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

This is the last slab flag that SLUB did not support.  Remove the check for
unimplemented flags from SLUB.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:57 -07:00
Marc Eshel
586759f03e gfs2: nfs lock support for gfs2
Add NFS lock support to GFS2.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-06 20:38:50 -04:00
Marc Eshel
9d6a8c5c21 locks: give posix_test_lock same interface as ->lock
posix_test_lock() and ->lock() do the same job but have gratuitously
different interfaces.  Modify posix_test_lock() so the two agree,
simplifying some code in the process.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06 17:39:00 -04:00
Greg Kroah-Hartman
823bccfc40 remove "struct subsystem" as it is no longer needed
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes.  The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.

Thanks to Kay for fixing the bugs in this patch.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 18:57:59 -07:00
Steven Whitehouse
37fde8ca6c [GFS2] Uncomment sprintf_symbol calling code
Now that the patch from -mm has gone upstream, we can uncomment the code
in GFS2 which uses sprintf_symbol.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Robert Peterson <rpeterso@redhat.com>
2007-05-01 09:51:39 +01:00
akpm@linux-foundation.org
f391a4ead6 [GFS2] printk warning fixes
alpha:

fs/gfs2/dir.c: In function 'gfs2_dir_read_leaf':
fs/gfs2/dir.c:1322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t'
fs/gfs2/dir.c: In function 'gfs2_dir_read':
fs/gfs2/dir.c:1455: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type '__u64'

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:48 +01:00
Steven Whitehouse
bf126aee6d [GFS2] Patch to fix mmap of stuffed files
If a stuffed file is mmaped and a page fault is generated at some offset
above the initial page, we need to create a zero page to hang the buffer
heads off before we can unstuff the file. This is a fix for bz #236087

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:46 +01:00
Josef Bacik
476c006be0 [GFS2] use lib/parser for parsing mount options
This patch converts the mount option parsing to use the kernels lib/parser stuff
like all of the other filesystems.  I tested this and it works well.  Thank you,

Signed-off-by: Josef Bacik <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:43 +01:00
Robert Peterson
5f8820960c [GFS2] lockdump improvements
The patch below consists of the following changes (in code order):

1. I fixed a minor compiler warning regarding the printing of
   a kernel symbol address.
2. I implemented a suggestion from Dave Teigland that moves
   the debugfs information for gfs2 into a subdirectory so
   we can easily expand our use of debugfs in the future.
   The current code keeps the glock information in:
   /debug/gfs2/<fs>
   With the patch, the new code keeps the glock information in:
   /debug/gfs2/<fs>/glock
   That will allow us to create more debugfs files in the future.
3. This fixes a bug whereby a failed mount attempt causes the
   debugfs file to not be deleted.  Failed mount attempts should
   always clean up after themselves, including deleting the
   debugfs file and/or directory.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:33 +01:00
Steven Whitehouse
bdd19a22f8 [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks
This patch detects when the number of entries in a leaf block or inode
block (in the case of stuffed directories) is corrupt and informs the
user. It prevents us from running off the end of the array thats been
allocated for the sorting in this case,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:30 +01:00
Robert Peterson
7a0079d9e3 [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump)
This is for Bugzilla Bug 236008: Kernel gpf doing cat /debugfs/gfs2/xxx
(lock dump) seen at the "gfs2 summit".  This also fixes the bug that caused
garbage to be printed by the "initialized at" field.  I apologize for the
kludge, but that code will all be ripped out anyway when the official
sprint_symbol function becomes available in the Linux kernel.  I also
changed some formatting so that spaces are replaced by proper tabs.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:28 +01:00
Steven Whitehouse
a43a49066d [GFS2] Fix bz 234168 (ignoring rgrp flags)
Ths following patch makes GFS2 use the rgrp flags properly. Although
there are also separate flags for both data and metadata as well, I've
not implemented these as there seems little use for them. On the
otherhand, the "noalloc" flag is generally useful for future changes we
might which to make, so this ensures that we interpret it correctly.

In addition I fixed the comment above the function which was incorrect.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:17 +01:00
Steven Whitehouse
f01963f264 [GFS2] Set drop_count to 0 (off) by default
This sets the drop_count to 0 by default which is a better default
for most people.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:05 +01:00
David Teigland
b9af8a788a [GFS2] use log_error before LM_OUT_ERROR
We always want to see the details of the error returned to gfs, but
log_debug is often turned off, so use log_error (printk).

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:02 +01:00
Robert Peterson
04b933f27b [GFS2] Red Hat bz 228540: owner references
In Testing the previously posted and accepted patch for
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540
I uncovered some gfs2 badness.  It turns out that the current
gfs2 code saves off a process pointer when glocks is taken
in both the glock and glock holder structures.  Those
structures will persist in memory long after the process has
ended; pointers to poisoned memory.

This problem isn't caused by the 228540 fix; the new capability
introduced by the fix just uncovered the problem.

I wrote this patch that avoids saving process pointers
and instead saves off the process pid.  Rather than
referencing the bad pointers, it now does process lookups.
There is special code that makes the output nicer for
printing holder information for processes that have ended.

This patch also adds a stub for the new "sprint_symbol"
function that exists in Andrew Morton's -mm patch set, but
won't go into the base kernel until 2.6.22, since it adds
functionality but doesn't fix a bug.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:55 +01:00
Benjamin Marzinski
172e045a7f [GFS2] flush the log if a transaction can't allocate space
This is a fix for bz #208514. When GFS2 frees up space, the freed blocks
aren't available for reuse until the resource group is successfully written
to the ondisk journal. So in rare cases, GFS2 operations will fail, saying
that the filesystem is out of space, when in reality, you are just waiting for
a log flush. For instance, on a 1Gig filesystem, if I continually write 10 Mb
to a file, and then truncate it, after a hundred interations, the write will
fail with -ENOSPC, even though the filesystem is just 1% full.

The attached patch calls a log flush in these cases.  I tested this patch
fairly heavily to check if there were any locking issues that I missed, and
it seems to work just fine. Also, this patch only does the log flush if
get_local_rgrp makes a complete loop of resource groups without skipping
any do to locking issues. The code would be slightly simpler if it just always
did the log flush after the first failed pass, and you could only ever have
to go through the loop twice, instead of up to three times. However, I guessed
that failing to find a rg simply do to locking issues would be common enough
to skip the log flush in that case, but I'm not certain that this is the right
way to go. Either way, I don't suppose this code will be hit all that often.

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:52 +01:00
Benjamin Marzinski
6883562588 [GFS2] Fix log entry list corruption
When glock_lo_add and rg_lo_add attempt to add an element to the log, they
check to see if has already been added before locking the log. If another
process adds that element to the log in this window between the check and
locking the log, the element will be added to the list twice. This causes
the log element list to become corrupted in such a way that the log element
can never be successfully removed from the list. This patch pulls the
list_empty() check inside the log lock, to remove this window.

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:50 +01:00
Steven Whitehouse
f35ac346bc [GFS2] Speed up lock_dlm's locking (move sprintf)
The following patch speeds up lock_dlm's locking by moving the sprintf
out from the lock acquisition path and into the lock creation path. This
reduces the amount of CPU time used in acquiring locks by a fair amount.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: David Teigland <teigland@redhat.com>
2007-05-01 09:10:47 +01:00
Steven Whitehouse
420d2a1028 [GFS2] Fix a bug on i386 due to evaluation order
Since gcc didn't evaluate the last two terms of the expression in
glock.c:1881 as a constant expression, it resulted in an error on
i386 due to the lack of a 64bit divide instruction. This adds some
brackets to fix the problem.

This was reported by Andrew Morton.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
2007-05-01 09:10:42 +01:00
Steven Whitehouse
3b8249f617 [GFS2] Fix bz 224480 and cleanup glock demotion code
This patch prevents the printing of a warning message in cases where
the fs is functioning normally by handing off responsibility for
unlinked, but still open inodes, to another node for eventual deallocation.
Also, there is now an improved system for ensuring that such requests
to other nodes do not get lost. The callback on the iopen lock is
only ever called when i_nlink == 0 and when a node is unable to deallocate
it due to it still being in use on another node. When a node receives
the callback therefore, it knows that i_nlink must be zero, so we mark
it as such (in gfs2_drop_inode) in order that it will then attempt
deallocation of the inode itself.

As an additional benefit, queuing a demote request no longer requires
a memory allocation. This simplifies the code for dealing with gfs2_holders
as it removes one special case.

There are two new fields in struct gfs2_glock. gl_demote_state is the
state which the remote node has requested and gl_demote_time is the
time when the request came in. Both fields are only valid when the
GLF_DEMOTE flag is set in gl_flags.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:39 +01:00
Josef Whiter
1de9139092 [GFS2] Fix bz 231380, unlock page before dequeing glocks in gfs2_commit_write
If we are writing a file, and in the middle of writing the file
another node attempts to get a shared lock on that file (by doing a du for
example) the process doing the writing will hang waiting on lock_page.  The
reason for this is because when we have waiters on a exclusive glock, we will go
through and flush out all dirty pages associated with that inode and release the
lock.  The problem is that when we flush the dirty pages, we could hit a page
that we have locked durring the generic_file_buffered_write part of this
operation.  This patch unlocks the page before we go to dequeue the lock and
locks it immediatly afterwards, since generic_file_buffered_write needs the page
locked when the commit_write is completed.  This patch resolves the problem,
however if somebody sees a better way to do this please don't hesistate to yell.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:37 +01:00
Josef Whiter
5c7342d894 [GFS2] fix bz 231369, gfs2 will oops if you specify an invalid mount option
If you specify an invalid mount option when trying to mount a gfs2 filesystem,
gfs2 will oops.  The attached patch resolves this problem.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:32 +01:00
Robert Peterson
7c52b166c5 [GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540)
The attached patch resolves bz 228540.  This adds the capability
for gfs2 to dump gfs2 locks through the debugfs file system.
This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
gfs2 because all the ioctls were stripped out.  Please see the bugzilla
for more history about the fix.  This patch is also attached to the bugzilla
record.

The patch is against Steve Whitehouse's latest nmw git tree kernel
(2.6.21-rc1) and has been tested on system trin-10.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:29 +01:00
Steven Whitehouse
c3f49bc209 [GFS2] Fix bz 229873, alternate test: assertion "!ip->i_inode.i_mapping->nrpages" failed
The following removes an incorrect assertion from the GFS2 glops code. This
fixes Red Hat bz 229873. Thanks to Abhijith Das for testing the patch
and confirming the fix.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>
2007-03-07 14:03:53 -05:00
akpm@linux-foundation.org
95d97b7dd7 [GFS2] build fix
fs/gfs2/glock.c:2198: error: 'THIS_MODULE' undeclared here (not in a function)

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-03-07 14:03:25 -05:00
Steven Whitehouse
631c42e170 [GFS2] go_drop_bh is never used, so remove it
The ->go_drop_bh function is never used, so this removes it and the single
caller,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:02:53 -05:00
Steven Whitehouse
04b159b132 [GFS2] Remove unused variable
Remove an unused variable.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:02:30 -05:00
Steven Whitehouse
1be3867955 [GFS2] Fix bz 229831, lookup returns wrong inode
The following patch fixes Red Hat bz 229831. Without this patch its
possible for the wrong inode to be returned in certain cases. It is a
pretty unusual event, so that its taken some time to track down. Thanks
and due to Josef Whiter who did a lot of the testing required to thrack
this down and fix it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:01:53 -05:00
Steven Whitehouse
cad5b93927 [GFS2] Fix bz 230143, incorrect flushing of rgrps
The below patch fixes a problem where we were not flushing rgrps
correctly. It only occurred in the specific case that a callback was
received for an rgrp which was dirty and when a journal log flush had
not already resulted in the rgrp being flushed anyway. This fixes Red
Hat bz 230143,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:00:14 -05:00
Wendy Cheng
fb0d3bce8e [GFS2] pass formal ino in do_filldir_main
ok, the following is the minimum changes to get NFSD going before we
settle down this issue .. would appreciate this in the tree so other NFS
related works can get done in parallel.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:58:45 -05:00
Josef Whiter
a13cbe3753 [GFS2] fix hangup when multiple processes are trying to write to the same file
This fixes a problem I encountered while running bonnie++.  When you have one
thread that opens a file and starts to write to it, and then another thread that
tries to open and write to the same file, the second thread will loop forever
trying to grab the inode lock for that inode.  Basically we come in through
generic_buffered_file_write, which calls gfs2_prepare_write, which then attempts
to grab the glock.  Because we don't own the lock, gfs2_prepare_write gets
GLR_TRYFAILED, which returns AOP_TRUNCATED_PAGE to generic_buffered_file_write.
At this point generic_buffered_file_write loops around again and immediately
retries the prepare_write.  This means that the second process never gets off of
the processor in order to allow the process that holds the lock to finish its
work and let go of the lock.  This patch makes gfs2_glock_nq schedule() if it
gets back a GLR_TRYFAILED, which resolves this problem.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:58:02 -05:00
Wendy Cheng
a7d2b2bdc9 [GFS2] NFS filehandle check
File handle checking error found in '07 NFS connectathon. The fh_type
and fh_len are not necessarily identical. Some of the client machines
could fail mount with stale filehandle without this patch.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:57:34 -05:00
Richard Fearn
d5a6751b32 [GFS2] add newline to printk message
Patch for the 2.6.20 stable tree that adds a missing newline to one of
the printk messages in fs/gfs2/ops_fstype.c.

Signed-off-by: Richard Fearn <richardfearn@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:57:10 -05:00
Josef Whiter
2e95b6653b [GFS2] fix locking mistake
This patch fixes a locking mistake in the quota code, we do a mutex_lock instead
of a mutex_unlock.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:56:41 -05:00
Tim Schmielau
cd354f1ae7 [PATCH] remove many unneeded #includes of sched.h
After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there.  Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.

To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.

Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm.  I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:54 -08:00
Josef 'Jeff' Sipek
ee9b6d61a2 [PATCH] Mark struct super_operations const
This patch is inspired by Arjan's "Patch series to mark struct
file_operations and struct inode_operations const".

Compile tested with gcc & sparse.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:47 -08:00
Arjan van de Ven
92e1d5be91 [PATCH] mark struct inode_operations const 2
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Arjan van de Ven
00977a59b9 [PATCH] mark struct file_operations const 6
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:45 -08:00
Robert P. J. Day
c376222960 [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:27 -08:00
Adrian Bunk
a2cf822274 [GFS2] make gfs2_writepages() static
On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc6-mm2:
>...
>  git-gfs2-nmw.patch
>...
>  git trees
>...

This patch makes the needlessly global gfs2_writepages() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-07 10:48:48 -05:00
Steven Whitehouse
2d72e7101c [GFS2] Unlock page on prepare_write try lock failure
When the try lock of the glock failed in prepare_write we were
incorrectly exiting this function with the page still locked.
This was resulting in further I/O to this page hanging.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-07 10:25:59 -05:00
Wendy Cheng
549ae0ac3d [GFS2] nfsd readdirplus assertion failure
Glock assertion failure found in '07 NFS connectathon. One of the NFSDs
is doing a "readdirplus" procedure call. It passes the logic into
gfs2_readdir() where it obtains its directory inode glock. This is then
followed by filehandle construction that invokes lookup code. It hits
the assertion failure while trying to obtain the inode glock again
inside gfs2_drevalidate().

This patch bypasses the recursive glock call if caller already holds the
lock.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-06 11:36:01 -05:00
Randy Dunlap
9beeb9f3c5 [DLM/GFS2] indent help text
Indent help text as expected.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:20 -05:00
Russell Cattelan
ddee76089c [GFS2] Fix unlink deadlocks
Move the glock acquisition to outside of the transactions.

Lock odering must be preserved in order to prevent ABBA
deadlocks. The current gfs2_change_nlink code would tries
to grab the glock after having started a transaction and thus is holding
the log lock. This is inconsistent with other code paths in
gfs that grab the resource group glock prior to staring
a tranactions.

One problem with this fix is that the resource group
lock is always grabbed now even if the inode still has
ref count and can not be marked for unlink.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:17 -05:00
Steven Whitehouse
61be084efc [GFS2] Put back semaphore to avoid umount problem
Dave Teigland fixed this bug a while back, but I managed to mistakenly
remove the semaphore during later development. It is required to avoid
the list of inodes changing during an invalidate_inodes call. I have
made it an rwsem since the read side will be taken frequently during
normal filesystem operation. The write site will only happen during
umount of the file system.

Also the bug only triggers when using the DLM lock manager and only then
under certain conditions as its timing related.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2007-02-05 13:38:14 -05:00
Eric Sandeen
bbb28ab759 [GFS2] more CURRENT_TIME_SEC
Whoops, quilt user error, missed this one in the previous patch.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:11 -05:00
Adrian Bunk
0011727785 [GFS2/DLM] fix GFS2 circular dependency
On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote:
> Andrew Morton napsal(a):
> >Temporarily at
> >
> >	http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
>
> Unable to select IPV6. Menuconfig doesn't offer it when INET is selected.
> When it's not it appears in the menu, but after state change it gets away.
> The same behaviour in xconfig, gconfig.
>
> $ mkdir ../a/tst
> $ make O=../a/tst menuconfig
>   HOSTCC  scripts/basic/fixdep
> [...]
>   HOSTLD  scripts/kconfig/mconf
> scripts/kconfig/mconf arch/i386/Kconfig
> Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS
> OCFS2_FS INET
>
> Maybe this is the problem?

Yes, patch below.

> regards,

cu
Adrian

<--  snip  -->

This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM
and DLM depend on instead of select SYSFS.

Since SYSFS depends on EMBEDDED this change shouldn't cause any problems
for users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:08 -05:00
Randy Dunlap
67f55897ee [GFS2/DLM] use sysfs
With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build
fails with:

WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use
kernel_subsys, they should either DEPEND on it or SELECT it.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:05 -05:00
David Teigland
ee32e4f3d3 [GFS2] make lock_dlm drop_count tunable in sysfs
We want to be able to change or disable the default drop_count (number at
which the dlm asks gfs to limit the the number of locks it's holding).
Add it to the collection of sysfs tunables for an fs.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:01 -05:00
David Teigland
2f708649ba [GFS2] increase default lock limit
Increase the number of locks at which point the dlm begins asking gfs to
reduce its lock usage.  The default value is largely arbitrary, but the
current value of 50,000 ends up limiting performance unnecessarily for too
many users.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:59 -05:00
Steven Whitehouse
8bd9572769 [GFS2] Fix list corruption in lops.c
The patch below appears to fix the list corruption that we are seeing on
occasion. Although the transaction structure is private to a single
thread, when the queued structures are dismantled during an in-core
commit, its possible for a different thread to be trying to add the same
structure to another, new, transaction at the same time.

To avoid this, this patch takes the log spinlock during this operation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:56 -05:00
Steven Whitehouse
d7c103d0bd [GFS2] Fix recursive locking attempt with NFS
In certain cases, its possible for NFS to call the lookup code while
holding the glock (when doing a readdirplus operation) so we need to
check for that and not try and lock the glock twice. This also fixes a
typo in a previous NFS related GFS2 patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:53 -05:00
Steven Whitehouse
d043e1900c [GFS2] Fix typo in glock.c
This is a one letter typo fix in glock.c, spotted by Rob Kenna.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:41 -05:00
Eric Sandeen
ddfe062783 [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2
I was looking something else up and came across this...

I don't honestly have a good reason to change it other than to make it
like every other Linux filesystem in this regard.  ;-)  It doesn't
functionally change anything, but makes some lines shorter. :)

I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but
not in timespec_t format, and only stores second resolutions?  Seems like
you're halfway to sub-second resolutions already.

I suppose if that gets implemented then all of the below should
instead be CURRENT_TIME not CURRENT_TIME_SEC.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:38 -05:00
Steven Whitehouse
90101c3186 [GFS2] Compile fix for glock.c
This one liner got missed from the previous patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:35 -05:00
Steven Whitehouse
12132933c4 [GFS2] Remove queue_empty() function
This function is not longer required since we do not do recursive
locking in the glock layer. As a result all its callers can be
replaceed with list_empty() calls.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:32 -05:00
Steven Whitehouse
b5d32bead1 [GFS2] Tidy up glops calls
This patch doesn't make any changes to the ordering of the various
operations related to glocking, but it does tidy up the calls to the
glops.c functions to make the structure more obvious.

The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be
made static within glock.c since they are called by every set of glock
operations. The xmote_th and drop_th glock operations are then made
conditional upon those two routines existing and called from the
previously mentioned functions in glock.c respectively.

Also it can be seen that the go_sync operation isn't needed since it can
easily be replaced by calls to xmote_bh and drop_bh respectively. This
results in no longer (confusingly) calling back into routines in glock.c
from glops.c and also reducing the glock operations by one member.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:26 -05:00
Steven Whitehouse
1c0f4872dc [GFS2] Remove local exclusive glock mode
Here is a patch for GFS2 to remove the local exclusive flag. In
the places it was used, mutex's are always held earlier in the
call path, so it appears redundant in the LM_ST_SHARED case.

Also, the GFS2 holders were setting local exclusive in any case where
the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock
code where the flag was tested have been replaced with tests for the
lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the
same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well
as globally exclusive).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:20 -05:00
Steven Whitehouse
6bd9c8c2fb [GFS2] Remove unused go_callback operation
This is never used, so we might as well remove it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:17 -05:00
Steven Whitehouse
e5dab552c8 [GFS2] Remove the "greedy" function from glock.[ch]
The "greedy" code was an attempt to retain glocks for a minimum length
of time when they relate to mmap()ed files. The current implementation
of this feature is not, however, ideal in that it required allocating
memory in order to do this and its overly complicated.

It also misses the mark by ignoring the other I/O operations which are
just as likely to suffer from the same problem. So the plan is to remove
this now and then add the functionality back as part of the glock state
machine at a later date (and thus take into account all the possible
users of this feature)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:14 -05:00
Steven Whitehouse
fee852e374 [GFS2] Shrink gfs2_inode memory by half
Here is something I spotted (while looking for something entirely
different) the other day.

Rather than using a completion in each and every struct gfs2_holder,
this removes it in favour of hashed wait queues, thus saving a
considerable amount of memory both on the stack (where a number of
gfs2_holder structures are allocated) and in particular in the
gfs2_inode which has 8 gfs2_holder structures embedded within it.

As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to
1912 bytes, a saving of 576 bytes per inode (no thats not a typo!).
In actual practice we get a much better result than that since
now that a gfs2_inode is under the 2048 byte barrier, we get two
per 4k slab page effectively halving the amount of memory required
to store gfs2_inodes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:11 -05:00
Steven Whitehouse
330005c2b2 [GFS2] Remove max_atomic_write tunable
This removes an unused sysfs tunable parameter.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:08 -05:00
Steven Whitehouse
3699e3a44b [GFS2] Clean up/speed up readdir
This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:04 -05:00
Steven Whitehouse
a8d638e30e [GFS2] Add writepages for "data=writeback" mounts
It occurred to me that although a gfs2 specific writepages for ordered
writes and journaled data would be tricky, by hooking writepages only
for "data=writeback" mounts we could take advantage of not needing
buffer heads (we don't use them on the read side, nor have we for some
time) and create much larger I/Os for the block layer.

Using blktrace both before and after, its possible to see that for large
I/Os, most of the requests generated through writepages are now 1024
sectors after this patch is applied as opposed to 8 sectors before.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:01 -05:00
Adrian Bunk
03dc6a538e [GFS2] make gfs2_change_nlink_i() static
On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-rc3-mm1:
>...
>  git-gfs2-nmw.patch
>...
>  git trees
>...

This patch makes the needlessly globlal gfs2_change_nlink_i() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:49 -05:00
Robert Peterson
7083146564 [GFS2] gfs2 knows of directories which it chooses not to display
This is for Red Hat bugzilla bug bz #222302:

Moving a virtual IP from node to node between two NFS-over-GFS2
servers was causing one of the GFS2 servers to become confused and
reference a deleted inode.  The problem was due to vfs dentries that did
not reference the gfs2_dops and therefore didn't call the gfs2 revalidate
code to revalidate a dentry after a directory had been deleted & recreated.
This patch is a crosswrite from a RHEL4 bug found in GFS1 as
bz #190756 and it is against the latest -nmw git tree.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:46 -05:00
S. Wendy Cheng
87d21e07f3 [GFS2] Fix gfs2_rename deadlock
Second round of gfs2_rename lock re-ordering to allow Anaconda adding
root partition on top of gfs2. Previous to this patch the recursive
lock detector in glock.c can be triggered due to attempting to lock
the rgrp twice. This fixes it by checking to see whether the rgrp
is already locked.

This fixes Red Hat bugzilla #221237

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:31 -05:00
Russell Cattelan
6c93fd1e57 [GFS2] BZ 217008 fsfuzzer fix.
Update the quilt header comments to match the
code changes.

Change gfs2_lookup_simple to return an error in the case
of a NULL inode.
The callers of gfs2_lookup_simple do not check for NULL
in the no entry case and such would end up dereferencing a NULL ptr.

This fixes:
http://projects.info-pull.com/mokb/MOKB-15-11-2006.html

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:28 -05:00
Steven Whitehouse
49686f7106 [GFS2] Fix ordering of page disposal vs. glock_dq
In case of unlinked files with dirty pages GFS2 wasn't clearing
the pages in quite the right order. This patch clears the pages
earlier (before the qlock_dq) to avoid the situation that the
release of the glock results in attempting to write back data that
has already been deallocated.

This fixes Red Hat bugzilla: #220117

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:24 -05:00
S. Wendy Cheng
5509826f1e [GFS2] Fix change nlink deadlock
Bugzilla 215088

Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2
partition. The gfs2_rename() apparently needs block allocation for the
new name (into the directory) where it requires rg locks. At the same
time, while updating the nlink count for the replaced file,
gfs2_change_nlink() tries to return the inode meta-data back to resource
group where it needs rg locks too. Our logic doesn't allow process to
acquire these locks recursively by the same process  (RHEL installer)
that results a BUG call. This only happens within rename code path and
only if the destination file exists before the rename operation.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:15 -05:00
Steven Whitehouse
e1d5b18ae9 [GFS2] Fail over to readpage for stuffed files
This is partially derrived from a patch written by Russell Cattelan.
It fixes a bug where there is a race between readpages and truncate
by ignoring readpages for stuffed files. This is ok because a stuffed
file will never be more than one block (minus sizeof(struct gfs2_dinode))
in size and block size is always less than page size, so we do not lose
anything efficiency-wise by not doing readahead for stuffed files. They
will have already been "read ahead" by the action of reading the inode
in, in the first place.

This is the remaining part of the fix for Red Hat bugzilla #218966
which had not yet made it upstream.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Russell Cattelan <cattelan@redhat.com>
2007-02-05 13:36:12 -05:00
Steven Whitehouse
c7b3383437 [GFS2] Fix DIO deadlock
This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs
due to trying to take the i_mutex while holding a glock. The correct
locking order is defined as i_mutex -> glock in all cases.

I've left dealing with allocating writes. I know that we need to do
that, but for now this should do the trick. We don't need to take the
i_mutex on write, because the VFS has already taken it for us. On read
we don't need it since the glock is enough protection. The reason that
I've made some of the checks into a separate function is that we'll need
to do the checks again in the allocating write case eventually, so this
is partly in preparation for this. Likewise the return value test of !=
1 might look a bit odd and thats because we'll need a third return value
in case of requiring an allocation.

I've made the change to deferred mode on the glock to ensure flushing
read caches on other nodes. I notice that (using blktrace to look at
whats going on) we appear to do a better job of large I/Os than ext3
after this patch (in terms of not splitting up the I/Os).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>
2007-02-05 13:36:09 -05:00
David Teigland
c378051177 [GFS2] don't try to lockfs after shutdown
If an fs has already been shut down, a lockfs callback should do nothing.
An fs that's been shut down can't acquire locks or do anything with
respect to the cluster.

Also, remove FIXME comment in withdraw function.  The missing bits of the
withdraw procedure are now all done by user space.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:35:44 -05:00
David Chinner
f73ca1b76c [PATCH] Revert bd_mount_mutex back to a semaphore
Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.

(XFS unlocks the semaphore from a different task, by design.  The mutex
code warns about this)

Signed-off-by: Dave Chinner <dgc@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:21 -08:00
Steven Whitehouse
1003f06953 [GFS2] Fix Kconfig
Here is a patch to fix up the Kconfig so that we don't land up with
problems when people disable the NET subsystem.  Thanks for all the hints and
suggestions that people have sent me regarding this.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Aleksandr Koltsoff <czr@iki.fi>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Chris Zubrzycki <chris@middle--earth.org>
Cc: Patrick Caulfield <pcaulfie@redhat.com>
2006-12-15 12:51:51 -05:00
Josef Sipek
81454098f7 [PATCH] struct path: convert gfs2
Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:28:45 -08:00
Linus Torvalds
1c1afa3c05 Merge master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits)
  [DLM] Clean up lowcomms
  [GFS2] Change gfs2_fsync() to use write_inode_now()
  [GFS2] Fix indent in recovery.c
  [GFS2] Don't flush everything on fdatasync
  [GFS2] Add a comment about reading the super block
  [GFS2] Mount problem with the GFS2 code
  [GFS2] Remove gfs2_check_acl()
  [DLM] fix format warnings in rcom.c and recoverd.c
  [GFS2] lock function parameter
  [DLM] don't accept replies to old recovery messages
  [DLM] fix size of STATUS_REPLY message
  [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning
  [DLM] fix add_requestqueue checking nodes list
  [GFS2] Fix recursive locking in gfs2_getattr
  [GFS2] Fix recursive locking in gfs2_permission
  [GFS2] Reduce number of arguments to meta_io.c:getbuf()
  [GFS2] Move gfs2_meta_syncfs() into log.c
  [GFS2] Fix journal flush problem
  [GFS2] mark_inode_dirty after write to stuffed file
  [GFS2] Fix glock ordering on inode creation
  ...
2006-12-07 09:13:20 -08:00
Christoph Lameter
e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Steven Whitehouse
34126f9f41 [GFS2] Change gfs2_fsync() to use write_inode_now()
This is a bit better than the previous version of gfs2_fsync()
although it would be better still if we were able to call a
function which only wrote the inode & metadata. Its no big deal
though that this will potentially write the data as well since
the VFS has already done that before calling gfs2_fsync(). I've
also added a comment to explain whats going on here.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
2006-12-07 09:13:14 -05:00
Steven Whitehouse
887bc5d00c [GFS2] Fix indent in recovery.c
As per comments from Andrew Morton and Jan Engelhardt, this fixes the
indent and removes the "static" from a variable declaration since its
not needed in this case (now allocated on the stack of the function
in question).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Andrew Morton <akpm@osdl.org>
2006-12-05 13:34:17 -05:00
David Howells
9db7372445 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/ata/libata-scsi.c
	include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 17:01:28 +00:00
Al Viro
bd01f843c3 [PATCH] severing skbuff.h -> poll.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-12-04 02:00:31 -05:00
Steven Whitehouse
33c3de3287 [GFS2] Don't flush everything on fdatasync
The gfs2_fsync() function was doing a journal flush on each
and every call. While this is correct, its also a lot of
overhead. This patch means that on fdatasync flushes we
rely on the VFS to flush the data for us and we don't do
a journal flush unless we really need to.

We have to do a journal flush for stuffed files though because
they have the data and the inode metadata in the same block.
Journaled files also need a journal flush too of course.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:37:44 -05:00
Steven Whitehouse
aac1a3c77a [GFS2] Add a comment about reading the super block
The comment explains why we use the bio functions to read
the super block.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Srinivasa Ds <srinivasa@in.ibm.com>
2006-11-30 10:37:40 -05:00
Srinivasa Ds
0da3585e1e [GFS2] Mount problem with the GFS2 code
While mounting the gfs2 filesystem,our test team had a problem and we
got this error message.
=======================================================

GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1"
GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS...
GFS2: not a GFS2 filesystem
GFS2: fsid=dasde1.0: can't read superblock: -22

==========================================================================
On debugging further we found that problem is while reading the super
block(gfs2_read_super) and comparing the magic number in it.
When I  replace the submit_bio() call(present in gfs2_read_super) with
the sb_getblk() and ll_rw_block(), mount operation succeded.
On further analysis we found that before calling submit_bio(),
bio->bi_sector was set to "sector" variable. This "sector" variable has
the same value of bh->b_blocknr(block number). Hence there is a need to
multiply this valuwith (blocksize >> 9)(9 because,sector size
2^9,samething happens in ll_rw_block also, before calling submit_bio()).
So I have developed the patch which solves this problem. Please let me
know your comments.
================================================================

Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:37:36 -05:00
Steven Whitehouse
77386e1f66 [GFS2] Remove gfs2_check_acl()
As pointed out by Adrian Bunk, the gfs2_check_acl() function is no
longer used. This patch removes it and renamed gfs2_check_acl_locked()
to gfs2_check_acl() since we only need one variant of that function now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Adrian Bunk <bunk@stusta.de>
2006-11-30 10:37:32 -05:00
Randy Dunlap
0ac230699a [GFS2] lock function parameter
Fix function parameter typing:
fs/gfs2/glock.c💯 warning: function declaration isn't a prototype

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:37:18 -05:00
Ryusuke Konishi
aed3255f22 [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning
Fix a printk format warning in fs/gfs2/log.c:
fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t'

Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:37:04 -05:00
Steven Whitehouse
dcf3dd852f [GFS2] Fix recursive locking in gfs2_getattr
The readdirplus NFS operation can result in gfs2_getattr being
called with the glock already held. In this case we do not want
to try and grab the lock again.

This fixes Red Hat bugzilla #215727

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:56 -05:00
Steven Whitehouse
300c7d75f3 [GFS2] Fix recursive locking in gfs2_permission
Since gfs2_permission may be called either from the VFS (in which case
we need to obtain a shared glock) or from GFS2 (in which case we already
have a glock) we need to test to see whether or not a lock is required.
The original test was buggy due to a potential race. This one should
be safe.

This fixes Red Hat bugzilla #217129

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:53 -05:00
Steven Whitehouse
cb4c031318 [GFS2] Reduce number of arguments to meta_io.c:getbuf()
Since the superblock and the address_space are determined by the
glock, we might as well just pass that as the argument since all
the callers already have that available.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:50 -05:00
Steven Whitehouse
a25311c8e0 [GFS2] Move gfs2_meta_syncfs() into log.c
By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start()
can be made static.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:45 -05:00
Steven Whitehouse
b004157ab5 [GFS2] Fix journal flush problem
This fixes a bug which resulted in poor performance due to flushing
the journal too often. The code path in question was via the inode_go_sync()
function in glops.c. The solution is not to flush the journal immediately
when inodes are ejected from memory, but batch up the work for glockd to
deal with later on. This means that glocks may now live on beyond the end of
the lifetime of their inodes (but not very much longer in the normal case).

Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
calculation of the number of free journal blocks.

The gfs2_logd process has been altered to be more responsive to the journal
filling up. We now wake it up when the number of uncommitted journal blocks
has reached the threshold level rather than trying to flush directly at the
end of each transaction. This again means doing fewer, but larger, log
flushes in general.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:42 -05:00
Steven Whitehouse
ae619320b2 [GFS2] mark_inode_dirty after write to stuffed file
Writes to stuffed files were not being marked dirty correctly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:36 -05:00
Steven Whitehouse
28626e2078 [GFS2] Fix glock ordering on inode creation
The lock order here should be parent -> child rather than
numeric order.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:33 -05:00
Steven Whitehouse
1a14d3a68f [GFS2] Simplify glops functions
The go_sync callback took two flags, but one of them was set on every
call, so this patch removes once of the flags and makes the previously
conditional operations (on this flag), unconditional.

The go_inval callback took three flags, each of which was set on every
call to it. This patch removes the flags and makes the operations
unconditional, which makes the logic rather more obvious.

Two now unused flags are also removed from incore.h.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:30 -05:00
Steven Whitehouse
fa2ecfc5e1 [GFS2] Fix Kconfig wrt CRC32
GFS2 requires the CRC32 library function. This was reported by
Toralf Förster.

Cc: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:24 -05:00
Steven Whitehouse
5e7d65cd9d [GFS2] Make sentinel dirents compatible with gfs1
When deleting directory entries, we set the inum.no_addr to zero
in a dirent when its the first dirent in a block and thus cannot
be merged into the previous dirent as is the usual case. In gfs1,
inum.no_formal_ino was used instead.

This patch changes gfs2 to set both inum.no_addr and inum.no_formal_ino
to zero. It also changes the test from just looking at inum.no_addr to
look at both inum.no_addr and inum.no_formal_ino and a sentinel is
now considered to be a dirent in which _either_ (or both) of them
is set to zero.

This resolves Red Hat bugzillas: #215809, #211465

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:36:20 -05:00
Steven Whitehouse
dcd2479959 [GFS2] Remove unused function from inode.c
The gfs2_glock_nq_m_atime function is unused in so far as its only
ever called with num_gh = 1, and this falls through to the
gfs2_glock_nq_atime function, so we might as well call that directly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:35:57 -05:00
Steven Whitehouse
175011cf6e [GFS2] Remove unused sysfs files
Four of the sysfs files are unused and can therefore be removed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:35:53 -05:00
Steven Whitehouse
4cf1ed8144 [GFS2] Tidy up bmap & fix boundary bug
This moves the locking for bmap into the bmap function itself
rather than using a wrapper function. It also fixes a bug where
the boundary flag was set on the wrong bh. Also the flags on
the mapped bh are reset earlier in the function to ensure that
they are 100% correct on the error path.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:35:49 -05:00
Steven Whitehouse
ab923031ce [GFS2] Fix memory allocation in glock.c
Change from GFP_KERNEL to GFP_NOFS as this was causing a
slow down when trying to push inodes from cache.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:35:46 -05:00
Russell Cattelan
61057c6bb3 [GFS2] Remove unused zero_readpage from stuffed_readpage
Stuffed files only consist of a maximum of
(gfs2 block size - sizeof(struct gfs2_dinode)) bytes. Since the
gfs2 block size is always less than page size, we will never see
a call to stuffed_readpage for anything other than the first page
in the file.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:57 -05:00
Russell Cattelan
7020933156 [GFS2] Fix race in logging code
The log lock is dropped prior to io submittion, but
this exposes a hole in which the log data structures
may be going away due to a truncate.
Store the buffer head in a local pointer prior to
dropping the lock and relay on the buffer_head lock
for consitency on the buffer head.

Signed-Off-By: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:55 -05:00
Steven Whitehouse
9e2dbdac3d [GFS2] Remove gfs2_inode_attr_in
This function wasn't really doing the right thing. There was no need
to update the inode size at this point and the updating of the
i_blocks field has now been moved to the places where di_blocks is
updated. A result of this patch and some those preceeding it is that
unlocking a glock is now a much more efficient process, since there
is no longer any requirement to copy data from the gfs2 inode into
the vfs inode at this point.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:52 -05:00
Steven Whitehouse
e7c698d74f [GFS2] Inode number is constant
Since the inode number is constant, we don't need to keep updating
it everytime we refresh the other inode fields.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:48 -05:00
Steven Whitehouse
6b124d8dba [GFS2] Only set inode flags when required
We were setting the inode flags from GFS2's flags far too often, even when they
couldn't possibly have changed. This patch reduces the amount of flag
setting going on so that we do it only when the inode is read in or
when the flags have changed. The create case is covered by the "when
the inode is read in" case.

This also fixes a bug where we didn't set S_SYNC correctly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:45 -05:00
Steven Whitehouse
2ca99501fa [GFS2] Fix page lock/glock deadlock
This fixes a race between the glock and the page lock encountered
during truncate in gfs2_readpage and gfs2_prepare_write. The gfs2_readpages
function doesn't need the same fix since it only uses a try lock anyway, so
it will fail back to gfs2_readpage in the case of a potential deadlock.

This bug was spotted by Russell Cattelan.

Cc: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:43 -05:00
Steven Whitehouse
c594d88664 [GFS2] Remove unused GL_DUMP flag
There is no way to set the GL_DUMP flag, and in any case the
same thing can be done with systemtap if required for debugging,
so this removes it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:40 -05:00
Steven Whitehouse
f6e58f01e8 [GFS2] Don't copy meta_header for rgrp in and out
The meta_header for an ondisk rgrp never changes, so there is no point
copying it in and back out to disk. Also there is no reason to keep
a copy for each rgrp in memory.

The code already checks to ensure that the header is correct before
it calls the routine to copy the data in, so that we don't even need
to check whether its correct on disk in the functions in ondisk.c

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:36 -05:00
Steven Whitehouse
294caaa3b8 [GFS2] Tidy up 0 initialisations in inode.c
We don't need to use endian conversions for 0 initialisations
when creating a new on-disk inode.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:33 -05:00
Steven Whitehouse
bfded27ba0 [GFS2] Shrink gfs2_inode (8) - i_vn
This shrinks the size of the gfs2_inode by 8 bytes by
replacing the version counter with a one bit valid/invalid
flag.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:30 -05:00
Steven Whitehouse
a9583c7983 [GFS2] Shrink gfs2_inode (7) - di_payload_format
This is almost never used. Its there for backward
compatibility with GFS1. It doesn't need its own
field since it can always be calculated from the
inode mode & flags. This saves a bit more space
in the gfs2_inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:26 -05:00
Steven Whitehouse
1a7b1eed58 [GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctime
Remove the di_[amc]time fields and use inode->i_[amc]time
fields instead. This saves 24 bytes from the gfs2_inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:23 -05:00
Steven Whitehouse
4f56110a00 [GFS2] Shrink gfs2_inode (5) - di_nlink
Remove the di_nlink field in favour of inode->i_nlink and
update the nlink handling to use the proper macros. This
saves 4 bytes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:20 -05:00
Steven Whitehouse
2933f9254a [GFS2] Shrink gfs2_inode (4) - di_uid/di_gid
Remove duplicate di_uid/di_gid fields in favour of using
inode->i_uid/inode->i_gid instead. This saves 8 bytes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:17 -05:00
Steven Whitehouse
b60623c238 [GFS2] Shrink gfs2_inode (3) - di_mode
This removes the duplicate di_mode field in favour of using the
inode->i_mode field. This saves 4 bytes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:14 -05:00
Steven Whitehouse
e7f14f4d09 [GFS2] Shrink gfs2_inode (2) - di_major/di_minor
This removes the device numbers from this structure by using
inode->i_rdev instead. It also cleans up the code in gfs2_mknod.
It results in shrinking the gfs2_inode by 8 bytes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:11 -05:00
Steven Whitehouse
af339c0241 [GFS2] Shrink gfs2_inode (1) - di_header/di_num
The metadata header doesn't need to be stored in the incore
struct gfs2_inode since its constant, and this patch removes it.
Also, there is already a field for the inode's number in the
struct gfs2_inode, so we don't need one in struct gfs2_dinode_host
as well.

This saves 28 bytes of space in the struct gfs2_inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:07 -05:00
Steven Whitehouse
4cc14f0b88 [GFS2] Change argument to gfs2_dinode_print
Change argument for gfs2_dinode_print in order to prepare
for removal of duplicate fields between struct inode and
struct gfs2_dinode_host.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:03 -05:00
Steven Whitehouse
ea744d01c6 [GFS2] Move gfs2_dinode_in to inode.c
gfs2_dinode_in() is only ever called from one place, so move it
to that place (in inode.c) and make it static.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:34:00 -05:00
Steven Whitehouse
891ea14712 [GFS2] Change argument to gfs2_dinode_in
This is a preliminary patch to enable the removal of fields
in gfs2_dinode_host which are duplicated in struct inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:57 -05:00
Steven Whitehouse
539e5d6b7a [GFS2] Change argument of gfs2_dinode_out
Everywhere this was called, a struct gfs2_inode was available,
but despite that, it was always called with a struct gfs2_dinode
as an argument. By making this change it paves the way to start
eliminating fields duplicated between the kernel's struct inode
and the struct gfs2_dinode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:54 -05:00
Al Viro
9c9ab3d541 [GFS2] gfs2 __user misannotation fix
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:49 -05:00
Al Viro
b44b84d765 [GFS2] gfs2 misc endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:46 -05:00
Al Viro
b62f963e1f [GFS2] split and annotate gfs2_quota_change
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:41 -05:00
Al Viro
bd209cc017 [GFS2] split and annotate gfs2_statfs_change
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:38 -05:00
Al Viro
b5bc9e8b06 [GFS2] split and annotate gfs2_quota
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:35 -05:00
Al Viro
629a21e7ec [GFS2] split and annotate gfs2_inum
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:32 -05:00
Al Viro
1e81c4c3e0 [GFS2] split and annotate gfs_rindex
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:29 -05:00
Al Viro
e928a76f95 [GFS2] split and annotate gfs2_meta_header
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:26 -05:00
Steven Whitehouse
2a2c98247b [GFS2] Fix crc32 calculation in recovery.c
Commit "[GFS2] split and annotate gfs2_log_head" resulted in an incorrect
checksum calculation for log headers. This patch corrects the
problem without resorting to copying the whole log header as
the previous code used to.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:17 -05:00
Al Viro
5516762261 [GFS2] split and annotate gfs2_log_head
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:14 -05:00
Al Viro
e697264709 [GFS2] split and annotate gfs2_inum_range
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:11 -05:00
Al Viro
68826664d1 [GFS2] split and annotate gfs2_rgrp
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:07 -05:00
Al Viro
f50dfaf78c [GFS2] split gfs2_sb
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:00 -05:00
Al Viro
5c6edb576f [GFS2] gfs2_dinode_host fields are host-endian
Annotated scalar fields, dropped unused ones.  Note that
it's not at all obvious that we want to convert all of them
to host-endian...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:32:55 -05:00
Al Viro
3ca68df6ee [GFS2] split gfs2_dinode into on-disk and host variants
The latter is used as part of gfs2-private part of struct inode.
It actually stores a lot of fields differently; for now the
declaration is just cloned, inode field is swtiched and changes
propagated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:32:50 -05:00
David Howells
c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Steven Whitehouse
26d83dedf6 [GFS2] Fix OOM error handling
Fix the OOM error handling in inode.c where it was possible for
a NULL pointer to be dereferenced.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-06 08:59:42 -05:00
Steven Whitehouse
4a221953ed [GFS2] Fix incorrect fs sync behaviour.
This adds a sync_fs superblock operation for GFS2 and removes
the journal flush from write_super in favour of sync_fs where it
ought to be. This is more or less identical to the way in which ext3
does this.

This bug was pointed out by Russell Cattelan <cattelan@redhat.com>

Cc: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-06 08:59:16 -05:00
Alexey Dobriyan
eb1dc33aa2 [GFS2] don't panic needlessly
First, SLAB_PANIC is unjustified. Second, all error propagating and backing out
is in place.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-06 08:58:52 -05:00
OGAWA Hirofumi
7011774db8 [PATCH] gfs2: ->readpages() fixes
This just ignore the remaining pages, and remove unneeded unlock_pages().

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-03 12:27:57 -08:00
Adrian Bunk
b7d8ac3e17 [GFS2] gfs2_dir_read_data(): fix uninitialized variable usage
In the "if (extlen)" case, "bh" was used uninitialized.

This patch changes the code to what seems to have been intended.

Spotted by the Coverity checker.

This patch also removes a pointless "bh = NULL" asignment (the variable
is never accessed again after this point).

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:16:20 -04:00
Adrian Bunk
bbbe451273 [GFS2] fs/gfs2/ops_fstype.c:fill_super_meta(): fix NULL dereference
Don't dereference new->s_root when we do know it's NULL.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:15:57 -04:00
Adrian Bunk
348acd48f0 [GFS2] fs/gfs2/dir.c:gfs2_dir_write_data(): don't use an uninitialized variable
In the "if (extlen)" case, "new" might be used uninitialized.

Looking at the code, it should be initialized to 0.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:15:31 -04:00
Adrian Bunk
b0cb66955f [GFS2] fs/gfs2/ops_fstype.c:gfs2_get_sb_meta(): remove unused variable
The Coverity checker spotted this unused variable.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:15:19 -04:00
Adrian Bunk
abbdbd2065 [GFS2] fs/gfs2/dir.c:gfs2_dir_write_data(): remove dead code
The Coverity checker spotted this obviously dead code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:14:42 -04:00
Al Viro
a2d7d021d7 [GFS2] gfs2 endianness bug: be16 assigned to be32 field
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:14:08 -04:00
Steven Whitehouse
23591256d6 [GFS2] Fix bmap to map extents properly
This fix means that bmap will map extents of the length requested
by the VFS rather than guessing at it, or just mapping one block
at a time. The other callers of gfs2_block_map are audited to ensure
they send the correct max extent lengths (i.e. set bh->b_size correctly).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:13:40 -04:00
Russell Cattelan
c312c4fdc8 [GFS2] Pass the correct value to kunmap_atomic
Pass kaddr rather than (incorrect) struct page to kunmap_atomic.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12 17:11:13 -04:00
Steven Whitehouse
fe1a698ffe [GFS2] Fix bug where lock not held
The log lock needs to be held when manipulating the counter
for the number of free journal blocks.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12 17:10:55 -04:00
Steven Whitehouse
f5c54804d9 [GFS2] Fix uninitialised variable
This fixes a bug where, in certain cases an uninitialised variable
could cause a dereference of a NULL pointer in gfs2_commit_write().
Also a typo in a comment is fixed at the same time.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12 17:10:15 -04:00
Russell Cattelan
52ae7b7935 [GFS2] Fix a size calculation error
Fix a size calculation error.
The size was incorrect being computed as a
negative length and then being passed to an
unsigned parameter.

This in turn would cause the allocator to
think it needed enough meta data to store
a gigabyte file for every file created.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12 17:09:54 -04:00
Al Viro
4b4fcaa1a9 [PATCH] misuse of strstr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:17:06 -07:00
Steven Whitehouse
7ecdb70a0e [GFS2] Fix endian bug for de_type
Missing endian conversion for the de_type field.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-03 21:03:35 -04:00
Ryan O'Hara
fcb47e0bd2 [GFS2] Initialize SELinux extended attributes at inode creation time.
This patch has gfs2_security_init declared as a static function, which
is correct. As a result, the declaration of this function in inode.h is
removed (and thus inode.h is unchanged). Also removed #include eaops.h,
which is not needed.

Signed-Off-By: Ryan O'Hara <rohara@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-03 11:57:35 -04:00
Steven Whitehouse
ddacfaf76d [GFS2] Move logging code into log.c (mostly)
This moves the logging code from meta_io.c into log.c and glops.c. As a
result the routines can now be static and all the logging code is together
in log.c, leaving meta_io.c with just metadata i/o code in it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-03 11:10:41 -04:00
Steven Whitehouse
f92a0b6ff4 [GFS2] Mark nlink cleared so VFS sees it happen
This does nothing atm, but will be required for later support of
r/o bind mounts.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 16:01:53 -04:00
Steven Whitehouse
409e185d23 [GFS2] Two redundant casts removed
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 14:20:43 -04:00
Steven Whitehouse
48516ced21 [GFS2] Remove uneeded endian conversion
In many places GFS2 was calling the endian conversion routines
for an inode even when only a single field, or a few fields might
have changed. As a result we were copying lots of data needlessly.

This patch replaces those calls with conversion of just the
required fields in each case. This should be faster and easier
to understand. There are still other places which suffer from this
problem, but this is a start in the right direction.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 12:39:19 -04:00
Steven Whitehouse
3cf1e7bed4 [GFS2] Remove duplicate sb reading code
For some reason we had two different sets of code for reading in the
superblock. This removes one of them in favour of the other. Also we
don't need the temporary buffer for the sb since we already have one
in the gfs2 sb itself.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 11:49:41 -04:00
Steven Whitehouse
2e565bb69c [GFS2] Mark metadata reads for blktrace
Mark the metadata reads so that blktrace knows what they are.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 11:38:25 -04:00
Steven Whitehouse
128e5ebaf8 [GFS2] Remove iflags.h, use FS_
Update GFS2 in the light of David Howells' patch:

[PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6]
36695673b0

which calls the filesystem independant flags FS_..._FL. As a result
we no longer need the flags.h file and the conversion routine is
moved into the GFS2 source code.

Userland programs which used to include iflags.h should now include
fs.h and use the new flag names.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 11:24:43 -04:00
Steven Whitehouse
d00223f169 [GFS2] Fix code style/indent in ops_file.c
Fix a couple of minor issues.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 10:28:05 -04:00
Andrew Morton
930cc237d6 [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix
Fix GFS for streamline-generic_file_-interfaces-and-filemap.patch

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 09:02:54 -04:00
Badari Pulavarty
9c9eb21eee [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits)
This patch removes readv() and writev() methods and replaces them with
aio_read()/aio_write() methods.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 09:02:12 -04:00
Theodore Ts'o
825f9075d7 [GFS2] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-28 08:32:51 -04:00
Theodore Ts'o
bba9dfd835 [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs)
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer.  Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer.  This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-28 08:32:24 -04:00
Steven Whitehouse
7e18c02be7 [GFS2] Fix bug in Makefiles for lock modules
The Makefile had the wrong CONFIG_ variable in it so that in
case GFS2 was y and the lock modules were m, they were not
getting built properly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-27 12:20:06 -04:00
Steven Whitehouse
907b9bceb4 [GFS2/DLM] Fix trailing whitespace
As per Andrew Morton's request, removed trailing whitespace.

Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-25 09:26:04 -04:00
Steven Whitehouse
7276b3b0c7 [GFS2] Tidy up meta_io code
Fix a bug in the directory reading code, where we might have dereferenced
a NULL pointer in case of OOM. Updated the directory code to use the new
& improved version of gfs2_meta_ra() which now returns the first block
that was being read. Previously it was releasing it requiring following
code to grab the block again at each point it was called.

Also turned off readahead on directory lookups since we are reading a
hash table, and therefore reading the entries in order is very
unlikely. Readahead is still used for all other calls to the
directory reading function (e.g. when growing the hash table).

Removed the DIO_START constant. Everywhere this was used, it was
used to unconditionally start i/o aside from a couple of places, so
I've removed it and made the couple of exceptions to this rule into
separate functions.

Also hunted through the other DIO flags and removed them as arguments
from functions which were always called with the same combination of
arguments.

Updated gfs2_meta_indirect_buffer to be a bit more efficient and
hopefully also be a bit easier to read.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-21 17:05:23 -04:00
Steven Whitehouse
56965536b8 [GFS2] Remove unused constants
Three of the DIO constants were not being used, so remove them.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-20 15:48:09 -04:00
Steven Whitehouse
f0e522a901 [GFS2] Remove "NFS only" readdir path
This code path shouldn't be needed, so remove it for now. This
tidys things up.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19 16:41:11 -04:00
Steven Whitehouse
74669416f7 [GFS2] Use list_for_each_entry_safe_reverse in gfs2_ail1_start()
This is an attempt to fix Red Hat bz 204364. I don't hit it all
the time, but with these changes, running postmark which used to
trigger it on a regular basis no longer appears to. So I'm not
saying that its 100% certain that its fixed, but it does look
promising at the moment.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19 11:17:38 -04:00
Fabio Massimo Di Nitto
7d308590ae [GFS2] Export lm_interface to kernel headers
lm_interface.h has a few out of the tree clients such as GFS1
and userland tools.

Right now, these clients keeps a copy of the file in their build tree
that can go out of sync.

Move lm_interface.h to include/linux, export it to userland and
clean up fs/gfs2 to use the new location.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19 08:45:18 -04:00
akpm@osdl.org
f3b30912e0 [GFS2] inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-vs-gfs2
i_blksize got removed in -mm.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19 08:43:01 -04:00
Steven Whitehouse
07903c02d0 [GFS2] Tweek unlock test in readpage()
This make the unlock test a bit simpler.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-18 17:30:05 -04:00
Russell Cattelan
dc41aeedef [GFS2] Fix for mmap() bug in readpage
Fix for Red Hat bz 205307. Don't need to lock in readpage if
the higher level code has already grabbed the lock.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-18 17:26:48 -04:00
Steven Whitehouse
7a6bbacbb8 [GFS2] Map multiple blocks at once where possible
This is a tidy up of the GFS2 bmap code. The main change is that the
bh is passed to gfs2_block_map allowing the flags to be set directly
rather than having to repeat that code several times in ops_address.c.

At the same time, the extent mapping code from gfs2_extent_map has
been moved into gfs2_block_map. This allows all calls to gfs2_block_map
to map extents in the case that no allocation is taking place. As a
result reads and non-allocating writes should be faster. A quick test
with postmark appears to support this.

There is a limit on the number of blocks mapped in a single bmap
call in that it will only ever map blocks which are pointed to
from a single pointer block. So in other words, it will never try
to do additional i/o in order to satisfy read-ahead. The maximum
number of blocks is thus somewhat less than 512 (the GFS2 4k block
size minus the header divided by sizeof(u64)). I've further limited
the mapping of "normal" blocks to 32 blocks (to avoid extra work)
since readpages() will currently read a maximum of 32 blocks ahead (128k).

Some further work will probably be needed to set a suitable value
for DIO as well, but for now thats left at the maximum 512 (see
ops_address.c:gfs2_get_block_direct).

There is probably a lot more that can be done to improve bmap for GFS2,
but this is a good first step.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-18 17:18:23 -04:00
David Teigland
65952fb4e9 [GFS2] print mount errors related to sysfs
Print an error message if mount fails in setting up the sysfs files.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-18 09:43:23 -04:00
Steven Whitehouse
a8336344a5 [GFS2] Fix glock hash clearing
A one liner bug fix to prevent the return value being
wrong when more than one superblock is mounted.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-14 13:57:38 -04:00
Steven Whitehouse
faa31ce85f [GFS2] Tidy up log.c
Based upon previous feedback from lkml and also removing some
commented out debugging which is no longer needed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-13 11:13:27 -04:00
Steven Whitehouse
16feb9fec0 [GFS2] Use atomic_t rather than kref in glock.c
Use atomic_t as the ref count in glocks rather than a kref.
This is another step towards using RCU for the glock hash.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-13 10:43:37 -04:00
Steven Whitehouse
b6397893a5 [GFS2] Use hlist for glock hash chains
This results in smaller list heads, so that we can have more chains
in the same amount of memory (twice as many). I've multiplied the
size of the table by four though - this is because we are saving
memory by not having one lock per chain any more. So we land up
using about the same amount of memory for the hash table as we
did before I started these changes, the difference being that we
now have four times as many hash chains.

The reason that I say "about the same amount of memory" is that the
actual amount now depends upon the NR_CPUS and some of the config
variables, so that its not exact and in some cases we do use more
memory. Eventually we might want to scale the hash table size
according to the size of physical ram as measured on module load.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-12 10:10:01 -04:00
Steven Whitehouse
2426443460 [GFS2] Rewrite of examine_bucket()
The existing implementation of this function in glock.c was not
very efficient as it relied upon keeping a cursor element upon the
hash chain in question and moving it along. This new version improves
upon this by using the current element as a cursor. This is possible
since we only look at the "next" element in the list after we've
taken the read_lock() subsequent to calling the examiner function.
Obviously we have to eventually drop the ref count that we are then
left with and we cannot do that while holding the read_lock, so we
do that next time we drop the lock. That means either just before
we examine another glock, or when the loop has terminated.

The new implementation has several advantages: it uses only a
read_lock() rather than a write_lock(), so it can run simnultaneously
with other code, it doesn't need a "plug" element, so that it removes
a test not only from this list iterator, but from all the other glock
list iterators too. So it makes things faster and smaller.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-11 21:40:30 -04:00
Steven Whitehouse
94610610f1 [GFS2] Remove unused function from glock.c
The callback for iopen locks is unused, so this removes
it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-09 18:59:27 -04:00
Steven Whitehouse
a5e08a9ef5 [GFS2] Add consts to glock sorting function
Add back the consts which were casted away in the glock sorting
function. Also add early exit code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-09 17:07:05 -04:00
Steven Whitehouse
087efdd391 [GFS2] Make glock hash locks proportional to NR_CPUS
Make the number of locks used for hash chains in glock.c
proportional to NR_CPUS. Also move constants for the number
of hash chains into glock.c from incore.h since they are
not used outside of glock.c.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-09 16:59:11 -04:00
Steven Whitehouse
ff6af411ae [GFS2] vfree should be kfree (II)
The superblock is now created with kmalloc, not vmalloc.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-09 16:56:34 -04:00
Steven Whitehouse
37b2fa6a24 [GFS2] Move rwlocks in glock.c into their own array
This splits the rwlocks guarding the hash chains of the glock hash
table into their own array. This will reduce memory usage in some
cases due to better alignment, although the real reason for doing it
is to allow the two tables to be different sizes in future (i.e.
the locks will be sized proportionally with the max number of CPUs
and the hash chains sized proportinally with the size of physical memory)

In order to allow this, the gl_bucket member of struct gfs2_glock has
now become gl_hash, so we record the hash rather than a pointer to the
bucket itself.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-08 13:35:56 -04:00
Steven Whitehouse
9b47c11d1c [GFS2] Use void * instead of typedef for locking module interface
As requested by Jan Engelhardt, this removes the typedefs in the
locking module interface and replaces them with void *. Also
since we are changing the interface, I've added a few consts
as well.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-08 10:17:58 -04:00
Steven Whitehouse
a2c4580797 [GFS2] vfree should be kfree
This was missed in an earlier patch when changing over from vmalloc
to kmalloc for the superblock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-08 10:13:03 -04:00
Steven Whitehouse
5ce311ebdb [GFS2] Remove unused sync_lvb code from lock modules
This code is no longer used for anything and can be removed
from the locking modules. The sync_lvb function is not required
as this happens automatically with the current locking system.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 17:35:48 -04:00
Steven Whitehouse
1c089c325d [GFS2] Remove one typedef
This removes one of the typedefs from the locking interface. It
is replaced by a forward declaration of the gfs2 superblock. The
other two are not so easy to solve since in their case, they
can refer to one of two possible structures.

Cc: David Teigland <teigland@redhat.com>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 15:50:20 -04:00
Steven Whitehouse
b9201ce9a8 [GFS2] Forgot to remove unused include vmalloc.h
Excatly as the subject line says.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 14:46:39 -04:00
Steven Whitehouse
85d1da67f7 [GFS2] Move glock hash table out of superblock
There are several reasons why we want to do this:
 - Firstly its large and thus we'll scale better with multiple
   GFS2 fs mounted at the same time
 - Secondly its easier to scale its size as required (thats a plan
   for later patches)
 - Thirdly, we can use kzalloc rather than vmalloc when allocating
   the superblock (its now only 4888 bytes)
 - Fourth its all part of my plan to eventually be able to use RCU
   with the glock hash.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 14:40:21 -04:00
Steven Whitehouse
b8547856f9 [GFS2] Add gfs2 superblock to glock hash function
This is another patch preparing for sharing of the glock hash
table between different gfs2 mounts.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 13:12:27 -04:00
Steven Whitehouse
62f140c173 [GFS2] Add brackets in locking/dlm/sysfs.c
As per Jan Engelhardt's request.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 09:54:55 -04:00
David Teigland
3204a6c055 [GFS2] use snprintf for sysfs show
Use snprintf(buf, PAGE_SIZE, ...) instead of sprintf for sysfs show
methods.  Per instructions in Documentation/filesystems/sysfs.txt

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 09:43:34 -04:00
Jan Engelhardt
c53921248c [GFS2] More style changes
Remove redundant brackets

Signed-off-by: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 09:42:56 -04:00
Steven Whitehouse
7b62536141 [GFS2] Add a comment in ops_export.c
Ass a comment explaining the slightly odd construct used to
pass error values back.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 15:56:17 -04:00
Steven Whitehouse
2c1e52aa90 [GFS2] More style fixes
As per Jan Engelhardt's follow up emails, here are a few small
fixes which were missed earlier.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 15:41:57 -04:00
Steven Whitehouse
48fac17909 [GFS2] Remove unused code from quota
As per Jan Engelhardt's request, some unused code is removed and
some consts added in the quota code.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 15:17:12 -04:00
Steven Whitehouse
a67cdbd457 [GFS2] Style changes in logging code
As per Jan Engelhardt's comments, removed some unused code and
removed some brackets which were not required.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 14:41:30 -04:00
Steven Whitehouse
cca195c5c0 [GFS2] Extended attribute code style changes
As per Jan Engelhardt's request and also a few of my own. It has
been possible to add a few most const to the code as a result of
the change in gfs2_ea_name2type.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 13:15:18 -04:00
Steven Whitehouse
16910427e1 [GFS2] Style changes in rgrp.c
Change one constant plus remove a redundant !!.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 11:15:45 -04:00
Steven Whitehouse
ea67eedb21 [GFS2] Fix end of multi-line structures
As per Jan Engelhardt's request, I've added a ',' to the end of
each of the multi-line structures which didn't already have
one (most already did).

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 10:53:09 -04:00
Steven Whitehouse
f2f7ba5237 [GFS2] Make headers compile on their own
As per Jan Engelhardt's comments, this should make all the headers
compile on their own by including and/or declaring structures
early.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 10:39:21 -04:00
Steven Whitehouse
2bdbc5d739 [GFS2] Directory code style changes
As per comments from Jan Engelhardt, remove redundant casts, redundant
endian conversions, add a smattering of const and rewrite the
dirent_next function in order to avoid as many casts as possible.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05 09:34:20 -04:00
Steven Whitehouse
5acd396734 [GFS2] Some further style changes
Introduce a couple of new constants which make the NFS filehandle
sizes that GFS2 uses a bit clearer. Also fix one or two minor
issues as per Jan Engelhardt's sixth email.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 16:16:45 -04:00
Steven Whitehouse
26c1a57412 [GFS2] More code style updates
As per Jan Engelhardt's fifth email. This has most of the changes
recommended, which is the removal of casts which are not required,
some indenting fixes and similar.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 15:32:10 -04:00
Steven Whitehouse
0bd5996a00 [GFS2] Style changes in ops_address.c
As per the remainder of Jan Engelhardt's fourth email comments,
remove an cast thats not required. Also tidy up the "limit" code
in stuck_releasepage().

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 14:59:35 -04:00
Steven Whitehouse
dd538c832a [GFS2] Spelling sentinal -> sentinel
A spelling mistake (one of mine).

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 14:53:30 -04:00
Steven Whitehouse
38c60ef228 [GFS2] Use const in endian conversion routines
Use const in endian conversion and printing of on-disk structures.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 14:48:37 -04:00
Steven Whitehouse
82ffa51637 [GFS2] More style changes
As per Jan Engelhardt's fourth email, this is the first part of the
change set with a few minor style points.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 14:47:06 -04:00
Steven Whitehouse
c26687113a [GFS2] Remove a cast, tidy gfs2_inode_attr_in
The remains of the changes for Jan Engelhardt's third email. Remove
a cast and tidy up gfs2_inode_attr_in.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 13:55:48 -04:00
Steven Whitehouse
cd915493fc [GFS2] Change all types to uX style
This makes all fixed size types have consistent names.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 12:49:07 -04:00
Steven Whitehouse
a91ea69ffd [GFS2] Align all labels against LH side
This makes everything consistent.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 12:04:26 -04:00
Steven Whitehouse
75d3b817a0 [GFS2] Tidy up bmap/inode code
As per Jan Engelhardt's third set of comments, this make various
code style changes and moves the structures from format.h into
super.c, which was the only place that format.h was actually used.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 11:41:31 -04:00
Steven Whitehouse
5029996547 [GFS2] Tidy up locking code
As per Jan Engelhardt's second email, this removes some unused code,
and fixes up indenting in various places.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 09:49:55 -04:00
Steven Whitehouse
e9fc2aa091 [GFS2] Update copyright, tidy up incore.h
As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
updates the copyright message to say "version" in full rather than
"v.2". Also incore.h has been updated to remove forward structure
declarations which are not required.

The gfs2_quota_lvb structure has now had endianess annotations added
to it. Also quota.c has been updated so that we now store the
lvb data locally in endian independant format to avoid needing
a structure in host endianess too. As a result the endianess
conversions are done as required at various points and thus the
conversion routines in lvb.[ch] are no longer required. I've
moved the one remaining constant in lvb.h thats used into lm.h
and removed the unused lvb.[ch].

I have not changed the HIF_ constants. That is left to a later patch
which I hope will unify the gh_flags and gh_iflags fields of the
struct gfs2_holder.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-01 11:05:15 -04:00
Steven Whitehouse
623d93555c [GFS2] Fix releasepage bug (fixes direct i/o writes)
This patch fixes three main bugs. Firstly the direct i/o get_block
was returning the wrong return code in certain cases. Secondly, the
GFS2's releasepage function was not dealing with cases when clean,
ordered buffers were found still queued on a transaction (which can
happen depending on the ordering of journal flushes). Thirdly, the
journaling code itself needed altering to take account of the
after effects of removing the clean ordered buffers from the transactions
before a journal flush.

The releasepage bug did also show up under "normal" buffered i/o
as well, so its not just a fix for direct i/o. In fact its not
normally used in the direct i/o path at all, except when flushing
existing buffers after performing a direct i/o write, but that was
the code path that led us to spot this.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-31 12:14:44 -04:00
Steven Whitehouse
899be4d3b7 [GFS2] Add superblock into key for glock lookups
This adds the superblock as a key for glock lookups. Since the glocks
are already stored in a per-superblock table, this has no effect at
the moment. Later on this will change though.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-30 12:50:28 -04:00
Steven Whitehouse
d6a5372768 [GFS2] Use const on glock lookup key
Use const for the glock name which is being used as a lookup key
in the glock hash table.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-30 11:16:23 -04:00
Steven Whitehouse
ec45d9f583 [GFS2] Use slab properly with glocks
We can take advantage of the slab allocator to ensure that all the list
heads and the spinlock (plus one or two other fields) are initialised
by slab to speed up allocation of glocks.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-30 10:36:52 -04:00
Steven Whitehouse
5e2b0613ed [GFS2] Remove unused code from glock layer
Remove the unused sync feature from glocks. This is currently done by
calling the required functions to sync pages/blocks directly so this
code isn't needed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-30 09:38:30 -04:00
Steven Whitehouse
8fb4b536e7 [GFS2] Make glock operations const
For all the usual reasons of enforcing correctness and potentially
reducing code size, this patch makes the glock operations const.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-30 09:30:00 -04:00
Abhijith Das
8638460540 [GFS2] Allow mounting of gfs2 and gfs2meta at the same time
This patch allows the simultaneous mounting of gfs2meta and gfs2
filesystems. A restriction however is that a gfs2meta fs may only be
mounted if its corresponding gfs2 filesystem is also mounted. Also, a
gfs2 filesystem cannot be unmounted before its gfs2meta filesystem.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-25 17:19:55 -04:00
Benjamin Marzinski
5dc39fe621 [GFS2] Fix journal off-by-one error
log_refund() incorrectly assumed that if a transaction had been touched, it
always committed buffers to the incore log. Thus, when you got around to
flushing the log, you would need one more block than you committed, to account
for the header. So it automatically set reserved to 1, which had the effect of
making sdp->sd_log_blks_reserved one greater when you got to gfs2_log_flush().
However, if you don't actually commit anything to the incore log between
flushes, you don't need the header, because you aren't writing anything out.
With this patch, log_refund() only increments reservered to account for the
header if something has been committed since the last flush.

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-25 09:57:41 -04:00
Steven Whitehouse
a2242db090 [GFS2] Speed up scanning of glocks
I noticed the gfs2_scand seemed to be taking a lot of CPU,
so in order to cut that down a bit, here is a patch. Firstly
the type of a glock is a constant during its lifetime, so that
its possible to check this without needing locking. I've moved
the (common) case of testing for an inode glock outside of
the glmutex lock.

Also there was a mutex left over from when the glock cache was
master of the inode cache. That isn't required any more so I've
removed that too.

There is probably scope for further speed ups in the future
in this area.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-24 17:03:05 -04:00
Steven Whitehouse
166afccd71 [GFS2] Tidy up error handling in gfs2_releasepage()
This should clarify the logic in gfs2_releasepage() relating to
error handling as well as making the response to errors a bit
more graceful.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-24 15:59:40 -04:00
Steven Whitehouse
b8e1aabf21 [GFS2] Another list_del bug
Another case where list_del should be list_del_init.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-22 16:25:50 -04:00
Steven Whitehouse
08867605e1 [GFS2] Fix to list_del in lops.c
A list_del should have been a list_del_init in lops.c which was
resulting in incorrect status returns from list_empty().

Signed-off-by: Steven Whitheouse <swhiteho@redhat.com>
2006-08-22 11:03:57 -04:00
Steven Whitehouse
15d00c0b91 [GFS2] Fix leak of gfs2_bufdata
This fixes a memory leak of struct gfs2_bufdata and also some
problems in the ordered write handling code. It needs a bit
more testing, but I believe that the reference counting of
ordered write buffers should now be correct.

This is aimed at fixing Red Hat bugzilla: #201028 and #201082

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-18 15:51:09 -04:00
Russell Cattelan
8872187780 [GFS2] Fix a couple of refcount leaks.
recovery.c add a brelse to deal with gfs2_replay_read_block being called
twice on the same block.

add a dput to drop the ref count on the root inode.
This was causing lingering glocks and thus causing
a mount failure to hang.

Fix a endian conversion macro that was was swizzling
16bits when it should have been swizzling 32.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-10 17:18:59 -04:00
Steven Whitehouse
f4387149ec [GFS2] Fix lack of buffers in writepage bug
In some cases we can enter write page without there being buffers
attached to the page. In this case the function to add gfs2_bufdata
to the buffers fails sliently causing further failures down the
stack.

This fix ensures that we always add buffers in writepage if they
didn't already exist (mmap is one way to trigger this).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-08 13:23:19 -04:00
Steven Whitehouse
2b557f6dc7 [GFS2] Fix gfs_ prefix in locking.c
The previous patch didn't change all the gfs_ to gfs2_ so
this is the remainder.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-07 11:12:30 -04:00
David Teigland
08eac93a68 [GFS2] match plock result with correct request
When the result of a posix lock request is read it needs to be matched up
with the correct waiting request.  The owner field needs to be used in the
comparison since more than one process may be waiting for locks on the
same file.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-07 08:46:51 -04:00
David Teigland
3120ec541e [GFS2] lockproto api prefix
Use the gfs2_ prefix on the register/unregister functions for the lock
modules.  The gfs_ prefix was left from an old idea on how to share these
with gfs1.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-07 08:46:19 -04:00
Steven Whitehouse
59a1cc6bda [GFS2] Fix lock ordering bug in page fault path
Mmapped files were able to trigger a lock ordering bug. Private
maps do not need to take the glock so early on. Shared maps do
unfortunately, however we can get around that by adding a flag
into the flags for the struct gfs2_file. This only works because
we are taking an exclusive lock at this point, so we know that
nobody else can be racing with us.

Fixes Red Hat bugzilla: #201196

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-04 15:41:22 -04:00
Steven Whitehouse
899bb26450 [GFS2] Fix bug in directory code
This was a nasty bug which resulted in corruption of hash tables
in the directory code with larger directories. We forgot to
increment a pointer in the read/write routines internal to the
directory code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-01 15:28:57 -04:00
David Teigland
de9b75d31e [GFS2] add plock owner
We need to use fl_owner instead of fl_pid to track the owner of a posix
lock.  Pass the owner value out to user space where cluster plocks are
managed.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-31 15:44:29 -04:00
Steven Whitehouse
420b9e5e45 [GFS2] Tidy up in various files
Tidy up some files and remove an unused routine in meta_io.h. Also
added a bit of extra debugging in meta_io.h.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-31 15:42:17 -04:00
Steven Whitehouse
5dd9feafb3 [GFS2] Fix bug in clear_inode
We should have been waiting for lock demotion to finish in
clear_inode.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-28 14:52:33 -04:00
Steven Whitehouse
2b98a54f79 [GFS2] Fix bug in super block reading code
This gets the argument to submit_bio() correct.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27 16:37:48 -04:00
Steven Whitehouse
dd894be8df [GFS2] Change some allocations to GFP_NOFS
Some allocations in rgrp.c should have been GFP_NOFS
rather than GFP_KERNEL.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27 14:29:00 -04:00
Steven Whitehouse
f45b7ddd2b [GFS2] Use a bio to read the superblock
This means that we don't need to create a special inode just to contain
a struct address_space in order to read a single disk block. Instead
we read the disk block directly. Its slightly faster, and uses slightly
less memory, but the real reason for doing this is that it removes a
special case from the glock code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27 13:53:53 -04:00
Steven Whitehouse
ba7f72901c [GFS2] Remove page.[ch]
The remaining routines in page.c were all only used in one other
file, so they are now moved into the files where they are referenced
and made static. Thus page.[ch] are no longer required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-26 11:27:10 -04:00
Steven Whitehouse
f25ef0c1b4 [GFS2] Tidy gfs2_unstuffer_page
Tidy up gfs2_unstuffer_page by:

 a) Moving it into bmap.c
 b) Making it static
 c) Calling it directly from gfs2_unstuff_dinode
 d) Updating all callers of gfs2_unstuff_dinode due to one less
    required argument.

It doesn't change the behaviour at all.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-26 10:51:20 -04:00
Steven Whitehouse
a9e5f4d078 [GFS2] Alter direct I/O path
As per comments received, alter the GFS2 direct I/O path so that
it uses the standard read functions "out of the box". Needs a
small change to one of the VFS functions. This reduces the size
of the code quite a lot and also removes the need for one new export.

Some more work remains to be done, but this is the bones of the
thing.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-25 17:24:12 -04:00
Abhijith Das
52f341cf75 [GFS2] gfs2_set_flags double locking patch
traced the "umount hang due to spurious glock" issue that I was having
with gfs2meta. It's in the do_gfs2_set_flags function, which does a
gfs2_holder_init as well as a gfs2_glock_nq_init (increases ref count by
2 instead of 1).

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-21 02:03:21 -04:00
David Teigland
c5921fd02e [GFS2] fix typo in locking/dlm
Typo causes the error value from the wrong lock to be checked.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-21 01:57:40 -04:00
Steven Whitehouse
e0f2bf780a [GFS2] Fix endian conversion bug
Fix an endian coversion bug in log.c spotted by Kevin Anderson.

Cc: Kevin Anderson <kanderso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-17 09:36:28 -04:00
Steven Whitehouse
634ee0b9f4 [GFS2] Fix use after free bug in dir.c
Fix a use after free bug in dir.c spotted by Kevin Anderson.

Cc: Kevin Anderson <kanderso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-17 09:32:37 -04:00
Wendy Cheng
2eb168ca94 [GFS2] NFS update
Update the NFS filehandles so that they contain the file type.

Signed-off-by: Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-13 09:24:48 -04:00
Steven Whitehouse
4da3c6463e [GFS2] Fix a coupls of warnings in dir.c
Fix a couple of compiler warnings in dir.c caused by
potentially uninitialised variables.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-11 13:19:13 -04:00
Abhijith Das
b2a580d87b [PATCH] patch to init di_payload_format field in gfs2_dinode
A missing initialisation when creating a new on disk inode.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-11 09:54:17 -04:00
Steven Whitehouse
f3bba03fd1 [GFS2] Fix deadlock in memory allocation
We must not call GFP_KERNEL memory allocations while we
are holding the log lock (read or write) since that may
trigger a log flush resulting in a deadlock.

Eventually we need to fix the locking in log.c, for now
this solves the problem at the expense of freeing up memory
as fast as we would like to. This needs to be revisited
later on.

Cc: Kevin Anderson <kanderso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-11 09:50:54 -04:00
Steven Whitehouse
4340fe6253 [GFS2] Add generation number
This adds a generation number for the eventual use of NFS to the
ondisk inode. Its backward compatible with the current code since
it doesn't really matter what the generation number is to start with,
and indeed since its set to zero, due to it being taken from padding
in both the inode and rgrp header, it should be fine.

The eventual plan is to use this rather than no_formal_ino in the
NFS filehandles. At that point no_formal_ino will be unused.

At the same time we also add a releasepages call back to the
"normal" address space for gfs2 inodes. Also I've removed a
one-linrer function thats not required any more.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-11 09:46:33 -04:00
Steven Whitehouse
ffeb874b2b [GFS2] Bug fix to gfs2_readpages()
This fixes a bug where we were releasing a page incorrectly
sometimes when reading a stuffed file. This fixes the bug
that Kevin reported when using Xen.

Cc: Kevin Anderson <kanderso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-10 15:47:01 -04:00
Steven Whitehouse
dc3e130a08 [GFS2] Remove unused code from dir.c
Remove a couple of commented out, and unused lines of
code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-10 11:19:29 -04:00
Steven Whitehouse
29937ac6ca [GFS2] Fixes to scanning of glocks (again)
This really is the correct fix this time. We just ignore all
glocks associated with inodes until the inodes are pushed
from the inode cache. At that point the glocks are queued for
reclaim, so we don't need to do it here.

Also fix one or two other minor bugs.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-06 17:58:03 -04:00
Steven Whitehouse
627add2d13 [GFS2] Correct logic in glock scanner
Under certain circumstances the glock scanning logic would
demote locks which ought not to have been selected for
demotion.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-05 13:16:19 -04:00
Steven Whitehouse
fd4de2d41a [GFS2] Add cast for printk
Cast a uint64_t to unsigned long long for a printk.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-05 13:14:59 -04:00
Steven Whitehouse
ecb1460dc4 [GFS2] Make GFS2 work with lock validator
Change our one existing old-style lock initialiser to a new-style
one. This allows the lock validator to work as intended.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-05 10:41:39 -04:00
Steven Whitehouse
faac9bd0e3 [GFS2] Fix locking for Direct I/O reads
We need to hold i_mutex when doing direct i/o reads.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-05 08:24:34 -04:00
Steven Whitehouse
b0dd9308b7 [GFS2] Mark file_operations const
As per Arjan's patches:
http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=99ac48f54a91d02140c497edc31dc57d4bc5c85d
and
http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=4b6f5d20b04dcbc3d888555522b90ba6d36c4106

make the GFS2 file_operations structures const.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 13:47:02 -04:00
Steven Whitehouse
66de045d9f [GFS2] Make our address_space_operations const
As per Christoph's patch:
http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=f5e54d6e53a20cef45af7499e86164f0e0d16bb2

We mark struct address_space_operations const in GFS2.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 13:37:30 -04:00
Steven Whitehouse
0c0834a30c [GFS2] API change for gfs2_statfs
The kernel API for super_operations->statfs() has changed
so this updates GFS2 to the new API.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 11:38:01 -04:00
Andrew Morton
ccd6efd0cd [patch 1/1] gfs2: get_sb_dev() fix
Update GFS2 for dhowells API changes.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 11:23:09 -04:00
Steven Whitehouse
02630a12c7 [GFS2] Remove dependance on tty_write_message()
This removes the call in GFS2 to tty_write_message and replaces
it with a printk. As the export was added by GFS2, we remove this
as well.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 11:20:06 -04:00
Steven Whitehouse
af18ddb886 [GFS2] Eliminate one instance of __GFP_NOFAIL
This removes one instance of GFP_NOFAIL from the glock callback
function. It also fixes a bug where a , was used at a line end
rather than ; causing unintended results.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-24 15:42:21 -04:00
Steven Whitehouse
a53311d4d9 [GFS2] Use generic_file_sendfile directly
Don't use a wrapper for generic_file_sendfile but call it
directly.

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-23 16:16:29 -04:00
David Teigland
a464418425 [GFS2] gfs2/dlm: mailing list and web page
List new development mailing list and correct web page url.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-22 15:29:57 -04:00
Steven Whitehouse
bdd512aeea [GFS2] Remove unused flag
The flag GIF_MIN_INIT is no longer used or required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-22 15:26:33 -04:00
Adrian Bunk
43f5d210a0 [GFS2] [-mm patch] fs/gfs2/: make code static
This patch makes the following needlessly global code static:
- eaops.c: struct gfs2_security_eaops
- rgrp.c: gfs2_free_uninit_di()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-22 11:16:40 -04:00
Steven Whitehouse
faf450ef4a [GFS2] Remove gfs2_repermission
gfs2_repermission is just a wrapper for permission, so remove it and
call permission directly where required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-22 10:59:10 -04:00
Steven Whitehouse
d9d1ca3050 [GFS2] Fix double locking problem in rename
The rename inode operation was trying to lock the same
inode twice in the case of renaming with the source
and destination directories the same. We now test for
this and just lock once.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-21 15:38:17 -04:00
Steven Whitehouse
0d42e54220 [GFS2] Remove unused ra_state variable
As per Nick Piggin's comments on lkml, remove the unused ra_state
variable.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
2006-06-20 16:13:49 -04:00
David Woodhouse
0239c4ae8a [GFS2] Fix printk format warnings in DLM code
fs/gfs2/locking/dlm/thread.c: In function ‘process_complete’:
fs/gfs2/locking/dlm/thread.c:56: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’
fs/gfs2/locking/dlm/thread.c:69: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
fs/gfs2/locking/dlm/thread.c:102: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’
fs/gfs2/locking/dlm/thread.c:124: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
fs/gfs2/locking/dlm/thread.c:146: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
fs/gfs2/locking/dlm/thread.c:148: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-20 13:48:31 +01:00
David Woodhouse
695165dfba [GFS2] Fix use of bitops on unsigned int (struct gfs2_holder->gh_iflags)
fs/gfs2/glock.c: In function ‘gfs2_holder_get’:
fs/gfs2/glock.c:439: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type
fs/gfs2/glock.c: In function ‘rq_promote’:
fs/gfs2/glock.c:512: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type
fs/gfs2/glock.c:526: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type
 ...

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-20 13:44:27 +01:00
Steven Whitehouse
b61dde795f [GFS2] Always include glock in transaction
Include the glock in the transaction, even when not journaling
data in order that ordered write data will be correctly flushed
when the lock is released.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-19 10:51:11 -04:00
Steven Whitehouse
3a8476dda1 [GFS2] Remove debugging printks
A few of my printks slipped through last time. Also fix a couple of
minor bugs.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-19 09:10:39 -04:00
Steven Whitehouse
feaa7bba02 [GFS2] Fix unlinked file handling
This patch fixes the way we have been dealing with unlinked,
but still open files. It removes all limits (other than memory
for inodes, as per every other filesystem) on numbers of these
which we can support on GFS2. It also means that (like other
fs) its the responsibility of the last process to close the file
to deallocate the storage, rather than the person who did the
unlinking. Note that with GFS2, those two events might take place
on different nodes.

Also there are a number of other changes:

 o We use the Linux inode subsystem as it was intended to be
used, wrt allocating GFS2 inodes
 o The Linux inode cache is now the point which we use for
local enforcement of only holding one copy of the inode in
core at once (previous to this we used the glock layer).
 o We no longer use the unlinked "special" file. We just ignore it
completely. This makes unlinking more efficient.
 o We now use the 4th block allocation state. The previously unused
state is used to track unlinked but still open inodes.
 o gfs2_inoded is no longer needed
 o Several fields are now no longer needed (and removed) from the in
core struct gfs2_inode
 o Several fields are no longer needed (and removed) from the in core
superblock

There are a number of future possible optimisations and clean ups
which have been made possible by this patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-14 15:32:57 -04:00
Steven Whitehouse
01eb7c0796 [GFS2] Fix warning on impossible event in eattr code
The caller ensures that ea_list_i() is never called with an
invalid type, so lets BUG() if we see one. This clears up
a couple of compiler warnings too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-06 17:31:30 -04:00
Steven Whitehouse
6b61b072a8 [GFS2] Move some fields around to reduce wasted space
We can reclaim some space by moving fields in some structures
in order to allow them to pack better on 64 bit architectures.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-06 14:49:39 -04:00
Ryan O'Hara
e70409f5f3 [GFS2] Fix for selinux support
This should fix the mount problems with gfs2 and selinux.

Signed-off-by: Ryan O'Hara <rohara@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-25 17:36:15 -04:00
Steven Whitehouse
382066da25 [GFS2] Casts for printing 64bit numbers
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-24 10:22:09 -04:00
David Teigland
9229f01349 [GFS2] Cast 64 bit printk args to unsigned long long.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-24 09:21:30 -04:00
Steven Whitehouse
90cdd2083a [GFS2] Flag up issue in selinux code
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-22 10:36:25 -04:00
Ryan O'Hara
639b6d79b8 [GFS2] selinux support
This adds support to GFS2 for selinux extended attributes. There is a
known bug in gfs2_ea_get() which is believed to be independant of this
patch. Further patches will follow once that bug is fixed in order to
make GFS2 use as much of the generic eattr infrastructure as possible.

Signed-off-by: Ryan O'Hara <rohara@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-22 10:08:35 -04:00
David Teigland
d2f222e631 [GFS2] setup lock_dlm kobject earlier
Setup the lock_dlm kobject before setting up the dlm lockspace instead
of after.  We want to use the sysfs files to detect the mount without
having to wait for the dlm setup which can take a while.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-19 08:24:02 -04:00
Steven Whitehouse
320dd101e2 [GFS2] glock debugging and inode cache changes
This adds some extra debugging to glock.c and changes
inode.c's deallocation code to call the debugging code
at a suitable moment. I'm chasing down a particular bug
to do with deallocation at the moment and the code can
go again once the bug is fixed.

Also this includes the first part of some changes to unify
the Linux struct inode and GFS2's struct gfs2_inode. This
transformation will happen in small parts over the next short
period.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 16:25:27 -04:00
Steven Whitehouse
3a8a9a1034 [GFS2] Update copyright date to 2006
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 15:09:15 -04:00
Steven Whitehouse
bd8968010a [GFS2] Remove semaphore.h from C files
We no longer use semaphores, everything has been converted to
mutex or rwsem, so we don't need to include this header any more.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 14:54:58 -04:00
Steven Whitehouse
1b50259bc3 [GFS2] Drop log lock on I/O error & tidy up
This patch drops the log spinlock when an I/O error occurs
to avoid any possible problems in case of blocking or
recursion in the I/O error routine. It also has a few
cosmetic changes to tidy up various other files.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 14:10:52 -04:00
Steven Whitehouse
02f211f4d0 [GFS2] Remove bits.c from the Makefile
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 14:03:43 -04:00
Steven Whitehouse
3efd7534a8 [GFS2] Make newly moved functions static
The functions moved from bits.c can now be made static.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 14:02:52 -04:00
Steven Whitehouse
88c8ab1fcb [GFS2] Merge bits.[ch] into rgrp.c
Since they are small and will be inlined by the complier,
it makes sense to merge the contents of bits.[ch] into
rgrp.c

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 13:52:39 -04:00
Steven Whitehouse
64c14ea73b [GFS2] Fix ref count bug that used to bite us on umount
The ref count of certain glock's got elevated too far during unlink
which caused umount to fail. This fixes it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-16 13:37:11 -04:00
Steven Whitehouse
b9cb981310 [GFS2] Fix attributes setting logic
The attributes logic for immutable was wrong so that there was
not way to remove this attribute once set. This fixes the
bug.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-12 17:07:56 -04:00
Steven Whitehouse
9801f6461e [GFS2] Remove incorrect initialisation of gh_owner
The gh_owner field shouldn't be set or reset outside the glock code.
These were left over from when recursive locking was allowed. It
isn't any more, so they are not needed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-12 14:06:02 -04:00
Steven Whitehouse
e90c01e148 [GFS2] Reverse block order in build_height
The original code ordered the blocks allocated in the build_height
routine backwards causing excessive disk seeks during a read of the
metadata. This patch reverses the order to try and reduce disk seeks.

Example: A five level metadata tree, I = Inode, P = Pointers, D = Data

You need to read the blocks in the order:

I P5 P4 P3 P2 P1 D

in order to read a single data block. The new code now orders the blocks
in this way. The old code used to order them as:

I P1 P2 P3 P4 P5 D

requiring two extra seeks on average. Note that for files which are
grown by gradual extension rather than by truncate or by llseek/write
at a large offset, this doesn't apply. In the case of writing to a
file linearly, this routine will only be called upon to extend the
height of the tree by one block at a time, so the ordering is
determined by when its called rather than by the internals of the
routine itself. Optimising that part of the ordering is a much
harder problem.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-12 12:09:15 -04:00
Steven Whitehouse
fd88de569b [GFS2] Readpages support
This adds readpages support (and also corrects a small bug in
the readpage error path at the same time). Hopefully this will
improve performance by allowing GFS to submit larger lumps of
I/O at a time.

In order to simplify the setting of BH_Boundary, it currently gets
set when we hit the end of a indirect pointer block. There is
always a boundary at this point with the current allocation code.
It doesn't get all the boundaries right though, so there is still
room for improvement in this.

See comments in fs/gfs2/ops_address.c for further information about
readpages with GFS2.

Signed-off-by: Steven Whitehouse
2006-05-05 16:59:11 -04:00
Robert S Peterson
5bb76af1e0 [GFS2] Set d_ops for root inode
Well, I managed to track down the bug in gfs2 that was causing
my grief.  Below is a patch for the problem.  Please incorporate
as you see fit.  Or should I say: as you see git.

The problem was basically that you never set d_ops for the root
inode, so the wrong hash algorithm was being used.  But only for
the root directory.  Turns out that if I used subdirectories, it
used the proper hash and my files were found just fine.

Signed-off-by: Robert S Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-05 16:29:50 -04:00
Steven Whitehouse
d2d7b8a2a7 [GFS2] Fix bug in writepage()
As pointed out by Wendy Cheng, the logic in GFS2's writepage() function
wasn't quite right with respect to invalidating pages when a file has been
truncated. This patch fixes that.

CC: Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-02 12:09:42 -04:00
Steven Whitehouse
56409abbf8 [GFS2] Remove some unused code
Remove some of the unused code flagged up by Adrian Bunk.

Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse
2006-04-28 11:48:45 -04:00
Adrian Bunk
08bc2dbc73 [GFS2] [-mm patch] fs/gfs2/: possible cleanups
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 unused functions
- remove the following global function that was both unused and
  unimplemented:
  - super.c: gfs2_do_upgrade()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:59:12 -04:00
Steven Whitehouse
363275216c [GFS2] Reordering in deallocation to avoid recursive locking
Despite my earlier careful search, there was a recursive lock left
in the deallocation code. This removes it. It also should speed up
deallocation be reducing the number of locking operations which take
place by using two "try lock" operations on the two locks involved in
inode deallocation which allows us to grab the locks out of order
(compared with NFS which grabs the inode lock first and the iopen
lock later). It is ok for us to fail while doing this since if it
does fail it means that someone else is still using the inode and
thus it wouldn't be possible to deallocate anyway.

This fixes the bug reported to me by Rob Kenna.

Cc: Rob Kenna <rkenna@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:46:21 -04:00
David Teigland
e7f5c01cad [GFS2] Remove redundant casts to/from void
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-27 11:25:45 -04:00
David Teigland
6bd70aba5a [DLM] lock_dlm recover_status patch
This saves the journal recovery result and makes it visible through sysfs.
User space needs to know if the node actually recovered the journal or
tried and gave up.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 15:56:35 -04:00
Steven Whitehouse
579b78a43b [GFS2] Remove GL_NEVER_RECURSE flag
There is no point in keeping this flag since recursion is not
now allowed for any glock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 14:58:26 -04:00
Steven Whitehouse
5965b1f479 [GFS2] Don't do recursive locking in glock layer
This patch changes the last user of recursive locking so that
it no longer needs this feature and removes it from the glock
layer. This makes the glock code a lot simpler and easier to
understand. Its also a prerequsite to adding support for the
AOP_TRUNCATED_PAGE return code (or at least it is if you don't
want your brain to melt in the process)

I've left in a couple of checks just in case there is some place
else in the code which is still using this feature that I didn't
spot yet, but they can probably be removed long term.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 13:21:55 -04:00
David Teigland
3a2a9c96ac [GFS2] Update plock code in DLM locking module
We should be using fl_pid not fl_owner.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 15:45:51 -04:00
Steven Whitehouse
4bcf7091f9 [GFS2] Remove inherited flags from exported flags.
We don't need the inherited flags since this action can be
implied by setting the flags on directories where they
wouldn't otherwise make sense. It reduces the number of extra
flags by two. Also updated the list of flags to take account of
one extra ext2/3 flag.

Cc: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 13:20:27 -04:00
Steven Whitehouse
b5ea3e1ef3 [GFS2] Tidy up Makefile & Kconfig
Remove select of SYSFS as requested by Greg KH. Change whitespace to
tabs rather than spaces in places where it was incorrect and removed
'default m' as suggested by Adrian Bunk.

Reorganised Makefile as suggested by Sam Ravnborg.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 14:14:42 -04:00
Steven Whitehouse
b800a1cb39 [GFS2] Tidy up daemon.c
As per Andrew Morton's comments, remove uneeded casts and use
wait_event_interruptible() rather than open code the wait.

Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 13:13:56 -04:00
Steven Whitehouse
61e085a88c [GFS2] Tidy up dir code as per Christoph Hellwig's comments
1. Comment whitespace fix
2. Removed unused header files from dir.c
3. Split the gfs2_dir_get_buffer() function into two functions

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 10:07:13 -04:00
Steven Whitehouse
1e09ae544e [GFS2] Move BUG() back into the header file
In order to make the file and line number reporting work
correctly, this has been moved back into the header file.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:52:46 -04:00
Steven Whitehouse
1dde2dbfc7 [GFS2] Add back missing BUG()
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:39:02 -04:00
Steven Whitehouse
a74604bee2 [GFS2] sem -> mutex conversion in locking.c
Convert a semaphore to a mutex in locking.c and also tidy
up one or two loose ends.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:10:46 -04:00
David Teigland
c63e31c2cc [GFS2] journal recovery patch
This is one of the changes related to journal recovery I mentioned a
couple weeks ago.  We can get into a situation where there are only
readonly nodes currently mounting the fs, but there are journals that need
to be recovered.  Since the readonly nodes can't recover journals, the
next rw mounter needs to go through and check all journals and recover any
that are dirty (i.e. what the first node to mount the fs does).  This rw
mounter needs to skip the journals held by the existing readonly nodes.
Skipping those journals amounts to using the TRY flag on the journal locks
so acquiring the lock of a journal held by a readonly node will fail
instead of blocking indefinately.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-20 17:03:48 -04:00
Steven Whitehouse
190562bd84 [GFS2] Fix a bug: scheduling under a spinlock
At some stage, a mutex was added to gfs2_glock_put() without
checking all its call sites. Two of them were called from
under a spinlock causing random delays at various points and
crashes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-20 16:57:23 -04:00
Steven Whitehouse
fe1bdedc6c [GFS2] Use vmalloc() in dir code
When allocating memory to sort directory entries, use vmalloc()
rather than kmalloc() since for larger directories, the required
size can easily be graeter than the 128k maximum of kmalloc().

Also adding the first steps towards getting the AOP_TRUNCATED_PAGE
return code get in the glock code by flagging all places where we
request a glock and we are holding a page lock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-18 10:09:15 -04:00
Steven Whitehouse
4d8012b60e [GFS2] Fix bug which was causing postmark to fail
A typo in the directory code was causing postmark to fail
somewhere in the allocation code, since it was unable to
find newly allocated directory leaf blocks under certain
circumstances.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-12 17:39:45 -04:00
Steven Whitehouse
f4154ea039 [GFS2] Update journal accounting code.
A small update to the journaling code to change the way that
the "extra" blocks are accounted for in the journal. These are
used at a rate of one per 503 metadata blocks or one per 251
journaled data blocks (or just one if the total number of journaled
blocks in the transaction is smaller). Since we are using them at
two different rates the old method of accounting for them no longer
works and we count them up as required.

Since the "per transaction" accounting can't handle this (there is no
fixed number of header blocks per transaction) we have to account for
it in the general journal code. We now require that each transaction
reserves more blocks than it actually needs to take account of the
possible extra blocks.

Also a final fix to dir.c to ensure that all ref counts are handled
correctly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-11 14:49:06 -04:00
Steven Whitehouse
ed3865079b [GFS2] Finally get ref counting correct
The last patch missed some other instances of incorrect ref counting,
this fixes all of those too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-07 16:28:07 -04:00
Steven Whitehouse
b09e593d79 [GFS2] Fix a ref count bug and other clean ups
This fixes a ref count bug that sometimes showed up a umount time
(causing it to hang) but it otherwise mostly harmless. At the same
time there are some clean ups including making the log operations
structures const, moving a memory allocation so that its not done
in the fast path of checking to see if there is an outstanding
transaction related to a particular glock.

Removes the sd_log_wrap varaible which was updated, but never actually
used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
(which it never needed anyway). Removes the "invalidate inodes" loop
from GFS2's put_super routine. This is done in kill super anyway so
we don't need to do it here. The loop was also bogus in that if there
are any inodes "stuck" at this point its a bug and we need to know
about it rather than hide it by hanging forever.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-07 11:17:32 -04:00
Steven Whitehouse
55eccc6d00 [GFS2] Finish off ioctl support
This puts the finishing touches to the ioctl support and also
removes a couple of unused fields from GFS2's private per file
structure.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-04 14:29:30 -04:00
Steven Whitehouse
8628de0583 [GFS2] Update GFS2 for the recent pull from Linus
Some interfaces have changed. In particular one of the posix
locking functions has changed prototype, along with the
address space operation invalidatepage and the block getting
callback to the direct IO function.

In addition add the splice file operations. These will need to
be updated to support AOP_TRUNCATED_PAGE before they will be
of much use to us.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-31 16:48:41 -05:00
Steven Whitehouse
7ea9ea8322 [GFS2] Update ioctl() to new interface
This is designed as a fs independent way to set flags on a
particular inode. The values of the ioctl() and flags are
designed to be identical to the ext2/3 values. Assuming that
this plan is acceptable to people in general, the plan is to
then move other fs across to using the same set of #defines,
etc.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-31 15:01:28 -05:00
Steven Whitehouse
e3167ded1f [GFS] Fix bug in endian conversion for metadata header
In some cases 16 bit functions were being used rather than 32 bit
functions.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-30 15:46:23 -05:00
Steven Whitehouse
cd45697f0d [GFS2] Add missing {} in trans.c
A conditional had missing {} around the two following
statements. Now added.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-30 11:10:12 -05:00
Steven Whitehouse
e90deff533 [GFS2] Fix bug in directory expansion code
We didn't properly check that leaf splitting was allowed. We do
now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-29 19:02:15 -05:00
Steven Whitehouse
d0dc80dbaf [GFS2] Update debugging code
Update the debugging code in trans.c and at the same time improve
the debugging code for gfs2_holders. The new code should be pretty
fast during the normal case and provide just as much information
in case of errors (or more).

One small function from glock.c has moved to glock.h as a static inline so
that its return address won't get in the way of the debugging.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-29 14:36:49 -05:00
Steven Whitehouse
484adff8a0 [GFS2] Update locking in log.c
Replace the lock_for_trans()/lock_for_flush() functions with an rwsem.
In fact the sd_log_flush_lock becomes an rwsem (the write part of it)
and is extended slightly to cover everything that the lock_for_flush()
used to cover. The read part of the lock is instead of lock_for_trans().

This corrects the races in the original code and reduces the code size.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-29 09:12:12 -05:00
David Teigland
7aabffcab4 [DLM] Look for "nodir" in the lockspace mount options
Look for "nodir" in the hostdata mount option which is used to
set the NODIR flag in dlm_new_lockspace().

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-28 14:20:58 -05:00
Steven Whitehouse
71b86f562b [GFS2] Further updates to dir and logging code
This reduces the size of the directory code by about 3k and gets
readdir() to use the functions which were introduced in the previous
directory code update.

Two memory allocations are merged into one. Eliminates zeroing of some
buffers which were never used before they were initialised by
other data.

There is still scope for further improvement in the directory code.

On the logging side, a hand created mutex has been replaced by a
standard Linux mutex in the log allocation code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-28 14:14:04 -05:00
Steven Whitehouse
94aabbd993 [GFS2] Remove ioctl support
The various flags on inodes will in future be set and queried
via the extended attributes interface, so this interface is no
longer required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-21 11:29:00 -05:00
Steven Whitehouse
c752666c17 [GFS2] Fix bug in directory code and tidy up
Due to a typo, the dir leaf split operation was (for the first
split in a directory) writing the new hash vaules at the
wrong offset. This is now fixed.

Also some other tidy ups are included:

 - We use GFS2's hash function for dentries (see ops_dentry.c) so that
   we don't have to keep recalculating the hash values.
 - A lot of common code is eliminated between the various directory
   lookup routines.
 - Better error checking on directory lookup (previously different
   routines checked for different errors)
 - The leaf split operation has a couple of redundant operations
   removed from it, so it should be faster.

There is still further scope for further clean ups in the directory
code, and readdir in particular could do with slimming down a bit.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-20 12:30:04 -05:00
Steven Whitehouse
419c93e0b6 [GFS2] Add gfs2meta filesystem
In order to separate out the filesystem's metadata from "normal"
files and directories, a new filesystem type has been created.
It is called gfs2meta and mounting it gives access to the files
that were previously under .gfs2_admin (well still are until mkfs
is altered, which is next on the adgenda).

Its not currently possible to mount both gfs2 and gfs2meta on the
same block device at the same time. A future patch will allow that
to happen.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-02 16:33:41 -05:00
Steven Whitehouse
b4dc72911d [GFS2] Fix some bugs
Fix a bug I introduced earlier with a kfree() and usage of
a structure in the wrong order. Also try and get the counts
of the journaled data buffers "more correct". Still some work
to do in this area though.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01 17:41:58 -05:00
Steven Whitehouse
c9fd43078f [GFS2] Tidy up mount code.
We no longer lookup ".gfs2_admin" in the root directory in order to
find it, but instead use the inode number given in the superblock.
Both the root directory and the admin directory are now looked up using
the same routine, so the redundant code is removed.

Also, there is no longer a reference to the root inode in the
GFS2 super block. When required this can be retreived via
sb->s_root->d_inode instead.

Assuming that we introduce a metadata filesystem type for GFS, then
this is a first step towards that goal.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01 15:31:02 -05:00
Steven Whitehouse
e317ffcb7c [GFS2] Remove uneeded memory allocation
For every filesystem operation where we need a transaction, we
now make one less memory allocation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01 11:39:37 -05:00
Steven Whitehouse
5c676f6d35 [GFS2] Macros removal in gfs2.h
As suggested by Pekka Enberg <penberg@cs.helsinki.fi>.

The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h
The other macros are gone from gfs2.h as (although not requested
by Pekka Enberg) are a number of included header file which are now
included individually. The inode number comparison function is
now an inline function.

The DT2IF and IF2DT may be addressed in a future patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 17:23:27 -05:00
Steven Whitehouse
116ad29d98 [GFS2] Remove pointless comment from nolock/main.c
As requested by:
Pavel Machek <pavel@suse.cz>

Pavel's other comments will be dealt with in later patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 12:11:18 -05:00
Steven Whitehouse
f382894e88 [GFS2] 80 Column audit of locking modules
Requested by:
Prarit Bhargava <prarit@redhat.com>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 12:07:05 -05:00
Steven Whitehouse
568f4c9659 [GFS2] 80 Column audit of GFS2
Requested by:
Prarit Bhargava <prarit@redhat.com>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 12:00:42 -05:00
Steven Whitehouse
3a8fe9be6c [GFS2] Use BUG_ON() rather then if (...) BUG();
This issue was raised by:
Eric Sesterhenn <snakebyte@gmx.de>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 11:00:37 -05:00
Steven Whitehouse
d92a8d4808 [GFS2] Audit printk and kmalloc
All printk calls now have KERN_ set where required and a couple of
kmalloc(), memset(.., 0, ...) calls changed to kzalloc().

This is in response to comments from:
Pekka Enberg <penberg@cs.helsinki.fi> and
Eric Sesterhenn <snakebyte@gmx.de>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 10:57:14 -05:00
David Teigland
d95cb943f5 [GFS2] Patch to remove stats counters from GFS2 (II)
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-23 10:13:13 +00:00
David Teigland
6a6b3d018f [GFS2] Patch to remove stats gathering from GFS2
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-23 10:11:47 +00:00
David Teigland
8d3b35a4af [DLM] Remove support for range locks (II)
This is the second of two patches removing support for range
locks from the DLM

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-23 10:00:56 +00:00
Steven Whitehouse
91ffd7db71 [GFS2] Missed deletion of debugging code
One line which should have been deleted in the last patch
was missed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-22 16:41:45 +00:00
Steven Whitehouse
13538b8e46 [GFS2] Add list empty test to databuf_lo_add
Heinz had spotted that I'd forgotten to test in databuf_lo_add()
that the data buffer in question hadn't already been added to
the list. This was causing an infinite loop later on in the
"before commit" routine.

This means that GFS2 is now ready to be tested by everybody.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-22 11:15:03 +00:00
Steven Whitehouse
f55ab26a8f [GFS2] Use mutices rather than semaphores
As well as a number of minor bug fixes, this patch changes GFS
to use mutices rather than semaphores. This results in better
information in case there are any locking problems.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-21 12:51:39 +00:00
Steven Whitehouse
5c4e9e0366 [GFS2] Fix a case where we didn't get unstuffing right
There was a bug in the unstuffing logic which caused a crash
under certain circumstances. This is now fixed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-15 12:26:19 +00:00
Steven Whitehouse
61a30dcb58 [GFS2] Fix for lock recursion problem for internal files
Two internal files which are read through the gfs2_internal_read()
routine were already locked when the routine was called and this
do not need locking at the redapages level.

This patch introduces a struct file which is used as a sentinal
so that readpage will only perform locking in the case that the
struct file passed to it is _not_ equal to this sentinal.

Since the comments in the generic kernel code indicate that the
struct file will never be used for anything other than passing
straight through to readpage(), this should be ok.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-15 10:15:18 +00:00
Steven Whitehouse
4dd651adbb [GFS2] Fix the bugs I introduced in the last patch but one
Various endianess changes required in the directory code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-14 15:56:44 +00:00
Steven Whitehouse
d1665e4142 [GFS2] Put back O_DIRECT support
This patch adds back O_DIRECT support with various caveats
attached:

 1. Journaled data can be read via O_DIRECT since its now the
    same on disk format as normal data files.
 2. Journaled data writes with O_DIRECT will be failed sliently
    back to normal writes (should we really do this I wonder or
    should we return an error instead?)
 3. Stuffed files will be failed back to normal buffered I/O
 4. All the usual corner cases (write beyond current end of file,
    write to an unallocated block) will also revert to normal buffered I/O.

The I/O path is slightly odd as reads arrive at the page cache layer
with the lock for the file already held, but writes arrive unlocked.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-14 11:54:42 +00:00
Steven Whitehouse
fc69d0d336 [GFS2] Change ondisk format (hopefully for the last time)
There were one or two fields in structures which didn't get changed
last time back to their gfs1 sizes and alignments. One or two constants
have also changed back to their original values which were missed the
first time.

Its possible that indirect pointer blocks might need to change. If
they don't we'll have to rewrite them all on upgrade due to a change
in the amount of padding that they use.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13 16:21:47 +00:00
Steven Whitehouse
7359a19cc7 [GFS2] Fix for root inode ref count bug
Umount is now working correctly again. The bug was due to
not getting an extra ref count when mounting the fs. We
should have bumped it by two (once for the internal pointer
to the root inode from the super block and once for the
inode hanging off the dcache entry for root).

Also this patch tidys up the code dealing with looking up
and creating inodes. We now pass Linux inodes (with gfs2_inodes
attached) rather than the other way around and this reduces code
duplication in various places.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13 12:27:43 +00:00
Steven Whitehouse
18ec7d5c3f [GFS2] Make journaled data files identical to normal files on disk
This is a very large patch, with a few still to be resolved issues
so you might want to check out the previous head of the tree since
this is known to be unstable. Fixes for the various bugs will be
forthcoming shortly.

This patch removes the special data format which has been used
up till now for journaled data files. Directories still retain the
old format so that they will remain on disk compatible with earlier
releases. As a result you can now do the following with journaled
data files:

 1) mmap them
 2) export them over NFS
 3) convert to/from normal files whenever you want to (the zero length
    restriction is gone)

In addition the level at which GFS' locking is done has changed for all
files (since they all now use the page cache) such that the locking is
done at the page cache level rather than the level of the fs operations.
This should mean that things like loopback mounts and other things which
touch the page cache directly should now work.

Current known issues:

 1. There is a lock mode inversion problem related to the resource
    group hold function which needs to be resolved.
 2. Any significant amount of I/O causes an oops with an offset of hex 320
    (NULL pointer dereference) which appears to be related to a journaled data
    buffer appearing on a list where it shouldn't be.
 3. Direct I/O writes are disabled for the time being (will reappear later)
 4. There is probably a deadlock between the page lock and GFS' locks under
    certain combinations of mmap and fs operation I/O.
 5. Issue relating to ref counting on internally used inodes causes a hang
    on umount (discovered before this patch, and not fixed by it)
 6. One part of the directory metadata is different from GFS1 and will need
    to be resolved before next release.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-08 11:50:51 +00:00
Steven Whitehouse
257f9b4e97 [GFS2] Update truncate function (shrinking partial blocks)
Update the function in GFS2 which deals with truncation of
partial blocks. Some of the code is "borrowed" from ext3
since it appears to give a good model of how to do this
operation. The function is renamed gfs2_block_truncate_page
accordingly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-31 10:00:25 +00:00
Steven Whitehouse
f42faf4fa4 [GFS2] Add gfs2_internal_read()
Add the new external read function. Its temporarily in jdata.c
even though the protoype is in ops_file.h - this will change
shortly. The current implementation will change to a page cache
one when that happens.

In order to effect the above changes, the various internal inodes
now have Linux inodes attached to them. We keep the references to
the Linux inodes, rather than the gfs2_inodes in the super block.

In order to get everything to work correctly I've had to reorder
the init sequence on mount (which I should probably have done
earlier when .gfs2_admin was made visible).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 18:34:10 +00:00
Steven Whitehouse
fd2ee6bb1e [GFS2] Remove unused prototype
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 13:36:53 +00:00
Steven Whitehouse
e13940ba56 [GFS2] Make dir.c independant of jdata.c
Copy & rename various jdata functions into dir.c. The plan
being that directory metadata format will not change although
the journalled data format for "normal" files will change.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 13:31:50 +00:00
Steven Whitehouse
9b124fbb8d [GFS2] Use mpage_readpage() in gfs2_readpage()
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 11:55:32 +00:00
Steven Whitehouse
2442a098be [GFS2] Bug fix relating to endian conversion in inode.c
A two line fix to get endian conversion correct.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 11:49:32 +00:00
Steven Whitehouse
4ff14670ee [GFS2] Rename get_block and make it extern
This renames get_block to gfs2_get_block and makes it accessible
from outside ops_address.c.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 09:39:10 +00:00
Steven Whitehouse
b381beadee [GFS2] Remove unused file resize.c
The code in this file is no longer used as the rindex file is
now accessible to userspace, so can be read/written directly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 08:47:14 +00:00
Steven Whitehouse
aa6a85a971 [GFS2] Remove pointless argument relating to truncate
For some reason a function pointer was being passed through
the truncate code which only ever took one value. This removes
the function pointer and replaces it with a single call to
the function in question.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-24 10:37:06 +00:00
Steven Whitehouse
a98ab2204f [GFS2] Rename gfs2_meta_pin to gfs2_pin
Since we'll need to pin data if we are going to journal it, then
I'm renaming this function to make it less confusing. It might also
be worth moving it into lops.c since there are no users outside that
file.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 13:38:44 +00:00
Steven Whitehouse
4f3df04137 [GFS2] Change memory allocations to GFP_NOFS
I'd like to be rid of these memory allocations entirely so far as is
possible. For the moment though, mark them GFP_NOFS to make them less
harmful.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 13:20:16 +00:00
Steven Whitehouse
64fb4eb7d4 [GFS2] Remove gfs2_databuf in favour of gfs2_bufdata structure
Removing the gfs2_databuf structure and using gfs2_bufdata instead
is a step towards allowing journaling of data without requiring the
metadata header on each journaled block. The idea is to merge the
code paths for ordered data with that of journaled data, with the
log operations in lops.c tacking account of the different types of
buffers as they are presented to it. Largely the code path for
metadata will be similar too, but obviously through a different set
of log operations.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 13:14:40 +00:00
Steven Whitehouse
586dfdaaf3 [GFS2] Make the new argument to gfs2_trans_add_bh() actually do something
Passes the flag through to ensure that the correct log operations are
invoked when the flag is set.

Signed-off-by: Steven Whitehouse: <swhiteho@redhat.com>
2006-01-18 11:32:00 +00:00
Steven Whitehouse
d4e9c4c3bf [GFS2] Add an additional argument to gfs2_trans_add_bh()
This adds an extra argument to gfs2_trans_add_bh() to indicate whether the
bh being added to the transaction is metadata or data. Its currently unused
since all existing callers set it to 1 (metadata) but following patches will
make use of it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 11:19:28 +00:00
Steven Whitehouse
b96ca4fa4e [GFS2] Update init_dinode() to reduce stack usage
We no longer allocate a dinode on the stack in init_dinode()
and we no longer use gfs2_dinode_out (eliminating one copy) and
gfs2_meta_header_in (eliminating another copy). The meta_header_in
fucntion is now no longer referenced from outside gfs2_ondisk.c, so
make it static.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 10:57:10 +00:00
Steven Whitehouse
3bd7662c4d [GFS2] Remove unused code from ondisk.c/gfs2_ondisk.h
Removal of unused conversion functions from ondisk.c and
gfs2_ondisk.h

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 10:40:17 +00:00
Steven Whitehouse
666a2c534c [GFS2] Remove unused code from various files
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 10:29:04 +00:00
David Teigland
c73530a1f9 [GFS2] Remove remains of the GFS2 identify ioctl()
We don't need this ioctl, we can use stat() to gain the same
information.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18 10:11:51 +00:00
David Teigland
5ddec5b3d7 [GFS2] Only two args for kobject_uevent() in locking/dlm/mount.c
Update the dlm interface module to take account of the recently
removed third argument to kobject_uevent()

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>
2006-01-18 09:34:14 +00:00
David Teigland
869d81df53 [GFS2] An update of the GFS2 lock modules
This brings the lock modules uptodate and removes the stray
.mod.c file which accidently got included in the last check in.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-17 08:47:12 +00:00
David Teigland
a8f2d64728 [GFS2] Fix typo in GFS2 Makefile
A two line fix to the GFS2 Makefile.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>
2006-01-17 08:36:49 +00:00
David Teigland
29b7998d88 [GFS2] The lock modules for GFS2
This patch contains the pluggable locking modules
for GFS2.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-16 16:52:38 +00:00
David Teigland
b3b94faa5f [GFS2] The core of GFS2
This patch contains all the core files for GFS2.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-16 16:50:04 +00:00