Commit Graph

2370 Commits

Author SHA1 Message Date
Jens Axboe
3e7ee3e7b3 [PATCH] splice: fix page stealing LRU handling.
Originally from Nick Piggin, just adapted to the newer branch.

You can't check PageLRU without holding zone->lru_lock.  The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:11:04 +02:00
Jens Axboe
ad8d6f0a78 [PATCH] splice: page stealing needs to wait_on_page_writeback()
Thanks to Andrew for the good explanation of why this is so. akpm writes:

If a page is under writeback and we remove it from pagecache, it's still
going to get written to disk.  But the VFS no longer knows about that page,
nor that this page is about to modify disk blocks.

So there might be scenarios in which those
blocks-which-are-about-to-be-written-to get reused for something else.
When writeback completes, it'll scribble on those blocks.

This won't happen in ext2/ext3-style filesystems in normal mode because the
page has buffers and try_to_release_page() will fail.

But ext2 in nobh mode doesn't attach buffers at all - it just sticks the
page in a BIO, finds some new blocks, points the BIO at those blocks and
lets it rip.

While that write IO's in flight, someone could truncate the file.  Truncate
won't block on the writeout because the page isn't in pagecache any more.
So truncate will the free the blocks from the file under the page's feet.
Then something else can reallocate those blocks.  Then write data to them.

Now, the original write completes, corrupting the filesystem.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:10:32 +02:00
Jens Axboe
059a8f3734 [PATCH] splice: export generic_splice_sendpage
Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL
to right below the functions.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:06:05 +02:00
Jens Axboe
b2b39fa478 [PATCH] splice: add a SPLICE_F_MORE flag
This lets userspace indicate whether more data will be coming in a
subsequent splice call.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:41 +02:00
Jens Axboe
83f9135bdd [PATCH] splice: add comments documenting more of the code
Hopefully this will make Andrew a little more happy.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:09 +02:00
Jens Axboe
4f6f0bd2ff [PATCH] splice: improve writeback and clean up page stealing
By cleaning up the writeback logic (killing write_one_page() and the manual
set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
just keep it local in pipe_to_file().

This also adds dirty page balancing logic and O_SYNC handling.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:46 +02:00
Jens Axboe
53cd9ae886 [PATCH] splice: fix shadow[] filling logic
Clear the entire range, and don't increment pidx or we keep filling
the same position again and again.

Thanks to KAMEZAWA Hiroyuki.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:21 +02:00
Linus Torvalds
29e350944f splice: add SPLICE_F_NONBLOCK flag
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 12:46:35 -07:00
Linus Torvalds
547a77ae62 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix typo in earlier cifs_unlink change and protect one
  [CIFS] Incorrect signature sent on SMB Read
  [CIFS] Fix unlink oops when indirectly called in rename error path
  [CIFS] Fix two remaining coverity scan tool warnings.
  [CIFS] Set correct lock type on new posix unlock call
  [CIFS] Upate cifs change log
  [CIFS] Fix slow oplock break response when mounts to different
  [CIFS] Workaround various server bugs found in testing at connectathon
  [CIFS] Allow fallback for setting file size to Procom SMB server when
  [CIFS] Make POSIX CIFS Extensions SetFSInfo match exactly what we want
  [CIFS] Move noisy debug message (triggerred by some older servers) from
  [CIFS] Use correct pid on new cifs posix byte range lock call
  [CIFS] Add posix (advisory) byte range locking support to cifs client
  [CIFS] CIFS readdir perf optimizations part 1
  [CIFS] Free small buffers earlier so we exceed the cifs
  [CIFS] Fix large (ie over 64K for MaxCIFSBufSize) buffer case for wrapping
  [CIFS] Convert remaining places in fs/cifs from
  [CIFS] SessionSetup cleanup part 2
  [CIFS] fix compile error (typo) and warning in cifssmb.c
  [CIFS] Cleanup NTLMSSP session setup handling
2006-03-31 21:27:53 -08:00
Steve French
06bcfedd05 [CIFS] Fix typo in earlier cifs_unlink change and protect one
extra path.

Since cifs_unlink can also be called from rename path and there
was one report of oops am making the extra check for null inode.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 22:43:50 +00:00
Steve French
e9917a000f [CIFS] Incorrect signature sent on SMB Read
Fixes Samba bug 3621 and kernel.org bug 6147

For servers which require SMB/CIFS packet signing, we were sending the
wrong signature (all zeros) on SMB Read request.  The new cifs routine
to do signatures across an iovec was not complete - and SMB Read, unlike
the new SMBWrite2, did not fall back to the older routine (ie use
SendReceive vs. the more efficient SendReceive2 ie used the older
cifs_sign_smb vs. the disabled  cifs_sign_smb2) for calculating signatures.

This finishes up cifs_sign_smb2/cifs_calc_signature2 so that the callers
of SendReceive2 can get SMB/CIFS packet signatures.

Now that cifs_sign_smb2 is supported, we could start using it in
the write path but this smaller fix does not include the change
to use SMBWrite2 when signatures are required (which when enabled
will make more Writes more efficient and alloc less memory).
Currently Write2 is only used when signatures are not
required at the moment but after more testing we will enable
that as well).

Thanks to James Slepicka and Sam Flory for initial investigation.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 21:22:00 +00:00
Jes Sorensen
30c14e40ed [PATCH] avoid unaligned access when accessing poll stack
Commit 70674f95c0:

  [PATCH] Optimize select/poll by putting small data sets on the stack

resulted in the poll stack being 4-byte aligned on 64-bit architectures,
causing misaligned accesses to elements in the array.

This patch fixes it by declaring the stack in terms of 'long' instead
of 'char'.

Force alignment of poll and select stacks to long to avoid unaligned
access on 64 bit architectures.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:30:48 -08:00
Adrian Bunk
a244e1698a [PATCH] fs/namei.c: make lookup_hash() static
As announced, lookup_hash() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
Eric W. Biederman
3e7e241f8c [PATCH] dcache: Add helper d_hash_and_lookup
It is very common to hash a dentry and then to call lookup.  If we take fs
specific hash functions into account the full hash logic can get ugly.
Further full_name_hash as an inline function is almost 100 bytes on x86 so
having a non-inline choice in some cases can measurably decrease code size.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Herbert Poetzl
e4e5d3fc80 [PATCH] cleanup in proc_check_chroot()
proc_check_chroot() does the check in a very unintuitive way (keeping a
copy of the argument, then modifying the argument), and has uncommented
sideeffects.

Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Trond Myklebust
993dfa8776 [PATCH] fs/locks.c: Fix sys_flock() race
sys_flock() currently has a race which can result in a double free in the
multi-thread case.

Thread 1			Thread 2

sys_flock(file, LOCK_EX)
				sys_flock(file, LOCK_UN)

If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for
list_empty(&lock->fl_link) at the end of sys_flock, then both threads will
end up calling locks_free_lock for the same lock.

Fix is to make flock_lock_file() do the same as posix_lock_file(), namely
to make a copy of the request, so that the caller can always free the lock.

This also has the side-effect of fixing up a reference problem in the
lockd handling of flock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Amy Griffis
7a2bd3f7ef [PATCH] inotify: IN_DELETE events missing
IN_DELETE events are no longer generated for the removal of a file from a
watched directory.

This seems to be a result of clearing DCACHE_INOTIFY_PARENT_WATCHED in
d_delete() directly before calling fsnotify_nameremove().

Assuming the flag doesn't need to be cleared before dentry_iput(), this
should do the trick.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Acked-by: Robert Love <rml@novell.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
OGAWA Hirofumi
094e320d76 [PATCH] fat: kill reserved names
Since these names on old MSDOS is used as device, so, current fat driver
doesn't allow a user to create those names.  But many OSes and even Windows
can create those names actually, now.

This patch removes the reserved name check.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
Joe Korty
68eef3b479 [PATCH] Simplify proc/devices and fix early termination regression
Make baby-simple the code for /proc/devices.  Based on the proven design
for /proc/interrupts.

This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:

    # dd if=/proc/devices bs=1
    Character devices:
      1 mem
    27+0 records in
    27+0 records out

This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.

[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Miklos Szeredi
5ce29646eb [PATCH] locks: don't panic
Don't panic!  Just BUG_ON().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:52 -08:00
Al Viro
347f217cec [PATCH] uml: __user annotations
__user annotations (hppfs)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:51 -08:00
Steve French
f1682e94a3 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 15:43:35 +00:00
Jeff Garzik
a0f0678025 [PATCH] splice exports
Woe be unto he who builds their filesystems as modules.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
[ Obscure quote from the infamous geek bible? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 22:16:24 -08:00
Steve French
6910ab30a2 [CIFS] Fix unlink oops when indirectly called in rename error path
under heavy stress.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 03:37:08 +00:00
Steve French
d62e54abca Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 03:35:56 +00:00
Jens Axboe
5abc97aa25 [PATCH] splice: add support for SPLICE_F_MOVE flag
This enables the caller to migrate pages from one address space page
cache to another.  In buzz word marketing, you can do zero-copy file
copies!

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Jens Axboe
5274f052e7 [PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Linus Torvalds
76babde121 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (67 commits)
  [PATCH] powerpc: Remove oprofile spinlock backtrace code
  [PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
  [PATCH] powerpc: Add oprofile calltrace support
  [PATCH] for_each_possible_cpu: ppc
  [PATCH] for_each_possible_cpu: powerpc
  [PATCH] lock PTE before updating it in 440/BookE page fault handler
  [PATCH] powerpc: Kill _machine and hard-coded platform numbers
  ppc: Fix compile error in arch/ppc/lib/strcase.c
  [PATCH] git-powerpc: WARN was a dumb idea
  [PATCH] powerpc: a couple of trivial compile warning fixes
  powerpc: remove OCP references
  powerpc: Make uImage default build output for MPC8540 ADS
  powerpc: move math-emu over to arch/powerpc
  powerpc: use memparse() for mem= command line parsing
  ppc: fix strncasecmp prototype
  [PATCH] powerpc: make ISA floppies work again
  [PATCH] powerpc: Fix some initcall return values
  [PATCH] powerpc: Workaround for pSeries RTAS bug
  [PATCH] spufs: fix __init/__exit annotations
  [PATCH] powerpc: add hvc backend for rtas
  ...
2006-03-29 11:28:30 -08:00
Linus Torvalds
e71ac6032e Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
  [XFS] Cleanup in XFS after recent get_block_t interface tweaks.
  [XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
  [XFS] A change to inode chunk allocation to try allocating the new chunk
  Fixes a regression from the recent "remove ->get_blocks() support"
  [XFS] Fix compiler warning and small code inconsistencies in compat
  [XFS] We really suck at spulling.  Thanks to Chris Pascoe for fixing all
2006-03-29 08:55:36 -08:00
Oleg Nesterov
aa1757f90b [PATCH] convert sighand_cache to use SLAB_DESTROY_BY_RCU
This patch borrows a clever Hugh's 'struct anon_vma' trick.

Without tasklist_lock held we can't trust task->sighand until we locked it
and re-checked that it is still the same.

But this means we don't need to defer 'kmem_cache_free(sighand)'.  We can
return the memory to slab immediately, all we need is to be sure that
sighand->siglock can't dissapear inside rcu protected section.

To do so we need to initialize ->siglock inside ctor function,
SLAB_DESTROY_BY_RCU does the rest.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:42 -08:00
Oleg Nesterov
8fafabd86f [PATCH] remove add_parent()'s parent argument
add_parent(p, parent) is always called with parent == p->parent, and it makes
no sense to do it differently.  This patch removes this argument.

No changes in affected .o files.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:41 -08:00
Eric W. Biederman
d73d65293e [PATCH] pidhash: kill switch_exec_pids
switch_exec_pids is only called from de_thread by way of exec, and it is
only called when we are exec'ing from a non thread group leader.

Currently switch_exec_pids gives the leader the pid of the thread and
unhashes and rehashes all of the process groups.  The leader is already in
the EXIT_DEAD state so no one cares about it's pids.  The only concern for
the leader is that __unhash_process called from release_task will function
correctly.  If we don't touch the leader at all we know that
__unhash_process will work fine so there is no need to touch the leader.

For the task becomming the thread group leader, we just need to give it the
pid of the old thread group leader, add it to the task list, and attach it
to the session and the process group of the thread group.

Currently de_thread is also adding the task to the task list which is just
silly.

Currently the only leader of __detach_pid besides detach_pid is
switch_exec_pids because of the ugly extra work that was being
performed.

So this patch removes switch_exec_pids because it is doing too much, it is
creating an unnecessary special case in pid.c, duing work duplicated in
de_thread, and generally obscuring what it is going on.

The necessary work is added to de_thread, and it seems to be a little
clearer there what is going on.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Oleg Nesterov
1434261c07 [PATCH] simplify exec from init's subthread
I think it is enough to take tasklist_lock for reading while changing
child_reaper:

	Reparenting needs write_lock(tasklist_lock)

	Only one thread in a thread group can do exec()

	sighand->siglock garantees that get_signal_to_deliver()
	will not see a stale value of child_reaper.

This means that we can change child_reaper earlier, without calling
zap_other_threads() twice.

"child_reaper = current" is a NOOP when init does exec from main thread, we
don't care.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Eric W. Biederman
fef23e7fbb [PATCH] exec: allow init to exec from any thread.
After looking at the problem of init calling exec some more I figured out
an easy way to make the code work.

The actual symptom without out this patch is that all threads will die
except pid == 1, and the thread calling exec.  The thread calling exec will
wait forever for pid == 1 to die.

Since pid == 1 does not install a handler for SIGKILL it will never die.

This modifies the tests for init from current->pid == 1 to the equivalent
current == child_reaper.  And then it causes exec in the ugly case to
modify child_reaper.

The only weird symptom is that you wind up with an init process that
doesn't have the oldest start time on the box.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Paul Mackerras
bac30d1a78 Merge ../linux-2.6 2006-03-29 13:24:50 +11:00
Nathan Scott
c25366680b [XFS] Cleanup in XFS after recent get_block_t interface tweaks.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 10:44:40 +10:00
Mandy Kirkconnell
0b7e56a450 [XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
SGI-PV: 951415
SGI-Modid: xfs-linux-melb:xfs-kern:208490a

Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 09:53:03 +10:00
Glen Overby
3ccb8b5f65 [XFS] A change to inode chunk allocation to try allocating the new chunk
contiguous with the most recently allocated chunk.  On a striped
filesystem, this will fill a stripe unit with inodes before allocating new
inodes in another stripe unit.

SGI-PV: 951416
SGI-Modid: xfs-linux-melb:xfs-kern:208488a

Signed-off-by: Glen Overby <overby@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 09:52:28 +10:00
Nathan Scott
3c674e7423 Fixes a regression from the recent "remove ->get_blocks() support"
change.  inode->i_blkbits should be used when making a get_block_t
request of a filesystem instead of dio->blkbits, as that does not
indicate the filesystem block size all the time (depends on request
alignment - see start of __blockdev_direct_IO).

Signed-off-by: Nathan Scott <nathans@sgi.com>
Acked-by: Badari Pulavarty <pbadari@us.ibm.com>
2006-03-29 09:26:15 +10:00
Nathan Scott
e0edd5962b [XFS] Fix compiler warning and small code inconsistencies in compat
ioctl32 land.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25590a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 08:55:47 +10:00
Nathan Scott
c41564b5af [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all
these typos.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 08:55:14 +10:00
Alexey Dobriyan
7f927fcc2f [PATCH] Typo fixes
Fix a lot of typos.  Eyeballed by jmc@ in OpenBSD.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:08 -08:00
Arjan van de Ven
4b6f5d20b0 [PATCH] Make most file operations structs in fs/ const
This is a conversion to make the various file_operations structs in fs/
const.  Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:06 -08:00
Arjan van de Ven
99ac48f54a [PATCH] mark f_ops const in the inode
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
KAMEZAWA Hiroyuki
0a94502277 [PATCH] for_each_possible_cpu: fixes for generic part
replaces for_each_cpu with for_each_possible_cpu().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
Vadim Lobanov
68c3431ae2 [PATCH] Fold select_bits_alloc/free into caller code.
Remove an unnecessary level of indirection in allocating and freeing select
bits, as per the select_bits_alloc() and select_bits_free() functions.
Both select.c and compat.c are updated.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Eric Dumazet
e4a1f129f9 [PATCH] use fget_light() in select/poll
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Andi Kleen
70674f95c0 [PATCH] Optimize select/poll by putting small data sets on the stack
Optimize select and poll by a using stack space for small fd sets

This brings back an old optimization from Linux 2.0.  Using the stack is
faster than kmalloc.  On a Intel P4 system it speeds up a select of a
single pty fd by about 13% (~4000 cycles -> ~3500)

It also saves memory because a daemon hanging in select or poll will
usually save one or two less pages.  This can add up - e.g.  if you have 10
daemons blocking in poll/select you save 40KB of memory.

I did a patch for this long ago, but it was never applied.  This version is
a reimplementation of the old patch that tries to be less intrusive.  I
only did the minimal changes needed for the stack allocation.

The cut off point before external memory is allocated is currently at
832bytes.  The system calls always allocate this much memory on the stack.

These 832 bytes are divided into 256 bytes frontend data (for the select
bitmaps of the pollfds) and the rest of the space for the wait queues used
by the low level drivers.  There are some extreme cases where this won't
work out for select and it falls back to allocating memory too early -
especially with very sparse large select bitmaps - but the majority of
processes who only have a small number of file descriptors should be ok.
[TBD: 832/256 might not be the best split for select or poll]

I suspect more optimizations might be possible, but they would be more
complicated.  One way would be to cache the select/poll context over
multiple system calls because typically the input values should be similar.
 Problem is when to flush the file descriptors out though.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Adrian Bunk
f5b95ff010 [PATCH] autofs4: proper prototype for autofs4_dentry_release()
Add a proper prototype for autofs4_dentry_release() to autofs_i.h.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:03 -08:00