2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* include/linux/buffer_head.h
|
|
|
|
*
|
|
|
|
* Everything to do with buffer_heads.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _LINUX_BUFFER_HEAD_H
|
|
|
|
#define _LINUX_BUFFER_HEAD_H
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
|
|
|
#include <linux/fs.h>
|
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <linux/pagemap.h>
|
|
|
|
#include <linux/wait.h>
|
2011-07-27 06:09:06 +07:00
|
|
|
#include <linux/atomic.h>
|
2005-04-17 05:20:36 +07:00
|
|
|
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 01:45:40 +07:00
|
|
|
#ifdef CONFIG_BLOCK
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
enum bh_state_bits {
|
|
|
|
BH_Uptodate, /* Contains valid data */
|
|
|
|
BH_Dirty, /* Is dirty */
|
|
|
|
BH_Lock, /* Is locked */
|
|
|
|
BH_Req, /* Has been submitted for I/O */
|
2005-07-08 07:56:56 +07:00
|
|
|
BH_Uptodate_Lock,/* Used by the first bh in a page, to serialise
|
|
|
|
* IO completion of other buffers in the page
|
|
|
|
*/
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
BH_Mapped, /* Has a disk mapping */
|
|
|
|
BH_New, /* Disk mapping was newly created by get_block */
|
|
|
|
BH_Async_Read, /* Is under end_buffer_async_read I/O */
|
|
|
|
BH_Async_Write, /* Is under end_buffer_async_write I/O */
|
|
|
|
BH_Delay, /* Buffer is not yet allocated on disk */
|
|
|
|
BH_Boundary, /* Block is followed by a discontiguity */
|
|
|
|
BH_Write_EIO, /* I/O error on write */
|
2007-02-12 15:51:41 +07:00
|
|
|
BH_Unwritten, /* Buffer is allocated on disk but not written */
|
block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set
Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer. The buffer
layer can then suppress needless printks.
This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.
During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.
My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd. This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.
I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue. A few runs of
bonnie++ show no noticeable difference in performance in my setup.
Thanks for John Stultz for the quiet_error finalization.
Submitted-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-25 16:24:35 +07:00
|
|
|
BH_Quiet, /* Buffer Error Prinks to be quiet */
|
2013-04-21 06:58:37 +07:00
|
|
|
BH_Meta, /* Buffer contains metadata */
|
|
|
|
BH_Prio, /* Buffer should be submitted with REQ_PRIO */
|
2013-09-04 20:04:39 +07:00
|
|
|
BH_Defer_Completion, /* Defer AIO completion to workqueue */
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
BH_PrivateStart,/* not a state bit, but the first bit available
|
|
|
|
* for private allocation by other entities
|
|
|
|
*/
|
|
|
|
};
|
|
|
|
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 19:29:47 +07:00
|
|
|
#define MAX_BUF_PER_PAGE (PAGE_SIZE / 512)
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
struct page;
|
|
|
|
struct buffer_head;
|
|
|
|
struct address_space;
|
|
|
|
typedef void (bh_end_io_t)(struct buffer_head *bh, int uptodate);
|
|
|
|
|
|
|
|
/*
|
2006-03-26 16:38:00 +07:00
|
|
|
* Historically, a buffer_head was used to map a single block
|
|
|
|
* within a page, and of course as the unit of I/O through the
|
|
|
|
* filesystem and block layers. Nowadays the basic I/O unit
|
|
|
|
* is the bio, and buffer_heads are used for extracting block
|
|
|
|
* mappings (via a get_block_t call), for tracking state within
|
|
|
|
* a page (via a page_mapping) and for wrapping bio submission
|
|
|
|
* for backward compatibility reasons (e.g. submit_bh).
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
|
|
|
struct buffer_head {
|
|
|
|
unsigned long b_state; /* buffer state bitmap (see above) */
|
|
|
|
struct buffer_head *b_this_page;/* circular list of page's buffers */
|
|
|
|
struct page *b_page; /* the page this bh is mapped to */
|
|
|
|
|
2006-03-26 16:38:00 +07:00
|
|
|
sector_t b_blocknr; /* start block number */
|
|
|
|
size_t b_size; /* size of mapping */
|
|
|
|
char *b_data; /* pointer to data within the page */
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
struct block_device *b_bdev;
|
|
|
|
bh_end_io_t *b_end_io; /* I/O completion */
|
|
|
|
void *b_private; /* reserved for b_end_io */
|
|
|
|
struct list_head b_assoc_buffers; /* associated with another mapping */
|
2006-10-17 14:10:19 +07:00
|
|
|
struct address_space *b_assoc_map; /* mapping this buffer is
|
|
|
|
associated with */
|
2006-03-26 16:38:00 +07:00
|
|
|
atomic_t b_count; /* users using this buffer_head */
|
2005-04-17 05:20:36 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* macro tricks to expand the set_buffer_foo(), clear_buffer_foo()
|
|
|
|
* and buffer_foo() functions.
|
|
|
|
*/
|
|
|
|
#define BUFFER_FNS(bit, name) \
|
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-18 04:18:21 +07:00
|
|
|
static __always_inline void set_buffer_##name(struct buffer_head *bh) \
|
2005-04-17 05:20:36 +07:00
|
|
|
{ \
|
|
|
|
set_bit(BH_##bit, &(bh)->b_state); \
|
|
|
|
} \
|
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-18 04:18:21 +07:00
|
|
|
static __always_inline void clear_buffer_##name(struct buffer_head *bh) \
|
2005-04-17 05:20:36 +07:00
|
|
|
{ \
|
|
|
|
clear_bit(BH_##bit, &(bh)->b_state); \
|
|
|
|
} \
|
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-18 04:18:21 +07:00
|
|
|
static __always_inline int buffer_##name(const struct buffer_head *bh) \
|
2005-04-17 05:20:36 +07:00
|
|
|
{ \
|
|
|
|
return test_bit(BH_##bit, &(bh)->b_state); \
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* test_set_buffer_foo() and test_clear_buffer_foo()
|
|
|
|
*/
|
|
|
|
#define TAS_BUFFER_FNS(bit, name) \
|
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-18 04:18:21 +07:00
|
|
|
static __always_inline int test_set_buffer_##name(struct buffer_head *bh) \
|
2005-04-17 05:20:36 +07:00
|
|
|
{ \
|
|
|
|
return test_and_set_bit(BH_##bit, &(bh)->b_state); \
|
|
|
|
} \
|
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-18 04:18:21 +07:00
|
|
|
static __always_inline int test_clear_buffer_##name(struct buffer_head *bh) \
|
2005-04-17 05:20:36 +07:00
|
|
|
{ \
|
|
|
|
return test_and_clear_bit(BH_##bit, &(bh)->b_state); \
|
|
|
|
} \
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Emit the buffer bitops functions. Note that there are also functions
|
|
|
|
* of the form "mark_buffer_foo()". These are higher-level functions which
|
|
|
|
* do something in addition to setting a b_state bit.
|
|
|
|
*/
|
|
|
|
BUFFER_FNS(Uptodate, uptodate)
|
|
|
|
BUFFER_FNS(Dirty, dirty)
|
|
|
|
TAS_BUFFER_FNS(Dirty, dirty)
|
|
|
|
BUFFER_FNS(Lock, locked)
|
|
|
|
BUFFER_FNS(Req, req)
|
|
|
|
TAS_BUFFER_FNS(Req, req)
|
|
|
|
BUFFER_FNS(Mapped, mapped)
|
|
|
|
BUFFER_FNS(New, new)
|
|
|
|
BUFFER_FNS(Async_Read, async_read)
|
|
|
|
BUFFER_FNS(Async_Write, async_write)
|
|
|
|
BUFFER_FNS(Delay, delay)
|
|
|
|
BUFFER_FNS(Boundary, boundary)
|
|
|
|
BUFFER_FNS(Write_EIO, write_io_error)
|
2007-02-12 15:51:41 +07:00
|
|
|
BUFFER_FNS(Unwritten, unwritten)
|
2013-04-21 06:58:37 +07:00
|
|
|
BUFFER_FNS(Meta, meta)
|
|
|
|
BUFFER_FNS(Prio, prio)
|
2013-09-04 20:04:39 +07:00
|
|
|
BUFFER_FNS(Defer_Completion, defer_completion)
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
|
|
|
|
|
|
|
|
/* If we *know* page->private refers to buffer_heads */
|
|
|
|
#define page_buffers(page) \
|
|
|
|
({ \
|
[PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.
This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock. (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)
In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.
Splitting the lock is not quite for free: another cacheline access. Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS. But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.
There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 08:16:40 +07:00
|
|
|
BUG_ON(!PagePrivate(page)); \
|
|
|
|
((struct buffer_head *)page_private(page)); \
|
2005-04-17 05:20:36 +07:00
|
|
|
})
|
|
|
|
#define page_has_buffers(page) PagePrivate(page)
|
|
|
|
|
2013-07-04 05:02:05 +07:00
|
|
|
void buffer_check_dirty_writeback(struct page *page,
|
|
|
|
bool *dirty, bool *writeback);
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Declarations
|
|
|
|
*/
|
|
|
|
|
2008-02-14 06:03:15 +07:00
|
|
|
void mark_buffer_dirty(struct buffer_head *bh);
|
2017-07-06 18:02:21 +07:00
|
|
|
void mark_buffer_write_io_error(struct buffer_head *bh);
|
2005-04-17 05:20:36 +07:00
|
|
|
void init_buffer(struct buffer_head *, bh_end_io_t *, void *);
|
2013-01-12 04:06:35 +07:00
|
|
|
void touch_buffer(struct buffer_head *bh);
|
2005-04-17 05:20:36 +07:00
|
|
|
void set_bh_page(struct buffer_head *bh,
|
|
|
|
struct page *page, unsigned long offset);
|
|
|
|
int try_to_free_buffers(struct page *);
|
|
|
|
struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
|
|
|
|
int retry);
|
|
|
|
void create_empty_buffers(struct page *, unsigned long,
|
|
|
|
unsigned long b_state);
|
|
|
|
void end_buffer_read_sync(struct buffer_head *bh, int uptodate);
|
|
|
|
void end_buffer_write_sync(struct buffer_head *bh, int uptodate);
|
2009-04-16 00:22:38 +07:00
|
|
|
void end_buffer_async_write(struct buffer_head *bh, int uptodate);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/* Things to do with buffers at mapping->private_list */
|
|
|
|
void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode);
|
|
|
|
int inode_has_buffers(struct inode *);
|
|
|
|
void invalidate_inode_buffers(struct inode *);
|
|
|
|
int remove_inode_buffers(struct inode *inode);
|
|
|
|
int sync_mapping_buffers(struct address_space *mapping);
|
2016-11-05 00:08:11 +07:00
|
|
|
void clean_bdev_aliases(struct block_device *bdev, sector_t block,
|
|
|
|
sector_t len);
|
2016-11-05 00:08:15 +07:00
|
|
|
static inline void clean_bdev_bh_alias(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
clean_bdev_aliases(bh->b_bdev, bh->b_blocknr, 1);
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
void mark_buffer_async_write(struct buffer_head *bh);
|
|
|
|
void __wait_on_buffer(struct buffer_head *);
|
|
|
|
wait_queue_head_t *bh_waitq_head(struct buffer_head *bh);
|
2007-02-12 15:52:14 +07:00
|
|
|
struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block,
|
|
|
|
unsigned size);
|
2014-09-05 09:04:42 +07:00
|
|
|
struct buffer_head *__getblk_gfp(struct block_device *bdev, sector_t block,
|
|
|
|
unsigned size, gfp_t gfp);
|
2005-04-17 05:20:36 +07:00
|
|
|
void __brelse(struct buffer_head *);
|
|
|
|
void __bforget(struct buffer_head *);
|
2007-02-12 15:52:14 +07:00
|
|
|
void __breadahead(struct block_device *, sector_t block, unsigned int size);
|
2014-09-05 09:04:42 +07:00
|
|
|
struct buffer_head *__bread_gfp(struct block_device *,
|
|
|
|
sector_t block, unsigned size, gfp_t gfp);
|
2007-05-07 04:49:55 +07:00
|
|
|
void invalidate_bh_lrus(void);
|
2005-10-07 13:46:04 +07:00
|
|
|
struct buffer_head *alloc_buffer_head(gfp_t gfp_flags);
|
2005-04-17 05:20:36 +07:00
|
|
|
void free_buffer_head(struct buffer_head * bh);
|
2008-02-14 06:03:15 +07:00
|
|
|
void unlock_buffer(struct buffer_head *bh);
|
|
|
|
void __lock_buffer(struct buffer_head *bh);
|
2016-06-06 02:31:44 +07:00
|
|
|
void ll_rw_block(int, int, int, struct buffer_head * bh[]);
|
2005-04-17 05:20:36 +07:00
|
|
|
int sync_dirty_buffer(struct buffer_head *bh);
|
2016-06-06 02:31:43 +07:00
|
|
|
int __sync_dirty_buffer(struct buffer_head *bh, int op_flags);
|
|
|
|
void write_dirty_buffer(struct buffer_head *bh, int op_flags);
|
|
|
|
int submit_bh(int, int, struct buffer_head *);
|
2005-04-17 05:20:36 +07:00
|
|
|
void write_boundary_block(struct block_device *bdev,
|
|
|
|
sector_t bblock, unsigned blocksize);
|
2008-01-29 11:58:26 +07:00
|
|
|
int bh_uptodate_or_lock(struct buffer_head *bh);
|
|
|
|
int bh_submit_read(struct buffer_head *bh);
|
vfs: Add page_cache_seek_hole_data helper
Both ext4 and xfs implement seeking for the next hole or piece of data
in unwritten extents by scanning the page cache, and both versions share
the same bug when iterating the buffers of a page: the start offset into
the page isn't taken into account, so when a page fits more than two
filesystem blocks, things will go wrong. For example, on a filesystem
with a block size of 1k, the following command will fail:
xfs_io -f -c "falloc 0 4k" \
-c "pwrite 1k 1k" \
-c "pwrite 3k 1k" \
-c "seek -a -r 0" foo
In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
SEEK_DATA) will return the correct result.
Introduce a generic vfs helper for seeking in the page cache that gets
this right. The next commits will replace the filesystem specific
implementations.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
[hch: dropped the export]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-06-30 01:43:20 +07:00
|
|
|
loff_t page_cache_seek_hole_data(struct inode *inode, loff_t offset,
|
|
|
|
loff_t length, int whence);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
extern int buffer_heads_over_limit;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Generic address_space_operations implementations for buffer_head-backed
|
|
|
|
* address_spaces.
|
|
|
|
*/
|
2013-05-22 10:17:23 +07:00
|
|
|
void block_invalidatepage(struct page *page, unsigned int offset,
|
|
|
|
unsigned int length);
|
2005-04-17 05:20:36 +07:00
|
|
|
int block_write_full_page(struct page *page, get_block_t *get_block,
|
|
|
|
struct writeback_control *wbc);
|
2016-06-27 21:58:40 +07:00
|
|
|
int __block_write_full_page(struct inode *inode, struct page *page,
|
|
|
|
get_block_t *get_block, struct writeback_control *wbc,
|
|
|
|
bh_end_io_t *handler);
|
2005-04-17 05:20:36 +07:00
|
|
|
int block_read_full_page(struct page*, get_block_t*);
|
2014-02-03 09:16:54 +07:00
|
|
|
int block_is_partially_uptodate(struct page *page, unsigned long from,
|
|
|
|
unsigned long count);
|
2010-06-04 16:29:58 +07:00
|
|
|
int block_write_begin(struct address_space *mapping, loff_t pos, unsigned len,
|
|
|
|
unsigned flags, struct page **pagep, get_block_t *get_block);
|
2010-06-04 16:29:57 +07:00
|
|
|
int __block_write_begin(struct page *page, loff_t pos, unsigned len,
|
|
|
|
get_block_t *get_block);
|
2007-10-16 15:25:01 +07:00
|
|
|
int block_write_end(struct file *, struct address_space *,
|
|
|
|
loff_t, unsigned, unsigned,
|
|
|
|
struct page *, void *);
|
|
|
|
int generic_write_end(struct file *, struct address_space *,
|
|
|
|
loff_t, unsigned, unsigned,
|
|
|
|
struct page *, void *);
|
|
|
|
void page_zero_new_buffers(struct page *page, unsigned from, unsigned to);
|
2007-10-16 15:25:07 +07:00
|
|
|
int cont_write_begin(struct file *, struct address_space *, loff_t,
|
|
|
|
unsigned, unsigned, struct page **, void **,
|
|
|
|
get_block_t *, loff_t *);
|
2006-01-08 16:02:13 +07:00
|
|
|
int generic_cont_expand_simple(struct inode *inode, loff_t size);
|
2005-04-17 05:20:36 +07:00
|
|
|
int block_commit_write(struct page *page, unsigned from, unsigned to);
|
2009-04-01 05:23:21 +07:00
|
|
|
int block_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf,
|
2007-07-19 14:39:55 +07:00
|
|
|
get_block_t get_block);
|
2011-05-24 05:23:34 +07:00
|
|
|
/* Convert errno to return value from ->page_mkwrite() call */
|
|
|
|
static inline int block_page_mkwrite_return(int err)
|
|
|
|
{
|
|
|
|
if (err == 0)
|
|
|
|
return VM_FAULT_LOCKED;
|
2017-02-09 05:30:53 +07:00
|
|
|
if (err == -EFAULT || err == -EAGAIN)
|
2011-05-24 05:23:34 +07:00
|
|
|
return VM_FAULT_NOPAGE;
|
|
|
|
if (err == -ENOMEM)
|
|
|
|
return VM_FAULT_OOM;
|
|
|
|
/* -ENOSPC, -EDQUOT, -EIO ... */
|
|
|
|
return VM_FAULT_SIGBUS;
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
sector_t generic_block_bmap(struct address_space *, sector_t, get_block_t *);
|
|
|
|
int block_truncate_page(struct address_space *, loff_t, get_block_t *);
|
2010-06-04 16:29:54 +07:00
|
|
|
int nobh_write_begin(struct address_space *, loff_t, unsigned, unsigned,
|
2007-10-16 15:25:25 +07:00
|
|
|
struct page **, void **, get_block_t*);
|
|
|
|
int nobh_write_end(struct file *, struct address_space *,
|
|
|
|
loff_t, unsigned, unsigned,
|
|
|
|
struct page *, void *);
|
|
|
|
int nobh_truncate_page(struct address_space *, loff_t, get_block_t *);
|
2005-04-17 05:20:36 +07:00
|
|
|
int nobh_writepage(struct page *page, get_block_t *get_block,
|
|
|
|
struct writeback_control *wbc);
|
|
|
|
|
2006-06-27 16:53:54 +07:00
|
|
|
void buffer_init(void);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* inline definitions
|
|
|
|
*/
|
|
|
|
|
|
|
|
static inline void attach_page_buffers(struct page *page,
|
|
|
|
struct buffer_head *head)
|
|
|
|
{
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 19:29:47 +07:00
|
|
|
get_page(page);
|
2005-04-17 05:20:36 +07:00
|
|
|
SetPagePrivate(page);
|
[PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.
This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock. (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)
In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.
Splitting the lock is not quite for free: another cacheline access. Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS. But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.
There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 08:16:40 +07:00
|
|
|
set_page_private(page, (unsigned long)head);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void get_bh(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
atomic_inc(&bh->b_count);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void put_bh(struct buffer_head *bh)
|
|
|
|
{
|
2014-03-18 00:06:10 +07:00
|
|
|
smp_mb__before_atomic();
|
2005-04-17 05:20:36 +07:00
|
|
|
atomic_dec(&bh->b_count);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void brelse(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
if (bh)
|
|
|
|
__brelse(bh);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void bforget(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
if (bh)
|
|
|
|
__bforget(bh);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct buffer_head *
|
|
|
|
sb_bread(struct super_block *sb, sector_t block)
|
|
|
|
{
|
2014-09-05 09:04:42 +07:00
|
|
|
return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct buffer_head *
|
|
|
|
sb_bread_unmovable(struct super_block *sb, sector_t block)
|
|
|
|
{
|
|
|
|
return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, 0);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
sb_breadahead(struct super_block *sb, sector_t block)
|
|
|
|
{
|
|
|
|
__breadahead(sb->s_bdev, block, sb->s_blocksize);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct buffer_head *
|
|
|
|
sb_getblk(struct super_block *sb, sector_t block)
|
|
|
|
{
|
2014-09-05 09:04:42 +07:00
|
|
|
return __getblk_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
2015-07-02 12:32:44 +07:00
|
|
|
|
|
|
|
static inline struct buffer_head *
|
|
|
|
sb_getblk_gfp(struct super_block *sb, sector_t block, gfp_t gfp)
|
|
|
|
{
|
|
|
|
return __getblk_gfp(sb->s_bdev, block, sb->s_blocksize, gfp);
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
static inline struct buffer_head *
|
|
|
|
sb_find_get_block(struct super_block *sb, sector_t block)
|
|
|
|
{
|
|
|
|
return __find_get_block(sb->s_bdev, block, sb->s_blocksize);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
map_bh(struct buffer_head *bh, struct super_block *sb, sector_t block)
|
|
|
|
{
|
|
|
|
set_buffer_mapped(bh);
|
|
|
|
bh->b_bdev = sb->s_bdev;
|
|
|
|
bh->b_blocknr = block;
|
2006-03-26 16:38:00 +07:00
|
|
|
bh->b_size = sb->s_blocksize;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void wait_on_buffer(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
might_sleep();
|
2010-08-10 07:18:42 +07:00
|
|
|
if (buffer_locked(bh))
|
2005-04-17 05:20:36 +07:00
|
|
|
__wait_on_buffer(bh);
|
|
|
|
}
|
|
|
|
|
2008-08-02 17:02:13 +07:00
|
|
|
static inline int trylock_buffer(struct buffer_head *bh)
|
|
|
|
{
|
2008-10-19 10:27:00 +07:00
|
|
|
return likely(!test_and_set_bit_lock(BH_Lock, &bh->b_state));
|
2008-08-02 17:02:13 +07:00
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
static inline void lock_buffer(struct buffer_head *bh)
|
|
|
|
{
|
|
|
|
might_sleep();
|
2008-08-02 17:02:13 +07:00
|
|
|
if (!trylock_buffer(bh))
|
2005-04-17 05:20:36 +07:00
|
|
|
__lock_buffer(bh);
|
|
|
|
}
|
|
|
|
|
2014-09-05 09:04:42 +07:00
|
|
|
static inline struct buffer_head *getblk_unmovable(struct block_device *bdev,
|
|
|
|
sector_t block,
|
|
|
|
unsigned size)
|
|
|
|
{
|
|
|
|
return __getblk_gfp(bdev, block, size, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct buffer_head *__getblk(struct block_device *bdev,
|
|
|
|
sector_t block,
|
|
|
|
unsigned size)
|
|
|
|
{
|
|
|
|
return __getblk_gfp(bdev, block, size, __GFP_MOVABLE);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* __bread() - reads a specified block and returns the bh
|
|
|
|
* @bdev: the block_device to read from
|
|
|
|
* @block: number of block
|
|
|
|
* @size: size (in bytes) to read
|
|
|
|
*
|
|
|
|
* Reads a specified block, and returns buffer head that contains it.
|
|
|
|
* The page cache is allocated from movable area so that it can be migrated.
|
|
|
|
* It returns NULL if the block was unreadable.
|
|
|
|
*/
|
|
|
|
static inline struct buffer_head *
|
|
|
|
__bread(struct block_device *bdev, sector_t block, unsigned size)
|
|
|
|
{
|
|
|
|
return __bread_gfp(bdev, block, size, __GFP_MOVABLE);
|
|
|
|
}
|
|
|
|
|
2006-08-30 01:05:54 +07:00
|
|
|
extern int __set_page_dirty_buffers(struct page *page);
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 01:45:40 +07:00
|
|
|
|
|
|
|
#else /* CONFIG_BLOCK */
|
|
|
|
|
|
|
|
static inline void buffer_init(void) {}
|
|
|
|
static inline int try_to_free_buffers(struct page *page) { return 1; }
|
|
|
|
static inline int inode_has_buffers(struct inode *inode) { return 0; }
|
|
|
|
static inline void invalidate_inode_buffers(struct inode *inode) {}
|
|
|
|
static inline int remove_inode_buffers(struct inode *inode) { return 1; }
|
|
|
|
static inline int sync_mapping_buffers(struct address_space *mapping) { return 0; }
|
|
|
|
|
|
|
|
#endif /* CONFIG_BLOCK */
|
2005-04-17 05:20:36 +07:00
|
|
|
#endif /* _LINUX_BUFFER_HEAD_H */
|