Commit Graph

22853 Commits

Author SHA1 Message Date
Sage Weil
98702467f8 configfs: remove unnecessary dentry_unhash on rmdir, dir rename
configfs does not have problems with references to unlinked directories.

CC: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:54 -04:00
Sage Weil
f4ff0e25c5 fat: remove unnecessary dentry_unhash on rmdir, dir rename
fat does not have problems with references to unlinked directories.

CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:54 -04:00
Sage Weil
45adfef7d0 hpfs: remove unnecessary dentry_unhash on rmdir, dir rename
Hpfs has no problems with references to unlinked directories.

We leave one dentry_unhash call in place, in hpfs_unlink's strange path
where it tries to truncate a file because the disk is full.  I'm not sure
what the full story is there.

CC: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:54 -04:00
Sage Weil
b80d2c228d minix: remove unnecessary dentry_unhash on rmdir, dir rename
Minix has no issues with references to unlinked directories.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:54 -04:00
Sage Weil
526e7ce552 fuse: remove unnecessary dentry_unhash on rmdir, dir rename
Fuse has no problems with references to unlinked directories.

CC: Miklos Szeredi <miklos@szeredi.hu>
CC: fuse-devel@lists.sourceforge.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
42b850b280 coda: remove unnecessary dentry_unhash on rmdir, dir rename
Coda has no problems with references to unlinked directories.

CC: Jan Harkes <jaharkes@cs.cmu.edu>
CC: coda@cs.cmu.edu
CC: codalist@coda.cs.cmu.edu
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
705da2de95 afs: remove unnecessary dentry_unhash on rmdir, dir rename
afs has no problems with references to unlinked directories.

CC: David Howells <dhowells@redhat.com>
CC: linux-afs@lists.infradead.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
94f9901b6c affs: remove unnecessary dentry_unhash on rmdir, dir rename
affs has no problems with references to unlinked directories.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
86905d6d96 9p: remove unnecessary dentry_unhash on rmdir, dir rename
9p has no problems with references to unlinked directories.

CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Ron Minnich <rminnich@sandia.gov>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: v9fs-developer@lists.sourceforge.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
76cc071a06 ncpfs: fix rename over directory with dangling references
ncpfs does not handle references to unlinked directories (or so it would
seem given the ncp_rmdir check).  Since it is also possible to rename over
an empty directory, perform the same check here.

CC: Petr Vandrovec <petr@vandrovec.name>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
7ce605d93b ncpfs: document dentry_unhash usage
ncpfs returns EBUSY if there are any references to the directory.  The
dentry_unhash call only unhashes the dentry if there are no references.

CC: Petr Vandrovec <petr@vandrovec.name>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:53 -04:00
Sage Weil
55e5b7e022 ecryptfs: remove unnecessary dentry_unhash on rmdir, dir rename
ecryptfs does not have problems with references to unlinked directories.

CC: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
CC: Dustin Kirkland <kirkland@canonical.com>
CC: ecryptfs-devel@lists.launchpad.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
e41a59e055 hostfs: remove unnecessary dentry_unhash on rmdir, dir rename
hostfs does not have problems with references to unlinked directories.

CC: Jeff Dike <jdike@addtoit.com>
CC: Richard Weinberger <richard@nod.at>
CC: user-mode-linux-devel@lists.sourceforge.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
e3911785b8 hfsplus: remove unnecessary dentry_unhash on rmdir, dir rename
hfsplus does not have problems with references to unlinked directories.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
4e82d61b6a hfs: remove unnecessary dentry_unhash on rmdir, dir rename
hfs does not have problems with references to unlinked directories.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
8aaa0f5431 omfs: remove unnecessary dentry_unhash on rmdir, dir rneame
omfs does not have problems with references to unlinked directories.

CC: Bob Copeland <me@bobcopeland.com>
CC: linux-karma-devel@lists.sourceforge.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
7020739df2 udf: remove unnecessary dentry_unhash from rmdir, dir rename
udf does not have problems with references to unlinked directories.

CC: Jan Kara <jack@suse.cz>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:52 -04:00
Sage Weil
cc350c2764 reiserfs: remove unnecessary dentry_unhash from rmdir, dir rename
Reiserfs does not have problems with references to unlinked directories.

CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
87161faae2 ufs: remove unnecessary dentry_unhash from rmdir, dir rename
ufs does not have problems with references to unlinked directories.

CC: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
0e54ec1c3a ubifs: remove unnecessary dentry_unhash from rmdir, dir rename
ubifs does not have problems with references to unlinked directories.

CC: Artem Bityutskiy <dedekind1@gmail.com>
CC: Adrian Hunter <adrian.hunter@nokia.com>
CC: linux-mtd@lists.infradead.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
dfb55de898 nilfs2: remove unnecessary dentry_unhash from rmdir, dir rename
nilfs2 does not have problems with references to unlinked directories.

CC: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
CC: linux-nilfs@vger.kernel.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
64370e2f9d logfs: remove unnecessary dentry_unhash from rmdir, dir rename
logfs does not have problems with references to unlinked directories.

CC: Joern Engel <joern@logfs.org>
CC: logfs@logfs.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
44a8e6364e jfs: remove unnecessary dentry_unhash from rmdir, dir rename
jfs does not have problems with references to unlinked directories.

CC: Dave Kleikamp <shaggy@kernel.org>
CC: jfs-discussion@lists.sourceforge.net
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:51 -04:00
Sage Weil
cf0f0536fa jffs2: remove unnecessary dentry_unhash from rmdir, dir rename
jffs2 does not have problems with references to unlinked directories.

CC: David Woodhouse <dwmw2@infradead.org>
CC: linux-mtd@lists.infradead.org
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:50 -04:00
Sage Weil
873ae4d5a8 sysv: remove unnecessary dentry_unhash from rmdir, dir rename
sysv does not have problems with references to unlinked directories.

CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:50 -04:00
Sage Weil
76f0b8d2d2 bfs: remove unnecessary dentry_unhash on dir rename
Bfs does not have problems with references to unlinked directories.

CC: tigran@aivazian.fsnet.co.uk
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 01:02:50 -04:00
Christoph Hellwig
4b4563dc80 fs: cosmetic inode.c cleanups
Move the lock order description after all the includes, remove several
fairly outdated and/or incorrect comments, move Andrea's
copyright/changelog to the top where it belongs, remove the pointless
filename in the top of the file comment, and remove to useless macros.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 09:43:00 -04:00
Andreas Gruenbacher
c642808454 vfs: Improve the bio_add_page() and bio_add_pc_page() descriptions
The descriptions of bio_add_page() and bio_add_pc_page() are slightly
inconsistent; improve them.

Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 09:43:00 -04:00
Andreas Gruenbacher
55b23bde19 xattr: Fix error results for non-existent / invisible attributes
Return -ENODATA when trying to read a user.* attribute which cannot
exist: user space otherwise does not have a reasonable way to
distinguish between non-existent and inaccessible attributes.

Likewise, return -ENODATA when an unprivileged process tries to read a
trusted.* attribute: to unprivileged processes, those attributes are
invisible (listxattr() won't include them).

Related to this bug report: https://bugzilla.redhat.com/660613

Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 09:43:00 -04:00
Christoph Hellwig
aa38572954 fs: pass exact type of data dirties to ->dirty_inode
Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or
anything else, so that the filesystem can track internally if it
needs to push out a transaction for fdatasync or not.

This is just the prototype change with no user for it yet.  I plan
to push large XFS changes for the next merge window, and getting
this trivial infrastructure in this window would help a lot to avoid
tree interdependencies.

Also remove incorrect comments that ->dirty_inode can't block.  That
has been changed a long time ago, and many implementations rely on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 07:04:40 -04:00
Al Viro
d6e9bd256c Lift the check for automount points into do_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 07:03:15 -04:00
Al Viro
dea3937619 Trim excessive arguments of follow_mount_rcu()
... and kill a useless local variable in follow_dotdot_rcu(), while
we are at it - follow_mount_rcu(nd, path, inode) *always* assigned
value to *inode, and always it had been path->dentry->d_inode (aka
nd->path.dentry->d_inode, since it always got &nd->path as the second
argument).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 07:01:49 -04:00
Al Viro
287548e46a split __follow_mount_rcu() into normal and .. cases
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-27 06:51:56 -04:00
Linus Torvalds
bc9bc72e2f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: update email address
  Squashfs: add extra sanity checks at mount time
  Squashfs: add sanity checks to fragment reading at mount time
  Squashfs: add sanity checks to lookup table reading at mount time
  Squashfs: add sanity checks to id reading at mount time
  Squashfs: add sanity checks to xattr reading at mount time
  Squashfs: reverse order of filesystem table reading
  Squashfs: move table allocation into squashfs_read_table()
2011-05-26 17:27:35 -07:00
Timo Warns
3eb8e74ec7 fs/partitions/efi.c: corrupted GUID partition tables can cause kernel oops
The kernel automatically evaluates partition tables of storage devices.
The code for evaluating GUID partitions (in fs/partitions/efi.c) contains
a bug that causes a kernel oops on certain corrupted GUID partition
tables.

This bug has security impacts, because it allows, for example, to
prepare a storage device that crashes a kernel subsystem upon connecting
the device (e.g., a "USB Stick of (Partial) Death").

	crc = efi_crc32((const unsigned char *) (*gpt), le32_to_cpu((*gpt)->header_size));

computes a CRC32 checksum over gpt covering (*gpt)->header_size bytes.
There is no validation of (*gpt)->header_size before the efi_crc32 call.

A corrupted partition table may have large values for (*gpt)->header_size.
 In this case, the CRC32 computation access memory beyond the memory
allocated for gpt, which may cause a kernel heap overflow.

Validate value of GUID partition table header size.

[akpm@linux-foundation.org: fix layout and indenting]
Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Olaf Hering
997c136f51 fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages
The balloon driver in a Xen guest frees guest pages and marks them as
mmio.  When the kernel crashes and the crash kernel attempts to read the
oldmem via /proc/vmcore a read from ballooned pages will generate 100%
load in dom0 because Xen asks qemu-dm for the page content.  Since the
reads come in as 8byte requests each ballooned page is tried 512 times.

With this change a hook can be registered which checks wether the given
pfn is really ram.  The hook has to return a value > 0 for ram pages, a
value < 0 on error (because the hypercall is not known) and 0 for non-ram
pages.

This will reduce the time to read /proc/vmcore.  Without this change a
512M guest with 128M crashkernel region needs 200 seconds to read it, with
this change it takes just 2 seconds.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
KOSAKI Motohiro
98bc93e505 proc: fix pagemap_read() error case
Currently, pagemap_read() has three error and/or corner case handling
mistake.

 (1) If ppos parameter is wrong, mm refcount will be leak.
 (2) If count parameter is 0, mm refcount will be leak too.
 (3) If the current task is sleeping in kmalloc() and the system
     is out of memory and oom-killer kill the proc associated task,
     mm_refcount prevent the task free its memory. then system may
     hang up.

<Quote Hugh's explain why we shold call kmalloc() before get_mm()>

  check_mem_permission gets a reference to the mm.  If we
  __get_free_page after check_mem_permission, imagine what happens if the
  system is out of memory, and the mm we're looking at is selected for
  killing by the OOM killer: while we wait in __get_free_page for more
  memory, no memory is freed from the selected mm because it cannot reach
  exit_mmap while we hold that reference.

This patch fixes the above three.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jovi Zhang <bookjovi@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Stephen Wilson <wilsons@start.ca>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
KOSAKI Motohiro
30cd890391 proc: put check_mem_permission after __get_free_page in mem_write
It whould be better if put check_mem_permission after __get_free_page in
mem_write, to be same as function mem_read.

Hugh Dickins explained the reason.

    check_mem_permission gets a reference to the mm.  If we __get_free_page
    after check_mem_permission, imagine what happens if the system is out
    of memory, and the mm we're looking at is selected for killing by the
    OOM killer: while we wait in __get_free_page for more memory, no memory
    is freed from the selected mm because it cannot reach exit_mmap while
    we hold that reference.

Reported-by: Jovi Zhang <bookjovi@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Stephen Wilson <wilsons@start.ca>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Yuanhan Liu
a4dbf0ec2a proc/stat: use defined macro KMALLOC_MAX_SIZE
There is a macro for the max size kmalloc can allocate, so use it instead
of a hardcoded number.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Mike Frysinger
e130aa70f4 proc: constify status array
No need for this local array to be writable, so mark it const.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Alexey Dobriyan
0a8cb8e341 fs/proc: convert to kstrtoX()
Convert fs/proc/ from strict_strto*() to kstrto*() functions.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Jiri Slaby
57cc083ad9 coredump: add support for exe_file in core name
Now, exe_file is not proc FS dependent, so we can use it to name core
file.  So we add %E pattern for core file name cration which extract path
from mm_struct->exe_file.  Then it converts slashes to exclamation marks
and pastes the result to the core file name itself.

This is useful for environments where binary names are longer than 16
character (the current->comm limitation).  Also where there are binaries
with same name but in a different path.  Further in case the binery itself
changes its current->comm after exec.

So by doing (s/$/#/ -- # is treated as git comment):

  $ sysctl kernel.core_pattern='core.%p.%e.%E'
  $ ln /bin/cat cat45678901234567890
  $ ./cat45678901234567890
  ^Z
  $ rm cat45678901234567890
  $ fg
  ^\Quit (core dumped)
  $ ls core*

we now get:

  core.2434.cat456789012345.!root!cat45678901234567890 (deleted)

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Jiri Slaby
3864601387 mm: extract exe_file handling from procfs
Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc/<pid>/exe.  Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.

To achieve that move that out of proc FS to the kernel/ where in fact it
should belong.  By doing that we can make dup_mm_exe_file static.  Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Ying Han
456f998ec8 memcg: add the pagefault count into memcg stats
Two new stats in per-memcg memory.stat which tracks the number of page
faults and number of major page faults.

  "pgfault"
  "pgmajfault"

They are different from "pgpgin"/"pgpgout" stat which count number of
pages charged/discharged to the cgroup and have no meaning of reading/
writing page to disk.

It is valuable to track the two stats for both measuring application's
performance as well as the efficiency of the kernel page reclaim path.
Counting pagefaults per process is useful, but we also need the aggregated
value since processes are monitored and controlled in cgroup basis in
memcg.

Functional test: check the total number of pgfault/pgmajfault of all
memcgs and compare with global vmstat value:

  $ cat /proc/vmstat | grep fault
  pgfault 1070751
  pgmajfault 553

  $ cat /dev/cgroup/memory.stat | grep fault
  pgfault 1071138
  pgmajfault 553
  total_pgfault 1071142
  total_pgmajfault 553

  $ cat /dev/cgroup/A/memory.stat | grep fault
  pgfault 199
  pgmajfault 0
  total_pgfault 199
  total_pgmajfault 0

Performance test: run page fault test(pft) wit 16 thread on faulting in
15G anon pages in 16G container.  There is no regression noticed on the
"flt/cpu/s"

Sample output from pft:

  TAG pft:anon-sys-default:
    Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec
    15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260

  +-------------------------------------------------------------------------+
      N           Min           Max        Median           Avg        Stddev
  x  10     16682.962     17344.027     16913.524     16928.812      166.5362
  +  10     16695.568     16923.896     16820.604     16824.652     84.816568
  No difference proven at 95.0% confidence

[akpm@linux-foundation.org: fix build]
[hughd@google.com: shmem fix]
Signed-off-by: Ying Han <yinghan@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Dan Carpenter
1d5827235d ufs: fix truncated values handling 64 bit metadata
Originally i_lastfrag was 32 bits but then we added support for handling
64 bit metadata and it became a 64 bit variable.  That was during 2007, in
54fb996ac1 "[PATCH] ufs2 write: block allocation update".  Unfortunately
these casts got left behind so the value got truncated to 32 bit again.

[akpm@linux-foundation.org: remove now-unneeded min_t/max_t casting]
Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:33 -07:00
Linus Torvalds
b7c2f03628 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  gfs2: Drop __TIME__ usage
  isdn/diva: Drop __TIME__ usage
  atm: Drop __TIME__ usage
  dlm: Drop __TIME__ usage
  wan/pc300: Drop __TIME__ usage
  parport: Drop __TIME__ usage
  hdlcdrv: Drop __TIME__ usage
  baycom: Drop __TIME__ usage
  pmcraid: Drop __DATE__ usage
  edac: Drop __DATE__ usage
  rio: Drop __DATE__ usage
  scsi/wd33c93: Drop __TIME__ usage
  scsi/in2000: Drop __TIME__ usage
  aacraid: Drop __TIME__ usage
  media/cx231xx: Drop __TIME__ usage
  media/radio-maxiradio: Drop __TIME__ usage
  nozomi: Drop __TIME__ usage
  cyclades: Drop __TIME__ usage
2011-05-26 13:19:00 -07:00
Linus Torvalds
a74b81b0af Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (28 commits)
  Ocfs2: Teach local-mounted ocfs2 to handle unwritten_extents correctly.
  ocfs2/dlm: Do not migrate resource to a node that is leaving the domain
  ocfs2/dlm: Add new dlm message DLM_BEGIN_EXIT_DOMAIN_MSG
  Ocfs2/move_extents: Set several trivial constraints for threshold.
  Ocfs2/move_extents: Let defrag handle partial extent moving.
  Ocfs2/move_extents: move/defrag extents within a certain range.
  Ocfs2/move_extents: helper to calculate the defraging length in one run.
  Ocfs2/move_extents: move entire/partial extent.
  Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode.
  Ocfs2/move_extents: helper to probe a proper region to move in an alloc group.
  Ocfs2/move_extents: helper to validate and adjust moving goal.
  Ocfs2/move_extents: find the victim alloc group, where the given #blk fits.
  Ocfs2/move_extents: defrag a range of extent.
  Ocfs2/move_extents: move a range of extent.
  Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving.
  Ocfs2/move_extents: Add basic framework and source files for extent moving.
  Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2.
  Ocfs2/refcounttree: Publicize couple of funcs from refcounttree.c
  Ocfs2: Add a new code 'OCFS2_INFO_FREEFRAG' for o2info ioctl.
  Ocfs2: Add a new code 'OCFS2_INFO_FREEINODE' for o2info ioctl.
  ...
2011-05-26 10:55:15 -07:00
Linus Torvalds
f8d613e2a6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
  xen: cleancache shim to Xen Transcendent Memory
  ocfs2: add cleancache support
  ext4: add cleancache support
  btrfs: add cleancache support
  ext3: add cleancache support
  mm/fs: add hooks to support cleancache
  mm: cleancache core ops functions and config
  fs: add field to superblock to support cleancache
  mm/fs: cleancache documentation

Fix up trivial conflict in fs/btrfs/extent_io.c due to includes
2011-05-26 10:50:56 -07:00
Linus Torvalds
8a0599dd24 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: correctly decrement the extent buffer index in xfs_bmap_del_extent
  xfs: check for valid indices in xfs_iext_get_ext and xfs_iext_idx_to_irec
  xfs: fix up asserts in xfs_iflush_fork
  xfs: do not do pointer arithmetic on extent records
  xfs: do not use unchecked extent indices in xfs_bunmapi
  xfs: do not use unchecked extent indices in xfs_bmapi
  xfs: do not use unchecked extent indices in xfs_bmap_add_extent_*
  xfs: remove if_lastex
  xfs: remove the unused XFS_BMAPI_RSVBLOCKS flag
  xfs: do not discard alloc btree blocks
  xfs: add online discard support
2011-05-26 10:49:11 -07:00
Linus Torvalds
35806b4f7c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
  jbd2: Add MAINTAINERS entry
  jbd2: fix a potential leak of a journal_head on an error path
  ext4: teach ext4_ext_split to calculate extents efficiently
  ext4: Convert ext4 to new truncate calling convention
  ext4: do not normalize block requests from fallocate()
  ext4: enable "punch hole" functionality
  ext4: add "punch hole" flag to ext4_map_blocks()
  ext4: punch out extents
  ext4: add new function ext4_block_zero_page_range()
  ext4: add flag to ext4_has_free_blocks
  ext4: reserve inodes and feature code for 'quota' feature
  ext4: add support for multiple mount protection
  ext4: ensure f_bfree returned by ext4_statfs() is non-negative
  ext4: protect bb_first_free in ext4_trim_all_free() with group lock
  ext4: only load buddy bitmap in ext4_trim_fs() when it is needed
  jbd2: Fix comment to match the code in jbd2__journal_start()
  ext4: fix waiting and sending of a barrier in ext4_sync_file()
  jbd2: Add function jbd2_trans_will_send_data_barrier()
  jbd2: fix sending of data flush on journal commit
  ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
  ...
2011-05-26 09:53:20 -07:00