linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Linus Torvalds	5909ccaa30	Make 'check_acl()' a first-class filesystem op This is stage one in flattening out the callchains for the common permission testing. Rather than have most filesystem implement their own inode->i_op->permission function that just calls back down to the VFS layers 'generic_permission()' with the per-filesystem ACL checking function, the filesystem can just expose its 'check_acl' function directly, and let the VFS layer do everything for it. This is all just preparatory - no filesystem actually enables this yet. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 11:07:44 -07:00
Linus Torvalds	cb9179ead0	Simplify exec_permission_lite(), part 3 Don't call down to the generic inode_permission() function just to call the inode-specific permission function - just do it directly. The generic inode_permission() code does things like checking MAY_WRITE and devcgroup_inode_permission(), neither of which are relevant for the light pathname walk permission checks (we always do just MAY_EXEC, and the inode is never a special device). Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 11:07:44 -07:00
Linus Torvalds	f1ac9f6bfe	Simplify exec_permission_lite() further This function is only called for path components that are already known to be directories (they have a '->lookup' method). So don't bother doing that whole S_ISDIR() testing, the whole point of the 'lite()' version is that we know that we are looking at a directory component, and that we're only checking name lookup permission. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 11:07:43 -07:00
Linus Torvalds	b7a437b08a	Simplify exec_permission_lite() logic Instead of returning EAGAIN and having the caller do something special for that case, just do the special case directly. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 11:07:43 -07:00
Linus Torvalds	e8e66ed25b	Do not call 'ima_path_check()' for each path component Not only is that a supremely timing-critical path, but it's hopefully some day going to be lockless for the common case, and ima can't do that. Plus the integrity code doesn't even care about non-regular files, so it was always a total waste of time and effort. Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 11:07:17 -07:00
Steven Whitehouse	acf7e2444a	GFS2: Be extra careful about deallocating inodes There is a potential race in the inode deallocation code if two nodes try to deallocate the same inode at the same time. Most of the issue is solved by the iopen locking. There is still a small window which is not covered by the iopen lock. This patches fixes that and also makes the deallocation code more robust in the face of any errors in the rgrp bitmaps, or erroneous iopen callbacks from other nodes. This does introduce one extra disk read, but that is generally not an issue since its the same block that must be written to later in the deallocation process. The total disk accesses therefore stay the same, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-09-08 18:00:30 +01:00
Mimi Zohar	acd0c93517	IMA: update ima_counts_put - As ima_counts_put() may be called after the inode has been freed, verify that the inode is not NULL, before dereferencing it. - Maintain the IMA file counters in may_open() properly, decrementing any counter increments on subsequent errors. Reported-by: Ciprian Docan <docan@eden.rutgers.edu> Reported-by: J.R. Okajima <hooanon05@yahoo.co.jp> Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Eric Paris <eparis@redhat.com Signed-off-by: James Morris <jmorris@namei.org>	2009-09-07 11:54:58 +10:00
Linus Torvalds	5136a6c0fd	Merge git://git.infradead.org/~dwmw2/mtd-2.6.31 * git://git.infradead.org/~dwmw2/mtd-2.6.31: JFFS2: add missing verify buffer allocation/deallocation mtd: nftl: fix offset alignments mtd: nftl: write support is broken mtd: m25p80: fix null pointer dereference bug	2009-09-05 14:57:04 -07:00
Linus Torvalds	0edfa2b1b5	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: actually enable the swapext compat handler	2009-09-05 14:25:14 -07:00
Linus Torvalds	5a09adf130	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key	2009-09-05 14:24:33 -07:00
Nicolas Pitre	9de6886ec6	ext2: fix unbalanced kmap()/kunmap() In ext2_rename(), dir_page is acquired through ext2_dotdot(). It is then released through ext2_set_link() but only if old_dir != new_dir. Failing that, the pkmap reference count is never decremented and the page remains pinned forever. Repeat that a couple times with highmem pages and all pkmap slots get exhausted, and every further kmap() calls end up stalling on the pkmap_map_wait queue at which point the whole system comes to a halt. Signed-off-by: Nicolas Pitre <nico@marvell.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-05 13:41:08 -07:00
Linus Torvalds	ac7ac9f2b9	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: ocfs2_write_begin_nolock() should handle len=0 ocfs2: invalidate dentry if its dentry_lock isn't initialized.	2009-09-05 13:38:37 -07:00
Oleg Nesterov	a2a8474c3f	exec: do not sleep in TASK_TRACED under ->cred_guard_mutex Tom Horsley reports that his debugger hangs when it tries to read /proc/pid_of_tracee/maps, this happens since "mm_for_maps: take ->cred_guard_mutex to fix the race with exec" 04b836cbf19e885f8366bccb2e4b0474346c02d commit in 2.6.31. But the root of the problem lies in the fact that do_execve() path calls tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC. The tracee must not sleep in TASK_TRACED holding this mutex. Even if we remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not hang until it is killed or the tracee resumes. With this patch do_execve() does not use ->cred_guard_mutex directly and we do not hold it throughout, instead: - introduce prepare_bprm_creds() helper, it locks the mutex and calls prepare_exec_creds() to initialize bprm->cred. - install_exec_creds() drops the mutex after commit_creds(), and thus before tracehook_report_exec()->ptrace_stop(). or, if exec fails, free_bprm() drops this mutex when bprm->cred != NULL which indicates install_exec_creds() was not called. Reported-by: Tom Horsley <tom.horsley@att.net> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-05 11:30:42 -07:00
Sunil Mushran	8379e7c46c	ocfs2: ocfs2_write_begin_nolock() should handle len=0 Bug introduced by mainline commit `e7432675f8` The bug causes ocfs2_write_begin_nolock() to oops when len=0. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Cc: stable@kernel.org Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-09-04 14:28:31 -07:00
Jeff Layton	9162ab2000	cifs: consolidate reconnect logic in smb_init routines There's a large cut and paste chunk of code in smb_init and small_smb_init to handle reconnects. Break it out into a separate function, clean it up and have both routines call it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-03 15:30:48 +00:00
Massimo Cirillo	bc8cec0dff	JFFS2: add missing verify buffer allocation/deallocation The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when that macro is enabled and the verify function is called. Similarly the jffs2_nor_wbuf_flash_cleanup() must free the buffer if CONFIG_JFFS2_FS_WBUF_VERIFY is enabled. The following patch fixes the problem. The following patch applies to 2.6.30 kernel. Signed-off-by: Massimo Cirillo <maxcir@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org	2009-09-03 15:01:34 +01:00
Mimi Zohar	6c1488fd58	IMA: open new file for read When creating a new file, ima_path_check() assumed the new file was being opened for write. Call ima_path_check() with the appropriate acc_mode so that the read/write counters are incremented correctly. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-03 12:06:12 +10:00
Alex Elder	988abe4075	xfs: xfs_showargs() reports group and project quotas enabled If you enable group or project quotas on an XFS file system, then the mount table presented through /proc/self/mounts erroneously shows that both options are in effect for the file system. The root of the problem is some bad logic in the xfs_showargs() function, which is used to format the file system type-specific options in effect for a file system. The problem originated in this GIT commit: Move platform specific mount option parse out of core XFS code Date: 11/22/07 Author: Dave Chinner SHA1 ID: `a67d7c5f5d` For XFS quotas, project and group quota management are mutually exclusive--only one can be in effect at a time. There are two parts to managing quotas: aggregating usage information; and enforcing limits. It is possible to have a quota in effect (aggregating usage) but not enforced. These features are recorded on an XFS mount point using these flags: XFS_PQUOTA_ACCT - Project quotas are aggregated XFS_GQUOTA_ACCT - Group quotas are aggregated XFS_OQUOTA_ENFD - Project/group quotas are enforced The code in error is in fs/xfs/linux-2.6/xfs_super.c: if (mp->m_qflags & (XFS_PQUOTA_ACCT\|XFS_OQUOTA_ENFD)) seq_puts(m, "," MNTOPT_PRJQUOTA); else if (mp->m_qflags & XFS_PQUOTA_ACCT) seq_puts(m, "," MNTOPT_PQUOTANOENF); if (mp->m_qflags & (XFS_GQUOTA_ACCT\|XFS_OQUOTA_ENFD)) seq_puts(m, "," MNTOPT_GRPQUOTA); else if (mp->m_qflags & XFS_GQUOTA_ACCT) seq_puts(m, "," MNTOPT_GQUOTANOENF); The problem is that XFS_OQUOTA_ENFD will be set in mp->m_qflags if either group or project quotas are enforced, and as a result both MNTOPT_PRJQUOTA and MNTOPT_GRPQUOTA will be shown as mount options. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>	2009-09-02 17:02:24 -05:00
David Howells	e0e817392b	CRED: Add some configurable debugging [try #6 ] Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking for credential management. The additional code keeps track of the number of pointers from task_structs to any given cred struct, and checks to see that this number never exceeds the usage count of the cred struct (which includes all references, not just those from task_structs). Furthermore, if SELinux is enabled, the code also checks that the security pointer in the cred struct is never seen to be invalid. This attempts to catch the bug whereby inode_has_perm() faults in an nfsd kernel thread on seeing cred->security be a NULL pointer (it appears that the credential struct has been previously released): http://www.kerneloops.org/oops.php?number=252883 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-02 21:29:01 +10:00
Ingo Molnar	f14eff1cc2	Merge commit 'v2.6.31-rc8' into sched/core Merge reason: bump from rc5 to rc8, but also pick up TP_perf_assign() API, a patch will be queued that depends on it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 08:20:35 +02:00
Christoph Hellwig	81e251766e	xfs: un-static xfs_inobt_lookup xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC modifier from it's declaration to fix non-debug builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 20:43:01 -05:00
Dave Kleikamp	6ab409b53d	cifs: Replace wrtPending with a real reference count Currently, cifs_close() tries to wait until all I/O is complete and then frees the file private data. If I/O does not completely in a reasonable amount of time it frees the structure anyway, leaving a potential use- after-free situation. This patch changes the wrtPending counter to a complete reference count and lets the last user free the structure. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Tested-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 22:35:01 +00:00
Jeff Layton	1b49c55661	cifs: protect GlobalOplock_Q with its own spinlock Right now, the GlobalOplock_Q is protected by the GlobalMid_Lock. That lock is also used for completely unrelated purposes (mostly for managing the global mid queue). Give the list its own dedicated spinlock (cifs_oplock_lock) and rename the list to cifs_oplock_list to eliminate the camel-case. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 22:25:29 +00:00
Jeff Layton	8e047d09ee	cifs: use tcon pointer in cifs_show_options Minor nit: we already have a tcon pointer so we don't need to dereference cifs_sb again. Also initialize the vars in the declaration. Reported-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 22:25:19 +00:00
Jeff Layton	8c58b54574	cifs: send IPv6 addr in upcall with colon delimiters Make it easier on the upcall program by adding ':' delimiters between each group of hex digits. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 22:24:10 +00:00
Christoph Hellwig	3725867dcc	xfs: actually enable the swapext compat handler Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Torsten Kaiser <just.for.lkml@googlemail.com> Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 17:00:46 -05:00
Christoph Hellwig	f4378b6eaf	xfs: actually enable the swapext compat handler Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Torsten Kaiser <just.for.lkml@googlemail.com> Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 16:55:53 -05:00
Christoph Hellwig	aa72a5cf00	xfs: simplify xfs_trans_iget xfs_trans_iget is a wrapper for xfs_iget that adds the inode to the transaction after it is read. Except when the inode already is in the inode cache, in which case it returns the existing locked inode with increment lock recursion counts. Now, no one in the tree every decrements these lock recursion counts, so any user of this gets a potential double unlock when both the original owner of the inode and the xfs_trans_iget caller unlock it. When looking back in a git bisect in the historic XFS tree there was only one place that decremented these counts, xfs_trans_iput. Introduced in commit ca25df7a840f426eb566d52667b6950b92bb84b5 by Adam Sweeney in 1993, and removed in commit 19f899a3ab155ff6a49c0c79b06f2f61059afaf3 by Steve Lord in 2003. And as long as it didn't slip through git bisects cracks never actually used in that time frame. A quick audit of the callers of xfs_trans_iget shows that no caller really relies on this behaviour fortunately - xfs_ialloc allows this inode from disk so it must not be there before, and all the RT allocator routines only every add each RT bitmap inode once. In addition to removing lots of code and reducing the size of the inode item this patch also avoids the double inode cache lookup in each create/mkdir/mknod transaction. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:46:16 -05:00
Christoph Hellwig	13e6d5cdde	xfs: merge fsync and O_SYNC handling The guarantees for O_SYNC are exactly the same as the ones we need to make for an fsync call (and given that Linux O_SYNC is O_DSYNC the equivalent is fdadatasync, but we treat both the same in XFS), except with a range data writeout. Jan Kara has started unifying these two path for filesystems using the generic helpers, and I've started to look at XFS. The actual transaction commited by xfs_fsync and xfs_write_sync_logforce has a different transaction number, but actually is exactly the same. We'll only use the fsync transaction going forward. One major difference is that xfs_write_sync_logforce never issues a cache flush unless we commit a transaction causing that as a side-effect, which is an obvious bug in the O_SYNC handling. Second all the locking and i_update_size vs i_update_core changes from `978b723712` never made it to xfs_write_sync_logforce, so we add them back. To make xfs_fsync easily usable from the O_SYNC path, the filemap_fdatawait call is moved up to xfs_file_fsync, so that we don't wait on the whole file after we already waited for our portion in xfs_write. We'll also use a plain call to filemap_write_and_wait_range instead of the previous sync_page_rang which did it in two steps including an half-hearted inode write out that doesn't help us. Once we're done with this also remove the now useless i_update_size tracking. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:57 -05:00
Dave Chinner	bd16956599	xfs: speed up free inode search Don't search too far - abort if it is outside a certain radius and simply do a linear search for the first free inode. In AGs with a million inodes this can speed up allocation speed by 3-4x. [hch: ported to the new xfs_ialloc.c world order] Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:48 -05:00
Christoph Hellwig	2187550525	xfs: rationalize xfs_inobt_lookup* Currenly we have a xfs_inobt_lookup* variant for each comparism direction, and all these get all three fields of the inobt records passed, while the common case is just looking for the inode number and we have only marginally more callers than xfs_inobt_lookup* variants. So opencode a direct call to xfs_btree_lookup for the single case where we need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that just takes the inode number and the direction for all other callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:39 -05:00
Christoph Hellwig	4254b0bbb1	xfs: untangle xfs_dialloc Clarify the control flow in xfs_dialloc. Factor out a helper to go to the next node from the current one and improve the control flow by expanding composite if statements and using gotos. The xfs_ialloc_next_rec helper is borrowed from Dave Chinners dynamic allocation policy patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:29 -05:00
Dave Chinner	0b48db80ba	xfs: factor out debug checks from xfs_dialloc and xfs_difree Factor out a common helper from repeated debug checks in xfs_dialloc and xfs_difree. [hch: split out from Dave's dynamic allocation policy patches] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:18 -05:00
Christoph Hellwig	afabc24a73	xfs: improve xfs_inobt_update prototype Both callers of xfs_inobt_update have the record in form of a xfs_inobt_rec_incore_t, so just pass a pointer to it instead of the individual variables. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:45:08 -05:00
Christoph Hellwig	2e287a731e	xfs: improve xfs_inobt_get_rec prototype Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the three individual variables, too. So just change xfs_inobt_get_rec to write the output into a xfs_inobt_rec_incore_t directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:44:56 -05:00
Dave Chinner	85c0b2ab5e	xfs: factor out inode initialisation Factor out code to initialize new inode clusters into a function of it's own. This keeps xfs_ialloc_ag_alloc smaller and better structured and enables a future inode cluster initialization transaction. Also initialize the agno variable earlier in xfs_ialloc_ag_alloc to avoid repeated byte swaps. [hch: The original patch is from Dave from his unpublished inode create transaction patch series, with some modifcations by me to apply stand-alone] Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-09-01 12:44:27 -05:00
Steve French	ca43e3beee	[CIFS] Fix checkpatch warnings Also update version number to 1.61 Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 17:20:50 +00:00
Suresh Jayaraman	bdb97adcdf	PATCH] cifs: fix broken mounts when a SSH tunnel is used (try #4 ) One more try.. It seems there is a regression that got introduced while Jeff fixed all the mount/umount races. While attempting to find whether a tcp session is already existing, we were not checking whether the "port" used are the same. When a second mount is attempted with a different "port=" option, it is being ignored. Because of this the cifs mounts that uses a SSH tunnel appears to be broken. Steps to reproduce: 1. create 2 shares # SSH Tunnel a SMB session 2. ssh -f -L 6111:127.0.0.1:445 root@localhost "sleep 86400" 3. ssh -f -L 6222:127.0.0.1:445 root@localhost "sleep 86400" 4. tcpdump -i lo 6111 & 5. mkdir -p /mnt/mnt1 6. mkdir -p /mnt/mnt2 7. mount.cifs //localhost/a /mnt/mnt1 -o username=guest,ip=127.0.0.1,port=6111 #(shows tcpdump activity on port 6111) 8. mount.cifs //localhost/b /mnt/mnt2 -o username=guest,ip=127.0.0.1,port=6222 #(shows tcpdump activity only on port 6111 and not on 6222 Fix by adding a check to compare the port _only_ if the user tries to override the tcp port with "port=" option, before deciding that an existing tcp session is found. Also, clean up a bit by replacing if-else if by a switch statment while at it as suggested by Jeff. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 17:08:48 +00:00
Alexander Strakh	1b3859bc9e	[CIFS] Memory leak in ntlmv2 hash calculation in function calc_ntlmv2_hash memory is not released. 1. If in the line 333 we successfully allocate memory and assign it to pctxt variable: pctxt = kmalloc(sizeof(struct HMACMD5Context), GFP_KERNEL); then we go to line 376 and exit wihout releasing memory pointed to by pctxt variable. Add a memory releasing for pctxt variable before exit from function calc_ntlmv2_hash. Signed-off-by: Alexander Strakh <strakh@ispras.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-09-01 17:02:24 +00:00
Ian Kent	37d0892c5a	autofs4 - fix missed case when changing to use struct path In the recent change by Al Viro that changes verious subsystems to use "struct path" one case was missed in the autofs4 module which causes mounts to no longer expire. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-31 17:44:05 -10:00
Julia Lawall	a0f7bfd342	fs/xfs: Correct redundant test bp was tested for NULL a few lines before, followed by a return, and there is no intervening modification of its value. A simplified version of the semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; expression E; position p1,p2; @@ if (x == NULL \|\| ...) { ... when forall return ...; } ... when != \(x=E\\|x--\\|x++\\|--x\\|++x\\|x-=E\\|x+=E\\|x\|=E\\|x&=E\\|&x\) ( x == NULL \| x != NULL ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-31 14:46:22 -05:00
Eric Sandeen	eb00457d62	xfs: remove XFS_INO64_OFFSET Commit `a19d9f887d` removed the ino64 option but left the XFS_INO64_OFFSET define it used in place - just remove it. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-31 14:46:22 -05:00
Eric Sandeen	fef1111ecd	un-static xfs_read_agf CONFIG_XFS_DEBUG builds still need xfs_read_agf to be non-static, oops. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-31 14:46:21 -05:00
Eric Sandeen	d96f8f891f	xfs: add more statics & drop some unused functions A lot more functions could be made static, but they need forward declarations; this does some easy ones, and also found a few unused functions in the process. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-31 14:46:20 -05:00
Steve French	2920ee2b47	[CIFS] potential NULL dereference in parse_DFS_referrals() memory allocation may fail, prevent a NULL dereference Pointed out by Roel Kluin CC: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-08-31 15:27:26 +00:00
Ryusuke Konishi	b1f1b8ce0a	nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key This will fix the following preempt count underflow reported from users with the title "[NILFS users] segctord problem" (Message-ID: <949415.6494.qm@web58808.mail.re1.yahoo.com> and Message-ID: <debc30fc0908270825v747c1734xa59126623cfd5b05@mail.gmail.com>): WARNING: at kernel/sched.c:4890 sub_preempt_count+0x95/0xa0() Hardware name: HP Compaq 6530b (KR980UT#ABC) Modules linked in: bridge stp llc bnep rfcomm l2cap xfs exportfs nilfs2 cowloop loop vboxnetadp vboxnetflt vboxdrv btusb bluetooth uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 snd_hda_codec_analog ecb iwlagn iwlcore rfkill lib80211 mac80211 snd_hda_intel snd_hda_codec ehci_hcd uhci_hcd usbcore snd_hwdep snd_pcm tg3 cfg80211 psmouse snd_timer joydev libphy ohci1394 snd_page_alloc hp_accel lis3lv02d ieee1394 led_class i915 drm i2c_algo_bit video backlight output i2c_core dm_crypt dm_mod Pid: 4197, comm: segctord Not tainted 2.6.30-gentoo-r4-64 #7 Call Trace: [<ffffffff8023fa05>] ? sub_preempt_count+0x95/0xa0 [<ffffffff802470f8>] warn_slowpath_common+0x78/0xd0 [<ffffffff8024715f>] warn_slowpath_null+0xf/0x20 [<ffffffff8023fa05>] sub_preempt_count+0x95/0xa0 [<ffffffffa04ce4db>] nilfs_btnode_prepare_change_key+0x11b/0x190 [nilfs2] [<ffffffffa04d01ad>] nilfs_btree_assign_p+0x19d/0x1e0 [nilfs2] [<ffffffffa04d10ad>] nilfs_btree_assign+0xbd/0x130 [nilfs2] [<ffffffffa04cead7>] nilfs_bmap_assign+0x47/0x70 [nilfs2] [<ffffffffa04d9bc6>] nilfs_segctor_do_construct+0x956/0x20f0 [nilfs2] [<ffffffff805ac8e2>] ? _spin_unlock_irqrestore+0x12/0x40 [<ffffffff803c06e0>] ? __up_write+0xe0/0x150 [<ffffffff80262959>] ? up_write+0x9/0x10 [<ffffffffa04ce9f3>] ? nilfs_bmap_test_and_clear_dirty+0x43/0x60 [nilfs2] [<ffffffffa04cd627>] ? nilfs_mdt_fetch_dirty+0x27/0x60 [nilfs2] [<ffffffffa04db5fc>] nilfs_segctor_construct+0x8c/0xd0 [nilfs2] [<ffffffffa04dc3dc>] nilfs_segctor_thread+0x15c/0x3a0 [nilfs2] [<ffffffffa04dbe20>] ? nilfs_construction_timeout+0x0/0x10 [nilfs2] [<ffffffff80252633>] ? add_timer+0x13/0x20 [<ffffffff802370da>] ? __wake_up_common+0x5a/0x90 [<ffffffff8025e960>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2] [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2] [<ffffffff8025e556>] kthread+0x56/0x90 [<ffffffff8020cdea>] child_rip+0xa/0x20 [<ffffffff8025e500>] ? kthread+0x0/0x90 [<ffffffff8020cde0>] ? child_rip+0x0/0x20 This problem was caused due to a missing radix_tree_preload() call in the retry path of nilfs_btnode_prepare_change_key() function. Reported-by: Eric A <eric225125@yahoo.com> Reported-by: Jerome Poulin <jeromepoulin@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Jerome Poulin <jeromepoulin@gmail.com> Cc: stable@kernel.org	2009-08-31 12:03:06 +09:00
Eric Paris	750a8870fe	inotify: update the group mask on mark addition Seperating the addition and update of marks in inotify resulted in a regression in that inotify never gets events. The inotify group mask is always 0. This mask should be updated any time a new mark is added. Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-28 12:51:14 -04:00
Eric Paris	83cb10f0ef	inotify: fix length reporting and size checking `0db501bd06` introduced a regresion in that it now sends a nul terminator but the length accounting when checking for space or reporting to userspace did not take this into account. This corrects all of the rounding logic. Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-28 11:57:55 -04:00
Brian Rogers	b962e7312a	inotify: do not send a block of zeros when no pathname is available When an event has no pathname, there's no need to pad it with a null byte and therefore generate an inotify_event sized block of zeros. This fixes a regression introduced by commit `0db501bd06` where my system wouldn't finish booting because some process was being confused by this. Signed-off-by: Brian Rogers <brian@xyzw.org> Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-28 10:03:06 -04:00
Tao Ma	a1b08e75df	ocfs2: invalidate dentry if its dentry_lock isn't initialized. In commit `a5a0a63092`, when ocfs2_attch_dentry_lock fails, we call an extra iput and reset dentry->d_fsdata to NULL. This resolve a bug, but it isn't completed and the dentry is still there. When we want to use it again, ocfs2_dentry_revalidate doesn't catch it and return true. That make future ocfs2_dentry_lock panic out. One bug is http://oss.oracle.com/bugzilla/show_bug.cgi?id=1162. The resolution is to add a check for dentry->d_fsdata in revalidate process and return false if dentry->d_fsdata is NULL, so that a new ocfs2_lookup will be called again. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-27 18:10:54 -07:00
Linus Torvalds	9c504cadc4	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: inotify: Ensure we alwasy write the terminating NULL. inotify: fix locking around inotify watching in the idr inotify: do not BUG on idr entries at inotify destruction inotify: seperate new watch creation updating existing watches	2009-08-27 12:26:02 -07:00
Linus Torvalds	cf481442f2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: update documentation pointers 9p: remove unnecessary v9fses->options which duplicates the mount string net/9p: insulate the client against an invalid error code sent by a 9p server 9p: Add missing cast for the error return value in v9fs_get_inode 9p: Remove redundant inode uid/gid assignment 9p: Fix possible regressions when ->get_sb fails. 9p: Fix v9fs show_options 9p: Fix possible memleak in v9fs_inode_from fid. 9p: minor comment fixes 9p: Fix possible inode leak in v9fs_get_inode. 9p: Check for error in return value of v9fs_fid_add	2009-08-27 12:24:08 -07:00
David Howells	9886e836a6	AFS: Stop readlink() on AFS crashing due to NULL 'file' ptr kAFS crashes when asked to read a symbolic link because page_getlink() passes a NULL file pointer to read_mapping_page(), but afs_readpage() expects a file pointer from which to extract a key. Modify afs_readpage() to request the appropriate key from the calling process's keyrings if a file struct is not supplied with one attached. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-27 12:22:08 -07:00
Steven Whitehouse	8d8291ae93	GFS2: Remove no_formal_ino generating code The inum structure used throughout GFS2 has two fields. One no_addr is the disk block number of the inode in question and is used everywhere as the inode number. The other, no_formal_ino, is used only as the generation number for NFS. Historically the no_formal_ino field was set using a complicated system of one global and one per-node file containing inode numbers in order to ensure that each no_formal_ino was unique. Also this code made no provision for what would happen when eventually the (64 bit) numbers ran out. Now I know that is pretty unlikely to happen given the large space of numbers, but it is possible nevertheless. The only guarantee required for no_formal_ino is that, for any single inode, the same number doesn't get reused too quickly. We already have a generation number which is kept in the inode and initialised from a counter in the resource group (almost no overhead, since we have to touch the resource group anyway in order to allocate an inode in the first place). Aside from ensuring that we never use the value 0 in the no_formal_ino field, we can use that counter directly. As a result of that change, we lose about 200 lines of code and also gain about 10 creates/sec on the postmark benchmark (on my test machine). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-27 15:51:07 +01:00
Eric W. Biederman	0db501bd06	inotify: Ensure we alwasy write the terminating NULL. Before the rewrite copy_event_to_user always wrote a terqminating '\0' byte to user space after the filename. Since the rewrite that terminating byte was skipped if your filename is exactly a multiple of event_size. Ouch! So add one byte to name_size before we round up and use clear_user to set userspace to zero like /dev/zero does instead of copying the strange nul_inotify_event. I can't quite convince myself len_to_zero will never exceed 16 and even if it doesn't clear_user should be more efficient and a more accurate reflection of what the code is trying to do. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-27 08:02:10 -04:00
Eric Paris	dead537dd8	inotify: fix locking around inotify watching in the idr The are races around the idr storage of inotify watches. It's possible that a watch could be found from sys_inotify_rm_watch() in the idr, but it could be removed from the idr before that code does it's removal. Move the locking and the refcnt'ing so that these have to happen atomically. Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-27 08:02:04 -04:00
Eric Paris	cf4374267f	inotify: do not BUG on idr entries at inotify destruction If an inotify watch is left in the idr when an fsnotify group is destroyed this will lead to a BUG. This is not a dangerous situation and really indicates a programming bug and leak of memory. This patch changes it to use a WARN and a printk rather than killing people's boxes. Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-27 08:02:04 -04:00
Eric Paris	52cef7555a	inotify: seperate new watch creation updating existing watches There is nothing known wrong with the inotify watch addition/modification but this patch seperates the two code paths to make them each easy to verify as correct. Signed-off-by: Eric Paris <eparis@redhat.com>	2009-08-27 08:02:04 -04:00
Steven Whitehouse	307cf6e63c	GFS2: Rename eattr.[ch] as xattr.[ch] Use the more conventional name for the extended attribute support code. Update all the places which care. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-26 18:51:04 +01:00
Steven Whitehouse	40b78a3223	GFS2: Clean up of extended attribute support This has been on my list for some time. We need to change the way in which we handle extended attributes to allow faster file creation times (by reducing the number of transactions required) and the extended attribute code is the main obstacle to this. In addition to that, the VFS provides a way to demultiplex the xattr calls which we ought to be using, rather than rolling our own. This patch changes the GFS2 code to use that VFS feature and as a result the code shrinks by a couple of hundred lines or so, and becomes easier to read. I'm planning on doing further clean up work in this area, but this patch is a good start. The cleaned up code also uses the more usual "xattr" shorthand, I plan to eliminate the use of "eattr" eventually and in the mean time it serves as a flag as to which bits of the code have been updated. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-26 18:41:32 +01:00
Linus Torvalds	e9cab24cf3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: ext3: Improve error message that changing journaling mode on remount is not possible ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED	2009-08-25 09:47:36 -07:00
Trond Myklebust	7111dc7392	NFSv4: Fix an infinite looping problem with the nfs4_state_manager Commit `76db6d9500` (nfs41: add session setup to the state manager) introduces an infinite loop possibility in the NFSv4 state manager. By first checking nfs4_has_session() before clearing the NFS4CLNT_SESSION_SETUP flag, it allows for a situation where someone sets that flag, but it never gets cleared, and so the state manager loops. In fact commit `c3fad1b1aa` (nfs41: add session reset to state manager) causes this to happen every time we get a network partition error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-24 16:28:42 -07:00
Linus Torvalds	2584e7986f	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2/dlm: Wait on lockres instead of erroring cancel requests ocfs2: Add missing lock name ocfs2: Don't oops in ocfs2_kill_sb on a failed mount ocfs2: release the buffer head in ocfs2_do_truncate. ocfs2: Handle quota file corruption more gracefully	2009-08-24 14:41:28 -07:00
Hugh Dickins	353d5c30c6	mm: fix hugetlb bug due to user_shm_unlock call 2.6.30's commit `8a0bdec194` removed user_shm_lock() calls in hugetlb_file_setup() but left the user_shm_unlock call in shm_destroy(). In detail: Assume that can_do_hugetlb_shm() returns true and hence user_shm_lock() is not called in hugetlb_file_setup(). However, user_shm_unlock() is called in any case in shm_destroy() and in the following atomic_dec_and_lock(&up->__count) in free_uid() is executed and if up->__count gets zero, also cleanup_user_struct() is scheduled. Note that sched_destroy_user() is empty if CONFIG_USER_SCHED is not set. However, the ref counter up->__count gets unexpectedly non-positive and the corresponding structs are freed even though there are live references to them, resulting in a kernel oops after a lots of shmget(SHM_HUGETLB)/shmctl(IPC_RMID) cycles and CONFIG_USER_SCHED set. Hugh changed Stefan's suggested patch: can_do_hugetlb_shm() at the time of shm_destroy() may give a different answer from at the time of hugetlb_file_setup(). And fixed newseg()'s no_id error path, which has missed user_shm_unlock() ever since it came in 2.6.9. Reported-by: Stefan Huber <shuber2@gmail.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Tested-by: Stefan Huber <shuber2@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-24 12:53:01 -07:00
Jan Kara	3c4cec6527	ext3: Improve error message that changing journaling mode on remount is not possible This patch makes the error message about changing journaling mode on remount more descriptive. Some people are going to hit this error now due to commit `bbae8bcc49` if they configure a kernel to default to data=writeback mode. The problem happens if they have data=ordered set for the root filesystem in /etc/fstab but not in the kernel command line (and they don't use initrd). Their filesystem then gets mounted as data=writeback by kernel but then their boot fails because init scripts won't be able to remount the filesystem rw. Better error message will hopefully make it easier for them to find the error in their setup and bother us less with error reports :). Signed-off-by: Jan Kara <jack@suse.cz>	2009-08-24 16:48:45 +02:00
Theodore Ts'o	6d41807614	ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED The old description for this configuration option was perhaps not completely balanced in terms of describing the tradeoffs of using a default of data=writeback vs. data=ordered. Despite the fact that old description very strongly recomended disabling this feature, all of the major distributions have elected to preserve the existing 'legacy' default, which is a strong hint that it perhaps wasn't telling the whole story. This revised description has been vetted by a number of ext3 developers as being better at informing the user about the tradeoffs of enabling or disabling this configuration feature. Cc: linux-ext4@vger.kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>	2009-08-24 16:48:32 +02:00
Bob Peterson	d34843d0c4	GFS2: Add "-o errors=panic\|withdraw" mount options This patch adds "-o errors=panic" and "-o errors=withdraw" to the gfs2 mount options. The "errors=withdraw" option is today's current behaviour, meaning to withdraw from the file system if a non-serious gfs2 error occurs. The new "errors=panic" option tells gfs2 to force a kernel panic if a non-serious gfs2 file system error occurs. This may be useful, for example, where fabric-level fencing is used that has no way to reboot (such as fence_scsi). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-24 10:44:18 +01:00
Roel Kluin	cd0120751d	GFS2: jumping to wrong label? Also a gfs2_glock_dq() is required here. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-24 10:41:44 +01:00
Mimi Zohar	6777d773a4	kernel_read: redefine offset type vfs_read() offset is defined as loff_t, but kernel_read() offset is only defined as unsigned long. Redefine kernel_read() offset as loff_t. Cc: stable@kernel.org Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-08-24 14:58:23 +10:00
Chuck Lever	5eecfde615	NFS: Handle a zero-length auth flavor list Some releases of Linux rpc.mountd (nfs-utils 1.1.4 and later) return an empty auth flavor list if no sec= was specified for the export. This is notably broken server behavior. The new auth flavor list checking added in a recent commit rejects this case. The OpenSolaris client does too. The broken mountd implementation is already widely deployed. To avoid a behavioral regression, the kernel's mount client skips flavor checking (ie reverts to the pre-2.6.32 behavior) if mountd returns an empty flavor list. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-23 23:43:57 -04:00
Linus Torvalds	8e9d78edea	Re-introduce page mapping check in mark_buffer_dirty() In commit `a8e7d49aa7` ("Fix race in create_empty_buffers() vs __set_page_dirty_buffers()"), I removed a test for a NULL page mapping unintentionally when some of the code inside __set_page_dirty() was moved to the callers. That removal generally didn't matter, since a filesystem would serialize truncation (which clears the page mapping) against writing (which marks the buffer dirty), so locking at a higher level (either per-page or an inode at a time) should mean that the buffer page would be stable. And indeed, nothing bad seemed to happen. Except it turns out that apparently reiserfs does something odd when under load and writing out the journal, and we have a number of bugzilla entries that look similar: http://bugzilla.kernel.org/show_bug.cgi?id=13556 http://bugzilla.kernel.org/show_bug.cgi?id=13756 http://bugzilla.kernel.org/show_bug.cgi?id=13876 and it looks like reiserfs depended on that check (the common theme seems to be "data=journal", and a journal writeback during a truncate). I suspect reiserfs should have some additional locking, but in the meantime this should get us back to the pre-2.6.29 behavior. Pattern-pointed-out-by: Roland Kletzing <devzero@web.de> Cc: stable@kernel.org (2.6.29 and 2.6.30) Cc: Jeff Mahoney <jeffm@suse.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-21 17:40:08 -07:00
Linus Torvalds	b57f92157e	Merge branch 'btrfs' of git://git.kernel.dk/linux-2.6-block * 'btrfs' of git://git.kernel.dk/linux-2.6-block: btrfs: fix inode rbtree corruption	2009-08-21 09:56:55 -07:00
From: Nick Piggin	03e860bd9f	btrfs: fix inode rbtree corruption Node may not be inserted over existing node. This causes inode tree corruption and I was seeing crashes in inode_tree_del which I can not reproduce after this patch. The other way to fix this would be to tie inode lifetime in the rbtree with inode while not in freeing state. I had a look at this but it is not so trivial at this point. At least this patch gets things working again. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Acked-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-08-21 10:09:44 +02:00
Amerigo Wang	939a9421eb	vfs: allow file truncations when both suid and write permissions set When suid is set and the non-owner user has write permission, any writing into this file should be allowed and suid should be removed after that. However, current kernel only allows writing without truncations, when we do truncations on that file, we get EPERM. This is a bug. Steps to reproduce this bug: % ls -l rootdir/file1 -rwsrwsrwx 1 root root 3 Jun 25 15:42 rootdir/file1 % echo h > rootdir/file1 zsh: operation not permitted: rootdir/file1 % ls -l rootdir/file1 -rwsrwsrwx 1 root root 3 Jun 25 15:42 rootdir/file1 % echo h >> rootdir/file1 % ls -l rootdir/file1 -rwxrwxrwx 1 root root 5 Jun 25 16:34 rootdir/file1 Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric Sandeen <esandeen@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Cc: Eugene Teo <eteo@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Christoph Hellwig <hch@lst.de> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <jmorris@namei.org>	2009-08-21 14:25:48 +10:00
Goldwyn Rodrigues	c795b33ba1	ocfs2/dlm: Wait on lockres instead of erroring cancel requests In case a downconvert is queued, and a flock receives a signal, BUG_ON(lockres->l_action != OCFS2_AST_INVALID) is triggered because a lock cancel triggers a dlmunlock while an AST is scheduled. To avoid this, allow a LKM_CANCEL to pass through, and let it wait on __dlm_wait_on_lockres(). Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de> Acked-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-20 18:42:34 -07:00
Jan Kara	a8b88d3d49	ocfs2: Add missing lock name There is missing name for NFSSync cluster lock. This makes lockdep unhappy because we end up passing NULL to lockdep when initializing lock key. Fix it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-20 16:41:53 -07:00
Jan Kara	e1af88a1ad	nfs: Remove reference to generic_osync_inode from a comment generic_file_direct_write() no longer calls generic_osync_inode() so remove the comment. CC: linux-nfs@vger.kernel.org CC: Neil Brown <neilb@suse.de> CC: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 19:48:08 -04:00
James Morris	ece13879e7	Merge branch 'master' into next Conflicts: security/Kconfig Manual fix. Signed-off-by: James Morris <jmorris@namei.org>	2009-08-20 09:18:42 +10:00
Trond Myklebust	7d7ea88289	NFS: Use the DNS resolver in the mount code. In the referral code, use it to look up the new server's ip address if the fs_locations attribute contains a hostname. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	e571cbf1a4	NFS: Add a dns resolver for use with NFSv4 referrals and migration The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client from one server to another in order to support filesystem migration and replication. For full protocol support, we need to add the ability to convert a DNS host name into an IP address that we can feed to the RPC client. We'll reuse the sunrpc cache, now that it has been converted to work with rpc_pipefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	6a396f67d2	Merge branch 'nfsv4_xdr_cleanups-for-2.6.32' into nfs-for-2.6.32 Conflicts: fs/nfs/nfs4xdr.c	2009-08-19 18:21:52 -04:00
Linus Torvalds	6c30c53fd5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix oopses with doubly mounted snapshots nilfs2: missing a read lock for segment writer in nilfs_attach_checkpoint()	2009-08-19 10:40:24 -07:00
KOSAKI Motohiro	0753ba01e1	mm: revert "oom: move oom_adj value" The commit `2ff05b2b` (oom: move oom_adj value) moveed the oom_adj value to the mm_struct. It was a very good first step for sanitize OOM. However Paul Menage reported the commit makes regression to his job scheduler. Current OOM logic can kill OOM_DISABLED process. Why? His program has the code of similar to the following. ... set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom / ... if (vfork() == 0) { set_oom_adj(0); / Invoked child can be killed */ execve("foo-bar-cmd"); } .... vfork() parent and child are shared the same mm_struct. then above set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also change oom_adj for vfork() parent. Then, vfork() parent (job scheduler) lost OOM immune and it was killed. Actually, fork-setting-exec idiom is very frequently used in userland program. We must not break this assumption. Then, this patch revert commit `2ff05b2b` and related commit. Reverted commit list --------------------- - commit `2ff05b2b4e` (oom: move oom_adj value from task_struct to mm_struct) - commit `4d8b9135c3` (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE) - commit `8123681022` (oom: only oom kill exiting tasks with attached memory) - commit `933b787b57` (mm: copy over oom_adj value at fork time) Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-18 16:31:13 -07:00
Jeff Layton	89a4eb4b66	vfs: make get_sb_pseudo set s_maxbytes to value that can be cast to signed get_sb_pseudo sets s_maxbytes to ~0ULL which becomes negative when cast to a signed value. Fix it to use MAX_LFS_FILESIZE which casts properly to a positive signed value. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Steve French <smfrench@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Robert Love <rlove@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-18 16:31:12 -07:00
Ryusuke Konishi	a924586036	nilfs2: fix oopses with doubly mounted snapshots will fix kernel oopses like the following: # mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test1 # mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test2 # umount /test1 # umount /test2 BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1069 in_atomic(): 0, irqs_disabled(): 1, pid: 3886, name: umount.nilfs2 1 lock held by umount.nilfs2/3886: #0: (&type->s_umount_key#31){+.+...}, at: [<c10b398a>] deactivate_super+0x52/0x6c irq event stamp: 1219 hardirqs last enabled at (1219): [<c135c774>] __mutex_unlock_slowpath+0xf8/0x119 hardirqs last disabled at (1218): [<c135c6d5>] __mutex_unlock_slowpath+0x59/0x119 softirqs last enabled at (1214): [<c1033316>] __do_softirq+0x1a5/0x1ad softirqs last disabled at (1205): [<c1033354>] do_softirq+0x36/0x5a Pid: 3886, comm: umount.nilfs2 Not tainted 2.6.31-rc6 #55 Call Trace: [<c1023549>] __might_sleep+0x107/0x10e [<c13603c0>] do_page_fault+0x246/0x397 [<c136017a>] ? do_page_fault+0x0/0x397 [<c135e753>] error_code+0x6b/0x70 [<c136017a>] ? do_page_fault+0x0/0x397 [<c104f805>] ? __lock_acquire+0x91/0x12fd [<c1050a62>] ? __lock_acquire+0x12ee/0x12fd [<c1050a62>] ? __lock_acquire+0x12ee/0x12fd [<c1050b2b>] lock_acquire+0xba/0xdd [<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2] [<c135d4fe>] down_write+0x2a/0x46 [<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2] [<d0d17d3f>] nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2] [<c104ea2c>] ? mark_held_locks+0x43/0x5b [<c104ecb1>] ? trace_hardirqs_on_caller+0x10b/0x133 [<c104ece4>] ? trace_hardirqs_on+0xb/0xd [<d0d09ac1>] nilfs_put_super+0x2f/0xca [nilfs2] [<c10b3352>] generic_shutdown_super+0x49/0xb8 [<c10b33de>] kill_block_super+0x1d/0x31 [<c10e6599>] ? vfs_quota_off+0x0/0x12 [<c10b398f>] deactivate_super+0x57/0x6c [<c10c4bc3>] mntput_no_expire+0x8c/0xb4 [<c10c5094>] sys_umount+0x27f/0x2a4 [<c10c50c6>] sys_oldumount+0xd/0xf [<c10031a4>] sysenter_do_call+0x12/0x38 ... This turns out to be a bug brought by an -rc1 patch ("nilfs2: simplify remaining sget() use"). In the patch, a new "put resource" function, nilfs_put_sbinfo() was introduced to delay freeing nilfs_sb_info struct. But the nilfs_put_sbinfo() mistakenly used atomic_dec_and_test() function to check the reference count, and it caused the nilfs_sb_info was freed when user mounted a snapshot twice. This bug also suggests there was unseen memory leak in usual mount /umount operations for nilfs. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-08-19 02:10:13 +09:00
Wengang Wang	970343cd49	GFS2: free disk inode which is deleted by remote node -V2 this patch is for the same problem that Benjamin Marzinski fixes at commit `b94a170e96` quotation of the original problem: ---cut here--- When a file is deleted from a gfs2 filesystem on one node, a dcache entry for it may still exist on other nodes in the cluster. If this happens, gfs2 will be unable to free this file on disk. Because of this, it's possible to have a gfs2 filesystem with no files on it and no free space. With this patch, when a node receives a callback notifying it that the file is being deleted on another node, it schedules a new workqueue thread to remove the file's dcache entry. ---end cut--- after applying Benjamin's patch, I think there is still a case in which the disk inode remains even when "no space" is hit. the case is that when running d_prune_aliases() against the inode, there are one or more dentries(aliases) which have reference count number > 0. in this case the dentries won't be pruned. and even later, the reference count becomes to 0, the dentries can still be cached in memory. unfortunately, no callback come again, things come back to the state before the callback runs. thus the on disk inode remains there until in memoryinode is removed for some other reason(shrinking inode cache or unmount the volume..). this patch is to remove those dentries when their reference count becomes to 0 and the inode is deleted by remote node. for implementation, gfs2_dentry_delete() is added as dentry_operations.d_delete. the function returns true when the inode is deleted by remote node. in dput(), gfs2_dentry_delete() is called and since it returns true, the dentry is unhashed from dcache and then removed. when all dentries are removed, the in memory inode get removed so that the on disk inode is freed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-18 10:29:39 +01:00
Zhang Qiang	1154ecbd2f	nilfs2: missing a read lock for segment writer in nilfs_attach_checkpoint() 'ns_cno' of structure 'the_nilfs' must be protected from segment writer, in other words, the caller of nilfs_get_checkpoint should hold read lock for nilfs->ns_segctor_sem. This patch adds the lock/unlock operations in nilfs_attach_checkpoint() when calling nilfs_cpfile_get_checkpoint(). Signed-off-by: Zhang Qiang <zhangqiang.buaa@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-08-18 17:32:27 +09:00
Abhishek Kulkarni	4b53e4b500	9p: remove unnecessary v9fses->options which duplicates the mount string The mount options string is saved in sb->s_options. This patch removes the redundant duplicating of the mount options. Also, since we are not displaying anything special in show options, we replace v9fs_show_options with generic_show_options for now. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:42:28 -05:00
Abhishek Kulkarni	48559b4c30	9p: Add missing cast for the error return value in v9fs_get_inode Cast the error return value (ENOMEM) in v9fs_get_inode() to its correct type using ERR_PTR. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:35:08 -05:00
Jan Kara	5fd1318937	ocfs2: Don't oops in ocfs2_kill_sb on a failed mount If we fail to mount the filesystem, we have to be careful not to dereference uninitialized structures in ocfs2_kill_sb. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-17 14:32:24 -07:00
Abhishek Kulkarni	4d3297ca5b	9p: Remove redundant inode uid/gid assignment Remove a redundant update of inode's i_uid and i_gid after v9fs_get_inode() since the latter already sets up a new inode and sets the proper uid and gid values. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:58 -05:00
Abhishek Kulkarni	1b5ab3e867	9p: Fix possible regressions when ->get_sb fails. ->get_sb can fail causing some badness. this patch fixes * clear sb->fs_s_info in kill_sb. * deactivate_locked_super() calls kill_sb (v9fs_kill_super) which closes the destroys the client, clunks all its fids and closes the v9fs session. Attempting to do it twice will cause an oops. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Abhishek Kulkarni	4f4038328d	9p: Fix v9fs show_options Add the delimiter ',' before the options when they are passed and check if no option parameters are passed to prevent displaying NULL in /proc/mounts. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Abhishek Kulkarni	02bc35672b	9p: Fix possible memleak in v9fs_inode_from fid. Add missing p9stat_free in v9fs_inode_from_fid to avoid any possible leaks. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Abhishek Kulkarni	0e15597ebf	9p: minor comment fixes Fix the comments -- mostly the improper and/or missing descriptions of function parameters. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Abhishek Kulkarni	2bb541157f	9p: Fix possible inode leak in v9fs_get_inode. Add a missing iput when cleaning up if v9fs_get_inode fails after returning a valid inode. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Abhishek Kulkarni	50fb6d2bd7	9p: Check for error in return value of v9fs_fid_add Check if v9fs_fid_add was successful or not based on its return value. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2009-08-17 16:27:57 -05:00
Linus Torvalds	c58afec8b2	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix locking in xfs_iget_cache_hit	2009-08-17 13:39:30 -07:00
Eric Paris	08e53fcb0d	inotify: start watch descriptor count at 1 The inotify_add_watch man page specifies that inotify_add_watch() will return a non-negative integer. However, historically the inotify watches started at 1, not at 0. Turns out that the inotifywait program provided by the inotify-tools package doesn't properly handle a 0 watch descriptor. In `7e790dd5` we changed from starting at 1 to starting at 0. This patch starts at 1, just like in previous kernels, but also just like in previous kernels it's possible for it to wrap back to 0. This preserves the kernel functionality exactly like it was before the patch (neither method broke the spec) Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-17 13:37:37 -07:00
Eric Paris	cd94c8bbef	inotify: tail drop inotify q_overflow events In `f44aebcc` the tail drop logic of events with no file backing (q_overflow and in_ignored) was reversed so IN_IGNORED events would never be tail dropped. This now means that Q_OVERFLOW events are NOT tail dropped. The fix is to not tail drop IN_IGNORED, but to tail drop Q_OVERFLOW. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-17 13:37:37 -07:00
Eric Paris	eef3a116be	notify: unused event private race inotify decides if private data it passed to get added to an event was used by checking list_empty(). But it's possible that the event may have been dequeued and the private event removed so it would look empty. The fix is to use the return code from fsnotify_add_notify_event rather than looking at the list. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-17 13:37:37 -07:00
Tao Ma	60e2ec4866	ocfs2: release the buffer head in ocfs2_do_truncate. In ocfs2_do_truncate, we forget to release last_eb_bh which will cause memleak. So call brelse in the end. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-17 12:50:35 -07:00
Jan Kara	ada508274b	ocfs2: Handle quota file corruption more gracefully ocfs2_read_virt_blocks() does BUG when we try to read a block from a file beyond its end. Since this can happen due to filesystem corruption, it is not really an appropriate answer. Make ocfs2_read_quota_block() check the condition and handle it by calling ocfs2_error() and returning EIO. [ Modified to print ip_blkno in the error - Joel ] Reported-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-17 12:50:12 -07:00
Steven Whitehouse	31e54b01f3	GFS2: Add sysfs link to device This adds a link from the per-gfs2 sb sysfs directory to the block device upon which the filesystem is mounted. The link is called "device", strangely enough :-) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-17 11:11:18 +01:00
Steven Whitehouse	05164e5b37	GFS2: Replace assertion with proper error handling One fewer assert, one more place we can recover gracefully if there is an error. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-17 11:06:43 +01:00
Steven Whitehouse	6050b9c74f	GFS2: Improve error handling in inode allocation A little while back, block allocation was given some improved error handling which meant that -EIO was returned in the case of there being a problem in the resource group data. In addition a message is printed explaning what went wrong and how to fix it. This extends that error handling so that it also covers inode allocation too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-17 11:05:31 +01:00
Steven Whitehouse	440d6da207	GFS2: Add some more info to uevents With each uevent, we now always include the journal ID. We can't call it JID since that is already in use by some of the individual events relating to recovery, so we use JOURNALID instead. We don't send the JOURNALID for spectator mounts, since there isn't one. Also the ADD event now has both RDONLY and SPECTATOR information to match that of the ONLINE event. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-17 11:05:00 +01:00
Steven Whitehouse	8633ecfaba	GFS2: Add online uevent to GFS2 We already have an offline uevent (used when a withdraw occurs) but no online uevent. This adds an online uevent so that userspace will be able to detect a successful mount by means other than not receiving a remove event after the add & recovery (change) uevents. It has also been added to the remount path as well - we can't use a change uevent there as older GFS2 userspace acts on change uevents according to the state that it thinks the fs is in, so we can't easily add any new ones. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-08-17 11:04:42 +01:00
Christoph Hellwig	bc990f5cb4	xfs: fix locking in xfs_iget_cache_hit The locking in xfs_iget_cache_hit currently has numerous problems: - we clear the reclaim tag without i_flags_lock which protects modifications to it - we call inode_init_always which can sleep with pag_ici_lock held (this is oss.sgi.com BZ #819) - we acquire and drop i_flags_lock a lot and thus provide no consistency between the various flags we set/clear under it This patch fixes all that with a major revamp of the locking in the function. The new version acquires i_flags_lock early and only drops it once we need to call into inode_init_always or before calling xfs_ilock. This patch fixes a bug seen in the wild where we race modifying the reclaim tag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-17 01:23:48 -05:00
Guillaume Knispel	b2add73dbf	poll/select: initialize triggered field of struct poll_wqueues The triggered field of struct poll_wqueues introduced in commit `5f820f648c` ("poll: allow f_op->poll to sleep"). It was first set to 1 in pollwake() (now __pollwake() ), tested and later set to 0 in poll_schedule_timeout(), but not initialized before. As a result when the process needs to sleep, triggered was likely to be non-zero even if pollwake() is not called before the first poll_schedule_timeout(), meaning schedule_hrtimeout_range() would not be called and an extra loop calling all ->poll() would be done. This patch initialize triggered to 0 in poll_initwait() so the ->poll() are not called twice before the process goes to sleep when it needs to. Signed-off-by: Guillaume Knispel <gknispel@proformatique.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-15 18:40:11 -07:00
Benny Halevy	cccddf4f55	nfs: nfs4xdr: optimize low level decoding do not increment decoding ptr if not needed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:26 -04:00
Benny Halevy	c0eae66ece	nfs: nfs4xdr: get rid of READ_BUF Use xdr_inline_decode instead. Open code debug printout and error return. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:23 -04:00
Benny Halevy	2460ba57c4	nfs: nfs4xdr: simplify decode_exchange_id by reusing decode_opaque_inline Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:20 -04:00
Benny Halevy	99398d0655	nfs: nfs4xdr: get rid of COPYMEM Just directly call memcpy. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:17 -04:00
Benny Halevy	e78291e4e0	nfs: nfs4xdr: introduce decode_sessionid helper Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 14:02:14 -04:00
Benny Halevy	db942bbd09	nfs: nfs4xdr: introduce decode_verifier helper Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Trond: Fixed up an 'uninitialised variable' issue in decode_readdir] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:57:58 -04:00
Benny Halevy	07d30434cf	nfs: nfs4xdr: introduce decode_opaque_fixed and decode_stateid helpers Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:26:27 -04:00
Benny Halevy	686841b3cc	nfs: nfs4xdr: introduce print_overflow_msg Part fo the nfs4xdr cleanup. READ_BUF will go away. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:38 -04:00
Benny Halevy	c816fd3406	nfs: nfs4xdr: get rid of READTIME It has no users. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:32 -04:00
Benny Halevy	3ceb4dbb99	nfs: nfs4xdr: get rid of READ64 s/READ64\(\(.)\)/p = xdr_decode_hyper(p, \1)/ s/READ64\((.*)\)/p = xdr_decode_hyper(p, &\1)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:24:13 -04:00
Benny Halevy	6f723f7710	nfs: nfs4xdr: get rid of READ32 s/READ32\((.*)\)/\1 = be32_to_cpup(p++)/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:23:58 -04:00
Benny Halevy	811652bd6e	nfs: nfs4xdr: merge xdr_encode_int+xdr_encode_opaque_fixed into xdr_encode_opaque use encode_string where appropriate. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:19:24 -04:00
Benny Halevy	345585132a	nfs: nfs4xdr: optimize low level encoding do not increment encoding ptr if not needed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:18:03 -04:00
Benny Halevy	13c65ce900	nfs: nfs4xdr: change RESERVE_SPACE macro into a static helper In order to open code and expose the result pointer assignment. Alternatively, we can open code the call to xdr_reserve_space and do the BUG_ON an the error case at the call site. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:17:17 -04:00
Benny Halevy	2220f13a8b	nfs: nfs4xdr: encode_compound_hdr does not have to round up reserved bytes This is already done by xdr_reserve_space and since encode_compound_hdr is adding a byte count to "12" which is already word aligned, the xdr level rounding will work just as well. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:16:00 -04:00
Benny Halevy	42edd69812	nfs: nfs4xdr: optimize RESERVE_SPACE in encode_create_session and encode_sequence Coalesce multilpe constant RESERVE_SPACEs into one Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:15:20 -04:00
Benny Halevy	93f0cf2594	nfs: nfs4xdr: get rid of WRITEMEM s/WRITEMEM(/p = xdr_encode_opaque_fixed(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:13:58 -04:00
Benny Halevy	b95be5a976	nfs: nfs4xdr: get rid of WRITE64 s/WRITE64/p = xdr_encode_hyper(p, / Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:13:15 -04:00
Benny Halevy	e75bc1c89e	nfs: nfs4xdr: get rid of WRITE32 s/WRITE32/*p++ = cpu_to_be32/ Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:12:55 -04:00
Steven Whitehouse	d7e623da1a	GFS2: Fix permissions on "recover" file Although this file is only ever written and not read by userspace, it seems that the utils are opening this file O_RDWR, so we need to allow that. Also fixes the whitespace which seemed to be broken. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>	2009-08-14 14:04:46 +01:00
Linus Torvalds	bc7af9ba15	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (22 commits) ocfs2: Fix possible deadlock when extending quota file ocfs2: keep index within status_map[] ocfs2: Initialize the cluster we're writing to in a non-sparse extend ocfs2: Remove redundant BUG_ON in __dlm_queue_ast() ocfs2/quota: Release lock for error in ocfs2_quota_write. ocfs2: Define credit counts for quota operations ocfs2: Remove syncjiff field from quota info ocfs2: Fix initialization of blockcheck stats ocfs2: Zero out padding of on disk dquot structure ocfs2: Initialize blocks allocated to local quota file ocfs2: Mark buffer uptodate before calling ocfs2_journal_access_dq() ocfs2: Make global quota files blocksize aligned ocfs2: Use ocfs2_rec_clusters in ocfs2_adjust_adjacent_records. ocfs2: Fix deadlock on umount ocfs2: Add extra credits and access the modified bh in update_edge_lengths. ocfs2: Fail ocfs2_get_block() immediately when a block needs allocation ocfs2: Fix error return in ocfs2_write_cluster() ocfs2: Fix compilation warning for fs/ocfs2/xattr.c ocfs2: Initialize count in aio_write before generic_write_checks ocfs2: log the actual return value of ocfs2_file_aio_write() ...	2009-08-13 11:17:40 -07:00
David S. Miller	aa11d958d1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: arch/microblaze/include/asm/socket.h	2009-08-12 17:44:53 -07:00
Linus Torvalds	78efd1ddd9	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix spin_is_locked assert on uni-processor builds xfs: check for dinode realtime flag corruption use XFS_CORRUPTION_ERROR in xfs_btree_check_sblock xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_get xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmap xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_set xfs: switch to NOFS allocation under i_lock in xfs_buf_associate_memory xfs: switch to NOFS allocation under i_lock in xfs_dir_cilookup_result xfs: switch to NOFS allocation under i_lock in xfs_da_buf_make xfs: switch to NOFS allocation under i_lock in xfs_da_state_alloc xfs: switch to NOFS allocation under i_lock in xfs_getbmap xfs: avoid memory allocation under m_peraglock in growfs code	2009-08-12 08:49:35 -07:00
Trond Myklebust	1ae88b2e44	NFS: Fix an O_DIRECT Oops... We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-12 08:21:39 -07:00
Christoph Hellwig	a8914f3a6d	xfs: fix spin_is_locked assert on uni-processor builds Without SMP or preemption spin_is_locked always returns false, so we can't do an assert with it. Instead use assert_spin_locked, which does the right thing on all builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reported-by: Johannes Engel <jcnengel@googlemail.com> Tested-by: Johannes Engel <jcnengel@googlemail.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:08:27 -05:00
Christoph Hellwig	b89d4208de	xfs: check for dinode realtime flag corruption Ramon tested XFS with a modified version of fsfuzzer and hit a NULL pointer dereference in __xfs_get_blocks due to the RT device target pointer being NULL. To fix this reject inode with the realtime bit set on a a filesystem without an RT subvolume during inode read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reported-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Tested-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:08:21 -05:00
Eric Sandeen	e0c222c411	use XFS_CORRUPTION_ERROR in xfs_btree_check_sblock In Red Hat Bug 512552 - Can't write to XFS mount during raid5 resync a user ran into corruption while resyncing a raid, and we failed a consistency test, but didn't get much more info; it'd be nice to call XFS_CORRUPTION_ERROR here so we can see the buffer contents. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:08:10 -05:00
Christoph Hellwig	ddd3a14e0f	xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_get xfs_attr_rmtval_get is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:08:01 -05:00
Christoph Hellwig	7b02ecb303	xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmap xfs_readlink_bmap is called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:53 -05:00
Christoph Hellwig	10746e47e7	xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_set xfs_attr_rmtval_set is always called with i_lock held, and i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:44 -05:00
Christoph Hellwig	36fae17a64	xfs: switch to NOFS allocation under i_lock in xfs_buf_associate_memory xfs_buf_associate_memory is used for setting up the spare buffer for the log wrap case in xlog_sync which can happen under i_lock when called from xfs_fsync. The i_lock mutex is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. There are a couple more uses of xfs_buf_associate_memory in the log recovery code that are also affected by this, but I'd rather keep the code simple than passing on a gfp_mask argument. Longer term we should just stop requiring the memoery allocation in xlog_sync by some smaller rework of the buffer layer. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:38 -05:00
Christoph Hellwig	3f52c2f0a0	xfs: switch to NOFS allocation under i_lock in xfs_dir_cilookup_result xfs_dir_cilookup_result is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:23 -05:00
Christoph Hellwig	73195ed786	xfs: switch to NOFS allocation under i_lock in xfs_da_buf_make i_lock is taken in the reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:14 -05:00
Christoph Hellwig	f41d7fb9da	xfs: switch to NOFS allocation under i_lock in xfs_da_state_alloc xfs_da_state_alloc is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:07:07 -05:00
Christoph Hellwig	ca35dcd6ca	xfs: switch to NOFS allocation under i_lock in xfs_getbmap xfs_getbmap allocates memory with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:06:59 -05:00
Christoph Hellwig	0cc6eee130	xfs: avoid memory allocation under m_peraglock in growfs code Allocate the memory for the larger m_perag array before taking the per-AG lock as the per-AG lock can be taken under the i_lock which can be taken from reclaim context. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-08-12 01:06:51 -05:00
James Morris	8b4bfc7feb	Merge branch 'master' into next	2009-08-11 08:33:01 +10:00
Trond Myklebust	f884dcaead	Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:58 -04:00
Trond Myklebust	976a6f921c	Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:50 -04:00
Jan Kara	b409d7a0ab	ocfs2: Fix possible deadlock when extending quota file In OCFS2, allocator locks rank above transaction start. Thus we cannot extend quota file from inside a transaction less we could deadlock. We solve the problem by starting transaction not already in ocfs2_acquire_dquot() but only in ocfs2_local_read_dquot() and ocfs2_global_read_dquot() and we allocate blocks to quota files before starting the transaction. In case we crash, quota files will just have a few blocks more but that's no problem since we just use them next time we extend the quota file. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-08-10 12:20:22 -07:00

1 2 3 4 5 ...

15121 Commits