linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-12 15:46:54 +07:00

Author	SHA1	Message	Date
Brian Foster	8b5279e33f	xfs: only writeback and truncate pages for the freed range xfs_free_file_space() only affects the range of the file for which space is being freed. It currently writes and truncates the page cache from the start offset of the free to EOF. Modify xfs_free_file_space() to write back and truncate page cache of just the range being freed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:39:05 +10:00
Brian Foster	f71721d061	xfs: writeback and inval. file range to be shifted by collapse The collapse range operation currently writes the entire file before starting the collapse to avoid changes in the in-core extent list due to writeback causing the extent count to change. Now that collapse range is fsb based rather than extent index based it can sustain changes in the extent list during the shift sequence without disruption. Modify xfs_collapse_file_space() to writeback and invalidate pages associated with the range of the file to be shifted. xfs_free_file_space() currently has similar behavior, but the space free need only affect the region of the file that is freed and this could change in the future. Also update the comments to reflect the current implementation. We retain the eofblocks trim permanently as a best option for dealing with delalloc extents. We don't shift delalloc extents because this scenario only occurs with post-eof preallocation (since data must be flushed such that the cache can be invalidated and data can be shifted). That means said space must also be initialized before being shifted into the accessible region of the file only to be immediately truncated off as the last part of the collapse. In other words, the eofblocks trim will happen anyways, we just run it first to ensure the file remains in a consistent state throughout the collapse. Finally, detect and fail explicitly in the event of a delalloc extent during the extent shift. The implementation does not support delalloc extents and the caller is expected to prevent this scenario in advance as is done by collapse. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:39:05 +10:00
Brian Foster	a979bdfea1	xfs: refactor single extent shift into xfs_bmse_shift_one() helper xfs_bmap_shift_extents() has a variety of conditions and error checks that make the logic difficult to follow and indent heavy. Refactor the loop body of this function into a new xfs_bmse_shift_one() helper. This simplifies the error checks, eliminates index decrement on merge hack by pushing the index increment down into the helper, and makes the code more readable by reducing multiple levels of indentation. This is a code refactor only. The behavior of extent shift and collapse range is not modified. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:39:04 +10:00
Brian Foster	ddb19e3180	xfs: refactor shift-by-merge into xfs_bmse_merge() helper The extent shift mechanism in xfs_bmap_shift_extents() is complicated and handles several different, non-deterministic scenarios. These include extent shifts, extent merges and potential btree updates in either of the former scenarios. Refactor the code to be more linear and readable. The loop logic in xfs_bmap_shift_extents() and some initial error checking is adjusted slightly. The associated btree lookup and update/delete operations are condensed into single blocks of code. This reduces the number of btree-specific blocks and facilitates the separation of the merge operation into a new xfs_bmse_merge() and xfs_bmse_can_merge() helpers. This is a code refactor only. The behavior of extent shift and collapse range is not modified. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:38:09 +10:00
Brian Foster	2c845f5a5f	xfs: track collapse via file offset rather than extent index The collapse range implementation uses a transaction per extent shift. The progress of the overall operation is tracked via the current extent index of the in-core extent list. This is racy because the ilock must be dropped and reacquired for each transaction according to locking and log reservation rules. Therefore, writeback to prior regions of the file is possible and can change the extent count. This changes the extent to which the current index refers and causes the collapse to fail mid operation. To avoid this problem, the entire file is currently written back before the collapse operation starts. To eliminate the need to flush the entire file, use the file offset (fsb) to track the progress of the overall extent shift operation rather than the extent index. Modify xfs_bmap_shift_extents() to unconditionally convert the start_fsb parameter to an extent index and return the file offset of the extent where the shift left off, if further extents exist. The bulk of ths function can remain based on extent index as ilock is held by the caller. xfs_collapse_file_space() now uses the fsb output as the starting point for the subsequent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:37:09 +10:00
Dave Chinner	0d085a529b	xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly XFS has been having trouble with stray delayed allocation extents beyond EOF for a long time. Recent changes to the collapse range code has triggered erroneous EBUSY errors on page invalidtion for block size smaller than page size filesystems. These have been caused by dirty buffers beyond EOF on a partial page which do not get written to disk during a sync. The issue is that write-ahead in xfs_cluster_write() finds such a partial page and handles it by leaving the page dirty but pushing it into a writeback state. This used to work just fine, as the write_cache_pages() code would then find the dirty partial page in the next mapping tree lookup as the dirty tag is still set. Unfortunately, when we moved to a mark and sweep approach to writeback to fix other writeback sync issues, we broken this. THe act of marking the page as under writeback now clears the TOWRITE tag in the radix tree, even though the page is still dirty. This causes the TOWRITE tag to be cleared, and hence the next lookup on the mapping tree does not find the dirty partial page and so doesn't try to write it again. This same writeback bug was found recently in ext4 and fixed in commit `1c8349a` ("ext4: fix data integrity sync in ordered mode") without communication to the wider filesystem community. We can use exactly the same fix here so the TOWRITE flag is not cleared on partial page writes. cc: stable@vger.kernel.org # dependent on `1c8349a171` Root-cause-found-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-23 15:36:27 +10:00
Brian Foster	41b9d7263e	xfs: trim eofblocks before collapse range xfs_collapse_file_space() currently writes back the entire file undergoing collapse range to settle things down for the extent shift algorithm. While this prevents changes to the extent list during the collapse operation, the writeback itself is not enough to prevent unnecessary collapse failures. The current shift algorithm uses the extent index to iterate the in-core extent list. If a post-eof delalloc extent persists after the writeback (e.g., a prior zero range op where the end of the range aligns with eof can separate the post-eof blocks such that they are not written back and converted), xfs_bmap_shift_extents() becomes confused over the encoded br_startblock value and fails the collapse. As with the full writeback, this is a temporary fix until the algorithm is improved to cope with a volatile extent list and avoid attempts to shift post-eof extents. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:53 +10:00
Dave Chinner	1669a8ca21	xfs: xfs_file_collapse_range is delalloc challenged If we have delalloc extents on a file before we run a collapse range opertaion, we sync the range that we are going to collapse to convert delalloc extents in that region to real extents to simplify the shift operation. However, the shift operation then assumes that the extent list is not going to change as it iterates over the extent list moving things about. Unfortunately, this isn't true because we can't hold the ILOCK over all the operations. We can prevent new IO from modifying the extent list by holding the IOLOCK, but that doesn't prevent writeback from running.... And when writeback runs, it can convert delalloc extents is the range of the file prior to the region being collapsed, and this changes the indexes of all the extents in the file. That causes the collapse range operation to Go Bad. The right fix is to rewrite the extent shift operation not to be dependent on the extent list not changing across the entire operation, but this is a fairly significant piece of work to do. Hence, as a short-term workaround for the problem, sync the entire file before starting a collapse operation to remove all delalloc ranges from the file and so avoid the problem of concurrent writeback changing the extent list. Diagnosed-and-Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:53 +10:00
Brian Foster	ca446d880c	xfs: don't log inode unless extent shift makes extent modifications The file collapse mechanism uses xfs_bmap_shift_extents() to collapse all subsequent extents down into the specified, previously punched out, region. This function performs some validation, such as whether a sufficient hole exists in the target region of the collapse, then shifts the remaining exents downward. The exit path of the function currently logs the inode unconditionally. While we must log the inode (and abort) if an error occurs and the transaction is dirty, the initial validation paths can generate errors before the transaction has been dirtied. This creates an unnecessary filesystem shutdown scenario, as the caller will cancel a transaction that has been marked dirty. Modify xfs_bmap_shift_extents() to OR the logflags bits as modifications are made to the inode bmap. Only log the inode in the exit path if logflags has been set. This ensures we only have to cancel a dirty transaction if modifications have been made and prevents an unnecessary filesystem shutdown otherwise. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:53 +10:00
Dave Chinner	7d4ea3ce63	xfs: use ranged writeback and invalidation for direct IO Now we are not doing silly things with dirtying buffers beyond EOF and using invalidation correctly, we can finally reduce the ranges of writeback and invalidation used by direct IO to match that of the IO being issued. Bring the writeback and invalidation ranges back to match the generic direct IO code - this will greatly reduce the perturbation of cached data when direct IO and buffered IO are mixed, but still provide the same buffered vs direct IO coherency behaviour we currently have. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:53 +10:00
Dave Chinner	834ffca6f7	xfs: don't zero partial page cache pages during O_DIRECT writes Similar to direct IO reads, direct IO writes are using truncate_pagecache_range to invalidate the page cache. This is incorrect due to the sub-block zeroing in the page cache that truncate_pagecache_range() triggers. This patch fixes things by using invalidate_inode_pages2_range instead. It preserves the page cache invalidation, but won't zero any pages. cc: stable@vger.kernel.org Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:52 +10:00
Chris Mason	85e584da32	xfs: don't zero partial page cache pages during O_DIRECT writes xfs is using truncate_pagecache_range to invalidate the page cache during DIO reads. This is different from the other filesystems who only invalidate pages during DIO writes. truncate_pagecache_range is meant to be used when we are freeing the underlying data structs from disk, so it will zero any partial ranges in the page. This means a DIO read can zero out part of the page cache page, and it is possible the page will stay in cache. buffered reads will find an up to date page with zeros instead of the data actually on disk. This patch fixes things by using invalidate_inode_pages2_range instead. It preserves the page cache invalidation, but won't zero any pages. [dchinner: catch error and warn if it fails. Comment.] cc: stable@vger.kernel.org Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:52 +10:00
Dave Chinner	22e757a49c	xfs: don't dirty buffers beyond EOF generic/263 is failing fsx at this point with a page spanning EOF that cannot be invalidated. The operations are: 1190 mapwrite 0x52c00 thru 0x5e569 (0xb96a bytes) 1191 mapread 0x5c000 thru 0x5d636 (0x1637 bytes) 1192 write 0x5b600 thru 0x771ff (0x1bc00 bytes) where 1190 extents EOF from 0x54000 to 0x5e569. When the direct IO write attempts to invalidate the cached page over this range, it fails with -EBUSY and so any attempt to do page invalidation fails. The real question is this: Why can't that page be invalidated after it has been written to disk and cleaned? Well, there's data on the first two buffers in the page (1k block size, 4k page), but the third buffer on the page (i.e. beyond EOF) is failing drop_buffers because it's bh->b_state == 0x3, which is BH_Uptodate \| BH_Dirty. IOWs, there's dirty buffers beyond EOF. Say what? OK, set_buffer_dirty() is called on all buffers from __set_page_buffers_dirty(), regardless of whether the buffer is beyond EOF or not, which means that when we get to ->writepage, we have buffers marked dirty beyond EOF that we need to clean. So, we need to implement our own .set_page_dirty method that doesn't dirty buffers beyond EOF. This is messy because the buffer code is not meant to be shared and it has interesting locking issues on the buffer dirty bits. So just copy and paste it and then modify it to suit what we need. Note: the solutions the other filesystems and generic block code use of marking the buffers clean in ->writepage does not work for XFS. It still leaves dirty buffers beyond EOF and invalidations still fail. Hence rather than play whack-a-mole, this patch simply prevents those buffers from being dirtied in the first place. cc: <stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-09-02 12:12:51 +10:00
Linus Torvalds	52addcf9d6	Linux 3.17-rc2	2014-08-25 15:36:20 -07:00
Linus Torvalds	f01bfc977e	NFS client fixes for 3.17 Highlights: - More fixes for read/write codepath regressions - Sleeping while holding the inode lock - Stricter enforcement of page contiguity when coalescing requests - Fix up error handling in the page coalescing code - Don't busy wait on SIGKILL in the file locking code -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJT+0LpAAoJEGcL54qWCgDyWfsP/imrpge47aZywi95chV8vgjM O85ITZbupTFwXbB7kE63CrcaxRGhFrSStk4UDhDCDkHfFb1ksjZaPR1mnkwvkR2p 4+JUoq0fkPfeX21+rqKCYmnhstpne/N8K8FJBsEs3/TqiCBWxWOelLXdyWun4H5B 9JBYQ7FYitUazeSiSiDXcl7Di/E09cFPi0H5VPKRyuNdYxySabnsBOELBE/28iXr egW1I9UKQR2EtBrvgazBbWE5XmB9XAm4X3sD1l0QD65mfSNkbnNhPFSiCdT7f/d6 9uxECR0Y4wNYgYAfVLBew5/MXJajcv03BFMKmTUeGj9fOQzycpBT4Dx2KxEWqfnt Xk2nNbISxBnO0koMflmo+LPv2lv+Br3kQ+eZCHHKknvBrX2a6bJdTCZkwACVtND9 LdbAveFQpdaeLrm/28TnRoE927r+VeAVM19yOSG8sNAskFFg4Yy51tR0e1GivkJT +qmmTRx+l78HjHvoPXOYdNgBC954r6APH5ST7su/7WxNClM36fEK6XxA9xbDLJWm wUzlGKvpwEeBJJhgjbQLwuU8BiksjFz/CaiObNvPOpc/d2GoKIhnTg19kNhg2R// UCDa2d5fep4z0Bo9p0s1KZm9pSBkkLjvRp9dm8WEIxLcdaF1jBK3dJECepm6ccvw dmEmEfjbMudVdt/ZhapJ =2wRt -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client fixes from Trond Myklebust: "Highlights: - more fixes for read/write codepath regressions * sleeping while holding the inode lock * stricter enforcement of page contiguity when coalescing requests * fix up error handling in the page coalescing code - don't busy wait on SIGKILL in the file locking code" * tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait nfs: can_coalesce_requests must enforce contiguity nfs: disallow duplicate pages in pgio page vectors nfs: don't sleep with inode lock in lock_and_join_requests nfs: fix error handling in lock_and_join_requests nfs: use blocking page_group_lock in add_request nfs: fix nonblocking calls to nfs_page_group_lock nfs: change nfs_page_group_lock argument	2014-08-25 15:34:28 -07:00
Linus Torvalds	dd5957b78f	SH Drivers Updates For v3.17 * Confine SH_INTC to platforms that need it -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJT+oswAAoJENfPZGlqN0++/A8QAK2yp96zl7gABtddgkunNJiV xWCm4wcP+7dIrLpwizxtt/6HMPj9ZJHYtcxRRKVlzquwcukeVCIkfZ3Qvn2JilWM +b4rVRGzQ+z0M7SCpNGjtgFc1IFrVrzuxZpPwgseQ5I6HQsZJKUi3qn5rRJSEax2 w+ANis2eZfmSBdu3Qsx2QBDXvS7ZPlLpimoAE+rjr60dZnljcrvBrVHQvSgpEHxn 4rXPbYrrkIee32J4lxLRDPtn5fvAsrr+vcLXEYqyFr2292U2PAxIrjZgrRzZYNtD L6xmhZAhdmIpC0gLDLyKQhwj/9pfGFxFErZX9jeGdPT2KTEEYPkSxb75kdTwmxP4 DEtZG9Nuu6CQCBldrMt6kpujn+XCYRl94cSyUffJ5KVhNUiqB+e+JP80Yj9HB7QU nGUfs2hQ/azY8XEiwvnls0314thkB5gN7KHxmlaFa9Wz7cgYFwWAguHH8qrq839U +i+7dUg7nlmVoIAvzYIIA+L7x3xX4gEwsf88x3i04DPp0A8/oDAwGrqM5QzsaZpD ghodnKRG7foc+29RLYka5CxF/MqKzpJu675sSBCVB3uW9NZpoGSwml/NN9yUu1JD uvU2FECcbxF5OU6RHfq8FFql0fAOVeVWsMCEXgkjpZOXynKU15VTf1K2d+qkUZ/Y gV45Jj+yCZhYs0wbXUml =QSyy -----END PGP SIGNATURE----- Merge tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas Pull SH driver fix from Simon Horman: "Confine SH_INTC to platforms that need it" * tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: sh: intc: Confine SH_INTC to platforms that need it	2014-08-25 15:29:33 -07:00
Linus Torvalds	497c01dda9	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: "Pretty much all across the field so with this we should be in reasonable shape for the upcoming -rc2" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: OCTEON: make get_system_type() thread-safe MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init' MIPS: EVA: Add new EVA header MIPS: scall64-o32: Fix indirect syscall detection MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64 MIPS: Loongson: Fix COP2 usage for preemptible kernel MIPS: NL: Fix nlm_xlp_defconfig build error MIPS: Remove race window in page fault handling MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G MIPS: Alchemy: Fix db1200 PSC clock enablement MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785 MIPS: Remove duplicated include from numa.c MIPS: Add common plat_irq_dispatch declaration MIPS: MSP71xx: remove unused plat_irq_dispatch() argument MIPS: GIC: Remove useless parens from GICBIS(). MIPS: perf: Mark pmu interupt IRQF_NO_THREAD	2014-08-25 15:28:57 -07:00
Linus Torvalds	01e9982ab3	The rewrite of the ftrace code that makes it possible to allow for separate trampolines had a design flaw with the interaction between the function and function_graph tracers. The main flaw was the simplification of the use of multiple tracers having the same filter (like function and function_graph, that use the set_ftrace_filter file to filter their code). The design assumed that the two tracers could never run simultaneously as only one tracer can be used at a time. The problem with this assumption was that the function profiler could be implemented on top of the function graph tracer, and the function profiler could run at the same time as the function tracer. This caused the assumption to be broken and when ftrace detected this failed assumpiton it would spit out a nasty warning and shut itself down. Instead of using a single ftrace_ops that switches between the function and function_graph callbacks, the two tracers can again use their own ftrace_ops. But instead of having a complex hierarchy of ftrace_ops, the filter fields are placed in its own structure and the ftrace_ops can carefully use the same filter. This change took a bit to be able to allow for this and currently only the global_ops can share the same filter, but this new design can easily be modified to allow for any ftrace_ops to share its filter with another ftrace_ops. The first four patches deal with the change of allowing the ftrace_ops to share the filter (and this needs to go to 3.16 as well). The fifth patch fixes a bug that was also caused by the new changes but only for archs other than x86, and only if those archs implement a direct call to the function_graph tracer which they do not do yet but will in the future. It does not need to go to stable, but needs to be fixed before the other archs update their code to allow direct calls to the function_graph trampoline. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJT+hqSAAoJEKQekfcNnQGulvcH/0O4NMXX4HH1dQlYgKEaSYxE Nh8WdiewopF5iaeNvo+8Nzdq8D2k3KgMOqSlzJ4JVmzd7gjOBSGeKDfqFwR+IbTk 9LcaJJCI3oG3MEf6m7gZMdjKPKyxkeYHDtG7kRHo8z94eliV9pKC6fUnEWayQO3o Kv6IBupdkF8ICAiKRae5Uo0c9wjZ9YP0bZS7fxI2hJw3h/NMFnhnhUL03URIx8e3 dqgpweYg+P3KPfp2Jz6safdJqLTPK9rqqhkZhylbDl7o78xEzRN7wCyB6Nak00xz swRgsW6vFP7ci/YSNx+B6HCIf7NTm3WLDrrIhitNHcJUZwUMU3CRO9IJHGsTuEE= =J5lZ -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull fix for ftrace function tracer/profiler conflict from Steven Rostedt: "The rewrite of the ftrace code that makes it possible to allow for separate trampolines had a design flaw with the interaction between the function and function_graph tracers. The main flaw was the simplification of the use of multiple tracers having the same filter (like function and function_graph, that use the set_ftrace_filter file to filter their code). The design assumed that the two tracers could never run simultaneously as only one tracer can be used at a time. The problem with this assumption was that the function profiler could be implemented on top of the function graph tracer, and the function profiler could run at the same time as the function tracer. This caused the assumption to be broken and when ftrace detected this failed assumpiton it would spit out a nasty warning and shut itself down. Instead of using a single ftrace_ops that switches between the function and function_graph callbacks, the two tracers can again use their own ftrace_ops. But instead of having a complex hierarchy of ftrace_ops, the filter fields are placed in its own structure and the ftrace_ops can carefully use the same filter. This change took a bit to be able to allow for this and currently only the global_ops can share the same filter, but this new design can easily be modified to allow for any ftrace_ops to share its filter with another ftrace_ops. The first four patches deal with the change of allowing the ftrace_ops to share the filter (and this needs to go to 3.16 as well). The fifth patch fixes a bug that was also caused by the new changes but only for archs other than x86, and only if those archs implement a direct call to the function_graph tracer which they do not do yet but will in the future. It does not need to go to stable, but needs to be fixed before the other archs update their code to allow direct calls to the function_graph trampoline" * tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Use current addr when converting to nop in __ftrace_replace_code() ftrace: Fix function_profiler and function tracer together ftrace: Fix up trampoline accounting with looping on hash ops ftrace: Update all ftrace_ops for a ftrace_hash_ops update ftrace: Allow ftrace_ops to use the hashes from other ops	2014-08-25 15:11:53 -07:00
Linus Torvalds	7be141d055	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A couple of EFI fixes, plus misc fixes all around the map" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/arm64: Store Runtime Services revision firmware: Do not use WARN_ON(!spin_is_locked()) x86_32, entry: Clean up sysenter_badsys declaration x86/doc: Fix the 'tlb_single_page_flush_ceiling' sysconfig path x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the variable read-mostly x86/mm: Fix RCU splat from new TLB tracepoints	2014-08-24 16:17:41 -07:00
Linus Torvalds	44744bb344	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A kprobes and a perf compat ioctl fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Handle compat ioctl kprobes: Skip kretprobe hit in NMI context to avoid deadlock	2014-08-24 16:16:55 -07:00
Linus Torvalds	959dc2587d	ARM: SoC fixes for 3.17-rc A collection of fixes from this week, it's been pretty quiet and nothing really stands out as particularly noteworthy here -- mostly minor fixes across the field: - ODROID booting was fixed due to PMIC interrupts missing in DT - A collection of i.MX fixes - Minor Tegra fix for regulators - Rockchip fix and addition of SoC-specific mailing list to make it easier to find posted patches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJT+jEaAAoJEIwa5zzehBx3gPIP/2EMvkRiFL73z6u67d6AhqCR UPpv65Lk0FzvKZYvpNdNDdPatS48SpEh4Ppj58HegK31jaZIYhsvGHFsAwrHpZXF 2D2H8Vqy6wl+UL+BQGTHJ2rhqgg40PSZQh1KksaBrjb9MDWUbAI0V9AysoT6ecIP /02mzRkNPL7V5ISikksa82FWbZwI36KJGTM19ZDp6CYQnV2L4eG0LefGjgEzdE07 iur1PO/TUi0ibQex3It9D+ynNlza0sZJDR0AsGFTS/96fvtX6SQzvxtEsr4jUfyT jceqD5KIWS9N1OmRudJ+e3awpzGGuRIkdq36eiJbhSe426LHgDNbIS4RU+YRNFIf 9/bK4blcxGNnddsDTLUIyi+vykAm1ObAfGNNrKeA4z9lDw0QVuoG5VwrgNKcIk3J kugj3RLUQ5yd9iIFJyrPxlpB5mTo5SaPCSxjuDKzftwDQtF+SJI58V//Nde0Ocfw K7VpmY26uYUmf6AltiyFxOCUASzUC3Bp+/cf0lYTWE1+iIuOTJRlYmYwKDMrJ+c0 mtz3uCiQhTzaHje6AksA1wlKhv3KS/opGN1oNVSILgjExiRGh94MSkROLqtJ35cE WV/nuzZUPdmZ6tGQ6b7FEa0elbEElaioDiQraOeSehKijEjfhK+tHNJczoO94s2R Swfff4NwRe346ZMk/L/2 =+oYu -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "A collection of fixes from this week, it's been pretty quiet and nothing really stands out as particularly noteworthy here -- mostly minor fixes across the field: - ODROID booting was fixed due to PMIC interrupts missing in DT - a collection of i.MX fixes - minor Tegra fix for regulators - Rockchip fix and addition of SoC-specific mailing list to make it easier to find posted patches" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: bus: arm-ccn: Fix warning message ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies MAINTAINERS: add new Rockchip SoC list ARM: dts: rockchip: readd missing mmc0 pinctrl settings ARM: dts: ODROID i2c improvements ARM: dts: Enable PMIC interrupts on ODROID ARM: dts: imx6sx: fix the pad setting for uart CTS_B ARM: dts: i.MX53: fix apparent bug in VPU clks ARM: imx: correct gpu2d_axi and gpu3d_axi clock setting ARM: dts: imx6: edmqmx6: change enet reset pin ARM: dts: vf610-twr: Fix pinctrl_esdhc1 pin definitions. ARM: imx: remove unnecessary ARCH_HAS_OPP select ARM: imx: fix TLB missing of IOMUXC base address during suspend ARM: imx6: fix SMP compilation again ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes	2014-08-24 15:57:00 -07:00
Linus Torvalds	fa7f78e02e	Fixes for the v3.17 series: - A largeish fix for the IRQ handling in the new Zynq driver. The quite verbose commit message gives the exact details. - Move some defines for gpiod flags outside an ifdef to make stub functions work again. - Various minor fixes that we can accept for -rc1. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJT+ZCiAAoJEEEQszewGV1zOZEQALkGrQacC9pZdhw93+wumj/d goA4B1VSwi+vlQ3DcF7oXbTcbaYfGu58ZAlShc9XhVINaWEkVfc1GnBCq9rsnjNh /rPaqDpqqBo7tfghjsMIZFdlNH151u/4rCAFMUB/STOAlUckSCiqWdxd7wMhAPlZ yLD5oKqm3mWu/PX2n334rkLb1UxT1vIniB3DgQRhUtrKeYarH9OpzJnWnCCmspCr GeVWpq4Z851fHMok5D76EbLL57JykTCC7VjHyK9jrmbjpMFX6+lYmWWMCOgUHypi zA5ZEL7PjhhrZlwTN5yn8QbuSfbmT14tTUP/OLtItE0CYjF1nheyxueKErHT+DP4 piLDZgdv79ZYBSbvfbpjeO5VLkZz3m7EAGp/0kWy89uGcwjVsk+zsDSDc0v09tTk hYRhu5/gc8zp3rd7v4DspSBbpxK+iI7uMzaGuv32AEdPvFzDVy4SJmRN/qTt/W9E KVAdk4OJdx7xb9ee9Z2OfdbteOt0zFQjmTIneZ0urlqOZCbvbFH4PvuPB4L81SCV XQ023OZqbBYb72Kz3Fw46NjMvRql3rkDph62oikEtfSKmSziMDY9z6bY42b0PCTq mSovoQduN06wRDy5VQA8H1h7VF3k8NXjTI2hMEV85VhxlCzhabukBf+HY37TQIhT pLQp0fwKmf8iZWtuF4Of =7iYA -----END PGP SIGNATURE----- Merge tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull gpio fixes from Linus Walleij: - a largeish fix for the IRQ handling in the new Zynq driver. The quite verbose commit message gives the exact details. - move some defines for gpiod flags outside an ifdef to make stub functions work again. - various minor fixes that we can accept for -rc1. * tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio-lynxpoint: enable input sensing in resume gpio: move GPIOD flags outside #ifdef gpio: delete unneeded test before of_node_put gpio: zynq: Fix IRQ handlers gpiolib: devres: use correct structure type name in sizeof MAINTAINERS: Change maintainer for gpio-bcm-kona.c	2014-08-24 15:54:23 -07:00
Linus Torvalds	5e30ca1e44	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Intel and radeon fixes. Post KS/LC git requests from i915 and radeon stacked up. They are all fixes along with some new pci ids for radeon, and one maintainers file entry. - i915: display fixes and irq fixes - radeon: pci ids, and misc gpuvm, dpm and hdp cache" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits) MAINTAINERS: Add entry for Renesas DRM drivers drm/radeon: add additional SI pci ids drm/radeon: add new bonaire pci ids drm/radeon: add new KV pci id Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe" drm/radeon: fix active_cu mask on SI and CIK after re-init (v3) drm/radeon: fix active cu count for SI and CIK drm/radeon: re-enable selective GPUVM flushing drm/radeon: Sync ME and PFP after CP semaphore waits v4 drm/radeon: fix display handling in radeon_gpu_reset drm/radeon: fix pm handling in radeon_gpu_reset drm/radeon: Only flush HDP cache for indirect buffers from userspace drm/radeon: properly document reloc priority mask drm/i915: don't try to retrain a DP link on an inactive CRTC drm/i915: make sure VDD is turned off during system suspend drm/i915: cancel hotplug and dig_port work during suspend and unload drm/i915: fix HPD IRQ reenable work cancelation drm/i915: take display port power domain in DP HPD handler drm/i915: Don't try to enable cursor from setplane when crtc is disabled drm/i915: Skip load detect when intel_crtc->new_enable==true ...	2014-08-24 15:48:12 -07:00
Benjamin LaHaise	d856f32a86	aio: fix reqs_available handling As reported by Dan Aloni, commit `f8567a3845` ("aio: fix aio request leak when events are reaped by userspace") introduces a regression when user code attempts to perform io_submit() with more events than are available in the ring buffer. Reverting that commit would reintroduce a regression when user space event reaping is used. Fixing this bug is a bit more involved than the previous attempts to fix this regression. Since we do not have a single point at which we can count events as being reaped by user space and io_getevents(), we have to track event completion by looking at the number of events left in the event ring. So long as there are as many events in the ring buffer as there have been completion events generate, we cannot call put_reqs_available(). The code to check for this is now placed in refill_reqs_available(). A test program from Dan and modified by me for verifying this bug is available at http://www.kvack.org/~bcrl/20140824-aio_bug.c . Reported-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Acked-by: Dan Aloni <dan@kernelim.com> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: stable@vger.kernel.org # v3.16 and anything that `f8567a3845` was backported to Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-24 15:47:27 -07:00
Pawel Moll	bf87bb12bd	bus: arm-ccn: Fix warning message A message warning a user about wrong vc value was printing out port instead. Reported-by: Drew Richardson <drew.richardson@arm.com> Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Olof Johansson <olof@lixom.net>	2014-08-24 11:28:30 -07:00
Geert Uytterhoeven	12266db732	ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux On r8a7791, i2c6 (aka iic3) doesn't need pinmux, but the koelsch dts refers to non-existent pinmux configuration data: pinmux core: sh-pfc does not support function i2c6 sh-pfc e6060000.pfc: invalid function i2c6 in map table Remove it to fix this. Fixes: commit `1d41f36a68` ("ARM: shmobile: koelsch dts: Add VDD MPU regulator for DVFS") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Olof Johansson <olof@lixom.net>	2014-08-24 11:23:28 -07:00
Marcel Ziswiler	caa9eac5bc	ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies Working on Gigabit/PCIe support in U-Boot for Apalis T30 I realised that the current device tree source includes for our modules only happen to work due to referencing the on-carrier 5v0 supply from USB which is not at all available on-module. The modules actually contain TPS60150 charge pumps to generate the PMIC required 5 volts from the one and only 3.3 volt module supply. This patch fixes this. (Note: When back-porting this to v3.16 stable releases, simply drop the change to tegra30-apalis.dtsi; that file was added in v3.17) Cc: <stable@vger.kernel.org> #v3.16+ Signed-off-by: Marcel Ziswiler <marcel@ziswiler.com> Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Olof Johansson <olof@lixom.net>	2014-08-24 11:21:19 -07:00
Olof Johansson	9d0b1f345e	Pinctrl that got accidentially dropped when reorganizing the dts files and addition of the new Rockchip list to MAINTAINERS. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCAAGBQJT+hwkAAoJEPOmecmc0R2Bda4IAJ7rxypWxi8RtfxULHu1oGxZ cdmAsaedAkjTn6tye/99lv1WkLzUD+Q3AxLX6Glu8PgTkocbwtkVTP+gMbv7zwfN Z9bA6kazPL4Rskr6Am8BFKkGhXHCXChGP1J3z9Gyh07Vu2+3zsvRzyM7hySt3TqZ EA2dL/uODlukw6EPzFaqGWZLn1OuJmkHXDfnZ3lBI/GFZD9qt5DnYswMBlDlIwr6 D08jmcleRA3wpnY1HXYR2cN1sJmcEP6xVE8ApXd71X0MtumEy49ZTR+4T5/Sh/XN OPheoH3UUVtXVrAa5fsoUE2xsKs3PZgnIHk1kQklpbw+HArbN+aW0Jh6CySb65Q= =aNJj -----END PGP SIGNATURE----- Merge tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes Merge "ARM: rockchip: fix for 3.17" from Heiko Stubner: Pinctrl that got accidentially dropped when reorganizing the dts files and addition of the new Rockchip list to MAINTAINERS. * tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: MAINTAINERS: add new Rockchip SoC list ARM: dts: rockchip: readd missing mmc0 pinctrl settings Signed-off-by: Olof Johansson <olof@lixom.net>	2014-08-24 11:19:58 -07:00
Laurent Pinchart	a284e9d14e	MAINTAINERS: Add entry for Renesas DRM drivers Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Dave Airlie <airlied@gmail.com>	2014-08-24 16:37:47 +10:00
Dave Airlie	db314f2310	Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux into drm-next This pull just contains some new pci ids. * 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux: drm/radeon: add additional SI pci ids drm/radeon: add new bonaire pci ids drm/radeon: add new KV pci id	2014-08-24 15:47:46 +10:00
Heiko Stuebner	00250b5293	MAINTAINERS: add new Rockchip SoC list Add the new list that Rockchip-specific patches should also be directed to. Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2014-08-23 13:22:05 +02:00
Heiko Stuebner	1302d32c84	ARM: dts: rockchip: readd missing mmc0 pinctrl settings During the restructuring of the Rockchip Cortex-A9 dtsi files it seems like the pinctrl settings vanished at some point from the mmc0 support. This of course renders them unusable, so readd the necessary pinctrl properties. Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2014-08-23 13:21:45 +02:00
Olof Johansson	2136edf3bf	Allwinner DT changes, take 2 Only a single patch in here that fixes a DTC warning. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJT2hD6AAoJEBx+YmzsjxAgV9YP/1wML2m04CODnLmLhKyx1GyM VyBoEHb+o7P7TRyK7ePIVkGhGnQjYGMyXAI2KfOiBJpAl23uxXCXSNKc6WY+nBAw 6T6qL6w+tPIPW4B79fAjmoPksqhoVf9sAKAkdu6+ETwKMHRaF0IesPRTg9t21Wcd yHHDY4z/qaFCMV59zeOIWMxM0V0gvUa46+bXC2flyTcTz5RgyUswCOXKHLvXfK1L w8B+V34AeBtm2EzG6co1iOJVRKXlrjzUg25+adbRissNfvmVkPwYkQJNoBO6r/PO Rxf7Mq50oovPl/Oc83e3RZl3tE8ds12t+ScS0qUFq6+tBvyq53IYP+51IhDOGiR3 a6PB1nPEXtvUt09xv5SPJiRsB+BFrh5hx+qWbDvUNQL5FZrqBxa63H6qArybFQXg GaY8Zv5gdiMvAU0tdk7v4pduJkvvoG2GALGHPrcKmrpg2+qQ4LDKOJxErIEqOUi2 ERp7c0kODDz18F9OmjGYU2R7XT6Ji8h4MS/hbCiLajcboQMe3yClrcVB8ts+AUKJ NSDWl5Q0/8uHhk40FAaGYEqVLakk8vhXwdg1hpcXzIg3dJg+P09dYK5nIOBzKIok +eVXZ8xjcRnoBAumljZ3eGbTuysGDO95E56coDPE4IiAe4Esi7kl3fUS/NICoOGs MC9tjeyMoscy+hoGsP57 =PL45 -----END PGP SIGNATURE----- Merge tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes Merge "Allwinner DT changes, take 2" from Maxime Ripard: Only a single patch in here that fixes a DTC warning. * tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes Signed-off-by: Olof Johansson <olof@lixom.net>	2014-08-22 22:57:57 -07:00
Steven Rostedt (Red Hat)	39b5552cd5	ftrace: Use current addr when converting to nop in __ftrace_replace_code() In __ftrace_replace_code(), when converting the call to a nop in a function it needs to compare against the "curr" (current) value of the ftrace ops, and not the "new" one. It currently does not affect x86 which is the only arch to do the trampolines with function graph tracer, but when other archs that do depend on this code implement the function graph trampoline, it can crash. Here's an example when ARM uses the trampolines (in the future): ------------[ cut here ]------------ WARNING: CPU: 0 PID: 9 at kernel/trace/ftrace.c:1716 ftrace_bug+0x17c/0x1f4() Modules linked in: omap_rng rng_core ipv6 CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.16.0-test-10959-gf0094b28f303-dirty #52 [<c02188f4>] (unwind_backtrace) from [<c021343c>] (show_stack+0x20/0x24) [<c021343c>] (show_stack) from [<c095a674>] (dump_stack+0x78/0x94) [<c095a674>] (dump_stack) from [<c02532a0>] (warn_slowpath_common+0x7c/0x9c) [<c02532a0>] (warn_slowpath_common) from [<c02532ec>] (warn_slowpath_null+0x2c/0x34) [<c02532ec>] (warn_slowpath_null) from [<c02cbac4>] (ftrace_bug+0x17c/0x1f4) [<c02cbac4>] (ftrace_bug) from [<c02cc44c>] (ftrace_replace_code+0x80/0x9c) [<c02cc44c>] (ftrace_replace_code) from [<c02cc658>] (ftrace_modify_all_code+0xb8/0x164) [<c02cc658>] (ftrace_modify_all_code) from [<c02cc718>] (__ftrace_modify_code+0x14/0x1c) [<c02cc718>] (__ftrace_modify_code) from [<c02c7244>] (multi_cpu_stop+0xf4/0x134) [<c02c7244>] (multi_cpu_stop) from [<c02c6e90>] (cpu_stopper_thread+0x54/0x130) [<c02c6e90>] (cpu_stopper_thread) from [<c0271cd4>] (smpboot_thread_fn+0x1ac/0x1bc) [<c0271cd4>] (smpboot_thread_fn) from [<c026ddf0>] (kthread+0xe0/0xfc) [<c026ddf0>] (kthread) from [<c020f318>] (ret_from_fork+0x14/0x20) ---[ end trace dc9ce72c5b617d8f ]--- [ 65.047264] ftrace failed to modify [<c0208580>] asm_do_IRQ+0x10/0x1c [ 65.054070] actual: 85:1b:00:eb Fixes: `7413af1fb7` "ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-08-22 21:04:35 -04:00
Steven Rostedt (Red Hat)	5f151b2401	ftrace: Fix function_profiler and function tracer together The latest rewrite of ftrace removed the separate ftrace_ops of the function tracer and the function graph tracer and had them share the same ftrace_ops. This simplified the accounting by removing the multiple layers of functions called, where the global_ops func would call a special list that would iterate over the other ops that were registered within it (like function and function graph), which itself was registered to the ftrace ops list of all functions currently active. If that sounds confusing, the code that implemented it was also confusing and its removal is a good thing. The problem with this change was that it assumed that the function and function graph tracer can never be used at the same time. This is mostly true, but there is an exception. That is when the function profiler uses the function graph tracer to profile. The function profiler can be activated the same time as the function tracer, and this breaks the assumption and the result is that ftrace will crash (it detects the error and shuts itself down, it does not cause a kernel oops). To solve this issue, a previous change allowed the hash tables for the functions traced by a ftrace_ops to be a pointer and let multiple ftrace_ops share the same hash. This allows the function and function_graph tracer to have separate ftrace_ops, but still share the hash, which is what is done. Now the function and function graph tracers have separate ftrace_ops again, and the function tracer can be run while the function_profile is active. Cc: stable@vger.kernel.org # 3.16 (apply after 3.17-rc4 is out) Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-08-22 21:04:34 -04:00
David Jeffery	92a56555bd	nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait If a SIGKILL is sent to a task waiting in __nfs_iocounter_wait, it will busy-wait or soft lockup in its while loop. nfs_wait_bit_killable won't sleep, and the loop won't exit on the error return. Stop the busy-wait by breaking out of the loop when nfs_wait_bit_killable returns an error. Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:44 -04:00
Weston Andros Adamson	78270e8fbc	nfs: can_coalesce_requests must enforce contiguity Commit `6094f83864` "nfs: allow coalescing of subpage requests" got rid of the requirement that requests cover whole pages, but it made some incorrect assumptions. It turns out that callers of this interface can map adjacent requests (by file position as seen by req_offset + req->wb_bytes) to different pages, even when they could share a page. An example is the direct I/O interface - iov_iter_get_pages_alloc may return one segment with a partial page filled and the next segment (which is adjacent in the file position) starts with a new page. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:44 -04:00
Weston Andros Adamson	bba5c1887a	nfs: disallow duplicate pages in pgio page vectors Adjacent requests that share the same page are allowed, but should only use one entry in the page vector. This avoids overruning the page vector - it is sized based on how many bytes there are, not by request count. This fixes issues that manifest as "Redzone overwritten" bugs (the vector overrun) and hangs waiting on page read / write, as it waits on the same page more than once. This also adds bounds checking to the page vector with a graceful failure (WARN_ON_ONCE and pgio error returned to application). Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:44 -04:00
Weston Andros Adamson	7c3af97525	nfs: don't sleep with inode lock in lock_and_join_requests This handles the 'nonblock=false' case in nfs_lock_and_join_requests. If the group is already locked and blocking is allowed, drop the inode lock and wait for the group lock to be cleared before trying it all again. This should fix warnings found in peterz's tree (sched/wait branch), where might_sleep() checks are added to wait.[ch]. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:43 -04:00
Weston Andros Adamson	94970014c4	nfs: fix error handling in lock_and_join_requests This fixes handling of errors from nfs_page_group_lock in nfs_lock_and_join_requests. It now releases the inode lock and the reference to the head request. Reported-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:43 -04:00
Weston Andros Adamson	bfd484a560	nfs: use blocking page_group_lock in add_request __nfs_pageio_add_request was calling nfs_page_group_lock nonblocking, but this can return -EAGAIN which would end up passing -EIO to the application. There is no reason not to block in this path, so change the two calls to do so. Also, there is no need to check the return value of nfs_page_group_lock when nonblock=false, so remove the error handling code. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:43 -04:00
Weston Andros Adamson	bc8a309e88	nfs: fix nonblocking calls to nfs_page_group_lock nfs_page_group_lock was calling wait_on_bit_lock even when told not to block. Fix by first trying test_and_set_bit, followed by wait_on_bit_lock if and only if blocking is allowed. Return -EAGAIN if nonblocking and the test_and_set of the bit was already locked. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:42 -04:00
Weston Andros Adamson	fd2f3a06d3	nfs: change nfs_page_group_lock argument Flip the meaning of the second argument from 'wait' to 'nonblock' to match related functions. Update all five calls to reflect this change. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-22 18:04:42 -04:00
Linus Torvalds	451fd72219	pwm: Fixes for v3.17-rc2 Just one bugfix for the PWM lookup table code that would cause a PWM channel to be set to the wrong period and polarity for non-perfect matches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJT8cLVAAoJEN0jrNd/PrOhRiQP/10av7t2Ahdn3Qi2imz5j3UO lA73tzU36MEchsNoDkYnLwZik6x/3GYk7QUkPoyMFYcay1Wu/USj5hTuZl2phOtF 9tdqkKosrV7APJfpzfuoj6W2FtKBIV4iaxez+ZrqXXKj4BdKXGFvv72w4xf/EWE9 aPabqg3lvorZY42adqbqH5kbATd61FJPZktwzKfmg7O01Wnp2GL3xCPApq9CsBEQ c7i9TR1ttEQZNM6RRs7auwgRNgbuxFZkXRSP5VFbFb1TB3OMCDcGY+PXab42SYLR ztlUao93jZP9Dz7abIGHcZDgRpj7i6veu09RAH6C1Lr0ovvcTor69LlsgvaWQKKb 8CMiKGpLVF3Sg3wLwrSRgUb7FMNVc/R1lR//BtMMTxFcVNTvxc18Tl41azx3kZEt UobQ3IzpalOlJTj1ADzUwws9alcgnD5hD6SEQJwwuqEzJTB4FeTepnrr4VgAA1oM HU8+TzSdZLV0lDIl43rKj0kZ93ds3i2lM/FU6e0Z4bZT1K53J9a4iQsKJIFy83An bcT0lr1kgBTHoBvSnCLsSB6ZmWZ3rmnQ6kIWv/nfzcQKQdNLMpyEbb1xComM/SOI VjaIf/OaTa9h1QswSS2kYZpOguNiQkzRQtDsH4Pr0UgG7g7A5kofjsaP1VZRjcCq FEFj0Zi+aHYX/UKDNa72 =SmTb -----END PGP SIGNATURE----- Merge tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm fix from Thierry Reding: "Just one bugfix for the PWM lookup table code that would cause a PWM channel to be set to the wrong period and polarity for non-perfect matches" * tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: Fix period and polarity in pwm_get() for non-perfect matches	2014-08-22 14:50:21 -07:00
Michal Kazior	47e4df94d1	mac80211: fix channel switch for chanctx-based drivers The new_ctx pointer is set only for non-chanctx drivers. This yielded a crash for chanctx-based drivers during channel switch finalization: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: ieee80211_vif_use_reserved_switch+0x71c/0xb00 [mac80211] Use an adequate chanctx pointer to fix this. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-22 14:45:49 -07:00
Linus Torvalds	433ab34d26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Here are some bug fixes that have piled up during ksummit/linuxcon. 1) Fix endian problems in ibmveth, from Anton Blanchard. 2) IPV6 routing code does GFP_KERNEL allocation in atomic, fix from Benjamin Block. 3) SCTP association fixes from Daniel Borkmann. 4) When multiple VLAN headers are present we have to make sure the second and subsequent ones are pullable in the SKB otherwise we blindly dereference garbage. From Jiri Benc. 5) The argument adjustment of the signature of hlist_add_after() introduced a regression in the batman-adv code, fix from Sven Eckelmann. 6) Fix TX hang handling to avoid a panic in i40e, from Anjali Singhai Jain. 7) PTP flag test is inverted in i40e driver, from Jesse Brandeburg. 8) ATM LEC driver needs to hold RTNL mutex over MTU changes, from Chas Williams. 9) Truncate packets larger then the TPACKET_V3 format configured buffers, otherwise we overwrite past the end of said buffers. From Eric Dumazet. 10) Fix endianness bugs in qlcnic firmware handling, from Rajesh Borundia and Shahed Shaikh. 11) CXGB4 sometimes doesn't get all of the TX completion events it should resulting in SKBs getting stuck in the TX queue, from Hariprasad Shenai. 12) When the FEC chip's PTP clock is disabled, you can't access the register. Add necessary checks to avoid the resulting hang, from Fugang Duan" git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits) drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef net: sctp: fix suboptimal edge-case on non-active active/retrans path selection net: sctp: spare unnecessary comparison in sctp_trans_elect_best net: ethernet: broadcom: bnx2x: Remove redundant #ifdef ibmveth: Fix endian issues with rx_no_buffer statistic net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings() openvswitch: fix panic with multiple vlan headers net: ipv6: fib: don't sleep inside atomic lock net: fec: ptp: avoid register access when ipg clock is disabled cxgb4: Free completed tx skbs promptly cxgb4: Fix race condition in cleanup sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe bnx2x: Revert UNDI flushing mechanism qlcnic: Fix endianess issue in firmware load from file operation qlcnic: Fix endianess issue in FW dump template header qlcnic: Fix flash access interface to application MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver macvlan: Allow setting multicast filter on all macvlan types packet: handle too big packets for PACKET_V3 MAINTAINERS: add entry for ec_bhf driver ...	2014-08-22 14:33:18 -07:00
Steven Rostedt (Red Hat)	bce0b6c51a	ftrace: Fix up trampoline accounting with looping on hash ops Now that a ftrace_hash can be shared by multiple ftrace_ops, they can dec the rec->flags by more than once (one per those that share the ftrace_hash). This means that the tramp_hash may not have a hash item when it was added. For example, if two ftrace_ops share a hash for a ftrace record, and the first ops has a trampoline, when it adds itself it will set the rec->flags TRAMP flag and increments its nr_trampolines counter. When the second ops is added, it must clear that tramp flag but also decrement the other ops that shares its hash. As the update to the function callbacks has not yet been performed, the other ops will not have the tramp hash set yet and it can not be used to know to decrement its nr_trampolines. Luckily, the tramp_hash does not need to be used. As the ftrace_mutex is held, a ops with a trampoline to a record during an update of another ops that shares the record will have its func_hash pointing to it. Since a trampoline can only be set for a record if only one ops is attached to it, we can just check if the record has a trampoline (the FTRACE_FL_TRAMP flag is set) and then find the ops that has this record in its hashes. Also added some output to help debug when things go wrong. Cc: stable@vger.kernel.org # 3.16+ (apply after 3.17-rc4 is out) Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-08-22 15:24:12 -04:00
Rasmus Villemoes	faaa55241f	drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef Test for definedness of the macro which is actually defined (the change is hard to see: it is s/SSS/SSA/). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-22 11:31:30 -07:00
Daniel Borkmann	aa4a83ee8b	net: sctp: fix suboptimal edge-case on non-active active/retrans path selection In SCTP, selection of active (T.ACT) and retransmission (T.RET) transports is being done whenever transport control operations (UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport(). Commits `4c47af4d5e` ("net: sctp: rework multihoming retransmission path selection to rfc4960") and `a7288c4dd5` ("net: sctp: improve sctp_select_active_and_retran_path selection") have both improved it towards a more fine-grained and optimal path selection. Currently, the selection algorithm for T.ACT and T.RET is as follows: 1) Elect the two most recently used ACTIVE transports T1, T2 for T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used 2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set T.ACT<-T.PRI and T.RET<-T1 3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1 4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is still slightly suboptimal as it can lead to the following scenario: Setup: <A> <B> T1: p1p1 (10.0.10.10) <==> .'`) <==> p1p1 (10.0.10.12) <= T.PRI T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22) net.sctp.rto_min = 1000 net.sctp.path_max_retrans = 2 net.sctp.pf_retrans = 0 net.sctp.hb_interval = 1000 T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to link flapping). Here, the first time transmission is sent over PF path T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks are sent over the INACTIVE path T1 (T.PRI), which is not good. After the patch, it's choosing better transports in both cases by modifying step 4): 4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is the most recently used (if avail) in PF, set T.RET<-T.ACT_new This will still select a best possible path in PF if available (which can also include T.PRI/T.RET), and set both T.ACT/T.RET to it. In case sctp_assoc_control_transport() just put T.ACT_old into INACTIVE as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just for a very short while before going back ACTIVE, it will guarantee that this path will be reselected for T.ACT/T.RET since T3 (PF) is not available. Previously, this was not possible, as we would only select between T.PRI and T.RET, and a possible T3 would be NULL due to the fact that we have just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE and would select a suboptimal path when T.PRI/T.RET have worse properties. In the case that T.ACT_old permanently went to INACTIVE during this transition and there's no PF path available, plus T.PRI and T.RET are INACTIVE as well, we would now camp on T.ACT_old, but if everything is being INACTIVE there's really not much we can do except hoping for a successful HB to bring one of the transports back up again and, thus cause a new selection through sctp_assoc_control_transport(). Now both tests work fine: Case 1: 1. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET 2. T1 S(ACTIVE) T.ACT, T.RET T2 S(PF) 3. T1 S(ACTIVE) T.ACT, T.RET T2 S(INACTIVE) 5. T1 S(PF) T.ACT, T.RET T2 S(INACTIVE) [ 5.1 T1 S(INACTIVE) T.ACT, T.RET T2 S(INACTIVE) ] 6. T1 S(ACTIVE) T.ACT, T.RET T2 S(INACTIVE) 7. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET Case 2: 1. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET 2. T1 S(PF) T2 S(ACTIVE) T.ACT, T.RET 3. T1 S(INACTIVE) T2 S(ACTIVE) T.ACT, T.RET 5. T1 S(INACTIVE) T2 S(PF) T.ACT, T.RET [ 5.1 T1 S(INACTIVE) T2 S(INACTIVE) T.ACT, T.RET ] 6. T1 S(INACTIVE) T2 S(ACTIVE) T.ACT, T.RET 7. T1 S(ACTIVE) T.ACT T2 S(ACTIVE) T.RET Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-22 11:31:30 -07:00
Daniel Borkmann	ea4f19c1f8	net: sctp: spare unnecessary comparison in sctp_trans_elect_best When both transports are the same, we don't have to go down that road only to realize that we will return the very same transport. We are guaranteed that curr is always non-NULL. Therefore, just short-circuit this special case. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-22 11:31:30 -07:00

1 2 3 4 5 ...

468455 Commits