Pull NVMe fix from Christoph:
"Below is a one-liner fix from Max that unbreaks T10-DIF support, which
got broken in 4.15."
* 'nvme-4.17' of git://git.infradead.org/nvme:
nvme: fix extended data LBA supported setting
This value depands on the metadata support value, so reorder the
initialization to fit.
Fixes: b5be3b392 ("nvme: always unregister the integrity profile in __nvme_revalidate_disk")
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
The URL is broken. This patch fixes it.
Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
[wsa: shortened the URL a bit]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
When a GID entry is invalid EAGAIN is returned. This is an incorrect error
code, there is nothing that will make this GID entry valid again in
bounded time.
Some user space tools fail incorrectly if EAGAIN is returned here, and
this represents a small ABI change from earlier kernels.
The first patch in the Fixes list makes entries that were valid before
to become invalid, allowing this code to trigger, while the second patch
in the Fixes list introduced the wrong EAGAIN.
Therefore revert the return result to EINVAL which matches the historical
expectations of the ibv_query_gid_type() API of the libibverbs user space
library.
Cc: <stable@vger.kernel.org>
Fixes: 598ff6bae6 ("IB/core: Refactor GID modify code for RoCE")
Fixes: 03db3a2d81 ("IB/core: Add RoCE GID table management")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It seems the first error assignment in if branch is redundant.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Although it's not that complex, but such comment could still save
several minutes for newer reader/reviewer instead of inferring that from
the code.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ minor wording updates ]
Signed-off-by: David Sterba <dsterba@suse.com>
Since compression.h is using the SZ_* macros, and if some file includes
only compression.h without linux/sizes.h, it will cause compile error.
One example is lzo.c, if it uses BTRFS_MAX_COMPRESSED. Fix it by adding
linux/sizes.h in compression.h
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In btrfs_clone_files(), we must check the NODATASUM flag while the
inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
will change the flags after we check and we can end up with a party
checksummed file.
The race window is only a few instructions in size, between the if and
the locks which is:
3834 if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
3835 return -EISDIR;
where the setflags must be run and toggle the NODATASUM flag (provided
the file size is 0). The clone will block on the inode lock, segflags
takes the inode lock, changes flags, releases log and clone continues.
Not impossible but still needs a lot of bad luck to hit unintentionally.
Fixes: 0e7b824c4e ("Btrfs: don't make a file partly checksummed through file clone")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Function btrfs_exclude_logged_extents may call __exclude_logged_extent
which may fail.
Propagate the failures of __exclude_logged_extent to upper caller.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of setting "parent" to ref->parent only when dealing with
a shared ref and subsequently performing another check to see
if (parent > 0), check the "node->type" directly and act accordingly.
This makes the code more streamline. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of taking only specific member of this structure, which results
in 2 extra arguments, just take the delayed_extent_op struct and
reference the arguments inside the functions. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function currently takes 7 parameters, most of which are proxies
for values from btrfs_delayed_ref_node struct which is not passed. This
patch simplifies the interface of the function by simply passing said
delayed ref node struct to the function. This enables us to:
1. Move locals variables and init code related to them from
run_delayed_tree_ref which should only be used inside
alloc_reserved_tree_block, such as skinny_metadata and the btrfs_key,
representing the extent being inserted. This removes the need for the
"ins" argument. Instead, it's replaced by a local var with a more
verbose name - extent_key.
2. Now that we have a reference to the node in alloc_reserved_tree_block
the delayed_tree_ref struct can be referenced inside the function and
this enable removing the "ref->level", "parent" and "ref_root"
arguments.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function already takes a transaction handle which contains a
reference to the fs_info. So use this and remove the extra argument.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The test failures are not clearly visible in the system log as they're
printed at INFO level. Add a new helper that is level ERROR. As this
touches almost all strings, I took the opportunity to unify them:
- decapitalize the first letter as there's a prefix and the text
continues after ":"
- glue strings split to more lines and un-indent so they fit to 80
columns
- use %llu instead of %Lu
- drop \n from the modified messages (test_msg is left untouched)
Signed-off-by: David Sterba <dsterba@suse.com>
We should remove O_NONBLOCK flag when calling sock->ops->connect()
in sctp_connect_to_sock() function.
Why?
1. up to now, sctp socket connect() function ignores the flag argument,
that means O_NONBLOCK flag does not take effect, then we should remove
it to avoid the confusion (but is not urgent).
2. for the future, there will be a patch to fix this problem, then the flag
argument will take effect, the patch has been queued at https://git.kernel.o
rg/pub/scm/linux/kernel/git/davem/net.git/commit/net/sctp?id=644fbdeacf1d3ed
d366e44b8ba214de9d1dd66a9.
But, the O_NONBLOCK flag will make sock->ops->connect() directly return
without any wait time, then the connection will not be established, DLM kernel
module will call sock->ops->connect() again and again, the bad results are,
CPU usage is almost 100%, even trigger soft_lockup problem if the related
configurations are enabled,
DLM kernel module also prints lots of messages like,
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
The upper application (e.g. ocfs2 mount command) is hanged at new_lockspace(),
the whole backtrace is as below,
tb0307-nd2:~ # cat /proc/2935/stack
[<0>] new_lockspace+0x957/0xac0 [dlm]
[<0>] dlm_new_lockspace+0xae/0x140 [dlm]
[<0>] user_cluster_connect+0xc3/0x3a0 [ocfs2_stack_user]
[<0>] ocfs2_cluster_connect+0x144/0x220 [ocfs2_stackglue]
[<0>] ocfs2_dlm_init+0x215/0x440 [ocfs2]
[<0>] ocfs2_fill_super+0xcb0/0x1290 [ocfs2]
[<0>] mount_bdev+0x173/0x1b0
[<0>] mount_fs+0x35/0x150
[<0>] vfs_kern_mount.part.23+0x54/0x100
[<0>] do_mount+0x59a/0xc40
[<0>] SyS_mount+0x80/0xd0
[<0>] do_syscall_64+0x76/0x140
[<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[<0>] 0xffffffffffffffff
So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can
not handle non-block sockect in connect() properly.
Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Geert Uytterhoeven <geert@linux-m68k.org> reported:
> HOSTLD scripts/mod/modpost
> CC arch/sh/kernel/traps_32.o
> arch/sh/kernel/traps_32.c: In function 'do_divide_error':
> arch/sh/kernel/traps_32.c:606:17: error: 'code' may be used uninitialized in this function [-Werror=uninitialized]
> cc1: all warnings being treated as errors
It is clear from inspection that do_divide_error is only called with
TRAP_DIVZERO_ERROR or TRAP_DIVOVF_ERROR, as that is the way
set_exception_table_vec is called. So let gcc know the other cases
should not be considered by returning in all other cases.
This removes the warning and let's the code continue to build.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: c65626c0cd ("signal/sh: Use force_sig_fault where appropriate")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The information about a size change in this case just creates confusion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Only used in block_dev.c and the partitions code, and it should remain
that way..
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
After the recent timeout handling changes, we have two holes in
the struct. Move the timeout near the deadline, killing both,
and moving related members closer together. On my config on
x86-64, this shrinks struct request from 312 to 304 bytes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
libiscsi is the only SCSI code that return BLK_EH_HANDLED, thus trying to
bypass the normal SCSI EH code. We are going to remove this return value
at the block layer, and at least from a quick look it doesn't look too
harmful to try to send an abort for these cases, especially as the first
one should not actually be possible. If this doesn't work out iscsi
will probably need its own eh_strategy_handler instead to just do the
right thing.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
[While this keeps existing behavior it seems to mismatch the comment,
maintainers please chime in!]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
NVMe always completes the request before returning from ->timeout, either
by polling for it, or by disabling the controller. Return BLK_EH_DONE so
that the block layer doesn't even try to complete it again.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command. Fix the symbolic name to reflect that a little better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch simplifies the timeout handling by relying on the request
reference counting to ensure the iterator is operating on an inflight
and truly timed out request. Since the reference counting prevents the
tag from being reallocated, the block layer no longer needs to prevent
drivers from completing their requests while the timeout handler is
operating on it: a driver completing a request is allowed to proceed to
the next state without additional syncronization with the block layer.
This also removes any need for generation sequence numbers since the
request lifetime is prevented from being reallocated as a new sequence
while timeout handling is operating on it.
To enables this a refcount is added to struct request so that request
users can be sure they're operating on the same request without it
changing while they're processing it. The request's tag won't be
released for reuse until both the timeout handler and the completion
are done with it.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: slight cleanups, added back submission side hctx lock, use cmpxchg
for completions]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Check for 0xE00 (RECOVERABLE_ERR) along with ARMFW UE (0x0)
in be_detect_error() to know whether the error is valid error or not
Fixes: 673c96e5a ("be2net: Fix UE detection logic for BE3")
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for Netgear Aircard 779S
Signed-off-by: Josh Hill <josh@joshuajhill.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
The block layer had been setting the state to in-flight prior to updating
the timer. This is the wrong order since the timeout handler could observe
the in-flight state with the older timeout, believing the request had
expired when in fact it is just getting started.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
As far as I can tell this function can't even be called any more, given
that ATA implements its own eh_strategy_handler with ata_scsi_error, which
never calls ->eh_timed_out.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch to the generic noncoherent direct mapping implementation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greentime Hu <greentime@andestech.com>
Tested-by: Greentime Hu <greentime@andestech.com>
This matches the implementation of the more commonly used unmap_single
routines and the sync_sg_for_cpu method which should provide equivalent
cache maintainance.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greentime Hu <greentime@andestech.com>
Tested-by: Greentime Hu <greentime@andestech.com>
Make sure all other DMA methods call nds32_dma_sync_single_for_{device,cpu}
to perform cache maintaince, and remove the consisteny_sync helper that
implemented both with entirely separate code based off an argument.
Also make sure these helpers handled highmem properly, for which code
is copy and pasted from mips.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greentime Hu <greentime@andestech.com>
Tested-by: Greentime Hu <greentime@andestech.com>
memcmp() returns int, but eprom_try_esi() cast it to unsigned char. One
can lose significant bits and get 0 from non-0 value returned by the
memcmp().
Signed-off-by: Ivan Bornyakov <brnkv.i1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This parameter has been around since commit e162b39a36 ("softlockup:
decouple hung tasks check from softlockup detection") in 2009 but was
never documented.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The are terms that seem obvious to the mm developers, but may be somewhat
obscure for, say, less involved readers.
The concepts overview can be seen as an "extended glossary" that introduces
such terms to the readers of the kernel documentation.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
After the userspace interface description for KSM and THP was split to
Documentation/admin-guide/mm, the remaining parts belong to the section
describing MM internals.
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Now that we have kerneldoc comments for
memalloc_no{fs,io}_{save_restore}(), go ahead and pull them into the docs.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Although the api is documented in the source code Ted has pointed out
that there is no mention in the core-api Documentation and there are
people looking there to find answers how to use a specific API.
Requested-by: "Theodore Y. Ts'o" <tytso@mit.edu>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
bugs that deal with triggers and instances.
The first is a generic trigger bug where the triggers are not removed
from a link list properly when deleting an instance.
The second is specific to snapshots, where the snapshot is does the
snapshot to the top level buffer, when it is suppose to snapshot the
buffer associated to the instance the snapshot trigger exists in.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCWw0+4hQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qpwyAQC56/yYzfpJnpjwcI2E7j8FihLg0Nlr
bq85CcQGRm07dwD+L90disWyPxpxH/fGO4OCET1LeoaO1I/fBfECR2XXjQY=
=w4al
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.17-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"While writing selftests for a new feature, I triggered two existing
bugs that deal with triggers and instances.
- a generic trigger bug where the triggers are not removed from a
linked list properly when deleting an instance.
- a bug specific to snapshots, where the snapshot is done in the top
level buffer, when it is supposed to snapshot the buffer associated
to the instance the snapshot trigger exists in"
* tag 'trace-v4.17-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Make the snapshot trigger work with instances
tracing: Fix crash when freeing instances with event triggers
nospec quite reasonably asserts that it will never be used with an index
larger than unsigned long (that being the largest possibly index into an
C array). However, our ubi uses the convention of u64 for any large
integer, running afoul of the assertion on 32b. Reduce our index to an
unsigned long, checking for type overflow first.
drivers/gpu/drm/i915/i915_query.c: In function 'i915_query_ioctl':
include/linux/compiler.h:339:38: error: call to '__compiletime_assert_119' declared with attribute error: BUILD_BUG_ON failed: sizeof(_s) > sizeof(long)
Reported-by: kbuild-all@01.org
Fixes: 84b510e22d ("drm/i915/query: Protect tainted function pointer lookup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180522121018.15199-1-chris@chris-wilson.co.uk
(cherry picked from commit a33b1dc8a7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When there are 16 or more logical CPUs, we request for
`IWL_MAX_RX_HW_QUEUES` (16) IRQs only as we limit to that number of
IRQs, but later on we compare the number of IRQs returned to
nr_online_cpus+2 instead of max_irqs, the latter being what we
actually asked for. This ends up setting num_rx_queues to 17 which
causes lots of out-of-bounds array accesses later on.
Compare to max_irqs instead, and also add an assertion in case
num_rx_queues > IWM_MAX_RX_HW_QUEUES.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199551
Fixes: 2e5d4a8f61 ("iwlwifi: pcie: Add new configuration to enable MSIX")
Signed-off-by: Hao Wei Tee <angelsl@in04.sg>
Tested-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>