When the initial discovery fails the subsystem hasn't been setup yet
in nvme_mpath_stop, and we can't dereference ctrl->subsys.
Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
For machines without the exrl instruction the BFP jit generates
code that uses an "br %r1" instruction located in the lowcore page.
Unfortunately there is a cut & paste error that puts an additional
"larl %r1,.+14" instruction in the code that clobbers the branch
target address in %r1. Remove the larl instruction.
Cc: <stable@vger.kernel.org> # v4.17+
Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The memove, memset, memcpy, __memset16, __memset32 and __memset64
function have an additional indirect return branch in form of a
"bzr" instruction. These need to use expolines as well.
Cc: <stable@vger.kernel.org> # v4.17+
Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches")
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
on the kernel command line as it cannot differentiate between SMT disabled
by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
makes the sysfs interface unusable.
Rework this so that during bringup of the non boot CPUs the availability of
SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
'primary' thread then set the local cpu_smt_available marker and evaluate
this explicitely right after the initial SMP bringup has finished.
SMT evaulation on x86 is a trainwreck as the firmware has all the
information _before_ booting the kernel, but there is no interface to query
it.
Fixes: 73d5e2b472 ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It turns out I had misunderstood how the x86_match_cpu() function works.
It evaluates a logical OR of the matching conditions, not logical AND.
This caused the CPU feature checks for AEGIS to pass even if only SSE2
(but not AES-NI) was supported (or vice versa), leading to potential
crashes if something tried to use the registered algs.
This patch switches the checks to a simpler method that is used e.g. in
the Camellia x86 code.
The patch also removes the MODULE_DEVICE_TABLE declarations which
actually seem to cause the modules to be auto-loaded at boot, which is
not desired. The crypto API on-demand module loading is sufficient.
Fixes: 1d373d4e8e ("crypto: x86 - Add optimized AEGIS implementations")
Fixes: 6ecc9d9ff9 ("crypto: x86 - Add optimized MORUS implementations")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it turns out, checking the TIF_NEED_RESCHED flag after each
iteration results in a significant performance regression (~10%)
when running fast algorithms (i.e., ones that use special instructions
and operate in the < 4 cycles per byte range) on in-order cores with
comparatively slow memory accesses such as the Cortex-A53.
Given the speed of these ciphers, and the fact that the page based
nature of the AEAD scatterwalk API guarantees that the core NEON
transform is never invoked with more than a single page's worth of
input, we can estimate the worst case duration of any resulting
scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages,
processing a page's worth of input at 4 cycles per byte results in
a delay of ~250 us, which is a reasonable upper bound.
So let's remove the yield checks from the fused AES-CCM and AES-GCM
routines entirely.
This reverts commit 7b67ae4d5c and
partially reverts commit 7c50136a8a.
Fixes: 7c50136a8a ("crypto: arm64/aes-ghash - yield NEON after every ...")
Fixes: 7b67ae4d5c ("crypto: arm64/aes-ccm - yield NEON after every ...")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Passing an array (swapper_pg_dir) as the argument to write_c0_kpgd() in
setup_pw() will become problematic if we modify __write_64bit_c0_split()
to cast its val argument to unsigned long long, because for 32-bit
kernel builds the size of a pointer will differ from the size of an
unsigned long long. This would fall foul of gcc's pointer-to-int-cast
diagnostic.
Cast the value to a long, which should be the same width as the pointer
that we ultimately want & will be sign extended if required to the
unsigned long long that __write_64bit_c0_split() ultimately needs.
Signed-off-by: Paul Burton <paul.burton@mips.com>
This is a single fix affecting X86 ACPI, and as such pretty important.
It is going to stable as well and have all the high-notch x86 platform
developers agreeing on it.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbaHj7AAoJEEEQszewGV1z8tUQAJ3XHPqbXRfrvOrFP0MOw2qj
5UOfNBgNyjpXMb3SLex+zOe80HVh2OjR2HBwZQqd4LVZcCu7ZasewrW17XZFwsN4
IerMIEBNtU8kA6HBvreU+U4owgV9FGEpZz5/Mp0TkjoIpm2XCvrYDhZ3mr+lLx+h
+lsMtgTlH5pEb/e/Hck0s9PO9mjzDq0sHlgaGEYKqf2fxUPWIeIw0uJw0fQrKFau
muUd044zD5ACIGMYnLpsjoV1WBlDmEzAUHJNeEGcQ7zeXS1LOwGJjc+XfQ3xmoZq
WqianLPoLbdP02cNgOJZK2VthQuqSiCg7VBEmYe5T2bLwChRxwQmxYkJH98AXlt/
367bI4wYcnIwxZJz/dMvb/M5ietY7qGAI6+S+TrWUt+BY1UGy2eLMLKgLLBug/VJ
T3S9qVFgYW+MP8Z9dvY0J8ABXKA+mhUd4wWstkq02s2JM7xwTIMd5cx8drh6SBqS
FE2dGmaqLK6KJxOk5vWxTiE9plDPcKllzPa5y6asZ8kut+aDBEK8Dbtmbaubjq5N
/RJkdgITcjKwIxFcPJx4PBpwStW1YR++rYiHHprxFd4FH2wnlCN7jzD+RME+j+YO
MU8U8jeO/8z5cUThawmtBKNZR3lKcp+okDpF9Oozzii6whsVFF+41Y/mWPo7c+ce
bNU+BjEH0qi26WLCt+Vm
=36C6
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"This is a single fix affecting X86 ACPI, and as such pretty important.
It is going to stable as well and have all the high-notch x86 platform
developers agreeing on it"
* tag 'gpio-v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpiolib-acpi: make sure we trigger edge events at least once on boot
The MIPS VDSO code filters out a subset of known-good flags from
KBUILD_CFLAGS to use when building VDSO libraries. When we build using
clang we need to allow the --target flag through, otherwise we'll
generally attempt to build the VDSO for the architecture of the build
machine rather than for MIPS.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20154/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Our genvdso tool performs some rather paranoid checking that the VDSO
library isn't attempting to make use of a GOT by constraining the number
of entries that the GOT is allowed to contain to the minimum 2 entries
that are always generated by binutils.
Unfortunately lld prior to revision 334390 generates a third entry,
which is unused & thus harmless but falls foul of genvdso's checks &
causes the build to fail.
Since we already check that the VDSO contains no relocations it seems
reasonable to presume that it also doesn't contain use of a GOT, which
would involve relocations. Thus rather than attempting to work around
this issue by allowing 3 GOT entries when using lld, simply remove the
GOT checks which seem overly paranoid.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20152/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Use unsigned int, because it's preferred to unsigned.
Signed-off-by: Kitone Elvis Peter <elviskitone@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
TPACKET_V3 stores variable length frames in fixed length blocks.
Blocks must be able to store a block header, optional private space
and at least one minimum sized frame.
Frames, even for a zero snaplen packet, store metadata headers and
optional reserved space.
In the block size bounds check, ensure that the frame of the
chosen configuration fits. This includes sockaddr_ll and optional
tp_reserve.
Syzbot was able to construct a ring with insuffient room for the
sockaddr_ll in the header of a zero-length frame, triggering an
out-of-bounds write in dev_parse_header.
Convert the comparison to less than, as zero is a valid snap len.
This matches the test for minimum tp_frame_size immediately below.
Fixes: f6fb8f100b ("af-packet: TPACKET_V3 flexible buffer implementation.")
Fixes: eb73190f4f ("net/packet: refine check for priv area size")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Variable cfg_val is being assigned but is never used hence it is
redundant and can be removed.
Cleans up clang warning:
warning: variable 'cfg_val' set but not used [-Wunused-but-set-variable]
[ Background info about val_bit field from alsa-devel ML thread:
tiwai: Actually this made me wonder what is the definition of val_bit.
It seems always 1 in the current code after the commit
964ca8083c. Pierre?
pbossart: This val_bit is only there for debug/test, it should be set
to one by default and has nothing to do with the lpcm_id.
This variable was set even in patches before upstream
submission and was never needed, I guess it must be a 9-yr
old issue. Good catch!
]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The kernel image is mapped into two places in the virtual address space
(addresses without KASLR, of course):
1. The kernel direct map (0xffff880000000000)
2. The "high kernel map" (0xffffffff81000000)
We actually execute out of #2. If we get the address of a kernel symbol,
it points to #2, but almost all physical-to-virtual translations point to
Parts of the "high kernel map" alias are mapped in the userspace page
tables with the Global bit for performance reasons. The parts that we map
to userspace do not (er, should not) have secrets. When PTI is enabled then
the global bit is usually not set in the high mapping and just used to
compensate for poor performance on systems which lack PCID.
This is fine, except that some areas in the kernel image that are adjacent
to the non-secret-containing areas are unused holes. We free these holes
back into the normal page allocator and reuse them as normal kernel memory.
The memory will, of course, get *used* via the normal map, but the alias
mapping is kept.
This otherwise unused alias mapping of the holes will, by default keep the
Global bit, be mapped out to userspace, and be vulnerable to Meltdown.
Remove the alias mapping of these pages entirely. This is likely to
fracture the 2M page mapping the kernel image near these areas, but this
should affect a minority of the area.
The pageattr code changes *all* aliases mapping the physical pages that it
operates on (by default). We only want to modify a single alias, so we
need to tweak its behavior.
This unmapping behavior is currently dependent on PTI being in place.
Going forward, we should at least consider doing this for all
configurations. Having an extra read-write alias for memory is not exactly
ideal for debugging things like random memory corruption and this does
undercut features like DEBUG_PAGEALLOC or future work like eXclusive Page
Frame Ownership (XPFO).
Before this patch:
current_kernel:---[ High Kernel Mapping ]---
current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd
current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd
current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte
current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte
current_kernel-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd
current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd
current_kernel-0xffffffff82c00000-0xffffffff82e00000 2M RW NX pte
current_kernel-0xffffffff82e00000-0xffffffff83200000 4M RW PSE NX pmd
current_kernel-0xffffffff83200000-0xffffffffa0000000 462M pmd
current_user:---[ High Kernel Mapping ]---
current_user-0xffffffff80000000-0xffffffff81000000 16M pmd
current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd
current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte
current_user-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte
current_user-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd
current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd
After this patch:
current_kernel:---[ High Kernel Mapping ]---
current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd
current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd
current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte
current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K pte
current_kernel-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd
current_kernel-0xffffffff82400000-0xffffffff82488000 544K ro NX pte
current_kernel-0xffffffff82488000-0xffffffff82600000 1504K pte
current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd
current_kernel-0xffffffff82c00000-0xffffffff82c0d000 52K RW NX pte
current_kernel-0xffffffff82c0d000-0xffffffff82dc0000 1740K pte
current_user:---[ High Kernel Mapping ]---
current_user-0xffffffff80000000-0xffffffff81000000 16M pmd
current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd
current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte
current_user-0xffffffff81e11000-0xffffffff82000000 1980K pte
current_user-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd
current_user-0xffffffff82400000-0xffffffff82488000 544K ro NX pte
current_user-0xffffffff82488000-0xffffffff82600000 1504K pte
current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd
[ tglx: Do not unmap on 32bit as there is only one mapping ]
Fixes: 0f561fce4d ("x86/pti: Enable global pages for shared areas")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/20180802225831.5F6A2BFC@viggo.jf.intel.com
As there is precious little left in any DTS files referring to the
node "/chosen@0" as opposed to "/chosen", remove the two checks for
the former node name.
[paul.burton@mips.com:
The modified yamon-dt code only operates on
arch/mips/boot/dts/mti/sead3.dts right now, and that uses chosen
rather than chosen@0 anyway, so this should have no behavioural
effect.]
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20131/
Cc: linux-mips@linux-mips.org
ACP->SYSMEM DMA happens at every I2S->SYSMEM period
completion. Thus, there is delay of x frames till
I2S->SYSMEM reaches a period length. This delay is
communicated to user space.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Give position on ACP->SYSMEM DMA channel for
the number of bytes that have been transferred on
the basis of current descriptor under service.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
In capture case we don't want ACP to SYSMEM dma
to be circular. This is because if an in place DSP
filter is applied to captured output then circular DMA
can overwrite the filter value with stale data.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Since mountpoint crossing can happen without leaving lazy mode,
root dentries do need the same protection against having their
memory freed without RCU delay as everything else in the tree.
It's partially hidden by RCU delay between detaching from the
mount tree and dropping the vfsmount reference, but the starting
point of pathwalk can be on an already detached mount, in which
case umount-caused RCU delay has already passed by the time the
lazy pathwalk grabs rcu_read_lock(). If the starting point
happens to be at the root of that vfsmount *and* that vfsmount
covers the entire filesystem, we get trouble.
Fixes: 48a066e72d ("RCU'd vsfmounts")
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add missing break statement in order to prevent the code from falling
through to the default case.
Addresses-Coverity-ID: 115050 ("Missing break in switch")
Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
This patch removes unused header files from common.h.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch removes unused header files from the driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch removes unused header files from the driver.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This patch converts common helper functions in to proper module
and also fixes below warning.
WARNING: sound/soc/qcom/snd-soc-sdm845: 'qcom_snd_parse_of' exported twice.
Previous export was in sound/soc/qcom/snd-soc-apq8096.ko
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value.
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
When btrfs hits error after modifying fs_devices in
btrfs_init_new_device() (such as btrfs_add_dev_item() returns error), it
leaves everything as is, but frees allocated btrfs_device. As a result,
fs_devices->devices and fs_devices->alloc_list contain already freed
btrfs_device, leading to later use-after-free bug.
Error path also messes the things like ->num_devices. While they go back
to the original value by unscanning btrfs devices, it is safe to revert
them here.
Fixes: 79787eaab4 ("btrfs: replace many BUG_ONs with proper error handling")
Signed-off-by: Naohiro Aota <naota@elisp.net>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's entirely possible that a crafted btrfs image contains overlapping
chunks.
Although we can't detect such problem by tree-checker, it's not a
catastrophic problem, current extent map can already detect such problem
and return -EEXIST.
We just only need to exit gracefully and fail the mount.
Reported-by: Xu Wen <wen.xu@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200409
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch will introduce chunk <-> dev extent mapping check, to protect
us against invalid dev extents or chunks.
Since chunk mapping is the fundamental infrastructure of btrfs, extra
check at mount time could prevent a lot of unexpected behavior (BUG_ON).
Reported-by: Xu Wen <wen.xu@gatech.edu>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200403
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200407
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If a crafted image has missing block group items, it could cause
unexpected behavior and breaks the assumption of 1:1 chunk<->block group
mapping.
Although we have the block group -> chunk mapping check, we still need
chunk -> block group mapping check.
This patch will do extra check to ensure each chunk has its
corresponding block group.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A crafted btrfs image with incorrect chunk<->block group mapping will
trigger a lot of unexpected things as the mapping is essential.
Although the problem can be caught by block group item checker
added in "btrfs: tree-checker: Verify block_group_item", it's still not
sufficient. A sufficiently valid block group item can pass the check
added by the mentioned patch but could fail to match the existing chunk.
This patch will add extra block group -> chunk mapping check, to ensure
we have a completely matching (start, len, flags) chunk for each block
group at mount time.
Here we reuse the original helper find_first_block_group(), which is
already doing the basic bg -> chunk checks, adding further checks of the
start/len and type flags.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199837
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When doing an incremental send, if we have a file in the parent snapshot
that has prealloc extents beyond EOF and in the send snapshot it got a
hole punch that partially covers the prealloc extents, the send stream,
when replayed by a receiver, can result in a file that has a size bigger
than it should and filled with zeroes past the correct EOF.
For example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "falloc -k 0 4M" /mnt/foobar
$ xfs_io -c "pwrite -S 0xea 0 1M" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send -f /tmp/1.send /mnt/snap1
$ xfs_io -c "fpunch 1M 2M" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs send -f /tmp/2.send -p /mnt/snap1 /mnt/snap2
$ stat --format %s /mnt/snap2/foobar
1048576
$ md5sum /mnt/snap2/foobar
d31659e82e87798acd4669a1e0a19d4f /mnt/snap2/foobar
$ umount /mnt
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ btrfs receive -f /mnt/1.snap /mnt
$ btrfs receive -f /mnt/2.snap /mnt
$ stat --format %s /mnt/snap2/foobar
3145728
# --> should be 1Mb and not 3Mb (which was the end offset of hole
# punch operation)
$ md5sum /mnt/snap2/foobar
117baf295297c2a995f92da725b0b651 /mnt/snap2/foobar
# --> should be d31659e82e87798acd4669a1e0a19d4f as in the original fs
This issue actually happens only since commit ffa7c4296e ("Btrfs: send,
do not issue unnecessary truncate operations"), but before that commit we
were issuing a write operation full of zeroes (to "punch" a hole) which
was extending the file size beyond the correct value and then immediately
issue a truncate operation to the correct size and undoing the previous
write operation. Since the send protocol does not support fallocate, for
extent preallocation and hole punching, fix this by not even attempting
to send a "hole" (regular write full of zeroes) if it starts at an offset
greater then or equals to the file's size. This approach, besides being
much more simple then making send issue the truncate operation, adds the
benefit of avoiding the useless pair of write of zeroes and truncate
operations, saving time and IO at the receiver and reducing the size of
the send stream.
A test case for fstests follows soon.
Fixes: ffa7c4296e ("Btrfs: send, do not issue unnecessary truncate operations")
CC: stable@vger.kernel.org # 4.17+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Don't open-code iget_failed(), don't bother with btrfs_free_path(NULL),
move handling of positive return values of btrfs_lookup_inode() from
btrfs_read_locked_inode() to btrfs_iget() and kill now obviously
pointless ASSERT() in there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We don't need to check is_bad_inode() after the call of
btrfs_read_locked_inode() - it's exactly the same as checking return
value for being non-zero.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>