Fix a regression from the 4.15 cycle that caused the system suspend
and resume overhead to increase on many systems and triggered more
serious problems on some of them (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbBoz0AAoJEILEb/54YlRxO1MP/2y26kAQLcKzaLiLVbTJW60V
p+1WEEWYG4S0gIEd5fgy5PzLoX50s6+JQ+JGxV5D/hMofyDSgeK1Ldi/jPT2+3rT
ag3XW1xYdnGz8SgCNhlDOonxZwRZiFL3SvlRRvP6jCLDFtRSpqCQpzVFD48DHgF/
y/MuJW95yRwzhvRWIlRViowHLyuGR81ymCNOZ1d9rNrv7KnxFQZdSR76hbMI9bD8
Ufm/bqc7T+TxIgft0L7584uqgfo0iptZWH1l8RpvDYBPIrR40NWibb5QEjJ+Jn9a
7nvB6Uw2ICxNTVUuEHRJ08MmhBdchykJnFEru+MH7Mse53MXB2u872kFaz9NbfXv
xBvCbQJN9eFkN8QfG1+UnX2UIVn/Fl7LxiXPwBOfWz2MxWOqFm5FCKH6ZKitnXnE
lLbAp7bSxWmHmjURALGv31vlG5rm8GyM8cKJInq9Ms34Y5OeU+/t1MQMS5X50His
eihn67CBUZmAf2Bo6KENOjuH2fHkCYe4gu2ewnThwflZhVo4cCZJjk07lC56FZai
TC9Zu1W0mr0CAauyRnT2+GAyGQCoHxBnSMpJ6ZhlGAdUPwerRDsUmpqJYl8iGOIO
WH3apiV0UkvEJy0Wp6P9/YakJ4PRjK32SwJwAGK1hA0vqRbHpPyYCoZKA16z8zOC
Rdle9Gq3QIQuRE7RWuVx
=BS+S
-----END PGP SIGNATURE-----
Merge tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a regression from the 4.15 cycle that caused the system suspend
and resume overhead to increase on many systems and triggered more
serious problems on some of them (Rafael Wysocki)"
* tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / core: Fix direct_complete handling for devices with no callbacks
While reviewing the verifier code, I recently noticed that the
following two program variants in relation to tail calls can be
loaded.
Variant 1:
# bpftool p d x i 15
0: (15) if r1 == 0x0 goto pc+3
1: (18) r2 = map[id:5]
3: (05) goto pc+2
4: (18) r2 = map[id:6]
6: (b7) r3 = 7
7: (35) if r3 >= 0xa0 goto pc+2
8: (54) (u32) r3 &= (u32) 255
9: (85) call bpf_tail_call#12
10: (b7) r0 = 1
11: (95) exit
# bpftool m s i 5
5: prog_array flags 0x0
key 4B value 4B max_entries 4 memlock 4096B
# bpftool m s i 6
6: prog_array flags 0x0
key 4B value 4B max_entries 160 memlock 4096B
Variant 2:
# bpftool p d x i 20
0: (15) if r1 == 0x0 goto pc+3
1: (18) r2 = map[id:8]
3: (05) goto pc+2
4: (18) r2 = map[id:7]
6: (b7) r3 = 7
7: (35) if r3 >= 0x4 goto pc+2
8: (54) (u32) r3 &= (u32) 3
9: (85) call bpf_tail_call#12
10: (b7) r0 = 1
11: (95) exit
# bpftool m s i 8
8: prog_array flags 0x0
key 4B value 4B max_entries 160 memlock 4096B
# bpftool m s i 7
7: prog_array flags 0x0
key 4B value 4B max_entries 4 memlock 4096B
In both cases the index masking inserted by the verifier in order
to control out of bounds speculation from a CPU via b2157399cc
("bpf: prevent out-of-bounds speculation") seems to be incorrect
in what it is enforcing. In the 1st variant, the mask is applied
from the map with the significantly larger number of entries where
we would allow to a certain degree out of bounds speculation for
the smaller map, and in the 2nd variant where the mask is applied
from the map with the smaller number of entries, we get buggy
behavior since we truncate the index of the larger map.
The original intent from commit b2157399cc is to reject such
occasions where two or more different tail call maps are used
in the same tail call helper invocation. However, the check on
the BPF_MAP_PTR_POISON is never hit since we never poisoned the
saved pointer in the first place! We do this explicitly for map
lookups but in case of tail calls we basically used the tail
call map in insn_aux_data that was processed in the most recent
path which the verifier walked. Thus any prior path that stored
a pointer in insn_aux_data at the helper location was always
overridden.
Fix it by moving the map pointer poison logic into a small helper
that covers both BPF helpers with the same logic. After that in
fixup_bpf_calls() the poison check is then hit for tail calls
and the program rejected. Latter only happens in unprivileged
case since this is the *only* occasion where a rewrite needs to
happen, and where such rewrite is specific to the map (max_entries,
index_mask). In the privileged case the rewrite is generic for
the insn->imm / insn->code update so multiple maps from different
paths can be handled just fine since all the remaining logic
happens in the instruction processing itself. This is similar
to the case of map lookups: in case there is a collision of
maps in fixup_bpf_calls() we must skip the inlined rewrite since
this will turn the generic instruction sequence into a non-
generic one. Thus the patch_call_imm will simply update the
insn->imm location where the bpf_map_lookup_elem() will later
take care of the dispatch. Given we need this 'poison' state
as a check, the information of whether a map is an unpriv_array
gets lost, so enforcing it prior to that needs an additional
state. In general this check is needed since there are some
complex and tail call intensive BPF programs out there where
LLVM tends to generate such code occasionally. We therefore
convert the map_ptr rather into map_state to store all this
w/o extra memory overhead, and the bit whether one of the maps
involved in the collision was from an unpriv_array thus needs
to be retained as well there.
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Since 4.10, commit 8003c9ae20 (KVM: LAPIC: add APIC Timer
periodic/oneshot mode VMX preemption timer support), guests using
periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
drift significantly over time.
Differences in the underlying clocks and numerical errors means the
periods of the two timers (hv and sw) are not the same. This
difference will accumulate with every expiry resulting in a large
error between the hv and sw timer.
This means the sw timer may be running slow when compared to the hv
timer. When the timer is switched from hv to sw, the now active sw
timer will expire late. The guest VCPU is reentered and it switches to
using the hv timer. This timer catches up, injecting multiple IRQs
into the guest (of which the guest only sees one as it does not get to
run until the hv timer has caught up) and thus the guest's timer rate
is low (and becomes increasing slower over time as the sw timer lags
further and further behind).
I believe a similar problem would occur if the hv timer is the slower
one, but I have not observed this.
Fix this by synchronizing the deadlines for both timers to the same
time source on every tick. This prevents the errors from accumulating.
Fixes: 8003c9ae20
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
- Close a hole which could possibly lead to the host timebase getting
out of sync.
- Three fixes relating to PTEs and TLB entries for radix guests.
- Fix a bug which could lead to an interrupt never getting delivered
to the guest, if it is pending for a guest vCPU when the vCPU gets
offlined.
-----BEGIN PGP SIGNATURE-----
iQFGBAABCgAwFiEEv0VLfXa2m9eKuaRpnZrqdyxjcZ8FAlsGTWMSHHBhdWx1c0Bv
emxhYnMub3JnAAoJEJ2a6ncsY3GfPKQH/3dopz+qjpZqvhgvqfC0wkLlGLcTxmKK
+y77M5YStFEeytYB52hyrAs4KptM1If5+BfShX4tTzGY5MGS4RMvzY7tLNzLlmFg
S/ghzlFCh4dIz+LTk58FIyFmyn7GrvJRP33FoiAPCCp1AkRL7MlSD5cu3N6fHo6P
GU5lHLLyaGEIkC4KxLQdr4smV3tKNk1k6iz4eMHwDOeLoxcLnz0LbiM7xBr/Txmu
miF68B29hU/peKM/GbtSAh5TpWY6WlcPTBUEiHXghcuYmXqgW43fjGleuL330mN4
HtSONLuapa6VNSJy3UuGBlI1puIEbUrtTPfy0UxKQG3Em7L8UnxO2wk=
=7/7b
-----END PGP SIGNATURE-----
Merge tag 'kvm-ppc-fixes-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Fixes for PPC KVM:
- Close a hole which could possibly lead to the host timebase getting
out of sync.
- Three fixes relating to PTEs and TLB entries for radix guests.
- Fix a bug which could lead to an interrupt never getting delivered
to the guest, if it is pending for a guest vCPU when the vCPU gets
offlined.
This one should be using the default LPM policy for mobile chipsets so
add the PCI ID to the driver list of supported revices.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Commit 15122ee2c5 ("arm64: Enforce BBM for huge IO/VMAP mappings")
disallowed block mappings for ioremap since that code does not honor
break-before-make. The same APIs are also used for permission updating
though and the extra checks prevent the permission updates from happening,
even though this should be permitted. This results in read-only permissions
not being fully applied. Visibly, this can occasionaly be seen as a failure
on the built in rodata test when the test data ends up in a section or
as an odd RW gap on the page table dump. Fix this by using
pgattr_change_is_safe instead of p*d_present for determining if the
change is permitted.
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Reported-by: Peter Robinson <pbrobinson@gmail.com>
Fixes: 15122ee2c5 ("arm64: Enforce BBM for huge IO/VMAP mappings")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).
To reproduce the issue: mount a filesystem with -o compress-force=zstd
and the following program will encounter return value of 1 from
ftruncate:
int main(void) {
char buf[256] = { 0 };
int ret;
int fd;
fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
perror("write");
close(fd);
return EXIT_FAILURE;
}
if (fsync(fd) == -1) {
perror("fsync");
close(fd);
return EXIT_FAILURE;
}
ret = ftruncate(fd, 128);
if (ret) {
printf("ftruncate() returned %d\n", ret);
close(fd);
return EXIT_FAILURE;
}
close(fd);
return EXIT_SUCCESS;
}
Fixes: ddfae63cc8 ("btrfs: move btrfs_truncate_block out of trans handle")
CC: stable@vger.kernel.org # 4.15+
Reported-by: Jun Wu <quark@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When posted work request, it need to compute the length of
all sges of every wr and fill it into the msg_len field of
send wqe. Thus, While posting multiple wr,
tmp_len should be reinitialized to zero.
Fixes: 8b9b8d143b ("RDMA/hns: Fix the endian problem for hns")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When use cq record db for kernel, it needs to set the hr_cq->db_en
to 1 and configure the dma address of record cq db of qp context.
Fixes: 86188a8810 ("RDMA/hns: Support cq record doorbell for kernel space")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The err pointer comes from uverbs_attr_get, not from the uobject member,
which does not store an ERR_PTR.
Fixes: be934cca9e ("IB/uverbs: Add device memory registration ioctl support")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Each user_context receives a separate dpi value and thus a different
address on the doorbell bar. The qedr_mmap function needs to validate
the address and map the doorbell bar accordingly.
The current implementation always checked against dpi=0 doorbell range
leading to a wrong mapping for doorbell bar. (It entered an else case
that mapped the address differently). qedr_mmap should only be used
for doorbells, so the else was actually wrong in the first place.
This only has an affect on arm architecture and not an issue on a
x86 based architecture.
This lead to doorbells not occurring on arm based systems and left
applications that use more than one dpi (or several applications
run simultaneously ) to hang.
Fixes: ac1b36e55a ("qedr: Add support for user context verbs")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.
This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):
[ 154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[ 154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G I
[ 154.675856] -----------------------------------------------------
[ 154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[ 154.700927]
and this task is already holding:
[ 154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[ 154.718028] which would create a new lock dependency:
[ 154.723705] (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[ 154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 154.740798] (&(&cq->lock)->rlock){..-.}
[ 154.740800]
... which became SOFTIRQ-irq-safe at:
[ 154.752163] _raw_spin_lock_irqsave+0x3e/0x50
[ 154.757163] mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[ 154.762554] ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[ 154.815603] (&(&qp_table->lock)->rlock){+.+.}
[ 154.815604]
... which became SOFTIRQ-irq-unsafe at:
[ 154.827718] ...
[ 154.827720] _raw_spin_lock+0x35/0x50
[ 154.833912] mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[ 154.839302] mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]
Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.
Fixes: 6dc06c08be ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On newer PHYs, we need to select the expansion register to write with
setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
to being migrated to generic code under bcm-phy-lib.c which
unfortunately used the older implementation from the BCM54xx days.
Fix this by creating an inline stub: bcm_write_exp_sel() which adds the
correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY
and BCM7xxx PHY drivers which require setting these bits.
broadcom.c is unchanged because some PHYs even use a different selector
method, so let them specify it directly (e.g: SerDes secondary selector).
Fixes: a1cba5613e ("net: phy: Add Broadcom phy library for common interfaces")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are currently doing auxiliary control register reads with the shadow
register value 0b111 (0x7) which incidentally is also the selector value
that should be present in bits [2:0]. Fix this by using the appropriate
selector mask which is defined (MII_BCM54XX_AUXCTL_SHDWSEL_MASK).
This does not have a functional impact yet because we always access the
MII_BCM54XX_AUXCTL_SHDWSEL_MISC (0x7) register in the current code.
This might change at some point though.
Fixes: 5b4e290051 ("net: phy: broadcom: add bcm54xx_auxctl_read")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in mlx4_dbg debug message and also
change the phrasing of the message so that is is more readable
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When enabling the sub-CRQ IRQ a previous update sent a H_EOI prior
to the enablement to clear any pending interrupts that may be present
across a partition migration. This fixed a firmware bug where a
migration could erroneously indicate that a H_EOI was pending.
The H_EOI should only be sent when enabling during a mobility
event though. Doing so at other time could wrong and can produce
extra driver output when IRQs are enabled when doing TX completion.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hopefully the last fixes for 4.17. ssb is again causing problems so we
had to revert a commit and fix it better. Also a small fix to bcma and
some MAINTAINERS file updates.
ssb
* fix regression with all module PCI cards, for example using b43 and
b44 drivers
* try again fixing a MIPS linker error
bcma
* fix truncated info log messages
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJbBCW1AAoJEG4XJFUm622bRIQH/R8BlzcVdvPx330v73NVuw/+
ZJ8OkSC0/j6lIVoQUcGXePP2P5bWZNZubgf+kneIOHVdGbNBdgvHC9hUMjl0Hi+V
Hweop6UfUN5lXIT/vI1TKiP4EwQhhjqboaxaq6s5ostCgi8/gc/pIDuppolNjS/M
6hI+PaQmbN/nPrzcm9GW50FAmPsI0XX77rwyIIJ40kCElChZEe23906El/BgU5pv
3dZ6vZaTxSbCNlRAVgk/W5sqak/XGYuJumdZ3vUFLKfcLTEhqytoVdvUkaDaAGA7
S1AuakMPa0J3oBcwDf8LEnHoJwPrOOqLF4CAcV9fdsI9hIDoQx+3hDMMh7yDDg4=
=VtOu
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2018-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.17
Hopefully the last fixes for 4.17. ssb is again causing problems so we
had to revert a commit and fix it better. Also a small fix to bcma and
some MAINTAINERS file updates.
ssb
* fix regression with all module PCI cards, for example using b43 and
b44 drivers
* try again fixing a MIPS linker error
bcma
* fix truncated info log messages
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When link is down, writes to the device might fail with
-EIO. Userspace needs an indication when the status is resolved. As a
fix, tun_net_open() attempts to wake up writers - but that is only
effective if SOCKWQ_ASYNC_NOSPACE has been set in the past. This is
not the case of vhost_net which only poll for EPOLLOUT after it meets
errors during sendmsg().
This patch fixes this by making sure SOCKWQ_ASYNC_NOSPACE is set when
socket is not writable or device is down to guarantee EPOLLOUT will be
raised in either tun_chr_poll() or tun_sock_write_space() after device
is up.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Fixes: 1bd4978a88 ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang says:
====================
Fix several issues of virtio-net mergeable XDP
Please review the patches that tries to fix several issues of
virtio-net mergeable XDP.
Changes from V1:
- check against 1 before decreasing instead of resetting to 1
- typoe fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to drop refcnt to xdp_page if we see a gso packet. Otherwise
it will be leaked. Fixing this by moving the check of gso packet above
the linearizing logic. While at it, remove useless comment as well.
Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 72979a6c35 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we successfully linearize the packet, num_buf will be set to zero
which may confuse error handling path which assumes num_buf is at
least 1 and this can lead the code tries to pop the descriptor of next
buffer. Fixing this by checking num_buf against 1 before decreasing.
Fixes: 4941d472bf ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not go for the error path after successfully transmitting a
XDP buffer after linearizing. Since the error path may try to pop and
drop next packet and increase the drop counters. Fixing this by simply
drop the refcnt of original page and go for xmit path.
Fixes: 72979a6c35 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a linearized packet was redirected by XDP, we should not go for
the err path which will try to pop buffers for the next packet and
increase the drop counter. Fixing this by just drop the page refcnt
for the original page.
Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Reported-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* hwsim radio dump wasn't working for the first radio
* mesh was updating statistics incorrectly
* a netlink message allocation was possibly too short
* wiphy name limit was still too long
* in certain cases regdb query could find a NULL pointer
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlsFNRQACgkQB8qZga/f
l8RTBw//Txtrv6BHZ5VHaibMwFfXB9UIuhzogcuUYzxF1qXzF4l4N2GehUGdlKPy
pjNGwYbqtD1b/mCa/BSGAHcuHXSQNmVRVOv3Vjvb6XtAPTVXiQcYWYqRA+F90IcL
gpfrl1RQKHGoZ8S/DST8YEtzgEog9hRr/WvnOCphVqnDohUM5UIv2iPF8Vp5ylQ8
cIwVa7/QgPJ5vG8EE7aPJTHnga9kNRqvlIAODq8H5QwwFTUPP431AjUHK7/nL/+l
GQF1CXA2mQldPowsNK82vS8guaIykD3wxLeuWBiHCa7EExmX5eA5NlySvvAwd1bE
2fM4/vTA5X0jNzqIYzVqZS4rbEHu7h/Kcm1QWCydl0xeKxONO0nfJj+AdWUBE2YH
g9CHEqnChIJgw43kXbN2E2WflDRL+k4yjQlvhbWIcr9yk/8pdO4mkD6qpwGbBQsn
kyeWbhB58M7IGAkTqrx9FeK7rCnPO5SZGFRD7Rpou0S5ioP7ce/xvkv5gvYN3Knu
OUsQv42mwY23cx2XEyeJ/4pXMaihw1lRyiTHSgpJjha2XdmlvvfPu5/pJcSft4jt
weQTot6ugCGimyBx/5bsJIczuHMVE1pD9ctjty5I0Xxfv/Qj78HSwFlB3RKWsFnG
fzksSC1ik5OSYBOi6vFiRO10EUBlVjNXPhP8x5LnAkBWwHp5juM=
=KWsp
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2018-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A handful of fixes:
* hwsim radio dump wasn't working for the first radio
* mesh was updating statistics incorrectly
* a netlink message allocation was possibly too short
* wiphy name limit was still too long
* in certain cases regdb query could find a NULL pointer
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAlsFBrQACgkQUa+KL4f8
d2HoJA//eG5ACgx8Lmop2y+b37PQ1M4sgQj0sf0POq1aFKxLIotKeK2V9SQE73HQ
0kcxNx7rXZT5dHBEdycf3W5LbmoK1X/Iz+GxMFR2RQ5plkAyT4h4uUQhRf9/5HVp
xrXyaJw2twOZInQ0dr1bE8IbDlL8xiH3kkhtzMmeS+xaz2aGsf5gfNq+fzhlyz2K
L8MTCBujYSR4KY3U8O7ifN3RobHtsYqAJhRej6JP7jlvxHcd3ZaOPQUHLTlVxZ8q
rUpv8Q9Ay0i/1+Yq/mr9Dj8YdszrtR9HQi0hH/nEO74POkOnrhADZsw8EoSCfRJN
pP+LNM/OrzCZjnzRwPjdv652DztBWxf9qJABz/F+EFVB5mLJkHNzQgcwMRXCd7VK
6lftkMzBWm8+uy9nLR9vBMSxifzGuEFARuIomFLM5XuaZoNcpK1xmntuzzPBzINU
Q8yBU3z9aCyfIBxVQ+dog1gg514aLLdoCK2iWmvAEpwW1T3wwI5972kjdg9pp3t6
2F0M+v/ClJEkxv9oeblJtiAZMYRXDahJ4uZ+IzbstQ1ouENrXBkr+hcIiMZFJdJB
8eTPworlU83BlxLvmgKsGXE6VLOpJjFVHcQHhBR1UX0eFnU7/Z3cXzYvSml1EJC6
kJtK83D28MzcZj1rGZiSwtMAKhyqhaKz1uJHgvpBs2QxE+VP60s=
=2BAn
-----END PGP SIGNATURE-----
Merge tag 'mfd-fixes-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fix from Lee Jones:
"A single cros_ec_spi fix correcting the handling for long-running
commands"
* tag 'mfd-fixes-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: cros_ec: Retry commands when EC is known to be busy
Pull alpha fixes from Matt Turner:
"A few small changes for alpha"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2
alpha: simplify get_arch_dma_ops
alpha: use dma_direct_ops for jensen
We have had problems displaying fbdev after a resume and as a
workaround we have had to call vmw_fb_refresh(). This has had
a number of unwanted side-effects. The root of the problem was,
however that the coalesced fbdev dirty region was not empty on
the first dirty_mark() after a resume, so a flush was never
scheduled.
Fix this by force scheduling an fbdev flush after resume, and
remove the workaround.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
The error paths were leaking opened channels.
Fix by using dedicated error paths.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Depending on whether the kernel is compiled with frame-pointer or not,
the temporary memory location used for the bp parameter in these macros
is referenced relative to the stack pointer or the frame pointer.
Hence we can never reference that parameter when we've modified either
the stack pointer or the frame pointer, because then the compiler would
generate an incorrect stack reference.
Fix this by pushing the temporary memory parameter on a known location on
the stack before modifying the stack- and frame pointers.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
The reuseport_bpf_numa test case fails there's no numa support. The
test shouldn't fail if there's no support it should be skipped.
Fixes: 3c2c3c16aa ("reuseport, bpf: add test case for bpf_get_numa_node_id")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Only CPUs which speculate can speculate. Therefore, it seems prudent
to test for cpu_no_speculation first and only then determine whether
a specific speculating CPU is susceptible to store bypass speculation.
This is underlined by all CPUs currently listed in cpu_no_speculation
were present in cpu_no_spec_store_bypass as well.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: konrad.wilk@oracle.com
Link: https://lkml.kernel.org/r/20180522090539.GA24668@light.dominikbrodowski.net
The X86_FEATURE_SSBD is an synthetic CPU feature - that is
it bit location has no relevance to the real CPUID 0x7.EBX[31]
bit position. For that we need the new CPU feature name.
Fixes: 52817587e7 ("x86/cpufeatures: Disentangle SSBD enumeration")
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20180521215449.26423-2-konrad.wilk@oracle.com
Commit 001dde9400 ("mfd: cros ec: spi: Fix "in progress" error
signaling") pointed out some bad code, but its analysis and conclusion
was not 100% correct.
It *is* correct that we should not propagate result==EC_RES_IN_PROGRESS
for transport errors, because this has a special meaning -- that we
should follow up with EC_CMD_GET_COMMS_STATUS until the EC is no longer
busy. This is definitely the wrong thing for many commands, because
among other problems, EC_CMD_GET_COMMS_STATUS doesn't actually retrieve
any RX data from the EC, so commands that expected some data back will
instead start processing junk.
For such commands, the right answer is to either propagate the error
(and return that error to the caller) or resend the original command
(*not* EC_CMD_GET_COMMS_STATUS).
Unfortunately, commit 001dde9400 forgets a crucial point: that for
some long-running operations, the EC physically cannot respond to
commands any more. For example, with EC_CMD_FLASH_ERASE, the EC may be
re-flashing its own code regions, so it can't respond to SPI interrupts.
Instead, the EC prepares us ahead of time for being busy for a "long"
time, and fills its hardware buffer with EC_SPI_PAST_END. Thus, we
expect to see several "transport" errors (or, messages filled with
EC_SPI_PAST_END). So we should really translate that to a retryable
error (-EAGAIN) and continue sending EC_CMD_GET_COMMS_STATUS until we
get a ready status.
IOW, it is actually important to treat some of these "junk" values as
retryable errors.
Together with commit 001dde9400, this resolves bugs like the
following:
1. EC_CMD_FLASH_ERASE now works again (with commit 001dde9400, we
would abort the first time we saw EC_SPI_PAST_END)
2. Before commit 001dde9400, transport errors (e.g.,
EC_SPI_RX_BAD_DATA) seen in other commands (e.g.,
EC_CMD_RTC_GET_VALUE) used to yield junk data in the RX buffer; they
will now yield -EAGAIN return values, and tools like 'hwclock' will
simply fail instead of retrieving and re-programming undefined time
values
Fixes: 001dde9400 ("mfd: cros ec: spi: Fix "in progress" error signaling")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
memory-barriers.txt has been updated with the following requirement.
"When using writel(), a prior wmb() is not needed to guarantee that the
cache coherent memory writes have completed before writing to the MMIO
region."
Current writeX() and iowriteX() implementations on alpha are not
satisfying this requirement as the barrier is after the register write.
Move mb() in writeX() and iowriteX() functions to guarantee that HW
observes memory changes before performing register operations.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
The generic dma_direct implementation does the same thing as the alpha
pci-noop implementation, just with more bells and whistles. And unlike
the current code it at least has a theoretical chance to actually compile.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Make sure to invoke pci_disable_device() when errors occur in
pcnet32_probe_pci().
Signed-off-by: Bo Chen <chenbo@pdx.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
ILT entry requires 12 bit right shifted physical address.
Existing mask for ILT entry of physical address i.e.
ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit
address because upper 8 bits of 64 bit address were getting
masked which resulted in completer abort error on
PCIe bus due to invalid address.
Fix that mask to handle 64bit physical address.
Fixes: fe56b9e6a8 ("qed: Add module with basic common support")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In divasmain.c, the function divas_write() firstly invokes the function
diva_xdi_open_adapter() to open the adapter that matches with the adapter
number provided by the user, and then invokes the function diva_xdi_write()
to perform the write operation using the matched adapter. The two functions
diva_xdi_open_adapter() and diva_xdi_write() are located in diva.c.
In diva_xdi_open_adapter(), the user command is copied to the object 'msg'
from the userspace pointer 'src' through the function pointer 'cp_fn',
which eventually calls copy_from_user() to do the copy. Then, the adapter
number 'msg.adapter' is used to find out a matched adapter from the
'adapter_queue'. A matched adapter will be returned if it is found.
Otherwise, NULL is returned to indicate the failure of the verification on
the adapter number.
As mentioned above, if a matched adapter is returned, the function
diva_xdi_write() is invoked to perform the write operation. In this
function, the user command is copied once again from the userspace pointer
'src', which is the same as the 'src' pointer in diva_xdi_open_adapter() as
both of them are from the 'buf' pointer in divas_write(). Similarly, the
copy is achieved through the function pointer 'cp_fn', which finally calls
copy_from_user(). After the successful copy, the corresponding command
processing handler of the matched adapter is invoked to perform the write
operation.
It is obvious that there are two copies here from userspace, one is in
diva_xdi_open_adapter(), and one is in diva_xdi_write(). Plus, both of
these two copies share the same source userspace pointer, i.e., the 'buf'
pointer in divas_write(). Given that a malicious userspace process can race
to change the content pointed by the 'buf' pointer, this can pose potential
security issues. For example, in the first copy, the user provides a valid
adapter number to pass the verification process and a valid adapter can be
found. Then the user can modify the adapter number to an invalid number.
This way, the user can bypass the verification process of the adapter
number and inject inconsistent data.
This patch reuses the data copied in
diva_xdi_open_adapter() and passes it to diva_xdi_write(). This way, the
above issues can be avoided.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is no license information in the header of
this file.
The MODULE_LICENSE field contains ("GPL"), which means
GNU Public License v2 or later, so add a corresponding
SPDX license identifier.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adopt the SPDX license identifier headers to ease license compliance
management.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
param can't be passed into its proto .connect where this flags is really
needed.
sctp works around it by getting flags from socket file in __sctp_connect.
It works for connecting from userspace, as inherently the user sock has
socket file and it passes f_flags as the flags param into the proto_ops
.connect.
However, the sock created by sock_create_kern doesn't have a socket file,
and it passes the flags (like O_NONBLOCK) by using the flags param in
kernel_connect, which calls proto_ops .connect later.
So to fix it, this patch defines a new proto_ops .connect for sctp,
sctp_inet_connect, which calls __sctp_connect() directly with this
flags param. After this, the sctp's proto .connect can be removed.
Note that sctp_inet_connect doesn't need to do some checks that are not
needed for sctp, which makes thing better than with inet_dgram_connect.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
If userspace faults on a kernel address, handing them the raw ESR
value on the sigframe as part of the delivered signal can leak data
useful to attackers who are using information about the underlying hardware
fault type (e.g. translation vs permission) as a mechanism to defeat KASLR.
However there are also legitimate uses for the information provided
in the ESR -- notably the GCC and LLVM sanitizers use this to report
whether wild pointer accesses by the application are reads or writes
(since a wild write is a more serious bug than a wild read), so we
don't want to drop the ESR information entirely.
For faulting addresses in the kernel, sanitize the ESR. We choose
to present userspace with the illusion that there is nothing mapped
in the kernel's part of the address space at all, by reporting all
faults as level 0 translation faults taken to EL1.
These fields are safe to pass through to userspace as they depend
only on the instruction that userspace used to provoke the fault:
EC IL (always)
ISV CM WNR (for all data aborts)
All the other fields in ESR except DFSC are architecturally RES0
for an L0 translation fault taken to EL1, so can be zeroed out
without confusing userspace.
The illusion is not entirely perfect, as there is a tiny wrinkle
where we will report an alignment fault that was not due to the memory
type (for instance a LDREX to an unaligned address) as a translation
fault, whereas if you do this on real unmapped memory the alignment
fault takes precedence. This is not likely to trip anybody up in
practice, as the only users we know of for the ESR information who
care about the behaviour for kernel addresses only really want to
know about the WnR bit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE
driver flags) inadvertently prevented the power.direct_complete flag
from being set for devices without PM callbacks and with disabled
runtime PM which also prevents power.direct_complete from being set
for their parents. That led to problems including a resume crash on
HP ZBook 14u.
Restore the previous behavior by causing power.direct_complete to be
set for those devices again, but do that in a more direct way to
avoid overlooking that case in the future.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693
Fixes: 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags)
Reported-by: Thomas Martitz <kugel@rockbox.org>
Tested-by: Thomas Martitz <kugel@rockbox.org>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Johan Hovold <johan@kernel.org>