When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.
Fixes: b0ab6f9275 ("net: sched: enable per cpu qstats")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LSR instruction cannot be used to perform a zero right shift since a
0 as the immediate value (imm5) in the LSR instruction encoding means
that a shift of 32 is perfomed. See DecodeIMMShift() in the ARM ARM.
Make the JIT skip generation of the LSR if a zero-shift is requested.
This was found using american fuzzy lop.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit acf673a318 fixed a user triggerable free
memory scribble but in doing so replaced it with a different one that allows
the user to control the data and scribble even more.
sixpack_close is called by the tty layer in tty context. The tty context is
protected by sp_get() and sp_put(). However network layer activity via
sp_xmit() is not protected this way. We must therefore stop the queue
otherwise the user gets to dump a buffer mostly of their choice into freed
kernel pages.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value. All the BPF JITs fail to clear A if this is used as
the first instruction in a filter. This was found using american fuzzy
lop.
Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs. Except for ARM, the
rest have only been compile-tested.
Fixes: 3480593131 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[I stole this patch from Eric Biederman. He wrote:]
> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.
[Hannes: changed patch using netns_eq]
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice
system call and AF_UNIX sockets,
http://lists.openwall.net/netdev/2015/11/06/24
The situation was analyzed as
(a while ago) A: socketpair()
B: splice() from a pipe to /mnt/regular_file
does sb_start_write() on /mnt
C: try to freeze /mnt
wait for B to finish with /mnt
A: bind() try to bind our socket to /mnt/new_socket_name
lock our socket, see it not bound yet
decide that it needs to create something in /mnt
try to do sb_start_write() on /mnt, block (it's
waiting for C).
D: splice() from the same pipe to our socket
lock the pipe, see that socket is connected
try to lock the socket, block waiting for A
B: get around to actually feeding a chunk from
pipe to file, try to lock the pipe. Deadlock.
on 2015/11/10 by Al Viro,
http://lists.openwall.net/netdev/2015/11/10/4
The patch fixes this by removing the kern_path_create related code from
unix_mknod and executing it as part of unix_bind prior acquiring the
readlock of the socket in question. This means that A (as used above)
will sb_start_write on /mnt before it acquires the readlock, hence, it
won't indirectly block B which first did a sb_start_write and then
waited for a thread trying to acquire the readlock. Consequently, A
being blocked by C waiting for B won't cause a deadlock anymore
(effectively, both A and B acquire two locks in opposite order in the
situation described above).
Dmitry Vyukov(<dvyukov@google.com>) tested the original patch.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commands run in a vrf context are not failing as expected on a route lookup:
root@kenny:~# ip ro ls table vrf-red
unreachable default
root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
ping: Warning: source address might be selected on device other than vrf-red.
PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.
--- 10.100.1.254 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:
root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
connect: No route to host
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the reset_resume() is called, the flag of SELECTIVE_SUSPEND should be
cleared and reinitialize the device, whether the SELECTIVE_SUSPEND is set
or not. If reset_resume() is called, it means the power supply is cut or the
device is reset. That is, the device wouldn't be in runtime suspend state and
the reinitialization is necessary.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry reports memleak with syskaller program.
Problem is that connector bumps skb usecount but might not invoke callback.
So move skb_get to where we invoke the callback.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since t4_alloc_mem can be failed in memory pressure,
if not properly handled, NULL dereference could be happened.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since qlcnic_alloc_mbx_args can be failed,
return value should be checked.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HiSilicon host bridge driver
Fix 32-bit config reads (Dongdong Liu)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWhZsJAAoJEFmIoMA60/r8JO8P/2zpQEewcPoFqtPFoAXXNVCg
vRnUlSVs5kz3cj3gUjbSjannWJZXFvsFgz/V+f3nxaOSLel550ccphhdS+oyh3L+
NVeQka8nnIbsVmGvNmebxNteBXa2CTGlZB4snRHQw+n1XjacqPQOMeccN09jCXmK
GBdJvs1Xs2rphGHq52cLkkqUdSCEayUiYK/4WgAzcBe8EFy5kWvbObcoBuX9/3Lm
fjnoWPXYSZFr+uyW8Q5+MztrpXJeOZV/krRZjcH2NxnLr1Xs+PnrC/NNu3BZvKnH
qGyLc3vMIpeYS2VGiwJDKzmahyKm4Elh1iJNoywHIGPf3o0WzjJgnsiryZWomytd
nVueiL8Oy0wUxoLupnFGdBIgbNvBeQSdeqcrXzjRfYHdHn3iakQTarUpVjqDUOEW
4iO4R+Xohq6X4Yhdr9RFxg2tCLk4dJebvwRNSGwTPmDnPZqzoQmg5uK84R1QrlD7
BM/ggHPryOogmeCqr7wCifkl73pMcvlK7maKUZcTgBz1E9aCeaGbz7Nc3KhUxSYV
jvP84dEBx0QN5M3523sn/TNRZsAztUaBgGJLwuLetPazOgGORZD2msMqpTCr1a3J
4TQjadvc5RWG4MBOeU9r2WdQeZdSwj/X41XVLVh3qCZaYQCz8aBGMb/PGdgWz2kZ
cBunX4VY1+S/EHu4abuw
=5MNY
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI bugfix from Bjorn Helgaas:
"Here's another fix for v4.4.
This fixes 32-bit config reads for the HiSilicon driver. Obviously
the driver is completely broken without this fix (apparently it
actually was tested internally, but got broken somehow in the process
of upstreaming it).
Summary:
HiSilicon host bridge driver
Fix 32-bit config reads (Dongdong Liu)"
* tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: hisi: Fix hisi_pcie_cfg_read() 32-bit reads
Pull sparc fixes from David Miller:
"Just some missing syscall wire ups"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Wire up mlock2 system call.
sparc: Add all necessary direct socket system calls.
Pull networking fixes from David Miller:
1) Prevent XFRM per-cpu counter updates for one namespace from being
applied to another namespace. Fix from DanS treetman.
2) Fix RCU de-reference in iwl_mvm_get_key_sta_id(), from Johannes
Berg.
3) Remove ethernet header assumption in nft_do_chain_netdev(), from
Pablo Neira Ayuso.
4) Fix cpsw PHY ident with multiple slaves and fixed-phy, from Pascal
Speck.
5) Fix use after free in sixpack_close and mkiss_close.
6) Fix VXLAN fw assertion on bnx2x, from Yuval Mintz.
7) natsemi doesn't check for DMA mapping errors, from Alexey
Khoroshilov.
8) Fix inverted test in ip6addrlbl_get(), from ANdrey Ryabinin.
9) Missing initialization of needed_headroom in geneve tunnel driver,
from Paolo Abeni.
10) Fix conntrack template leak in openvswitch, from Joe Stringer.
11) Mission initialization of wq->flags in sock_alloc_inode(), from
Nicolai Stange.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
net, socket, socket_wq: fix missing initialization of flags
drivers: net: cpsw: fix error return code
openvswitch: Fix template leak in error cases.
sctp: label accepted/peeled off sockets
sctp: use GFP_USER for user-controlled kmalloc
qlcnic: fix a loop exit condition better
net: cdc_ncm: avoid changing RX/TX buffers on MTU changes
geneve: initialize needed_headroom
ipv6: honor ifindex in case we receive ll addresses in router advertisements
addrconf: always initialize sysctl table data
ipv6/addrlabel: fix ip6addrlbl_get()
switchdev: bridge: Pass ageing time as clock_t instead of jiffies
sh_eth: fix 16-bit descriptor field access endianness too
veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
net: usb: cdc_ncm: Adding Dell DW5813 LTE AT&T Mobile Broadband Card
net: usb: cdc_ncm: Adding Dell DW5812 LTE Verizon Mobile Broadband Card
natsemi: add checks for dma mapping errors
rhashtable: Kill harmless RCU warning in rhashtable_walk_init
openvswitch: correct encoding of set tunnel action attributes
...
The GLIBC folks would like to eliminate socketcall support
eventually, and this makes sense regardless so wire them
all up.
Signed-off-by: David S. Miller <davem@davemloft.net>
In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().
So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".
But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ceb5d58b21 ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.
Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().
One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.
Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.
Fixes: ceb5d58b21 ("net: fix sock_wake_async() rcu protection")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull block fixes from Jens Axboe:
"Make the block layer great again.
Basically three amazing fixes in this pull request, split into 4
patches. Believe me, they should go into 4.4. Two of them fix a
regression, the third and last fixes an easy-to-trigger bug.
- Fix a bad irq enable through null_blk, for queue_mode=1 and using
timer completions. Add a block helper to restart a queue
asynchronously, and use that from null_blk. From me.
- Fix a performance issue in NVMe. Some devices (Intel Pxxxx) expose
a stripe boundary, and performance suffers if we cross it. We took
that into account for merging, but not for the newer splitting
code. Fix from Keith.
- Fix a kernel oops in lightnvm with multiple channels. From Matias"
* 'for-linus' of git://git.kernel.dk/linux-block:
lightnvm: wrong offset in bad blk lun calculation
null_blk: use async queue restart helper
block: add blk_start_queue_async()
block: Split bios on chunk boundaries
Merge misc fixes from Andrew Morton:
"9 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/vmstat: fix overflow in mod_zone_page_state()
ocfs2/dlm: clear migration_pending when migration target goes down
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
ocfs2: fix flock panic issue
m32r: add io*_rep helpers
m32r: fix build failure
arch/x86/xen/suspend.c: include xen/xen.h
mm: memcontrol: fix possible memcg leak due to interrupted reclaim
ocfs2: fix BUG when calculate new backup super
Pull vfs fix from Al Viro:
"Fix for 3.15 breakage of fcntl64() in arm OABI compat. -stable
fodder"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()
mod_zone_page_state() takes a "delta" integer argument. delta contains
the number of pages that should be added or subtracted from a struct
zone's vm_stat field.
If a zone is larger than 8TB this will cause overflows. E.g. for a
zone with a size slightly larger than 8TB the line
mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);
in mm/page_alloc.c:free_area_init_core() will result in a negative
result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB
contain 0x8xxxxxxx pages which will be sign extended to a negative
value.
Fix this by changing the delta argument to long type.
This could fix an early boot problem seen on s390, where we have a 9TB
system with only one node. ZONE_DMA contains 2GB and ZONE_NORMAL the
rest. The system is trying to allocate a GFP_DMA page but ZONE_DMA is
completely empty, so it tries to reclaim pages in an endless loop.
This was seen on a heavily patched 3.10 kernel. One possible
explaination seem to be the overflows caused by mod_zone_page_state().
Unfortunately I did not have the chance to verify that this patch
actually fixes the problem, since I don't have access to the system
right now. However the overflow problem does exist anyway.
Given the description that a system with slightly less than 8TB does
work, this seems to be a candidate for the observed problem.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have found a BUG on res->migration_pending when migrating lock
resources. The situation is as follows.
dlm_mark_lockres_migration
res->migration_pending = 1;
__dlm_lockres_reserve_ast
dlm_lockres_release_ast returns with res->migration_pending remains
because other threads reserve asts
wait dlm_migration_can_proceed returns 1
>>>>>>> o2hb found that target goes down and remove target
from domain_map
dlm_migration_can_proceed returns 1
dlm_mark_lockres_migrating returns -ESHOTDOWN with
res->migration_pending still remains.
When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
with res->migration_pending. So clear migration_pending when target is
down.
Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range. pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.
Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.
This also prevents a crash from offlining memory devices with missing
sections. Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Greg KH <greg@kroah.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
m32r allmodconfig was failing with the error:
error: implicit declaration of function 'read'
On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.
At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
m32r allmodconfig is failing with:
In file included from ../include/linux/kvm_para.h:4:0,
from ../kernel/watchdog.c:26:
../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory
kvm_para.h was not included in the build.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the build warning:
arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
if (xen_pv_domain())
^
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break()
once enough pages have been reclaimed, in which case, in contrast to a
full round-trip over a cgroup sub-tree, the current position stored in
mem_cgroup_reclaim_iter of the target cgroup does not get invalidated
and so is left holding the reference to the last scanned cgroup. If the
target cgroup does not get scanned again (we might have just reclaimed
the last page or all processes might exit and free their memory
voluntary), we will leak it, because there is nobody to put the
reference held by the iterator.
The problem is easy to reproduce by running the following command
sequence in a loop:
mkdir /sys/fs/cgroup/memory/test
echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs
memhog 150M
echo $$ > /sys/fs/cgroup/memory/cgroup.procs
rmdir test
The cgroups generated by it will never get freed.
This patch fixes this issue by making mem_cgroup_iter avoid taking
reference to the current position. In order not to hit use-after-free
bug while running reclaim in parallel with cgroup deletion, we make use
of ->css_released cgroup callback to clear references to the dying
cgroup in all reclaim iterators that might refer to it. This callback
is called right before scheduling rcu work which will free css, so if we
access iter->position from rcu read section, we might be sure it won't
go away under us.
[hannes@cmpxchg.org: clean up css ref handling]
Fixes: 5ac8fb31ad ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [3.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When resizing, it firstly extends the last gd. Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.
But it currently doesn't consider the situation that the backup super is
already done. And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:
BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
So check whether the backup super is done and then do the updates.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Propagate the return value of platform_get_irq on failure.
A simplified version of the semantic match that finds the two cases where
no error code is returned at all is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
identifier ret; expression e1,e2;
@@
(
if (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
reference leak on helper objects, but inadvertently introduced a leak on
the ct template.
Previously, ct_info.ct->general.use was initialized to 0 by
nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
returned successful. If an error occurred while adding the helper or
adding the action to the actions buffer, the __ovs_ct_free_action()
cleanup would use nf_ct_put() to free the entry; However, this relies on
atomic_dec_and_test(ct_info.ct->general.use). This reference must be
incremented first, or nf_ct_put() will never free it.
Fix the issue by acquiring a reference to the template immediately after
allocation.
Fixes: cae3a26275 ("openvswitch: Allow attaching helpers to ct action")
Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev->nr_luns reports the total number of luns available in a device
while dev->luns_per_chnl is the number of luns per channel.
When multiple channels are available, the offset is calculated from a
channel and lun id into a linear array. As it multiplies with
the total number of luns, we go out of bound when channel id > 0 and
causes the kernel to panic when we read a protected kernel memory area.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
- A previous patch updated the mlx4 driver to use vmalloc when there was
not enough memory to get a contiguous region large enough for our needs,
so we need kvfree() whenever we free that item. We missed one place,
so fix that now.
- A previous patch added code to match incoming packets against a specific
device, but failed to compensate for devices that have both InfiniBand
and Ethernet ports. Fix that.
- Under certain vlan conditions, the ocrdma driver would fail to bring up
any vlan interfaces and would print out a circular locking failure. Fix
that.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWgWd5AAoJELgmozMOVy/dxzwP/3cfSifjE7Wp8QxgCw78gov3
Gt2zWDKNQqgAogrykCpivgDm4/CRoNBu3TnWEHl2DwoZhIT71h0H6H0bZr2cUtrv
ywWYiMX1WuR+oRG0gOCkpEuFr5fdGC3ngSNqQXSp+8BkcOCkN7andnU5JE/TXRTp
9g274kogZzYhH7LSxc29y8UFocXvfSywAsJlvpAS370j4OmxxD4EspXm6YJgVp4x
Q8s9xfyQD5sroY8VOH36IY1CvdeDBqplQI6yvr6IJ5AyJy7jt++HywTgfFWTIbgz
KktKC/7KHTPVXzwzOAexn0Ot+e42Yns7zmHaullX+liAyxPljJ844R98IBNlc2Ji
95AWP3YAsT0QOr/C5dchbbRdrPV8nq47sSrUO+mJ5Xb3pKpgMAMPlh/UK/BP40dg
6HsTqpBtNlYl4oILWo6xyxvpsk1w4m/Lad25LNwPc4Jp4BnlXbT/E1UO9FJiN3B3
E62Vo/IBBinhORmnUzUCTEePJ07izX406OZ/mh+6UqkYkiXvZjEDblcptqGQZaC7
Ng/mV3FfrilDm//W5b6/B3pTmwB496YjbALHev9nRS1XDg/gCQsVHpCjp6c5uPEW
qQ8uft4ZITgPX+dhoc4ur0jAygVx1w0dmbMwbCD5/LXpqxMF7BSeb/EslcFmeyzS
YwQq2tw8eZbhNaickM6e
=ePss
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Three late 4.4-rc fixes.
The first two were very small in terms of number of lines, the third
is more lines of change than I like this late in the cycle, but there
are positive test results from Avagotech and from my own test setup
with the target hardware, and given the problem was a 100% failure
case, I sent it through.
- A previous patch updated the mlx4 driver to use vmalloc when there
was not enough memory to get a contiguous region large enough for
our needs, so we need kvfree() whenever we free that item. We
missed one place, so fix that now.
- A previous patch added code to match incoming packets against a
specific device, but failed to compensate for devices that have
both InfiniBand and Ethernet ports. Fix that.
- Under certain vlan conditions, the ocrdma driver would fail to
bring up any vlan interfaces and would print out a circular locking
failure. Fix that"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/be2net: Remove open and close entry points
RDMA/ocrdma: Depend on async link events from CNA
RDMA/ocrdma: Dispatch only port event when port state changes
RDMA/ocrdma: Fix vlan-id assignment in qp parameters
IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq
IB/cma: cma_match_net_dev needs to take into account port_num
If null_blk is run in NULL_IRQ_TIMER mode and with queue_mode NULL_Q_RQ,
we need to restart the queue from the hrtimer interrupt. We can't
directly invoke the request_fn from that context, so punt the queue run
to async kblockd context.
Tested-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Jens Axboe <axboe@fb.com>
We currently only have an inline/sync helper to restart a stopped
queue. If drivers need an async version, they have to roll their
own. Add a generic helper instead.
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull crypto fix from Herbert Xu:
"This fixes a bug in the algif_skcipher interface that can trigger a
kernel WARN_ON from user-space. It does so by using the new skcipher
interface which unlike the previous ablkcipher does not need to create
extra geniv objects which is what was used to trigger the WARN_ON"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_skcipher - Use new skcipher interface
Pull key handling bugfix from James Morris:
"Fix a race between keyctl_read() and keyctl_revoke()"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
KEYS: Fix race between read and revoke
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issueing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
In order to resolve the above deadlock condition, ocrdma intorduced a
patch to stop listening to administrative open/close events generated from
be2net driver. It now depends on link-state-change async-event generated from
CNA. This change leaves behind dead code which used to generate administrative
open/close events. This patch cleans-up all that dead code from be2net.
Reported-by: Doug Ledford <dledford@redhat.com>
CC: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issuing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
With this patch we stop using administrative open and close events
injected by be2net driver. These events were used to dispatch PORT_ACTIVE
and PORT_ERROR events to the IB-stack. This patch implements a logic
to receive async-link-events generated from CNA whenever link-state-change
is detected. Now on, these async-events will be used to dispatch
PORT_ACTIVE and PORT_ERROR events to IB-stack.
Depending on async-events from CNA removes the need to hold device-list-mutex
and thus breaks the busy-wait scenario.
Reported-by: Doug Ledford <dledford@redhat.com>
CC: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Dispatch only port event to IB stack when port state changes.
Don't explicitly modify qps to error. Let application listen to
port events on async event queue or let QP fail with retry-exceeded
completion error.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
vlan-id is wrongly getting as 0 when PFC is enabled.
Set vlan-id configured by user in QP parameters.
In case vlan interface is not used, flash a warning to
user to configure vlan and assign vlan-id as 0 in qp params.
Fixes: dbf727de74 ('IB/core: Use GID table in AH creation and dmac resolution')
Cc: Matan Barak <matanb@mellanox.com>
Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Accepted or peeled off sockets were missing a security label (e.g.
SELinux) which means that socket was in "unlabeled" state.
This patch clones the sock's label from the parent sock and resolves the
issue (similar to AF_BLUETOOTH protocol family).
Cc: Paul Moore <pmoore@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit cacc062152 ("sctp: use GFP_USER for user-controlled kmalloc")
missed two other spots.
For connectx, as it's more likely to be used by kernel users of the API,
it detects if GFP_USER should be used or not.
Fixes: cacc062152 ("sctp: use GFP_USER for user-controlled kmalloc")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS fixes from Ralf Baechle:
- Fix bitrot in __get_user_unaligned()
- EVA userspace accessor bug fixes.
- Fix for build issues with certain toolchains.
- Fix build error for VDSO with particular toolchain versions.
- Fix build error due to a variable that should have been removed by an
earlier patch
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix bitrot in __get_user_unaligned()
MIPS: Fix build error due to unused variables.
MIPS: VDSO: Fix build error
MIPS: CPS: drop .set mips64r2 directives
MIPS: uaccess: Take EVA into account in [__]clear_user
MIPS: uaccess: Take EVA into account in __copy_from_user()
MIPS: uaccess: Fix strlen_user with EVA
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
- A clock rate and a PHY setup fix for i.MX6Q/DL
- A couple of fixes for the reduced serial bus (sunxi-rsb) on Allwinner
- UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
- Suspend fix for Tegra124 Chromebooks
- Fix for missing implicit include that's different between ARM/ARM64
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWgDAQAAoJEIwa5zzehBx3muAP/RJD+oERDw/XiBSCl4k4kvTZ
M4ep2ExO/GPXewKXyK7rlbNWMitBwttdT5OmXuuxT7BA/ODAGk90uvZSebIHRxkh
bqnt+3njeR3M6FrTp6Np+yCh3I0hLYJoRInV3Vh8XQcml0LobIIq2BbdahIg/2JO
sSDLAzJSR15CfnnKOSexvfRFoslLaKG0ffbxmR32HiMgnbq8ZfkRr7fANsrXTzoX
oHxEV0uQXBY0d1TQ71rT/4KdTKN9QsdJI6xAjUGH95MYu+ZE0DEnW/wodd/f4e+p
dyV+qoS2LHJnhYgJho3r19icM0iyNRm0Yt0GqP7ERN/y5GGa41slE9eThMG7WQ4h
Ot5DEF2GmP7tSYhp4pqEeLqHnfou9+WzxxNP6wxGceqlg9EPuPRtzdCPStZYW4rD
6+SQCvFs0vHZlEbbAfQLhRgDpvr9enGjCcWm6ntUcgwxFy0CPmRV9g5gIeipZOwi
QfJM233tudUkDQxJYrCEgKxHy3a7T3K9LgYMM0VO1JRAEpaPIBmxZ0A+Y84WJbTE
7r+r3eDC8RFF3bLbNRb/Ogt5OHOQUdElhRXcyz/BYA/X4HOT4wyVNM12aBlmW9uN
aDB4HWE2lKV/h2pxWRR1Hem1NHYRUpcdVzkwRvncDHnxCAmb82BE2x1Ub3+fqa0v
f0eO7GjUDRdPsQXn0zZn
=kNRV
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smallish set of fixes that we've been sitting on for a while now,
flushing the queue here so they go in. Summary:
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
- A clock rate and a PHY setup fix for i.MX6Q/DL
- A couple of fixes for the reduced serial bus (sunxi-rsb) on
Allwinner
- UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
- Suspend fix for Tegra124 Chromebooks
- Fix for missing implicit include that's different between
ARM/ARM64"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
bus: sunxi-rsb: Fix peripheral IC mapping runtime address
bus: sunxi-rsb: Fix primary PMIC mapping hardware address
ARM: dts: Fix UART wakeirq for omap4 duovero parlor
ARM: OMAP2+: AM43xx: select ARM TWD timer
ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
fsl-ifc: add missing include on ARM64
ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property