Commit Graph

693160 Commits

Author SHA1 Message Date
Lorenzo Colitti
8a4b5784fa net: xfrm: don't double-hold dst when sk_policy in use.
While removing dst_entry garbage collection, commit 52df157f17
("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
changed xfrm_resolve_and_create_bundle so it returns an xdst with
a refcount of 1 instead of 0.

However, it did not delete the dst_hold performed by xfrm_lookup
when a per-socket policy is in use. This means that when a
socket policy is in use, dst entries returned by xfrm_lookup have
a refcount of 2, and are not freed when no longer in use.

Cc: Wei Wang <weiwan@google.com>
Fixes: 52df157f17 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
Tested: https://android-review.googlesource.com/417481
Tested: https://android-review.googlesource.com/418659
Tested: https://android-review.googlesource.com/424463
Tested: https://android-review.googlesource.com/452776 passes on net-next
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-08-24 13:01:14 +02:00
Eric Dumazet
2b33bc8aa2 net: dsa: use consume_skb()
Two kfree_skb() should be consume_skb(), to be friend with drop monitor
(perf record ... -e skb:kfree_skb)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 22:13:34 -07:00
David S. Miller
40e607cbee Merge branch 'nfp-fixes'
Jakub Kicinski says:

====================
nfp: fix SR-IOV deadlock and representor bugs

This series tackles the bug I've already tried to fix in commit
6d48ceb27a ("nfp: allocate a private workqueue for driver work").
I created a separate workqueue to avoid possible deadlock, and
the lockdep error disappeared, coincidentally.  The way workqueues
are operating, separate workqueue doesn't necessarily mean separate
thread of execution.  Luckily we can safely forego the lock.

Second fix changes the order in which vNIC netdevs and representors
are created/destroyed.  The fix is kept small and should be sufficient
for net because of how flower uses representors, a more thorough fix
will be targeted at net-next.

Third fix avoids leaking mapped frame buffers if FW sent a frame with
unknown portid.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:39:50 -07:00
Jakub Kicinski
1691a4c0f4 nfp: avoid buffer leak when representor is missing
When driver receives a muxed frame, but it can't find the representor
netdev it is destined to it will try to "drop" that frame, i.e. reuse
the buffer.  The issue is that the replacement buffer has already been
allocated at this point, and reusing the buffer from received frame
will leak it.  Change the code to put the new buffer on the ring
earlier and not reuse the old buffer (make the buffer parameter
to nfp_net_rx_drop() a NULL).

Fixes: 91bf82ca9e ("nfp: add support for tx/rx with metadata portid")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:39:44 -07:00
Jakub Kicinski
326ce60301 nfp: make sure representors are destroyed before their lower netdev
App start/stop callbacks can perform application initialization.
Unfortunately, flower app started using them for creating and
destroying representors.  This can lead to a situation where
lower vNIC netdev is destroyed while representors still try
to pass traffic.  This will most likely lead to a NULL-dereference
on the lower netdev TX path.

Move the start/stop callbacks, so that representors are created/
destroyed when vNICs are fully initialized.

Fixes: 5de73ee467 ("nfp: general representor implementation")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:39:44 -07:00
Jakub Kicinski
d6e1ab9ea3 nfp: don't hold PF lock while enabling SR-IOV
Enabling SR-IOV VFs will cause the PCI subsystem to schedule a
work and flush its workqueue.  Since the nfp driver schedules its
own work we can't enable VFs while holding driver load.  Commit
6d48ceb27a ("nfp: allocate a private workqueue for driver work")
tried to avoid this deadlock by creating a separate workqueue.
Unfortunately, due to the architecture of workqueue subsystem this
does not guarantee a separate thread of execution.  Luckily
we can simply take pci_enable_sriov() from under the driver lock.

Take pci_disable_sriov() from under the lock too for symmetry.

Fixes: 6d48ceb27a ("nfp: allocate a private workqueue for driver work")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:39:44 -07:00
David S. Miller
2f19f50e0f Merge branch 'dst-tag-ksz-fix'
Florian Fainelli says:

====================
net: dsa: Fix tag_ksz.c

This implements David's suggestion of providing low-level functions
to control whether skb_pad() and skb_put_padto() should be freeing
the passed skb.

We make use of it to fix a double free in net/dsa/tag_ksz.c that would
occur if we kept using skb_put_padto() in both places.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:33:49 -07:00
Florian Fainelli
4971667924 net: dsa: skb_put_padto() already frees nskb
The first call of skb_put_padto() will free up the SKB on error, but we
return NULL which tells dsa_slave_xmit() that the original SKB should be
freed so this would lead to a double free here.

The second skb_put_padto() already frees the passed sk_buff reference
upon error, so calling kfree_skb() on it again is not necessary.

Detected by CoverityScan, CID#1416687 ("USE_AFTER_FREE")

Fixes: e71cb9e009 ("net: dsa: ksz: fix skb freeing")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:33:49 -07:00
Florian Fainelli
cd0a137acb net: core: Specify skb_pad()/skb_put_padto() SKB freeing
Rename skb_pad() into __skb_pad() and make it take a third argument:
free_on_error which controls whether kfree_skb() should be called or
not, skb_pad() directly makes use of it and passes true to preserve its
existing behavior. Do exactly the same thing with __skb_put_padto() and
skb_put_padto().

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:33:49 -07:00
Stephan Gatzka
013dae5dbc net: stmmac: socfgpa: Ensure emac bit set in sys manager for MII/GMII/SGMII.
When using MII/GMII/SGMII in the Altera SoC, the phy needs to be
wired through the FPGA. To ensure correct behavior, the appropriate
bit in the System Manager FPGA Interface Group register needs to be
set.

Signed-off-by: Stephan Gatzka <stephan.gatzka@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-23 20:26:58 -07:00
Florian Fainelli
a1a50c8e4c fsl/man: Inherit parent device and of_node
Junote Cai reported that he was not able to get a DSA setup involving the
Freescale DPAA/FMAN driver to work and narrowed it down to
of_find_net_device_by_node(). This function requires the network device's
device reference to be correctly set which is the case here, though we have
lost any device_node association there.

The problem is that dpaa_eth_add_device() allocates a "dpaa-ethernet" platform
device, and later on dpaa_eth_probe() is called but SET_NETDEV_DEV() won't be
propagating &pdev->dev.of_node properly. Fix this by inherenting both the parent
device and the of_node when dpaa_eth_add_device() creates the platform device.

Fixes: 3933961682 ("fsl/fman: Add FMan MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 16:32:08 -07:00
Daniel Borkmann
33ba43ed0a bpf: fix map value attribute for hash of maps
Currently, iproute2's BPF ELF loader works fine with array of maps
when retrieving the fd from a pinned node and doing a selfcheck
against the provided map attributes from the object file, but we
fail to do the same for hash of maps and thus refuse to get the
map from pinned node.

Reason is that when allocating hash of maps, fd_htab_map_alloc() will
set the value size to sizeof(void *), and any user space map creation
requests are forced to set 4 bytes as value size. Thus, selfcheck
will complain about exposed 8 bytes on 64 bit archs vs. 4 bytes from
object file as value size. Contract is that fdinfo or BPF_MAP_GET_FD_BY_ID
returns the value size used to create the map.

Fix it by handling it the same way as we do for array of maps, which
means that we leave value size at 4 bytes and in the allocation phase
round up value size to 8 bytes. alloc_htab_elem() needs an adjustment
in order to copy rounded up 8 bytes due to bpf_fd_htab_map_update_elem()
calling into htab_map_update_elem() with the pointer of the map
pointer as value. Unlike array of maps where we just xchg(), we're
using the generic htab_map_update_elem() callback also used from helper
calls, which published the key/value already on return, so we need
to ensure to memcpy() the right size.

Fixes: bcc6b1b7eb ("bpf: Add hash of maps support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 16:32:02 -07:00
Florian Fainelli
fcd03e362b net: phy: Deal with unbound PHY driver in phy_attached_print()
Priit reported that stmmac was crashing with the trace below. This is because
phy_attached_print() is called too early right after the PHY device has been
found, but before it has a driver attached, since that is only done in
phy_probe() which occurs later.

Fix this by dealing with a possibly NULL phydev->drv point since that can
happen here, but could also happen if we voluntarily did an unbind of the
PHY device with the PHY driver.

sun7i-dwmac 1c50000.ethernet: PTP uses main clock
sun7i-dwmac 1c50000.ethernet: no reset control found
sun7i-dwmac 1c50000.ethernet: no regulator found
sun7i-dwmac 1c50000.ethernet: Ring mode enabled
sun7i-dwmac 1c50000.ethernet: DMA HW capability register supported
sun7i-dwmac 1c50000.ethernet: Normal descriptors
libphy: stmmac: probed
Unable to handle kernel NULL pointer dereference at virtual address 00000048
pgd = c0004000
[00000048] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc6-00318-g0065bd7fa384 #1
Hardware name: Allwinner sun7i (A20) Family
task: ee868000 task.stack: ee85c000
PC is at phy_attached_print+0x1c/0x8c
LR is at stmmac_mdio_register+0x12c/0x200
pc : [<c04510ac>]    lr : [<c045e6b4>]    psr: 60000013
sp : ee85ddc8  ip : 00000000  fp : c07dfb5c
r10: ee981210  r9 : 00000001  r8 : eea73000
r7 : eeaa6dd0  r6 : eeb49800  r5 : 00000000  r4 : 00000000
r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : eeb49800
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 4000406a  DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xee85c210)
Stack: (0xee85ddc8 to 0xee85e000)
ddc0:                   00000000 00000002 eeb49400 eea72000 00000000 eeb49400
dde0: c045e6b4 00000000 ffffffff eeab0810 00000000 c08051f8 ee9292c0 c016d480
de00: eea725c0 eea73000 eea72000 00000001 eea726c0 c0457d0c 00000040 00000020
de20: 00000000 c045b850 00000001 00000000 ee981200 eeab0810 eeaa6ed0 ee981210
de40: 00000000 c094a4a0 00000000 c0465180 eeaa7550 f08d0000 c9ffb90c 00000032
de60: fffffffa 00000032 ee981210 ffffffed c0a46620 fffffdfb c0a46620 c03f7be8
de80: ee981210 c0a9a388 00000000 00000000 c0a46620 c03f63e0 ee981210 c0a46620
dea0: ee981244 00000000 00000007 000000c6 c094a4a0 c03f6534 00000000 c0a46620
dec0: c03f6490 c03f49ec ee828a58 ee9217b4 c0a46620 eeaa4b00 c0a43230 c03f59fc
dee0: c08051f8 c094a49c c0a46620 c0a46620 00000000 c091c668 c093783c c03f6dfc
df00: ffffe000 00000000 c091c668 c010177c eefe0938 eefe0935 c085e200 000000c6
df20: 00000005 c0136bc8 60000013 c080b3a4 00000006 00000006 c07ce7b4 00000000
df40: c07d7ddc c07cef28 eefe0938 eefe093e c0a0b2f0 c0a641c0 c0a641c0 c0a641c0
df60: c0937834 00000007 000000c6 c094a4a0 00000000 c0900d88 00000006 00000006
df80: 00000000 c09005a8 00000000 c060ecf4 00000000 00000000 00000000 00000000
dfa0: 00000000 c060ecfc 00000000 c0107738 00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffdeffff ffffffff
[<c04510ac>] (phy_attached_print) from [<c045e6b4>] (stmmac_mdio_register+0x12c/0x200)
[<c045e6b4>] (stmmac_mdio_register) from [<c045b850>] (stmmac_dvr_probe+0x850/0x96c)
[<c045b850>] (stmmac_dvr_probe) from [<c0465180>] (sun7i_gmac_probe+0x120/0x180)
[<c0465180>] (sun7i_gmac_probe) from [<c03f7be8>] (platform_drv_probe+0x50/0xac)
[<c03f7be8>] (platform_drv_probe) from [<c03f63e0>] (driver_probe_device+0x234/0x2e4)
[<c03f63e0>] (driver_probe_device) from [<c03f6534>] (__driver_attach+0xa4/0xa8)
[<c03f6534>] (__driver_attach) from [<c03f49ec>] (bus_for_each_dev+0x4c/0x9c)
[<c03f49ec>] (bus_for_each_dev) from [<c03f59fc>] (bus_add_driver+0x190/0x214)
[<c03f59fc>] (bus_add_driver) from [<c03f6dfc>] (driver_register+0x78/0xf4)
[<c03f6dfc>] (driver_register) from [<c010177c>] (do_one_initcall+0x44/0x168)
[<c010177c>] (do_one_initcall) from [<c0900d88>] (kernel_init_freeable+0x144/0x1d0)
[<c0900d88>] (kernel_init_freeable) from [<c060ecfc>] (kernel_init+0x8/0x110)
[<c060ecfc>] (kernel_init) from [<c0107738>] (ret_from_fork+0x14/0x3c)
Code: e59021c8 e59d401c e590302c e3540000 (e5922048)
---[ end trace 39ae87c7923562d0 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Tested-By: Priit Laes <plaes@plaes.org>
Fixes: fbca164776 ("net: stmmac: Use the right logging function in stmmac_mdio_register")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:49:06 -07:00
David S. Miller
e188245d02 Merge branch 'net-sched-couple-of-chain-fixes'
Jiri Pirko says:

====================
net: sched: couple of chain fixes

Jiri Pirko (2):
  net: sched: fix use after free when tcf_chain_destroy is called
    multiple times
  net: sched: don't do tcf_chain_flush from tcf_chain_destroy
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:39:59 -07:00
Jiri Pirko
30d65e8f96 net: sched: don't do tcf_chain_flush from tcf_chain_destroy
tcf_chain_flush needs to be called with RTNL. However, on
free_tcf->
 tcf_action_goto_chain_fini->
  tcf_chain_put->
   tcf_chain_destroy->
    tcf_chain_flush
callpath, it is called without RTNL.
This issue was notified by following warning:

[  155.599052] WARNING: suspicious RCU usage
[  155.603165] 4.13.0-rc5jiri+ #54 Not tainted
[  155.607456] -----------------------------
[  155.611561] net/sched/cls_api.c:195 suspicious rcu_dereference_protected() usage!

Since on this callpath, the chain is guaranteed to be already empty
by check in tcf_chain_put, move the tcf_chain_flush call out and call it
only where it is needed - into tcf_block_put.

Fixes: db50514f9a ("net: sched: add termination action to allow goto chain")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:39:58 -07:00
Jiri Pirko
744a4cf63e net: sched: fix use after free when tcf_chain_destroy is called multiple times
The goto_chain termination action takes a reference of a chain. In that
case, there is an issue when block_put is called tcf_chain_destroy
directly. The follo-up call of tcf_chain_put by goto_chain action free
works with memory that is already freed. This was caught by kasan:

[  220.337908] BUG: KASAN: use-after-free in tcf_chain_put+0x1b/0x50
[  220.344103] Read of size 4 at addr ffff88036d1f2cec by task systemd-journal/261
[  220.353047] CPU: 0 PID: 261 Comm: systemd-journal Not tainted 4.13.0-rc5jiri+ #54
[  220.360661] Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox x86 mezzanine board, BIOS 4.6.5 08/02/2016
[  220.371784] Call Trace:
[  220.374290]  <IRQ>
[  220.376355]  dump_stack+0xd5/0x150
[  220.391485]  print_address_description+0x86/0x410
[  220.396308]  kasan_report+0x181/0x4c0
[  220.415211]  tcf_chain_put+0x1b/0x50
[  220.418949]  free_tcf+0x95/0xc0

So allow tcf_chain_destroy to be called multiple times, free only in
case the reference count drops to 0.

Fixes: 5bc1701881 ("net: sched: introduce multichain support for filters")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:39:58 -07:00
Eric Dumazet
fd6055a806 udp: on peeking bad csum, drop packets even if not at head
When peeking, if a bad csum is discovered, the skb is unlinked from
the queue with __sk_queue_drop_skb and the peek operation restarted.

__sk_queue_drop_skb only drops packets that match the queue head.

This fails if the skb was found after the head, using SO_PEEK_OFF
socket option. This causes an infinite loop.

We MUST drop this problematic skb, and we can simply check if skb was
already removed by another thread, by looking at skb->next :

This pointer is set to NULL by the  __skb_unlink() operation, that might
have happened only under the spinlock protection.

Many thanks to syzkaller team (and particularly Dmitry Vyukov who
provided us nice C reproducers exhibiting the lockup) and Willem de
Bruijn who provided first version for this patch and a test program.

Fixes: 627d2d6b55 ("udp: enable MSG_PEEK at non-zero offset")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:27:58 -07:00
Sabrina Dubroca
78362998f5 macsec: add genl family module alias
This helps tools such as wpa_supplicant can start even if the macsec
module isn't loaded yet.

Fixes: c09440f7dc ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:25:50 -07:00
David S. Miller
bfe9a6d76a Merge branch 'tipc-topology-server-fixes'
Parthasarathy Bhuvaragan says:

====================
tipc: topology server fixes

The following commits fixes two race conditions causing general
protection faults.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:25:02 -07:00
Ying Xue
fd849b7c41 tipc: fix a race condition of releasing subscriber object
No matter whether a request is inserted into workqueue as a work item
to cancel a subscription or to delete a subscription's subscriber
asynchronously, the work items may be executed in different workers.
As a result, it doesn't mean that one request which is raised prior to
another request is definitely handled before the latter. By contrast,
if the latter request is executed before the former request, below
error may happen:

[  656.183644] BUG: spinlock bad magic on CPU#0, kworker/u8:0/12117
[  656.184487] general protection fault: 0000 [#1] SMP
[  656.185160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: ip6_udp_tunnel]
[  656.187003] CPU: 0 PID: 12117 Comm: kworker/u8:0 Not tainted 4.11.0-rc7+ #6
[  656.187920] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  656.188690] Workqueue: tipc_rcv tipc_recv_work [tipc]
[  656.189371] task: ffff88003f5cec40 task.stack: ffffc90004448000
[  656.190157] RIP: 0010:spin_bug+0xdd/0xf0
[  656.190678] RSP: 0018:ffffc9000444bcb8 EFLAGS: 00010202
[  656.191375] RAX: 0000000000000034 RBX: ffff88003f8d1388 RCX: 0000000000000000
[  656.192321] RDX: ffff88003ba13708 RSI: ffff88003ba0cd08 RDI: ffff88003ba0cd08
[  656.193265] RBP: ffffc9000444bcd0 R08: 0000000000000030 R09: 000000006b6b6b6b
[  656.194208] R10: ffff8800bde3e000 R11: 00000000000001b4 R12: 6b6b6b6b6b6b6b6b
[  656.195157] R13: ffffffff81a3ca64 R14: ffff88003f8d1388 R15: ffff88003f8d13a0
[  656.196101] FS:  0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
[  656.197172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  656.197935] CR2: 00007f0b3d2e6000 CR3: 000000003ef9e000 CR4: 00000000000006f0
[  656.198873] Call Trace:
[  656.199210]  do_raw_spin_lock+0x66/0xa0
[  656.199735]  _raw_spin_lock_bh+0x19/0x20
[  656.200258]  tipc_subscrb_subscrp_delete+0x28/0xf0 [tipc]
[  656.200990]  tipc_subscrb_rcv_cb+0x45/0x260 [tipc]
[  656.201632]  tipc_receive_from_sock+0xaf/0x100 [tipc]
[  656.202299]  tipc_recv_work+0x2b/0x60 [tipc]
[  656.202872]  process_one_work+0x157/0x420
[  656.203404]  worker_thread+0x69/0x4c0
[  656.203898]  kthread+0x138/0x170
[  656.204328]  ? process_one_work+0x420/0x420
[  656.204889]  ? kthread_create_on_node+0x40/0x40
[  656.205527]  ret_from_fork+0x29/0x40
[  656.206012] Code: 48 8b 0c 25 00 c5 00 00 48 c7 c7 f0 24 a3 81 48 81 c1 f0 05 00 00 65 8b 15 61 ef f5 7e e8 9a 4c 09 00 4d 85 e4 44 8b 4b 08 74 92 <45> 8b 84 24 40 04 00 00 49 8d 8c 24 f0 05 00 00 eb 8d 90 0f 1f
[  656.208504] RIP: spin_bug+0xdd/0xf0 RSP: ffffc9000444bcb8
[  656.209798] ---[ end trace e2a800e6eb0770be ]---

In above scenario, the request of deleting subscriber was performed
earlier than the request of canceling a subscription although the
latter was issued before the former, which means tipc_subscrb_delete()
was called before tipc_subscrp_cancel(). As a result, when
tipc_subscrb_subscrp_delete() called by tipc_subscrp_cancel() was
executed to cancel a subscription, the subscription's subscriber
refcnt had been decreased to 1. After tipc_subscrp_delete() where
the subscriber was freed because its refcnt was decremented to zero,
but the subscriber's lock had to be released, as a consequence, panic
happened.

By contrast, if we increase subscriber's refcnt before
tipc_subscrb_subscrp_delete() is called in tipc_subscrp_cancel(),
the panic issue can be avoided.

Fixes: d094c4d5f5 ("tipc: add subscription refcount to avoid invalid delete")
Reported-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:25:02 -07:00
Parthasarathy Bhuvaragan
458be024ef tipc: remove subscription references only for pending timers
In commit, 139bb36f75 ("tipc: advance the time of deleting
subscription from subscriber->subscrp_list"), we delete the
subscription from the subscribers list and from nametable
unconditionally. This leads to the following bug if the timer
running tipc_subscrp_timeout() in another CPU accesses the
subscription list after the subscription delete request.

[39.570] general protection fault: 0000 [#1] SMP
::
[39.574] task: ffffffff81c10540 task.stack: ffffffff81c00000
[39.575] RIP: 0010:tipc_subscrp_timeout+0x32/0x80 [tipc]
[39.576] RSP: 0018:ffff88003ba03e90 EFLAGS: 00010282
[39.576] RAX: dead000000000200 RBX: ffff88003f0f3600 RCX: 0000000000000101
[39.577] RDX: dead000000000100 RSI: 0000000000000201 RDI: ffff88003f0d7948
[39.578] RBP: ffff88003ba03ea0 R08: 0000000000000001 R09: ffff88003ba03ef8
[39.579] R10: 000000000000014f R11: 0000000000000000 R12: ffff88003f0d7948
[39.580] R13: ffff88003f0f3618 R14: ffffffffa006c250 R15: ffff88003f0f3600
[39.581] FS:  0000000000000000(0000) GS:ffff88003ba00000(0000) knlGS:0000000000000000
[39.582] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[39.583] CR2: 00007f831c6e0714 CR3: 000000003d3b0000 CR4: 00000000000006f0
[39.584] Call Trace:
[39.584]  <IRQ>
[39.585]  call_timer_fn+0x3d/0x180
[39.585]  ? tipc_subscrb_rcv_cb+0x260/0x260 [tipc]
[39.586]  run_timer_softirq+0x168/0x1f0
[39.586]  ? sched_clock_cpu+0x16/0xc0
[39.587]  __do_softirq+0x9b/0x2de
[39.587]  irq_exit+0x60/0x70
[39.588]  smp_apic_timer_interrupt+0x3d/0x50
[39.588]  apic_timer_interrupt+0x86/0x90
[39.589] RIP: 0010:default_idle+0x20/0xf0
[39.589] RSP: 0018:ffffffff81c03e58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[39.590] RAX: 0000000000000000 RBX: ffffffff81c10540 RCX: 0000000000000000
[39.591] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[39.592] RBP: ffffffff81c03e68 R08: 0000000000000000 R09: 0000000000000000
[39.593] R10: ffffc90001cbbe00 R11: 0000000000000000 R12: 0000000000000000
[39.594] R13: ffffffff81c10540 R14: 0000000000000000 R15: 0000000000000000
[39.595]  </IRQ>
::
[39.603] RIP: tipc_subscrp_timeout+0x32/0x80 [tipc] RSP: ffff88003ba03e90
[39.604] ---[ end trace 79ce94b7216cb459 ]---

Fixes: 139bb36f75 ("tipc: advance the time of deleting subscription from subscriber->subscrp_list")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:25:02 -07:00
Nogah Frankel
4eb6a3bdb4 mlxsw: spectrum_switchdev: Fix mrouter flag update
Update the value of the mrouter flag in struct mlxsw_sp_bridge_port when
it is being changed.

Fixes: c57529e1d5 ("mlxsw: spectrum: Replace vPorts with Port-VLAN")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:22:54 -07:00
Andrew Jeffery
5160a153a0 net: ftgmac100: Fix oops in probe on failure to find associated PHY
netif_napi_del() should be paired with netif_napi_add(), however no
such call takes place in ftgmac100_probe(). This triggers a NULL
pointer dereference if e.g. no PHY is found by the MDIO probe:

	 [ 2.770000] libphy: Fixed MDIO Bus: probed
	 [ 2.770000] ftgmac100 1e660000.ethernet: Generated random MAC address 66:58:c0:5a:50:b8
	 [ 2.790000] libphy: ftgmac100_mdio: probed
	 [ 2.790000] ftgmac100 1e660000.ethernet (unnamed net_device) (uninitialized): eth%d: no PHY found
	 [ 2.790000] ftgmac100 1e660000.ethernet: MII Probe failed!
	 [ 2.810000] Unable to handle kernel NULL pointer dereference at virtual address 00000004
	 [ 2.810000] pgd = 80004000
	 [ 2.810000] [00000004] *pgd=00000000
	 [ 2.810000] Internal error: Oops: 805 [#1] ARM
	 [ 2.810000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.10.17-1a4df30c39cf5ee0e3d2528c409787ccbb4a672a #1
	 [ 2.810000] Hardware name: ASpeed SoC
	 [ 2.810000] task: 9e421b60 task.stack: 9e4a0000
	 [ 2.810000] PC is at netif_napi_del+0x74/0xa4
	 [ 2.810000] LR is at ftgmac100_probe+0x290/0x674
	 [ 2.810000] pc : [<80331004>] lr : [<80292b30>] psr: 60000013
	 [ 2.810000] sp : 9e4a1d70 ip : 9e4a1d88 fp : 9e4a1d84
	 [ 2.810000] r10: 9e565000 r9 : ffffffed r8 : 00000007
	 [ 2.810000] r7 : 9e565480 r6 : 9ec072c0 r5 : 00000000 r4 : 9e5654d8
	 [ 2.810000] r3 : 9e565530 r2 : 00000000 r1 : 00000000 r0 : 9e5654d8
	 [ 2.810000] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
	 [ 2.810000] Control: 00c5387d Table: 80004008 DAC: 00000055
	 [ 2.810000] Process swapper (pid: 1, stack limit = 0x9e4a0188)
	 [ 2.810000] Stack: (0x9e4a1d70 to 0x9e4a2000)
	 [ 2.810000] 1d60: 9e565000 9e549e10 9e4a1dcc 9e4a1d88
	 [ 2.810000] 1d80: 80292b30 80330f9c ffffffff 9e4a1d98 80146058 9ec072c0 00009e10 00000000
	 [ 2.810000] 1da0: 9e549e18 9e549e10 ffffffed 805f81f4 fffffdfb 00000000 00000000 00000000
	 [ 2.810000] 1dc0: 9e4a1dec 9e4a1dd0 80243df8 802928ac 9e549e10 8062cbd8 8062cbe0 805f81f4
	 [ 2.810000] 1de0: 9e4a1e24 9e4a1df0 80242178 80243da4 803001d0 802ffa60 9e4a1e24 9e549e10
	 [ 2.810000] 1e00: 9e549e44 805f81f4 00000000 00000000 805b8840 8058a6b0 9e4a1e44 9e4a1e28
	 [ 2.810000] 1e20: 80242434 80241f04 00000000 805f81f4 80242344 00000000 9e4a1e6c 9e4a1e48
	 [ 2.810000] 1e40: 80240148 80242350 9e425bac 9e4fdc90 9e790e94 805f81f4 9e790e60 805f5640
	 [ 2.810000] 1e60: 9e4a1e7c 9e4a1e70 802425dc 802400d8 9e4a1ea4 9e4a1e80 80240ba8 802425c0
	 [ 2.810000] 1e80: 8050b6ac 9e4a1e90 805f81f4 ffffe000 805b8838 80616720 9e4a1ebc 9e4a1ea8
	 [ 2.810000] 1ea0: 80243068 80240a68 805ab24c ffffe000 9e4a1ecc 9e4a1ec0 80244a38 80242fec
	 [ 2.810000] 1ec0: 9e4a1edc 9e4a1ed0 805ab264 80244a04 9e4a1f4c 9e4a1ee0 8058ae70 805ab258
	 [ 2.810000] 1ee0: 80032c68 801e3fd8 8052f800 8041af2c 9e4a1f4c 9e4a1f00 80032f90 8058a6bc
	 [ 2.810000] 1f00: 9e4a1f2c 9e4a1f10 00000006 00000006 00000000 8052f220 805112f0 00000000
	 [ 2.810000] 1f20: 9e4a1f4c 00000006 80616720 805cf400 80616720 805b8838 80616720 00000057
	 [ 2.810000] 1f40: 9e4a1f94 9e4a1f50 8058b040 8058add0 00000006 00000006 00000000 8058a6b0
	 [ 2.810000] 1f60: 3940bf3d 00000007 f115c2e8 00000000 803fd158 00000000 00000000 00000000
	 [ 2.810000] 1f80: 00000000 00000000 9e4a1fac 9e4a1f98 803fd170 8058af38 00000000 803fd158
	 [ 2.810000] 1fa0: 00000000 9e4a1fb0 8000a5e8 803fd164 00000000 00000000 00000000 00000000
	 [ 2.810000] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
	 [ 2.810000] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 d11dcae8 af8ddec5
	 [ 2.810000] [<80331004>] (netif_napi_del) from [<80292b30>] (ftgmac100_probe+0x290/0x674)
	 [ 2.810000] [<80292b30>] (ftgmac100_probe) from [<80243df8>] (platform_drv_probe+0x60/0xc0)
	 [ 2.810000] [<80243df8>] (platform_drv_probe) from [<80242178>] (driver_probe_device+0x280/0x44c)
	 [ 2.810000] [<80242178>] (driver_probe_device) from [<80242434>] (__driver_attach+0xf0/0x104)
	 [ 2.810000] [<80242434>] (__driver_attach) from [<80240148>] (bus_for_each_dev+0x7c/0xb0)
	 [ 2.810000] [<80240148>] (bus_for_each_dev) from [<802425dc>] (driver_attach+0x28/0x30)
	 [ 2.810000] [<802425dc>] (driver_attach) from [<80240ba8>] (bus_add_driver+0x14c/0x268)
	 [ 2.810000] [<80240ba8>] (bus_add_driver) from [<80243068>] (driver_register+0x88/0x104)
	 [ 2.810000] [<80243068>] (driver_register) from [<80244a38>] (__platform_driver_register+0x40/0x54)
	 [ 2.810000] [<80244a38>] (__platform_driver_register) from [<805ab264>] (ftgmac100_driver_init+0x18/0x20)
	 [ 2.810000] [<805ab264>] (ftgmac100_driver_init) from [<8058ae70>] (do_one_initcall+0xac/0x168)
	 [ 2.810000] [<8058ae70>] (do_one_initcall) from [<8058b040>] (kernel_init_freeable+0x114/0x1cc)
	 [ 2.810000] [<8058b040>] (kernel_init_freeable) from [<803fd170>] (kernel_init+0x18/0x104)
	 [ 2.810000] [<803fd170>] (kernel_init) from [<8000a5e8>] (ret_from_fork+0x14/0x2c)
	 [ 2.810000] Code: e594205c e5941058 e2843058 e3a05000 (e5812004)
	 [ 3.210000] ---[ end trace f32811052fd3860c ]---

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 14:17:47 -07:00
Florian Fainelli
414e7d76af net/hsr: Check skb_put_padto() return value
skb_put_padto() will free the sk_buff passed as reference in case of
errors, but we still need to check its return value and decide what to
do.

Detected by CoverityScan, CID#1416688 ("CHECKED_RETURN")

Fixes: ee1c279772 ("net/hsr: Added support for HSR v1")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 13:40:23 -07:00
Wei Wang
c5cff8561d ipv6: add rcu grace period before freeing fib6_node
We currently keep rt->rt6i_node pointing to the fib6_node for the route.
And some functions make use of this pointer to dereference the fib6_node
from rt structure, e.g. rt6_check(). However, as there is neither
refcount nor rcu taken when dereferencing rt->rt6i_node, it could
potentially cause crashes as rt->rt6i_node could be set to NULL by other
CPUs when doing a route deletion.
This patch introduces an rcu grace period before freeing fib6_node and
makes sure the functions that dereference it takes rcu_read_lock().

Note: there is no "Fixes" tag because this bug was there in a very
early stage.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 11:03:19 -07:00
David S. Miller
0c8d2d95b8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2017-08-21

1) Fix memleaks when ESP takes an error path.

2) Fix null pointer dereference when creating a sub policy
   that matches the same outer flow as main policy does.
   From Koichiro Den.

3) Fix possible out-of-bound access in xfrm_migrate.
   This patch should go to the stable trees too.
   From Vladis Dronov.

4) ESP can return positive and negative error values,
   so treat both cases as an error.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 10:27:26 -07:00
Stefano Brivio
3de33e1ba0 ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt()
A packet length of exactly IPV6_MAXPLEN is allowed, we should
refuse parsing options only if the size is 64KiB or more.

While at it, remove one extra variable and one assignment which
were also introduced by the commit that introduced the size
check. Checking the sum 'offset + len' and only later adding
'len' to 'offset' doesn't provide any advantage over directly
summing to 'offset' and checking it.

Fixes: 6399f1fae4 ("ipv6: avoid overflow of offset in ip6_find_1stfragopt")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-22 10:23:26 -07:00
Linus Torvalds
6470812e22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Just a couple small fixes, two of which have to do with gcc-7:

   1) Don't clobber kernel fixed registers in __multi4 libgcc helper.

   2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
      from Thomas Petazzoni.

   3) Adjust pmd_t initializer on sparc32 to make gcc happy.

   4) If ATU isn't available, don't bark in the logs. From Tushar Dave"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
  sparc64: remove unnecessary log message
  sparc64: Don't clibber fixed registers in __multi4.
  mm: add pmd_t initializer __pmd() to work around a GCC bug.
2017-08-21 14:07:48 -07:00
Thomas Petazzoni
2dc77533f1 sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
When building the kernel for Sparc using gcc 7.x, the build fails
with:

arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    cmd |= PCI_COMMAND_IO;
        ^~

The simplified code looks like this:

unsigned int cmd;
[...]
pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
[...]
cmd |= PCI_COMMAND_IO;

I.e, the code assumes that pcic_read_config() will always initialize
cmd. But it's not the case. Looking at pcic_read_config(), if
bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
not be initialized.

As a simple fix, we initialize cmd to zero at the beginning of
pcibios_fixup_bus.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-21 13:57:22 -07:00
Linus Torvalds
05ab303b4f ARC fixes for 4.13-rc7
- PAE40 related updates
 
  - SLC errata for region ops
 
  - intc line masking by default
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZmw0DAAoJEGnX8d3iisJeoogP/R78jWBmVIRr7YBvOMEbNZAz
 r5dJHnd2jCUtqQ/rmCndwzOqLv2j77RRdH3nlwfM7YaS8LZmK0Dkz9zUGReCalQU
 z1sZfBSZuOfodWbDzfXdJVNEGGu8eSG7/s87E6K9UxOuaZJIPNMyU9qGxjDb3BTo
 dzgNmT0xgiqZYtv9Y3uciPgkddJLOxE+eMuEpxzbzejLerUUc/jRV8m5qAL8ja9w
 NanzLjo7Ec0FiczyYf1DtiONXBVl556IPQoFJtXIbsfZww8kJxFSZ+qemvmFXJOF
 cxSfZeRBOCWV9mRW36kGAeKVE9EWqFHFn/UiCfvhTCpoFXPX63Hz+nVRLEE1lhrQ
 ZaiQSuu0QgUkWP39ZpAPQjmdBlVxzv9Jsz5Dh72l3C00Hf9yw3jVCsKX/nZLpLM2
 pS8pFVnJqttOLX/6wU1JjIQDhPvzqn0V21SwCXBt2DwyXd1zuce82ioPY7K2Uefc
 4Unrso+YpIJNh8NlIe7Pvn8kEitNF7MViybofjhKPXlFXT4FqSJIV0Q/iq/L6lh8
 RAfJO3GCQQymkB03aVmfRWq+xgCS0v3K2vP50T3+XEyix98ZwH+D4ViaXs9egmLk
 EO323ebCKp8AJxICTme5qtmXs+k/CH+KeCzwSc90Mtf1SD3ohvyUeJvA5BGCCyP5
 NP5sxiH9cNKwrMIBdvuE
 =mOLm
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - PAE40 related updates

 - SLC errata for region ops

 - intc line masking by default

* tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: Mask individual IRQ lines during core INTC init
  ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
  ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
  ARC: dma: implement dma_unmap_page and sg variant
  ARCv2: SLC: Make sure busy bit is set properly for region ops
  ARC: [plat-sim] Include this platform unconditionally
  ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
  ARC: defconfig: Cleanup from old Kconfig options
2017-08-21 13:30:36 -07:00
Linus Torvalds
0b3baec853 RTC fixes for 4.13:
- ds1307: fix regmap configuration
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEXx9Viay1+e7J/aM4AyWl4gNJNJIFAlma57AACgkQAyWl4gNJ
 NJJDMw/9F3x6xub2F5nkW/uWgp4c6qL6cY0yC+fSHEhBrDl1T5gr+Cl+prQJotjU
 HNJAFRC78ERlr16o6ck5wL9Mo9rP3iOgpYuLCMFGEelSWF+DMv1fO1mrE2GWgcLZ
 phILkzzdkqqveq6f6wHuxAegltw8Wgh/DCF+tJCZI6UcEwsfWsH7yzadNSdYhiAO
 DFN7lry0mvt432ezz/ko/Evg7GOCT9SK9C1jZ6P2uB/gSoXwbhYXBWubdtKuYYVy
 5OCyyv9FWb3WX8FXFfb/oyAMI5xZ4cXqrJseoFsxZr5GamprK0hKQutaGtV9WCg8
 Tp8syz3MCH5w402VqiUARnW89o5F5bEUs+Ymh6x9+C6d2//3MQ4Bne1tEHSMhXI6
 w6R5tm/S+ybF7zKC/WjIxS+xGvqzizJoxUhUqDb7oQmz8aZQt6qhtNzwJO7zmDx6
 3c2nADmfjV28CAXYOWaIYWLp6CRwIOSGykijMvnhwvHFS+XdzXrq1uadaaMvfxDR
 iv5lRcb9nA66vsjYvU0/6cfn3U+el1cgn7g62gzj86GmElUvwW1XR2DIWs/RqDfZ
 e92B+ZwORE6u5YZtQAIvaNep2aZnwqYOQn7zwf5sNZlURDPwtHfDYpDP8xntZIFt
 twLH+DpYqoQvUtCP8u15xYsLIFiqpGHA12oBERhK5A+2nYYOJck=
 =CuI8
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC fix from Alexandre Belloni:
 "Fix regmap configuration for ds1307"

* tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: ds1307: fix regmap config
2017-08-21 13:27:51 -07:00
Linus Torvalds
e3181f2c0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix IGMP handling wrt VRF, from David Ahern.

 2) Fix timer access to freed object in dccp, from Eric Dumazet.

 3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are
    triggerable by userspace. Also from Eric Dumazet.

 4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian
    King.

 5) Correct datagram peek handling of empty SKBs, from Matthew Dawson.

 6) Fix use after free in TIPC, from Eric Dumazet.

 7) When replacing a route in ipv6 we need to reset the round robin
    pointer, from Wei Wang.

 8) Fix bug in pci_find_pcie_root_port() which was unearthed by the
    relaxed ordering changes, from Thierry Redding. I made sure to get
    an explicit ACK from Bjorn this time around :-)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  ipv6: repair fib6 tree in failure case
  net_sched: fix order of queue length updates in qdisc_replace()
  tools lib bpf: improve warning
  switchdev: documentation: minor typo fixes
  bpf, doc: also add s390x as arch to sysctl description
  net: sched: fix NULL pointer dereference when action calls some targets
  rxrpc: Fix oops when discarding a preallocated service call
  irda: do not leak initialized list.dev to userspace
  net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
  PCI: Allow PCI express root ports to find themselves
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
  net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
  ipv6: reset fn->rr_ptr when replacing route
  sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
  tipc: fix use-after-free
  tun: handle register_netdevice() failures properly
  datagram: When peeking datagrams with offset < 0 don't skip empty skbs
  bpf, doc: improve sysctl knob description
  netxen: fix incorrect loop counter decrement
  nfp: fix infinite loop on umapping cleanup
  ...
2017-08-21 13:16:27 -07:00
Oleg Nesterov
dd1c1f2f20 pids: make task_tgid_nr_ns() safe
This was reported many times, and this was even mentioned in commit
52ee2dfdd4 ("pids: refactor vnr/nr_ns helpers to make them safe") but
somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is
not safe because task->group_leader points to nowhere after the exiting
task passes exit_notify(), rcu_read_lock() can not help.

We really need to change __unhash_process() to nullify group_leader,
parent, and real_parent, but this needs some cleanups.  Until then we
can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and
fix the problem.

Reported-by: Troy Kensinger <tkensinger@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-21 12:47:31 -07:00
Heiner Kallweit
03619844d8 rtc: ds1307: fix regmap config
Current max_register setting breaks reading nvram on certain chips and
also reading the standard registers on RX8130 where register map starts
at 0x10.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 11e5890b53 "rtc: ds1307: convert driver to regmap"
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-08-21 11:08:03 +02:00
Wei Wang
348a400272 ipv6: repair fib6 tree in failure case
In fib6_add(), it is possible that fib6_add_1() picks an intermediate
node and sets the node's fn->leaf to NULL in order to add this new
route. However, if fib6_add_rt2node() fails to add the new
route for some reason, fn->leaf will be left as NULL and could
potentially cause crash when fn->leaf is accessed in fib6_locate().
This patch makes sure fib6_repair_tree() is called to properly repair
fn->leaf in the above failure case.

Here is the syzkaller reported general protection fault in fib6_locate:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d7d64100 ti: ffff8801d01a0000 task.ti: ffff8801d01a0000
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] __ipv6_prefix_equal64_half include/net/ipv6.h:475 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] ipv6_prefix_equal include/net/ipv6.h:492 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate_1 net/ipv6/ip6_fib.c:1210 [inline]
RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate+0x281/0x3c0 net/ipv6/ip6_fib.c:1233
RSP: 0018:ffff8801d01a36a8  EFLAGS: 00010202
RAX: 0000000000000020 RBX: ffff8801bc790e00 RCX: ffffc90002983000
RDX: 0000000000001219 RSI: ffff8801d01a37a0 RDI: 0000000000000100
RBP: ffff8801d01a36f0 R08: 00000000000000ff R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
R13: dffffc0000000000 R14: ffff8801d01a37a0 R15: 0000000000000000
FS:  00007f6afd68c700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004c6340 CR3: 00000000ba41f000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d01a37a8 ffff8801d01a3780 ffffed003a0346f5 0000000c82a23ea0
 ffff8800b7bd7700 ffff8801d01a3780 ffff8800b6a1c940 ffffffff82a23ea0
 ffff8801d01a3920 ffff8801d01a3748 ffffffff82a223d6 ffff8801d7d64988
Call Trace:
 [<ffffffff82a223d6>] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
 [<ffffffff82a23f9d>] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
 [<ffffffff82621359>] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
 [<ffffffff8274c1d1>] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
 [<ffffffff82613ddf>] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
 [<ffffffff8274ad38>] netlink_unicast_kernel net/netlink/af_netlink.c:1206 [inline]
 [<ffffffff8274ad38>] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
 [<ffffffff8274b83e>] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
 [<ffffffff82564aff>] sock_sendmsg_nosec net/socket.c:609 [inline]
 [<ffffffff82564aff>] sock_sendmsg+0xcf/0x110 net/socket.c:619
 [<ffffffff82564d62>] sock_write_iter+0x222/0x3a0 net/socket.c:834
 [<ffffffff8178523d>] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
 [<ffffffff817853f4>] __vfs_write+0xe4/0x110 fs/read_write.c:491
 [<ffffffff81786c38>] vfs_write+0x178/0x4b0 fs/read_write.c:538
 [<ffffffff817892a9>] SYSC_write fs/read_write.c:585 [inline]
 [<ffffffff817892a9>] SyS_write+0xd9/0x1b0 fs/read_write.c:577
 [<ffffffff82c71e32>] entry_SYSCALL_64_fastpath+0x12/0x17

Note: there is no "Fixes" tag as this seems to be a bug introduced
very early.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:06:56 -07:00
Konstantin Khlebnikov
68a66d149a net_sched: fix order of queue length updates in qdisc_replace()
This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in ->qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.

Missed class deactivations leads to crashes/warnings at picking packets
from empty qdisc and corrupting state at reactivating this class in future.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 86a7996cc8 ("net_sched: introduce qdisc_replace() helper")
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 20:02:00 -07:00
Eric Leblond
49bf4b36fd tools lib bpf: improve warning
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:51 -07:00
Chris Packham
5a78449810 switchdev: documentation: minor typo fixes
Two typos in switchdev.txt

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:49:10 -07:00
Daniel Borkmann
d4dd2d75a2 bpf, doc: also add s390x as arch to sysctl description
Looks like this was accidentally missed, so still add s390x
as supported eBPF JIT arch to bpf_jit_enable.

Fixes: 014cd0a368 ("bpf: Update sysctl documentation to list all supported architectures")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-20 19:45:09 -07:00
Linus Torvalds
14ccee78fc Linux 4.13-rc6 2017-08-20 14:13:52 -07:00
Linus Torvalds
197e7e5213 Sanitize 'move_pages()' permission checks
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).

That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.

So change the access checks to the more common 'ptrace_may_access()'
model instead.

This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.

Famous last words.

Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-20 13:26:27 -07:00
Linus Torvalds
7f680d7ec3 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Another pile of small fixes and updates for x86:

   - Plug a hole in the SMAP implementation which misses to clear AC on
     NMI entry

   - Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
     parameter works correctly again

   - Use the proper accessor in the startup64 code for next_early_pgt to
     prevent accessing of invalid addresses and faulting in the early
     boot code.

   - Prevent CPU hotplug lock recursion in the MTRR code

   - Unbreak CPU0 hotplugging

   - Rename overly long CPUID bits which got introduced in this cycle

   - Two commits which mark data 'const' and restrict the scope of data
     and functions to file scope by making them 'static'"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Constify attribute_group structures
  x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
  x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
  x86: Fix norandmaps/ADDR_NO_RANDOMIZE
  x86/mtrr: Prevent CPU hotplug lock recursion
  x86: Mark various structures and functions as 'static'
  x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
  x86/smpboot: Unbreak CPU0 hotplug
  x86/asm/64: Clear AC on NMI entries
2017-08-20 09:36:52 -07:00
Linus Torvalds
2615a38f14 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A few small fixes for timer drivers:

   - Prevent infinite recursion in the arm architected timer driver with
     ftrace

   - Propagate error codes to the caller in case of failure in EM STI
     driver

   - Adjust a bogus loop iteration in the arm architected timer driver

   - Add a missing Kconfig dependency to the pistachio clocksource to
     prevent build failures

   - Correctly check for IS_ERR() instead of NULL in the shared timer-of
     code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
  clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
  clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
  clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
  clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
2017-08-20 09:34:24 -07:00
Linus Torvalds
e46db8d2ef Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Two fixes for the perf subsystem:

   - Fix an inconsistency of RDPMC mm struct tagging across exec() which
     causes RDPMC to fault.

   - Correct the timestamp mechanics across IOC_DISABLE/ENABLE which
     causes incorrect timestamps and total time calculations"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix time on IOC_ENABLE
  perf/x86: Fix RDPMC vs. mm_struct tracking
2017-08-20 09:20:57 -07:00
Linus Torvalds
9dae41a238 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A pile of smallish changes all over the place:

   - Add a missing ISB in the GIC V1 driver

   - Remove an ACPI version check in the GIC V3 ITS driver

   - Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid
     spurious wakeups

   - Remove the artifical limitation of ITS instances to the number of
     NUMA nodes which prevents utilizing the ITS hardware correctly

   - Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code

   - Honour the force affinity argument in the GIC-V3 driver which is
     required to make perf work correctly

   - Correctly report allocation failures in GIC-V2/V3 to avoid using
     half allocated and initialized interrupts.

   - Fixup checks against nr_cpu_ids in the generic IPI code"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/ipi: Fixup checks against nr_cpu_ids
  genirq: Restore trigger settings in irq_modify_status()
  MAINTAINERS: Remove Jason Cooper's irqchip git tree
  irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
  irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
  irqchip: brcmstb-l2: Define an irq_pm_shutdown function
  irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
  irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
  irqchip/gic-v3: Honor forced affinity setting
  irqchip/gic-v3: Report failures in gic_irq_domain_alloc
  irqchip/gic-v2: Report failures in gic_irq_domain_alloc
  irqchip/atmel-aic: Remove root argument from ->fixup() prototype
  irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
  irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
2017-08-20 09:07:56 -07:00
Linus Torvalds
e18a5ebc2d Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull watchdog fix from Thomas Gleixner:
 "A fix for the hardlockup watchdog to prevent false positives with
  extreme Turbo-Modes which make the perf/NMI watchdog fire faster than
  the hrtimer which is used to verify.

  Slightly larger than the minimal fix, which just would increase the
  hrtimer frequency, but comes with extra overhead of more watchdog
  timer interrupts and thread wakeups for all users.

  With this change we restrict the overhead to the extreme Turbo-Mode
  systems"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kernel/watchdog: Prevent false positives with turbo modes
2017-08-20 08:54:30 -07:00
Alexey Dobriyan
8fbbe2d7cc genirq/ipi: Fixup checks against nr_cpu_ids
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.

Fixes: 3b8e29a82d ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
2017-08-20 10:49:05 +02:00
Xin Long
4f8a881acc net: sched: fix NULL pointer dereference when action calls some targets
As we know in some target's checkentry it may dereference par.entryinfo
to check entry stuff inside. But when sched action calls xt_check_target,
par.entryinfo is set with NULL. It would cause kernel panic when calling
some targets.

It can be reproduce with:
  # tc qd add dev eth1 ingress handle ffff:
  # tc filter add dev eth1 parent ffff: u32 match u32 0 0 action xt \
    -j ECN --ecn-tcp-remove

It could also crash kernel when using target CLUSTERIP or TPROXY.

By now there's no proper value for par.entryinfo in ipt_init_target,
but it can not be set with NULL. This patch is to void all these
panics by setting it with an ipt_entry obj with all members = 0.

Note that this issue has been there since the very beginning.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:25:49 -07:00
David Howells
9a19bad70c rxrpc: Fix oops when discarding a preallocated service call
rxrpc_service_prealloc_one() doesn't set the socket pointer on any new call
it preallocates, but does add it to the rxrpc net namespace call list.
This, however, causes rxrpc_put_call() to oops when the call is discarded
when the socket is closed.  rxrpc_put_call() needs the socket to be able to
reach the namespace so that it can use a lock held therein.

Fix this by setting a call's socket pointer immediately before discarding
it.

This can be triggered by unloading the kafs module, resulting in an oops
like the following:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: rxrpc_put_call+0x1e2/0x32d
PGD 0
P4D 0
Oops: 0000 [#1] SMP
Modules linked in: kafs(E-)
CPU: 3 PID: 3037 Comm: rmmod Tainted: G            E   4.12.0-fscache+ #213
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8803fc92e2c0 task.stack: ffff8803fef74000
RIP: 0010:rxrpc_put_call+0x1e2/0x32d
RSP: 0018:ffff8803fef77e08 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8803fab99ac0 RCX: 000000000000000f
RDX: ffffffff81c50a40 RSI: 000000000000000c RDI: ffff8803fc92ea88
RBP: ffff8803fef77e30 R08: ffff8803fc87b941 R09: ffffffff82946d20
R10: ffff8803fef77d10 R11: 00000000000076fc R12: 0000000000000005
R13: ffff8803fab99c20 R14: 0000000000000001 R15: ffffffff816c6aee
FS:  00007f915a059700(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 00000003fef39000 CR4: 00000000001406e0
Call Trace:
 rxrpc_discard_prealloc+0x325/0x341
 rxrpc_listen+0xf9/0x146
 kernel_listen+0xb/0xd
 afs_close_socket+0x3e/0x173 [kafs]
 afs_exit+0x1f/0x57 [kafs]
 SyS_delete_module+0x10f/0x19a
 do_syscall_64+0x8a/0x149
 entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:23:23 -07:00
Colin Ian King
b024d949a3 irda: do not leak initialized list.dev to userspace
list.dev has not been initialized and so the copy_to_user is copying
data from the stack back to user space which is a potential
information leak. Fix this ensuring all of list is initialized to
zero.

Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:21:51 -07:00