linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Florian Fainelli	e6afb1ad88	net: korina: Fix NAPI versus resources freeing Commit `beb0babfb7` ("korina: disable napi on close and restart") introduced calls to napi_disable() that were missing before, unfortunately this leaves a small window during which NAPI has a chance to run, yet we just freed resources since korina_free_ring() has been called: Fix this by disabling NAPI first then freeing resource, and make sure that we also cancel the restart task before doing the resource freeing. Fixes: `beb0babfb7` ("korina: disable napi on close and restart") Reported-by: Alexandros C. Couloumbis <alex@ozo.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-26 11:26:16 -05:00
Daniel Borkmann	628185cfdd	net, sched: fix soft lockup in tc_classify Shahar reported a soft lockup in tc_classify(), where we run into an endless loop when walking the classifier chain due to tp->next == tp which is a state we should never run into. The issue only seems to trigger under load in the tc control path. What happens is that in tc_ctl_tfilter(), thread A allocates a new tp, initializes it, sets tp_created to 1, and calls into tp->ops->change() with it. In that classifier callback we had to unlock/lock the rtnl mutex and returned with -EAGAIN. One reason why we need to drop there is, for example, that we need to request an action module to be loaded. This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning after we loaded and found the requested action, we need to redo the whole request so we don't race against others. While we had to unlock rtnl in that time, thread B's request was processed next on that CPU. Thread B added a new tp instance successfully to the classifier chain. When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN and destroying its tp instance which never got linked, we goto replay and redo A's request. This time when walking the classifier chain in tc_ctl_tfilter() for checking for existing tp instances we had a priority match and found the tp instance that was created and linked by thread B. Now calling again into tp->ops->change() with that tp was successful and returned without error. tp_created was never cleared in the second round, thus kernel thinks that we need to link it into the classifier chain (once again). tp and *back point to the same object due to the match we had earlier on. Thus for thread B's already public tp, we reset tp->next to tp itself and link it into the chain, which eventually causes the mentioned endless loop in tc_classify() once a packet hits the data path. Fix is to clear tp_created at the beginning of each request, also when we replay it. On the paths that can cause -EAGAIN we already destroy the original tp instance we had and on replay we really need to start from scratch. It seems that this issue was first introduced in commit `12186be7d2` ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup"). Fixes: `12186be7d2` ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup") Reported-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-26 11:24:10 -05:00
Tariq Toukan	eb9def61be	net/mlx4_en: Fix user prio field in XDP forward The user prio field is wrong (and overflows) in the XDP forward flow. This is a result of a bad value for num_tx_rings_p_up, which should account all XDP TX rings, as they operate for the same user prio. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 17:53:47 -05:00
Jon Paul Maloy	693c56491f	tipc: don't send FIN message from connectionless socket In commit `6f00089c73` ("tipc: remove SS_DISCONNECTING state") the check for socket type is in the wrong place, causing a closing socket to always send out a FIN message even when the socket was never connected. This is normally harmless, since the destination node for such messages most often is zero, and the message will be dropped, but it is still a wrong and confusing behavior. We fix this in this commit. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 17:53:47 -05:00
Mahesh Bandewar	e252536068	ipvlan: fix multicast processing In an IPvlan setup when master is set in loopback mode e.g. ethtool -K eth0 set loopback on where eth0 is master device for IPvlan setup. The failure is caused by the faulty logic that determines if the packet is from TX-path vs. RX-path by just looking at the mac- addresses on the packet while processing multicast packets. In the loopback-mode where this crash was happening, the packets that are sent out are reflected by the NIC and are processed on the RX path, but mac-address check tricks into thinking this packet is from TX path and falsely uses dev_forward_skb() to pass packets to the slave (virtual) devices. This patch records the path while queueing packets and eliminates logic of looking at mac-addresses for the same decision. ------------[ cut here ]------------ kernel BUG at include/linux/skbuff.h:1737! Call Trace: [<ffffffff921fbbc2>] dev_forward_skb+0x92/0xd0 [<ffffffffc031ac65>] ipvlan_process_multicast+0x395/0x4c0 [ipvlan] [<ffffffffc031a9a7>] ? ipvlan_process_multicast+0xd7/0x4c0 [ipvlan] [<ffffffff91cdfea7>] ? process_one_work+0x147/0x660 [<ffffffff91cdff09>] process_one_work+0x1a9/0x660 [<ffffffff91cdfea7>] ? process_one_work+0x147/0x660 [<ffffffff91ce086d>] worker_thread+0x11d/0x360 [<ffffffff91ce0750>] ? rescuer_thread+0x350/0x350 [<ffffffff91ce960b>] kthread+0xdb/0xe0 [<ffffffff91c05c70>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff91ce9530>] ? flush_kthread_worker+0xc0/0xc0 [<ffffffff92348b7a>] ret_from_fork+0x9a/0xd0 [<ffffffff91ce9530>] ? flush_kthread_worker+0xc0/0xc0 Fixes: `ba35f8588f` ("ipvlan: Defer multicast / broadcast processing to a work-queue") Signed-off-by: Mahesh Bandewar <maheshb@google.com> CC: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 17:53:47 -05:00
Eric Dumazet	b1227d019f	ipvlan: fix various issues in ipvlan_process_multicast() 1) netif_rx() / dev_forward_skb() should not be called from process context. 2) ipvlan_count_rx() should be called with preemption disabled. 3) We should check if ipvlan->dev is up before feeding packets to netif_rx() 4) We need to prevent device from disappearing if some packets are in the multicast backlog. 5) One kfree_skb() should be a consume_skb() eventually Fixes: `ba35f8588f` ("ipvlan: Defer multicast / broadcast processing to a work-queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 17:53:47 -05:00
Linus Torvalds	50b17cfb19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) We have to be careful to not try and place a checksum after the end of a rawv6 packet, fix from Dave Jones with help from Hannes Frederic Sowa. 2) Missing memory barriers in tcp_tasklet_func() lead to crashes, from Eric Dumazet. 3) Several bug fixes for the new XDP support in virtio_net, from Jason Wang. 4) Increase headroom in RX skbs in be2net driver to accomodate encapsulations such as geneve. From Kalesh A P. 5) Fix SKB frag unmapping on TX in mvpp2, from Thomas Petazzoni. 6) Pre-pulling UDP headers created a regression in RECVORIGDSTADDR socket option support, from Willem de Bruijn. 7) UID based routing added a potential OOPS in ip_do_redirect() when we see an SKB without a socket attached. We just need it for the network namespace which we can get from skb->dev instead. Fix from Lorenzo Colitti. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits) sctp: fix recovering from 0 win with small data chunks sctp: do not loose window information if in rwnd_over virtio-net: XDP support for small buffers virtio-net: remove big packet XDP codes virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support virtio-net: make rx buf size estimation works for XDP virtio-net: unbreak csumed packets for XDP_PASS virtio-net: correctly handle XDP_PASS for linearized packets virtio-net: fix page miscount during XDP linearizing virtio-net: correctly xmit linearized page on XDP_TX virtio-net: remove the warning before XDP linearizing mlxsw: spectrum_router: Correctly remove nexthop groups mlxsw: spectrum_router: Don't reflect dead neighs neigh: Send netevent after marking neigh as dead ipv6: handle -EFAULT from skb_copy_bits inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets net/sched: cls_flower: Mandate mask when matching on flags net/sched: act_tunnel_key: Fix setting UDP dst port in metadata under IPv6 stmmac: CSR clock configuration fix net: ipv4: Don't crash if passing a null sk to ip_do_redirect. ...	2016-12-23 11:23:25 -08:00
Marcelo Ricardo Leitner	1636098c46	sctp: fix recovering from 0 win with small data chunks Currently if SCTP closes the receive window with window pressure, mostly caused by excessive skb overhead on payload/overheads ratio, SCTP will close the window abruptly while saving the delta on rwnd_press. It will start recovering rwnd as the chunks are consumed by the application and the rwnd_press will be only recovered after rwnd reach the same value as of rwnd_press, mostly to prevent silly window syndrome. Thing is, this is very inefficient with small data chunks, as with those it will never reach back that value, and thus it will never recover from such pressure. This means that we will not issue window updates when recovering from 0 window and will rely on a sender retransmit to notice it. The fix here is to remove such threshold, as no value is good enough: it depends on the (avg) chunk sizes being used. Test with netperf -t SCTP_STREAM -- -m 1, and trigger 0 window by sending SIGSTOP to netserver, sleep 1.2, and SIGCONT. Rate limited to 845kbps, for visibility. Capture done at netserver side. Previously: 01.500751 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632372996] [a_rwnd 99153] [ 01.500752 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632372997] [SID: 0] [SS 01.517471 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 01.517483 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.517485 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS 01.517488 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.534168 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373096] [SID: 0] [SS 01.534180 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 01.534181 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373169] [SID: 0] [SS 01.534185 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap 02.525978 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 02.526021 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373009] [a_rwnd 0] [#gap (window update missed) 04.573807 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373010] [SID: 0] [SS 04.779370 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373082] [a_rwnd 859] [#g 04.789162 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373083] [SID: 0] [SS 04.789323 IP A.36925 > B.48277: sctp (1) [DATA] (B)(E) [TSN: 632373156] [SID: 0] [SS 04.789372 IP B.48277 > A.36925: sctp (1) [SACK] [cum ack 632373228] [a_rwnd 786] [#g After: 02.568957 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098728] [a_rwnd 99153] 02.568961 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098729] [SID: 0] [S 02.585631 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 02.585666 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.585671 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S 02.585683 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.602330 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098828] [SID: 0] [S 02.602359 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 02.602363 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098901] [SID: 0] [S 02.602372 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 03.600788 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 03.600830 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 0] [#ga 03.619455 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 13508] 03.619479 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 27017] 03.619497 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 40526] 03.619516 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 54035] 03.619533 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 67544] 03.619552 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 81053] 03.619570 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098741] [a_rwnd 94562] (following data transmission triggered by window updates above) 03.633504 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098742] [SID: 0] [S 03.836445 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098814] [a_rwnd 100000] 03.843125 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098815] [SID: 0] [S 03.843285 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098888] [SID: 0] [S 03.843345 IP B.50536 > A.55173: sctp (1) [SACK] [cum ack 2490098960] [a_rwnd 99894] 03.856546 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490098961] [SID: 0] [S 03.866450 IP A.55173 > B.50536: sctp (1) [DATA] (B)(E) [TSN: 2490099011] [SID: 0] [S Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 14:01:35 -05:00
Marcelo Ricardo Leitner	58b94d88de	sctp: do not loose window information if in rwnd_over It's possible that we receive a packet that is larger than current window. If it's the first packet in this way, it will cause it to increase rwnd_over. Then, if we receive another data chunk (specially as SCTP allows you to have one data chunk in flight even during 0 window), rwnd_over will be overwritten instead of added to. In the long run, this could cause the window to grow bigger than its initial size, as rwnd_over would be charged only for the last received data chunk while the code will try open the window for all packets that were received and had its value in rwnd_over overwritten. This, then, can lead to the worsening of payload/buffer ratio and cause rwnd_press to kick in more often. The fix is to sum it too, same as is done for rwnd_press, so that if we receive 3 chunks after closing the window, we still have to release that same amount before re-opening it. Log snippet from sctp_test exhibiting the issue: [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) [ 146.209232] sctp: sctp_assoc_rwnd_decrease: association:ffff88013928e000 has asoc->rwnd:0, asoc->rwnd_over:1! [ 146.209232] sctp: sctp_assoc_rwnd_decrease: asoc:ffff88013928e000 rwnd decreased by 1 to (0, 1, 114221) Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 14:01:35 -05:00
Linus Torvalds	a307d0a007	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull final vfs updates from Al Viro: "Assorted cleanups and fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sg_write()/bsg_write() is not fit to be called under KERNEL_DS ufs: fix function declaration for ufs_truncate_blocks fs: exec: apply CLOEXEC before changing dumpable task flags seq_file: reset iterator to first record for zero offset vfs: fix isize/pos/len checks for reflink & dedupe [iov_iter] fix iterate_all_kinds() on empty iterators move aio compat to fs/aio.c reorganize do_make_slave() clone_private_mount() doesn't need to touch namespace_sem remove a bogus claim about namespace_sem being held by callers of mnt_alloc_id()	2016-12-23 10:52:43 -08:00
David S. Miller	e57cbe48a6	Merge branch 'virtio-net-xdp-fixes' Jason Wang says: ==================== several fixups for virtio-net XDP Merry Xmas and a Happy New year to all: This series tries to fixes several issues for virtio-net XDP which could be categorized into several parts: - fix several issues during XDP linearizing - allow csumed packet to work for XDP_PASS - make EWMA rxbuf size estimation works for XDP - forbid XDP when GUEST_UFO is support - remove big packet XDP support - add XDP support or small buffer Please see individual patches for details. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:56 -05:00
Jason Wang	bb91accf27	virtio-net: XDP support for small buffers Commit `f600b69050` ("virtio_net: Add XDP support") leaves the case of small receive buffer untouched. This will confuse the user who want to set XDP but use small buffers. Other than forbid XDP in small buffer mode, let's make it work. XDP then can only work at skb->data since virtio-net create skbs during refill, this is sub optimal which could be optimized in the future. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:56 -05:00
Jason Wang	c47a43d300	virtio-net: remove big packet XDP codes Now we in fact don't allow XDP for big packets, remove its codes. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	92502fe86c	virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support When VIRTIO_NET_F_GUEST_UFO is negotiated, host could still send UFO packet that exceeds a single page which could not be handled correctly by XDP. So this patch forbids setting XDP when GUEST_UFO is supported. While at it, forbid XDP for ECN (which comes only from GRO) too to prevent user from misconfiguration. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	5c33474d41	virtio-net: make rx buf size estimation works for XDP We don't update ewma rx buf size in the case of XDP. This will lead underestimation of rx buf size which causes host to produce more than one buffers. This will greatly increase the possibility of XDP page linearization. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	b00f70b0da	virtio-net: unbreak csumed packets for XDP_PASS We drop csumed packet when do XDP for packets. This breaks XDP_PASS when GUEST_CSUM is supported. Fix this by allowing csum flag to be set. With this patch, simple TCP works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	1830f8935f	virtio-net: correctly handle XDP_PASS for linearized packets When XDP_PASS were determined for linearized packets, we try to get new buffers in the virtqueue and build skbs from them. This is wrong, we should create skbs based on existed buffers instead. Fixing them by creating skb based on xdp_page. With this patch "ping 192.168.100.4 -s 3900 -M do" works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	56a86f84b8	virtio-net: fix page miscount during XDP linearizing We don't put page during linearizing, the would cause leaking when xmit through XDP_TX or the packet exceeds PAGE_SIZE. Fix them by put page accordingly. Also decrease the number of buffers during linearizing to make sure caller can free buffers correctly when packet exceeds PAGE_SIZE. With this patch, we won't get OOM after linearize huge number of packets. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	275be061b3	virtio-net: correctly xmit linearized page on XDP_TX After we linearize page, we should xmit this page instead of the page of first buffer which may lead unexpected result. With this patch, we can see correct packet during XDP_TX. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	73b62bd085	virtio-net: remove the warning before XDP linearizing Since we use EWMA to estimate the size of rx buffer. When rx buffer size is underestimated, it's usual to have a packet with more than one buffers. Consider this is not a bug, remove the warning and correct the comment before XDP linearizing. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:53 -05:00
Linus Torvalds	fc26901b12	befs fixes for 4.10-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYW7vZAAoJEGu/nxmHO1GNeGUIAJil3Q4ZaeOaaj5uNs4h64kc 0BAfGSwzGNgreX5PWm+jQVeh6xbAqXnYtsWIDSibpxnXOhAZcXHbpzKLTwlMl4rh qpXAAWhHcBsOKiNcg++RRmouubYtpgMoOKCgo/DzGp51mSV7/8K2mugzDRohPUsR jUDqUa9qvt65uqI5xCuK1n3aLtCQ9m3RUzDfQbH4fK/yBXpNIE83xegU1SBJKZHj uGPJpjHhc1vaba6Y8vDDBHuJR9IJxfeSnoJE0xMmGlIub40exw7P4Dek1Tc/3G+R qiqT9aGAbegkFDerps5sqOLbU4Lm4Js8Ov78l3IN1FSVdYWsptzRibjIbUidPdc= =zypk -----END PGP SIGNATURE----- Merge tag 'befs-v4.10-rc1' of git://github.com/luisbg/linux-befs Pull befs updates from Luis de Bethencourt: "A series of small fixes and adding NFS export support" * tag 'befs-v4.10-rc1' of git://github.com/luisbg/linux-befs: befs: add NFS export support befs: remove trailing whitespaces befs: remove signatures from comments befs: fix style issues in header files befs: fix style issues in linuxvfs.c befs: fix typos in linuxvfs.c befs: fix style issues in io.c befs: fix style issues in inode.c befs: fix style issues in debug.c	2016-12-23 10:46:15 -08:00
Linus Torvalds	01302aac12	drm initial fixes for 4.10 (intel, amd) -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYXIQlAAoJEAx081l5xIa+WnEP/RDeTucoQrcKj66EU7Pq5kO6 blF/mcn7RfcHnxDIWPQCASk+pyyeTpCq063H/SMvcqfsXA8M8/89INZK/zYlYCqX 5dMjCuHoQ465V4LClY0VhWpeM2EPgvVWWFHvCE0tUQQKLvJRQjZHQl0zjqDzrfAR 6lVNFE7nIbHPy6dlHw7Xe9kQmsW1GpQhsr4o31MpZIQdOwBHu2k4CzioSOwPLaDg u/RejB3dMdxfn7RKqyupSF8Z8VQ1yQE+LcTTlsc4CbfQf1YX8pijhFQShI6OBsz0 3pIS6Veucd9QH2HQEA3eFptHocJ7XK6w8aIQ93hgJ2WR9JWIXrVwLrpNQss+HcOA 9o5DriPwyYW1B5gZ86N652EWm+d5qIIJEAd4pcfAiQZDBzoVdMyUXZYcAV1yfsHI eWiJZgnxytlem7tFmFACmmQTCMOZZU1BlESVf5cSX1JQAVo7TCEhkIi6xbXDY8E9 qU3NLETXZvaMIXJnDtI27wbXvzcK35cbskIwVPgd6ax9rV8IP33WmXXbs5CRtVwC EWAyu/n2s3LOUPwZ9Sack3Whtl2c55yywUe4SARROuVhGtpzz7xpCbasg0ecBTwl jEw1HhoonQAf/y5YxDVLX1SAMbiO7pSw9cmCKN6zCC2Iius5b4uBTpzbrQbUwUnc Vk2/cX2oZBMXSk4j54sZ =Taep -----END PGP SIGNATURE----- Merge tag 'drm-fixes-for-4.10-rc1' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Some fixes came in while I was out, mostly intel and amdgpu ones, with one ast fix" Daniel Vetter says: "This should also shut up the WARN_ON(!intel_dp->lane_count) noise" * tag 'drm-fixes-for-4.10-rc1' of git://people.freedesktop.org/~airlied/linux: (35 commits) drm/amdgpu: update tile table for oland/hainan drm/amdgpu: update tile table for verde drm/amdgpu: update rev id for verde drm/amdgpu: update golden setting for verde drm/amdgpu: update rev id for oland drm/amdgpu: update golden setting for oland drm/amdgpu: update rev id for hainan drm/amdgpu: update golden setting for hainan drm/amdgpu: update rev id for pitcairn drm/amdgpu: update golden setting for pitcairn drm/amdgpu: update golden setting/tiling table of tahiti drm/i915: skip the first 4k of stolen memory on everything >= gen8 drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping drm/i915: Fix use after free in logical_render_ring_init drm/i915: disable PSR by default on HSW/BDW drm/i915: Fix setting of boost freq tunable drm/i915: tune down the fast link training vs boot fail drm/i915: Reorder phys backing storage release drm/i915/gen9: Fix PCODE polling during SAGV disabling drm/i915/gen9: Fix PCODE polling during CDCLK change notification ...	2016-12-23 10:41:53 -08:00
Linus Torvalds	296915912d	First round of -rc fixes for 4.10 kernel - Series of qedr fixes - Series of rxe fixes - One isolated i40iw fix - One isolated cma fix - One isolated cxgb4 fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYXAGvAAoJELgmozMOVy/dDukQAMMNarWp0U8KfNYRU5tyCBwd aIQC1gFT6GUCFys40Z6L84m1D3NpGR+vzVv3grVBeuge73b79zAOHXvVDwJCA+Jl QQLG3vZ13C3158sLDiK8zL+4Ob5OfOQ5nQ2spvDfJWpye9SD+pWFcrpqvK02ANRN kFHILk1gROBTNi46yBR5hjWOkw7Bua6XLsPxh6xoaDZ43NL0r0xgm43FTnj/19x3 0zpZYYKP+3C6U7678rqaog9zfXHvadghW5/WBJ/VgfKqEmH89ESx4J2MvbB8DxFD 1tWAOpr5TNY5jnh8mtUsceDjCzQivc/RWqAu05BspEwcavjSLFyRYr1epR0/4oAd PqLSmfORmhpJ8+5Kmn+chtXo3TT4SYGHIzSUbgbEV/ClwX/7UW+w8mfQZ3buUBq/ cQp/oRnJcsrQIEDFO3AH7P+6Sxy6t3zbSl5oKBUOI1u4RFmC7YBPqo9fQu2Z2mGk 3+AWQaPr7qgEcFzXBgLzvd4LhTYKsvmiNwrcXi9KjjwQjNEVg15qqF2YtmxEUgi9 kh3IOcGan3iSblhV/WLrxcOjlPQrPpBOVnTPhUskFtlsrD+032OxeOBpVoU3nCUt MjTYWoNTYdw4wHz0w373o0uR4+4nl4a5OmO4Fh6Drmg5hm4Bl9BWy0Kziu93Z1Ay Z2utZVWLWhBzn8yJujUz =NW9g -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs	2016-12-23 10:38:48 -08:00
Linus Torvalds	f290cbacb6	SCSI for-linus on 20161222 This is mostly stuff which missed the initial pull. There's a new driver: qedi, some ufs, ibmvscsis and ncr5380 updates plus some assorted driver fixes and also a fix for the bug where if a device goes into a blocked state between configuration and sysfs device add (which can be a long time under async probing) it would become permanently blocked. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJYXDugAAoJEAVr7HOZEZN4MO4P/2FM+U9uxjSMnHlnnEf/zVNE pqYS8vxdnoPwbuzd++uFTneLa8xMGXPmX4X0t6tZQsi9w+ZDAUYPWEjH2TPCa112 V6y9bDKTnkXyLUo78dIRIJXb0lVSi2bCYi0R9x6HA2wyU4/NRhYtCSbYTONEzHkk Y3aADV6kLQf8NNsqExFmtY1XNCc+xjSistcJwbmIkFAB8qdVFo2xo0qa6Sb1SEsX WzER/6GdZOVblEy6llmlfSG3h8XPfbxVvqdHvTLlkCUjQsOucjzhxZZ1TGcqWhWB XQGuXlfOTJN1O6ZEAFfZdiz7g/1NGjwczUP4x99S2NiPFD76C34ncRBFiYODvdYs izsctB4l4cZmBNdYQSYGYxY5emJrmv1JCvtaqLxl2sQ+XIxGRv/t3yAWisdQi3My UlGVZw8lyxPNY/+E7vuC9QvFlDW2FBTop8VPH/wvZFB3cuDssHr6Uogx9oxuX/BY LfYu7GsGCcMigyot6MKiUMlT5Od4cHDlTaQE5kfgCSCP0s8Ssi7CQv2/CnpmWln+ HU/zDC+CRzZjkcCNy7nsgUhCEWKMWcPPTQ/HMpXDMRUIMfweHDUJ/cM5tRyqb0kl n8TROG9Ptma2A2EcDYtkAqX+XqzE0RUTwDwcQhSQR/XJZEHmsVlXGyCcY3sx3UmZ 5q9bgMoRLWzsiTuZSqBR =Zofc -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull late SCSI updates from James Bottomley: "This is mostly stuff which missed the initial pull. There's a new driver: qedi, and some ufs, ibmvscsis and ncr5380 updates plus some assorted driver fixes and also a fix for the bug where if a device goes into a blocked state between configuration and sysfs device add (which can be a long time under async probing) it would become permanently blocked" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (30 commits) scsi: avoid a permanent stop of the scsi device's request queue scsi: mpt3sas: Recognize and act on iopriority info scsi: qla2xxx: Fix Target mode handling with Multiqueue changes. scsi: qla2xxx: Add Block Multi Queue functionality. scsi: qla2xxx: Add multiple queue pair functionality. scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls. scsi: qla2xxx: Only allow operational MBX to proceed during RESET. scsi: hpsa: remove memory allocate failure message scsi: Update 3ware driver email addresses scsi: zfcp: fix rport unblock race with LUN recovery scsi: zfcp: do not trace pure benign residual HBA responses at default level scsi: zfcp: fix use-after-"free" in FC ingress path after TMF scsi: libcxgbi: return error if interface is not up scsi: cxgb4i: libcxgbi: add missing module_put() scsi: cxgb4i: libcxgbi: cxgb4: add T6 iSCSI completion feature scsi: cxgb4i: libcxgbi: add active open cmd for T6 adapters scsi: cxgb4i: use cxgb4_tp_smt_idx() to get smt_idx scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework. scsi: aacraid: remove wildcard for series 9 controllers scsi: ibmvscsi: add write memory barrier to CRQ processing ...	2016-12-23 10:36:19 -08:00
Linus Torvalds	42e0372c0e	2nd round of ARC udpates for 4.10rc1 - Fix for aliasing VIPT dcache in old ARC700 cores - micro-optimization in ARC700 ProtV handler - Enable SG_CHAIN [Vladimir] - ARC HS38 core intc default to prio 1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYXHQ1AAoJEGnX8d3iisJeGpMP/2IIx1JxYklDRluSHkA3ekr3 OJBQrMBJbvRW2s4kD9brHgzZwMAJTjO4b7DGOLcd7w12fCRc0fuzRysyuW5x5Sky ovubNwneGBb2T4SDjI6VKibWBfOZARiyE5ROfHF4vmgAAALFUtGlztm4ZjyPy6Mw jN/rAQLyvNSv29gYmmOiq/lH68qJSmVzR6otBVpil4veleXIv5f093ujao2Egjb2 DPnzUBjTJ/QvmtWhinxQqdepq4G0Oeo//+lbDeWYFHIdYdwT0o0v6XzeTjr7whKI KKPPDRMMSzsNAuqGU1ufD+5E/oVruUNcnlepfbvfTgN+g3kXB0e6S+nFlkAKHcCG 3VcifR+2M73ZIS0Jhdx5Gb9uSvXA0sYg/GEhiV0PdCi86ntcp+SlpVs7POdyKmb8 zDvIZyFnLofWaOc8Oiugtb0kOfENFzrc8xXVRsLeECngAXQq8Eaz23yVqRLQaCwY uIFl7k0IVpIYLZrDOKOsN+eJrE2JW3eftZFfwktOwdA1JL6AqEvlXDWsjHTmIpcB pGJOYvIGFdlT74HSrKEnDKMlcP3dIqN888CBeNrck+wrqlGpvZWD4kytCiHpR95c xiXDQNZassmT9ispCRA/dxpPjUksxfhZgPiin93GiUSXk55WX5EomNbAiJz/oCqT pAv9Y10C3o+hifgV9r9F =1qMs -----END PGP SIGNATURE----- Merge tag 'arc-4.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull more ARC updates from Vineet Gupta: - Fix for aliasing VIPT dcache in old ARC700 cores - micro-optimization in ARC700 ProtV handler - Enable SG_CHAIN [Vladimir] - ARC HS38 core intc default to prio 1 * tag 'arc-4.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: mm: arc700: Don't assume 2 colours for aliasing VIPT dcache ARC: mm: No need to save cache version in @cpuinfo ARC: enable SG chaining ARCv2: intc: default all interrupts to priority 1 ARCv2: entry: document intr disable in hard isr ARC: ARCompact entry: elide re-reading ECR in ProtV handler	2016-12-23 10:22:47 -08:00
David S. Miller	d3a51d6cbc	Merge branch 'mlxsw-router-fixes' Jiri Pirko says: ==================== mlxsw: Router fixes Ido says: First two patches ensure we remove from the device's table neighbours that are considered to be dead by the neighbour core. The last patch removes nexthop groups from the device when they are no longer valid. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:31:20 -05:00
Ido Schimmel	58312125da	mlxsw: spectrum_router: Correctly remove nexthop groups At the end of the nexthop initialization process we determine whether the nexthop should be offloaded or not based on the NUD state of the neighbour representing it. After all the nexthops were initialized we refresh the nexthop group and potentially offload it to the device, in case some of the nexthops were resolved. Make the destruction of a nexthop group symmetric with its creation by marking all nexthops as invalid and then refresh the nexthop group to make sure it was removed from the device's tables. Fixes: `b2157149b0` ("mlxsw: spectrum_router: Add the nexthop neigh activity update") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:31:19 -05:00
Ido Schimmel	93a87e5e79	mlxsw: spectrum_router: Don't reflect dead neighs When a neighbour is considered to be dead, we should remove it from the device's table regardless of its NUD state. Without this patch, after setting a port to be administratively down we get the following errors when we periodically try to update the kernel about neighbours activity: [ 461.947268] mlxsw_spectrum 0000:03:00.0 sw1p3: Failed to find matching neighbour for IP=192.168.100.2 Fixes: `a6bf9e933d` ("mlxsw: spectrum_router: Offload neighbours based on NUD state change") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:31:19 -05:00
Ido Schimmel	53f800e3ba	neigh: Send netevent after marking neigh as dead neigh_cleanup_and_release() is always called after marking a neighbour as dead, but it only notifies user space and not in-kernel listeners of the netevent notification chain. This can cause multiple problems. In my specific use case, it causes the listener (a switch driver capable of L3 offloads) to believe a neighbour entry is still valid, and is thus erroneously kept in the device's table. Fix that by sending a netevent after marking the neighbour as dead. Fixes: `a6bf9e933d` ("mlxsw: spectrum_router: Offload neighbours based on NUD state change") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:31:18 -05:00
Dave Jones	a98f917589	ipv6: handle -EFAULT from skb_copy_bits By setting certain socket options on ipv6 raw sockets, we can confuse the length calculation in rawv6_push_pending_frames triggering a BUG_ON. RIP: 0010:[<ffffffff817c6390>] [<ffffffff817c6390>] rawv6_sendmsg+0xc30/0xc40 RSP: 0018:ffff881f6c4a7c18 EFLAGS: 00010282 RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002 RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00 RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009 R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030 R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80 Call Trace: [<ffffffff8118ba23>] ? unmap_page_range+0x693/0x830 [<ffffffff81772697>] inet_sendmsg+0x67/0xa0 [<ffffffff816d93f8>] sock_sendmsg+0x38/0x50 [<ffffffff816d982f>] SYSC_sendto+0xef/0x170 [<ffffffff816da27e>] SyS_sendto+0xe/0x10 [<ffffffff81002910>] do_syscall_64+0x50/0xa0 [<ffffffff817f7cbc>] entry_SYSCALL64_slow_path+0x25/0x25 Handle by jumping to the failure path if skb_copy_bits gets an EFAULT. Reproducer: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #define LEN 504 int main(int argc, char* argv[]) { int fd; int zero = 0; char buf[LEN]; memset(buf, 0, LEN); fd = socket(AF_INET6, SOCK_RAW, 7); setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4); setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN); sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110); } Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:20:39 -05:00
Willem de Bruijn	39b2dd765e	inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within the packet. For sockets that have transport headers pulled, transport offset can be negative. Use signed comparison to avoid overflow. Fixes: `e6afc8ace6` ("udp: remove headers from UDP packets before queueing") Reported-by: Nisar Jagabar <njagabar@cloudmark.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 12:19:18 -05:00
David S. Miller	9aa340a5c3	Merge branch 'cls_flower-act_tunnel_key-fixes' Or Gerlitz says: ==================== net/sched fixes for cls_flower and act_tunnel_key This small series contain a fix to the matching flags support in flower and to the tunnel key action MD prep for IPv6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 11:59:57 -05:00
Or Gerlitz	d9724772e6	net/sched: cls_flower: Mandate mask when matching on flags When matching on flags, we should require the user to provide the mask and avoid using an all-ones mask. Not doing so causes matching on flags provided w.o mask to hit on the value being unset for all flags, which may not what the user wanted to happen. Fixes: `faa3ffce78` ('net/sched: cls_flower: Add support for matching on flags') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Paul Blakey <paulb@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 11:59:56 -05:00
Or Gerlitz	dc594ecd41	net/sched: act_tunnel_key: Fix setting UDP dst port in metadata under IPv6 The UDP dst port was provided to the helper function which sets the IPv6 IP tunnel meta-data under a wrong param order, fix that. Fixes: `75bfbca01e` ('net/sched: act_tunnel_key: Add UDP dst port option') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 11:59:56 -05:00
jpinto	567be78659	stmmac: CSR clock configuration fix When testing stmmac with my QoS reference design I checked a problem in the CSR clock configuration that was impossibilitating the phy discovery, since every read operation returned 0x0000ffff. This patch fixes the issue. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 11:46:37 -05:00
Al Viro	faf0dcebd7	Merge branch 'work.namespace' into for-linus	2016-12-22 23:04:31 -05:00
Al Viro	128394eff3	sg_write()/bsg_write() is not fit to be called under KERNEL_DS Both damn things interpret userland pointers embedded into the payload; worse, they are actually traversing those. Leaving aside the bad API design, this is very much _not_ safe to call with KERNEL_DS. Bail out early if that happens. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:42 -05:00
Jeff Layton	f698cccbc8	ufs: fix function declaration for ufs_truncate_blocks sparse says: fs/ufs/inode.c:1195:6: warning: symbol 'ufs_truncate_blocks' was not declared. Should it be static? Note that the forward declaration in the file is already marked static. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:41 -05:00
Aleksa Sarai	613cc2b6f2	fs: exec: apply CLOEXEC before changing dumpable task flags If you have a process that has set itself to be non-dumpable, and it then undergoes exec(2), any CLOEXEC file descriptors it has open are "exposed" during a race window between the dumpable flags of the process being reset for exec(2) and CLOEXEC being applied to the file descriptors. This can be exploited by a process by attempting to access /proc/<pid>/fd/... during this window, without requiring CAP_SYS_PTRACE. The race in question is after set_dumpable has been (for get_link, though the trace is basically the same for readlink): [vfs] -> proc_pid_link_inode_operations.get_link -> proc_pid_get_link -> proc_fd_access_allowed -> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); Which will return 0, during the race window and CLOEXEC file descriptors will still be open during this window because do_close_on_exec has not been called yet. As a result, the ordering of these calls should be reversed to avoid this race window. This is of particular concern to container runtimes, where joining a PID namespace with file descriptors referring to the host filesystem can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect against access of CLOEXEC file descriptors -- file descriptors which may reference filesystem objects the container shouldn't have access to). Cc: dev@opencontainers.org Cc: <stable@vger.kernel.org> # v3.2+ Reported-by: Michael Crosby <crosbymichael@gmail.com> Signed-off-by: Aleksa Sarai <asarai@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:41 -05:00
Tomasz Majchrzak	e522751d60	seq_file: reset iterator to first record for zero offset If kernfs file is empty on a first read, successive read operations using the same file descriptor will return no data, even when data is available. Default kernfs 'seq_next' implementation advances iterator position even when next object is not there. Kernfs 'seq_start' for following requests will not return iterator as position is already on the second object. This defect doesn't allow to monitor badblocks sysfs files from MD raid. They are initially empty but if data appears at some stage, userspace is not able to read it. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:06 -05:00
Darrick J. Wong	22725ce4e4	vfs: fix isize/pos/len checks for reflink & dedupe Strengthen the checking of pos/len vs. i_size, clarify the return values for the clone prep function, and remove pointless code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:00:23 -05:00
Al Viro	33844e6651	[iov_iter] fix iterate_all_kinds() on empty iterators Problem similar to ones dealt with in "fold checks into iterate_and_advance()" and followups, except that in this case we really want to do nothing when asked for zero-length operation - unlike zero-length iterate_and_advance(), zero-length iterate_all_kinds() has no side effects, and callers are simpler that way. That got exposed when copy_from_iter_full() had been used by tipc, which builds an msghdr with zero payload and (now) feeds it to a primitive based on iterate_all_kinds() instead of iterate_and_advance(). Reported-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:00:22 -05:00
Al Viro	c00d2c7e89	move aio compat to fs/aio.c ... and fix the minor buglet in compat io_submit() - native one kills ioctx as cleanup when put_user() fails. Get rid of bogus compat_... in !CONFIG_AIO case, while we are at it - they should simply fail with ENOSYS, same as for native counterparts. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 22:58:37 -05:00
James Bottomley	3eff4c7828	Merge branch 'misc' into for-linus	2016-12-22 12:32:33 -08:00
Dave Airlie	4a401ceeef	Merge tag 'drm-intel-next-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes First set of i915 fixes for code in next. * tag 'drm-intel-next-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-intel: drm/i915: skip the first 4k of stolen memory on everything >= gen8 drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping drm/i915: Fix use after free in logical_render_ring_init drm/i915: disable PSR by default on HSW/BDW drm/i915: Fix setting of boost freq tunable drm/i915: tune down the fast link training vs boot fail drm/i915: Reorder phys backing storage release drm/i915/gen9: Fix PCODE polling during SAGV disabling drm/i915/gen9: Fix PCODE polling during CDCLK change notification drm/i915/dsi: Fix chv_exec_gpio disabling the GPIOs it is setting drm/i915/dsi: Fix swapping of MIPI_SEQ_DEASSERT_RESET / MIPI_SEQ_ASSERT_RESET drm/i915/dsi: Do not clear DPOUNIT_CLOCK_GATE_DISABLE from vlv_init_display_clock_gating drm/i915: drop the struct_mutex when wedged or trying to reset	2016-12-23 05:28:02 +10:00
Dave Airlie	d043835d08	Merge tag 'drm-misc-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes Here's the one lonely bugfix I talked about on irc. * tag 'drm-misc-fixes-2016-12-22' of git://anongit.freedesktop.org/git/drm-misc: drivers/gpu/drm/ast: Fix infinite loop if read fails	2016-12-23 05:26:55 +10:00
Dave Airlie	6df383cf90	Merge branch 'drm-next-4.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes - fix display regression on DCE6/8 - Powergating fixes for GFX8 - amdgpu SI fixes (golden settings, proper rev id setup, etc.) * 'drm-next-4.10' of git://people.freedesktop.org/~agd5f/linux: (21 commits) drm/amdgpu: update tile table for oland/hainan drm/amdgpu: update tile table for verde drm/amdgpu: update rev id for verde drm/amdgpu: update golden setting for verde drm/amdgpu: update rev id for oland drm/amdgpu: update golden setting for oland drm/amdgpu: update rev id for hainan drm/amdgpu: update golden setting for hainan drm/amdgpu: update rev id for pitcairn drm/amdgpu: update golden setting for pitcairn drm/amdgpu: update golden setting/tiling table of tahiti drm/amdgpu: fix cursor setting of dce6/dce8 drm/amdgpu: refine set clock gating for tonga/polaris drm/amdgpu: initialize cg flags for tonga/polaris10/polaris11. drm/amdgpu: add new gfx cg flags. drm/amdgpu: fix pg can't be disabled by PG mask. drm/amdgpu: always initialize gfx pg for gfx_v8.0. drm/amdgpu: enable AMD_PG_SUPPORT_CP in Carrizo/Stoney. drm/amdgpu: fix init save/restore list in gfx_v8.0 drm/amdgpu: fix enable_cp_power_gating in gfx_v8.0. ...	2016-12-23 05:25:12 +10:00
Linus Torvalds	50f6584e4c	Update Jacek Anaszewski's email address -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYXAeZAAoJEL1qUBy3i3wmDfAP/iMBlZAAdY0U3WmXpXkhQDB0 qbYXvFqy6ccZN/o5oabsHFqW/2DZqpeUnwQACCljLgbcUUuzcuNuYIES2bayp3Bi 7f4ZpE+sibJaSnhH3zikqV9OZzIArkO7HyIg8z+lSqM+Sjvbe6PLFr8oXisU47bQ +RamJRVEdzII6l8+Y2r6CAHZohAH1Fl+MSPChCuIk8LPGFZz43Oejr+odSApQgQj k+QZfLKHou98RWB++oc+t0wtfw+edpXBdW02aKV7fN/BvRffjnYR+om5rzrN5uYE Ufv6NP/WCgQN+WLulcTr7kzEk+6kqt/nVrTFcTHFVbz1ef8GYjf1bPpmIABVi23P EDHQvDJEFipYb+pr7+lu6g+xZhggM/oxS129lEuXj4mdJp8i8vtvTr05JlFH1Gij sV0NOkvkiotR77aFqbGeHU5mh2mi861QoIm2zBg9fjenRLk6el0P4m0bmOIRLY3A nzLzJiFYYE+AB54F+roKJEh+nzpbartz5cZAKtXUcpECZPxFJdFGZevX+SYrerIc /j/LmtMCO/4vEvAt4dsMxZdzpZGgFNhJ2QKPl4C/NS8uiXObHaW5/8Fp2/3cgWuI c+2Si4+AKxdf/5lTtOqb9+3l0CexKMPFYt+19sdWap+dbSNTyi54yVei7qCybKx1 aYayPhaqG74Ab/YjncWB =Fx3S -----END PGP SIGNATURE----- Merge tag 'leds_for_4.10_email_update' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED maintainer email update from Jacek Anaszewski: "Update Jacek Anaszewski's email address" * tag 'leds_for_4.10_email_update' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: MAINTAINERS: Update Jacek Anaszewski's email address	2016-12-22 10:31:30 -08:00
Linus Torvalds	af22941ae1	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "Just a set of small fixes that have either been queued up after the original pull for this merge window, or just missed the original pull request. - a few bcache fixes/changes from Eric and Kent - add WRITE_SAME to the command filter whitelist frm Mauricio - kill an unused struct member from Ritesh - partition IO alignment fix from Stefan - nvme sysfs printf fix from Stephen" * 'for-linus' of git://git.kernel.dk/linux-block: block: check partition alignment nvme : Use correct scnprintf in cmb show block: allow WRITE_SAME commands with the SG_IO ioctl block: Remove unused member (busy) from struct blk_queue_tag bcache: partition support: add 16 minors per bcacheN device bcache: Make gc wakeup sane, remove set_task_state()	2016-12-22 10:23:39 -08:00
Linus Torvalds	9be962d525	More ACPI updates for v4.10-rc1 - Move some Linux-specific functionality to upstream ACPICA and update the in-kernel users of it accordingly (Lv Zheng). - Drop a useless warning (triggered by the lack of an optional object) from the ACPI namespace scanning code (Zhang Rui). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYW90kAAoJEILEb/54YlRxS78P/RZwSOwkgdowIMGOlO7Hz5Xz 6Psjfc3QdGhVGOau4F3Irg7HIDsPaHE5+xijwujWFK0wh89tknrChrFDrOueND6K rZ4w72lYROQhtQl4bu9yXZ2TswP3I5aYERDPyqKfDAi+SaQBpZUtmWdoh0I/JN2M svyuzxoRlNglgp2GWPnnnKHFe6plEeZgKjgoGjPAufDHakqYvP0cQwY3i4+PpuYh prXb4Xr4fFvAG2MhM9ciT02ewrQJcYUgVj5bnuSBmMu0y0zUBJ1kK7abiQlf+lTy /BCqQbllhHrrXQD0zkqi2oBFqWL4Bnj9r+5zj03KmBsfoz3u+zXr5CApDei8Yc06 gGIZXX9aCqBirahkpsMFYPEQVPoPFCek0slXSyF5xKjCWYyocwgUtHnB87uIiSFO jCeSsWwT5IO9HnbsTf6x4xDWBdY6LOJgJyDirIGcNSUX+q4tfgn/LFrgN9Ck4M2g xkZn8e8H7u09ACX9l6dHZ1PCHlg7bWKeH6gcqdo5R4NOVeZxt9YDIz13ERsG3D17 JZvdrAbBdC1fXfHnOLBj/BRy+TFvieWOC+kHBTAnPQlnkr7NjXXv/vgWU8flVL/z kxlR2U2lD2XarG/b8OHvYxO2k/TDLRenN0IqqFmYbmyhH0dHYKGGBLY4TetAXKNL N4EouS+xUBulCP2XOI8N =55TG -----END PGP SIGNATURE----- Merge tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "Here are new versions of two ACPICA changes that were deferred previously due to a problem they had introduced, two cleanups on top of them and the removal of a useless warning message from the ACPI core. Specifics: - Move some Linux-specific functionality to upstream ACPICA and update the in-kernel users of it accordingly (Lv Zheng) - Drop a useless warning (triggered by the lack of an optional object) from the ACPI namespace scanning code (Zhang Rui)" * tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory() ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users ACPICA: Tables: Allow FADT to be customized with virtual address ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel ACPI: do not warn if _BQC does not exist	2016-12-22 10:19:32 -08:00

1 2 3 4 5 ...

647699 Commits