linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-19 11:56:12 +07:00

Author	SHA1	Message	Date
David Ahern	85bd05deb3	ipv6: Pass fib6_result to ip6_rt_cache_alloc Change ip6_rt_cache_alloc to take a fib6_result over a fib6_info. Since ip6_rt_cache_alloc is only the caller, update the rt6_is_gw_or_nonexthop helper to take fib6_result. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 23:10:46 -07:00
David Ahern	7e4b512875	ipv6: Pass fib6_result to rt6_find_cached_rt Simplify rt6_find_cached_rt for the fast path cases and pass fib6_result to rt6_find_cached_rt. Rename the local return variable to ret to maintain consisting with fib6_result name. Update the comment in rt6_find_cached_rt to reference the new names in a fib6_info vs the old name when fib entries were an rt6_info. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 23:08:51 -07:00
David Ahern	b1d4099150	ipv6: Rename fib6_multipath_select and pass fib6_result Add 'struct fib6_result' to hold the fib entry and fib6_nh from a fib lookup as separate entries, similar to what IPv4 now has with fib_result. Rename fib6_multipath_select to fib6_select_path, pass fib6_result to it, and set f6i and nh in the result once a path selection is done. Call fib6_select_path unconditionally for path selection which means moving the sibling and oif check to fib6_select_path. To handle the two different call paths (2 only call multipath_select if flowi6_oif == 0 and the other always calls it), add a new have_oif_match that controls the sibling walk if relevant. Update callers of fib6_multipath_select accordingly and have them use the fib6_info and fib6_nh from the result. This is needed for multipath nexthop objects where a single f6i can point to multiple fib6_nh (similar to IPv4). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 23:08:51 -07:00
Yonghong Song	ba02de1aa0	selftests/bpf: fix a compilation error I hit the following compilation error with gcc 4.8.5. prog_tests/flow_dissector.c: In function ‘test_flow_dissector’: prog_tests/flow_dissector.c:155:2: error: ‘for’ loop initial declarations are only allowed in C99 mode for (int i = 0; i < ARRAY_SIZE(tests); i++) { ^ prog_tests/flow_dissector.c:155:2: note: use option -std=c99 or -std=gnu99 to compile your code Let us fix the issue by avoiding this particular c99 feature. Fixes: `a5cb33464e` ("selftests/bpf: make flow dissector tests more extensible") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 22:29:51 -07:00
Alexei Starovoitov	193d0002ef	Merge branch 'bulk-cpumap-redirect' Jesper Dangaard Brouer says: ==================== This patchset utilize a number of different kernel bulk APIs for optimizing the performance for the XDP cpumap redirect feature. Benchmark details are available here: https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap03-optimizations.org Performance measurements can be considered micro benchmarks, as they measure dropping packets at different stages in the network stack. Summary based on above: Baseline benchmarks - baseline-redirect: UdpNoPorts: 3,180,074 - baseline-redirect: iptables-raw drop: 6,193,534 Patch1: bpf: cpumap use ptr_ring_consume_batched - redirect: UdpNoPorts: 3,327,729 - redirect: iptables-raw drop: 6,321,540 Patch2: net: core: introduce build_skb_around - redirect: UdpNoPorts: 3,221,303 - redirect: iptables-raw drop: 6,320,066 Patch3: bpf: cpumap do bulk allocation of SKBs - redirect: UdpNoPorts: 3,290,563 - redirect: iptables-raw drop: 6,650,112 Patch4: bpf: cpumap memory prefetchw optimizations for struct page - redirect: UdpNoPorts: 3,520,250 - redirect: iptables-raw drop: 7,649,604 In this V2 submission I have chosen drop the SKB-list patch using netif_receive_skb_list() as it was not showing a performance improvement for these micro benchmarks. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 19:09:26 -07:00
Jesper Dangaard Brouer	86d231459d	bpf: cpumap memory prefetchw optimizations for struct page A lot of the performance gain comes from this patch. While analysing performance overhead it was found that the largest CPU stalls were caused when touching the struct page area. It is first read with a READ_ONCE from build_skb_around via page_is_pfmemalloc(), and when freed written by page_frag_free() call. Measurements show that the prefetchw (W) variant operation is needed to achieve the performance gain. We believe this optimization it two fold, first the W-variant saves one step in the cache-coherency protocol, and second it helps us to avoid the non-temporal prefetch HW optimizations and bring this into all cache-levels. It might be worth investigating if prefetch into L2 will have the same benefit. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 19:09:25 -07:00
Jesper Dangaard Brouer	8f0504a97e	bpf: cpumap do bulk allocation of SKBs As cpumap now batch consume xdp_frame's from the ptr_ring, it knows how many SKBs it need to allocate. Thus, lets bulk allocate these SKBs via kmem_cache_alloc_bulk() API, and use the previously introduced function build_skb_around(). Notice that the flag __GFP_ZERO asks the slab/slub allocator to clear the memory for us. This does clear a larger area than needed, but my micro benchmarks on Intel CPUs show that this is slightly faster due to being a cacheline aligned area is cleared for the SKBs. (For SLUB allocator, there is a future optimization potential, because SKBs will with high probability originate from same page. If we can find/identify continuous memory areas then the Intel CPU memset rep stos will have a real performance gain.) Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 19:09:25 -07:00
Jesper Dangaard Brouer	ba0509b688	net: core: introduce build_skb_around The function build_skb() also have the responsibility to allocate and clear the SKB structure. Introduce a new function build_skb_around(), that moves the responsibility of allocation and clearing to the caller. This allows caller to use kmem_cache (slab/slub) bulk allocation API. Next patch use this function combined with kmem_cache_alloc_bulk. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 19:09:24 -07:00
Jesper Dangaard Brouer	77361825bb	bpf: cpumap use ptr_ring_consume_batched Move ptr_ring dequeue outside loop, that allocate SKBs and calls network stack, as these operations that can take some time. The ptr_ring is a communication channel between CPUs, where we want to reduce/limit any cacheline bouncing. Do a concentrated bulk dequeue via ptr_ring_consume_batched, to shorten the period and times the remote cacheline in ptr_ring is read Batch size 8 is both to (1) limit BH-disable period, and (2) consume one cacheline on 64-bit archs. After reducing the BH-disable section further then we can consider changing this, while still thinking about L1 cacheline size being active. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-17 19:09:24 -07:00
David S. Miller	6b0a7f84ea	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflict resolution of af_smc.c from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 11:26:25 -07:00
David S. Miller	cea0aa9cbd	Merge branch 's390-next' Julian Wiedmann says: ==================== s390/qeth: updates 2019-04-17 please apply some additional qeth patches to net-next. This patchset converts the driver to use the kernel's multiqueue model. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	54a50941b7	s390/qeth: stop/wake TX queues based on their fill level Current xmit code only stops the txq after attempting to fill an IO buffer that hasn't been TX-completed yet. In many-connection scenarios, this can result in frequent rejected TX attempts, requeuing of skbs with NETDEV_TX_BUSY and extra overhead. Now that we have a proper 1-to-1 relation between stack-side txqs and our HW Queues, overhaul the stop/wake logic so that the xmit code stops the txq as needed. Given that we might map multiple skbs into a single buffer, it's crucial to ensure that the queue always provides an _entirely_ empty IO buffer. Otherwise large skbs (eg TSO) might not fit into the last available buffer. So whenever qeth_do_send_packet() first utilizes an _empty_ buffer, it updates & checks the used_buffers count. This now ensures that an skb passed to qeth_xmit() can always be mapped into an IO buffer, so remove all of the -EBUSY roll-back handling in the TX path. We preserve the minimal safety-checks ("Is this IO buffer really available?"), just in case some nasty future bug ever attempts to corrupt an in-use buffer. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	e6c15b5f34	s390/qeth: simplify QoS code qeth_get_priority_queue() is no longer used for IQD devices, remove the special-casing of their mcast queue. This effectively reverts commit `70deb01662` ("qeth: omit outbound queue 3 for unicast packets in Priority Queuing on HiperSockets"). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	73dc2daf11	s390/qeth: add TX multiqueue support for OSA devices This adds trivial support for multiple TX queues on OSA-style devices (both real HW and z/VM NICs). For now we expose the driver's existing QoS mechanism via .ndo_select_queue, and adjust the number of available TX queues when qeth_update_from_chp_desc() detects that the HW configuration has changed. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	3a18d75400	s390/qeth: add TX multiqueue support for IQD devices qeth has been supporting multiple HW Output Queues for a long time. But rather than exposing those queues to the stack, it uses its own queue selection logic in .ndo_start_xmit... with all the drawbacks that entails. Start off by switching IQD devices over to a proper mqs net_device, and converting all the netdev_queue management code. One oddity with IQD devices is the requirement to place all mcast traffic on the _highest_ established HW queue. Doing so via .ndo_select_queue seems straight-forward - but that won't work if only some of the HW queues are active (ie. when dev->real_num_tx_queues < dev->num_tx_queues), since netdev_cap_txqueue() will not allow us to put skbs on the higher queues. To make this work, we 1. let .ndo_select_queue() map all mcast traffic to netdev_queue 0, and 2. later re-map the netdev_queue and HW queue indices in .ndo_start_xmit and the TX completion handler. With this patch we default to a fixed set of 1 ucast and 1 mcast queue. Support for dynamic reconfiguration is added at a later time. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	333ef9d1d5	s390/qeth: don't keep statistics for tx timeout struct netdev_queue contains a counter for tx timeouts, which gets updated by dev_watchdog(). So let's not attempt to maintain our own statistics, in particular not by overloading the skb-error counter. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	fdd1a5303e	s390/qeth: don't bother updating the last-tx time As the documentation for netif_trans_update() says, netdev_start_xmit() already updates the last-tx time after every good xmit. So don't duplicate that effort. One odd case is that qeth_flush_buffers() also gets called from our TX completion handler, to flush out any partially filled buffer when we switch the queue to non-packing mode. But as the TX completion handler will _always_ wake the txq, we don't have to worry about the TX watchdog there. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	a4cdc9baee	s390/qeth: handle error from qeth_update_from_chp_desc() Subsequent code relies on the values that qeth_update_from_chp_desc() reads from the CHP descriptor. Rather than dealing with weird errors later on, just handle it properly here. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
Julian Wiedmann	41c47da3b6	s390/qeth: clarify naming for some QDIO helpers The naming of several QDIO helpers doesn't match their actual functionality, or the structures they operate on. Clean this up. s/qeth_alloc_qdio_buffers/qeth_alloc_qdio_queues s/qeth_free_qdio_buffers/qeth_free_qdio_queues s/qeth_alloc_qdio_out_buf/qeth_alloc_output_queue s/qeth_clear_outq_buffers/qeth_drain_output_queue s/qeth_clear_qdio_buffers/qeth_drain_output_queues Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:33:59 -07:00
David S. Miller	3a6f7892ac	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2019-04-16 This series contains updates to i40e driver only. Adam fixes i40e so that queues can be restored to its original value if configuring queue channels fails. Bumped the maximum API version supported and added the API version to error messages to clarify supported firmware API versions. Fixed the problem with the driver being able to add only 7 multicast MAC address filters instead of 16. Aleksandr adds support for Dynamic Device Personalization (DDP) which allows loading profiles that change the way internal parser interprets processed frames. Nick fixes an issue where if we modify the VLAN stripping options when a port VLAN is configured, it will break traffic for the VSI, so prevent changes from being made. Jake fixes an issue where a device reset can mess up the clock time because we reset the clock time based on the kernel time every reset. This causes us to potentially completely reset the PTP time, and can cause unexpected behavior in programs like ptp4l. Piotr fixes an LED blink issue with the 'ethtool -p' command, so that identification blinking will work on all hardware. Chinh fixed the error returned to correctly reflect the current state when LLDP or DCBx is not in an operational state. Grzegorz cleans up a misleading error message when untrusted VF tries to exceed addresses beyond the NIC limit. Carolyn fixes the error return code to correctly reflect the error case. v2: updated the URL provided in the DDP patch (#2) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:31:21 -07:00
Linus Torvalds	fe5cdef29e	Fixes for some bugs cause by recent changes. One crash if you feed bad data to the module parameters, one BUG that sometimes occurs when a user closes the connection, and one bug that cause the driver to not work if the configuration information only comes in from SMBIOS. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAly3SdwACgkQYfOMkJGb /4HbXQ/8C/STdGNlR5NYSj11H58XNw3WskUb9xgRjnkee1vOfjeVHuQa5DndxUNj 02ngdqwG+cg6W7CIujXiAm41Ty6YUjRttkUhlO+ByK6AR9failaf2emhLuoNDhht 8UD2mhR6SK1ZUIFZjU0vWy9YH/IngbBm4BbNQn8nvUIL8xuSL/Bhx/5X0X7UNppr 6ehBnGNlVCdeSGjor364Tero1ENYsvwJIwtI5lMWSi5Ig28BfuNtusWod7vTQU7y /yoUFUaTo8fwDczc7l5PpAgwxSqGF26bwuT7VYMEvudeNjq+FKieMNDCJy03gDmY F6OUcUcINTL+UW6cS6rdVO53lkD4qjupMM9MNSgED7XUI126OD5Igo3b+Du8fntT t9RszxXgn8VDpwBZCEiOpTg4+WpElWaWZFHxhEwhxUcrfBVQo37doqOi8JTozU9E OVOJn2M87S9AeHltNrGiMp2or8uma0X3c01loypgLCct7F5aO/YzjtzZyuHQ3InS Y2RwmNhoKs8z6sx4y7raS7BDUjsxh4N074PhFYgQ+9sWGWKJuHcp7OTlRlBXthf4 zMbyvKj9U7nSfhz43eXkP6EqC6reLXfoHAZXRxiUsjQO6ROzFsak/y+cw1PKSkca NRA9U9pmdik67bF7Y0HGG3vo1om1lFNbeSVCt9CbG9A6AQwePtY= =jOqw -----END PGP SIGNATURE----- Merge tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi Pull IPMI fixes from Corey Minyard: "Fixes for some bugs cause by recent changes. One crash if you feed bad data to the module parameters, one BUG that sometimes occurs when a user closes the connection, and one bug that cause the driver to not work if the configuration information only comes in from SMBIOS" * tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi: ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash ipmi: Fix failure on SMBIOS specified devices	2019-04-17 10:25:25 -07:00
David S. Miller	e77b8ba640	Merge branch 'stmmac-Enable-Flow-Control' Jose Abreu says: ==================== net: stmmac: Enable Flow Control I don't know of any specific reason why Flow Control is off by default but do let me know if there is any. Tested in B2B between XGMAC2 and GMAC5. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:14:28 -07:00
Jose Abreu	e998933906	net: stmmac: Set Flow Control to automatic mode in the driver By default Flow Control feature is not being enabled in stmmac. This is a useful feature that can prevent loss of packets and now that XGMAC already supports it (along with GMAC and QoS) it makes sense to activate it. Switch the module parameter to FLOW_AUTO so that Flow Control is activated. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:14:28 -07:00
Jose Abreu	ff82cfc783	net: stmmac: dwxgmac: Finish the Flow Control implementation Finish the implementation of Flow Control feature. In order for it to work correctly we need to set EHFC bit and the correct threshold values for activating and deactivating it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-17 10:14:28 -07:00
Linus Torvalds	2a3a028fc6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Handle init flow failures properly in iwlwifi driver, from Shahar S Matityahu. 2) mac80211 TXQs need to be unscheduled on powersave start, from Felix Fietkau. 3) SKB memory accounting fix in A-MDSU aggregation, from Felix Fietkau. 4) Increase RCU lock hold time in mlx5 FPGA code, from Saeed Mahameed. 5) Avoid checksum complete with XDP in mlx5, also from Saeed. 6) Fix netdev feature clobbering in ibmvnic driver, from Thomas Falcon. 7) Partial sent TLS record leak fix from Jakub Kicinski. 8) Reject zero size iova range in vhost, from Jason Wang. 9) Allow pending work to complete before clcsock release from Karsten Graul. 10) Fix XDP handling max MTU in thunderx, from Matteo Croce. 11) A lot of protocols look at the sa_family field of a sockaddr before validating it's length is large enough, from Tetsuo Handa. 12) Don't write to free'd pointer in qede ptp error path, from Colin Ian King. 13) Have to recompile IP options in ipv4_link_failure because it can be invoked from ARP, from Stephen Suryaputra. 14) Doorbell handling fixes in qed from Denis Bolotin. 15) Revert net-sysfs kobject register leak fix, it causes new problems. From Wang Hai. 16) Spectre v1 fix in ATM code, from Gustavo A. R. Silva. 17) Fix put of BROPT_VLAN_STATS_PER_PORT in bridging code, from Nikolay Aleksandrov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (111 commits) socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW tcp: tcp_grow_window() needs to respect tcp_space() ocelot: Clean up stats update deferred work ocelot: Don't sleep in atomic context (irqs_disabled()) net: bridge: fix netlink export of vlan_stats_per_port option qed: fix spelling mistake "faspath" -> "fastpath" tipc: set sysctl_tipc_rmem and named_timeout right range tipc: fix link established but not in session net: Fix missing meta data in skb with vlan packet net: atm: Fix potential Spectre v1 vulnerabilities net/core: work around section mismatch warning for ptp_classifier net: bridge: fix per-port af_packet sockets bnx2x: fix spelling mistake "dicline" -> "decline" route: Avoid crash from dereferencing NULL rt->from MAINTAINERS: normalize Woojung Huh's email address bonding: fix event handling for stacked bonds Revert "net-sysfs: Fix memory leak in netdev_register_kobject" rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check qed: Fix the DORQ's attentions handling qed: Fix missing DORQ attentions ...	2019-04-17 09:57:45 -07:00
Corey Minyard	3b9a907223	ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier free_user() could be called in atomic context. This patch pushed the free operation off into a workqueue. Example: BUG: sleeping function called from invalid context at kernel/workqueue.c:2856 in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27 CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1 Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015 Call Trace: dump_stack+0x5c/0x7b ___might_sleep+0xec/0x110 __flush_work+0x48/0x1f0 ? try_to_del_timer_sync+0x4d/0x80 _cleanup_srcu_struct+0x104/0x140 free_user+0x18/0x30 [ipmi_msghandler] ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler] deliver_response+0xbd/0xd0 [ipmi_msghandler] deliver_local_response+0xe/0x30 [ipmi_msghandler] handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler] ? dequeue_entity+0xa0/0x960 handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler] tasklet_action_common.isra.22+0x103/0x120 __do_softirq+0xf8/0x2d7 run_ksoftirqd+0x26/0x50 smpboot_thread_fn+0x11d/0x1e0 kthread+0x103/0x140 ? sort_range+0x20/0x20 ? kthread_destroy_worker+0x40/0x40 ret_from_fork+0x1f/0x40 Fixes: `77f8269606` ("ipmi: fix use-after-free of user->release_barrier.rda") Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org # 5.0 Cc: Yang Yingliang <yangyingliang@huawei.com>	2019-04-17 10:29:27 -05:00
Arnd Bergmann	e6986423d2	socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW It looks like the new socket options only work correctly for native execution, but in case of compat mode fall back to the old behavior as we ignore the 'old_timeval' flag. Rework so we treat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW the same way in compat and native 32-bit mode. Cc: Deepa Dinamani <deepa.kernel@gmail.com> Fixes: `a9beb86ae6` ("sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:52:22 -07:00
Eric Dumazet	50ce163a72	tcp: tcp_grow_window() needs to respect tcp_space() For some reason, tcp_grow_window() correctly tests if enough room is present before attempting to increase tp->rcv_ssthresh, but does not prevent it to grow past tcp_space() This is causing hard to debug issues, like failing the (__tcp_select_window(sk) >= tp->rcv_wnd) test in __tcp_ack_snd_check(), causing ACK delays and possibly slow flows. Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio, we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000" after about 60 round trips, when the active side no longer sends immediate acks. This bug predates git history. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:47:39 -07:00
David S. Miller	17f780b364	Merge branch 'dpaa2-eth-Add-flow-steering-support-without-masking' Ioana Ciocoi Radulescu says: ==================== dpaa2-eth: Add flow steering support without masking On DPAA2 platforms that lack a TCAM (like LS1088A), masking of flow steering keys is not supported. Until now we didn't offer flow steering capabilities at all on these platforms. Introduce a limited support for flow steering, where we only allow ethtool rules that share a common key (i.e. have the same header fields). If a rule with a new composition key is wanted, the user must first manually delete all previous rules. First patch fixes a minor bug, the next two cleanup and prepare the code and the last one introduces the actual FS support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu	2d6802374c	dpaa2-eth: Add flow steering support without masking On platforms that lack a TCAM (like LS1088A), masking of flow steering keys is not supported. Until now we didn't offer flow steering capabilities at all on these platforms, since our driver implementation configured a "comprehensive" FS key (containing all supported header fields), with masks used to ignore the fields not present in the rules provided by the user. We now allow ethtool rules that share a common key (i.e. have the same header fields). The FS key is now kept in the driver private data and initialized when the first rule is added to an empty table, rather than at probe time. If a rule with a new composition key is wanted, the user must first manually delete all previous rules. When building a FS table entry to pass to firmware, we still use the old building algorithm, which assumes an all-supported-fields key, and later collapse the fields which aren't actually needed. Masked rules are not supported; if provided, the mask value will be ignored. For firmware versions older than MC10.7.0 (that only offer the legacy ABIs for configuring distribution keys) flow steering without masking support remains unavailable. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu	3a1e6b84ad	dpaa2-eth: Update hash key composition code Introduce an internal id bitfield to uniquely identify header fields supported by the Rx distribution keys. For the hash key, add a conversion from the RXH_* bitmask provided by ethtool to the internal ids. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu	61f9bf0011	dpaa2-eth: Add a couple of macros Add two macros to simplify reading DPNI options. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu	df8e249be8	dpaa2-eth: Fix Rx classification status Set the Rx flow classification enable flag only if key config operation is successful. Fixes 3f9b5c9 ("dpaa2-eth: Configure Rx flow classification key") Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:46:19 -07:00
Claudiu Manoil	1e1caa9735	ocelot: Clean up stats update deferred work This is preventive cleanup that may save troubles later. No need to cancel repeateadly queued work if code is properly refactored. Don't let the ethtool -s process interfere with the stat workqueue scheduling. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:43:53 -07:00
Claudiu Manoil	a8fd48b50d	ocelot: Don't sleep in atomic context (irqs_disabled()) Preemption disabled at: [<ffff000008cabd54>] dev_set_rx_mode+0x1c/0x38 Call trace: [<ffff00000808a5c0>] dump_backtrace+0x0/0x3d0 [<ffff00000808a9a4>] show_stack+0x14/0x20 [<ffff000008e6c0c0>] dump_stack+0xac/0xe4 [<ffff0000080fe76c>] ___might_sleep+0x164/0x238 [<ffff0000080fe890>] __might_sleep+0x50/0x88 [<ffff0000082261e4>] kmem_cache_alloc+0x17c/0x1d0 [<ffff000000ea0ae8>] ocelot_set_rx_mode+0x108/0x188 [mscc_ocelot_common] [<ffff000008cabcf0>] __dev_set_rx_mode+0x58/0xa0 [<ffff000008cabd5c>] dev_set_rx_mode+0x24/0x38 Fixes: `a556c76adc` ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:43:53 -07:00
John Hurley	9bad65e515	nfp: flower: fix implicit fallthrough warning The nfp_flower_copy_pre_actions function introduces a case statement with an intentional fallthrough. However, this generates a warning if built with the -Wimplicit-fallthrough flag. Remove the warning by adding a fall through comment. Fixes: `1c6952ca58` ("nfp: flower: generate merge flow rule") Signed-off-by: John Hurley <john.hurley@netronome.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:42:06 -07:00
Nikolay Aleksandrov	600bea7dba	net: bridge: fix netlink export of vlan_stats_per_port option Since the introduction of the vlan_stats_per_port option the netlink export of it has been broken since I made a typo and used the ifla attribute instead of the bridge option to retrieve its state. Sysfs export is fine, only netlink export has been affected. Fixes: `9163a0fc1f` ("net: bridge: add support for per-port vlan stats") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:40:29 -07:00
Colin Ian King	3321b6c23f	qed: fix spelling mistake "faspath" -> "fastpath" There is a spelling mistake in a DP_INFO message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:39:10 -07:00
Antoine Tenart	3aed3e2a14	net: phy: micrel: add Asym Pause workaround The Micrel KSZ9031 PHY may fail to establish a link when the Asymmetric Pause capability is set. This issue is described in a Silicon Errata (DS80000691D or DS80000692D), which advises to always disable the capability. This patch implements the workaround by defining a KSZ9031 specific get_feature callback to force the Asymmetric Pause capability bit to be cleared. This fixes issues where the link would not come up at boot time, or when the Asym Pause bit was set later on. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:36:40 -07:00
David S. Miller	9c03e22708	Merge branch 'bnx2x-Support-for-timestamping-in-P2P-mode' Sudarsana Reddy Kalluru says: ==================== bnx2x: Support for timestamping in P2P mode. The patch series adds driver support for timestamping the ptp packets in peer-delay (P2P) mode. - Patch (1) performs the code cleanup. - Patch (2) adds the required implementation. Please consider applying it 'net-next' tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:34:48 -07:00
Sudarsana Reddy Kalluru	00165c25fa	bnx2x: Add support for detection of P2P event packets. The patch adds support for detecting the P2P (peer-to-peer) event packets. This is required for timestamping the PTP packets in peer delay mode. Unmask the below bits (set to 0) for device to detect the p2p packets. NIG_REG_P0/1_LLH_PTP_PARAM_MASK NIG_REG_P0/1_TLLH_PTP_PARAM_MASK bit 1 - IPv4 DA 1 of 224.0.0.107. bit 3 - IPv6 DA 1 of 0xFF02:0:0:0:0:0:0:6B. bit 9 - MAC DA 1 of 0x01-80-C2-00-00-0E. NIG_REG_P0/1_LLH_PTP_RULE_MASK NIG_REG_P0/1_TLLH_PTP_RULE_MASK bit 2 - {IPv4 DA 1; UDP DP 0} bit 6 - MAC Ethertype 0 of 0x88F7. bit 9 - MAC DA 1 of 0x01-80-C2-00-00-0E. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:34:48 -07:00
Sudarsana Reddy Kalluru	b320532c99	bnx2x: Replace magic numbers with macro definitions. This patch performs code cleanup by defining macros for the ptp-timestamp filters. Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:34:48 -07:00
Jie Liu	4bcd4ec101	tipc: set sysctl_tipc_rmem and named_timeout right range We find that sysctl_tipc_rmem and named_timeout do not have the right minimum setting. sysctl_tipc_rmem should be larger than zero, like sysctl_tcp_rmem. And named_timeout as a timeout setting should be not less than zero. Fixes: `cc79dd1ba9` ("tipc: change socket buffer overflow control to respect sk_rcvbuf") Fixes: `a5325ae5b8` ("tipc: add name distributor resiliency queue") Signed-off-by: Jie Liu <liujie165@huawei.com> Reported-by: Qiang Ning <ningqiang1@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:32:02 -07:00
Tuong Lien	f7a937801b	tipc: fix link established but not in session According to the link FSM, when a link endpoint got RESET_MSG (- a traditional one without the stopping bit) from its peer, it moves to PEER_RESET state and raises a LINK_DOWN event which then resets the link itself. Its state will become ESTABLISHING after the reset event and the link will be re-established soon after this endpoint starts to send ACTIVATE_MSG to the peer. There is no problem with this mechanism, however the link resetting has cleared the link 'in_session' flag (along with the other important link data such as: the link 'mtu') that was correctly set up at the 1st step (i.e. when this endpoint received the peer RESET_MSG). As a result, the link will become ESTABLISHED, but the 'in_session' flag is not set, and all STATE_MSG from its peer will be dropped at the link_validate_msg(). It means the link not synced and will sooner or later face a failure. Since the link reset action is obviously needed for a new link session (this is also true in the other situations), the problem here is that the link is re-established a bit too early when the link endpoints are not really in-sync yet. The commit forces a resync as already done in the previous commit `91986ee166` ("tipc: fix link session and re-establish issues") by simply varying the link 'peer_session' value at the link_reset(). Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:31:26 -07:00
Yuya Kusakabe	d85e8be2a5	net: Fix missing meta data in skb with vlan packet skb_reorder_vlan_header() should move XDP meta data with ethernet header if XDP meta data exists. Fixes: `de8f3a83b0` ("bpf: add meta pointer for direct access") Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com> Signed-off-by: Takeru Hayasaka <taketarou2@gmail.com> Co-developed-by: Takeru Hayasaka <taketarou2@gmail.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:29:38 -07:00
Gustavo A. R. Silva	a32b9d91b7	xen-netfront: mark expected switch fall-through In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/xen-netfront.c: In function ‘netback_changed’: drivers/net/xen-netfront.c:2038:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (dev->state == XenbusStateClosed) ^ drivers/net/xen-netfront.c:2041:2: note: here case XenbusStateClosing: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:03:02 -07:00
Gustavo A. R. Silva	899537b735	net: atm: Fix potential Spectre v1 vulnerabilities arg is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/atm/lec.c:715 lec_mcast_attach() warn: potential spectre issue 'dev_lec' [r] (local cap) Fix this by sanitizing arg before using it to index dev_lec. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/ Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 21:01:45 -07:00
Ard Biesheuvel	ad910c7c01	net/core: work around section mismatch warning for ptp_classifier The routine ptp_classifier_init() uses an initializer for an automatic struct type variable which refers to an __initdata symbol. This is perfectly legal, but may trigger a section mismatch warning when running the compiler in -fpic mode, due to the fact that the initializer may be emitted into an anonymous .data section thats lack the __init annotation. So work around it by using assignments instead. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 20:46:17 -07:00
Nikolay Aleksandrov	3b2e2904de	net: bridge: fix per-port af_packet sockets When the commit below was introduced it changed two visible things: - the skb was no longer passed through the protocol handlers with the original device - the skb was passed up the stack with skb->dev = bridge The first change broke af_packet sockets on bridge ports. For example we use them for hostapd which listens for ETH_P_PAE packets on the ports. We discussed two possible fixes: - create a clone and pass it through NF_HOOK(), act on the original skb based on the result - somehow signal to the caller from the okfn() that it was called, meaning the skb is ok to be passed, which this patch is trying to implement via returning 1 from the bridge link-local okfn() Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and drop/error would return < 0 thus the okfn() is called only when the return was 1, so we signal to the caller that it was called by preserving the return value from nf_hook(). Fixes: `8626c56c82` ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-16 20:30:40 -07:00
Alexei Starovoitov	00967e84f7	Merge branch 'af_xdp-smp_mb-fixes' Magnus Karlsson says: ==================== This patch set fixes one bug and removes two dependencies on Linux kernel headers from the XDP socket code in libbpf. A number of people have pointed out that these two dependencies make it hard to build the XDP socket part of libbpf without any kernel header dependencies. The two removed dependecies are: * Remove the usage of likely and unlikely (compiler.h) in xsk.h. It has been reported that the use of these actually decreases the performance of the ring access code due to an increase in instruction cache misses, so let us just remove these. * Remove the dependency on barrier.h as it brings in a lot of kernel headers. As the XDP socket code only uses two simple functions from it, we can reimplement these. As a bonus, the new implementation is faster as it uses the same barrier primitives as the kernel does when the same code is compiled there. Without this patch, the user land code uses lfence and sfence on x86, which are unnecessarily harsh/thorough. In the process of removing these dependencies a missing barrier function for at least PPC64 was discovered. For a full explanation on the missing barrier, please refer to patch 1. So the patch set now starts with two patches fixing this. I have also added a patch at the end removing this full memory barrier for x86 only, as it is not needed there. Structure of the patch set: Patch 1-2: Adds the missing barrier function in kernel and user space. Patch 3-4: Removes the dependencies Patch 5: Optimizes the added barrier from patch 2 so that it does not do unnecessary work on x86. v2 -> v3: * Added missing memory barrier in ring code * Added an explanation on the three barriers we use in the code * Moved barrier functions from xsk.h to libbpf_util.h * Added comment on why we have these functions in libbpf_util.h * Added a new barrier function in user space that makes it possible to remove the full memory barrier on x86. v1 -> v2: * Added comment about validity of ARM 32-bit barriers. Only armv7 and above. /Magnus ==================== Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-04-16 20:13:11 -07:00

... 2 3 4 5 6 ...

827374 Commits