linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-15 07:36:42 +07:00

Author	SHA1	Message	Date
Richard Cochran	80b14dee2b	net: Add a new socket option for a future transmit time. This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 22:30:27 +09:00
Jesus Sanchez-Palencia	c47d8c2f38	net: Clear skb->tstamp only on the forwarding path This is done in preparation for the upcoming time based transmission patchset. Now that skb->tstamp will be used to hold packet's txtime, we must ensure that it is being cleared when traversing namespaces. Also, doing that from skb_scrub_packet() before the early return would break our feature when tunnels are used. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 22:30:27 +09:00
Gustavo A. R. Silva	d287c50243	isdn: mark expected switch fall-throughs In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Warning level 2 was used: -Wimplicit-fallthrough=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 22:17:32 +09:00
Marcel Ziswiler	03fc5d4ffb	net: usb: asix: allow optionally getting mac address from device tree For Embedded use where e.g. AX88772B chips may be used without external EEPROMs the boot loader may choose to pass the MAC address to be used via device tree. Therefore, allow for optionally getting the MAC address from device tree data e.g. as follows (excerpt from a T30 based board, local-mac-address to be filled in by boot loader): /* EHCI instance 1: USB2_DP/N -> AX88772B */ usb@7d004000 { status = "okay"; #address-cells = <1>; #size-cells = <0>; asix@1 { reg = <1>; local-mac-address = [00 00 00 00 00 00]; }; }; Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 22:09:23 +09:00
Wei Yongjun	30e99ed6db	net: sched: act_pedit: fix possible memory leak in tcf_pedit_init() 'keys_ex' is malloced by tcf_pedit_keys_ex_parse() in tcf_pedit_init() but not all of the error handle path free it, this may cause memory leak. This patch fix it. Fixes: `71d0ed7079` ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 22:08:21 +09:00
David S. Miller	7184e7e7d9	Merge branch 'bridge-iproute2-isolated-port-and-selftests' Nikolay Aleksandrov says: ==================== bridge: iproute2 isolated port and selftests Add support to iproute2 for port isolation config and selftests for it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:40:02 +09:00
Nikolay Aleksandrov	a14e9fafaa	selftests: forwarding: test for bridge port isolation This test checks if the bridge port isolation feature works as expected by performing ping/ping6 tests between hosts that are isolated (should not work) and between an isolated and non-isolated hosts (should work). Same test is performed for flooding from and to isolated and non-isolated ports. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:40:02 +09:00
Nikolay Aleksandrov	967450c543	selftests: forwarding: lib: extract ping and ping6 so they can be reused Extract ping and ping6 command execution so the return value can be checked by the caller, this is needed for port isolation tests that are intended to fail. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:40:02 +09:00
David S. Miller	f744c4bb5c	Merge branch 'vhost_net-Avoid-vq-kicks-during-busyloop' Toshiaki Makita says: ==================== vhost_net: Avoid vq kicks during busyloop Under heavy load vhost tx busypoll tend not to suppress vq kicks, which causes poor guest tx performance. The detailed scenario is described in commitlog of patch 2. Rx seems not to have that serious problem, but for consistency I made a similar change on rx to avoid rx wakeups (patch 3). Additionary patch 4 is to avoid rx kicks under heavy load during busypoll. Tx performance is greatly improved by this change. I don't see notable performance change on rx with this series though. Performance numbers (tx): - Bulk transfer from guest to external physical server. [Guest]->vhost_net->tap--(XDP_REDIRECT)-->i40e --(wire)--> [Server] - Set 10us busypoll. - Guest disables checksum and TSO because of host XDP. - Measured single flow Mbps by netperf, and kicks by perf kvm stat (EPT_MISCONFIG event). Before After Mbps kicks/s Mbps kicks/s UDP_STREAM 1472byte 247758 27 Send 3645.37 6958.10 Recv 3588.56 6958.10 1byte 9865 37 Send 4.34 5.43 Recv 4.17 5.26 TCP_STREAM 8801.03 45794 9592.77 2884 v2: - Split patches into 3 parts (renaming variables, tx-kick fix, rx-wakeup fix). - Avoid rx-kicks too (patch 4). - Don't memorize endtime as it is not needed for now. ==================== Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:31:08 +09:00
Toshiaki Makita	6369fec5be	vhost_net: Avoid rx vring kicks during busyloop We may run out of avail rx ring descriptor under heavy load but busypoll did not detect it so busypoll may have exited prematurely. Avoid this by checking rx ring full during busypoll. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:30:47 +09:00
Toshiaki Makita	be294a51ad	vhost_net: Avoid rx queue wake-ups during busypoll We may run handle_rx() while rx work is queued. For example a packet can push the rx work during the window before handle_rx calls vhost_net_disable_vq(). In that case busypoll immediately exits due to vhost_has_work() condition and enables vq again. This can lead to another unnecessary rx wake-ups, so poll rx work instead of enabling the vq. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:30:46 +09:00
Toshiaki Makita	027b17603b	vhost_net: Avoid tx vring kicks during busyloop Under heavy load vhost busypoll may run without suppressing notification. For example tx zerocopy callback can push tx work while handle_tx() is running, then busyloop exits due to vhost_has_work() condition and enables notification but immediately reenters handle_tx() because the pushed work was tx. In this case handle_tx() tries to disable notification again, but when using event_idx it by design cannot. Then busyloop will run without suppressing notification. Another example is the case where handle_tx() tries to enable notification but avail idx is advanced so disables it again. This case also leads to the same situation with event_idx. The problem is that once we enter this situation busyloop does not work under heavy load for considerable amount of time, because notification is likely to happen during busyloop and handle_tx() immediately enables notification after notification happens. Specifically busyloop detects notification by vhost_has_work() and then handle_tx() calls vhost_enable_notify(). Because the detected work was the tx work, it enters handle_tx(), and enters busyloop without suppression again. This is likely to be repeated, so with event_idx we are almost not able to suppress notification in this case. To fix this, poll the work instead of enabling notification when busypoll is interrupted by something. IMHO vhost_has_work() is kind of interruption rather than a signal to completely cancel the busypoll, so let's run busypoll after the necessary work is done. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:30:46 +09:00
Toshiaki Makita	28b9b33b98	vhost_net: Rename local variables in vhost_net_rx_peek_head_len So we can easily see which variable is for which, tx or rx. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:30:46 +09:00
Qiaobin Fu	e7e3728bd7	net:sched: add action inheritdsfield to skbedit The new action inheritdsfield copies the field DS of IPv4 and IPv6 packets into skb->priority. This enables later classification of packets based on the DS field. v5: Update the drop counter for TC_ACT_SHOT v4: Not allow setting flags other than the expected ones. Allow dumping the pure flags. v3: Use optional flags, so that it won't break old versions of tc. Allow users to set both SKBEDIT_F_PRIORITY and SKBEDIT_F_INHERITDSFIELD flags. v2: Fix the style issue *Move the code from skbmod to skbedit Original idea by Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu> Reviewed-by: Michel Machado <michel@digirati.com.br> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 21:27:42 +09:00
David S. Miller	f145b0a707	Merge branch 'More-mirror-to-gretap-tests-with-bridge-in-UL' Petr Machata says: ==================== More mirror-to-gretap tests with bridge in UL This patchset adds two more tests where the mirror-to-gretap has a bridge in underlay packet path, without a VLAN above or below that bridge. In patch #1, a non-VLAN-filtering bridge is tested. In patch #2, a VLAN-filtering bridge is tested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:18:46 +09:00
Petr Machata	239e754af8	selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q Test for "tc action mirred egress mirror" that mirrors to gretap when the underlay route points at a VLAN-aware bridge (802.1q). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:18:45 +09:00
Petr Machata	35c31d5c32	selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d Test for "tc action mirred egress mirror" that mirrors to gretap when the underlay route points at a VLAN-unaware bridge (802.1d). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:18:45 +09:00
David S. Miller	2d1b138505	Merge branch 'Handle-multiple-received-packets-at-each-stage' Edward Cree says: ==================== Handle multiple received packets at each stage This patch series adds the capability for the network stack to receive a list of packets and process them as a unit, rather than handling each packet singly in sequence. This is done by factoring out the existing datapath code at each layer and wrapping it in list handling code. The motivation for this change is twofold: * Instruction cache locality. Currently, running the entire network stack receive path on a packet involves more code than will fit in the lowest-level icache, meaning that when the next packet is handled, the code has to be reloaded from more distant caches. By handling packets in "row-major order", we ensure that the code at each layer is hot for most of the list. (There is a corresponding downside in _data_ cache locality, since we are now touching every packet at every layer, but in practice there is easily enough room in dcache to hold one cacheline of each of the 64 packets in a NAPI poll.) * Reduction of indirect calls. Owing to Spectre mitigations, indirect function calls are now more expensive than ever; they are also heavily used in the network stack's architecture (see [1]). By replacing 64 indirect calls to the next-layer per-packet function with a single indirect call to the next-layer list function, we can save CPU cycles. Drivers pass an SKB list to the stack at the end of the NAPI poll; this gives a natural batch size (the NAPI poll weight) and avoids waiting at the software level for further packets to make a larger batch (which would add latency). It also means that the batch size is automatically tuned by the existing interrupt moderation mechanism. The stack then runs each layer of processing over all the packets in the list before proceeding to the next layer. Where the 'next layer' (or the context in which it must run) differs among the packets, the stack splits the list; this 'late demux' means that packets which differ only in later headers (e.g. same L2/L3 but different L4) can traverse the early part of the stack together. Also, where the next layer is not (yet) list-aware, the stack can revert to calling the rest of the stack in a loop; this allows gradual/creeping listification, with no 'flag day' patch needed to listify everything. Patches 1-2 simply place received packets on a list during the event processing loop on the sfc EF10 architecture, then call the normal stack for each packet singly at the end of the NAPI poll. (Analogues of patch #2 for other NIC drivers should be fairly straightforward.) Patches 3-9 extend the list processing as far as the IP receive handler. Patches 1-2 alone give about a 10% improvement in packet rate in the baseline test; adding patches 3-9 raises this to around 25%. Performance measurements were made with NetPerf UDP_STREAM, using 1-byte packets and a single core to handle interrupts on the RX side; this was in order to measure as simply as possible the packet rate handled by a single core. Figures are in Mbit/s; divide by 8 to obtain Mpps. The setup was tuned for maximum reproducibility, rather than raw performance. Full details and more results (both with and without retpolines) from a previous version of the patch series are presented in [2]. The baseline test uses four streams, and multiple RXQs all bound to a single CPU (the netperf binary is bound to a neighbouring CPU). These tests were run with retpolines. net-next: 6.91 Mb/s (datum) after 9: 8.46 Mb/s (+22.5%) Note however that these results are not robust; changes in the parameters of the test sometimes shrink the gain to single-digit percentages. For instance, when using only a single RXQ, only a 4% gain was seen. One test variation was the use of software filtering/firewall rules. Adding a single iptables rule (UDP port drop on a port range not matching the test traffic), thus making the netfilter hook have work to do, reduced baseline performance but showed a similar gain from the patches: net-next: 5.02 Mb/s (datum) after 9: 6.78 Mb/s (+35.1%) Similarly, testing with a set of TC flower filters (kindly supplied by Cong Wang) gave the following: net-next: 6.83 Mb/s (datum) after 9: 8.86 Mb/s (+29.7%) These data suggest that the batching approach remains effective in the presence of software switching rules, and perhaps even improves the performance of those rules by allowing them and their codepaths to stay in cache between packets. Changes from v3: * Fixed build error when CONFIG_NETFILTER=n (thanks kbuild). Changes from v2: * Used standard list handling (and skb->list) instead of the skb-queue functions (that use skb->next, skb->prev). - As part of this, changed from a "dequeue, process, enqueue" model to using list_for_each_safe, list_del, and (new) list_cut_before. * Altered __netif_receive_skb_core() changes in patch 6 as per Willem de Bruijn's suggestions (separate *ppt_prev from pt_prev; renaming). * Removed patches to Generic XDP, since they were producing no benefit. I may revisit them later. * Removed RFC tags. Changes from v1: * Rebased across 2 years' net-next movement (surprisingly straightforward). - Added Generic XDP handling to netif_receive_skb_list_internal() - Dealt with changes to PFMEMALLOC setting APIs * General cleanup of code and comments. * Skipped function calls for empty lists at various points in the stack (patch #9). * Added listified Generic XDP handling (patches 10-12), though it doesn't seem to help (see above). * Extended testing to cover software firewalls / netfilter etc. [1] http://vger.kernel.org/netconf2018_files/DavidMiller_netconf2018.pdf [2] http://vger.kernel.org/netconf2018_files/EdwardCree_netconf2018.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:20 +09:00
Edward Cree	b9f463d6c9	net: don't bother calling list RX functions on empty lists Generally the check should be very cheap, as the sk_buff_head is in cache. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:20 +09:00
Edward Cree	5fa12739a5	net: ipv4: listify ip_rcv_finish ip_rcv_finish_core(), if it does not drop, sets skb->dst by either early demux or route lookup. The last step, calling dst_input(skb), is left to the caller; in the listified case, we split to form sublists with a common dst, but then ip_sublist_rcv_finish() just calls dst_input(skb) in a loop. The next step in listification would thus be to add a list_input() method to struct dst_entry. Early demux is an indirect call based on iph->protocol; this is another opportunity for listification which is not taken here (it would require slicing up ip_rcv_finish_core() to allow splitting on protocol changes). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:20 +09:00
Edward Cree	17266ee939	net: ipv4: listified version of ip_rcv Also involved adding a way to run a netfilter hook over a list of packets. Rather than attempting to make netfilter know about lists (which would be a major project in itself) we just let it call the regular okfn (in this case ip_rcv_finish()) for any packets it steals, and have it give us back a list of packets it's synchronously accepted (which normally NF_HOOK would automatically call okfn() on, but we want to be able to potentially pass the list to a listified version of okfn().) The netfilter hooks themselves are indirect calls that still happen per- packet (see nf_hook_entry_hookfn()), but again, changing that can be left for future work. There is potential for out-of-order receives if the netfilter hook ends up synchronously stealing packets, as they will be processed before any accepts earlier in the list. However, it was already possible for an asynchronous accept to cause out-of-order receives, so presumably this is considered OK. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:20 +09:00
Edward Cree	88eb1944e1	net: core: propagate SKB lists through packet_type lookup __netif_receive_skb_core() does a depressingly large amount of per-packet work that can't easily be listified, because the another_round looping makes it nontrivial to slice up into smaller functions. Fortunately, most of that work disappears in the fast path: * Hardware devices generally don't have an rx_handler * Unless you're tcpdumping or something, there is usually only one ptype * VLAN processing comes before the protocol ptype lookup, so doesn't force a pt_prev deliver so normally, __netif_receive_skb_core() will run straight through and pass back the one ptype found in ptype_base[hash of skb->protocol]. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:20 +09:00
Edward Cree	4ce0017a37	net: core: another layer of lists, around PF_MEMALLOC skb handling First example of a layer splitting the list (rather than merely taking individual packets off it). Involves new list.h function, list_cut_before(), like list_cut_position() but cuts on the other side of the given entry. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:19 +09:00
Edward Cree	7da517a3bc	net: core: Another step of skb receive list processing netif_receive_skb_list_internal() now processes a list and hands it on to the next function. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:19 +09:00
Edward Cree	920572b732	net: core: unwrap skb list receive slightly further Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:19 +09:00
Edward Cree	e090bfb9f1	sfc: batch up RX delivery Improves packet rate of 1-byte UDP receives by up to 10%. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:19 +09:00
Edward Cree	f6ad8c1bcd	net: core: trivial netif_receive_skb_list() entry point Just calls netif_receive_skb() in a loop. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 14:06:19 +09:00
David S. Miller	2bdea157b9	Merge branch 'sctp-fully-support-for-dscp-and-flowlabel-per-transport' Xin Long says: ==================== sctp: fully support for dscp and flowlabel per transport Now dscp and flowlabel are set from sock when sending the packets, but being multi-homing, sctp also supports for dscp and flowlabel per transport, which is described in section 8.1.12 in RFC6458. v1->v2: - define ip_queue_xmit as inline in net/ip.h, instead of exporting it in Patch 1/5 according to David's suggestion. - fix the param len check in sctp_s/getsockopt_peer_addr_params() in Patch 3/5 to guarantee that an old app built with old kernel headers could work on the newer kernel per Marcelo's point. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:55 +09:00
Xin Long	0999f021c9	sctp: check for ipv6_pinfo legal sndflow with flowlabel in sctp_v6_get_dst The transport with illegal flowlabel should not be allowed to send packets. Other transport protocols already denies this. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:54 +09:00
Xin Long	4be4139f7d	sctp: add support for setting flowlabel when adding a transport Struct sockaddr_in6 has the member sin6_flowinfo that includes the ipv6 flowlabel, it should also support for setting flowlabel when adding a transport whose ipaddr is from userspace. Note that addrinfo in sctp_sendmsg is using struct in6_addr for the secondary addrs, which doesn't contain sin6_flowinfo, and it needs to copy sin6_flowinfo from the primary addr. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:54 +09:00
Xin Long	0b0dce7a36	sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams spp_ipv6_flowlabel and spp_dscp are added in sctp_paddrparams in this patch so that users could set sctp_sock/asoc/transport dscp and flowlabel with spp_flags SPP_IPV6_FLOWLABEL or SPP_DSCP by SCTP_PEER_ADDR_PARAMS , as described section 8.1.12 in RFC6458. As said in last patch, it uses '\| 0x100000' or '\|0x1' to mark flowlabel or dscp is set, so that their values could be set to 0. Note that to guarantee that an old app built with old kernel headers could work on the newer kernel, the param's check in sctp_g/setsockopt_peer_addr_params() is also improved, which follows the way that sctp_g/setsockopt_delayed_ack() or some other sockopts' process that accept two types of params does. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:54 +09:00
Xin Long	8a9c58d28d	sctp: add support for dscp and flowlabel per transport Like some other per transport params, flowlabel and dscp are added in transport, asoc and sctp_sock. By default, transport sets its value from asoc's, and asoc does it from sctp_sock. flowlabel only works for ipv6 transport. Other than that they need to be passed down in sctp_xmit, flow4/6 also needs to set them before looking up route in get_dst. Note that it uses '& 0x100000' to check if flowlabel is set and '& 0x1' (tos 1st bit is unused) to check if dscp is set by users, so that they could be set to 0 by sockopt in next patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:54 +09:00
Xin Long	69b9e1e07d	ipv4: add __ip_queue_xmit() that supports tos param This patch introduces __ip_queue_xmit(), through which the callers can pass tos param into it without having to set inet->tos. For ipv6, ip6_xmit() already allows passing tclass parameter. It's needed when some transport protocol doesn't use inet->tos, like sctp's per transport dscp, which will be added in next patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:36:54 +09:00
Linus Walleij	05bd97fc55	net: dsa: Add Vitesse VSC73xx DSA router driver This adds a DSA driver for: Vitesse VSC7385 SparX-G5 5-port Integrated Gigabit Ethernet Switch Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch These switches have a built-in 8051 CPU and can download and execute firmware in this CPU. They can also be configured to use an external CPU handling the switch in a memory-mapped manner by connecting to that external CPU's memory bus. This driver (currently) only takes control of the switch chip over SPI and configures it to route packages around when connected to a CPU port. The chip has embedded PHYs and VLAN support so we model it using DSA as a best fit so we can easily add VLAN support and maybe later also exploit the internal frame header to get more direct control over the switch. The four built-in GPIO lines are exposed using a standard GPIO chip. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:30:02 +09:00
Linus Walleij	975ae7c69d	net: phy: vitesse: Add support for VSC73xx The VSC7385, VSC7388, VSC7395 and VSC7398 are integrated switch/router chips for 5+1 or 8-port switches/routers. When managed directly by Linux using DSA we need to do a special set-up "dance" on the PHY. Unfortunately these sequences switches the PHY to undocumented pages named 2a30 and 52b6 and does undocumented things. It is described by these opaque sequences also in the reference manual. This is a best effort to integrate it anyways. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:30:02 +09:00
Linus Walleij	1decd2ec22	net: dsa: Add DT bindings for Vitesse VSC73xx switches This adds the device tree bindings for the Vitesse VSC73xx switches. We also add the vendor name for Vitesse. Cc: devicetree@vger.kernel.org Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 11:30:01 +09:00
David S. Miller	b68034087a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-03 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Various improvements to bpftool and libbpf, that is, bpftool build speed improvements, missing BPF program types added for detection by section name, ability to load programs from '.text' section is made to work again, and better bash completion handling, from Jakub. 2) Improvements to nfp JIT's map read handling which allows for optimizing memcpy from map to packet, from Jiong. 3) New BPF sample is added which demonstrates XDP in combination with bpf_perf_event_output() helper to sample packets on all CPUs, from Toke. 4) Add a new BPF kselftest case for tracking connect(2) BPF hooks infrastructure in combination with TFO, from Andrey. 5) Extend the XDP/BPF xdp_rxq_info sample code with a cmdline option to read payload from packet data in order to use it for benchmarking. Also for '--action XDP_TX' option implement swapping of MAC addresses to avoid drops on some hardware seen during testing, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-04 08:53:53 +09:00
David S. Miller	44a4c4698e	Merge branch 'aquantia-various-ethtool-ops-implementation' Igor Russkikh says: ==================== net: aquantia: various ethtool ops implementation In this patchset Anton Mikaev and I added some useful ethtool operations: - ring size changes - link renegotioation - flow control management The patch also improves init/deinit sequence. V3 changes: - After review and analysis it is clear that rtnl lock (which is captured by default on ethtool ops) is enough to secure possible overlapping of dev open/close. Thus, just dropping internal mutex. V2 changes: - using mutex to secure simultaneous dev close/open - using state var to store/restore dev state ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Igor Russkikh	1d1c212283	net: aquantia: bump driver version Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Anton Mikaev	b8d68b62d9	net: aquantia: Add renegotiate ethtool operation support Adds ethtool -r\|--negotiate operation support. It triggers special control bit on FW interface causing FW to restart link negotiation. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Anton Mikaev <amikaev@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Igor Russkikh	288551de45	net: aquantia: Implement rx/tx flow control ethtools callback Runtime change of pause frame configuration (rx/tx flow control) via ethtool. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Igor Russkikh	44e00dd8eb	net: aquantia: Improve adapter init/deinit logic We now pass link drop status to FW on init/deinit. This is required to inform FW that driver took/released a control on link. FW then will manage its own state and device power profile based on this information. To improve management we remove mpi_set function which ambiguously took both state and speed parameters. Deinit callback is now a part of FW ops, as it actually manages the FW. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Anton Mikaev	c1af542795	net: aquantia: Ethtool based ring size configuration Implemented ring size setup, min/max validation and reconfiguration in runtime. Signed-off-by: Anton Mikaev <amikaev@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:23:48 +09:00
Gustavo A. R. Silva	c18a9c0966	net: stmmac_tc: use 64-bit arithmetic instead of 32-bit Add suffix UL to constant 1024 in order to give the compiler complete information about the proper arithmetic to use. Notice that this constant is used in a context that expects an expression of type u64 (64 bits, unsigned) and following expressions are currently being evaluated using 32-bit arithmetic: qopt->idleslope * 1024 * ptr qopt->hicredit * 1024 * 8 qopt->locredit * 1024 * 8 Addresses-Coverity-ID: 1470246 ("Unintentional integer overflow") Addresses-Coverity-ID: 1470248 ("Unintentional integer overflow") Addresses-Coverity-ID: 1470249 ("Unintentional integer overflow") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 23:21:07 +09:00
Dan Murphy	00f553660a	net: phy: DP83TC811: Fix SGMII enable/disable If SGMII was selected in the DT then the device should write the SGMII enable bit. If SGMII is not selected in the DT then the SGMII bit should be disabled. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 11:38:07 +09:00
Dan Murphy	4203638359	net: phy: DP83TC811: Add INT_STAT3 Add INT_STAT3 interrupt setting and clearing support. Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 11:38:07 +09:00
David S. Miller	5cd3da4ba2	Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-03 10:29:26 +09:00
Linus Torvalds	d0fbad0aec	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md Pull MD fixes from Shaohua Li: "Two small fixes for MD: - an error handling fix from me - a recover bug fix for raid10 from BingJing" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: md/raid10: fix that replacement cannot complete recovery after reassemble MD: cleanup resources in failure	2018-07-02 12:40:59 -07:00
Linus Torvalds	8d2b6f6b4a	OpenRISC fixes for 4.18 Two fixes here which were breaking OpenRISC boot. - Fix bug in __pte_free_tlb() exposed in 4.18 by Matthew Wilcox's page table flag addition. - Fix issue booting on real hardware if delay slot detection emulation is disabled. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJbOjFIAAoJEMOzHC1eZifkvesP/1WPmI9M6g57kkky7uU5MJi6 cdarrEJbk3KrGFQCJeDkYB3rNQ+NuGebNfbe1AJZabot8raCvU6eGcsvkOVMM4ik v3iN7Dp4NstKJJ3nr1uAihhJpJdIrVH6caJd21Do23SZGrjUUaa621g72nUCxZT1 u1i4M9YLrUazMtIWhOBL4nkSmVmxL2Qc1fywg/ahDfeUSkqoY3su98HG/sc4t7Yx j1Bg+ugJyXR87G6mo+wlXF9Y+lXCycSVQC8TEdD0ku9qQzGKsb9ER/wJUSFLcQbP lrny+rYW79VEbht69NavXTyGV+k+F5+Jr9+w6XN36me3NbmgrBPucpmLj6iGMRDf xJ0+rS+4/ECy6rGDc3Q3p6SaL/YfJeib0XxmrH5ACg7B4k0Iczk5nuL6sbPcEDLw a7dOWlLH6DLxmeDF68ExQNi//R+wLe/MRxmOHAoBbyIAXbq+2cvGqp8Jk1V8JQP3 hgQA9BLFb72o7djepJ0MOynXE6nQbWoTIUDQqoy4sLwqCUT40JnRjC4/ji9OcFBe Ma3CrTTu0RA3U0e984mP025f6MQrLIyhU0AdA+iadnrarC+FIpe/4bzhYfL1OAfy chsOKAvQnzD9y3b01gbql1x6JV6ro91YGwtP0vdfjiyahQBICIzrglxoZ6byY6AQ RrwXPgBn8BFEaxAzUBGj =7uxj -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://github.com/stffrdhrn/linux Pull OpenRISC fixes from Stafford Horne: "Two fixes for issues which were breaking OpenRISC boot: - Fix bug in __pte_free_tlb() exposed in 4.18 by Matthew Wilcox's page table flag addition. - Fix issue booting on real hardware if delay slot detection emulation is disabled" * tag 'for-linus' of git://github.com/stffrdhrn/linux: openrisc: entry: Fix delay slot exception detection openrisc: Call destructor during __pte_free_tlb	2018-07-02 12:38:14 -07:00
Linus Torvalds	4e33d7d479	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Verify netlink attributes properly in nf_queue, from Eric Dumazet. 2) Need to bump memory lock rlimit for test_sockmap bpf test, from Yonghong Song. 3) Fix VLAN handling in lan78xx driver, from Dave Stevenson. 4) Fix uninitialized read in nf_log, from Jann Horn. 5) Fix raw command length parsing in mlx5, from Alex Vesker. 6) Cleanup loopback RDS connections upon netns deletion, from Sowmini Varadhan. 7) Fix regressions in FIB rule matching during create, from Jason A. Donenfeld and Roopa Prabhu. 8) Fix mpls ether type detection in nfp, from Pieter Jansen van Vuuren. 9) More bpfilter build fixes/adjustments from Masahiro Yamada. 10) Fix XDP_{TX,REDIRECT} flushing in various drivers, from Jesper Dangaard Brouer. 11) fib_tests.sh file permissions were broken, from Shuah Khan. 12) Make sure BH/preemption is disabled in data path of mac80211, from Denis Kenzior. 13) Don't ignore nla_parse_nested() return values in nl80211, from Johannes berg. 14) Properly account sock objects ot kmemcg, from Shakeel Butt. 15) Adjustments to setting bpf program permissions to read-only, from Daniel Borkmann. 16) TCP Fast Open key endianness was broken, it always took on the host endiannness. Whoops. Explicitly make it little endian. From Yuching Cheng. 17) Fix prefix route setting for link local addresses in ipv6, from David Ahern. 18) Potential Spectre v1 in zatm driver, from Gustavo A. R. Silva. 19) Various bpf sockmap fixes, from John Fastabend. 20) Use after free for GRO with ESP, from Sabrina Dubroca. 21) Passing bogus flags to crypto_alloc_shash() in ipv6 SR code, from Eric Biggers. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) qede: Adverstise software timestamp caps when PHC is not available. qed: Fix use of incorrect size in memcpy call. qed: Fix setting of incorrect eswitch mode. qed: Limit msix vectors in kdump kernel to the minimum required count. ipvlan: call dev_change_flags when ipvlan mode is reset ipv6: sr: fix passing wrong flags to crypto_alloc_shash() net: fix use-after-free in GRO with ESP tcp: prevent bogus FRTO undos with non-SACK flows bpf: sockhash, add release routine bpf: sockhash fix omitted bucket lock in sock_close bpf: sockmap, fix smap_list_map_remove when psock is in many maps bpf: sockmap, fix crash when ipv6 sock is added net: fib_rules: bring back rule_exists to match rule during add hv_netvsc: split sub-channel setup into async and sync net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN atm: zatm: Fix potential Spectre v1 s390/qeth: consistently re-enable device features s390/qeth: don't clobber buffer on async TX completion s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6] s390/qeth: fix race when setting MAC address ...	2018-07-02 11:18:28 -07:00

1 2 3 4 5 ...

767632 Commits