linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Florian Westphal	9971a514ed	netfilter: nf_nat: add nat type hooks to nat core Currently the packet rewrite and instantiation of nat NULL bindings happens from the protocol specific nat backend. Invocation occurs either via ip(6)table_nat or the nf_tables nat chain type. Invocation looks like this (simplified): NF_HOOK() \| `---iptable_nat \| `---> nf_nat_l3proto_ipv4 -> nf_nat_packet \| new packet? pass skb though iptables nat chain \| `---> iptable_nat: ipt_do_table In nft case, this looks the same (nft_chain_nat_ipv4 instead of iptable_nat). This is a problem for two reasons: 1. Can't use iptables nat and nf_tables nat at the same time, as the first user adds a nat binding (nf_nat_l3proto_ipv4 adds a NULL binding if do_table() did not find a matching nat rule so we can detect post-nat tuple collisions). 2. If you use e.g. nft_masq, snat, redir, etc. uses must also register an empty base chain so that the nat core gets called fro NF_HOOK() to do the reverse translation, which is neither obvious nor user friendly. After this change, the base hook gets registered not from iptable_nat or nftables nat hooks, but from the l3 nat core. iptables/nft nat base hooks get registered with the nat core instead: NF_HOOK() \| `---> nf_nat_l3proto_ipv4 -> nf_nat_packet \| new packet? pass skb through iptables/nftables nat chains \| +-> iptables_nat: ipt_do_table +-> nft nat chain x `-> nft nat chain y The nat core deals with null bindings and reverse translation. When no mapping exists, it calls the registered nat lookup hooks until one creates a new mapping. If both iptables and nftables nat hooks exist, the first matching one is used (i.e., higher priority wins). Also, nft users do not need to create empty nat hooks anymore, nat core always registers the base hooks that take care of reverse/reply translation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:06 +02:00
Florian Westphal	1cd472bf03	netfilter: nf_nat: add nat hook register functions to nf_nat This adds the infrastructure to register nat hooks with the nat core instead of the netfilter core. nat hooks are used to configure nat bindings. Such hooks are registered from ip(6)table_nat or by the nftables core when a nat chain is added. After next patch, nat hooks will be registered with nf_nat instead of netfilter core. This allows to use many nat lookup functions at the same time while doing the real packet rewrite (nat transformation) in one place. This change doesn't convert the intended users yet to ease review. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	06cad3acf0	netfilter: core: export raw versions of add/delete hook functions This will allow the nat core to reuse the nf_hook infrastructure to maintain nat lookup functions. The raw versions don't assume a particular hook location, the functions get added/deleted from the hook blob that is passed to the functions. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	4e25ceb80b	netfilter: nf_tables: allow chain type to override hook register Will be used in followup patch when nat types no longer use nf_register_net_hook() but will instead register with the nat core. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	ba7d284a98	netfilter: xtables: allow table definitions not backed by hook_ops The ip(6)tables nat table is currently receiving skbs from the netfilter core, after a followup patch skbs will be coming from the netfilter nat core instead, so the table is no longer backed by normal hook_ops. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
Florian Westphal	1f55236bd8	netfilter: nf_nat: move common nat code to nat core Copy-pasted, both l3 helpers almost use same code here. Split out the common part into an 'inet' helper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-05-23 09:14:05 +02:00
David S. Miller	1fe8c06c4a	Merge branch 'qed-firmware-TLV' Sudarsana Reddy Kalluru says: ==================== qed*: Add support for management firmware TLV request. Management firmware (MFW) requires config and state information from the driver. It queries this via TLV (type-length-value) request wherein mfw specificies the list of required TLVs. Driver fills the TLV data and responds back to MFW. This patch series adds qed/qede/qedf/qedi driver implementation for supporting the TLV queries from MFW. Changes from previous versions: ------------------------------- v2: Split patch (2) into multiple simpler patches. v2: Update qed_tlv_parsed_buf->p_val datatype to void pointer to avoid bunch of unnecessary typecasts. Please consider applying this series to "net-next". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:55 -04:00
Manish Rangankar	269afb3603	qedi: Add get_generic_tlv_data handler. Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Manish Rangankar	534bbdf883	qedi: Add support for populating ethernet TLVs. This patch adds callbacks for providing the ethernet protocol driver TLVs. Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Chad Dupuis	8673daf4f5	qedf: Add get_generic_tlv_data handler. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Chad Dupuis	642a0b37e6	qedf: Add support for populating ethernet TLVs. This patch adds callbacks for providing the ethernet protocol driver TLVs. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru	d25b859ccd	qede: Add support for populating ethernet TLVs. This patch adds callbacks for providing the ethernet protocol driver TLVs. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru	59ccf86fe6	qed: Add driver infrastucture for handling mfw requests. MFW requests the TLVs in interrupt context. Extracting of the required data from upper layers and populating of the TLVs require process context. The patch adds work-queues for processing the tlv requests. It also adds the implementation for requesting the tlv values from appropriate protocol driver. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru	77a509e4f6	qed: Add support for processing iscsi tlv request. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:54 -04:00
Sudarsana Reddy Kalluru	f240b68822	qed: Add support for processing fcoe tlv request. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:53 -04:00
Sudarsana Reddy Kalluru	2528c38993	qed: Add support for tlv request processing. The patch adds driver support for processing TLV requests/repsonses from the mfw and upper driver layers respectively. The implementation reads the requested TLVs from the shared memory, requests the values from upper layer drivers, populates this info (TLVs) shared memory and notifies MFW about the TLV values. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:53 -04:00
Sudarsana Reddy Kalluru	dd006921d6	qed: Add MFW interfaces for TLV request support. The patch adds required management firmware (MFW) interfaces such as mailbox commands, TLV types etc. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 23:29:53 -04:00
David S. Miller	9c803cfd5f	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-05-22 This series contains updates to i40e only. Jake provides all the changes in this series starting with making it consistent in how we approach the bit lock. Fixed the reporting of the VEB statistics and the queue statistics to always return every queue even if it is not currently in use. Use WARN_ONCE() so that the first time we end up with an incorrect size we will dump a stack trace and a message to help highlight the issue early in testing. Folded the fixed string prefix into the stat string definition. Instead of using a separate char *p pointer when copying strings, use the data pointer directly. Added code comments for several of the statistic functions to better explain the number and ordering of statistics. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 15:45:06 -04:00
David S. Miller	119768c903	Merge branch 'tcp-ECN-quickack' Eric Dumazet says: ==================== tcp: reduce quickack pressure for ECN Small patch series changing TCP behavior vs quickack and ECN First patch is a refactoring, adding parameter to tcp_incr_quickack() and tcp_enter_quickack_mode() helpers. Second patch implements the change, lowering number of ACK packets sent after an ECN event. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 15:43:16 -04:00
Eric Dumazet	522040ea5f	tcp: do not aggressively quick ack after ECN events ECN signals currently forces TCP to enter quickack mode for up to 16 (TCP_MAX_QUICKACKS) following incoming packets. We believe this is not needed, and only sending one immediate ack for the current packet should be enough. This should reduce the extra load noticed in DCTCP environments, after congestion events. This is part 2 of our effort to reduce pure ACK packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 15:43:15 -04:00
Eric Dumazet	9a9c9b51e5	tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode We want to add finer control of the number of ACK packets sent after ECN events. This patch is not changing current behavior, it only enables following change. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 15:43:15 -04:00
Vlad Buslov	290aa0ad74	net: sched: don't disable bh when accessing action idr Initial net_device implementation used ingress_lock spinlock to synchronize ingress path of device. This lock was used in both process and bh context. In some code paths action map lock was obtained while holding ingress_lock. Commit `e1e992e52f` ("[NET_SCHED] protect action config/dump from irqs") modified actions to always disable bh, while using action map lock, in order to prevent deadlock on ingress_lock in softirq. This lock was removed from net_device, so disabling bh, while accessing action map, is no longer necessary. Replace all action idr spinlock usage with regular calls that do not disable bh. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 15:34:34 -04:00
David S. Miller	73bf1fc58d	Merge branch 'net-ipv6-Fix-route-append-and-replace-use-cases' David Ahern says: ==================== net/ipv6: Fix route append and replace use cases This patch set fixes a few append and replace uses cases for IPv6 and adds test cases that codifies the expectations of how append and replace are expected to work. In paricular it allows a multipath route to have a dev-only nexthop, something Thomas tried to accomplish with commit `edd7ceb782` ("ipv6: Allow non-gateway ECMP for IPv6") which had to be reverted because of breakage, and to replace an existing FIB entry with a reject route. There are a number of inconsistent and surprising aspects to the Linux API for adding, deleting, replacing and changing FIB entries. For example, with IPv4 NLM_F_APPEND means insert the route after any existing entries with the same key (prefix + priority + TOS for IPv4) and NLM_F_CREATE without the append flag inserts the new route before any existing entries. IPv6 on the other hand attempts to guess whether a new route should be appended to an existing one, possibly creating a multipath route, or to add a new entry after any existing ones. This applies to both the 'append' (NLM_F_CREATE + NLM_F_APPEND) and 'prepend' (NLM_F_CREATE only) cases meaning for IPv6 the NLM_F_APPEND is basically ignored. This guessing whether the route should be added to a multipath route (gateway routes) or inserted after existing entries (non-gateway based routes) means a multipath route can not have a dev only nexthop (potentially required in some cases - tunnels or VRF route leaking for example) and route 'replace' is a bit adhoc treating gateway based routes and dev-only / reject routes differently. This has led to frustration with developers working on routing suites such as FRR where workarounds such as delete and add are used instead of replace. After this patch set there are 2 differences between IPv4 and IPv6: 1. 'ip ro prepend' = NLM_F_CREATE only IPv4 adds the new route before any existing ones IPv6 adds new route after any existing ones 2. 'ip ro append' = NLM_F_CREATE\|NLM_F_APPEND IPv4 adds the new route after any existing ones IPv6 adds the nexthop to existing routes converting to multipath For the former, there are cases where we want same prefix routes added after existing ones (e.g., multicast, prefix routes for macvlan when used for virtual router redundancy). Requiring the APPEND flag to add a new route to an existing one helps here but is a slight change in behavior since prepend with gateway routes now create a separate entry. For the latter IPv6 behavior is preferred - appending a route for the same prefix and metric to make a multipath route, so really IPv4 not allowing an existing route to be updated is the limiter. This will be fixed when nexthops become separate objects - a future patch set. Thank you to Thomas and Ido for testing earlier versions of this set, and to Ido for providing an update to the mlxsw driver. Changes since RFC - cleanup wording in test script; add comments about expected failures and why ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:20 -04:00
David Ahern	abb1860aac	selftests: fib_tests: Add ipv4 route add append replace tests Add IPv4 route tests covering add, append and replace permutations. Assumes the ability to add a basic single path route works; this is required for example when adding an address to an interface. $ fib_tests.sh -t ipv4_rt IPv4 route add / append tests TEST: Attempt to add duplicate route - gw [ OK ] TEST: Attempt to add duplicate route - dev only [ OK ] TEST: Attempt to add duplicate route - reject route [ OK ] TEST: Add new nexthop for existing prefix [ OK ] TEST: Append nexthop to existing route - gw [ OK ] TEST: Append nexthop to existing route - dev only [ OK ] TEST: Append nexthop to existing route - reject route [ OK ] TEST: Append nexthop to existing reject route - gw [ OK ] TEST: Append nexthop to existing reject route - dev only [ OK ] TEST: add multipath route [ OK ] TEST: Attempt to add duplicate multipath route [ OK ] TEST: Route add with different metrics [ OK ] TEST: Route delete with metric [ OK ] IPv4 route replace tests TEST: Single path with single path [ OK ] TEST: Single path with multipath [ OK ] TEST: Single path with reject route [ OK ] TEST: Single path with single path via multipath attribute [ OK ] TEST: Invalid nexthop [ OK ] TEST: Single path - replace of non-existent route [ OK ] TEST: Multipath with multipath [ OK ] TEST: Multipath with single path [ OK ] TEST: Multipath with single path via multipath attribute [ OK ] TEST: Multipath with reject route [ OK ] TEST: Multipath - invalid first nexthop [ OK ] TEST: Multipath - invalid second nexthop [ OK ] TEST: Multipath - replace of non-existent route [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:19 -04:00
David Ahern	f9a5a9d89f	selftests: fib_tests: Add ipv6 route add append replace tests Add IPv6 route tests covering add, append and replace permutations. Assumes the ability to add a basic single path route works; this is required for example when adding an address to an interface. $ fib_tests.sh -t ipv6_rt IPv6 route add / append tests TEST: Attempt to add duplicate route - gw [ OK ] TEST: Attempt to add duplicate route - dev only [ OK ] TEST: Attempt to add duplicate route - reject route [ OK ] TEST: Add new route for existing prefix (w/o NLM_F_EXCL) [ OK ] TEST: Append nexthop to existing route - gw [ OK ] TEST: Append nexthop to existing route - dev only [ OK ] TEST: Append nexthop to existing route - reject route [ OK ] TEST: Append nexthop to existing reject route - gw [ OK ] TEST: Append nexthop to existing reject route - dev only [ OK ] TEST: Add multipath route [ OK ] TEST: Attempt to add duplicate multipath route [ OK ] TEST: Route add with different metrics [ OK ] TEST: Route delete with metric [ OK ] IPv6 route replace tests TEST: Single path with single path [ OK ] TEST: Single path with multipath [ OK ] TEST: Single path with reject route [ OK ] TEST: Single path with single path via multipath attribute [ OK ] TEST: Invalid nexthop [ OK ] TEST: Single path - replace of non-existent route [ OK ] TEST: Multipath with multipath [ OK ] TEST: Multipath with single path [ OK ] TEST: Multipath with single path via multipath attribute [ OK ] TEST: Multipath with reject route [ OK ] TEST: Multipath - invalid first nexthop [ OK ] TEST: Multipath - invalid second nexthop [ OK ] TEST: Multipath - replace of non-existent route [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:19 -04:00
David Ahern	7df15e6c3e	selftests: fib_tests: Add option to pause after each test Add option to pause after each test before cleanup is done. Allows user to do manual inspection or more ad-hoc testing after each test with the setup in tact. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:19 -04:00
David Ahern	1c7447b4e8	selftests: fib_tests: Add command line options Add command line options for controlling pause on fail, controlling specific tests to run and verbose mode rather than relying on environment variables. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:19 -04:00
David Ahern	37ce42c14e	selftests: fib_tests: Add success-fail counts As more tests are added, it is convenient to have a tally at the end. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:18 -04:00
David Ahern	f34436a430	net/ipv6: Simplify route replace and appending into multipath route Bring consistency to ipv6 route replace and append semantics. Remove rt6_qualify_for_ecmp which is just guess work. It fails in 2 cases: 1. can not replace a route with a reject route. Existing code appends a new route instead of replacing the existing one. 2. can not have a multipath route where a leg uses a dev only nexthop Existing use cases affected by this change: 1. adding a route with existing prefix and metric using NLM_F_CREATE without NLM_F_APPEND or NLM_F_EXCL (ie., what iproute2 calls 'prepend'). Existing code auto-determines that the new nexthop can be appended to an existing route to create a multipath route. This change breaks that by requiring the APPEND flag for the new route to be added to an existing one. Instead the prepend just adds another route entry. 2. route replace. Existing code replaces first matching multipath route if new route is multipath capable and fallback to first matching non-ECMP route (reject or dev only route) in case one isn't available. New behavior replaces first matching route. (Thanks to Ido for spotting this one) Note: Newer iproute2 is needed to display multipath routes with a dev-only nexthop. This is due to a bug in iproute2 and parsing nexthops. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:18 -04:00
David Ahern	5a15a1b07c	mlxsw: spectrum_router: Add support for route append Handle append for gateway based routes. Dev-only multipath routes will be handled by a follow on patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-22 14:44:18 -04:00
Jacob Keller	3f76bdb437	i40e: use the more traditional 'i' loop variable Since we no longer use i as an array index for the data variable, replace the use of 'j' with 'i' so that we match the general loop variable name. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	ec29bbf8ab	i40e: add function doc headers for ethtool stats functions Add documentation for the i40e_get_stats_count, i40e_get_stat_strings and i40e_get_ethtool_stats explaining that the number and ordering of statistics must remain constant for a given netdevice. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	e08696dcd9	i40e: update data pointer directly when copying to the buffer A future patch is going to add a helper function i40e_add_ethtool_stats that will help lower the amount of boiler plate code in the i40e_get_ethtool_stats function. This conversion will take place over many patches, and the helper function will work by directly updating a reference to the data pointer. Since this would not work combined with the current method of accessing data like an array, update all the code that copies stats into the data buffer to use direct updates to the pointer instead of array accesses. This will prevent incorrect stat updates for patches in between the conversion. Similarly, when copying strings, we used a separate char p pointer. Instead, use the data pointer directly as it's already a (u8 ) type which is the same size. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	bf1c39e640	i40e: fold prefix strings directly into stat names We always prefix these stats with a fixed string, so just fold this prefix into the stat string definition. This preparatory work will make it easier to implement a helper function to copy stats and strings into the supplied buffers in a future patch. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	9b10df596b	i40e: use WARN_ONCE to replace the commented BUG_ON size check We don't really want to use BUG_ON here since that would completely crash the kernel, thus the reason we commented it out. We can't use BUILD_BUG_ON because at least now (a) the sizes aren't constant (we are fixing this) and (b) not all compilers are smart enough to understand that "p - data" is a constant. Instead, just use a WARN_ONCE so that the first time we end up with an incorrect size we will dump a stack trace and a message, hopefully highlighting the issues early in testing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	019b9cd44d	i40e: split i40e_get_strings() into smaller functions Split the statistic strings and private flags strings into their own separate functions to aid code readability. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	b831236509	i40e: always return all queue stat strings The ethtool API for obtaining device statistics is not intended to allow runtime changes in the number of statistics reported. It may appear this way, as there is an ability to request the number of stats using ethtool_get_set_count(). However, it is expected that this must always return the same value for invocations of the same device. If we don't satisfy this contract, and allow the number of stats to change during run time, we could cause invalid memory accesses or report the stat strings incorrectly. This is because the API for obtaining stats is to (1) get the size, (2) get the strings and finally (3) get the stats. Since these are each separate ethtool op commands, it is not possible to maintain consistency by holding the RTNL lock over the whole operation. This results in the potential for a race condition to occur where the size changed between any of the 3 calls. Avoid this issue by requiring that we always return the same value for a given device. We can check any values which remain constant for the life of the device, but must not report different sizes depending on runtime attributes. This patch specifically fixes the queue statistics to always return every queue even if it's not currently in use. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	9955d4940e	i40e: always return VEB stat strings The ethtool API for obtaining device statistics is not intended to allow runtime changes in the number of statistics reported. It may appear this way, as there is an ability to request the number of stats using ethtool_get_set_count(). However, it is expected that this must always return the same value for invocations of the same device. If we don't satisfy this contract, and allow the number of stats to change during run time, we could cause invalid memory accesses or report the stat strings incorrectly. This is because the API for obtaining stats is to (1) get the size, (2) get the strings and finally (3) get the stats. Since these are each separate ethtool op commands, it is not possible to maintain consistency by holding the RTNL lock over the whole operation. This results in the potential for a race condition to occur where the size changed between any of the 3 calls. Avoid this issue by requiring that we always return the same value for a given device. We can check any values which remain constant for the life of the device, but must not report different sizes depending on runtime attributes. This patch specifically fixes the VEB statistics strings to always be reported. Other issues will be fixed in future patches. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Jacob Keller	bdf2752305	i40e: free skb after clearing lock in ptp_stop Use the same logic to free the skb after clearing the Tx timestamp bit lock in i40e_ptp_stop as we use in the other locations. It is not as important here since we are not racing against a future Tx timestamp request (as we are disabling PTP at this point). However it is good to be consistent in how we approach the bit lock so that future callers don't copy the old anti-pattern. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-05-22 08:37:06 -07:00
Johannes Berg	f3a7ca6458	cfg80211: add missing kernel-doc Add the kernel-doc missed earlier. Fixes: `52539ca89f` ("cfg80211: Expose TXQ stats and parameters to userspace") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-05-22 11:32:43 +02:00
Daniel Borkmann	3fb48d881d	Merge branch 'bpf-fib-mtu-check' David Ahern says: ==================== Packets that exceed the egress MTU can not be forwarded in the fast path. Add IPv4 and IPv6 MTU helpers that take a FIB lookup result (versus the typical dst path) and add the calls to bpf_ipv{4,6}_fib_lookup. v2 - add ip6_mtu_from_fib6 to ipv6_stub - only call the new MTU helpers for fib lookups in XDP path; skb path uses is_skb_forwardable to determine if the packet can be sent via the egress device from the FIB lookup ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:51:11 +02:00
David Ahern	4f74fede40	bpf: Add mtu checking to FIB forwarding helper Add check that egress MTU can handle packet to be forwarded. If the MTU is less than the packet length, return 0 meaning the packet is expected to continue up the stack for help - eg., fragmenting the packet or sending an ICMP. The XDP path needs to leverage the FIB entry for an MTU on the route spec or an exception entry for a given destination. The skb path lets is_skb_forwardable decide if the packet can be sent. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:51:09 +02:00
David Ahern	901731b882	net/ipv6: Add helper to return path MTU based on fib result Determine path MTU from a FIB lookup result. Logic is based on ip6_dst_mtu_forward plus lookup of nexthop exception. Add ip6_dst_mtu_forward to ipv6_stubs to handle access by core bpf code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:51:09 +02:00
David Ahern	50d889b178	net/ipv4: Add helper to return path MTU based on fib result Determine path MTU from a FIB lookup result. Logic is a distillation of ip_dst_mtu_maybe_forward. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:51:09 +02:00
Daniel Borkmann	fd0bfa8d6e	Merge branch 'bpf-af-xdp-cleanups' Björn Töpel says: ==================== This the second follow-up set. The first four patches are uapi changes: * Removing rebind support * Getting rid of structure hole * Removing explicit cache line alignment * Stricter bind checks The last patches do some cleanups, where the umem and refcount_t changes were suggested by Daniel. * Add a missing write-barrier and use READ_ONCE for data-dependencies * Clean up umem and do proper locking * Convert atomic_t to refcount_t ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:08 +02:00
Björn Töpel	d3b42f1422	xsk: convert atomic_t to refcount_t Introduce refcount_t, in favor of atomic_t. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00
Björn Töpel	a49049ea25	xsk: simplified umem setup As suggested by Daniel Borkmann, the umem setup code was a too defensive and complex. Here, we reduce the number of checks. Also, the memory pinning is now folded into the umem creation, and we do correct locking. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00
Björn Töpel	37b076933a	xsk: add missing write- and data-dependency barrier Here, we add a missing write-barrier, and use READ_ONCE for the data-dependency barrier. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00
Björn Töpel	1c4917da36	samples/bpf: adapt xdpsock to the new uapi Adapt xdpsock to use the new getsockopt introduced in the previous commit. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00
Björn Töpel	b3a9e0be43	xsk: remove explicit ring structure from uapi In this commit we remove the explicit ring structure from the the uapi. It is tricky for an uapi to depend on a certain L1 cache line size, since it can differ for variants of the same architecture. Now, we let the user application determine the offsets of the producer, consumer and descriptors by asking the socket via getsockopt. A typical flow would be (Rx ring): struct xdp_mmap_offsets off; struct xdp_desc ring; u32 prod, cons; void map; ... getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen); map = mmap(NULL, off.rx.desc + NUM_DESCS * sizeof(struct xdp_desc), PROT_READ \| PROT_WRITE, MAP_SHARED \| MAP_POPULATE, sfd, XDP_PGOFF_RX_RING); prod = map + off.rx.producer; cons = map + off.rx.consumer; ring = map + off.rx.desc; Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-22 10:25:06 +02:00

... 3 4 5 6 7 ...

755347 Commits