linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-18 21:06:44 +07:00

Author	SHA1	Message	Date
Gabriel Krisman Bertazi	4cace675d6	bnx2x: Alloc 4k fragment for each rx ring buffer element The driver allocates one page for each buffer on the rx ring, which is too much on architectures like ppc64 and can cause unexpected allocation failures when the system is under stress. Now, we keep a memory pool per queue, and if the architecture's PAGE_SIZE is greater than 4k, we fragment pages and assign each 4k segment to a ring element, which reduces the overall memory consumption on such architectures. This helps avoiding errors like the example below: [bnx2x_alloc_rx_sge:435(eth1)]Can't alloc sge [c00000037ffeb900] [d000000075eddeb4] .bnx2x_alloc_rx_sge+0x44/0x200 [bnx2x] [c00000037ffeb9b0] [d000000075ee0b34] .bnx2x_fill_frag_skb+0x1ac/0x460 [bnx2x] [c00000037ffebac0] [d000000075ee11f0] .bnx2x_tpa_stop+0x160/0x2e8 [bnx2x] [c00000037ffebb90] [d000000075ee1560] .bnx2x_rx_int+0x1e8/0xc30 [bnx2x] [c00000037ffebcd0] [d000000075ee2084] .bnx2x_poll+0xdc/0x3d8 [bnx2x] (unreliable) Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 15:56:42 -07:00
Neil McKee	ccea74457b	openvswitch: include datapath actions with sampled-packet upcall to userspace If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions in the upcall. This Directly associates the sampled packet with the path it takes through the virtual switch. Path information currently includes mangling, encapsulation and decapsulation actions for tunneling protocols GRE, VXLAN, Geneve, MPLS and QinQ, but this extension requires no further changes to accommodate datapath actions that may be added in the future. Adding path information enhances visibility into complex virtual networks. Signed-off-by: Neil McKee <neil.mckee@inmon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 15:05:40 -07:00
David S. Miller	bdef7de4b8	net: Add priority to packet_offload objects. When we scan a packet for GRO processing, we want to see the most common packet types in the front of the offload_base list. So add a priority field so we can handle this properly. IPv4/IPv6 get the highest priority with the implicit zero priority field. Next comes ethernet with a priority of 10, and then we have the MPLS types with a priority of 15. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 14:56:09 -07:00
Vaishali Thakkar	493be55ac3	xen-netfront: Use setup_timer Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e, func, da; @@ -init_timer (&e); +setup_timer (&e, func, da); -e.data = da; -e.function = func; Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 22:26:03 -07:00
David S. Miller	f7f35c0209	Merge branch 'rds-next' Sowmini Varadhan says: ==================== net/rds: SOL_RDS socket option to explicitly select transport Today the underlying transport (TCP or IB) for a PF_RDS socket is implicitly selected based on the local address used to bind(2) the PF_RDS socket. This results in some non-deterministic behavior when there are un-numbered and IPoIB interfaces sharing the same IP address. It also places the constraint that the IB interface must have an IP address (and thus, IPoIB) configured on it. The non-determinism may be avoided by providing the user-space application a socket option that allows it to explicitly select the transport prior to bind(2). Patch 1 of this series provides the constant definitions needed by the application via <linux/rds.h>. Patch 2 provides the setsockopt support, and Patch 3 provides the getsockopt support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:47:23 -07:00
Sowmini Varadhan	8ba38460f3	net/rds Add getsockopt support for SO_RDS_TRANSPORT The currently attached transport for a PF_RDS socket may be obtained from user space by invoking getsockopt(2) using the SO_RDS_TRANSPORT option at the SOL_RDS level. The integer optval returned will be one of the RDS_TRANS_* constants defined in linux/rds.h. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:47:23 -07:00
Sowmini Varadhan	d97dac54bf	net/rds: Add setsockopt support for SO_RDS_TRANSPORT An application may deterministically attach the underlying transport for a PF_RDS socket by invoking setsockopt(2) with the SO_RDS_TRANSPORT option at the SOL_RDS level. The integer argument to setsockopt must be one of the RDS_TRANS_* transport types, e.g., RDS_TRANS_TCP. The option must be specified before invoking bind(2) on the socket, and may only be used once on the socket. An attempt to set the option on a bound socket, or to invoke the option after a successful SO_RDS_TRANSPORT attachment, will return EOPNOTSUPP. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:47:23 -07:00
Sowmini Varadhan	a28c257c9e	net/rds: Declare SO_RDS_TRANSPORT and RDS_TRANS_* constants in uapi/linux/rds.h User space applications that desire to explicitly select the underlying transport for a PF_RDS socket may do so by using the SO_RDS_TRANSPORT socket option at the SOL_RDS level before bind(). The integer argument provided to the socket option would be one of the RDS_TRANS_* values, e.g., RDS_TRANS_TCP. This commit exports the constant values need by such applications via <linux/rds.h> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:47:23 -07:00
Vaishali Thakkar	f16e9d86ae	ethernet/intel: Use setup_timer Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.function = a; ... when != b = e4 -e1.data = b; Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:46:01 -07:00
Daniel Borkmann	3324b584b6	ebpf: misc core cleanup Besides others, move bpf_tail_call_proto to the remaining definitions of other protos, improve comments a bit (i.e. remove some obvious ones, where the code is already self-documenting, add objectives for others), simplify bpf_prog_array_compatible() a bit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:44:44 -07:00
Daniel Borkmann	17ca8cbf49	ebpf: allow bpf_ktime_get_ns_proto also for networking As this is already exported from tracing side via commit `d9847d310a` ("tracing: Allow BPF programs to call bpf_ktime_get_ns()"), we might as well want to move it to the core, so also networking users can make use of it, e.g. to measure diffs for certain flows from ingress/egress. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:44:44 -07:00
Vaishali Thakkar	a24c85abc0	isdn/capi: Use setup_timer Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.data = b; ... when != a = e4 -e1.function = a; Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:36:37 -07:00
Vaishali Thakkar	52e0b2b15b	net: dl2k: Use setup_timer Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e1, e2, e3, e4, a, b; @@ -init_timer(&e1); +setup_timer(&e1, a, b); ... when != a = e2 when != b = e3 -e1.data = b; ... when != a = e4 -e1.function = a; Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:34:38 -07:00
Vaishali Thakkar	12d5e6fd1d	net: mv643xx_eth: Use setup_timer Use the timer API function setup_timer instead of structure field assignments to initialize a timer. A simplified version of the Coccinelle semantic patch that performs this transformation is as follows: @change@ expression e, func, da; @@ -init_timer (&e); +setup_timer (&e, func, da); -e.data = da; -e.function = func; Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 21:23:28 -07:00
David S. Miller	d803731462	As we get closer to the merge window, here are a few more things for -next: * disconnect TDLS stations on CSA to avoid issues * fix a memory leak introduced in a recent commit * switch rfkill and cfg80211 to PM ops * in an unlikely scenario, prevent a bookkeeping value to get corrupted leading to dropped packets * fix a crash in VLAN assignment * switch rfkill-gpio to more modern gpiod API * send disconnected event to userspace with proper local/remote indication -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVaEyAAAoJEDBSmw7B7bqrPiIQAKOrX4g2UNtyoTWJzA7YRu+g GEUu/CE4LQKCodCpBiEhFlhQo2WzXsHoLj5+Nr56aFAZx19VZjXWVC5JS785wYn5 r8hpOVWUUA3MVnXeL/+yz4chm0wTYN9pSpElZ4FHlUI0OkCMh2rPCTvdrbSKoGzV MN8NEO0jVE89AgOMF8gHk5YKpJ6B4QibZuUuZpgkqdwIi5udaCcrPFFrUg/NfRpA nTauP6blFUPOUV0sxbhS78uC3rqGQuYsnvab/QeGc9PDKk5ukrXzFdgRCVZq8224 Ge0JcPzwzWldk892oEJoc2OfGkg5HOil9HtC+S2ehBGuK0yEXOBIkO1ZgudTH1kC 0rLOPWVKRzTWE+sq+gWK/OjfaA7Dl6HFYYHRQ2dhm1XkqtAw8SwGQMDSIPJYWr4O jp4gYpwKVjnMmsEAg7FdKWyIiTgLyI07VnIciORXDyefddYMuofXI2pJkfzUeFeH HjCVYm2NYXDty6uneP4RC1nUbNc53FKJ5O9fW3BPMyVXD4pTjam50p9H6N7OcDN3 k3dEevWiVgvBjZPVc3HI8RaCzS/Ww1ym+MYgV97QkMfgiuE2VkiFwK+zhWn9axbc eutkzFEdDcIACCZ74hIWqMJjsMnZm9E11Uq7tifAE0bi1Wpku1xPAnxMPnI+0eiF Dgo2bmlQ/d1dHr3N3FC0 =KmwY -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== As we get closer to the merge window, here are a few more things for -next: * disconnect TDLS stations on CSA to avoid issues * fix a memory leak introduced in a recent commit * switch rfkill and cfg80211 to PM ops * in an unlikely scenario, prevent a bookkeeping value to get corrupted leading to dropped packets * fix a crash in VLAN assignment * switch rfkill-gpio to more modern gpiod API * send disconnected event to userspace with proper local/remote indication ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 17:34:26 -07:00
David S. Miller	a9ab2184f4	Included changes: - checkpatch fixes - code cleanup - debugfs component is now compiled only if DEBUG_FS is selected - update copyright years - disable by default not-so-user-safe features -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJVaCjMAAoJEOb/4TMchkvfQS0QANDW/0eOT8azlbik5+MZTC5i d+K1Xbc7Qn7ebo5F27eRGNrgV5a8Wwx1JUCANXhAfjSURItj3KoHbjLYN2lJLn5L mBoU7IWwqUzX2garm7xKm94TTaN3Q6t/NGYVeQqJXNcWBDJQcNAr7ECg8tpV16Ec +o6FPsuZBX1dKNijvcy77VNGAaauhAbfMuAYRJDx6CtCIyWg+f/vcAeTR2PCmbMD FP2qD2zHBnR5feQF9YtrCOUHX3SzKlnCBQ1DyUzWbC40eGJWQPZiml+CC0r7fNrI buOlk2yDI1Pc0/TIDrm3B3f0LqoQhmC4h0EDP/tazoiHAe/Vh06D4dmsC81XBM+H 9wEzU+C20DUjDVIyTzboIDjcSNwTN5TxK0dG72vc+yDfSSAmJVtLQ8dqQevRp6cd NPVebjCyJKXoBZWd1o7KO0s41dTbFBVHrA5ZLaEu5TcCMpKHzicJJyMr+OLgqTQE tqLMzqR+7VPmJfIwXuHX+wqHlsJCkrU1zyiuOyBn6uQ4rvbg503eadJffOAaLeCH FpOtKkQ34HNDUchgmiFVWWV1w6r3Si3/a7WRJN55B49sIZqJxxQfB2Evlk8vYNzT sVDFsNk8QnbaL2yCwxJEXj/Kgyfxj/PLAoxDnkt+cHWOF6nbGPHyIdDJQGSAHFrp NcZisqImn5iJS+2QV68a =2UBm -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Antonio Quartulli says: ==================== Included changes: - checkpatch fixes - code cleanup - debugfs component is now compiled only if DEBUG_FS is selected - update copyright years - disable by default not-so-user-safe features ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 01:07:06 -07:00
Alexei Starovoitov	abf2e7d6e2	bpf: add missing rcu protection when releasing programs from prog_array Normally the program attachment place (like sockets, qdiscs) takes care of rcu protection and calls bpf_prog_put() after a grace period. The programs stored inside prog_array may not be attached anywhere, so prog_array needs to take care of preserving rcu protection. Otherwise bpf_tail_call() will race with bpf_prog_put(). To solve that introduce bpf_prog_put_rcu() helper function and use it in 3 places where unattached program can decrement refcnt: closing program fd, deleting/replacing program in prog_array. Fixes: `04fd61ab36` ("bpf: allow bpf programs to tail-call other bpf programs") Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:27:51 -07:00
David S. Miller	d1f5f2bb91	Merge branch 'hv_netvsc-next' K. Y. Srinivasan says: ==================== hv_netvsc: Implement NUMA aware memory allocation Allocate both receive buffer and send buffer from the NUMA node assigned to the primary channel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:23:11 -07:00
K. Y. Srinivasan	5defde5946	hv_netvsc: Allocate the sendbuf in a NUMA aware way Allocate the send buffer in a NUMA aware way. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:23:03 -07:00
K. Y. Srinivasan	0a726c2b49	hv_netvsc: Allocate the receive buffer from the correct NUMA node Allocate the receive bufer from the NUMA node assigned to the primary channel. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:22:56 -07:00
Wang Long	282c320d33	netevent: remove automatic variable in register_netevent_notifier() Remove automatic variable 'err' in register_netevent_notifier() and return the result of atomic_notifier_chain_register() directly. Signed-off-by: Wang Long <long.wanglong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:03:21 -07:00
David S. Miller	583d3f5af2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next, they are: 1) default CONFIG_NETFILTER_INGRESS to y for easier compile-testing of all options. 2) Allow to bind a table to net_device. This introduces the internal NFT_AF_NEEDS_DEV flag to perform a mandatory check for this binding. This is required by the next patch. 3) Add the 'netdev' table family, this new table allows you to create ingress filter basechains. This provides access to the existing nf_tables features from ingress. 4) Kill unused argument from compat_find_calc_{match,target} in ip_tables and ip6_tables, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-31 00:02:30 -07:00
David S. Miller	5289e4a03f	Merge branch 'systemport-next' Florian Fainelli says: ==================== net: systemport: misc improvements These patches are highly inspired by changes from Petri on bcmgenet, last patch is a misc fix that I had pending for a while, but is not a candidate for 'net' at this point. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:51:37 -07:00
Florian Fainelli	25977ac77d	net: systemport: Add a check for oversized packets Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:51:25 -07:00
Florian Fainelli	c73b01837e	net: systemport: rewrite bcm_sysport_rx_refill Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx packet processing loop, after the current Rx packet has already been passed to napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a new Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists. To eliminate this situation: 1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped. In this case, the data on the current Rx skb is effectively dropped. 2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the top of Rx packet processing loop, so that the new replacement Rx skb is already in place before the current Rx skb is processed. This is loosely inspired from `d6707bec59` ("net: bcmgenet: rewrite bcmgenet_rx_refill()") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:51:17 -07:00
Florian Fainelli	baf387a8ed	net: systemport: Pre-calculate and utilize cb->bd_addr There is a 1:1 mapping between the software maintained control block in priv->rx_cbs and the buffer address in priv->rx_bds, such that there is no need to keep computing the buffer address when refiling a control block. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:51:09 -07:00
Julia Lawall	3d2f6d41d1	ipv6: drop unneeded goto Delete jump to a label on the next line, when that label is not used elsewhere. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ identifier l; @@ -if (...) goto l; -l: // </smpl> Also remove the unnecessary ret variable. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:48:36 -07:00
Arnd Bergmann	5e9615bfb9	net: thunderx: add 64-bit dependency The thunderx ethernet driver fails to build on architectures that do not have an atomic readq() and writeq() function for 64-bit PCI bus access: drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function 'bgx_reg_read': include/asm-generic/io.h:195:23: error: implicit declaration of function 'readq' [-Werror=implicit-function-declaration] It seems impossible to get this driver to work on most 32-bit hardware, so it's better to add an explicit dependency, in order to let us keep building 'allmodconfig' kernels on all architectures. As the driver is meant for the internal hardware on an arm64 SoC, this is not a problem for usability. Allowing the build on all 64-bit architectures rather than just CONFIG_ARM64 on the other hand means that we get the benefit of build testing on x86. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:38:49 -07:00
David S. Miller	d9dca9cb26	Merge branch 'mlx4-next' Or Gerlitz says: ==================== mlx4 driver update, May 28, 2015 The 1st patch fixes an issue with a function running DPDK overriding broadcast steering rules set by other functions. Please add this one to your -stable queue. The rest of the series from Matan and Ido deals with scaling the number of IRQs that serve RoCE applications to be in par with the Ethernet driver. changes from V0: - addressed feedback from Sergei, removed extra blank line in patch #4 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:35 -07:00
Matan Barak	6d90aa5cf1	net/mlx4_core: Make sure there are no pending async events when freeing CQ When freeing a CQ, we need to make sure there are no asynchronous events (on the ASYNC EQ) that could relate to this CQ before freeing it. This is done by introducing synchronize_irq. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Ido Shamay	de1618034a	net/mlx4_core: Move affinity hints to mlx4_core ownership Now that EQs management is in the sole responsibility of mlx4_core, the IRQ affinity hints configuration should be in its hands as well. request_irq is called only once by the first consumer (maybe mlx4_ib), so mlx4_en passes the affinity mask too late. We also need to request vectors according to the cores we want to run on. mlx4_core distribution of IRQs to cores is straight forward, EQ(i)->IRQ will set affinity hint to core i. Consumers need to request EQ vectors, according to their cores considerations (NUMA). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Matan Barak	c66fa19c40	net/mlx4: Add EQ pool Previously, mlx4_en allocated EQs and used them exclusively. This affected RoCE performance, as applications which are events sensitive were limited to use only the legacy EQs. Change that by introducing an EQ pool. This pool is managed by mlx4_core. EQs are assigned to ports (when there are limited number of EQs, multiple ports could be assigned to the same EQs). An exception to this rule is the ASYNC EQ which handles various events. Legacy EQs are completely removed as all EQs could be shared. When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for EQ serving on a specific port. The core driver calculates which EQ should be assigned to that request. Because IRQs are shared between IB and Ethernet modules, their names only include the PCI device BDF address. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Matan Barak	48564135cb	net/mlx4_core: Demote simple multicast and broadcast flow steering rules In SRIOV, when simple (i.e - Ethernet L2 only) flow steering rules are created, always create them at MLX4_DOMAIN_NIC priority (instead of the real priority the function created them at). This is done in order to let multiple functions add broadcast/multicast rules without affecting other functions, which is necessary for DPDK in SRIOV. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
David S. Miller	9d52bf0a23	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-05-28 Here's a set of patches intended for 4.2. The majority of the changes are on the 802.15.4 side of things rather than Bluetooth related: - All sorts of cleanups & fixes to ieee802154 and related drivers - Rework of tx power support in ieee802154 and its drivers - Support for setting ieee802154 tx power through nl802154 - New IDs for the btusb driver - Various cleanups & smaller fixes to btusb - New btrtl driver for Realtec devices - Fix suspend/resume for Realtek devices Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:26:45 -07:00
David S. Miller	1dcf3ac49f	Merge branch 'mlx5-next' Amir Vadai says: ==================== net/mlx5: ConnectX-4 100G Ethernet driver This patchset extends the mlx5_core driver to support Ethernet functionality. The Ethernet functionality in the mlx5 driver is integrated into the core driver and not as separated driver. The IB functionality remains in the mlx5_ib driver as before. This functionality will enable the Ethernet capability of Mellanox's new famility of cards - ConnectX-4. Due to the fact that backword compatability is being kept, existing Connect-IB cards that are using this driver are fully working with the modified driver, and no issues with current deployments should be seen. Like the ConnectX-3 cards, ConnectX-4 is a VPI (Virtual Port Interface - every port can be configured as Infiniband or Ethernet) card. Unlike previous generations, the ConnectX-4 has a separate PCI function per port. The current code has a limitation that Infiniband and Ethernet port types are mutually exclusive. When the driver is compiled with Ethernet support, the Infiniband functionality is disabled and vice versa. To control that we added the CONFIG_MLX5_CORE_EN config directive which is 'n' by default, but can be changed by the user. This limitation is short-lived and would be addressed soon. As part of this patchset, mlx5_ifc.h was heavily modified [1]. This file is now generated automatically from the device specification document. Since this patch is too big for the mail server, it might be missing in the mailing list, but could be pulled from an external git repository [2]. irq name selection is done at driver initialization and doesn't contain the interface name as part of the irq name. irq_balancer will still work thanks to an improvement introduced by Neil Horman [3] to use sysfs instead of /proc/interrupts. Patchset was applied on top of commit `ed2dfd9` ("tcp/dccp: warn user for preferred ip_local_port_range") [1] - Patch 4/11 ("net/mlx5_core: HW data structs/types definitions preparation for mlx5 ehternet driver") [2] - http://git.openfabrics.org/?p=~amirv/linux.git;a=shortlog;h=refs/heads/mlx5e_v1 [3] - kernel: `da8d1c8` PCI/sysfs: add per pci device msi[x] irq listing (v5) irq_balancer: 32a7757 Complete rework of how we detect and classify irqs Thanks to Achiad, Saeed, Yevheny, Or and the whole team for making this happen, Amir Changes from V4: - Removed Patch 3/12: net/mlx5_core: Add EQ renaming mechanism - Patch 12/12: net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality - irq name is created on driver initialization, therefore it won't contain the network interface name in it. This won't effect irq_balancer thanks to patches introduced by Neil Horman to use sysfs instead of /proc/interrupts. Changes from V3: - PATCH 8/11: net/mlx5_core: Set/Query port MTU commands - Return value directly - no need for err. Changes from V2: - Improved changelogs and cover-letter - Added CONFIG_MLX5_EN to disable/enable the Ethernet functionality - Moved en.h and wq.[ch] into the patch with data-path related code Changes from V1: - Added patch 1/12 ("net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory") Changes from V0: - Removed V0 Patch 1/11 ("net/mlx5_core: Virtually extend work/completion queue buffers by one page") due to misuse of DMA API. Thanks Dave. - Patch 1/11 ("net/mlx5_core: Set irq affinity hints"): - Use kcalloc instead of kzalloc - Fix build error when CONFIG_CPUMASK_OFFSTACK=n. Driver loading will fail now if cpumask allocation is failing. - Using dev_to_node helper. Thanks, Ido. - Patch 3/11 ("net/mlx5_core: HW data structs/types definitions preparation for mlx5 ehternet driver") - Removed Mellanox internal comment at the head of the file. Thanks Joe - Patch 6/11 ("net/mlx5_core: Implement get/set port status") - Use direct return of function's result. Thanks Sergei. - Added Patch 8/11 ("net/mlx5_core: Set/Query port MTU commands") - Patch 9/11 ("net/mlx5: Ethernet Datapath files") - Use rq->wqe_sz instead of skb_end_offset. Thanks Ido. - Use dma_wmb() when possible instead of wmb(). Thanks Alex. - Fix checkpatch issues - Patch 10/11 ("net/mlx5: ethernet resources handling") - checkpatch issues - Added missing include - Patch 11/11 ("net/mlx5: Ethernet driver") - checkpatch issues - fixed typo - Modified use of affinity hint - Using dev_to_node helper. Thanks, Ido. - Use new hardware commands from Patch 8/11 ("net/mlx5_core: Set/Query port MTU commands") to get/set port MTU in hardware. - Removed NETIF_F_SG since hardware ring wraparound is not supported - Use dma_wmb() when possible instead of wmb(). Thanks Alex. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:25:03 -07:00
Amir Vadai	f62b8bb8f2	net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality This is the Ethernet part of the driver for the Mellanox ConnectX(R)-4 Single/Dual-Port Adapter supporting 100Gb/s with VPI. The driver extends the existing mlx5 driver with Ethernet functionality. This patch contains the driver entry points but does not include transmit and receive (see the previous patch in the series) routines. It also adds the option MLX5_CORE_EN to Kconfig to enable/disable the Ethernet functionality. Currently, Kconfig is programmed to make Ethernet and Infiniband functionality mutally exclusive. Also changed MLX5_INFINIBAND to be depandant on MLX5_CORE instead of selecting it, since MLX5_CORE could be selected without MLX5_INFINIBAND being selected. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:51 -07:00
Amir Vadai	afb736e933	net/mlx5: Ethernet resource handling files This patch contains the resource handling files: - flow_table.c: This file contains the code to handle the low level API to configure hardware flow table. It is separated from the flow_table_en.c, because it will be used in the future by Raw Ethernet QP in mlx5_ib too. - en_flow_table.[ch]: Ethernet flow steering handling. The flow table object contain a mapping between flow specs and TIRs. This mechanism will be used also to configure e-switch in the future, when SR-IOV support will be added. - transobj.[ch] - Low level functions to create/modify/destroy the transport objects: RQ/SQ/TIR/TIS - vport.[ch] - Handle attributes of a virtual port (vPort) in the embedded switch. Currently this switch is a passthrough, until SR-IOV support will be added. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:39 -07:00
Amir Vadai	e586b3b0ba	net/mlx5: Ethernet Datapath files en_[rt]x.c contains the data path related code specific to tx or rx. en_txrx.c contains data path code which is common for both the rx and tx, this is mainly napi related code. Below are the objects that are being used by the hardware and the driver in the data path: Channel - one channel per IRQ. Every channel object contains: RQ - describes the rx queue TIR - One TIR (Transport Interface Receive) object per flow type. TIR contains attributes for a type of rx flow (e.g IPv4, IPv6 etc). A flow is defined in the Flow Table. Currently TIR describes the RSS hash parameters if exists and LRO attributes. SQ - describes the a tx queue. There is one SQ (Send Queue) per TC (traffic class). TIS - There is one TIS (Transport Interface Send) per TC. It describes the TC and may later be extended to describe more transport properties. Both RQ and SQ inherit from the object WQ (work queue). This common code to describe the layout of CQE's WQE's in memory is in the files wq.[cj] For every channel there is one NAPI context that is used for RX and for TX. Driver is using netdev_alloc_skb() to allocate skb's. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:20 -07:00
Saeed Mahameed	e725440e75	net/mlx5_core: Set/Query port MTU commands Introduce set/Query low level functions to access MTU in hardware. To be used by the netdev. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:08 -07:00
Rana Shahout	90b3e38d04	net/mlx5_core: Modify CQ moderation parameters Introduce mlx5_core_modify_cq_moderation() to be used by the netdev, to set hardware coalescing. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:59 -07:00
Rana Shahout	4c916a7980	net/mlx5_core: Implement get/set port status Implemet get/set port status low level functions to be exposed by the netdev. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:46 -07:00
Saeed Mahameed	adb0c9545b	net/mlx5_core: Implement access functions of ptys register fields Those registers will be used by the ethtool to set/get settings. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:31 -07:00
Saeed Mahameed	938fe83c8d	net/mlx5_core: New device capabilities handling - Query all supported types of dev caps on driver load. - Store the Cap data outbox per cap type into driver private data. - Introduce new Macros to access/dump stored caps (using the auto generated data types). - Obsolete SW representation of dev caps (no need for SW copy for each cap). - Modify IB driver to use new macros for checking caps. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:22 -07:00
Saeed Mahameed	e281682bf2	net/mlx5_core: HW data structs/types definitions cleanup mlx5_ifc.h was heavily modified here since it is now generated by a script from the device specification (PRM rev 0.25). This specification is backward compatible to existing hardware. Some structures/fields were added here in order to enable the Ethernet functionality of the driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:11 -07:00
Saeed Mahameed	db058a186f	net/mlx5_core: Set irq affinity hints Preparation for upcoming ethernet driver. - Move msix array from eq_table struct to priv since its not related to eq_table - Intorduce irq_info struct to hold all irq information - Move name from mlx5_eq to irq_info struct since it is irq property. - Set IRQ affinity hints Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:22:48 -07:00
Amir Vadai	64ffaa2159	net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory As David Daney pointed in mlx4_core driver [1], mlx5_core is also misusing the DMA-API. This patch is removing the code that vmap() memory allocated by dma_alloc_coherent(). After this patch, users of this drivers might fail allocating resources on memory fragmeneted systems. This will be fixed later on. [1] - https://patchwork.ozlabs.org/patch/458531/ CC: David Daney <david.daney@cavium.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:22:37 -07:00
David S. Miller	8ed9b5e1c8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-05-28 This series contains updates to ethtool, ixgbe, i40e and i40evf. John adds helper routines for ethtool to pass VF to rx_flow_spec. Since the ring_cookie is 64 bits wide which is much larger than what could be used for actual queue index values, provide helper routines to pack a VF index into the cookie. Then John provides a ixgbe patch to allow flow director to use the entire queue space. Neerav provides a i40e patch to collect XOFF Rx stats, where it was not being collected before. Anjali provides ATR support for tunneled packets, as well as stats to count tunnel ATR hits. Cleaned up PF struct members which are unnecessary, since we can use the stat index macro directly. Cleaned up flow director ATR/SB messages to a higher debug level since they are not useful unless silicon validation is happening. Greg provides a patch to disable offline diagnostics if VFs are enabled since ethtool offline diagnostic tests are not designed (out of scope) to disable VF functions for testing and re-enable afterward. Also cleans up TODO comment that is no longer needed. Vasu provides a fix an FCoE EOF case where i40e_fcoe_ctxt_eof() maybe called before i40e_fcoe_eof_is_supported() is called. Jesse adds skb->xmit_more support for i40evf. Then provides a performance enhancement for i40evf by inlining some functions which provides a 15% gain in small packet performance. Also cleans up the use of time_stamp since it is no longer used to determine if there is a tx_hang and was a part of a previous tx_hang design which is no longer used. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:09:58 -07:00
Ying Xue	1ea23a2117	tipc: unconditionally put sock refcnt when sock timer to be deleted is pending As sock refcnt is taken when sock timer is started in sk_reset_timer(), the sock refcnt should be put when sock timer to be deleted is in pending state no matter what "probing_state" value of tipc sock is. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:08:37 -07:00
Vivien Didelot	f4fb874cf0	if_vlan: fix vlaue -> value typo Fixes "vlaue" for "value" in include/linux/if_vlan.h. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 17:52:28 -07:00
Alexei Starovoitov	37e82c2f97	bpf: allow BPF programs access skb->skb_iif and skb->dev->ifindex fields classic BPF already exposes skb->dev->ifindex via SKF_AD_IFINDEX extension. Allow eBPF program to access it as well. Note that classic aborts execution of the program if 'skb->dev == NULL' (which is inconvenient for program writers), whereas eBPF returns zero in such case. Also expose the 'skb_iif' field, since programs triggered by redirected packet need to known the original interface index. Summary: __skb->ifindex -> skb->dev->ifindex __skb->ingress_ifindex -> skb->skb_iif Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 17:51:13 -07:00

1 2 3 4 5 ...

520694 Commits