linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Jack Morgenstein	6ee51a4e86	mlx4: Adjust QP1 multiplexing for RoCE/SRIOV This requires the following modifications: 1. Fix build_mlx4_header to properly fill in the ETH fields 2. Adjust mux and demux QP1 flow to support RoCE. This commit still assumes only one GID per slave for RoCE. The commit enabling multiple GIDs is a subsequent commit, and is done separately because of its complexity. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:57:12 -04:00
David S. Miller	36f6fdb749	Merge branch 'tipc' Jon Maloy says: ==================== tipc: simplifications in socket and port layer After the removal of the tipc native API the relation between tipc_port and its API types is strictly one-to-one, i.e, the latter can now only be a socket API. This change opens up for simplifications both in the code, data and locking structure. We start with this series, where we ensure that port and socket structures are co-allocated. Note that the first commit in the series is unrelated to the above. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:54 -04:00
Jon Paul Maloy	5c311421a2	tipc: eliminate redundant lookups in registry As an artefact from the native interface, the message sending functions in the port takes a port ref as first parameter, and then looks up in the registry to find the corresponding port pointer. This despite the fact that the only currently existing caller, tipc_sock, already knows this pointer. We change the signature of these functions to take a struct tipc_port* argument, and remove the redundant lookups. We also remove an unmotivated extra lookup in the function socket.c:auto_connect(), and, as the lookup functions tipc_port_deref() and ref_deref() now become unused, we remove these two functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	58ed944241	tipc: align usage of variable names and macros in socket The practice of naming variables in TIPC is inconistent, sometimes even within the same file. In this commit we align variable names and declarations within socket.c, and function and macro names within socket.h. We also reduce the number of conversion macros to two, in order to make usage less obsure. These changes are purely cosmetic. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	3b4f302d85	tipc: eliminate redundant locking The three functions tipc_portimportance(), tipc_portunreliable() and tipc_portunreturnable() and their corresponding tipc_set* functions, are all grabbing port_lock when accessing the targeted port. This is unnecessary in the current code, since these calls only are made from within socket downcalls, already protected by sock_lock. We remove the redundant locking. Also, since the functions now become trivial one-liners, we move them to port.h and make them inline. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	24be34b5a0	tipc: eliminate upcall function pointers between port and socket Due to the original one-to-many relation between port and user API layers, upcalls to the API have been performed via function pointers, installed in struct tipc_port at creation. Since this relation now always is one-to-one, we can instead use ordinary function calls. We remove the function pointers 'dispatcher' and ´wakeup' from struct tipc_port, and replace them with calls to the renamed functions tipc_sk_rcv() and tipc_sk_wakeup(). At the same time we change the name and signature of the functions tipc_createport() and tipc_deleteport() to reflect their new role as mere initialization/destruction functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	8826cde655	tipc: aggregate port structure into socket structure After the removal of the tipc native API the relation between a tipc_port and its API types is strictly one-to-one, i.e, the latter can now only be a socket API. There is therefore no need to allocate struct tipc_port and struct sock independently. In this commit, we aggregate struct tipc_port into struct tipc_sock, hence saving both CPU cycles and structure complexity. There are no functional changes in this commit, except for the elimination of the separate allocation/freeing of tipc_port. All other changes are just adaptatons to the new data structure. This commit also opens up for further code simplifications and code volume reduction, something we will do in later commits. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	f9fef18c6d	tipc: remove redundant 'peer_name' field in struct tipc_sock The field 'peer_name' in struct tipc_sock is redundant, since this information already is available from tipc_port, to which tipc_sock has a reference. We remove the field, and ensure that peer node and peer port info instead is fetched via the functions that already exist for this purpose. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Jon Paul Maloy	978813ee89	tipc: replace reference table rwlock with spinlock The lock for protecting the reference table is declared as an RWLOCK, although it is only used in write mode, never in read mode. We redefine it to become a spinlock. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:53:49 -04:00
Steffen Klassert	4a93f5095a	flowcache: Fix resource leaks on namespace exit. We leak an active timer, the hotcpu notifier and all allocated resources when we exit a namespace. Fix this by introducing a flow_cache_fini() function where we release the resources before we exit. Fixes: `ca925cf153` ("flowcache: Make flow cache name space aware") Reported-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Jakub Kicinski <moorray3@wp.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:31:18 -04:00
Joe Perches	1f36fc74d8	lg-vl600: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	ceffc4acfc	xilinx: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	b779d0afcc	brocade: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	184593c734	tipc: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	ec633eb5ff	ieee802154: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	2b8837aeaa	net: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Joe Perches	f0e78826e4	8021q: Convert uses of __constant_<foo> to <foo> The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:28:06 -04:00
Claudiu Manoil	b338ce270e	gianfar: Fix multi-queue support checks @probe() priv is not instantiated at gfar_of_init() time, when parsing the DT for info on supported HW queues. Before the netdev can be allocated, the number of supported queues must be known. Because the number of supported queues depends on device type, move the compatibility checks before netdev allocation. Local vars are used to hold the operation mode info before netdev allocation. This fixes the null accesses for priv->.., in gfar_of_init. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 00:47:52 -04:00
hayeswang	4f1d4d54f9	r8152: support dumping the hw counters Add dumping the tally counter by ethtool. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 00:09:09 -04:00
Thomas Stilwell	48d5dbaf94	ieee802154: at86rf230: add support for rf233 chip The rf233 and rf231 are sufficiently similar that we can treat rf233 like rf231. rf233 is missing some features that rf231 has, but we don't currently make use of them so there's nothing to handle differently yet. Should we add support in the future for rf231 *_NOCLK or SLEEP states, or PAD_IO drive strength, exceptions will need to be made for rf233. Signed-off-by: Thomas Stilwell <stilwellt@openlabs.co> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 00:05:36 -04:00
David S. Miller	62cf4be989	Merge branch 'pkt_sched_cond_resched' Eric Dumazet says: ==================== pkt_sched: allow scheduling points We have seen delays of more than 50ms in class or qdisc dumps, in case device is under high TX stress, even with the prior 4KB per skb limit. With the new 16KB limit, this could translate to 200ms delays. Add cond_resched() to give a chance to higher prio tasks to get cpu. But before doing so, we need to remove the rcu locking from tc_dump_qdisc() as David spotted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 23:54:56 -04:00
Eric Dumazet	fba373d2bb	pkt_sched: add cond_resched() to class and qdisc dump We have seen delays of more than 50ms in class or qdisc dumps, in case device is under high TX stress, even with the prior 4KB per skb limit. Add cond_resched() to give a chance to higher prio tasks to get cpu. Signed-off-by; Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 23:54:23 -04:00
Eric Dumazet	15dc36ebbb	pkt_sched: do not use rcu in tc_dump_qdisc() Like all rtnetlink dump operations, we hold RTNL in tc_dump_qdisc(), so we do not need to use rcu protection to protect list of netdevices. This will allow preemption to occur, thus reducing latencies. Following patch adds explicit cond_resched() calls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 23:54:23 -04:00
stephen hemminger	a19a7ec8fc	bonding: force cast of IP address in options The option code is taking IP address and putting it into a generic container. Force cast to silence sparse warnings. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 16:37:14 -04:00
stephen hemminger	693350c2ff	netdev: set __percpu attribute on netdev_alloc_pcpu_stats This patch fixes sparse warnings in vlan driver. It propagates the sparse __percpu attribute from alloc_percpu into netdev_alloc_pcpu_stats. I expect it may trigger additional sparse warnings from other drivers that are missing the __percpu attribute. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 16:37:14 -04:00
Li RongQing	090f1166c6	ipv6: ip6_forward: perform skb->pkt_type check at the beginning Packets which have L2 address different from ours should be already filtered before entering into ip6_forward(). Perform that check at the beginning to avoid processing such packets. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-11 00:37:42 -04:00
hayeswang	fcb308d529	r8152: add skb_cow_head Call skb_cow_head() before editing the tx packet header. The header would be reallocated if it is shared. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 22:23:00 -04:00
Tobias Klauser	8dc43ddc9f	net: eth: cpsw: Use net_device_stats from struct net_device Instead of using an own copy of struct net_device_stats in struct cpsw_priv, use stats from struct net_device. Also remove the thus unnecessary .ndo_get_stats function, as it just returns dev->stats, which is the default. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 21:53:01 -04:00
Eric Dumazet	d32d9bb85c	flowcache: restore a single flow_cache kmem_cache It is not legal to create multiple kmem_cache having the same name. flowcache can use a single kmem_cache, no need for a per netns one. Fixes: `ca925cf153` ("flowcache: Make flow cache name space aware") Reported-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Fan Du <fan.du@windriver.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 21:45:11 -04:00
Gu Zheng	5812521be0	net: add a pre-check of net_ns in sk_change_net() We do not need to switch the net_ns if the target net_ns the same as the current one, so here we add a pre-check of net_ns to avoid this as David suggested. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 16:29:48 -04:00
Eric Dumazet	431a91242d	tcp: timestamp SYN+DATA messages All skb in socket write queue should be properly timestamped. In case of FastOpen, we special case the SYN+DATA 'message' as we queue in socket wrote queue the two fallback skbs: 1) SYN message by itself. 2) DATA segment by itself. We should make sure these skbs have proper timestamps. Add a WARN_ON_ONCE() to eventually catch future violations. Fixes: `740b0f1841` ("tcp: switch rtt estimations to usec resolution") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 16:15:54 -04:00
Haiyang Zhang	99d3016de4	hyperv: Change the receive buffer size for legacy hosts Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller receive buffer size, otherwise the buffer will not be accepted by the legacy hosts. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 16:11:26 -04:00
Alexander Aring	3772ab1d37	6lowpan: reassembly: fix access of ctl table entry Correct offset is 3 of the 6lowpanfrag_max_datagram_size value in proc entry ctl table and not 2. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 16:03:03 -04:00
David S. Miller	e3ca64948b	Merge branch 'hyperv-next' K. Y. Srinivasan says: ==================== Drivers: net: hyperv: Enable various offloads This patch set enables both checksum as well as segmentation offload. As part of this effort I have enabled scatter gather I/O a well. In version 2 of these patches, I addressed comments from David Miller and Dan Carpenter. In this version I have addressed the latest comments from David Miller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:52:17 -04:00
KY Srinivasan	77bf548794	Drivers: net: hyperv: Enable large send offload Enable segmentation offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:37 -04:00
KY Srinivasan	08cd04bf6d	Drivers: net: hyperv: Enable send side checksum offload Enable send side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:37 -04:00
KY Srinivasan	e3d605ed44	Drivers: net: hyperv: Enable receive side IP checksum offload Enable receive side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:37 -04:00
KY Srinivasan	4a0e70ae5e	Drivers: net: hyperv: Enable offloads on the host Prior to enabling guest side offloads, enable the offloads on the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:37 -04:00
KY Srinivasan	8a00251a36	Drivers: net: hyperv: Cleanup the send path In preparation for enabling offloads, cleanup the send path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:37 -04:00
KY Srinivasan	54a7357f7a	Drivers: net: hyperv: Enable scatter gather I/O Cleanup the code and enable scatter gather I/O. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:51:36 -04:00
Tim Harvey	3ee2f8ce1a	sky2: allow mac to come from dt The driver reads the mac address from the device registers which would need to have been programmed by the bootloader. This patch adds the ability to pull the mac from devicetree via the pci device dt node. Signed-off-by: Tim Harvey <tharvey@gateworks.com> Cc: netdev@vger.kernel.org Cc: devicetree@vger.kernel.org Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Changes since v2: - eliminated use of stack tmpaddr per feedback Changes since v1: - simplified based on feedback - fixed formatting Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:40:30 -04:00
Eric Dumazet	746e349980	l2tp: fix unused variable warning net/l2tp/l2tp_core.c:1111:15: warning: unused variable 'sk' [-Wunused-variable] Fixes: `31c70d5956` ("l2tp: keep original skb ownership") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:32:24 -04:00
Kleber Sacilotto de Souza	c120e9e030	IB/mlx5_core: remove unreachable function call in module init The call to mlx5_health_cleanup() in the module init function can never be reached. Removing it. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 15:23:22 -04:00
Eric Dumazet	9063e21fb0	netlink: autosize skb lengthes One known problem with netlink is the fact that NLMSG_GOODSIZE is really small on PAGE_SIZE==4096 architectures, and it is difficult to know in advance what buffer size is used by the application. This patch adds an automatic learning of the size. First netlink message will still be limited to ~4K, but if user used bigger buffers, then following messages will be able to use up to 16KB. This speedups dump() operations by a large factor and should be safe for legacy applications. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 13:56:26 -04:00
Edward Cree	cd84ff4da1	sfc: Use ether_addr_copy and eth_broadcast_addr Faster than memcpy/memset on some architectures. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 13:53:37 -04:00
David S. Miller	19433646fe	Merge branch 'gianfar-next' Claudiu Manoil says: ==================== gianfar: Tx timeout issue There's an older Tx timeout issue showing up on etsec2 devices with 2 CPUs. I pinned this issue down to processing overhead incurred by supporting multiple Tx/Rx rings, as explained in the 2nd patch below. But before this, there's also a concurency issue leading to Rx/Tx spurrious interrupts, addressed by the 'Tx NAPI' patch below. The Tx timeout can be triggered with multiple Tx flows, 'iperf -c -N 8' commands, on a 2 CPUs etsec2 based (P1020) board. Before the patches: """ root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 & [...] root@p1020rdb-pc:~# NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 1 timed out WARNING: at net/sched/sch_generic.c:279 Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc3-03386-g89ea59c #23 task: ed84ef40 ti: ed868000 task.ti: ed868000 NIP: `c04627a8` LR: `c04627a8` CTR: c02fb270 REGS: ed869d00 TRAP: 0700 Not tainted (3.13.0-rc3-03386-g89ea59c) MSR: 00029000 <CE,EE,ME> CR: 44000022 XER: 20000000 [...] root@p1020rdb-pc:~# [ ID] Interval Transfer Bandwidth [ 5] 0.0-19.3 sec 1000 MBytes 434 Mbits/sec [ 8] 0.0-39.7 sec 1000 MBytes 211 Mbits/sec [ 9] 0.0-40.1 sec 1000 MBytes 209 Mbits/sec [ 3] 0.0-40.2 sec 1000 MBytes 209 Mbits/sec [ 10] 0.0-59.0 sec 1000 MBytes 142 Mbits/sec [ 7] 0.0-74.6 sec 1000 MBytes 112 Mbits/sec [ 6] 0.0-74.7 sec 1000 MBytes 112 Mbits/sec [ 4] 0.0-74.7 sec 1000 MBytes 112 Mbits/sec [SUM] 0.0-74.7 sec 7.81 GBytes 898 Mbits/sec root@p1020rdb-pc:~# ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:04:9f:00:13:01 inet addr:172.16.1.1 Bcast:172.16.255.255 Mask:255.255.0.0 inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:708722 errors:0 dropped:0 overruns:0 frame:0 TX packets:8717849 errors:6 dropped:0 overruns:1470 carrier:0 collisions:0 txqueuelen:1000 RX bytes:58118018 (55.4 MiB) TX bytes:274069482 (261.3 MiB) Base address:0xa000 """ After applying the patches: """ root@p1020rdb-pc:~# iperf -c 172.16.1.3 -n 1000M -P 8 & [...] root@p1020rdb-pc:~# [ ID] Interval Transfer Bandwidth [ 9] 0.0-70.5 sec 1000 MBytes 119 Mbits/sec [ 5] 0.0-70.5 sec 1000 MBytes 119 Mbits/sec [ 6] 0.0-70.7 sec 1000 MBytes 119 Mbits/sec [ 4] 0.0-71.0 sec 1000 MBytes 118 Mbits/sec [ 8] 0.0-71.1 sec 1000 MBytes 118 Mbits/sec [ 3] 0.0-71.2 sec 1000 MBytes 118 Mbits/sec [ 10] 0.0-71.3 sec 1000 MBytes 118 Mbits/sec [ 7] 0.0-71.3 sec 1000 MBytes 118 Mbits/sec [SUM] 0.0-71.3 sec 7.81 GBytes 942 Mbits/sec root@p1020rdb-pc:~# ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:04:9f:00:13:01 inet addr:172.16.1.1 Bcast:172.16.255.255 Mask:255.255.0.0 inet6 addr: fe80::204:9fff:fe00:1301/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:728446 errors:0 dropped:0 overruns:0 frame:0 TX packets:8690057 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:59732650 (56.9 MiB) TX bytes:271554306 (258.9 MiB) Base address:0xa000 """ v2: PATCH 2: Replaced CPP check with run-time condition to limit the number of queues. Updated comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 13:17:44 -04:00
Claudiu Manoil	71ff9e3df7	gianfar: Use Single-Queue polling for "fsl,etsec2" For the "fsl,etsec2" compatible models the driver currently supports 8 Tx and Rx DMA rings (aka HW queues). However, there are only 2 pairs of Rx/Tx interrupt lines, as these controllers are integrated in low power SoCs with 2 CPUs at most. As a result, there are at most 2 NAPI instances that have to service multiple Tx and Rx queues for these devices. This complicates the NAPI polling routine having to iterate over the mutiple Rx/Tx queues hooked to the same interrupt lines. And there's also an overhead at HW level, as the controller needs to service all the 8 Tx rings in a round robin manner. The combined overhead shows up for multi parallel Tx flows transmitted by the kernel stack, when the driver usually starts returning NETDEV_TX_BUSY leading to NETDEV WATCHDOG Tx timeout triggering if the Tx path is congested for too long. As an alternative, this patch makes the driver support only one Tx/Rx DMA ring per NAPI instance (per interrupt group or pair of Tx/Rx interrupt lines) by default. The simplified single queue polling routine (gfar_poll_sq) will be the default napi poll routine for the etsec2 devices too. Some adjustments needed to be made to link the Tx/Rx HW queues with each NAPI instance (2 in this case). The gfar_poll_sq() is already successfully used by older SQ_SG_MODE (single interrupt group) controllers. This patch fixes Tx timeout triggering under heavy Tx traffic load (i.e. iperf -c -P 8) for the "fsl,etsec2" (currently the only MQ_MG_MODE devices). There's also a significant memory footprint reduction by supporting 2 Rx/Tx DMA rings (at most), instead of 8, for these devices. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 13:17:22 -04:00
Claudiu Manoil	aeb12c5ef7	gianfar: Separate out the Tx interrupt handling (Tx NAPI) There are some concurrency issues on devices w/ 2 CPUs related to the handling of Rx and Tx interrupts. eTSEC has separate interrupt lines for Rx and Tx but a single imask register to mask these interrupts and a single NAPI instance to handle both Rx and Tx work. As a result, the Rx and Tx ISRs are identical, both are invoking gfar_schedule_cleanup(), however both handlers can be entered at the same time when the Rx and Tx interrupts are taken by different CPUs. In this case spurrious interrupts (SPU) show up (in /proc/interrupts) indicating a concurrency issue. Also, Tx overruns followed by Tx timeout have been observed under heavy Tx traffic load. To address these issues, the schedule cleanup ISR part has been changed to handle the Rx and Tx interrupts independently. The patch adds a separate NAPI poll routine for Tx cleanup to be triggerred independently by the Tx confirmation interrupts only. Existing poll functions are modified to handle only the Rx path processing. The Tx poll routine does not need a budget, since Tx processing doesn't consume NAPI budget, and hence it is registered with minimum NAPI weight. NAPI scheduling does not require locking since there are different NAPI instances between the Rx and Tx confirmation paths now. So, the patch fixes the occurence of spurrious Rx/Tx interrupts. Tx overruns also occur less frequently now. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-10 13:17:22 -04:00
dingtianhong	be14cc98e9	vlan: use use ether_addr_equal_64bits to instead of ether_addr_equal Ether_addr_equal_64bits is more efficient than ether_addr_equal, and can be used when each argument is an array within a structure that contains at least two bytes of data beyond the array, so it is safe to use it for vlan, and make sense for fast path. Cc: Joe Perches <joe@perches.com> Cc: Patrick McHardy <kaber@trash.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-09 19:03:51 -04:00
dingtianhong	375f67df28	vlan: slight optimization for vlan_do_receive() According Joe's suggestion, maybe it'd be faster to add an unlikely to the test for PCKET_OTHERHOST, so I add it and see whether the performance could be better, although the differences is so small and negligible, but it is hard to catch that any lower device would set the skb type to PACKET_OTHERHOST, so most of time, I think it make sense to add unlikely for the test. Cc: Joe Perches <joe@perches.com> Cc: Patrick McHardy <kaber@trash.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-09 19:03:51 -04:00

1 2 3 4 5 ...

428224 Commits