linux_dsm_epyc7002/net/ipv4
Jesper Dangaard Brouer 19952cc4f8 net: frag queue per hash bucket locking
This patch implements per hash bucket locking for the frag queue
hash.  This removes two write locks, and the only remaining write
lock is for protecting hash rebuild.  This essentially reduce the
readers-writer lock to a rebuild lock.

This patch is part of "net: frag performance followup"
 http://thread.gmane.org/gmane.linux.network/263644
of which two patches have already been accepted:

Same test setup as previous:
 (http://thread.gmane.org/gmane.linux.network/257155)
 Two 10G interfaces, on seperate NUMA nodes, are under-test, and uses
 Ethernet flow-control.  A third interface is used for generating the
 DoS attack (with trafgen).

Notice, I have changed the frag DoS generator script to be more
efficient/deadly.  Before it would only hit one RX queue, now its
sending packets causing multi-queue RX, due to "better" RX hashing.

Test types summary (netperf UDP_STREAM):
 Test-20G64K     == 2x10G with 65K fragments
 Test-20G3F      == 2x10G with 3x fragments (3*1472 bytes)
 Test-20G64K+DoS == Same as 20G64K with frag DoS
 Test-20G3F+DoS  == Same as 20G3F  with frag DoS
 Test-20G64K+MQ  == Same as 20G64K with Multi-Queue frag DoS
 Test-20G3F+MQ   == Same as 20G3F  with Multi-Queue frag DoS

When I rebased this-patch(03) (on top of net-next commit a210576c) and
removed the _bh spinlock, I saw a performance regression.  BUT this
was caused by some unrelated change in-between.  See tests below.

Test (A) is what I reported before for patch-02, accepted in commit 1b5ab0de.
Test (B) verifying-retest of commit 1b5ab0de corrospond to patch-02.
Test (C) is what I reported before for this-patch

Test (D) is net-next master HEAD (commit a210576c), which reveals some
(unknown) performance regression (compared against test (B)).
Test (D) function as a new base-test.

Performance table summary (in Mbit/s):

(#) Test-type:  20G64K    20G3F    20G64K+DoS  20G3F+DoS  20G64K+MQ 20G3F+MQ
    ----------  -------   -------  ----------  ---------  --------  -------
(A) Patch-02  : 18848.7   13230.1   4103.04     5310.36     130.0    440.2
(B) 1b5ab0de  : 18841.5   13156.8   4101.08     5314.57     129.0    424.2
(C) Patch-03v1: 18838.0   13490.5   4405.11     6814.72     196.6    461.6

(D) a210576c  : 18321.5   11250.4   3635.34     5160.13     119.1    405.2
(E) with _bh  : 17247.3   11492.6   3994.74     6405.29     166.7    413.6
(F) without bh: 17471.3   11298.7   3818.05     6102.11     165.7    406.3

Test (E) and (F) is this-patch(03), with(V1) and without(V2) the _bh spinlocks.

I cannot explain the slow down for 20G64K (but its an artificial
"lab-test" so I'm not worried).  But the other results does show
improvements.  And test (E) "with _bh" version is slightly better.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>

----
V2:
- By analysis from Hannes Frederic Sowa and Eric Dumazet, we don't
  need the spinlock _bh versions, as Netfilter currently does a
  local_bh_disable() before entering inet_fragment.
- Fold-in desc from cover-mail
V3:
- Drop the chain_len counter per hash bucket.
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-04 17:37:05 -04:00
..
netfilter net-next: replace obsolete NLMSG_* with type safe nlmsg_* 2013-03-28 14:25:25 -04:00
af_inet.c ipv4: Fix ip-header identification for gso packets. 2013-03-26 13:50:05 -04:00
ah4.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
arp.c firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection. 2013-03-26 12:32:13 -04:00
cipso_ipv4.c cipso: don't follow a NULL pointer when setsockopt() is called 2012-07-18 09:01:12 -07:00
datagram.c ipv4: Add a socket release callback for datagram sockets 2013-01-21 14:17:05 -05:00
devinet.c ipv4: provide addr and netconf dump consistency info 2013-03-24 17:16:29 -04:00
esp4.c xfrm4: Invalidate all ipv4 routes on IPsec pmtu events 2013-01-21 12:43:54 +01:00
fib_frontend.c net-next: replace obsolete NLMSG_* with type safe nlmsg_* 2013-03-28 14:25:25 -04:00
fib_lookup.h ipv4: Fix nexthop caching wrt. scoping. 2011-03-24 18:06:47 -07:00
fib_rules.c sections: fix section conflicts in net 2012-10-06 03:04:45 +09:00
fib_semantics.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
fib_trie.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
gre.c GRE: Refactor GRE tunneling code. 2013-03-26 12:27:18 -04:00
icmp.c ipv4: fix error handling in icmp_protocol. 2013-02-22 15:10:18 -05:00
igmp.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
inet_connection_sock.c tcp: Remove TCPCT 2013-03-17 14:35:13 -04:00
inet_diag.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
inet_fragment.c net: frag queue per hash bucket locking 2013-04-04 17:37:05 -04:00
inet_hashtables.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inet_lro.c ipv4: replace ip_fast_csum with csum_replace2 2013-03-15 09:12:25 -04:00
inet_timewait_sock.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inetpeer.c Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2012-10-02 09:54:49 -07:00
ip_forward.c ipv4: introduce rt_uses_gateway 2012-10-08 17:42:36 -04:00
ip_fragment.c inet: generalize ipv4-only RFC3168 5.3 ecn fragmentation handling for future use by ipv6 2013-03-24 17:16:30 -04:00
ip_gre.c ip_gre: don't overwrite iflink during net_dev init 2013-03-30 17:28:33 -04:00
ip_input.c ipv[4|6]: correct dropwatch false positive in local_deliver_finish 2013-03-01 15:56:29 -05:00
ip_options.c net/ipv4: Ensure that location of timestamp option is stored 2013-03-12 05:35:39 -04:00
ip_output.c net: Fix possible wrong checksum generation. 2013-02-13 13:30:10 -05:00
ip_sockglue.c net: prevent setting ttl=0 via IP_TTL 2013-01-08 17:57:10 -08:00
ip_tunnel.c ip_tunnel: Fix off-by-one error in forming dev name. 2013-03-29 15:24:28 -04:00
ip_vti.c Tunneling: use IP Tunnel stats APIs. 2013-03-26 12:27:19 -04:00
ipcomp.c ipcomp: Mark as netns_ok. 2013-02-04 15:46:15 -05:00
ipconfig.c ipconfig: add informative timeout messages while waiting for carrier 2013-04-02 14:35:33 -04:00
ipip.c IPIP: Use ip-tunneling code. 2013-03-26 12:27:18 -04:00
ipmr.c net-next: replace obsolete NLMSG_* with type safe nlmsg_* 2013-03-28 14:25:25 -04:00
Kconfig Tunneling: use IP Tunnel stats APIs. 2013-03-26 12:27:19 -04:00
Makefile GRE: Refactor GRE tunneling code. 2013-03-26 12:27:18 -04:00
netfilter.c netfilter: properly annotate ipv4_netfilter_{init,fini}() 2012-09-03 13:56:04 +02:00
ping.c ipv4: fix a bug in ping_err(). 2013-02-21 15:25:00 -05:00
proc.c tcp: TLP loss detection. 2013-03-12 08:30:34 -04:00
protocol.c ipv4: Disallow non-namespace aware protocols to register. 2013-02-05 14:42:23 -05:00
raw.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
route.c rtnetlink: Remove passing of attributes into rtnl_doit functions 2013-03-22 10:31:16 -04:00
syncookies.c tcp: Remove TCPCT 2013-03-17 14:35:13 -04:00
sysctl_net_ipv4.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_bic.c tcp: fix undo after RTO for BIC 2012-01-20 14:17:26 -05:00
tcp_cong.c tcp: remove Appropriate Byte Count support 2013-02-05 14:51:16 -05:00
tcp_cubic.c tcp: fix undo after RTO for CUBIC 2012-01-20 14:17:26 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_fastopen.c tcp: TCP Fast Open Server - header & support functions 2012-08-31 20:02:18 -04:00
tcp_highspeed.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_htcp.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_hybla.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_illinois.c net: fix divide by zero in tcp algorithm illinois 2012-11-01 11:55:59 -04:00
tcp_input.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-27 13:52:49 -04:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-20 12:46:26 -04:00
tcp_lp.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tcp_memcontrol.c memcg: decrement static keys at real destroy time 2012-05-29 16:22:28 -07:00
tcp_metrics.c tcp: handle tcp_net_metrics_init() order-5 memory allocation failures 2012-11-16 13:36:27 -05:00
tcp_minisocks.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-22 12:53:09 -04:00
tcp_probe.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
tcp_scalable.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_timer.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_vegas.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_vegas.h
tcp_veno.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_westwood.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_yeah.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tcp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-20 12:46:26 -04:00
tunnel4.c net: Convert printks to pr_<level> 2012-03-11 23:42:51 -07:00
udp_diag.c net-next: replace obsolete NLMSG_* with type safe nlmsg_* 2013-03-28 14:25:25 -04:00
udp_impl.h ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
udp.c Revert "udp: increase inner ip header ID during segmentation" 2013-03-25 12:29:54 -04:00
udplite.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
xfrm4_input.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
xfrm4_mode_beet.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c ip: fix warning in xfrm4_mode_tunnel_input 2013-02-18 12:42:48 -05:00
xfrm4_output.c xfrm4: Don't call icmp_send on local error 2011-07-01 17:33:19 -07:00
xfrm4_policy.c xfrm: make gc_thresh configurable in all namespaces 2013-02-06 11:36:29 +01:00
xfrm4_state.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
xfrm4_tunnel.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00