Commit Graph

312957 Commits

Author SHA1 Message Date
Alexander Duyck
540eb7bf0b net: Update alloc frag to reduce get/put page usage and recycle pages
This patch is meant to help improve performance by reducing the number of
locked operations required to allocate a frag on x86 and other platforms.
This is accomplished by using atomic_set operations on the page count
instead of calling get_page and put_page.  It is based on work originally
provided by Eric Dumazet.

In addition it also helps to reduce memory overhead when using TCP.  This
is done by recycling the page if the only holder of the frame is the
netdev_alloc_frag call itself.  This can occur when skb heads are stolen by
either GRO or TCP and the driver providing the packets is using paged frags
to store all of the data for the packets.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-13 05:48:36 -07:00
David S. Miller
391e5c22f5 ipv4: Remove tb_peers from fib_table.
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 09:39:28 -07:00
David S. Miller
10b0a7732e Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next 2012-07-12 08:18:45 -07:00
Padmanabh Ratnakar
d3bd3a5eeb be2net: Enable RSS UDP hashing for Lancer and Skyhawk
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:47 -07:00
Padmanabh Ratnakar
b4e32a7169 be2net: Fix port name in message during driver load
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Padmanabh Ratnakar
19d59aa762 be2net: Fix cleanup path when EQ creation fails
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Padmanabh Ratnakar
f67ef7bae8 be2net: Activate new FW after FW download for Lancer
After FW download, activate new FW by invoking FW reset.
Recreate rings once new FW is operational.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Padmanabh Ratnakar
bf99e50dc2 be2net: Fix initialization sequence for Lancer
Invoke only required initialization routines for Lancer.
Remove invocation of unnecessary routines.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Padmanabh Ratnakar
7aeb215643 be2net : Fix die temperature stat for Lancer
Query die temperature stat for Lancer to report it correctly
in ethtool.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Padmanabh Ratnakar
c871c5f293 be2net: Fix error while toggling autoneg of pause parameters
Autonegotiation of pause parameters is possible only on some PHYs.
Ability of autoneg of pause parameters is reported by adapter.
Autoneg of pause parameters cannot be changed from driver.
Fix driver to give error when autoneg mode is toggled by user.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:16:46 -07:00
Jiri Pirko
68c450426a team: make team_port_enabled() and team_port_txable() static inline
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:08:20 -07:00
Jiri Pirko
5fc889911a team: add broadcast mode
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:08:20 -07:00
Jiri Pirko
6e88e1357c team: use function team_port_txable() for determing enabled and up port
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:08:20 -07:00
David S. Miller
f0a70e902f ipv4: Put proper checks into icmp_socket_deliver().
All handler->err() routines expect that we've done a pskb_may_pull()
test to make sure that IP header length + 8 bytes can be safely
pulled.

Reported-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 08:06:04 -07:00
David S. Miller
065f5f9749 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next 2012-07-12 08:00:56 -07:00
Florian Westphal
6d4fa852a0 net: sched: add ipset ematch
Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.

Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).

Tested-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:46 -07:00
Flavio Leitner
fa919833e3 netxen: fix link notification order
First update the adapter variables with the current speed and
mode before fire the notification. Otherwise, the get_settings()
may provide old values.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:46 -07:00
alex.bluesman.smirnov@gmail.com
33c34c5e93 6lowpan: rework fragment-deleting routine
6lowpan module starts collecting incomming frames and fragments
right after lowpan_module_init() therefor it will be better to
clean unfinished fragments in lowpan_cleanup_module() function
instead of doing it when link goes down.

Changed spinlocks type to prevent deadlock with expired timer event
and removed unused one.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:46 -07:00
alex.bluesman.smirnov@gmail.com
abbee2effc 6lowpan: fix tag variable size
Function lowpan_alloc_new_frame() takes u8 tag as an argument. However,
its only caller, lowpan_process_data() passes down a u16. Hence,
the tag value can get corrupted. This prevent 6lowpan fragment reassembly of a
message when the fragment tag value is over 256.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
alex.bluesman.smirnov@gmail.com
428840424f mac802154: sparse warnings: make symbols static
Make symbols static to avoid the following warning shown up
by sparse:

    warning: symbol ... was not declared. Should it be static?

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
alex.bluesman.smirnov@gmail.com
79ff1db6d9 6lowpan: get extra headroom in allocated frame
Use netdev_alloc_skb_ip_align() instead of alloc_skb() to get some
extra headroom in case we need to forward this frame in a tunnel or
something else.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
alex.bluesman.smirnov@gmail.com
e885a47a47 mac802154: add get short address method
Add method to get the device short 802.15.4 address. This call
needed by ieee802154 layer to satisfy 'iz list' request from
the user space.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
alex.bluesman.smirnov@gmail.com
5b00f2eef3 drivers/ieee802154/at86rf230: rework irq handler
Fix LOCKDEP bug message for the irq handler spinlock.
Make the irq processing code more explicit and stable.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
alex.bluesman.smirnov@gmail.com
4d27de149b 6lowpan: revert: add missing spin_lock_init()
Revert the commit 768f7c7c12 to initialize
spinlock in the more preferable way and make it static to avoid sparse
warning.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
Dan Carpenter
d348446b88 smsc95xx: signedness bug in get_regs()
"retval" has to be a signed integer for the error handling to work.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
Greg Ungerer
064bff1c9f net: add support for NS8390 based eth controllers on some ColdFire CPU boards
A number of older ColdFire CPU based boards use NS8390 based network
controllers. Most use the Davicom 9008F or the UMC 9008F. This driver
provides the support code to get these devices working on these platforms.

Generally the NS8390 based eth device is direct connected via the general
purpose bus of the ColdFire CPU. So its addressing and interrupt setup is
fixed on each of the different platforms (classic platform setup).

This driver is based on the other drivers/net/ethernet/8390 drivers, and
includes the lib8390.c code. It uses the existing definitions of the
board NS8390 device addresses, interrupts and access types from the
arch/m68k/include/asm/mcf8390.h, but moves the IO access functions into
the driver code and out of that header.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:45 -07:00
Greg Ungerer
2c624880fb m68knommu: move the badly named mcfne.h to a better mcf8390.h
The mcfne.h include contains definitions to support NS8390 eth based hardware
on ColdFire based CPU boards. So change its name to reflect that better.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:54:44 -07:00
David S. Miller
99ee038d41 ipv4: Fix warnings in ip_do_redirect() for some configurations.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 07:40:05 -07:00
David S. Miller
3ec5a261ae Merge branch 'redirect_via_sock'
As described in my patch series from the other day, we need to
rearrange redirect handling so that the local initiators of packets
(sockets, tunnels, xfrms, etc.) that implement the protocols compute
the route and pass this down into the ipv4/ipv6 routing code.

These changes here do so by implementing a new dst_ops->redirect
method.

No more do we have this funny code that tries several different sets
of routing keys to try and figure out which route the redirect should
actually be applied to.

No more do we have the problem wherein TOS rewriting causes problems
for us.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 03:49:19 -07:00
David S. Miller
1ed5c48f23 net: Remove checks for dst_ops->redirect being NULL.
No longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:41:25 -07:00
David S. Miller
b587ee3ba2 net: Add dummy dst_ops->redirect method where needed.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:39:24 -07:00
David S. Miller
b94f1c0904 ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().
And delete rt6_redirect(), since it is no longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:33:37 -07:00
David S. Miller
ec18d9a269 ipv6: Add redirect support to all protocol icmp error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:25:15 -07:00
David S. Miller
3a5ad2ee5e ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:08:07 -07:00
David S. Miller
6e157b6ac6 ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().
Hook it into dst_ops->redirect as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12 00:05:02 -07:00
David S. Miller
e8599ff4b1 ipv6: Move bulk of redirect handling into rt6_redirect().
This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 23:43:53 -07:00
David S. Miller
30f2a5f379 ipv6: Export ndisc option parsing from ndisc.c
This is going to be used internally by the rt6 redirect code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 23:39:11 -07:00
David S. Miller
1f42539d25 ipv4: Kill ip_rt_redirect().
No longer needed, as the protocol handlers now all properly
propagate the redirect back into the routing code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 21:30:08 -07:00
David S. Miller
55be7a9c60 ipv4: Add redirect support to all protocol icmp error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 21:27:49 -07:00
David S. Miller
b42597e2f3 ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 21:25:45 -07:00
David S. Miller
e47a185b31 ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.
All of the redirect acceptance policy is now contained within.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 20:55:47 -07:00
David S. Miller
94206125c4 ipv4: Rearrange arguments to ip_rt_redirect()
Pass in the SKB rather than just the IP addresses, so that policy
and other aspects can reside in ip_rt_redirect() rather then
icmp_redirect().

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 20:38:08 -07:00
David S. Miller
d0da720f9f ipv4: Pull redirect instantiation out into a helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 20:27:54 -07:00
David S. Miller
d3351b75a7 ipv4: Deliver ICMP redirects to sockets too.
And thus, we can remove the ping_err() hack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 18:35:12 -07:00
David S. Miller
1de9243bbf ipv4: Pull icmp socket delivery out into a helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 18:32:17 -07:00
Eric Dumazet
46d3ceabd8 tcp: TCP Small Queues
This introduce TSQ (TCP Small Queues)

TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.

sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.

TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.

As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.

This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.

Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.

Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)

I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.

As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.

If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.

[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
  but some drivers call it in their start_xmit() handler.
  These drivers should at least use BQL, or else a single TCP
  session can still fill the whole NIC TX ring, since TSQ will
  have no effect.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 18:12:59 -07:00
Alexander Duyck
2100844ca9 tcp: Fix out of bounds access to tcpm_vals
The recent patch "tcp: Maintain dynamic metrics in local cache." introduced
an out of bounds access due to what appears to be a typo.   I believe this
change should resolve the issue by replacing the access to RTAX_CWND with
TCP_METRIC_CWND.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 17:30:41 -07:00
David S. Miller
48ee3569f3 ipv6: Move ipv6 twsk accessors outside of CONFIG_IPV6 ifdefs.
Fixes build when ipv6 is disabled.

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 02:39:24 -07:00
Alexander Duyck
0b7f5d0b65 ixgbe: Merge RSS and flow director ring register caching and configuration
There are really only 3 modes that can control the number of queues.  Those
are RSS, DCB, and VMDq/SR-IOV.  Currently we have things much more broken
up than they need to be for how we are configuring the rings.  In order to
try and straiten some of this out I am going to start merging similar
functionality into single functions.  To start with I am merging the Flow
Director ring configuration into the RSS ring configuration since Flow
Director cannot function with DCB or SR-IOV.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-07-11 02:04:40 -07:00
Alexander Duyck
45e9baa515 ixgbe: Clean up a useless switch statement and dead code in configure_srrctl
This patch replaces a switch statement for an 82598 workaround with an if
statement that only applies to 82598. In addition I am pulling out several
dead pieces of code and instead of reading the SRRCTL register and then
modifying it we are just writing a value which we generate from scratch.
Finally I am also removing any drop enable related code since that was
moved to a function of its own.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-07-11 02:02:26 -07:00