Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.
In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.
The signature of the ->poll() call back goes from:
int foo_poll(struct net_device *dev, int *budget)
to
int foo_poll(struct napi_struct *napi, int budget)
The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.
The napi_struct is to be embedded in the device driver private data
structures.
Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.
With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the DIS_EARLY_DAC PHY workaround for 5709 A1. Without it, link
sometimes does not come up.
Update version to 1.6.5.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add comment to explain why we cannot read back after chip reset
before delaying.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2.c (incorrectly) sets current->state directly to
TASK_UNINTERRUPTIBLE, without going through set_task_state(). However
all the code wants to do is an msleep so just make it do that instead...
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device would not resume properly if it was shutdown before the system
was suspended. In such scenario where the netif_running state is 0,
bnx2_suspend() would not save the PCI state and so the memory enable bit
and bus master enable bit would be lost.
We fix this by always saving and restoring the PCI state in
bnx2_suspend() and bnx2_resume() regardless of netif_running() state.
Update version to 1.6.4.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function. So we can inline the generic function into the
caller and avoid going through the drivers.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change all stats related magic numbers to constants.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The management firmware may still be loading during bnx2_init_one()
because of the D3hot -> D0 transition and the firmware version may
not be available without waiting a bit.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NVRAM interface is slightly modified on the 5709. To properly
support it, we need to change the buffered flag in the flash data
structure into multiple flags to indicate buffered operation, address
translation, and the use of write enable (WREN). The 5709 flash
only requires the buffered operation bit to be set.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 963bd949b1. The
driver _does_ need the networking header files;
CC [M] drivers/net/bnx2.o
drivers/net/bnx2.c: In function 'bnx2_start_xmit':
drivers/net/bnx2.c:5177: warning: implicit declaration of function 'tcp_optlen'
drivers/net/bnx2.c:5181: error: invalid application of 'sizeof' to incomplete type 'struct ipv6hdr'
drivers/net/bnx2.c:5202: error: invalid application of 'sizeof' to incomplete type 'struct tcphdr'
drivers/net/bnx2.c:5207: warning: implicit declaration of function 'tcp_hdr'
drivers/net/bnx2.c:5207: error: invalid type argument of '->'
make[2]: *** [drivers/net/bnx2.o] Error 1
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
Cc: Ilpo Jävinen <ilpo.jarvinen@helsinki.fi>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
PCI: Only build PCI syscalls on architectures that want them
PCI: limit pci_get_bus_and_slot to domain 0
PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
PCI: hotplug: pciehp: wait for 1 second after power off slot
PCI: pci_set_power_state(): check for PM capabilities earlier
PCI: cpci_hotplug: Convert to use the kthread API
PCI: add pci_try_set_mwi
PCI: pcie: remove SPIN_LOCK_UNLOCKED
PCI: ROUND_UP macro cleanup in drivers/pci
PCI: remove pci_dac_dma_... APIs
PCI: pci-x-pci-express-read-control-interfaces cleanups
PCI: Fix typo in include/linux/pci.h
PCI: pci_ids, remove double or more empty lines
PCI: pci_ids, add atheros and 3com_2 vendors
PCI: pci_ids, reorder some entries
PCI: i386: traps, change VENDOR to DEVICE
PCI: ATM: lanai, change VENDOR to DEVICE
PCI: Change all drivers to use pci_device->revision
...
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Got bored to always recompile it for no reason.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to the periodic heartbeat, we're adding a heartbeat
request interrupt when the heartbeat is late. This is needed during
netpoll where the timer is not available. -rt kernels will also
benefit since the timer is not as accurate.
[ We discussed this patch last time and we decided that the -rt
kernel problem alone did not justify this patch. I think the
netpoll problem makes this patch necessary. ]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Spurious interrupts are often encountered especially on systems
using the 8259 PIC mode. This is because the I/O write to deassert
the interrupt is posted and won't get to the chip immediately. As
a result, the IRQ may remain asserted after the IRQ handler exits,
causing spurious interrupts.
Add read back to flush the I/O write to deassert the IRQ immediately.
We also store the last_status_idx immediately in the IRQ handler to
help detect whether the interrupt is ours or not when the IRQ is
entered again before ->poll gets called.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the link up dmesg to report remote copper or Serdes link.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the driver's ethtool_ops->get_settings and set_settings
functions to support remote PHY. Users control the remote copper
PHY settings by specifying link settings for the tp (twisted pair)
port.
The nway_reset function is also modified to support remote PHY.
mii-tool operations are not supported on remote PHY and we will
return -EOPNOTSUPP.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In blade servers, the Serdes PHY in 5708S can control the remote
copper PHY through autonegotiation on the backplane. This patch adds
the logic to interface with the firmware to control the remote PHY
autonegotiation and to handle remote PHY link events.
When remote PHY is present, the 5708S Serdes device practically
becomes a copper device with full control over the 1000Base-T
link settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Put existing code to setup the default link settings in this new
function. This makes it easier to support the remote PHY feature in
the next few patches.
Also change ETHTOOL_ALL_FIBRE_SPEED to include 2500Mbps if supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.
This patch:
* adds NETIF_F_IPV6_CSUM for those devices
* fixes bnx2 and tg3 devices that need it
* add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
* fixes assumptions about NETIF_F_ALL_CSUM in nat
* adjusts bridge union of checksumming computation
Signed-off-by: David S. Miller <davem@davemloft.net>
Update to version 1.5.11.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The statistics block DMA on 5708 can be messed up occasionally on the
average of about once per hour. If the user is reading the counters
within one second after the corruption, the counters will be all
messed up. One second later, the counters will be ok again until the
next corruption occurs.
The workaround is to disable the periodic statistics DMA. Instead,
we manually trigger the DMA once a second in bnx2_timer(). This
manual trigger of the DMA avoids the problem.
As a consequence, we can only allow 0 or 1 second settings for
ethtool -C statistics block.
Thanks to Jean-Daniel Pauget <jd@disjunkt.com> and
CaT <cat@zip.com.au> for reporting this rare problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing code to enable DMA on 5709 A1. The bit is a no-op on A0
and therefore can be set on all 5709 chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For correctness, we need to wait for the MEM_INIT bit to be cleared
in the BNX2_CTX_COMMAND register before proceeding.
[Added return -EBUSY when the MEM_INIT bit doesn't clear, suggested
by Jeff Garzik.]
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a bug in the driver that only initializes half of the context
memory on the 5708. Surprisingly, this works most of the time except
for some occasional netdev watchdogs when sending a lot of 64-byte
packets. The fix is to add the missing code to initialize the 2nd
halves of all context memory.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many drivers had code that did kill_vid, but they weren't doing vlan
filtering. With new API the stub is unneeded unless device sets
NETIF_F_HW_VLAN_FILTER.
Bad habit: I couldn't resist fixing a couple of nearby style things
in acenic, and forcedeth.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove the check for skb->len greater than MTU when doing TSO. When
the destination has a smaller MSS than the source, a TSO packet may
be smaller than the MTU at the source and we still need to process it
as a TSO packet.
Thanks to Brian Ristuccia <bristuccia@starentnetworks.com> for
reporting the problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the code to print PCI or PCIE bus information for all devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips. In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.
Put the request_irq/free_irq logic in common procedures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restructure by adding bnx2_phy_event_is_set() to make code cleaner
and easier to understand.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI ID and code to support the 5709 Serdes PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some common procedures to handle enabling and disabling 2.5G.
Add some missing code to resolve flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 Serdes device uses non-standard MII register offsets. This
re-structuring will make it easier to support 5709 Serdes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed to save the MSI state which will be lost during
suspend.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hot-plug scripts can call bnx2_open() as soon as register_netdev() is
called in bnx2_init_one(). We need to call pci_set_drvdata() and
setup everything before calling register_netdev(). netif_carrier_off()
also needs to be moved to bnx2_open() to avoid race conditions with
the irq.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The internal PCIE-to-PCIX bridge of the 5708 has the same 40-bit DMA
limitation as some of the tg3 chips. Set dma_mask and persistent DMA
mask to 40-bit to workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device may be in D3hot state and should not allow MII register
access.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.
Ditched a no-op in bnx2 in the process.
I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open
coded skb->nh.iph uses, now to go after the rest...
Just out of curiosity, here are the idioms found to get the same result:
skb->nh.iph->ihl << 2
skb->nh.iph->ihl<<2
skb->nh.iph->ihl * 4
skb->nh.iph->ihl*4
(skb->nh.iph)->ihl * sizeof(u32)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>