With commit d75b1ade56 ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. bcm_sysport_tx_poll()
always returns 0 whether or not we completed the poll quantum.
Fix this by returning either 0 when we did complete the TX ring reclaim,
or budget to trigger a repoll.
Fixes: d75b1ade56 ("net: less interrupt masking in NAPI")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the commit d75b1ade56 ("net: less interrupt masking in NAPI") napi repoll
is done only when work_done == budget. In tx napi poll we always return 0.
So tx napi is not called again and we do not clean up the tx ring.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the host uses packet encapsulation feature, the MTU may be reduced by the
host due to headroom reservation for encapsulation. This patch handles this
new MTU value.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the types of the descriptor entries in the xgbe_ring_desc struct
from u32 to __le32 to fix endian warnings issued by sparse.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is to ensure the net_device's dev.parent is set before we used it
in dma_zalloc_coherent() from init_hash_table().
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix static checker warning that got introduced in commit e2ac962895
("cxgb4: Cleanup macros so they follow the same style and look consistent, part
2") due to accidental checkin of bogus line.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the device is unplugged or !netif_running(), the workqueue
doesn't need to wake the device, and could return directly.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear the flag of SCHEDULE_TASKLET in bottom_half() to avoid
re-schedule the tasklet again by workqueue.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The INIT_LIST_HEAD(&tp->rx_done) would be done in rtl_start_rx(),
so remove the unnecessary one in alloc_all_mem().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHY revisions E0 and F0 share the same shorter workaround initialization
sequence. Dedicate a special function for these two PHY revisions to
perform the needed workaround sequence.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHY revision D0 requires a specific workaround sequence which needs to
be applied to get the HW to behave properly in all corner cases
conditions. Do this based on the revision we just read out of the HW
using a specific function.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function performs a R/RC calibration reset and will start being
used by more than one function in the next patches, create a helper
function to factor code.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcm7445_config_init() was working around non-production version of the
PHY HW block, so just remove it entirely.
bcm7xxx_28nm_afe_config_init() was running for all PHY revisions greater
than B0, but this workaround sequence is really specific to the B0 PHY
revision, so rename the function accordingly and update the GPHY macro
to use the generic config_init callback.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcm7xxx_28nm_config_init() can be called as frequently as needed by the
PHY library upon suspend/resume cycles and interface bring up/down, just
print the PHY revision once and for all in order not to spam kernel
logs.
Fixes: d8ebfed3f1 ("net: phy: bcm7xxx: utilize PHY revision in config_init")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the normal kernel debugging mechanism which also
enables dynamic_debug at the same time.
Other miscellanea:
o Remove sysctl for irda_debug
o Remove function tracing like uses (use ftrace instead)
o Coalesce formats
o Realign arguments
o Remove unnecessary OOM messages
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable led-mode configuration for KSZ8081 and KSZ8091.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up led-mode setup by introducing proper defines for PHY Control
registers 1 and 2 and only passing the register to the setup function.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure never to update more than two bits when setting the led mode,
something which could for example change the reference-clock setting.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable PHY address 0 as the broadcast address, so that it can be used
as a unique (non-broadcast) address on a shared bus.
Note that this can also be configured using the B-CAST_OFF pin on
KSZ9091, but that KSZ8081 lacks this pin and is also limited to
addresses 0 and 3.
Specifically, this allows for dual KSZ8081 setups.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor and clean up broadcast disable.
Some Micrel PHYs have a broadcast-off bit in the Operation Mode Strap
Override register which can be used to disable PHY address 0 as the
broadcast address, so that it can be used as a unique (non-broadcast)
address on a shared bus.
Note that the KSZPHY_OMSO_RMII_OVERRIDE bit is set by default on
KSZ8021/8031.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure never to update the control register with random data (an
error code) by checking the return value after reading it.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace module init/exit which only calls phy_drivers_register with
module_phy_driver macro.
Tested using Micrel driver, and otherwise compile-tested only.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace module init/exit which only calls phy_driver_register with
module_phy_driver macro.
Compile tested only.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a NULL pointer dereference when __tx_port_find() doesn't
find a matching port.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Intel drivers were pretty much just using the plain vanilla GFP flags
in their calls to __skb_alloc_page so this change makes it so that they use
dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the calls to __skb_alloc_page that are passed NULL with calls to
__dev_alloc_page.
In addition remove __GFP_COLD flag from allocations as we only want it for
the Rx buffer which is taken care of by __dev_alloc_skb, not for any
secondary allocations such as the queue element transmit descriptors.
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the bloated use of __skb_alloc_page and replace it with
__dev_alloc_page. In addition update the one other spot that is
allocating a page so that it allocates with the correct flags.
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And use the more common mechanisms directly.
Other miscellanea:
o Coalesce formats
o Add missing newlines
o Realign arguments
o Remove unnecessary OOM message logging as
there's a generic stack dump already on OOM.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-11-11
This series contains updates to i40e, i40evf and ixgbe.
Kamil updated the i40e and i40evf driver to poll the firmware slower
since we were polling faster than the firmware could respond.
Shannon updates i40e to add a check to keep the service_task from
running the periodic tasks more than once per second, while still
allowing quick action to service the events.
Jesse cleans up the throttle rate code by fixing the minimum interrupt
throttle rate and removing some unused defines.
Mitch makes the early init admin queue message receive code more robust
by handling messages in a loop and ignoring those that we are not
interested in. This also gets rid of some scary log messages that
really do not indicate a problem.
Don provides several ixgbe patches, first fixes an issue with x540
completion timeout where on topologies including few levels of PCIe
switching for x540 can run into an unexpected completion error. Cleans
up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
future work. Adds support for x550 MAC's to the driver.
v2:
- Remove code comment in patch 01 of the series, based on feedback from
David Liaght
- Updated the "goto" to "break" statements in patch 06 of the series,
based on feedback from Sergei Shtylyov
- Initialized the variable err due to the possibility of use before
being assigned a value in patch 07 of the series
- Added patch "ixgbe: add helper function for setting RSS key in
preparation of X550" since it is needed for the addition of X550 MAC
support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
we were dereferencing dev to initialize pdata. but just after that we
have a BUG_ON(!dev). so we were basically dereferencing the pointer
first and then tesing it for NULL.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of registering the platform and PCI drivers in one module let's move
necessary bits to where it belongs. During this procedure we convert the module
registration part to use module_*_driver() macros which makes code simplier.
>From now on the driver consists three parts: core library, PCI, and platform
drivers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).
Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.
In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can call napi_gro_frags for all the received traffic regardless
of the checksum status. Specifically, received packets whose status
is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
are eligible for napi_gro_frags as well.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split off the setting of the RSS key into its own function. This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver. New functionality and enablement support will
be added in following patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move setting of drop enable to support function. This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.
Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.
Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.
Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.
Change some strings in the function to be a bit less wrappy, and
express the correct limits.
Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task. However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware. This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.
Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.
The maximum delay is still 100ms.
Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After commit 1a28817282 ("mlx4: use napi_complete_done()") we ended up
calling napi_complete_done() in the case NAPI poll consumed all its
budget.
This added extra interrupt pressure, this patch restores proper
behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 1a28817282 ("mlx4: use napi_complete_done()")
Signed-off-by: David S. Miller <davem@davemloft.net>
The out_dropped label will only do rcu_read_unlock for non-null port.
So add the missing rcu_read_unlock() when bailing due to non-null port.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per comments in vnet_start_xmit, for the edge case
when outgoing vnet_start_xmit() data and an incoming STOPPED
ACK cross each other in flight, we may need to send the missed
START trigger from maybe_tx_wakeup() after checking for a
false value of start_cons
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vnet_start_xmit() is concurrent with vnet_ack(), we may
have a race that looks like:
thread 1 thread 2
vnet_start_xmit vnet_event_napi -> vnet_rx
__vnet_tx_trigger for some desc X
at this point dr->prod == X
peer sends back a stopped ack for X
we process X, but X == dr->prod
so we bail out in vnet_ack with
!idx_is_pending
update dr->prod
As a result of the fact that we never processed the stopped ack for X,
the Tx path is led to incorrectly believe that the peer is still
"started" and reading, but the peer has stopped reading, which will
ultimately end in flow-control assertions.
The fix is to synchronize the above 2 paths on the netif_tx_lock.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver allows MTU up to 1518 bytes which is not enought to run
batman-adv. Simply raise the maximum packet size up to the maximum
allowed by the transmit descriptor, 1792 bytes, giving a maximum MTU
of 1774 bytes.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>