Previously we refilled with much larger batches, which caused large latency
spikes. We now have many more much much smaller spikes!
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We need to clear the private data pointer in the PCI device.
Also reorder cleanup in efx_pci_remove() for symmetry.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
efx_nic_fatal_interrupt() disables DMA before scheduling a reset.
After this, we need not and *cannot* flush queues.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Artem's cleanup of the MTD API continues apace.
Fixes and improvements for ST FSMC and SuperH FLCTL NAND, amongst others.
More work on DiskOnChip G3, new driver for DiskOnChip G4.
Clean up debug/warning printks in JFFS2 to use pr_<level>.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAk92K6UACgkQdwG7hYl686NrMACfWQJRWasR78MWKfkT2vWZwTFJ
X5AAoKiSYO2pfo5gWJGOAahNC1zUqMX0
=i3Vb
-----END PGP SIGNATURE-----
Merge tag 'for-linus-3.4' of git://git.infradead.org/mtd-2.6
Pull MTD changes from David Woodhouse:
- Artem's cleanup of the MTD API continues apace.
- Fixes and improvements for ST FSMC and SuperH FLCTL NAND, amongst
others.
- More work on DiskOnChip G3, new driver for DiskOnChip G4.
- Clean up debug/warning printks in JFFS2 to use pr_<level>.
Fix up various trivial conflicts, largely due to changes in calling
conventions for things like dmaengine_prep_slave_sg() (new inline
wrapper to hide new parameter, clashing with rewrite of previously last
parameter that used to be an 'append' flag, and is now a bitmap of
'unsigned long flags').
(Also some header file fallout - like so many merges this merge window -
and silly conflicts with sparse fixes)
* tag 'for-linus-3.4' of git://git.infradead.org/mtd-2.6: (120 commits)
mtd: docg3 add protection against concurrency
mtd: docg3 refactor cascade floors structure
mtd: docg3 increase write/erase timeout
mtd: docg3 fix inbound calculations
mtd: nand: gpmi: fix function annotations
mtd: phram: fix section mismatch for phram_setup
mtd: unify initialization of erase_info->fail_addr
mtd: support ONFI multi lun NAND
mtd: sm_ftl: fix typo in major number.
mtd: add device-tree support to spear_smi
mtd: spear_smi: Remove default partition information from driver
mtd: Add device-tree support to fsmc_nand
mtd: fix section mismatch for doc_probe_device
mtd: nand/fsmc: Remove sparse warnings and errors
mtd: nand/fsmc: Add DMA support
mtd: nand/fsmc: Access the NAND device word by word whenever possible
mtd: nand/fsmc: Use dev_err to report error scenario
mtd: nand/fsmc: Use devm routines
mtd: nand/fsmc: Modify fsmc driver to accept nand timing parameters via platform
mtd: fsmc_nand: add pm callbacks to support hibernation
...
As of bb0eb217, MTD_FAIL_ADDR_UNKNOWN should be used to indicate mtd
erase failure not specific to any particular block.
Use MTD_FAIL_ADDR_UNKNOWN instead of 0xffffffff when setting
'erase->fail_addr' in 'efx_mtd_erase()'.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch renames all MTD functions by adding a "_" prefix:
mtd->erase -> mtd->_erase
mtd->read_oob -> mtd->_read_oob
...
The reason is that we are re-working the MTD API and from now on it is
an error to use MTD function pointers directly - we have a corresponding
API call for every pointer. By adding a leading "_" we achieve the following:
1. Make sure we convert every direct pointer users
2. A leading "_" suggests that this interface is internal and it becomes
less likely that people will use them directly
3. Make sure all the out-of-tree modules stop compiling and the owners
spot the big API change and amend them.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
During probe of each port, read and log the part number from VPD.
Remove the Falcon-specific board name lookup.
Initial version by Stuart Hodgson <smhodgson@solarflare.com>.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Generate a test event on each event queue whenever the interface is
brought up, then after 1 second check that we have either handled a
test event or handled another IRQ for each event queue.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
In case all event queues are broken for some reason, this means it
will only take about a second to check them all, rather than up to 32
seconds. This may also speed up testing in the successful case.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
IRQ latency can be ridiculously high for various reasons, so our
current timeouts of 100 ms or 10 ms are too short.
Change the IRQ and event tests to use polling loops starting with a
delay of 1 tick and doubling that if necessary up to a maximum total
delay of approximately 1 second.
Raise the loopback packet RX timeout to 1 second.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
RX and TX completions on the same event queue are generally not associated
with the same flows. The inclusion of TX completions in the adaptive IRQ
score is more of a source of noise rather than useful feedback. Therefore,
do not include them in the score, and adjust the default threshold scores
down.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The in-tree driver has never supported Driverlink. The rest of the
comments are rather redundant, but we can usefully state what the
requirements are on the buffer state.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Conflicts:
drivers/net/ethernet/sfc/rx.c
Overlapping changes in drivers/net/ethernet/sfc/rx.c, one to change
the rx_buf->is_page boolean into a set of u16 flags, and another to
adjust how ->ip_summed is initialized.
Signed-off-by: David S. Miller <davem@davemloft.net>
When pre-allocating skbs for received packets, we set ip_summed =
CHECKSUM_UNNCESSARY. We used to change it back to CHECKSUM_NONE when
the received packet had an incorrect checksum or unhandled protocol.
Commit bc8acf2c8c ('drivers/net: avoid
some skb->ip_summed initializations') mistakenly replaced the latter
assignment with a DEBUG-only assertion that ip_summed ==
CHECKSUM_NONE. This assertion is always false, but it seems no-one
has exercised this code path in a DEBUG build.
Fix this by moving our assignment of CHECKSUM_UNNECESSARY into
efx_rx_packet_gro().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Unify return value of .ndo_set_mac_address if the given address
isn't valid. Return -EADDRNOTAVAIL as eth_mac_addr() already does
if is_valid_ether_addr() fails.
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
efx_for_each_possible_channel_tx_queue() should do nothing for RX-only
or extra channels. The current definition results in allocating
additional unused hardware TX queues when using the mqprio qdisc and
either separate_tx_channels or SR-IOV.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We have a very simple way of allocating buffer table entries to
queues, which is just to take the next one available. The extra
channels are the highest numbered channels but they need to be
allocated the lowest entries so that the traffic channels can be
allocated new entries without any collisions.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
efx_vfdi_set_status_page() validates the peer page count by
calculating the size of a request containing that many addresses and
comparing that with the maximum valid request size (4KB). The
calculation involves a multiplication that may overflow on a 32-bit
system.
We use kcalloc() to allocate memory to store the addresses; that also
does a multiplication and it does check for integer overflow, so any
values larger than 0x1fffffff will be rejected. However, values in
the range [0x1fffffffc, 0x1fffffff] pass boh tests and result in an
attempt to allocate nearly 4GB on the heap. This should be rejected
rather quickly as it's obviously impossible on a 32-bit system, and
indeed the maximum possible heap allocation is 32MB. Still, let's
make absolutely sure by fixing the initial validation.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This requirement was meant to be implied in the name 'status page'.
One out-of-tree VF driver allocates a buffer using the structure size
and not a full page - hence the current odd specification - but in
practice that allocation will be padded and aligned to at least 4KB.
Therefore, we can specify this and have the option to extend the
structure up to 4KB without worrying about VF drivers using odd-shaped
buffers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On the SFC9000 family, each port has 1024 Virtual Interfaces (VIs),
each with an RX queue, a TX queue, an event queue and a mailbox
register. These may be assigned to up to 127 SR-IOV virtual functions
per port, with up to 64 VIs per VF.
We allocate an extra channel (IRQ and event queue only) to receive
requests from VF drivers.
There is a per-port limit of 4 concurrent RX queue flushes, and queue
flushes may be initiated by the MC in response to a Function Level
Reset (FLR) of a VF. Therefore, when SR-IOV is in use, we submit all
flush requests via the MC.
The RSS indirection table is shared with VFs, so the number of RX
queues used in the PF is limited to the number of VIs per VF.
This is almost entirely the work of Steve Hodgson, formerly
shodgson@solarflare.com.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Each port has a block of 64-bit SRAM that is divided between buffer
table and descriptor cache regions at initialisation time. Currently
we use a fixed allocation, but it needs to be changed to support
larger numbers of queues.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This lets us identify the NIC affected in case of failure, and
will be necessary to adjust for SR-IOV constraints.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Abstract some of the channel operations to allow for 'extra'
channels that do not have RX or TX queues.
- Try to assign a channel to each extra channel type that is enabled
for the NIC, but gracefully degrade if we can't allocate sufficient
MSI-X vectors
- Allow each extra channel type to generate its own channel name
- Allow channel types to disable reallocation and reinitialisation
of their channels
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The TX DMA engine issues upstream read requests when there is room in
the TX FIFO for the completion. However, the fetches for the rest of
the packet might be delayed by any back pressure. Since a flush must
wait for an EOP, the entire flush may be delayed by back pressure.
Mitigate this by disabling flow control before the flushes are
started. Since PF and VF flushes run in parallel introduce
fc_disable, a reference count of the number of flushes outstanding.
The same principle could be applied to Falcon, but that
would bring with it its own testing.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
For SR-IOV we will need to send events to event queues that belong to
VFs serviced by other drivers. Change the parameters of
efx_generate_event() to allow this and declare it extern.
While we're at it, remove the existing declaration under the wrong
name efx_nic_generate_event().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
When SR-IOV is enabled we may receive FLR (Function-Level Reset)
events, associated queue flush events and requests from VF drivers at
any time. Therefore we need to keep event queues and interrupts
enabled whenever possible.
Currently we stop interrupt-driven event processing before flushing RX
and TX queues; efx_nic_flush_queues() then polls event queues for
flush events and discards any others it finds. Change it to work with
the regular event handling functions.
Currently efx_start_channel() fills RX queues synchronously when a
device is brought up. This could now race with NAPI, so change it to
send fill events.
This was almost entirely written by Steve Hodgson, formerly
shodgson@solarflare.com.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The RMFT_DEST_MAC and TMFT_SRC_MAC register fields were previously
documented as 44 bits wide, whereas a MAC address has 48 bits.
Thankfully the hardware uses the correct width and the driver has
used separate definitions that divide each of these into 32-bit and
16-bit fields.
Fix the initial definitions for these fields and rewrite the latter
definitions to use them.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On Siena each TX queue can be configured to send only packets for
which there is a TX MAC filter that matches the source MAC address,
queue ID, and optionally VID. This will be used to implement the
'spoofchk' feature for SR-IOV virtual functions.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
On Siena all received packets that don't match a more specific filter
will match the unicast or multicast default filter. Currently we
leave these set to the default values (RSS with base queue number of
0). Allow them to be reconfigured to select a single RX queue.
These default filters are programmed through the FILTER_CTL register,
but we represent them internally as an additional table of size 2.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Log an explicit warning if we are unable to create MTDs for a net
device. Also correct the comment about why mtd_device_register() may
fail; there is no longer an MTD table to fill up.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The 'page size' for PCIe DMA, i.e. the alignment of boundaries at
which DMA must be broken, is 4KB. Name this value as EFX_PAGE_SIZE
and use it in efx_max_tx_len(). Redefine EFX_BUF_SIZE as
EFX_PAGE_SIZE since its value is also a result of that requirement,
and use it in efx_init_special_buffer().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
If efx_pci_probe_main() schedules an INVISIBLE or ALL reset (but
nothing more drastic), we retry it up to 5 times. So far as I'm
aware, this was a workaround for bugs in Falcon A0 which were fixed
in production silicon. Remove the retry.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The code in efx_process_channel() to update the RX queue after each
batch of RX completions works out as a no-op on a TX-only channel
where the RX queue structure is set to all-zeroes, but
(1) efx_channel_get_rx_queue() will BUG() if DEBUG is defined, and
(2) it's a waste of time.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This function returns the page offset of the buffer, which can be
calculated based on either its DMA address or its virtual address. It
used to use the virtual address and we would cast that to unsigned
long, as anything smaller would result in a compiler warning. Now
that it's using the DMA address we should use unsigned int, matching
the return type. It is also unnecessary to use __force.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Replace checksummed and discard booleans from efx_handle_rx_event()
with a bitmask, added to the flags field.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Currently we use type u64 for byte counts, which can very quickly
exceed 2^32, and unsigned long for packet counts, which do not. But
it can still take only 20-something minutes to send or receive 2^32
packets, and not all tools properly handle overflow even if they
sample more often than this.
The MAC statistics are all updated synchronously, so it costs very
little to make them all 64-bit regardless of native word size.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Rename efx_set_multicast_list() to efx_set_rx_mode(), in line
with the operation name net_device_ops::ndo_set_rx_mode.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>