Commit Graph

11338 Commits

Author SHA1 Message Date
Loganaden Velvindron
4a89ba04ec 3c59x: Add BQL support for 3c59x ethernet driver.
This BQL patch is based on work done by Tino Reichardt.

Tested on 0000:05:00.0: 3Com PCI 3c905C Tornado at ffffc90000e6e000 by running
Flent several times.

Signed-off-by: Loganaden Velvindron <logan@elandsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-24 12:20:58 -07:00
Jaedon Shin
b6df7d61c8 net: bcmgenet: fix uncleaned dma flags
Clean the dma flags of multiq ring buffer int the interface stop
process. This patch fixes that the genet is not running while the
interface is re-enabled.

$ ifup eth0 - running after booting
$ ifdown eth0
$ ifup eth0 - not running and occur tx_timeout

The bcmgenet_dma_disable() in bcmgenet_open() do clean ring16 dma flag
only. If the genet has multiq, the dma register is not cleaned. and
bcmgenet_init_dma() is not done correctly. in case
GENET_V2(tx_queues=4), tdma_ctrl has 0x1e after running
bcmgenet_dma_disable().

Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 23:00:41 -07:00
Florian Fainelli
eed635699a net: bcmgenet: Avoid sleeping in bcmgenet_timeout
bcmgenet_timeout() executes in atomic context, yet we will invoke
napi_disable() which does sleep. Looking back at the changes, disabling
TX napi and re-enabling it is completely useless, since we reclaim all
TX buffers and re-enable interrupts, and wake up the TX queues.

Fixes: 13ea657806 ("net: bcmgenet: improve TX timeout")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 22:59:33 -07:00
Noam Camus
41493795a4 NET: nps_enet: minor namespace cleanup
We define buf_int_enable in the minimal namespace it is used.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:08:54 -07:00
Noam Camus
3d99b74ab3 NET: nps_enet: TX done acknowledge.
This is needed for when TX done interrupt is in
"level mode".
For example it is true for some simulators of this device.

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:08:54 -07:00
Noam Camus
de6715677a NET: nps_enet: drop control frames
We set controller to drop control frames and not trying
to pass them on. This is only needed for debug reasons.

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:08:53 -07:00
Noam Camus
93fcf83eb9 NET: nps_enet: TX done race condition
We need to set tx_skb pointer before send frame.
If we receive interrupt before we set pointer we will try
to free SKB with wrong pointer.
Now we are sure that SKB pointer will never be NULL during
handling TX done and check is removed.

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:08:53 -07:00
Noam Camus
0dd20f3ce0 NET: nps_enet: replace use of cause register
When interrupt is received we read directly from control
register for RX/TX instead of reading cause register
since this register fails to indicate TX done when
TX interrupt is "edge mode".

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:08:53 -07:00
Michal Hocko
2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
David S. Miller
dc25b25897 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c

Overlapping additions of new device IDs to qmi_wwan.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-21 11:44:04 -07:00
David S. Miller
1a69205c47 enic: Fix build failure with SRIOV disabled.
err_out_vnic_unregister is used regardless of whether
SRIOV is enabled or not.

Reported-by: Jesse Brandeburg <jesse.brangeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-21 11:43:22 -07:00
Jeremy Linton
938049e18d net: xgene Remove xgene specific phy and MAC lookup functions
Convert the xgene_get_mac_address to device_get_mac_address(), and
xgene_get_phy_mode() to device_get_phy_mode().

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:43:49 -07:00
Jeremy Linton
4d14a63400 smsc911x: Remove dev==NULL check.
The dev==NULL check in smsc911x_probe_config is useless
and isn't providing any additional protection. If a fwnode
doesn't exist then an appropriate error should be returned
by device_get_phy_mode() covering the original case
of a missing of/fwnode.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:36:25 -07:00
Yuval Mintz
0f8f27de19 bnx2x: Fix vxlan endianity issue
Commit f34fa14cc0 ("bnx2x: Add vxlan RSS support") has introduced an
endianity issue when passing the vxlan UDP port to the HW.

Reported-by: <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:08:08 -07:00
David S. Miller
def63be85f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-08-18

This series contains updates to igb, e100, e1000e and ixgbe.

Shota Suzuki provides a fix for a possible overflow in
igb_set_interrupt_capability() which leads to an oops.  When changing the
number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same
manner as when initializing the igb driver.

Vasily Averin provides a fix for a missing rtnl_unlock() for when we
error out due to not being able to allocate memory for our queues.

Stefan Assman provides a couple of fixes for igb/igbvf.  First changes
the igb driver in probe to simply call igb_enable_sriov() instead of
igb_sriov_reinit() since we are starting from scratch.  Then in igbvf,
fix the driver where it does not clear the buffer_info->dma in all
cases after calling dma_unmap_single(), which was found by changing the
MTU twice.

Richard Cochran implements the periodic output function using the
programmable clock outputs available in i210 when possible, falling
back to the target time for longer periods.

Todd adds support for the Marvell PHY 1512 which is required for i354
devices.  Then updates igb to make sure SR-IOV init uses the correct
number of queues, since recent changes could result in the PF holding
onto all of the queues.

Alex Williamson provides a fix in the case where a guest OS does not
support hot-unplug, so disable SR-IOV prior to unregister_netdev() to
avoid the problem.

Jia-Ju Bai provides several patches, first knocks some collecting dust
off an old e100 driver to add a check to avoid a null pointer
dereference.  Then cleans up a possible resource leak by releasing the
skb buffer allocated when the e100_xmit_prepare() runs into an issue
in the DMA mapping.  In igb, add a missing rtnl_unlock() for when we
error out due to igb_sriov_reinit() in the igb_init_interrupt_scheme().
Provides a e1000e fix, based on suggestions from Alex Duyck to move
head/tail register writing to e1000_configure_tx/rx() to avoid a
possible null pointer dereference (similar to igb driver).  Lastly,
fix a possible memory leak in igb_probe(), where the memory shadow_vfta
allocated by kcalloc in igb_sw_init() is not freed.

Mark simplifies port-specific macros for ixgbe by eliminating explicit
comparisons with 0 and enclose formal parameters in parens to eliminate
the risk of an operator precedence issue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18 20:21:32 -07:00
Dan Carpenter
c938a00344 cxgb4: memory corruption in debugfs
You can't use kstrtoul() with an int or it causes memory corruption.
Also j should be unsigned or we have underflow bugs.

I considered changing "j" to unsigned long but everything fits in a u32.

Fixes: 8e3d04fd7d ('cxgb4: Add MPS tracing support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18 19:06:58 -07:00
David S. Miller
3dc33e2322 enic: Fix namespace pollution causing build errors.
drivers/net/built-in.o: In function `.vnic_wq_devcmd2_alloc':
(.text+0x49fe40): multiple definition of `.vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.text+0xb4318): first defined here
drivers/net/built-in.o:(.opd+0x2af00): multiple definition of `vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.opd+0xad70): first defined here
drivers/net/built-in.o: In function `.vnic_wq_init_start':
(.text+0x49f9c0): multiple definition of `.vnic_wq_init_start'
drivers/scsi/built-in.o:(.text+0xb3b58): first defined here
drivers/net/built-in.o:(.opd+0x2ae88): multiple definition of `vnic_wq_init_start'
drivers/scsi/built-in.o:(.opd+0xace0): first defined here

Rename these to 'enic_*' to avoid the conflict with the functiosn of
the same name in the snic scsi driver.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18 14:24:30 -07:00
Rajesh Borundia
f34fa14cc0 bnx2x: Add vxlan RSS support
Latest FW submission added some vxlan offload capabilities to our device.
This patch adds the ability to connect to the vxlan NDOs and configure
the UDP port associated with it in the HW.

The device would now be capable of performing RSS according to the
inner headers of the vxlan packets.

Signed-off-by: Rajesh Borundia <Rajesh.Borundia@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-18 14:21:10 -07:00
Jacob Keller
56d1392f2f ixgbe: TRIVIAL fix up double 'the' and comment style
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:07 -07:00
Mark Rustad
d147329b0a ixgbe: Simplify port-specific macros
Simplify port-specific macros by eliminating explicit comparison
with 0. More importantly, enclose formal parameter in parens to
eliminate the risk of an operator precedence surprise.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:07 -07:00
Todd Fujinaka
ceee3450b3 igb: make sure SR-IOV init uses the right number of queues
Recent changes to igb_probe_vfs() could lead to the PF holding onto all
of the queues. Reorder igb_probe_vfs() to be before
gb_init_queue_configuration() and add some more error checking.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:07 -07:00
Stefan Assmann
fae5ecaee3 igbvf: clear buffer_info->dma after dma_unmap_single()
The driver doesn't clear buffer_info->dma after calling
dma_unmap_single() in all cases. This has been discovered by changing
the mtu twice, which caused the following backtrace.

[   68.569280] WARNING: CPU: 2 PID: 1860 at drivers/iommu/intel-iommu.c:3517 intel_unmap+0x20c/0x220()
[   68.579392] Driver unmaps unmatched page at PFN fffc2a40
[   68.585322] Modules linked in: igbvf ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat kvm_intel kvm igb megs
[   68.599163] CPU: 2 PID: 1860 Comm: ifconfig Not tainted 4.2.0-rc4+ #147
[   68.606543] Hardware name: IBM  -[546025Z]-/00Y7630, BIOS -[VVE134TUS-1.51]- 10/17/2013
[   68.615473]  0000000000000dbd ffff88046441bb08 ffffffff81a5ad0b ffffffff81e2f9ea
[   68.623775]  ffff88046441bb58 ffff88046441bb48 ffffffff81056b55 ffff88047fc583c0
[   68.632075]  0000000000000000 ffff880469a8e600 00000000fffc2a40 ffff880465b32098
[   68.640375] Call Trace:
[   68.643109]  [<ffffffff81a5ad0b>] dump_stack+0x48/0x5d
[   68.648844]  [<ffffffff81056b55>] warn_slowpath_common+0x95/0xe0
[   68.655549]  [<ffffffff81056c56>] warn_slowpath_fmt+0x46/0x70
[   68.661960]  [<ffffffff8158a614>] ? find_iova+0x54/0x90
[   68.667791]  [<ffffffff815988dc>] intel_unmap+0x20c/0x220
[   68.673815]  [<ffffffff8159891e>] intel_unmap_page+0xe/0x10
[   68.680038]  [<ffffffffa0067536>] igbvf_clean_rx_ring+0x96/0x370 [igbvf]
[   68.687516]  [<ffffffffa0067915>] igbvf_down+0x105/0x110 [igbvf]
[   68.694219]  [<ffffffffa0067beb>] igbvf_change_mtu+0x16b/0x180 [igbvf]
[...]

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:06 -07:00
Jia-Ju Bai
42ad1a03b4 igb: Fix a memory leak in igb_probe
In error handling code of igb_probe, the memory adapter->shadow_vfta
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:06 -07:00
Jia-Ju Bai
0845d45e90 e1000e: Modify Tx/Rx configurations to avoid null pointer dereferences in e1000_open
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in "err_setup_rx" segment is executed.
"writel(0, tx_ring->head)" statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null poonter dereference(crash),
because "tx_ring->head" is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.

This patch moves head/tail register writing to e1000_configure_tx/rx,
which can fix this problem. It is inspired by igb_configure_tx_ring
in the igb driver.

Specially, thank Alexander Duyck for his valuable suggestion.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:06 -07:00
Jia-Ju Bai
3eb14ea8d9 igb: Fix a deadlock in igb_sriov_reinit
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Jia-Ju Bai
5e5d49422d e100: Release skb when DMA mapping is failed in e100_xmit_prepare
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer
allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not
released, which causes a possible resource leak.
This patch adds error handling code to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Jia-Ju Bai
9ad607b4a9 e100: Add a check after pci_pool_create to avoid null pointer dereference
The driver lacks the check of nic->cbs_pool after pci_pool_create
in e100_probe. When this function is failed, a null pointer dereference
occurs when pci_pool_alloc uses nic->cbs_pool in e100_alloc_cbs.
This patch adds a check and related error handling code to fix it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Alex Williamson
c23d92b80e igb: Teardown SR-IOV before unregister_netdev()
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci.  In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM.  Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown.  Disabling SR-IOV prior to unregister_netdev() avoids this
issue.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:05 -07:00
Todd Fujinaka
51045ecff0 igb: add support for 1512 PHY
This patch adds support for Marvell PHY 1512 (required for I354).

Submitted by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:04 -07:00
Richard Cochran
30c72916d7 igb: implement high frequency periodic output signals
In addition to interrupt driven target time output events, the i210
also has two programmable clock outputs.  These clocks support periods
between 16 nanoseconds and 140 milliseconds.  This patch implements
the periodic output function using the clock outputs when possible,
falling back to the target time for longer periods.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:04 -07:00
Stefan Assmann
6423fc3416 igb: do not re-init SR-IOV during probe
During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
  ->igb_probe_vfs
    ->igb_pci_enable_sriov
      ->igb_sriov_reinit

Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.

Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:04 -07:00
Vasily Averin
f468adc944 igb: missing rtnl_unlock in igb_sriov_reinit()
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:03 -07:00
Shota Suzuki
72ddef0506 igb: Fix oops caused by missing queue pairing
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.

Fix this problem as follows:
 - When changing the number of queues by "ethtool -L", set
   IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
 - When increasing the size of q_vector, reallocate it appropriately.
   (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)

Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.

Note that before commit cd14ef54d2 ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().

Fixes: 907b783579 ("igb: Add ethtool support to configure number of channels")
CC: stable <stable@vger.kernel.org>
Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-08-18 14:06:03 -07:00
David S. Miller
f376d4adfd enic: Fix sparse warning in vnic_devcmd_init().
>> drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: sparse: incorrect type in assignment (different address spaces)
   drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13:    expected void *res
   drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13:    got void [noderef] <asn:2>*

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 21:24:59 -07:00
David S. Miller
ecf842f65c mlx5e: Fix sparse warnings in mlx5e_handle_csum().
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types)
   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44:    expected restricted __sum16 [usertype] n
   drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44:    got restricted __be16 [usertype] check_sum

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 21:22:26 -07:00
Achiad Shochat
bbceefce9a net/mlx5e: Support RX CHECKSUM_COMPLETE
Only for packets with first ethertype set to IPv4/6 for now.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:36 -07:00
Achiad Shochat
3c2d18ef22 net/mlx5e: Support ethtool get/set_pauseparam
Only rx/tx pause settings.
Autoneg setting is currently not supported.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:36 -07:00
Achiad Shochat
6fa1bcab6b net/mlx5e: Ethtool link speed setting fixes
- Port speed settings are applied by the device only upon
  port admin status transition from DOWN to UP.
  So we enforce this transition regardless of the port's
  current operation state (which may be occasionally DOWN if
  for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
  register set/query access functions.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat
d9a40271cf net/mlx5e: HW LRO changes/fixes
- Change the maximum LRO session size from 16KB to 64KB
- Reduce the LRO session timeout from 512us to 32us in
  order to reduce the TCP latency of non-LRO'ed flows.
- Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type.
- Fix a bug accessing un-initialized mdev pointer.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat
e842b1001d net/mlx5e: Support smaller RX/TX ring sizes
We un-intentionally limited the minimum rings size too much.

TX minimum ring size reduced from 128 to 64.
RX minimum ring size reduced from 128 to 2.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat
2d75b2bc8a net/mlx5e: Add ethtool RSS configuration options
- get_rxfh_key_size
- get_rxfh_indir_size
- get/set_rxfh indirection table and RSS Toeplitz hash key
- get_rxnfc

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat
936896e908 net/mlx5e: Make RSS indirection table size a constant
The indirection table size was defined by a variable that
was actually assigned a constant value.
Since we do not have any forseen intension to make it configurable
we simply made it a constant.

We also limit the number of channels such that the RSS indirection
table could always populate all RX rings.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Achiad Shochat
57afead544 net/mlx5e: Have a single RSS Toeplitz hash key
No need to generate a unique key per TIR.
Generating a single key per netdev and copying it to all
its TIRs.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
Govindarajulu Varadarajan
373fb0873d enic: add devcmd2
devcmd is an interface for driver to communicate with fw/adaptor. It
involves writing data to hardware registers and waiting for the result.
This mechanism does not scale well. The queuing of "no wait" devcmds is
done in firmware memory rather than on the host. Firmware memory is a
rather more scarce and valuable resource than host memory. A devcmd storm
from one vf can disrupt the service on other pf/vf. The lack of flow
control allows for possible denial of server from one VM to another.

Devcmd2 uses work queue to post the devcmds, just like tx work queue. This
allows better flow control.

Initialize devcmd2, if fails we fall back to devcmd1.

Also change the driver version.

Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:25:29 -07:00
Govindarajulu Varadarajan
fda3f52bdb enic: add devcmd2 resources
Add devcmd resources to vnic_res_type. Add data types used by devcmd.

Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:25:29 -07:00
Govindarajulu Varadarajan
6a3c2f838c enic: use netdev_<foo> or dev_<foo> instead of pr_<foo>
pr_info does not give any details about the interface involved. This patch
uses netdev_info for printing the message. Use dev_info where netdev is not
ready.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:25:29 -07:00
Govindarajulu Varadarajan
8b89f3a19d enic: move struct definition from .c to .h file
Some of the structure definitions are in .c file to make them private to
that file. This patch moves the struct definition to .h file, So that their
definitions are accessible from other files.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:25:29 -07:00
Ivan Vecera
af19e68683 be2net: avoid vxlan offloading on multichannel configs
VxLAN offloading is not functional if the NIC is running in multichannel
mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
connectivity through the NIC and the device needs to be down and up to
restore it. The firmware should take care about it and does not allow
the conversion of interface to tunnel type (be_cmd_manage_iface) or should
support VxLAN offloading if multichannel config is enabled.
I have tested this on the latest available firmware (10.6.144.21).

Result:
[root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 172.30.10.50/24 dev enp5s0f0
[root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
[root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 172.30.10.60 dstport 4789
[root@sm-04 ~]# ip link set vxlan10 up
[ 7900.442811] be2net 0000:05:00.0: Enabled VxLAN offloads for UDP port 4789
[ 7900.455722] be2net 0000:05:00.1: Enabled VxLAN offloads for UDP port 4789
[ 7900.468635] be2net 0000:05:00.2: Enabled VxLAN offloads for UDP port 4789
[ 7900.481553] be2net 0000:05:00.3: Enabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

[root@sm-04 ~]# ip link set vxlan10 down
[ 7959.434093] be2net 0000:05:00.0: Disabled VxLAN offloads for UDP port 4789
[ 7959.444792] be2net 0000:05:00.1: Disabled VxLAN offloads for UDP port 4789
[ 7959.455592] be2net 0000:05:00.2: Disabled VxLAN offloads for UDP port 4789
[ 7959.466416] be2net 0000:05:00.3: Disabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ip link del vxlan10
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

[root@sm-04 ~]# ip link set enp5s0f0 down
[root@sm-04 ~]# ip link set enp5s0f0 up
[ 8071.019003] be2net 0000:05:00.0 enp5s0f0: Link is Up
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms

Cc: Sathya Perla <sathya.perla@avagotech.com>
Cc: Ajit Khaparde <ajit.khaparde@avagotech.com>
Cc: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Ajit Khaparde <ajit.khaparde@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 14:29:58 -07:00
Guenter Roeck
62ee783bf1 smsc911x: Fix crash seen if neither ACPI nor OF is configured or used
Commit 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT") makes
the call to smsc911x_probe_config() unconditional, and no longer fails if
there is no device node. device_get_phy_mode() is called unconditionally,
and if there is no phy node configured returns an error code. This error
code is assigned to phy_interface, and interpreted elsewhere in the code
as valid phy mode. This in turn causes qemu to crash when running a
variant of realview_pb_defconfig.

	qemu: hardware error: lan9118_read: Bad reg 0x86

Fixes: 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT")
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 14:06:16 -07:00
Dan Carpenter
2902bc66fa net: ethernet: micrel: fix an error code
The dma_mapping_error() function returns true or false.  We should
return -ENOMEM if it there is a dma mapping error.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 12:23:22 -07:00
Yuval Mintz
da3cc2da7c bnx2: Fix bandwidth allocation for some MF modes
Management firmware tells driver in case bandwidth configuration for
a specific function exists, but [regretably] the same field has different
meanings depending on the multi-function mode - it can either be
a percentile value or an actual speed.

For newer multi-function modes current logic is incorrect -
driver understands values as actual speeds instead of percentages,
causing the resulting chip configuration to be incorrect.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 10:27:57 -07:00
Claudiu Manoil
2a4eebf0c4 gianfar: Restore link state settings after MAC reset
There are some MAC registers that need to be kept in sync
with the link state parameters, see adjust_link().
However, after a MAC soft reset default values for
these registers are assumed.  In some cases (excepting
if down/ if up for example) adjust_link() does not see
that these values were reset to default because the
priv->old* link parameters were left unchanged.
So, reset the priv->old* link params as well during a
MAC reset to let adjust_link() restore the MAC link
settings to the actual link state values.

Fixes following case, for example:
Setting link to 100M, changing MTU (implies MAC reset),
link state remains unchanged to 100M but MAC registers
were reset to default (1G) breaking the connectivity w/
the PHY.  Closing and re-opening the interface would
restore the MAC link parameters to the correct values.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 21:26:23 -07:00
Hariprasad Shenai
8e3d04fd7d cxgb4: Add MPS tracing support
Handle TRACE_PKT, stack can sniff them on the first port
Add debubfs enrty to configure tracing for offload traffic like iWARP
& iSCSI for debugging purpose.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 21:14:48 -07:00
Scott Feldman
dd19f83d6c rocker: hook ndo_neigh_destroy to cleanup neigh refs in driver
Rocker driver tracks arp_tbl neighs to resolve IPv4 route nexthops.  The
driver uses NETEVENT_NEIGH_UPDATE for neigh adds and updates, but there is
no event when the neigh is removed from the device (such as when the device
goes admin down).  This patches hooks ndo_neigh_destroy so the driver can
know when a neigh is removed from the device.  In response, the driver will
purge the neigh entry from its internal tbl.

I didn't find an in-tree users of ndo_neigh_destroy, so I'm not sure if
this ndo is vestigial or if there are out-of-tree users.  In any case, it
does what I need here.  An alternative design would be to generate
NETEVENT_NEIGH_UPDATE event when neigh is being destroyed, setting state to
NUD_NONE so driver knows neigh entry is dead.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 17:05:46 -07:00
Scott Feldman
c8beb5b261 rocker: print switch ID consistent with phys_switch_id sysfs node
On sucessful probe, driver prints the switch ID.  This patch changes the
format of the printed ID to match what's used in sysfs phys_switch_id node.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 17:05:46 -07:00
Jeremy Linton
0b50dc4fc9 Convert smsc911x to use ACPI as well as DT
Add ACPI bindings for the smsc911x driver. Convert the DT specific calls
to nonspecific device* calls, This allows the driver to work
with both ACPI and DT configurations. Ethernet should now work when using
ACPI on ARM Juno.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:58:29 -07:00
Mugunthan V N
7da1160002 drivers: net: cpsw: add am335x errata workarround for interrutps
As per Am335x Errata [1] Advisory 1.0.9, The CPSW C0_TX_PEND and
C0_RX_PEND interrupt outputs provide a single transmit interrupt
that combines transmit channel interrupts TXPEND[7:0] and a
single receive interrupt that combines receive channel interrupts
RXPEND[7:0]. The TXPEND[0] and RXPEND[0] interrupt outputs are
connected to the ARM Cortex-A8 interrupt controller (INTC) rather
than the C0_TX_PEND and C0_RX_PEND interrupt outputs. So even
though CPSW interrupt is cleared by writing appropriate values to
EOI register the interrupt is not cleared in IRQ controller. So
interrupt is still pending and CPU is struck in ISR, the
workaround is to disable the interrupts in ARM irq controller.

[1] http://www.ti.com/lit/er/sprz360f/sprz360f.pdf

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:51:00 -07:00
David S. Miller
182ad468e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cavium/Kconfig

The cavium conflict was overlapping dependency
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:23:11 -07:00
Fabio Estevam
b0c6ce2491 net: fec: Remove unneeded use of IS_ERR_VALUE() macro
There is no need to use the IS_ERR_VALUE() macro for checking
the return value from pm_runtime_* functions.

Just do a simple negative test instead.

The semantic patch that makes this change is available
in scripts/coccinelle/api/pm_runtime.cocci.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:45:46 -07:00
Hariprasad Shenai
a4011fd470 cxgb4: Add debugfs support to dump tid info
Add debugfs support to dump tid info like stid, sftid, tids, atid and
hwtids

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:42:12 -07:00
Hariprasad Shenai
2248b29349 cxgb4: Differentiate between stids between server and filter region
For T4 adapter, offloaded servers tid for IPv4 connections are
allocated from filter region. So add a new field for server filter tid if
server tid is allocated from filter region.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:42:12 -07:00
Hariprasad Shenai
9a1bb9f64e cxgb4: Differentiates between TIDs being used in TCAM and HASH
For the tid info, differentiate from which region the TID is allocated
from. It can be from TCAM region or HASH region.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:42:12 -07:00
Hariprasad Shenai
e106a4d9ed cxgb4: Add some more details to sge qinfo
Adding more details to sge qinfo for debugging purpose.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:42:11 -07:00
Ron Angeles
47b344b27a net: atl1c: add BQL support
This BQL implementation is mostly derived from its related driver, alx.
Tested on AR8131 (rev c0) [1969:1063]. Saturated a 100mbps link with 5
concurrent runs of netperf. Ping latency dropped from 14ms to 3ms.

Signed-off-by: Ron Angeles <ronangeles@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 16:27:40 -07:00
Jakub Kicinski
1f2b729334 gianfar: remove faulty filer optimizer
Current filer rule optimization is broken in several ways:
 (1) Can perform reads/writes beyond end of allocated tables.
     (gianfar_ethtool.c:1326).

(2) It breaks badly for rules with more than 2 specifiers
     (e.g. matching ip, port, tos).

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: TOS  == 00000003 AND         Q:00           ctrl:0000008a prop:00000003
05: DIA  == 0a000003 AND         Q:11           ctrl:0000448c prop:0a000003
06: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
07: TOS  == 00000002 AND         Q:00           ctrl:0000008a prop:00000002
08: DIA  == 0a000002 AND         Q:09           ctrl:0000248c prop:0a000002
09: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
0a: DPT  == 00000001 AND         Q:00           ctrl:0000008e prop:00000001
0b: TOS  == 00000001     CLE     Q:01           ctrl:0000060a prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

(Entire cluster gets AND-ed together).

 (3) We observed that the masking rules it generates do not
     play well with clustering on P2020.  Only first rule
     of the cluster would ever fire.  Given that optimizer
     relies heavily on masking this is very hard to fix.

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1  action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2  action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3  action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: DIA  == 0a000003             Q:11           ctrl:0000440c prop:0a000003
05: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
06: DIA  == 0a000002             Q:09           ctrl:0000240c prop:0a000002
07: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
08: DPT  == 00000001     CLE     Q:01           ctrl:0000060e prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules.  We found no errata covering this.

The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.

Reported-by: Aleksander Dutkowski <adutkowski@gmail.com>
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 14:47:06 -07:00
Jakub Kicinski
b5c8c8906e gianfar: correct list membership accounting
At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 14:47:05 -07:00
Jakub Kicinski
a898fe040f gianfar: correct filer table writing
MAX_FILER_IDX is the last usable index.  Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 14:47:05 -07:00
Tobias Klauser
c3ffe0ca4a net: eth: altera: Remove sgdmadesclen member from altera_tse_private
altera_tse_private->sgdmadesclen is always assigned assigned the same
value and never changes during runtime.  Remove the struct member and
use a new define for sizeof(struct sgdma_descrip) instead.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-12 14:31:07 -07:00
LEROY Christophe
c68875fa82 net: fs_enet: mask interrupts for TX partial frames.
We are not interested in interrupts for partially transmitted frames.
Unlike SCC and FCC, the FEC doesn't handle the I bit in buffer
descriptors, instead it defines two interrupt bits, TXB and TXF.

We have to mask TXB in order to only get interrupts once the
frame is fully transmitted.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 12:05:34 -07:00
LEROY Christophe
8961822c46 net: fs_enet: explicitly remove I flag on TX partial frames
We are not interested in interrupts for partially transmitted frames,
we have to clear BD_ENET_TX_INTR explicitly otherwise it may remain
from a previously used descriptor.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 12:05:34 -07:00
Vivien Didelot
ce80e7bc57 net: switchdev: support static FDB addresses
This patch adds an ndm_state member to the switchdev_obj_fdb structure,
in order to support static FDB addresses.

Set Rocker ndm_state to NUD_REACHABLE.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 12:03:19 -07:00
David S. Miller
cdf0969763 Revert "Merge branch 'mv88e6xxx-switchdev-fdb'"
This reverts commit f1d5ca4344, reversing
changes made to 4933d85c51.

I applied v2 instead of v3.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 12:00:37 -07:00
David Daney
46b903a01c net, thunder, bgx: Add support to get MAC address from ACPI.
Currently there is no way to get the MAC address in a firmware
independent manner, so set the MAC address of the device directly from
the ACPI tables.

The binding agrees with the proposed standard here:

http://www.uefi.org/sites/default/files/resources/nic-request-v2.pdf

Based on code from: Narinder Dhillon <ndhillon@cavium.com>
                    Tomasz Nowicki <tomasz.nowicki@linaro.org>
                    Robert Richter <rrichter@cavium.com>

Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 11:47:57 -07:00
Robert Richter
de387e1156 net: thunder: Factor out DT specific code in BGX
Separate DT code in preparation for follow-on ACPI integration.

Based on code from: Tomasz Nowicki <tomasz.nowicki@linaro.org>

Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-11 11:47:57 -07:00
Fabio Estevam
ed8db18dea mellanox: mlxsw: Use '%zx' to print size_t format
Use '%zx' to print size_t format in order to fix the following build warning:

drivers/net/ethernet/mellanox/mlxsw/item.h:65:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'size_t' [-Wformat=]

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 21:12:49 -07:00
Yuval Mintz
0ea853dfa9 bnx2x: Free NVRAM lock at end of each page
Writing each 4Kb page into flash might take up-to ~100 miliseconds,
during which time management firmware cannot acces the nvram for its
own uses.

Firmware upgrade utility use the ethtool API to burn new flash images
for the device via the ethtool API, doing so by writing several page-worth
of data on each command. Such action might create problems for the
management firmware, as the nvram might not be accessible for a long time.

This patch changes the write implementation, releasing the nvram lock on
the completion of each page, allowing the management firmware time to
claim it and perform its own required actions.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 14:31:59 -07:00
Yuval Mintz
e1615903eb bnx2x: Prevent null pointer dereference on SKB release
On error flows its possible to free an SKB even if it was not allocated.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 14:31:58 -07:00
Dan Carpenter
2f3a87326d cxgb4: cleanup some indenting
Add or remove some tabs so that statements line up correctly.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 14:26:53 -07:00
Dan Carpenter
21a447637d cxgb4: missing curly braces in t4_setup_debugfs()
There were missing curly braces so it means we call add_debugfs_mem()
unintentionally.

Fixes: 3ccc6cf74d ('cxgb4: Adds support for T6 adapter')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 14:26:26 -07:00
Shahed Shaikh
02509f171d qlcnic: Update version to 5.3.63
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:29 -07:00
Shahed Shaikh
0e90ad9bfd qlcnic: Don't use kzalloc unncecessarily for allocating large chunk of memory
Driver allocates a large chunk of temporary buffer using kzalloc
to copy FW image. As there is no real need of this memory to be
physically contiguous, use vzalloc instead.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:28 -07:00
Shahed Shaikh
da286a6fd1 qlcnic: Add new VF device ID 0x8C30
This is a 83xx series based VF device

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:28 -07:00
Shahed Shaikh
642de51025 qlcnic: Print firmware minidump buffer and template header addresses
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:28 -07:00
Shahed Shaikh
d01a6d3c8a qlcnic: Add support to enable capability to extend minidump for iSCSI
In some cases it is required to capture minidump for iSCSI functions
as part of default minidump collection process. To enable this, firmware
exports it's capability and driver need to enable that capability
by issuing a mailbox command.

With this feature, firmware can provide additional iSCSI function's
minidump with smaller minidump capture mask (0x1f).

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:28 -07:00
Harish Patil
a930a4639d qlcnic: Rearrange ordering of header files inclusion
Include local headers files after kernel's header files.

Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:34:28 -07:00
Kevin Hao
c4bc44c65b net: fec: fix the race between xmit and bdp reclaiming path
When we transmit a fragmented skb, we may run into a race like the
following scenario (assume txq->cur_tx is next to txq->dirty_tx):
           cpu 0                                          cpu 1
  fec_enet_txq_submit_skb
    reserve a bdp for the first fragment
    fec_enet_txq_submit_frag_skb
       update the bdp for the other fragment
       update txq->cur_tx
                                                   fec_enet_tx_queue
                                                     bdp = fec_enet_get_nextdesc(txq->dirty_tx, fep, queue_id);
                                                     This bdp is the bdp reserved for the first segment. Given
                                                     that this bdp BD_ENET_TX_READY bit is not set and txq->cur_tx
                                                     is already pointed to a bdp beyond this one. We think this is a
                                                     completed bdp and try to reclaim it.
    update the bdp for the first segment
    update txq->cur_tx

So we shouldn't update the txq->cur_tx until all the update to the
bdps used for fragments are performed. Also add the corresponding
memory barrier to guarantee that the update to the bdps, dirty_tx and
cur_tx performed in the proper order.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:28:14 -07:00
Ivan Vecera
ade4dc3e61 bna: fix interrupts storm caused by erroneous packets
The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
increment from the beginning of the NAPI processing loop after the check
for erroneous packets so they are never accounted. This counter is used
to inform firmware about number of processed completions (packets).
As these packets are never acked the firmware fires IRQs for them again
and again.

Fixes: e29aa33 ("bna: Enable Multi Buffer RX")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 10:58:37 -07:00
Marcin Wojtas
edc660fa09 net: mvpp2: replace TX coalescing interrupts with hrtimer
The PP2 controller is capable of per-CPU TX processing, which means there are
per-CPU banked register sets and queues. Current version of the driver supports
TX packet coalescing - once on given CPU sent packets amount reaches a threshold
value, an IRQ occurs. However, there is a single interrupt line responsible for
CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not
support RSS).

When the top-half executes the interrupt cause is not known. This is why in
NAPI poll function, along with RX processing, IRQ cause register on both
CPU's is accessed in order to determine on which of them the TX coalescing
threshold might have been reached. Thus the egress processing and releasing the
buffers is able to take place on the corresponding CPU. Hitherto approach lead
to an illegal usage of on_each_cpu function in softirq context.

The problem is solved by resigning from TX coalescing interrupts and separating
egress finalization from NAPI processing. For that purpose a method of using
hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released
once a software coalescing threshold is reached. In case not all the data is
processed a timer is set on this CPU - in its interrupt context a tasklet is
scheduled in which all queues are processed. At once only one timer per-CPU can
be running, which is controlled by a dedicated flag.

This commit removes TX processing from NAPI polling function, disables hardware
coalescing and enables hrtimer with tasklet, using new per-CPU port structure
(mvpp2_port_pcpu).

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 10:57:00 -07:00
Marcin Wojtas
71ce391dfb net: mvpp2: enable proper per-CPU TX buffers unmapping
mvpp2 driver allows usage of per-CPU TX processing. Once the packets are
prepared independetly on each CPU, the hardware enqueues the descriptors in
common TX queue. After they are sent, the buffers and associated sk_buffs
should be released on the corresponding CPU.

This is why a special index is maintained in order to point to the right data to
be released after transmission takes place. Each per-CPU TX queue comprise an
array of sent sk_buffs, freed in mvpp2_txq_bufs_free function. However, the
index was used there also for obtaining a descriptor (and therefore a buffer to
be DMA-unmapped) from common TX queue, which was wrong, because it was not
referring to the current CPU.

This commit enables proper unmapping of sent data buffers by indexing them in
per-CPU queues using a dedicated array for keeping their physical addresses.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 10:57:00 -07:00
Marcin Wojtas
d53793c5d6 net: mvpp2: remove excessive spinlocks from driver initialization
Using spinlocks protection during one-time driver initialization is not
necessary. Moreover it resulted in invalid GFP_KERNEL allocation under the lock.

This commit removes redundant spinlocks from buffer manager part of mvpp2
initialization.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Reported-by: Alexandre Fournier <alexandre.fournier@wisp-e.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 10:57:00 -07:00
Ido Schimmel
e577516b9d mlxsw: Fix use-after-free bug in mlxsw_sx_port_xmit
Store the length of the skb before transmitting it and use it for stats
instead of skb->len, since skb might have been freed already.

This issue was discovered using the Kernel Address sanitizer (KASan).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Ido Schimmel
3bfcd34764 mlxsw: Use correct skb length when dumping payload
Do not use the length of the transmitted skb (which was freed), but
that of the response skb.

This issue was discovered using the Kernel Address sanitizer (KASan).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Ido Schimmel
d003462a50 mlxsw: Simplify mlxsw_sx_port_xmit function
Previously we only checked if the transmission queue is not full in the
middle of the xmit function. This lead to complex logic due to the fact
that sometimes we need to reallocate the headroom for our Tx header.

Allow the switch driver to know if the transmission queue is not full
before sending the packet and remove this complex logic.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Jiri Pirko
7b7b9cff74 mlxsw: Strip FCS from incoming packets
FCS of incoming packets is already checked by HW. Just strip it out.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:10 -07:00
Jiri Pirko
74ed207e2a mlxsw: Make pci module dependent on HAS_DMA and HAS_IOMEM
This resolves compile errors on um-allyesconfig.

Note that there are many other drivers which have the same issue.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Ido Schimmel
e61011b5e0 mlxsw: Make system port to local port mapping explicit
System ports are unique identifiers in a multi-ASIC environment that
represent all the available ports in the system. Local ports on the
other hand, are unique only within the local ASIC.

Since system port to local port mapping is not part of the HW-SW
contract and since only single-ASIC configurations are currently
supported, set an explicit 1:1 mapping by configuring the Switch System
Port Record (SSPR) register.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Ido Schimmel
26a80f6e54 mlxsw: Call free_netdev when removing port
When removing a port's netdevice we should also free the memory
allocated by alloc_etherdev(). Do this by calling free_netdev() at the
end of the teardown sequence.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:54:09 -07:00
Vivien Didelot
1525c386a1 net: switchdev: change fdb addr for a byte array
The address in the switchdev_obj_fdb structure is currently represented
as a pointer. Replacing it for a 6-byte array allows switchdev to carry
addresses directly read from hardware registers, not stored by the
switch chip driver (as in Rocker).

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:48:08 -07:00
Carol L Soto
fe1e1876d8 net/mlx5_core: Set log_uar_page_sz for non 4K page size architecture
failed to configure the page size for architectures with page size
different than 4K.

Fixes: 938fe83 ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 15:55:48 -07:00
Kalesh AP
649886a36b be2net: protect eqo->affinity_mask from getting freed twice
There are paths in the driver such as an unrecoverable error (UE) detection
followed by a driver unload wherein be_clear() is invoked twice.
Individual data structures are reset so that they are not cleaned/freed
twice. This patch does the same for eqo->affinity_mask. It is freed only
if EQs haven't yet been destroyed. This fixes a possible crash when
affinity_mask is freed twice.

Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 11:53:06 -07:00