This BQL patch is based on work done by Tino Reichardt.
Tested on 0000:05:00.0: 3Com PCI 3c905C Tornado at ffffc90000e6e000 by running
Flent several times.
Signed-off-by: Loganaden Velvindron <logan@elandsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean the dma flags of multiq ring buffer int the interface stop
process. This patch fixes that the genet is not running while the
interface is re-enabled.
$ ifup eth0 - running after booting
$ ifdown eth0
$ ifup eth0 - not running and occur tx_timeout
The bcmgenet_dma_disable() in bcmgenet_open() do clean ring16 dma flag
only. If the genet has multiq, the dma register is not cleaned. and
bcmgenet_init_dma() is not done correctly. in case
GENET_V2(tx_queues=4), tdma_ctrl has 0x1e after running
bcmgenet_dma_disable().
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcmgenet_timeout() executes in atomic context, yet we will invoke
napi_disable() which does sleep. Looking back at the changes, disabling
TX napi and re-enabling it is completely useless, since we reclaim all
TX buffers and re-enable interrupts, and wake up the TX queues.
Fixes: 13ea657806 ("net: bcmgenet: improve TX timeout")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We define buf_int_enable in the minimal namespace it is used.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed for when TX done interrupt is in
"level mode".
For example it is true for some simulators of this device.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We set controller to drop control frames and not trying
to pass them on. This is only needed for debug reasons.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to set tx_skb pointer before send frame.
If we receive interrupt before we set pointer we will try
to free SKB with wrong pointer.
Now we are sure that SKB pointer will never be NULL during
handling TX done and check is removed.
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When interrupt is received we read directly from control
register for RX/TX instead of reading cause register
since this register fails to indicate TX done when
TX interrupt is "edge mode".
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():
if (page->pfmemalloc && !page->mapping)
skb->pfmemalloc = true;
It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted. However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone. Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.
So the assumption is invalid if the networking code can see such a page.
And it seems it can. We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf. There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.
The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead. We can reuse the index
again but define it to an impossible value (-1UL). This is the page
index so it should never see the value that large. Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.
The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g. what SLAB and SLUB do).
[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org> [3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
err_out_vnic_unregister is used regardless of whether
SRIOV is enabled or not.
Reported-by: Jesse Brandeburg <jesse.brangeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the xgene_get_mac_address to device_get_mac_address(), and
xgene_get_phy_mode() to device_get_phy_mode().
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dev==NULL check in smsc911x_probe_config is useless
and isn't providing any additional protection. If a fwnode
doesn't exist then an appropriate error should be returned
by device_get_phy_mode() covering the original case
of a missing of/fwnode.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit f34fa14cc0 ("bnx2x: Add vxlan RSS support") has introduced an
endianity issue when passing the vxlan UDP port to the HW.
Reported-by: <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-08-18
This series contains updates to igb, e100, e1000e and ixgbe.
Shota Suzuki provides a fix for a possible overflow in
igb_set_interrupt_capability() which leads to an oops. When changing the
number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same
manner as when initializing the igb driver.
Vasily Averin provides a fix for a missing rtnl_unlock() for when we
error out due to not being able to allocate memory for our queues.
Stefan Assman provides a couple of fixes for igb/igbvf. First changes
the igb driver in probe to simply call igb_enable_sriov() instead of
igb_sriov_reinit() since we are starting from scratch. Then in igbvf,
fix the driver where it does not clear the buffer_info->dma in all
cases after calling dma_unmap_single(), which was found by changing the
MTU twice.
Richard Cochran implements the periodic output function using the
programmable clock outputs available in i210 when possible, falling
back to the target time for longer periods.
Todd adds support for the Marvell PHY 1512 which is required for i354
devices. Then updates igb to make sure SR-IOV init uses the correct
number of queues, since recent changes could result in the PF holding
onto all of the queues.
Alex Williamson provides a fix in the case where a guest OS does not
support hot-unplug, so disable SR-IOV prior to unregister_netdev() to
avoid the problem.
Jia-Ju Bai provides several patches, first knocks some collecting dust
off an old e100 driver to add a check to avoid a null pointer
dereference. Then cleans up a possible resource leak by releasing the
skb buffer allocated when the e100_xmit_prepare() runs into an issue
in the DMA mapping. In igb, add a missing rtnl_unlock() for when we
error out due to igb_sriov_reinit() in the igb_init_interrupt_scheme().
Provides a e1000e fix, based on suggestions from Alex Duyck to move
head/tail register writing to e1000_configure_tx/rx() to avoid a
possible null pointer dereference (similar to igb driver). Lastly,
fix a possible memory leak in igb_probe(), where the memory shadow_vfta
allocated by kcalloc in igb_sw_init() is not freed.
Mark simplifies port-specific macros for ixgbe by eliminating explicit
comparisons with 0 and enclose formal parameters in parens to eliminate
the risk of an operator precedence issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
You can't use kstrtoul() with an int or it causes memory corruption.
Also j should be unsigned or we have underflow bugs.
I considered changing "j" to unsigned long but everything fits in a u32.
Fixes: 8e3d04fd7d ('cxgb4: Add MPS tracing support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/built-in.o: In function `.vnic_wq_devcmd2_alloc':
(.text+0x49fe40): multiple definition of `.vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.text+0xb4318): first defined here
drivers/net/built-in.o:(.opd+0x2af00): multiple definition of `vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.opd+0xad70): first defined here
drivers/net/built-in.o: In function `.vnic_wq_init_start':
(.text+0x49f9c0): multiple definition of `.vnic_wq_init_start'
drivers/scsi/built-in.o:(.text+0xb3b58): first defined here
drivers/net/built-in.o:(.opd+0x2ae88): multiple definition of `vnic_wq_init_start'
drivers/scsi/built-in.o:(.opd+0xace0): first defined here
Rename these to 'enic_*' to avoid the conflict with the functiosn of
the same name in the snic scsi driver.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest FW submission added some vxlan offload capabilities to our device.
This patch adds the ability to connect to the vxlan NDOs and configure
the UDP port associated with it in the HW.
The device would now be capable of performing RSS according to the
inner headers of the vxlan packets.
Signed-off-by: Rajesh Borundia <Rajesh.Borundia@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simplify port-specific macros by eliminating explicit comparison
with 0. More importantly, enclose formal parameter in parens to
eliminate the risk of an operator precedence surprise.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Recent changes to igb_probe_vfs() could lead to the PF holding onto all
of the queues. Reorder igb_probe_vfs() to be before
gb_init_queue_configuration() and add some more error checking.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In error handling code of igb_probe, the memory adapter->shadow_vfta
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in "err_setup_rx" segment is executed.
"writel(0, tx_ring->head)" statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null poonter dereference(crash),
because "tx_ring->head" is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.
This patch moves head/tail register writing to e1000_configure_tx/rx,
which can fix this problem. It is inspired by igb_configure_tx_ring
in the igb driver.
Specially, thank Alexander Duyck for his valuable suggestion.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer
allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not
released, which causes a possible resource leak.
This patch adds error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver lacks the check of nic->cbs_pool after pci_pool_create
in e100_probe. When this function is failed, a null pointer dereference
occurs when pci_pool_alloc uses nic->cbs_pool in e100_alloc_cbs.
This patch adds a check and related error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci. In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM. Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown. Disabling SR-IOV prior to unregister_netdev() avoids this
issue.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for Marvell PHY 1512 (required for I354).
Submitted by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In addition to interrupt driven target time output events, the i210
also has two programmable clock outputs. These clocks support periods
between 16 nanoseconds and 140 milliseconds. This patch implements
the periodic output function using the clock outputs when possible,
falling back to the target time for longer periods.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
->igb_probe_vfs
->igb_pci_enable_sriov
->igb_sriov_reinit
Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.
Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.
Fix this problem as follows:
- When changing the number of queues by "ethtool -L", set
IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
- When increasing the size of q_vector, reallocate it appropriately.
(With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.
Note that before commit cd14ef54d2 ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().
Fixes: 907b783579 ("igb: Add ethtool support to configure number of channels")
CC: stable <stable@vger.kernel.org>
Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: sparse: incorrect type in assignment (different address spaces)
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: expected void *res
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: got void [noderef] <asn:2>*
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types)
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: expected restricted __sum16 [usertype] n
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: got restricted __be16 [usertype] check_sum
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only for packets with first ethertype set to IPv4/6 for now.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only rx/tx pause settings.
Autoneg setting is currently not supported.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Port speed settings are applied by the device only upon
port admin status transition from DOWN to UP.
So we enforce this transition regardless of the port's
current operation state (which may be occasionally DOWN if
for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
register set/query access functions.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Change the maximum LRO session size from 16KB to 64KB
- Reduce the LRO session timeout from 512us to 32us in
order to reduce the TCP latency of non-LRO'ed flows.
- Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type.
- Fix a bug accessing un-initialized mdev pointer.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We un-intentionally limited the minimum rings size too much.
TX minimum ring size reduced from 128 to 64.
RX minimum ring size reduced from 128 to 2.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirection table size was defined by a variable that
was actually assigned a constant value.
Since we do not have any forseen intension to make it configurable
we simply made it a constant.
We also limit the number of channels such that the RSS indirection
table could always populate all RX rings.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to generate a unique key per TIR.
Generating a single key per netdev and copying it to all
its TIRs.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devcmd is an interface for driver to communicate with fw/adaptor. It
involves writing data to hardware registers and waiting for the result.
This mechanism does not scale well. The queuing of "no wait" devcmds is
done in firmware memory rather than on the host. Firmware memory is a
rather more scarce and valuable resource than host memory. A devcmd storm
from one vf can disrupt the service on other pf/vf. The lack of flow
control allows for possible denial of server from one VM to another.
Devcmd2 uses work queue to post the devcmds, just like tx work queue. This
allows better flow control.
Initialize devcmd2, if fails we fall back to devcmd1.
Also change the driver version.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devcmd resources to vnic_res_type. Add data types used by devcmd.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pr_info does not give any details about the interface involved. This patch
uses netdev_info for printing the message. Use dev_info where netdev is not
ready.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure definitions are in .c file to make them private to
that file. This patch moves the struct definition to .h file, So that their
definitions are accessible from other files.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VxLAN offloading is not functional if the NIC is running in multichannel
mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
connectivity through the NIC and the device needs to be down and up to
restore it. The firmware should take care about it and does not allow
the conversion of interface to tunnel type (be_cmd_manage_iface) or should
support VxLAN offloading if multichannel config is enabled.
I have tested this on the latest available firmware (10.6.144.21).
Result:
[root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 172.30.10.50/24 dev enp5s0f0
[root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
[root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 172.30.10.60 dstport 4789
[root@sm-04 ~]# ip link set vxlan10 up
[ 7900.442811] be2net 0000:05:00.0: Enabled VxLAN offloads for UDP port 4789
[ 7900.455722] be2net 0000:05:00.1: Enabled VxLAN offloads for UDP port 4789
[ 7900.468635] be2net 0000:05:00.2: Enabled VxLAN offloads for UDP port 4789
[ 7900.481553] be2net 0000:05:00.3: Enabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
[root@sm-04 ~]# ip link set vxlan10 down
[ 7959.434093] be2net 0000:05:00.0: Disabled VxLAN offloads for UDP port 4789
[ 7959.444792] be2net 0000:05:00.1: Disabled VxLAN offloads for UDP port 4789
[ 7959.455592] be2net 0000:05:00.2: Disabled VxLAN offloads for UDP port 4789
[ 7959.466416] be2net 0000:05:00.3: Disabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ip link del vxlan10
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms
[root@sm-04 ~]# ip link set enp5s0f0 down
[root@sm-04 ~]# ip link set enp5s0f0 up
[ 8071.019003] be2net 0000:05:00.0 enp5s0f0: Link is Up
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms
--- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms
Cc: Sathya Perla <sathya.perla@avagotech.com>
Cc: Ajit Khaparde <ajit.khaparde@avagotech.com>
Cc: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Ajit Khaparde <ajit.khaparde@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT") makes
the call to smsc911x_probe_config() unconditional, and no longer fails if
there is no device node. device_get_phy_mode() is called unconditionally,
and if there is no phy node configured returns an error code. This error
code is assigned to phy_interface, and interpreted elsewhere in the code
as valid phy mode. This in turn causes qemu to crash when running a
variant of realview_pb_defconfig.
qemu: hardware error: lan9118_read: Bad reg 0x86
Fixes: 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT")
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dma_mapping_error() function returns true or false. We should
return -ENOMEM if it there is a dma mapping error.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware tells driver in case bandwidth configuration for
a specific function exists, but [regretably] the same field has different
meanings depending on the multi-function mode - it can either be
a percentile value or an actual speed.
For newer multi-function modes current logic is incorrect -
driver understands values as actual speeds instead of percentages,
causing the resulting chip configuration to be incorrect.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some MAC registers that need to be kept in sync
with the link state parameters, see adjust_link().
However, after a MAC soft reset default values for
these registers are assumed. In some cases (excepting
if down/ if up for example) adjust_link() does not see
that these values were reset to default because the
priv->old* link parameters were left unchanged.
So, reset the priv->old* link params as well during a
MAC reset to let adjust_link() restore the MAC link
settings to the actual link state values.
Fixes following case, for example:
Setting link to 100M, changing MTU (implies MAC reset),
link state remains unchanged to 100M but MAC registers
were reset to default (1G) breaking the connectivity w/
the PHY. Closing and re-opening the interface would
restore the MAC link parameters to the correct values.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle TRACE_PKT, stack can sniff them on the first port
Add debubfs enrty to configure tracing for offload traffic like iWARP
& iSCSI for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rocker driver tracks arp_tbl neighs to resolve IPv4 route nexthops. The
driver uses NETEVENT_NEIGH_UPDATE for neigh adds and updates, but there is
no event when the neigh is removed from the device (such as when the device
goes admin down). This patches hooks ndo_neigh_destroy so the driver can
know when a neigh is removed from the device. In response, the driver will
purge the neigh entry from its internal tbl.
I didn't find an in-tree users of ndo_neigh_destroy, so I'm not sure if
this ndo is vestigial or if there are out-of-tree users. In any case, it
does what I need here. An alternative design would be to generate
NETEVENT_NEIGH_UPDATE event when neigh is being destroyed, setting state to
NUD_NONE so driver knows neigh entry is dead.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On sucessful probe, driver prints the switch ID. This patch changes the
format of the printed ID to match what's used in sysfs phys_switch_id node.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ACPI bindings for the smsc911x driver. Convert the DT specific calls
to nonspecific device* calls, This allows the driver to work
with both ACPI and DT configurations. Ethernet should now work when using
ACPI on ARM Juno.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Am335x Errata [1] Advisory 1.0.9, The CPSW C0_TX_PEND and
C0_RX_PEND interrupt outputs provide a single transmit interrupt
that combines transmit channel interrupts TXPEND[7:0] and a
single receive interrupt that combines receive channel interrupts
RXPEND[7:0]. The TXPEND[0] and RXPEND[0] interrupt outputs are
connected to the ARM Cortex-A8 interrupt controller (INTC) rather
than the C0_TX_PEND and C0_RX_PEND interrupt outputs. So even
though CPSW interrupt is cleared by writing appropriate values to
EOI register the interrupt is not cleared in IRQ controller. So
interrupt is still pending and CPU is struck in ISR, the
workaround is to disable the interrupts in ARM irq controller.
[1] http://www.ti.com/lit/er/sprz360f/sprz360f.pdf
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/cavium/Kconfig
The cavium conflict was overlapping dependency
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to use the IS_ERR_VALUE() macro for checking
the return value from pm_runtime_* functions.
Just do a simple negative test instead.
The semantic patch that makes this change is available
in scripts/coccinelle/api/pm_runtime.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add debugfs support to dump tid info like stid, sftid, tids, atid and
hwtids
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For T4 adapter, offloaded servers tid for IPv4 connections are
allocated from filter region. So add a new field for server filter tid if
server tid is allocated from filter region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the tid info, differentiate from which region the TID is allocated
from. It can be from TCAM region or HASH region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding more details to sge qinfo for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This BQL implementation is mostly derived from its related driver, alx.
Tested on AR8131 (rev c0) [1969:1063]. Saturated a 100mbps link with 5
concurrent runs of netperf. Ping latency dropped from 14ms to 3ms.
Signed-off-by: Ron Angeles <ronangeles@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current filer rule optimization is broken in several ways:
(1) Can perform reads/writes beyond end of allocated tables.
(gianfar_ethtool.c:1326).
(2) It breaks badly for rules with more than 2 specifiers
(e.g. matching ip, port, tos).
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: TOS == 00000003 AND Q:00 ctrl:0000008a prop:00000003
05: DIA == 0a000003 AND Q:11 ctrl:0000448c prop:0a000003
06: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
07: TOS == 00000002 AND Q:00 ctrl:0000008a prop:00000002
08: DIA == 0a000002 AND Q:09 ctrl:0000248c prop:0a000002
09: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
0a: DPT == 00000001 AND Q:00 ctrl:0000008e prop:00000001
0b: TOS == 00000001 CLE Q:01 ctrl:0000060a prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
(Entire cluster gets AND-ed together).
(3) We observed that the masking rules it generates do not
play well with clustering on P2020. Only first rule
of the cluster would ever fire. Given that optimizer
relies heavily on masking this is very hard to fix.
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: DIA == 0a000003 Q:11 ctrl:0000440c prop:0a000003
05: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
06: DIA == 0a000002 Q:09 ctrl:0000240c prop:0a000002
07: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
08: DPT == 00000001 CLE Q:01 ctrl:0000060e prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules. We found no errata covering this.
The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.
Reported-by: Aleksander Dutkowski <adutkowski@gmail.com>
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAX_FILER_IDX is the last usable index. Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
altera_tse_private->sgdmadesclen is always assigned assigned the same
value and never changes during runtime. Remove the struct member and
use a new define for sizeof(struct sgdma_descrip) instead.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are not interested in interrupts for partially transmitted frames.
Unlike SCC and FCC, the FEC doesn't handle the I bit in buffer
descriptors, instead it defines two interrupt bits, TXB and TXF.
We have to mask TXB in order to only get interrupts once the
frame is fully transmitted.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are not interested in interrupts for partially transmitted frames,
we have to clear BD_ENET_TX_INTR explicitly otherwise it may remain
from a previously used descriptor.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds an ndm_state member to the switchdev_obj_fdb structure,
in order to support static FDB addresses.
Set Rocker ndm_state to NUD_REACHABLE.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is no way to get the MAC address in a firmware
independent manner, so set the MAC address of the device directly from
the ACPI tables.
The binding agrees with the proposed standard here:
http://www.uefi.org/sites/default/files/resources/nic-request-v2.pdf
Based on code from: Narinder Dhillon <ndhillon@cavium.com>
Tomasz Nowicki <tomasz.nowicki@linaro.org>
Robert Richter <rrichter@cavium.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate DT code in preparation for follow-on ACPI integration.
Based on code from: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use '%zx' to print size_t format in order to fix the following build warning:
drivers/net/ethernet/mellanox/mlxsw/item.h:65:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'size_t' [-Wformat=]
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Writing each 4Kb page into flash might take up-to ~100 miliseconds,
during which time management firmware cannot acces the nvram for its
own uses.
Firmware upgrade utility use the ethtool API to burn new flash images
for the device via the ethtool API, doing so by writing several page-worth
of data on each command. Such action might create problems for the
management firmware, as the nvram might not be accessible for a long time.
This patch changes the write implementation, releasing the nvram lock on
the completion of each page, allowing the management firmware time to
claim it and perform its own required actions.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On error flows its possible to free an SKB even if it was not allocated.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add or remove some tabs so that statements line up correctly.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were missing curly braces so it means we call add_debugfs_mem()
unintentionally.
Fixes: 3ccc6cf74d ('cxgb4: Adds support for T6 adapter')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver allocates a large chunk of temporary buffer using kzalloc
to copy FW image. As there is no real need of this memory to be
physically contiguous, use vzalloc instead.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases it is required to capture minidump for iSCSI functions
as part of default minidump collection process. To enable this, firmware
exports it's capability and driver need to enable that capability
by issuing a mailbox command.
With this feature, firmware can provide additional iSCSI function's
minidump with smaller minidump capture mask (0x1f).
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include local headers files after kernel's header files.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we transmit a fragmented skb, we may run into a race like the
following scenario (assume txq->cur_tx is next to txq->dirty_tx):
cpu 0 cpu 1
fec_enet_txq_submit_skb
reserve a bdp for the first fragment
fec_enet_txq_submit_frag_skb
update the bdp for the other fragment
update txq->cur_tx
fec_enet_tx_queue
bdp = fec_enet_get_nextdesc(txq->dirty_tx, fep, queue_id);
This bdp is the bdp reserved for the first segment. Given
that this bdp BD_ENET_TX_READY bit is not set and txq->cur_tx
is already pointed to a bdp beyond this one. We think this is a
completed bdp and try to reclaim it.
update the bdp for the first segment
update txq->cur_tx
So we shouldn't update the txq->cur_tx until all the update to the
bdps used for fragments are performed. Also add the corresponding
memory barrier to guarantee that the update to the bdps, dirty_tx and
cur_tx performed in the proper order.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
increment from the beginning of the NAPI processing loop after the check
for erroneous packets so they are never accounted. This counter is used
to inform firmware about number of processed completions (packets).
As these packets are never acked the firmware fires IRQs for them again
and again.
Fixes: e29aa33 ("bna: Enable Multi Buffer RX")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PP2 controller is capable of per-CPU TX processing, which means there are
per-CPU banked register sets and queues. Current version of the driver supports
TX packet coalescing - once on given CPU sent packets amount reaches a threshold
value, an IRQ occurs. However, there is a single interrupt line responsible for
CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not
support RSS).
When the top-half executes the interrupt cause is not known. This is why in
NAPI poll function, along with RX processing, IRQ cause register on both
CPU's is accessed in order to determine on which of them the TX coalescing
threshold might have been reached. Thus the egress processing and releasing the
buffers is able to take place on the corresponding CPU. Hitherto approach lead
to an illegal usage of on_each_cpu function in softirq context.
The problem is solved by resigning from TX coalescing interrupts and separating
egress finalization from NAPI processing. For that purpose a method of using
hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released
once a software coalescing threshold is reached. In case not all the data is
processed a timer is set on this CPU - in its interrupt context a tasklet is
scheduled in which all queues are processed. At once only one timer per-CPU can
be running, which is controlled by a dedicated flag.
This commit removes TX processing from NAPI polling function, disables hardware
coalescing and enables hrtimer with tasklet, using new per-CPU port structure
(mvpp2_port_pcpu).
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvpp2 driver allows usage of per-CPU TX processing. Once the packets are
prepared independetly on each CPU, the hardware enqueues the descriptors in
common TX queue. After they are sent, the buffers and associated sk_buffs
should be released on the corresponding CPU.
This is why a special index is maintained in order to point to the right data to
be released after transmission takes place. Each per-CPU TX queue comprise an
array of sent sk_buffs, freed in mvpp2_txq_bufs_free function. However, the
index was used there also for obtaining a descriptor (and therefore a buffer to
be DMA-unmapped) from common TX queue, which was wrong, because it was not
referring to the current CPU.
This commit enables proper unmapping of sent data buffers by indexing them in
per-CPU queues using a dedicated array for keeping their physical addresses.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using spinlocks protection during one-time driver initialization is not
necessary. Moreover it resulted in invalid GFP_KERNEL allocation under the lock.
This commit removes redundant spinlocks from buffer manager part of mvpp2
initialization.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Reported-by: Alexandre Fournier <alexandre.fournier@wisp-e.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the length of the skb before transmitting it and use it for stats
instead of skb->len, since skb might have been freed already.
This issue was discovered using the Kernel Address sanitizer (KASan).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not use the length of the transmitted skb (which was freed), but
that of the response skb.
This issue was discovered using the Kernel Address sanitizer (KASan).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we only checked if the transmission queue is not full in the
middle of the xmit function. This lead to complex logic due to the fact
that sometimes we need to reallocate the headroom for our Tx header.
Allow the switch driver to know if the transmission queue is not full
before sending the packet and remove this complex logic.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FCS of incoming packets is already checked by HW. Just strip it out.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This resolves compile errors on um-allyesconfig.
Note that there are many other drivers which have the same issue.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
System ports are unique identifiers in a multi-ASIC environment that
represent all the available ports in the system. Local ports on the
other hand, are unique only within the local ASIC.
Since system port to local port mapping is not part of the HW-SW
contract and since only single-ASIC configurations are currently
supported, set an explicit 1:1 mapping by configuring the Switch System
Port Record (SSPR) register.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing a port's netdevice we should also free the memory
allocated by alloc_etherdev(). Do this by calling free_netdev() at the
end of the teardown sequence.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The address in the switchdev_obj_fdb structure is currently represented
as a pointer. Replacing it for a 6-byte array allows switchdev to carry
addresses directly read from hardware registers, not stored by the
switch chip driver (as in Rocker).
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
failed to configure the page size for architectures with page size
different than 4K.
Fixes: 938fe83 ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are paths in the driver such as an unrecoverable error (UE) detection
followed by a driver unload wherein be_clear() is invoked twice.
Individual data structures are reset so that they are not cleaned/freed
twice. This patch does the same for eqo->affinity_mask. It is freed only
if EQs haven't yet been destroyed. This fixes a possible crash when
affinity_mask is freed twice.
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>