Commit Graph

172 Commits

Author SHA1 Message Date
Muhammad Mahajna
f0df35037a net/mlx4_en: Add RX-FCS support
Enabled when device supports KEEP FCS. When the flag is set, Ethernet FCS
is appended to the end of the frame, controlled by ethtool.

Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:04 -04:00
Ido Shamay
3742cc6551 net/mlx4: Warn users of depracated QoS Firmware
A new capability bit was introduced in the past to to differ devices
using the QoS ETS feature. The old was deprecated since then.
If driver sees device which set only the old capabilty, it will print
warning to user suggesting to upgrade the FW.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
cda373f484 net/mlx4_en: Enable TX rate limit per VF
Support granular QoS per VF, by implementing the ndo_set_vf_rate.

Enforce a rate limit per VF when called, and enabled only for VFs in
VST mode with user priority supported by the device.

We don't enforce VFs to be in VST mode at the moment of configuration,
but rather save the given rate limit and enforce it when the VF is
moved to VST with user priority which is supported (currently 0).

VST<->VGT or VST qos value state changes are disallowed when a rate
limit is configured. Minimum BW share is not supported yet.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
241a08c3a7 net/mlx4_en: Change loopback only upon feature change
Currently any change of netdev features results in a call to
mlx4_en_update_loopback_state(). Those calls are unnecessary,
and should be called only upon loopback feature change.

Also moved some of the logic into mlx4_en_update_loopback_state().

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:24:51 -04:00
David S. Miller
9f0d34bc34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	drivers/net/usb/sr9800.c
	drivers/net/usb/usbnet.c
	include/linux/usb/usbnet.h
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:16:53 -04:00
Matan Barak
0b131561a7 net/mlx4_en: Add Flow control statistics display via ethtool
Flow control per priority and Global pause counters are now visible via
ethtool.  The counters shows statistics regarding pauses in the device.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 16:36:51 -04:00
Eran Ben Elisha
3da8a36cc5 net/mlx4_en: Protect access to the statistics bitmap
This will allow parallel access to the statistics bitmap.
A pre-step for adding PFC counters, where the statistics bitmap
can be dynamically changed when modifying the PFC setting.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 16:36:50 -04:00
Eran Ben Elisha
6fcd27354b net/mlx4_en: Support general selective view of ethtool statistics
The driver uses a bitmask to indicate which statistics should be
displayed to the user in ethtool. The bitmask is u64, therefore we are
limited for a selective view of up to 64 statistics. Extend the bitmap
in order to show more than 64 statistics.

In addition, add packet statistics to the ethtool display for PF.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 16:36:50 -04:00
Eran Ben Elisha
ffa88f37ff net/mlx4_en: Move statistics bitmap setting to the Ethernet driver
The statistics bitmap belongs to the Ethernet driver, move it there.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 16:36:50 -04:00
Toshiaki Makita
8cb65d0008 net: Move check for multiple vlans to drivers
To allow drivers to handle the features check for multiple tags,
move the check to ndo_features_check().
As no drivers currently handle multiple tagged TSO, introduce
dflt_features_check() and call it if the driver does not have
ndo_features_check().

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29 13:33:22 -07:00
Ido Shamay
e5eda89d97 net/mlx4_en: Call register_netdevice in the proper location
Netdevice registration should be performed a the end of the driver
initialization flow. If we don't do that, after calling register_netdevice,
device callbacks may be issued by higher layers of the stack before
final configuration of the device is done.

For example (VXLAN configuration race), mlx4_SET_PORT_VXLAN was issued
after the register_netdev command. System network scripts may configure
the interface (UP) right after the registration, which also attach
unicast VXLAN steering rule, before mlx4_SET_PORT_VXLAN was called,
causing the firmware to fail the rule attachment.

Fixes: 837052d0cc ("net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling")
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-24 15:22:52 -04:00
David S. Miller
0fa74a4be4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	net/core/sysctl_net_core.c
	net/ipv4/inet_diag.c

The be_main.c conflict resolution was really tricky.  The conflict
hunks generated by GIT were very unhelpful, to say the least.  It
split functions in half and moved them around, when the real actual
conflict only existed solely inside of one function, that being
be_map_pci_bars().

So instead, to resolve this, I checked out be_main.c from the top
of net-next, then I applied the be_main.c changes from 'net' since
the last time I merged.  And this worked beautifully.

The inet_diag.c and sysctl_net_core.c conflicts were simple
overlapping changes, and were easily to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 18:51:09 -04:00
Wu Fengguang
de1cf8a7d7 net/mlx4_en: mlx4_en_set_tx_maxrate() can be static
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 23:21:41 -04:00
Eran Ben Elisha
39de961a4a net/mlx4_en: Set statistics bitmap at port init
Port statistics bitmap will now be initialized at port init.  Even before
starting the port, statistics are visible to the user and must be properly masked.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 15:17:11 -04:00
Or Gerlitz
c10e4fc6c4 net/mlx4_en: Add tx queue maxrate support
Add ndo_set_tx_maxrate support.

To support per tx queue maxrate limit, we use the update-qp firmware
command to do run-time rate setting for the qp that serves this tx ring.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 14:55:19 -04:00
Joe Perches
c7bf716940 ethernet: Use eth_<foo>_addr instead of memset
Use the built-in function instead of memset.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-03 17:01:36 -05:00
Ido Shamay
cfb53f36a5 net/mlx4_en: Notify TX Vlan offload change
Notify users when TX vlan offload feature changed with ethtool.
Relevant command - ethtool -K <eth> txvlan on/off.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 16:17:46 -08:00
Moni Shoua
5da0354726 net/mlx4_en: Port aggregation configuration
Capture NETDEV events generated by the bonding driver and based on that
make decisions of how to configure port aggregation in the mlx4 core driver.

This includes setting the V2P port table and re-creating the interested
interfaces in bonded/non-bonded mode.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 16:14:25 -08:00
David S. Miller
95f873f2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/arm/boot/dts/imx6sx-sdb.dts
	net/sched/cls_bpf.c

Two simple sets of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-27 16:59:56 -08:00
Yishai Hadas
872bf2fb69 net/mlx4_core: Maintain a persistent memory for mlx4 device
Maintain a persistent memory that should survive reset flow/PCI error.
This comes as a preparation for coming series to support above flows.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-25 14:43:13 -08:00
Or Gerlitz
5eff6dadb9 net/mlx4: Don't disable vxlan offloads under DMFS-A0 optimized steering
Except for VXLAN steering rules, all offloads should work as they were
under plain DMFS mode. Fix that by enabling all the offloads under
DMFS-A0 mode, except for VXLAN steering rules.

Fixes: d57febe1a4 "net/mlx4: Add A0 hybrid steering"
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 19:35:30 -05:00
Jesse Gross
5f35227ea3 net: Generalize ndo_gso_check to ndo_features_check
GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.

This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.

By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.

CC: Tom Herbert <therbert@google.com>
CC: Joe Stringer <joestringer@nicira.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by:  Tom Herbert <therbert@google.com>
Fixes: 04ffcb255f ("net: Add ndo_gso_check")
Tested-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-26 17:20:56 -05:00
Ido Shamay
c3f2511fea net/mlx4: Cache line CQE/EQE stride fixes
This commit contains 2 fixes for the 128B CQE/EQE stride feaure.
Wei found that mlx4_QUERY_HCA function marked the wrong capability
in flags (64B CQE/EQE), when CQE/EQE stride feature was enabled.
Also added small fix in initial CQE ownership bit assignment, when CQE
is size is not default 32B.

Fixes: 77507aa24 (net/mlx4: Enable CQE/EQE stride support)
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-16 15:23:53 -05:00
Matan Barak
7d077cd34e net/mlx4: Add support for A0 steering
Add the required firmware commands for A0 steering and a way to enable
that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT,
QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used
to configure and query the device.

The different A0 DMFS (steering) modes are:

Static - optimized performance, but flow steering rules are
limited. This mode should be choosed explicitly by the user
in order to be used.

Dynamic - this mode should be explicitly choosed by the user.
In this mode, the FW works in optimized steering mode as long as
it can and afterwards automatically drops to classic (full) DMFS.

Disable - this mode should be explicitly choosed by the user.
The user instructs the system not to use optimized steering, even if
the FW supports Dynamic A0 DMFS (and thus will be able to use optimized
steering in Default A0 DMFS mode).

Default - this mode is implicitly choosed. In this mode, if the FW
supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll
work at Disable A0 DMFS mode.

Under SRIOV configuration, when the A0 steering mode is enabled,
older guest VF drivers who aren't using the RX QP allocation flag
(MLX4_RESERVE_A0_QP) will get a QP from the general range and
fail when attempting to register a steering rule. To avoid that,
the PF context behaviour is changed once on A0 static mode, to
require support for the allocation flag in VF drivers too.

In order to enable A0 steering, we use log_num_mgm_entry_size param.
If the value of the parameter is not positive, we treat the absolute
value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this
bit field enables static A0 steering.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:36 -05:00
Matan Barak
d57febe1a4 net/mlx4: Add A0 hybrid steering
A0 hybrid steering is a form of high performance flow steering.
By using this mode, mlx4 cards use a fast limited table based steering,
in order to enable fast steering of unicast packets to a QP.

In order to implement A0 hybrid steering we allocate resources
from different zones:
(1) General range
(2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region.

When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP,
we try hard to allocate the QP from range (2). Otherwise, we try hard not
to allocate from this  range. However, when the system is pushed to its
limits and one needs every resource, the allocator uses every region it can.

Meaning, when we run out of raw-eth qps, the allocator allocates from the
general range (and the special-A0 area is no longer active). If we run out
of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that
is also exhausted, the allocator will allocate from the general range
(and the A0 region is no longer active).

Note that if a raw-eth qp is allocated from the general range, it attempts
to allocate the range such that bits 6 and 7 (blueflame bits) in the
QP number are not set.

When the feature is used in SRIOV, the VF has to notify the PF what
kind of QP attributes it needs. In order to do that, along with the
"Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According
to the combination of these bits, the PF tries to allocate a suitable QP.

In order to maintain backward compatibility (with older PFs), the PF
notifies which QP attributes it supports via QUERY_FUNC_CAP command.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:35 -05:00
Eugenia Emantayev
ddae0349fd net/mlx4: Change QP allocation scheme
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields
in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset.

The current Ethernet driver code reserves a Tx QP range with 256b alignment.

This is wrong because if there are more than 64 Tx QPs in use,
QPNs >= base + 65 will have bits 6/7 set.

This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.

The new mechanism introduced here will support reservation for
"Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:

1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
   and request "BF enabled QPs" if BF is supported for the function

2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0]  - number of QPs
b. param1[31-24] - flags controlling QPs reservation

Bit 31 refers to Eth blueflame supported QPs. Those QPs must have
bits 6 and 7 unset in order to be used in Ethernet.

Bits 24-30 of the flags are currently reserved.

When a function tries to allocate a QP, it states the required attributes
for this QP. Those attributes are considered "best-effort". If an attribute,
such as Ethernet BF enabled QP, is a must-have attribute, the function has
to check that attribute is supported before trying to do the allocation.

In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those attributes
and masks out unsupported attributes as well. In order to notify VFs which
attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's
mailbox is filled by the PF, which notifies which QP allocation attributes
it supports.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 14:47:35 -05:00
Eyal Perry
947cbb0ac2 net/mlx4_en: Support for configurable RSS hash function
The ConnectX HW is capable of using one of the following hash functions:
Toeplitz and an XOR hash function. This patch extends the implementation
of the mlx4_en driver set/get_rxfh callbacks to support getting and
setting the RSS hash function used by the device.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08 21:07:10 -05:00
Jiri Pirko
02637fce3e net: rename netdev_phys_port_id to more generic name
So this can be reused for identification of other "items" as well.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 20:01:19 -08:00
Eric Dumazet
bd635c354d mlx4: fix mlx4_en_set_rxfh()
mlx4_en_set_rxfh() can crash if no RSS indir table is provided.

While we are at it, allow RSS key to be changed with ethtool -X

Tested:

myhost:~# cat /proc/sys/net/core/netdev_rss_key
b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e:c5:5d:4d:8c:22:67:30🆎8a:6e:c3:6a

myhost:~# ethtool -x eth0
RX flow hash indirection table for eth0 with 8 RX ring(s):
    0:      0     1     2     3     4     5     6     7
RSS hash key:
b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e

myhost:~# ethtool -X eth0 hkey \
03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f

myhost:~# ethtool -x eth0
RX flow hash indirection table for eth0 with 8 RX ring(s):
    0:      0     1     2     3     4     5     6     7
RSS hash key:
03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f

Reported-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: b9d1ab7eb4 ("mlx4: use netdev_rss_key_fill() helper")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23 13:49:12 -05:00
David S. Miller
1459143386 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ieee802154/fakehard.c

A bug fix went into 'net' for ieee802154/fakehard.c, which is removed
in 'net-next'.

Add build fix into the merge from Stephen Rothwell in openvswitch, the
logging macros take a new initial 'log' argument, a new call was added
in 'net' so when we merge that in here we have to explicitly add the
new 'log' arg to it else the build fails.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21 22:28:24 -05:00
Or Gerlitz
9737c6ab7a net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
This is currently missing, which results in a crash when one attempts
to set VXLAN tunnel over the mlx4_en when acting as PF.

	[ 2408.785472] BUG: unable to handle kernel NULL pointer dereference at (null)
	[...]
	[ 2408.994104] Call Trace:
	[ 2408.996584]  [<ffffffffa021f7f5>] ? vxlan_get_rx_port+0xd6/0x103 [vxlan]
	[ 2409.003316]  [<ffffffffa021f71f>] ? vxlan_lowerdev_event+0xf2/0xf2 [vxlan]
	[ 2409.010225]  [<ffffffffa0630358>] mlx4_en_start_port+0x862/0x96a [mlx4_en]
	[ 2409.017132]  [<ffffffffa063070f>] mlx4_en_open+0x17f/0x1b8 [mlx4_en]

While here, make sure to invoke vxlan_get_rx_port() only when VXLAN
offloads are actually enabled and not when they are only supported.

Reported-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-19 15:11:09 -05:00
Joe Stringer
956bdab2e4 net/mlx4_en: Implement ndo_gso_check()
Use vxlan_gso_check() to advertise offload support for this NIC.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 17:12:48 -05:00
David S. Miller
076ce44825 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/chelsio/cxgb4vf/sge.c
	drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c

sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.

ixgbe_phy.c was a set of overlapping whitespace changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 01:01:12 -05:00
Or Gerlitz
f4a1edd561 net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
Currenly we only support Large-Send and TX checksum offloads for
encapsulated traffic of type VXLAN. We must make sure to advertize
these offloads up to the stack only when VXLAN tunnel is set.

Failing to do so, would mislead the the networking stack to assume
that the driver can offload the internal TX checksum for GRE packets
and other buggy schemes.

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 13:24:45 -05:00
Shani Michaeli
f8c6455bb0 net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).

Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.

In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.

Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 13:20:02 -05:00
Eugenia Emantayev
207af6c507 net/mlx4_en: Move spinlocks and work initalizations to beginning of init_netdev
Upon failures, destroy_netdev is called, and spinlocks/works must be
initialized before calling it. Otherwise kernel panic may occur.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 17:18:01 -04:00
Ido Shamay
f4a3675158 net/mlx4_en: Call napi_synchronize on stop_port
This is instead of calling the actual implementation of
napi_synchronize, for better encapsulation.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 17:18:01 -04:00
Jack Morgenstein
c2a3d4b4ca net/mlx4_en: Cleanups suggested by clang static checker
clang flagged the following. All are actually cosmetic cleanups, not really bugs:

drivers/net/ethernet/mellanox/mlx4/en_main.c:233:3: warning: Value stored to 'err' is never read
                err = -ENOMEM;
                ^     ~~~~~~~
drivers/net/ethernet/mellanox/mlx4/en_main.c:293:3: warning: Value stored to 'err' is never read
                err = -ENOMEM;

drivers/net/ethernet/mellanox/mlx4/en_netdev.c:648:16: warning: Assigned value is garbage or undefined
        entry->reg_id = reg_id;
                      ^ ~~~~~~
drivers/net/ethernet/mellanox/mlx4/en_netdev.c:659:2: warning: Function call argument is an uninitialized value
        mlx4_en_uc_steer_release(priv, priv->dev->dev_addr, *qpn, reg_id);
(NOTE: reg_id is only used in the device-managed flow steering path, in which is it always initialized.
 This is not a bug. Cleanup here is therefore cosmetic only).

drivers/net/ethernet/mellanox/mlx4/en_rx.c:122:3: warning: Value stored to 'frag_info' is never read
                frag_info = &priv->frag_info[i];
                ^           ~~~~~~~~~~~~~~~~~~~

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 17:18:01 -04:00
Saeed Mahameed
537f6f951e net/mlx4_en: Add ethtool support for [rx|tx]vlan offload set to OFF/ON
Move mlx4_en_reset_config to en_netdev.c as it now serves more general purpose.
Add support for turning OFF/ON the rx/tx vlan offlad.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 17:18:01 -04:00
Saeed Mahameed
7787fa661b net/mlx4_en: Add support for setting rxvlan offload OFF/ON
Rename mlx4_en_timestamp_config to mlx4_en_reset_config and extend it to support
choosing RX vlan offload configuration.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-28 17:18:01 -04:00
Ido Shamay
b1b6b4da78 net/mlx4_en: Add mlx4_en_get_cqe helper
This function derives the base address of the CQE from the CQE size,
and calculates the real CQE context segment in it from the factor
(this is like before). Before this change the code used the factor to
calculate the base address of the CQE as well.

The factor indicates in which segment of the cqe stride the cqe information
is located. For 32-byte strides, the segment is 0, and for 64 byte strides,
the segment is 1 (bytes 32..63). Using the factor was ok as long as we had
only 32 and 64 byte strides. However, with larger strides, the factor is zero,
and so cannot be used to calculate the base of the CQE.

The helper uses the same method of CQE buffer pulling made by all other
components that reads the CQE buffer (mlx4_ib driver and libmlx4).

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19 17:30:11 -04:00
Or Gerlitz
b95089d00c net/mlx4: Move the tunnel steering helper function to mlx4_core
Move the function which we use to set VXLAN DMFS (flow-steering) rules
from mlx4_en to mlx4_core. This refactoring will allow the mlx4_ib driver
to call the helper for the use case of user-space RAW Ethernet QPs, such
that they can serve VXLAN traffic too.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:13:00 -07:00
Amir Vadai
0fef9d0308 net/mlx4_en: Disable blueflame using ethtool private flags
Enable the user to turn off the hardware feature called BlueFlame.
Since it is something specific to mlx4_en hardware, we control
the feature via ethtool private flags.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 19:53:14 -07:00
Eyal Perry
b94901f3ed net/mlx4_en: current_mac isn't updated in port up
When port is down dev_addr is changed (e.g. by bonding) but current_mac
is not touched. When port is up again, hash_mac is updated to dev_addr,
but current_mac isn't. This leads to inconsistency between current_mac
and mac_hash. Because of that, mlx4_en_replace_mac() fails to find
current_mac in mac_hash.

Fix is to reset current_mac to dev_addr when port is up - as we do for
mac_hash.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 19:53:13 -07:00
David S. Miller
1a98c69af1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 14:09:34 -07:00
Amir Vadai
fbc6daf197 net/mlx4_en: Ignore budget on TX napi polling
It is recommended that TX work not count against the quota.
The cost of TX packet liberation is a minute percentage of what it costs to
process an RX frame. Furthermore, that SKB freeing makes memory available for
other paths in the stack.

Give the TX a larger budget and be more aggressive about cleaning up the Tx
descriptors this budget could be changed using ethtool:
$ ethtool -C eth1 tx-frames-irq <budget>

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 20:00:49 -07:00
Noa Osherovich
2695bab2a6 net/mlx4_en: Fix mac_hash database inconsistency
Using a local copy of dev_addr in mlx4_en_set_mac() to prevent dev_addr
from being modified during error flow or when dev_addr is modified in
another context (which is another problem that is being discussed over
the mailing list [1]).
Also fixing bad naming of priv->prev_mac into priv->current_mac.

[1] - http://patchwork.ozlabs.org/patch/351489/

Reviewed-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 19:58:45 -07:00
Eugenia Emantayev
49a1e4f6b7 net/mlx4_en: Do not disable vlan filter during promiscuous mode
Promiscous mode is only for MACs.
Should not disable/enable VLAN filter when entering/leaving promisuous mode.

Signed-off-by: Aviad Yehezkel <aviadye@mellanox.co.il>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 19:58:44 -07:00
Or Gerlitz
e326f2f13b net/mlx4_en: Don't configure the HW vxlan parser when vxlan offloading isn't set
The add_vxlan_port ndo driver code was wrongly testing whether HW vxlan offloads
are supported by the device instead of checking if they are currently enabled.

This causes the driver to configure the HW parser to conduct matching for vxlan
packets but since no steering rules were set, vxlan packets are dropped on RX.

Fix that by doing the right test, as done in the del_vxlan_port ndo handler.

Fixes: 1b136de ('net/mlx4: Implement vxlan ndo calls')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:39:18 -07:00
Yuval Atias
9e311e77a8 net/mlx4_en: Use affinity hint
The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.

Signed-off-by: Yuval Atias <yuvala@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 14:58:16 -07:00