Commit Graph

71169 Commits

Author SHA1 Message Date
Tariq Toukan
7ba5e7bd64 net/mlx4_en: Use __force to fix a sparse warning in TX datapath
In TX data-path, we intentionally do not byte-swap, as documented
in code and in the cited commit log.
This fixes sparse warning:
en_tx.c:720:23: warning: incorrect type in argument 1 (different base types)
en_tx.c:720:23:    expected unsigned int [unsigned] [usertype] <noident>
en_tx.c:720:23:    got restricted __be32 [usertype] doorbell_qpn

Fixes: 492f5add4b ("net/mlx4_en: Doorbell is byteswapped in Little Endian archs")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:33:05 -07:00
Tariq Toukan
b71322d9db net/mlx4_core: Fix cast warning in fw.c
Fix the following SPARSE warning, in MLX4_GET() macro:
drivers/net/ethernet/mellanox/mlx4/fw.c:233:9: warning: cast to restricted __be64

Fixes: 17d5ceb6e4 ("net/mlx4_core: Fix unaligned accesses")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:33:05 -07:00
Tariq Toukan
bb428a5c4d net/mlx4: Fix endianness issue in qp context params
Should take care of the endianness before assigning to params2 field.

Fixes: 53f33ae295 ("net/mlx4_core: Port aggregation upper layer interface")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:33:05 -07:00
Michal Kalderon
1e28eaad07 qed: Add iWARP support for fpdu spanned over more than two tcp packets
We continue to maintain a maximum of three buffers per fpdu, to ensure
that there are enough buffers for additional unaligned mpa packets.
To support this, if a fpdu is split over more than two tcp packets, we
use an intermediate buffer to copy the data to the previous buffer, then
we can release the data. We need an intermediate buffer as the initial
buffer partial packet could be located at the end of the packet, not
leaving room for additional data. This is a corner case, and will usually
not be the case.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:27 -07:00
Michal Kalderon
c7d1d83999 qed: Add support for MPA header being split over two tcp packets
There is a special case where an MPA header is split over to tcp
packets, in this case we need to wait for the next packet to
get the fpdu length. We use the incomplete_bytes to mark this
fpdu as a "special" one which requires updating the length with
the next packet

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:27 -07:00
Michal Kalderon
d531038eeb qed: Add support for freeing two ll2 buffers for corner cases
When posting a packet on the ll2 tx, we can provide a cookie that
will be returned upon tx completion. This cookie is the ll2 iwarp buffer
which is then reposted to the rx ring. Part of the unaligned mpa flow
is determining when a buffer can be reposted. Each buffer needs to be
sent only once as a cookie for on the tx ring. In packed fpdu case, only
the last packet will be sent with the buffer, meaning we need to handle the
case that a cookie can be NULL on tx complete. In addition, when a fpdu
splits over two buffers, but there are no more fpdus on the second buffer,
two buffers need to be provided as a cookie. To avoid changing the ll2
interface to provide two cookies, we introduce a piggy buf pointer,
relevant for iWARP only, that holds a pointer to a second buffer that
needs to be released during tx completion.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:27 -07:00
Michal Kalderon
469981b17a qed: Add unaligned and packed packet processing
The fpdu data structure is preallocated per connection.
Each connection stores the current status of the connection:
either nothing pending, or there is a partial fpdu that is waiting for
the rest of the fpdu (incomplete bytes != 0).
The same structure is also used for splitting a packet when there are
packed fpdus. The structure is initialized with all data required
for sending the fpdu back to the FW. A fpdu will always be spanned across
a maximum of 3 tx bds. One for the header, one for the partial fdpu
received and one for the remainder (unaligned) packet.
In case of packed fpdu's, two fragments are used, one for the header
and one for the data.
Corner cases are not handled in the patch for clarity, and will be added
as a separate patch.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
fcb39f6c10 qed: Add mpa buffer descriptors for storing and processing mpa fpdus
The mpa buff is a descriptor for iwarp ll2 buffers that contains
additional information required for aligining fpdu's.
In some cases, an additional packet will arrive which will complete
the alignment of a fpdu, but we won't be able to post the fpdu due to
insufficient place on the tx ring. In this case we can't loose the data
and require storing it for later. Processing is therefore done
in two places, during rx completion, where we initialize a mpa buffer
descriptor and add it to the pending list, and during tx-completion, since
we free up an entry in the tx chain we can process any pending mpa packets.
The mpa buff descriptors are pre-allocated since we have to ensure that
we won't reach a state where we can't store an incoming unaligned packet.
All packets received on the ll2 MUST be processed by the driver at some
stage. Since they are preallocated, we hold a free list.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
ae3488ff37 qed: Add ll2 connection for processing unaligned MPA packets
This patch adds only the establishment and termination of the
ll2 connection that handles unaligned MPA packets.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
6f34a284f3 qed: Add LL2 slowpath handling
For iWARP unaligned MPA flow, a slowpath event of flushing an
MPA connection that entered an unaligned state is required.
The flush ramrod is received on the ll2 queue, and a pre-registered
callback function is called to handle the flush event.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
89d6511309 qed: Add the source of a packet sent on an iWARP ll2 connection
When a packet is sent back to iWARP FW via the tx ll2 connection
the FW needs to know the source of the packet. Whether it is
OOO or unaligned MPA related. Since OOO is implemented entirely
inside the ll2 code (and shared with iSCSI), packets are marked
as IN_ORDER inside the ll2 code. For unaligned mpa the value
will be determined in the iWARP code and sent on the pkt->vlan
field.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
6df60fe703 qed: Fix initialization of ll2 offload feature
enable_ip_cksum, enable_l4_cksum, calc_ip_len were added in
commit stated below but not passed through to FW. This was OK
until now as it wasn't used, but is required for the iWARP
unaligned flow

Fixes:7c7973b2ae27 ("qed: LL2 to use packed information for tx")

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
77caa792f5 qed: Add ll2 option for dropping a tx packet
The option of sending a packet on the ll2 and dropping it exists in
hardware and was not used until now, thus not exposed.
The iWARP unaligned MPA flow requires this functionality for
flushing the tx queue.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
ed468ebee0 qed: Add ll2 ability of opening a secondary queue
When more than one ll2 queue is opened ( that is not an OOO queue )
ll2 code does not have enough information to determine whether
the queue is the main one or not, so a new field is added to the
acquire input data to expose the control of determining whether
the queue is the main queue or a secondary queue.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Michal Kalderon
f5823fe689 qed: Add ll2 option to limit the number of bds per packet
iWARP uses 3 ll2 connections, the maximum number of bds is known
during connection setup. This patch modifies the static array in
the ll2_tx_packet descriptor to be a flexible array and
significantlly reduces memory size.

In addition, some redundant fields in the ll2_tx_packet were
removed, which also contributed to decreasing the descriptor size.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:21:26 -07:00
Yotam Gigi
593bc28ae2 mlxsw: spectrum_switchdev: Support bridge mrouter notifications
Support the SWITCHDEV_ATTR_ID_BRIDGE_MROUTER port attribute switchdev
notification.

To do that, add the mrouter flag to struct mlxsw_sp_bridge_device, which
indicates whether the bridge device was set to be mrouter port. This field
is set when:
 - A new bridge is created, where the value is taken from the kernel
   bridge value.
 - A switchdev SWITCHDEV_ATTR_ID_BRIDGE_MROUTER notification is sent.

In addition, change the bridge MID entries to include the router port when
the bridge device is configured to be mrouter port. The MID entries are
updated in the following cases:
 - When a new MID entry is created, update the router port according to the
   bridge mrouter state.
 - When a SWITCHDEV_ATTR_ID_BRIDGE_MROUTER notification is sent, update all
   the bridge's MID entries.

This is aligned with the case where a bridge slave is configured to be
mrouter port.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:18:11 -07:00
Yotam Gigi
c4db953f00 mlxsw: spectrum_switchdev: Add support for router port in SMID entries
In Spectrum, MDB entries point to MID entries, that indicate which ports a
packet should be forwarded to. Add the support in creating MID entries that
forward the packet to the Spectrum router port.

This will be later used to handle the bridge mrouter port switchdev
notifications.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:18:11 -07:00
Yotam Gigi
b35750f191 mlxsw: spectrum: router: Export the mlxsw_sp_router_port function
In Spectrum hardware, the router port is a virtual port that is the gateway
to the routing mechanism. Hence, in order for a packet to be L3 forwarded,
it must first be L2 forwarded to the router port inside the hardware.

Further patches in this patchset are going to introduce support in bridge
device used as an mrouter port. In this case, the router port index will be
needed in order to update the MDB entries to include the router port. Thus,
export the mlxsw_sp_router_port function, which returns the index of the
Spectrum router port.

Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:18:11 -07:00
Jakub Kicinski
2de1be1db2 nfp: bpf: pass dst register to ld_field instruction
ld_field instruction is a bit special because the encoding uses
two source registers and one of them becomes the output.  We do
need to pass the dst register to our encoding helpers though,
otherwise the "write both banks" flag will not be observed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
2e85d3884f nfp: bpf: byte swap the instructions
Device expects the instructions in little endian.  Make sure we
byte swap on big endian hosts.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
1c03e03f9b nfp: bpf: pad code with valid nops
We need to append up to 8 nops after last instruction to make
sure the CPU will not fetch garbage instructions with invalid
ECC if the code store was not initialized.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
fd068ddc88 nfp: bpf: calculate code store ECC
In the initial PoC firmware I simply disabled ECC on the instruction
store.  Do the ECC calculation for generated instructions in the driver.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
18e53b6cb9 nfp: bpf: move to datapath ABI version 2
Datapath ABI version 2 stores the packet information in LMEM
instead of NNRs.  We also have strict restrictions on which
GPRs we can use.  Only GPRs 0-23 are reserved for BPF.

Adjust the static register locations and "ABI" registers.
Note that packet length is packed with other info so we have
to extract it into one of the scratch registers, OTOH since
LMEM can be used in restricted operands we don't have to
extract packet pointer.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
995e101ffa nfp: bpf: encode extended LM pointer operands
Most instructions have special fields which allow switching
between base and extended Local Memory pointers.  Introduce
those to register encoding, we will use the extra LM pointers
to access high addresses of the stack.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
9f15d0f438 nfp: bpf: encode LMEM accesses
NFP LMEM is a large, indirectly accessed register file.  There
are two basic indirect access registers.  Each access operation
may either use offset (up to 8 or 16 words) or perform post
decrement/increment.

Add encodings of LMEM indexes as instruction operands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:03 -07:00
Jakub Kicinski
8afd9c961e nfp: add more white space to the instruction defines
We need to add longer OP_* defines, move the values away.
Purely whitespace commit.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
509144e250 nfp: bpf: remove packet marking support
Temporarily drop support for skb->mark.  We are primarily focusing
on XDP offload, and implementing skb->mark on the new datapath has
lower priority.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
226e0e94ce nfp: bpf: remove register rename
Remove the register renumbering optimization.  To implement calling
map and other helpers we need more strict register layout.  We can't
freely reassign register numbers.

This will have the effect of running in 4 context/thread mode, which
should be OK since we are moving towards integrating the BPF closer
with FW app datapath anyway, and the target datapath itself runs in
4 context mode.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
3cae131933 nfp: bpf: encode all 64bit shifts
Add encodings of all 64bit shift operations.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
2a15bb1aba nfp: bpf: move software reg helpers and cmd table out of translator
Move the software reg helpers and some static data to nfp_asm.c.
They are related to the previous patch, but move is done in a separate
commit for ease of review.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
b3f868df3c nfp: bpf: use the power of sparse to check we encode registers right
Define a new __bitwise type for software representation of registers.
This will allow us to catch incorrect parameter types using sparse.

Accessors we define also allow us to return correct enum type and
therefore ensure all switches handle all register types.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
a52b35c39e nfp: bpf: lift the single-port limitation
Limiting the eBPF offload to a single port was a workaround
required for the PoC application FW which has not been
released externally.  It's not necessary any more.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Jakub Kicinski
3a4b0129bf nfp: output control messages to trace_devlink_hwmsg()
Use standard devlink trace point to allow tracing of control
messages.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:51:02 -07:00
Yunsheng Lin
1db9b1bf82 net: hns3: Cleanup for non-static function in hns3 driver
This patch fixes the following warning from sparse:
warning: symbol 'hns3_set_multicast_list' was not declared.
Should it be static.

hns3_set_multicast_list turns out to be not used, so delete it.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
a90bb9a5ea net: hns3: Cleanup for endian issue in hns3 driver
This patch fixes a lot of endian issues detected by sparse.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
d44f9b631f net: hns3: Cleanup for struct that used to send cmd to firmware
The hclge_tm module has already added _cmd to the end of struct
that used to send cmd to firmware. This will help us finding the
endian issues.
This patch adds the _cmd to the end of struct that used to send
cmd to firmware in hclge_main module.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
5392902d33 net: hns3: Consistently using GENMASK in hns3 driver
This patch uses GENMASK to generate bit mask whenever
possible in hns3 driver.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
56cf68c730 net: hns3: Cleanup indentation for Kconfig in the the hisilicon folder
This patch fixes a few indentation for Kconfig file in the
hisilicon folder.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
9780cb97af net: hns3: Add hns3_get_handle macro in hns3 driver
There are many places that will need to get the handle
of netdev, so add a macro to get the handle of netdev.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:54 -07:00
Yunsheng Lin
5bca3b94df net: hns3: Cleanup for shifting true in hns3 driver
This patch fixes a shifting true in hclge_main module.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 09:46:53 -07:00
Christos Gkekas
c49c777f9c qed: Delete redundant check on dcb_app priority
dcb_app priority is unsigned thus checking whether it is less than zero
is redundant.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Acked-By: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:21:02 -07:00
Christos Gkekas
c778c32118 net: ethernet: stmmac: Clean up dead code
Many macros in dwmac-ipq806x are unused and should be removed.
Moreover gmac->id is an unsigned variable and therefore checking
whether it is less than zero is redundant.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:19:07 -07:00
David S. Miller
51a0c00c6b Merge tag 'mlx5-updates-2017-10-06' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
Mellanox, mlx5 updates 2017-10-06

This series includes some shared code updates for kernel 4.15 to both
net-next and rdma-next trees.

The series includes mlx5 low level flow steering updates and optimizations
to support firmware command parallelism for flow steering requests from
Maor Gottlieb and two other small fixes from Matan and Maor.

One fix from Matan adds error handling for when the destination
list of the flow steering rule is full.

Maor introduced a patch to avoid NULL pointer dereference on steering cleanup.

Then Some refactoring patches needed by the series for code sharing purposes.
and split the Flow Table Entry (FTE) and Flow Group (FG) creation code to two parts:
    1) Object allocation - allocate the steering node and initialize
    its resources.

    2) The firmware command execution.

This change will give us the ability to take write lock on the
parent node (e.g. FG for FTE creating) only on the software data struct allocation
and creation part of the procedure where the synchronization is really required,
and will allow us to execute multiple firmware commands simultaneously and overcome the
firmware bottleneck.

Refactor the locking scheme of the mlx5 core flow steering as follows:

1) Replace the mutex lock with readers-writers semaphore and take
    the write lock only when necessary (e.g. allocating a new flow
    table entry index or adding a node to the parent's children list).
    When we try to find a suitable child in the parent's children list
    (e.g. search for flow group with the same match_criteria of the rule)
    then we only take the read lock.

2) Add versioning mechanism - each steering entity (FT, FG, FTE, DST)
    will have an incremental version. The version is increased when the
    entity is changed (e.g. when a new FTE was added to FG - the FG's
    version is increased).
    Versioning is used in order to determine if the last traverse of an
    entity's children is valid or a rescan under write lock is required.

Last patch adds FGs and FTEs memory pool, It is useful because these objects
are not small and could be allocated/deallocated many times.

This support improves the insertion rate of steering rules
from ~5k/sec to ~40k/sec.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:07:11 -07:00
Haiyang Zhang
0518ec4f9d hv_netvsc: Add ethtool handler to set and get TCP hash levels
The patch supports the options to switch TCP hash level between
L3 and L4 by ethtool command. TCP over IPv4 and v6 can be set
differently. The default hash level is L4. We currently only
allow switching TX hash level from within the guests.

For example, for TCP over IPv4 on eth0:
To include TCP port numbers in hashing:
	ethtool -N eth0 rx-flow-hash tcp4 sdfn
To exclude TCP port numbers in hashing:
	ethtool -N eth0 rx-flow-hash tcp4 sd
To show TCP hash level:
	ethtool -n eth0 rx-flow-hash tcp4

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:11:01 -07:00
Haiyang Zhang
486e398105 hv_netvsc: Change the hash level variable to bit flags
This simplifies the logic and make it easier to add more
options.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:11:01 -07:00
Ido Schimmel
9b63ef88d3 mlxsw: spectrum: Propagate extack further for bridge enslavements
The code that actually takes care of bridge offload introduces a few
more non-trivial constraints with regards to bridge enslavements.
Propagate extack there to indicate the reason.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add link enp1s0np1 name enp1s0np1.20 type vlan id 20
$ ip link add name br0 type bridge
$ ip link set dev enp1s0np1.10 master br0
$ ip link set dev enp1s0np1.20 master br0
Error: spectrum: Can not bridge VLAN uppers of the same port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:07:21 -07:00
Ido Schimmel
c1f2c6d025 mlxsw: spectrum: Add extack for VLAN enslavements
Similar to physical ports, enslavement of VLAN devices can also fail.
Use extack to indicate why the enslavement failed.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add name bond0 type bond mode 802.3ad
$ ip link set dev enp1s0np1.10 master bond0
Error: spectrum: VLAN devices only support bridge and VRF uppers.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 10:07:21 -07:00
Jonathan Toppins
0d7b70e836 bnxt_en: don't consider building bnxt_tc.o if option not enabled
Instead of zeroing out bnxt_tc.c with a #ifdef foo, instead don't compile
the file when the option is not enabled. Now make and the preprocessor do
not have to waste time compiling a no-op.

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 03:00:26 +01:00
David S. Miller
f5333f80c3 Merge branch '40GbE' of ra.kernel.org:/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-10-06

This series contains updates to i40e and i40evf only.

Rami fixes a typo in the code comments.

Mitch adds an ethtool private flag to control source pruning to resolve an
issue where our default behavior is to enable source pruning which breaks ARP
monitoring in channel bonding.  Fixes a couple of register definitions, which
were incorrect.

Jake fixes an issue with multiple logical CPUs per core (simultaneous
multithreading - SMT) and how we set an affinity hint based on the v_idx of
that q_vector, which is an incremental value and might lead to multiple
offline CPUs being assigned to a q_vector.  Instead, we should only assign
hints for CPUs which are online, so look to use cpumask_local_spread().
Also fixed a VF VLAN tag stripping issue, where the flag created to change
this feature was seen as unchangeable.  Lastly, organized and re-numbered
the feature flags.

Alan re-enables PTP L4 for XL710 devices with firmware version 6.0 or
greater, now that the previous bug in the older firmware is fixed.
Implements the PCI error handlers for reset_prepare() and reset_done() to
allow us to handle function level resets.

Alice cleans up code that was added to the incorrect function during a
merge.

Filip adds a change to display an error message when a module is inserted
that does not meet the thermal requirements, Talking Heads "Burning Down
the House" comes to mind.  Also fixed a flow director filter issue where
a variable was not being cleared which stores the filter number to be
removed from the list when the firmware refused to add the requested
filter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 23:29:38 +01:00
Bjorn Helgaas
d2746fe538 bnx2x: Use pci_ari_enabled() instead of local copy
Use pci_ari_enabled() from the PCI core instead of the identical local copy
bnx2x_ari_enabled().  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06 10:06:06 -07:00