For SMC it is important to know the current port state of RoCE devices.
Monitoring port states has been triggered, when a RoCE device was added
to the pnet table. To support future alternatives to the pnet table the
monitoring of ports is made independent of the existence of a pnet table.
It starts once the smc_ib_device is established.
Due to this change smc_ib_remember_port_attr() is now a local function
and shuffling its location and the location of its used functions
makes any forward references obsolete.
And the duplicate SMC_MAX_PORTS definition is removed.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata says:
====================
Fixes for running mirror-to-gretap tests on veth
The forwarding selftests infrastructure makes it possible to run the
individual tests on a purely software netdevices. Names of interfaces to
run the test with can be passed as command line arguments to a test.
lib.sh then creates veth pairs backing the interfaces if none exist in
the system.
However, the tests need to recognize that they might be run on a soft
device. Many mirror-to-gretap tests are buggy in this regard. This patch
set aims to fix the problems in running mirror-to-gretap tests on veth
devices.
In patch #1, a service function is split out of setup_wait().
In patch #2, installing a trap is made optional.
In patch #3, tc filters in several tests are tweaked to work with veth.
In patch #4, the logic for waiting for neighbor is fixed for veth.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When running the test on soft devices, there's no mechanism to
gratuitously start resolving the neighbor for remote tunnel endpoint.
So instead of passively waiting, wait for the device to be up, and then
probe the neighbor with a ping.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running mirror_gre_bridge_1d_vlan tests on veth, several issues
cause spurious failures:
- vlan_ethtype should be ip, not ipv6 even in mirror-to-ip6gretap case,
because the overlay packet is still IPv4.
- Similarly ip_proto matches the innermost IP protocol, so can't be used
to filter out GRE packet. Drop the corresponding condition.
- Because the above fixes the filters to match in slow path as well,
they need to be made skip_hw so as not to double-count packets.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several cases where traffic that would normally be forwarded
in silicon needs to be observed in slow path. That's achieved by
trapping such traffic, and the functions trap_install() and
trap_uninstall() realize that. However, such treatment is obviously
wrong if the device in question is actually a soft device not backed by
an ASIC.
Therefore try to trap if possible, but fall back to inserting a continue
if not.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split out of setup_wait() a function setup_wait_dev() that waits for a
single device. This gives tests the opportunity to wait for a selected
device after they tinkered with its upness.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Radhey Shyam Pandey says:
====================
Fixes coding style in xilinx_emaclite.c
This patchset fixes checkpatch and kernel-doc warnings in
xilinx emaclite driver. No functional change.
Changes from v2:
-In 2/5 patch refactor if-else to make failure path return early.
-In 2/5 patch coalesce the format onto a single line and add the
missing space after the comma.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes below checkpatch checks-
CHECK: spaces preferred around that '*' (ctx:VxV)
CHECK: No space is necessary after a cast
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes below checkpatch warnings-
WARNING: Block comments use a trailing */ on a separate line
WARNING: Block comments use * on subsequent lines
WARNING: networking block comments don't use an empty /* line,
use /* Comment
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes below kernel-doc warnings:
Function parameter or member 'maxlen' not described in 'xemaclite_recv_data'
Function parameter or member 'address'not described in 'xemaclite_set_mac_address'
Excess function parameter 'addr' description in 'xemaclite_set_mac_address'
No description found for return value of 'xemaclite_interrupt'
No description found for return value of 'xemaclite_mdio_write'
Function parameter or member 'dev' not described in 'xemaclite_mdio_setup'
Excess function parameter 'ofdev' description in 'xemaclite_mdio_setup'
No description found for return value of 'xemaclite_open'
No description found for return value of 'xemaclite_close'
Excess function parameter 'match' description in 'xemaclite_of_probe'
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove else as it is not required with if doing a return.
It also coalesce the format onto a single line and add the
missing space after the comma. Fixes below checkpatch warning-
WARNING: else is not generally useful after a break or return
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch hardcoded function name with a reference to __func__ making
the code more maintainable. Address below checkpatch warning:
WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_read',
this function's name, in a string
+ "xemaclite_mdio_read(phy_id=%i, reg=%x) == %x\n",
WARNING: Prefer using '"%s...", __func__' to using 'xemaclite_mdio_write',
this function's name, in a string
+ "xemaclite_mdio_write(phy_id=%i, reg=%x, val=%x)\n",
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier says:
====================
net: mvpp2: Add big-endian support
This series allows to use PPv2 on system built as big endian.
The first patch fixes the way we represent TX and RX descriptors, so that
they used fixed little endianness as expected by the PPv2 controller.
The second reworks the way we handle the software representation of the
Header Parser entries, so that we don't use a union of arrays.
The last two patches fixes some incorrect byte swapping logic, that wen't
un-noticed on little-endian.
This whole series doesn't fix any existing bug for little-endian systems, and
since big-endian never worked for this driver, I didn't include 'fixes' tags.
This was tested on MacchiatoBin (Armada 8040).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When checking the skb->protocol field, we have to make sure we use the
proper endianness using htons, and not swab16.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlan IDs must not be swapped when creating Header Parser entries. This
has no effect on little-endian systems, but is wrong for big-endian.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPv2's Header Parser use some large TCAM and SRAM entries, that are
duplicated in software so that we can write them to hardware only when
we are done modifying them.
Currently, PPv2 uses a union containing arrays of u32 and u8 to represent
these entries, to facilitate byte per byte access. This representation is
broken when we want to support big endian, and this makes the code
confusing to read.
This patch drops the union, and simply stores the TCAM and SRAM entries
as u32 arrays, each entry corresponding to a 32-bit register.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2 controller always expect descriptors to be in little endian. We
must therefore force descriptors to use that format, and convert to the
host endianness when necessary.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sk_rmem_alloc is larger than the receive buffer and we can't
schedule more memory for it, the skb will be dropped.
In above situation, if this skb is put into the ofo queue,
LINUX_MIB_TCPOFODROP is incremented to track it.
While if this skb is put into the receive queue, there's no record.
So a new SNMP counter is introduced to track this behavior.
LINUX_MIB_TCPRCVQDROP: Number of packets meant to be queued in rcv queue
but dropped because socket rcvbuf limit hit.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for CBS reconfiguration using the TC application.
A new callback was added to TC ops struct and another one to DMA ops to
reconfigure the channel mode.
Tested in GMAC5.10.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5e netdevice driver updates:
- Boris Pismenny added the support for UDP GSO in the first two patches.
Impressive performance numbers are included in the commit message,
@Line rate with ~half of the cpu utilization compared to non offload
or no GSO at all.
- From Tariq Toukan:
- Convert large order kzalloc allocations to kvzalloc.
- Added performance diagnostic statistics to several places in data path.
From Saeed and Eran,
- Update NIC HW stats on demand only, this is to eliminate the background
thread needed to update some HW statistics in the driver cache in
order to report error and drop counters from HW in ndo_get_stats.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbNVbbAAoJEEg/ir3gV/o+1SMH/R/jBMgqWKzuDL+QUTCm6BNH
Gcu7OPd3CUGT3wUujJ4sgVmYzyp70GZx0f843ct3Bmwcfwh8Yu/eOju7ldoQq2ns
6oz8sL6yvg/zmMfkc7DYFHrJSGu3Qh7fje+8zKVdoGlHkIFwR07bRkQolTKL6hem
qZHbn5o2RhFYLxjlGa7E7w41hXMx27gmHk4V/YdUGzHrExGB0fGbUXkkveCYiGYk
sB/JDSi02xjmEV2Nw8dX5jdZoBivaG2kPcBRD6Gpgmb+hntjIyTqwGdoJ7jDy1vj
A5wWHboj+Q/LtUNvpbb48fc/hhciddSxeB+c/1DMHjd4TchnBmPXyXwz5/7eTj8=
=xft1
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-updates-2018-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-06-28
mlx5e netdevice driver updates:
- Boris Pismenny added the support for UDP GSO in the first two patches.
Impressive performance numbers are included in the commit message,
@Line rate with ~half of the cpu utilization compared to non offload
or no GSO at all.
- From Tariq Toukan:
- Convert large order kzalloc allocations to kvzalloc.
- Added performance diagnostic statistics to several places in data path.
From Saeed and Eran,
- Update NIC HW stats on demand only, this is to eliminate the background
thread needed to update some HW statistics in the driver cache in
order to report error and drop counters from HW in ndo_get_stats.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
net: Geneve options support for TC act_tunnel_key
Simon & Pieter say:
This set adds Geneve Options support to the TC tunnel key action.
It provides the plumbing required to configure Geneve variable length
options. The options can be configured in the form CLASS:TYPE:DATA,
where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit
hexadecimal value and DATA as a variable length hexadecimal value.
Additionally multiple options may be listed using a comma delimiter.
v2:
- fix sparse warnings in patches 3 and 4 (first one reported by
build bot).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow setting tunnel options using the act_tunnel_key action.
Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.
# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
geneve_opts 0102:80:00800022,0102:80:00800022 \
action mirred egress redirect dev geneve0
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check the tunnel option type stored in tunnel flags when creating options
for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel
options on interfaces that are not associated with them.
Make sure all users of the infrastructure set correct flags, for the BPF
helper we have to set all bits to keep backward compatibility.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add extended ack support for the tunnel key action by using NL_SET_ERR_MSG
during validation of user input.
Cc: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Metadata may be NULL for one of two reasons:
* Missing user input
* Failure to allocate the metadata dst
Disambiguate these case by returning -EINVAL for the former and -ENOMEM
for the latter rather than -EINVAL for both cases.
This is in preparation for using extended ack to provide more information
to users when parsing their input.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is used to change TX workrequests, which helps in
host->vf communication.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done
using FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next
commit.
Based on the original work by : Casey Leedom <leedom@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This feature is actually already supported by sk->sk_reuse which can be
set by socket level opt SO_REUSEADDR. But it's not working exactly as
RFC6458 demands in section 8.1.27, like:
- This option only supports one-to-one style SCTP sockets
- This socket option must not be used after calling bind()
or sctp_bindx().
Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
work in linux.
To separate it from the socket level version, this patch adds 'reuse' in
sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
setup limitations that are needed when it is being enabled.
"It should be noted that the behavior of the socket-level socket option
to reuse ports and/or addresses for SCTP sockets is unspecified", so it
leaves SO_REUSEADDR as is for the compatibility.
Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
functionality is nearly identical to SO_REUSEADDR, but with some
extra restrictions. Here it uses 'reuse' in sctp_sock instead of
'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
added in another patch.
Thanks to Neil to make this clear.
v1->v2:
- add sctp_sk->reuse to separate it from the socket level version.
v2->v3:
- improve changelog according to Marcelo's suggestion.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add constants and callback functions for the dwmac on px30 Soc.
The base structure is the same, but registers and the bits in
them are moved slightly, and add the clk_mac_speed for selecting
mac speed.
Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert says:
====================
ila: Cleanup
Perform some cleanup in ILA code. This includes:
- Fix rhashtable walk for cases where nl dumps are done with muliple
function calls. Add a skip index to skip over entries in
a node that have been previously visitied. Call rhashtable_walk_peek
to avoid dropping items between calls to ila_nl_dump.
- Call alloc_bucket_spinlocks to create bucket locks.
- Split out module initialization and netlink definitions into
separate files.
- Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ILA_CMD_FLUSH netlink command to clear the ILA translation table.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a main ila file that contains the module initialization functions
as well as netlink definitions. Previously these were defined in
ila_xlat and ila_common. This approach allows better extensibility.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allocate the array of bucket locks for the hash table we now
call library function alloc_bucket_spinlocks.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Perform better EAGAIN handling, handle case where ila_dump_info
fails and we missed objects in the dump, and add a skip index
to skip over ila entires in a list on a rhashtable node that have
already been visited (by a previous call to ila_nl_dump).
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li says:
====================
net: hns3: a few code improvements
This patchset fixes a few code stylistic issues from
concentrated review, no functional changes introduced.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
MACRO lower_32_bits and upper_32_bits can help to get bits 0-31
and bits 32-63 of a number, so just use it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hclge_hw is embedded in hclge_dev, so use container_of instead of
back to get hclge_dev.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interface h->ae_algo->ops->put_vector is called in both
hns3_nic_dealloc_vector_data and hns3_nic_uninit_vector_data in
hns3_client_uninit, this will cause vector freed twice.
This patch remove the Redundant put_vector to make vector freed
only once.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print the ret value in error information can help find the reason.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extraction an interface for state init|uninit to make the code
easier to read.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
linux/slab.h is not used in hnae3.h, this patch removes it.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first bd of a packet is invalid and invalid ring head for tx
IRQ is not offen, they may occur when there is error,
Add unlikely for error check branch is better for performance.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW supports UDP, TCP and SCTP packets checksum for both ipv4 and
ipv6, but do not support other type packets checksum for ipv4 or
ipv6.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the hdev->vector_status[vector_id] is already HCLGE_INVALID_VPORT,
should log the error and return.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interface init_client_instance and uninit_client_instance
do not register anything, only initialize the client instance.
This patch rename the related interface to make the function
name to indicate the purpose.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In hclge_unmap_ring_frm_vector, there are 2 steps:
step 1: get vector index.
step 2 unbind ring with vector.
But it gets vector id again in step 2 interface. This patch
removes hclge_get_vector_index from hclge_bind_ring_with_vector,
and make the step the same with hns3 PF driver.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable periodic stats update background thread and update stats in
background on demand when ndo_get_stats is called.
Having a background thread running in the driver all the time is bad for
power consumption and normally a user space daemon will query the stats
once every specific interval, so ideally the background thread and its
interval can be done in user space..
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
A per-ring counter for NOP operations already exists.
Here I add a counter that sums them up.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>