Merge branch 'switchdev_spring_cleanup'

Scott Feldman says:

====================
switchdev: spring cleanup

v7:

Address review comments:

 - [Jiri] split the br_setlink and br_dellink reverts into their own patches
 - [Jiri] some parameter cleanup of rocker's memory allocators
 - [Jiri] pass trans mode as formal parameter rather than hanging off of
     rocker_port.

v6:

Address review comments:

 - [Jiri] split a couple of patches into one-logical-change per patch
 - [Joe Perches] revert checkpatch -f changes for wrapped lines with long
     symbols.

v5:

Address review comments:

 - [Jiri] include Jiri's s/swdev/switchdev rename patches up front.
 - [Jiri] squash some patches.  Now setlink/dellink/getlink patches are in
     three parts: new implementation, convert drivers to new, delete old impl.
 - [Jiri] some minor variable renames
 - [Jiri] use BUG_ON rather than WARN when COMMIT phase fails when PREPARE
     phase said it was safe to come into the water.
 - [Simon] rocker: fix a few transaction prepare-commit cases that were wrong.
     This was the bulk of the changes in v5.

v4:

Well, it was a lot of work, but now prepare-commit transaction model is how
davem advises: if prepare fails, abort the transaction.  The driver must do
resource reservations up front in prepare phase and return those resources if
aborting.  Commit phase would use reserved resources.  The good news is the
driver code (for rocker) now handles resource allocation failures better by not
leaving partially device or driver states.  This is a side-effect of the
prepare phase where state isn't modified; only validation of inputs and
resource reservations happen in the prepare phase.  Since we're supporting
setting attrs and add objs across lower devs in the stacked case, we need to
hold rtnl_lock (or ensure rtnl_lock is held) so lower devs don't move on us
during the prepare-commit transaction.  DSA driver code skips the prepare phase
and goes straight for the commit phase since no up-front allocations are done
and no device failures (that could be detected in the prepare phase) can
happen.

Remove NETIF_F_HW_SWITCH_OFFLOAD from rocker and the swdev_attr_set/get
wrappers.  DSA doesn't set NETIF_F_HW_SWITCH_OFFLOAD, so it can't be in
swdev_attr_set/get.  rocker doesn't need it; or rather can't support
NETIF_F_HW_SWITCH_OFFLOAD being set/cleared at run-time after the device
port is already up and offloading L2/L3.  NETIF_F_HW_SWITCH_OFFLOAD is still
left as a feature flag for drivers that can use it.

Drop the renaming patch for netdev_switch_notifier.  Other renames are a
result of moving to the attr get/set or obj add/del model.  Everything
but the netdev_switch_notifier is still prefixed with "swdev_".

v3:

Move to two-phase prepare-commit transaction model for attr set and obj add.
Driver gets a change in prepare phase to NACK transaction if lack of resources
or support in device.

v2:

Address review comments:

 - [Jiri] squash a few related patches
 - [Roopa] don't remove NETIF_F_HW_SWITCH_OFFLOAD
 - [Roopa] address VLAN setlink/dellink
 - [Ronen] print warning is attr set revert fails

Not address:

 - Using something other than "swdev_" prefix
 - Vendor extentions

The patch set grew a bit to not only support port attr get/set but also add
support for port obj add/del.  Example of port objs are VLAN, FDB entries, and
FIB entries.  The VLAN support now allows the swdev driver to get VLAN ranges
and flags like PVID and "untagged".  Sridhar will be adding FDB obj support
in follow-on patch.

v1:

The main theme of this patch set is to cleanup swdev in preparation for
new features or fixes to be added soon.  We have a pretty good idea now how
to handle stacked drivers in swdev, but there where some loose ends.  For
example, if a set failed in the middle of walking the lower devs, we would
leave the system in an undefined state...there was no way to recover back to
the previous state.  Speaking of sets, also recognize a pattern that most
swdev API accesses are gets or sets of port attributes, so go ahead and make
port attr get/set the central swdev API, and convert everything that is
set-ish/get-ish to this new API.

Features/fixes that should follow from this cleanup:

 - solve the duplicate pkt forwarding issue
 - get/set bridge attrs, like ageing_time, from/to device
 - get/set more bridge port attrs from/to device

There are some rename cleanups tagging along at the end, to give swdev
consistent naming.

And finally, some much needed updates to the switchdev.txt documentation to
hopefully capture the state-of-the-art of swdev.  Hopefully, we can do a better
job keeping this document up-to-date.

Tested with rocker, of course, to make sure nothing functional broke.  There
are a couple minor tweaks to DSA code for getting switch ID and setting STP
updates to use new API, but not expecting amy breakage there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
David S. Miller 2015-05-12 18:43:56 -04:00
commit a62b70ddd1
17 changed files with 1735 additions and 754 deletions

View File

@ -1,59 +1,355 @@
Switch (and switch-ish) device drivers HOWTO
===========================
Ethernet switch device driver model (switchdev)
===============================================
Copyright (c) 2014 Jiri Pirko <jiri@resnulli.us>
Copyright (c) 2014-2015 Scott Feldman <sfeldma@gmail.com>
Please note that the word "switch" is here used in very generic meaning.
This include devices supporting L2/L3 but also various flow offloading chips,
including switches embedded into SR-IOV NICs.
Lets describe a topology a bit. Imagine the following example:
The Ethernet switch device driver model (switchdev) is an in-kernel driver
model for switch devices which offload the forwarding (data) plane from the
kernel.
+----------------------------+ +---------------+
| SOME switch chip | | CPU |
+----------------------------+ +---------------+
port1 port2 port3 port4 MNGMNT | PCI-E |
| | | | | +---------------+
PHY PHY | | | | NIC0 NIC1
| | | | | |
| | +- PCI-E -+ | |
| +------- MII -------+ |
+------------- MII ------------+
Figure 1 is a block diagram showing the components of the switchdev model for
an example setup using a data-center-class switch ASIC chip. Other setups
with SR-IOV or soft switches, such as OVS, are possible.
In this example, there are two independent lines between the switch silicon
and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are
separate from the switch driver. SOME switch chip is by managed by a driver
via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be
connected to some other type of bus.
Now, for the previous example show the representation in kernel:
User-spacetools
userspace|
+-------------------------------------------------------------------+
kernel|Netlink
|
+--------------+-------------------------------+
|Networkstack|
|(Linux)|
||
+----------------------------------------------+
sw1p2 sw1p4 sw1p6
sw1p1 + sw1p3 + sw1p5 + eth1
+|+|+|+
|||||||
+--+----+----+----+-+--+----+---++-----+-----+
|Switchdriver||mgmt|
|(thisdocument)||driver|
||||
+--------------+----------------++-----------+
|
kernel|HWbus(egPCI)
+-------------------------------------------------------------------+
hardware|
+--------------+---+------------+
|Switchdevice (sw1)|
|+----++--------+
||voffloadeddatapath|mgmtport
||||
+--|----|----+----+----+----+---+
||||||
++++++
p1p2p3p4p5p6
front-panelports
+----------------------------+ +---------------+
| SOME switch chip | | CPU |
+----------------------------+ +---------------+
sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT | PCI-E |
| | | | | +---------------+
PHY PHY | | | | eth0 eth1
| | | | | |
| | +- PCI-E -+ | |
| +------- MII -------+ |
+------------- MII ------------+
Fig 1.
Lets call the example switch driver for SOME switch chip "SOMEswitch". This
driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX
created for each port of a switch. These netdevices are instances
of "SOMEswitch" driver. sw0pX netdevices serve as a "representation"
of the switch chip. eth0 and eth1 are instances of some other existing driver.
The only difference of the switch-port netdevice from the ordinary netdevice
is that is implements couple more NDOs:
Include Files
-------------
ndo_switch_parent_id_get - This returns the same ID for two port netdevices
of the same physical switch chip. This is
mandatory to be implemented by all switch drivers
and serves the caller for recognition of a port
netdevice.
ndo_switch_parent_* - Functions that serve for a manipulation of the switch
chip itself (it can be though of as a "parent" of the
port, therefore the name). They are not port-specific.
Caller might use arbitrary port netdevice of the same
switch and it will make no difference.
ndo_switch_port_* - Functions that serve for a port-specific manipulation.
#include <linux/netdevice.h>
#include <net/switchdev.h>
Configuration
-------------
Use "depends NET_SWITCHDEV" in driver's Kconfig to ensure switchdev model
support is built for driver.
Switch Ports
------------
On switchdev driver initialization, the driver will allocate and register a
struct net_device (using register_netdev()) for each enumerated physical switch
port, called the port netdev. A port netdev is the software representation of
the physical port and provides a conduit for control traffic to/from the
controller (the kernel) and the network, as well as an anchor point for higher
level constructs such as bridges, bonds, VLANs, tunnels, and L3 routers. Using
standard netdev tools (iproute2, ethtool, etc), the port netdev can also
provide to the user access to the physical properties of the switch port such
as PHY link state and I/O statistics.
There is (currently) no higher-level kernel object for the switch beyond the
port netdevs. All of the switchdev driver ops are netdev ops or switchdev ops.
A switch management port is outside the scope of the switchdev driver model.
Typically, the management port is not participating in offloaded data plane and
is loaded with a different driver, such as a NIC driver, on the management port
device.
Port Netdev Naming
^^^^^^^^^^^^^^^^^^
Udev rules should be used for port netdev naming, using some unique attribute
of the port as a key, for example the port MAC address or the port PHYS name.
Hard-coding of kernel netdev names within the driver is discouraged; let the
kernel pick the default netdev name, and let udev set the final name based on a
port attribute.
Using port PHYS name (ndo_get_phys_port_name) for the key is particularly
useful for dynically-named ports where the device names it's ports based on
external configuration. For example, if a physical 40G port is split logically
into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
name for each port using port PHYS name. The udev rule would be:
SUBSYSTEM=="net", ACTION=="add", DRIVER="<driver>", ATTR{phys_port_name}!="", \
NAME="$attr{phys_port_name}"
Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0
would be sub-port 0 on port 1 on switch 1.
Switch ID
^^^^^^^^^
The switchdev driver must implement the switchdev op switchdev_port_attr_get for
SWITCHDEV_ATTR_PORT_PARENT_ID for each port netdev, returning the same physical ID
for each port of a switch. The ID must be unique between switches on the same
system. The ID does not need to be unique between switches on different
systems.
The switch ID is used to locate ports on a switch and to know if aggregated
ports belong to the same switch.
Port Features
^^^^^^^^^^^^^
NETIF_F_NETNS_LOCAL
If the switchdev driver (and device) only supports offloading of the default
network namespace (netns), the driver should set this feature flag to prevent
the port netdev from being moved out of the default netns. A netns-aware
driver/device would not set this flag and be resposible for partitioning
hardware to preserve netns containment. This means hardware cannot forward
traffic from a port in one namespace to another port in another namespace.
Port Topology
^^^^^^^^^^^^^
The port netdevs representing the physical switch ports can be organized into
higher-level switching constructs. The default construct is a standalone
router port, used to offload L3 forwarding. Two or more ports can be bonded
together to form a LAG. Two or more ports (or LAGs) can be bridged to bridge
to L2 networks. VLANs can be applied to sub-divide L2 networks. L2-over-L3
tunnels can be built on ports. These constructs are built using standard Linux
tools such as the bridge driver, the bonding/team drivers, and netlink-based
tools such as iproute2.
The switchdev driver can know a particular port's position in the topology by
monitoring NETDEV_CHANGEUPPER notifications. For example, a port moved into a
bond will see it's upper master change. If that bond is moved into a bridge,
the bond's upper master will change. And so on. The driver will track such
movements to know what position a port is in in the overall topology by
registering for netdevice events and acting on NETDEV_CHANGEUPPER.
L2 Forwarding Offload
---------------------
The idea is to offload the L2 data forwarding (switching) path from the kernel
to the switchdev device by mirroring bridge FDB entries down to the device. An
FDB entry is the {port, MAC, VLAN} tuple forwarding destination.
To offloading L2 bridging, the switchdev driver/device should support:
- Static FDB entries installed on a bridge port
- Notification of learned/forgotten src mac/vlans from device
- STP state changes on the port
- VLAN flooding of multicast/broadcast and unknown unicast packets
Static FDB Entries
^^^^^^^^^^^^^^^^^^
The switchdev driver should implement ndo_fdb_add, ndo_fdb_del and ndo_fdb_dump
to support static FDB entries installed to the device. Static bridge FDB
entries are installed, for example, using iproute2 bridge cmd:
bridge fdb add ADDR dev DEV [vlan VID] [self]
Note: by default, the bridge does not filter on VLAN and only bridges untagged
traffic. To enable VLAN support, turn on VLAN filtering:
echo 1 >/sys/class/net/<bridge>/bridge/vlan_filtering
Notification of Learned/Forgotten Source MAC/VLANs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The switch device will learn/forget source MAC address/VLAN on ingress packets
and notify the switch driver of the mac/vlan/port tuples. The switch driver,
in turn, will notify the bridge driver using the switchdev notifier call:
err = call_switchdev_notifiers(val, dev, info);
Where val is SWITCHDEV_FDB_ADD when learning and SWITCHDEV_FDB_DEL when forgetting, and
info points to a struct switchdev_notifier_fdb_info. On SWITCHDEV_FDB_ADD, the bridge
driver will install the FDB entry into the bridge's FDB and mark the entry as
NTF_EXT_LEARNED. The iproute2 bridge command will label these entries
"offload":
$ bridge fdb
52:54:00:12:35:01 dev sw1p1 master br0 permanent
00:02:00:00:02:00 dev sw1p1 master br0 offload
00:02:00:00:02:00 dev sw1p1 self
52:54:00:12:35:02 dev sw1p2 master br0 permanent
00:02:00:00:03:00 dev sw1p2 master br0 offload
00:02:00:00:03:00 dev sw1p2 self
33:33:00:00:00:01 dev eth0 self permanent
01:00:5e:00:00:01 dev eth0 self permanent
33:33:ff:00:00:00 dev eth0 self permanent
01:80:c2:00:00:0e dev eth0 self permanent
33:33:00:00:00:01 dev br0 self permanent
01:00:5e:00:00:01 dev br0 self permanent
33:33:ff:12:35:01 dev br0 self permanent
Learning on the port should be disabled on the bridge using the bridge command:
bridge link set dev DEV learning off
Learning on the device port should be enabled, as well as learning_sync:
bridge link set dev DEV learning on self
bridge link set dev DEV learning_sync on self
Learning_sync attribute enables syncing of the learned/forgotton FDB entry to
the bridge's FDB. It's possible, but not optimal, to enable learning on the
device port and on the bridge port, and disable learning_sync.
To support learning and learning_sync port attributes, the driver implements
switchdev op switchdev_port_attr_get/set for SWITCHDEV_ATTR_PORT_BRIDGE_FLAGS. The driver
should initialize the attributes to the hardware defaults.
FDB Ageing
^^^^^^^^^^
There are two FDB ageing models supported: 1) ageing by the device, and 2)
ageing by the kernel. Ageing by the device is preferred if many FDB entries
are supported. The driver calls call_switchdev_notifiers(SWITCHDEV_FDB_DEL, ...) to
age out the FDB entry. In this model, ageing by the kernel should be turned
off. XXX: how to turn off ageing in kernel on a per-port basis or otherwise
prevent the kernel from ageing out the FDB entry?
In the kernel ageing model, the standard bridge ageing mechanism is used to age
out stale FDB entries. To keep an FDB entry "alive", the driver should refresh
the FDB entry by calling call_switchdev_notifiers(SWITCHDEV_FDB_ADD, ...). The
notification will reset the FDB entry's last-used time to now. The driver
should rate limit refresh notifications, for example, no more than once a
second. If the FDB entry expires, ndo_fdb_del is called to remove entry from
the device. XXX: this last part isn't currently correct: ndo_fdb_del isn't
called, so the stale entry remains in device...this need to get fixed.
FDB Flush
^^^^^^^^^
XXX: Unimplemented. Need to support FDB flush by bridge driver for port and
remove both static and learned FDB entries.
STP State Change on Port
^^^^^^^^^^^^^^^^^^^^^^^^
Internally or with a third-party STP protocol implementation (e.g. mstpd), the
bridge driver maintains the STP state for ports, and will notify the switch
driver of STP state change on a port using the switchdev op switchdev_attr_port_set for
SWITCHDEV_ATTR_PORT_STP_UPDATE.
State is one of BR_STATE_*. The switch driver can use STP state updates to
update ingress packet filter list for the port. For example, if port is
DISABLED, no packets should pass, but if port moves to BLOCKED, then STP BPDUs
and other IEEE 01:80:c2:xx:xx:xx link-local multicast packets can pass.
Note that STP BDPUs are untagged and STP state applies to all VLANs on the port
so packet filters should be applied consistently across untagged and tagged
VLANs on the port.
Flooding L2 domain
^^^^^^^^^^^^^^^^^^
For a given L2 VLAN domain, the switch device should flood multicast/broadcast
and unknown unicast packets to all ports in domain, if allowed by port's
current STP state. The switch driver, knowing which ports are within which
vlan L2 domain, can program the switch device for flooding. The packet should
also be sent to the port netdev for processing by the bridge driver. The
bridge should not reflood the packet to the same ports the device flooded.
XXX: the mechanism to avoid duplicate flood packets is being discuseed.
It is possible for the switch device to not handle flooding and push the
packets up to the bridge driver for flooding. This is not ideal as the number
of ports scale in the L2 domain as the device is much more efficient at
flooding packets that software.
IGMP Snooping
^^^^^^^^^^^^^
XXX: complete this section
L3 routing
----------
Offloading L3 routing requires that device be programmed with FIB entries from
the kernel, with the device doing the FIB lookup and forwarding. The device
does a longest prefix match (LPM) on FIB entries matching route prefix and
forwards the packet to the matching FIB entry's nexthop(s) egress ports. To
program the device, the switchdev driver is called with add/delete ops for IPv4
and IPv6 FIB entries. For IPv4, the driver implements switchdev ops:
int (*switchdev_fib_ipv4_add)(struct net_device *dev,
__be32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type,
u32 nlflags, u32 tb_id);
int (*switchdev_fib_ipv4_del)(struct net_device *dev,
__be32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type,
u32 tb_id);
to add/delete IPv4 dst/dest_len prefix on table tb_id. The *fi structure holds
details on the route and route's nexthops. *dev is one of the port netdevs
mentioned in the routes next hop list. If the output port netdevs referenced
in the route's nexthop list don't all have the same switch ID, the driver is
not called to add/delete the FIB entry.
Routes offloaded to the device are labeled with "offload" in the ip route
listing:
$ ip route show
default via 192.168.0.2 dev eth0
11.0.0.0/30 dev sw1p1 proto kernel scope link src 11.0.0.2 offload
11.0.0.4/30 via 11.0.0.1 dev sw1p1 proto zebra metric 20 offload
11.0.0.8/30 dev sw1p2 proto kernel scope link src 11.0.0.10 offload
11.0.0.12/30 via 11.0.0.9 dev sw1p2 proto zebra metric 20 offload
12.0.0.2 proto zebra metric 30 offload
nexthop via 11.0.0.1 dev sw1p1 weight 1
nexthop via 11.0.0.9 dev sw1p2 weight 1
12.0.0.3 via 11.0.0.1 dev sw1p1 proto zebra metric 20 offload
12.0.0.4 via 11.0.0.9 dev sw1p2 proto zebra metric 20 offload
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.15
XXX: add/del IPv6 FIB API
Nexthop Resolution
^^^^^^^^^^^^^^^^^^
The FIB entry's nexthop list contains the nexthop tuple (gateway, dev), but for
the switch device to forward the packet with the correct dst mac address, the
nexthop gateways must be resolved to the neighbor's mac address. Neighbor mac
address discovery comes via the ARP (or ND) process and is available via the
arp_tbl neighbor table. To resolve the routes nexthop gateways, the driver
should trigger the kernel's neighbor resolution process. See the rocker
driver's rocker_port_ipv4_resolve() for an example.
The driver can monitor for updates to arp_tbl using the netevent notifier
NETEVENT_NEIGH_UPDATE. The device can be programmed with resolved nexthops
for the routes as arp_tbl updates.

View File

@ -1015,10 +1015,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
netdev_features_t mask;
struct slave *slave;
/* If any slave has the offload feature flag set,
* set the offload flag on the bond.
*/
mask = features | NETIF_F_HW_SWITCH_OFFLOAD;
mask = features;
features &= ~NETIF_F_ONE_FOR_ALL;
features |= NETIF_F_ALL_FOR_ALL;
@ -4039,8 +4036,9 @@ static const struct net_device_ops bond_netdev_ops = {
.ndo_add_slave = bond_enslave,
.ndo_del_slave = bond_release,
.ndo_fix_features = bond_fix_features,
.ndo_bridge_setlink = ndo_dflt_netdev_switch_port_bridge_setlink,
.ndo_bridge_dellink = ndo_dflt_netdev_switch_port_bridge_dellink,
.ndo_bridge_setlink = switchdev_port_bridge_setlink,
.ndo_bridge_getlink = switchdev_port_bridge_getlink,
.ndo_bridge_dellink = switchdev_port_bridge_dellink,
.ndo_features_check = passthru_features_check,
};

File diff suppressed because it is too large Load Diff

View File

@ -65,9 +65,9 @@ enum {
#define ROCKER_TEST_DMA_CTRL 0x0034
/* Rocker test register ctrl */
#define ROCKER_TEST_DMA_CTRL_CLEAR (1 << 0)
#define ROCKER_TEST_DMA_CTRL_FILL (1 << 1)
#define ROCKER_TEST_DMA_CTRL_INVERT (1 << 2)
#define ROCKER_TEST_DMA_CTRL_CLEAR BIT(0)
#define ROCKER_TEST_DMA_CTRL_FILL BIT(1)
#define ROCKER_TEST_DMA_CTRL_INVERT BIT(2)
/* Rocker DMA ring register offsets */
#define ROCKER_DMA_DESC_ADDR(x) (0x1000 + (x) * 32) /* 8-byte */
@ -79,7 +79,7 @@ enum {
#define ROCKER_DMA_DESC_RES1(x) (0x101c + (x) * 32)
/* Rocker dma ctrl register bits */
#define ROCKER_DMA_DESC_CTRL_RESET (1 << 0)
#define ROCKER_DMA_DESC_CTRL_RESET BIT(0)
/* Rocker DMA ring types */
enum rocker_dma_type {
@ -111,7 +111,7 @@ struct rocker_desc {
u16 comp_err;
};
#define ROCKER_DMA_DESC_COMP_ERR_GEN (1 << 15)
#define ROCKER_DMA_DESC_COMP_ERR_GEN BIT(15)
/* Rocker DMA TLV struct */
struct rocker_tlv {
@ -237,14 +237,14 @@ enum {
ROCKER_TLV_RX_MAX = __ROCKER_TLV_RX_MAX - 1,
};
#define ROCKER_RX_FLAGS_IPV4 (1 << 0)
#define ROCKER_RX_FLAGS_IPV6 (1 << 1)
#define ROCKER_RX_FLAGS_CSUM_CALC (1 << 2)
#define ROCKER_RX_FLAGS_IPV4_CSUM_GOOD (1 << 3)
#define ROCKER_RX_FLAGS_IP_FRAG (1 << 4)
#define ROCKER_RX_FLAGS_TCP (1 << 5)
#define ROCKER_RX_FLAGS_UDP (1 << 6)
#define ROCKER_RX_FLAGS_TCP_UDP_CSUM_GOOD (1 << 7)
#define ROCKER_RX_FLAGS_IPV4 BIT(0)
#define ROCKER_RX_FLAGS_IPV6 BIT(1)
#define ROCKER_RX_FLAGS_CSUM_CALC BIT(2)
#define ROCKER_RX_FLAGS_IPV4_CSUM_GOOD BIT(3)
#define ROCKER_RX_FLAGS_IP_FRAG BIT(4)
#define ROCKER_RX_FLAGS_TCP BIT(5)
#define ROCKER_RX_FLAGS_UDP BIT(6)
#define ROCKER_RX_FLAGS_TCP_UDP_CSUM_GOOD BIT(7)
enum {
ROCKER_TLV_TX_UNSPEC,
@ -460,6 +460,6 @@ enum rocker_of_dpa_overlay_type {
#define ROCKER_SWITCH_ID 0x0320 /* 8-byte */
/* Rocker control bits */
#define ROCKER_CONTROL_RESET (1 << 0)
#define ROCKER_CONTROL_RESET BIT(0)
#endif

View File

@ -1924,7 +1924,7 @@ static netdev_features_t team_fix_features(struct net_device *dev,
struct team *team = netdev_priv(dev);
netdev_features_t mask;
mask = features | NETIF_F_HW_SWITCH_OFFLOAD;
mask = features;
features &= ~NETIF_F_ONE_FOR_ALL;
features |= NETIF_F_ALL_FOR_ALL;
@ -1977,8 +1977,9 @@ static const struct net_device_ops team_netdev_ops = {
.ndo_del_slave = team_del_slave,
.ndo_fix_features = team_fix_features,
.ndo_change_carrier = team_change_carrier,
.ndo_bridge_setlink = ndo_dflt_netdev_switch_port_bridge_setlink,
.ndo_bridge_dellink = ndo_dflt_netdev_switch_port_bridge_dellink,
.ndo_bridge_setlink = switchdev_port_bridge_setlink,
.ndo_bridge_getlink = switchdev_port_bridge_getlink,
.ndo_bridge_dellink = switchdev_port_bridge_dellink,
.ndo_features_check = passthru_features_check,
};

View File

@ -66,7 +66,6 @@ enum {
NETIF_F_HW_VLAN_STAG_FILTER_BIT,/* Receive filtering on VLAN STAGs */
NETIF_F_HW_L2FW_DOFFLOAD_BIT, /* Allow L2 Forwarding in Hardware */
NETIF_F_BUSY_POLL_BIT, /* Busy poll */
NETIF_F_HW_SWITCH_OFFLOAD_BIT, /* HW switch offload */
/*
* Add your fresh new feature above and remember to update
@ -125,7 +124,6 @@ enum {
#define NETIF_F_HW_VLAN_STAG_TX __NETIF_F(HW_VLAN_STAG_TX)
#define NETIF_F_HW_L2FW_DOFFLOAD __NETIF_F(HW_L2FW_DOFFLOAD)
#define NETIF_F_BUSY_POLL __NETIF_F(BUSY_POLL)
#define NETIF_F_HW_SWITCH_OFFLOAD __NETIF_F(HW_SWITCH_OFFLOAD)
/* Features valid for ethtool to change */
/* = all defined minus driver/device-class-related */
@ -161,8 +159,7 @@ enum {
*/
#define NETIF_F_ONE_FOR_ALL (NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ROBUST | \
NETIF_F_SG | NETIF_F_HIGHDMA | \
NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED | \
NETIF_F_HW_SWITCH_OFFLOAD)
NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED)
/*
* If one device doesn't support one of these features, then disable it

View File

@ -1567,7 +1567,7 @@ struct net_device {
const struct net_device_ops *netdev_ops;
const struct ethtool_ops *ethtool_ops;
#ifdef CONFIG_NET_SWITCHDEV
const struct swdev_ops *swdev_ops;
const struct switchdev_ops *switchdev_ops;
#endif
const struct header_ops *header_ops;

View File

@ -14,153 +14,210 @@
#include <linux/netdevice.h>
#include <linux/notifier.h>
#define SWITCHDEV_F_NO_RECURSE BIT(0)
enum switchdev_trans {
SWITCHDEV_TRANS_NONE,
SWITCHDEV_TRANS_PREPARE,
SWITCHDEV_TRANS_ABORT,
SWITCHDEV_TRANS_COMMIT,
};
enum switchdev_attr_id {
SWITCHDEV_ATTR_UNDEFINED,
SWITCHDEV_ATTR_PORT_PARENT_ID,
SWITCHDEV_ATTR_PORT_STP_STATE,
SWITCHDEV_ATTR_PORT_BRIDGE_FLAGS,
};
struct switchdev_attr {
enum switchdev_attr_id id;
enum switchdev_trans trans;
u32 flags;
union {
struct netdev_phys_item_id ppid; /* PORT_PARENT_ID */
u8 stp_state; /* PORT_STP_STATE */
unsigned long brport_flags; /* PORT_BRIDGE_FLAGS */
};
};
struct fib_info;
enum switchdev_obj_id {
SWITCHDEV_OBJ_UNDEFINED,
SWITCHDEV_OBJ_PORT_VLAN,
SWITCHDEV_OBJ_IPV4_FIB,
};
struct switchdev_obj {
enum switchdev_obj_id id;
enum switchdev_trans trans;
union {
struct switchdev_obj_vlan { /* PORT_VLAN */
u16 flags;
u16 vid_start;
u16 vid_end;
} vlan;
struct switchdev_obj_ipv4_fib { /* IPV4_FIB */
u32 dst;
int dst_len;
struct fib_info *fi;
u8 tos;
u8 type;
u32 nlflags;
u32 tb_id;
} ipv4_fib;
};
};
/**
* struct switchdev_ops - switchdev operations
*
* @swdev_parent_id_get: Called to get an ID of the switch chip this port
* is part of. If driver implements this, it indicates that it
* represents a port of a switch chip.
* @switchdev_port_attr_get: Get a port attribute (see switchdev_attr).
*
* @swdev_port_stp_update: Called to notify switch device port of bridge
* port STP state change.
* @switchdev_port_attr_set: Set a port attribute (see switchdev_attr).
*
* @swdev_fib_ipv4_add: Called to add/modify IPv4 route to switch device.
* @switchdev_port_obj_add: Add an object to port (see switchdev_obj).
*
* @swdev_fib_ipv4_del: Called to delete IPv4 route from switch device.
* @switchdev_port_obj_del: Delete an object from port (see switchdev_obj).
*/
struct swdev_ops {
int (*swdev_parent_id_get)(struct net_device *dev,
struct netdev_phys_item_id *psid);
int (*swdev_port_stp_update)(struct net_device *dev, u8 state);
int (*swdev_fib_ipv4_add)(struct net_device *dev, __be32 dst,
int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 nlflags,
u32 tb_id);
int (*swdev_fib_ipv4_del)(struct net_device *dev, __be32 dst,
int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 tb_id);
struct switchdev_ops {
int (*switchdev_port_attr_get)(struct net_device *dev,
struct switchdev_attr *attr);
int (*switchdev_port_attr_set)(struct net_device *dev,
struct switchdev_attr *attr);
int (*switchdev_port_obj_add)(struct net_device *dev,
struct switchdev_obj *obj);
int (*switchdev_port_obj_del)(struct net_device *dev,
struct switchdev_obj *obj);
};
enum netdev_switch_notifier_type {
NETDEV_SWITCH_FDB_ADD = 1,
NETDEV_SWITCH_FDB_DEL,
enum switchdev_notifier_type {
SWITCHDEV_FDB_ADD = 1,
SWITCHDEV_FDB_DEL,
};
struct netdev_switch_notifier_info {
struct switchdev_notifier_info {
struct net_device *dev;
};
struct netdev_switch_notifier_fdb_info {
struct netdev_switch_notifier_info info; /* must be first */
struct switchdev_notifier_fdb_info {
struct switchdev_notifier_info info; /* must be first */
const unsigned char *addr;
u16 vid;
};
static inline struct net_device *
netdev_switch_notifier_info_to_dev(const struct netdev_switch_notifier_info *info)
switchdev_notifier_info_to_dev(const struct switchdev_notifier_info *info)
{
return info->dev;
}
#ifdef CONFIG_NET_SWITCHDEV
int netdev_switch_parent_id_get(struct net_device *dev,
struct netdev_phys_item_id *psid);
int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
int register_netdev_switch_notifier(struct notifier_block *nb);
int unregister_netdev_switch_notifier(struct notifier_block *nb);
int call_netdev_switch_notifiers(unsigned long val, struct net_device *dev,
struct netdev_switch_notifier_info *info);
int netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int ndo_dflt_netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int ndo_dflt_netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 nlflags, u32 tb_id);
int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 tb_id);
void netdev_switch_fib_ipv4_abort(struct fib_info *fi);
int switchdev_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr);
int switchdev_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr);
int switchdev_port_obj_add(struct net_device *dev, struct switchdev_obj *obj);
int switchdev_port_obj_del(struct net_device *dev, struct switchdev_obj *obj);
int register_switchdev_notifier(struct notifier_block *nb);
int unregister_switchdev_notifier(struct notifier_block *nb);
int call_switchdev_notifiers(unsigned long val, struct net_device *dev,
struct switchdev_notifier_info *info);
int switchdev_port_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
struct net_device *dev, u32 filter_mask,
int nlflags);
int switchdev_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int switchdev_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 nlflags, u32 tb_id);
int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 tb_id);
void switchdev_fib_ipv4_abort(struct fib_info *fi);
#else
static inline int netdev_switch_parent_id_get(struct net_device *dev,
struct netdev_phys_item_id *psid)
static inline int switchdev_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr)
{
return -EOPNOTSUPP;
}
static inline int netdev_switch_port_stp_update(struct net_device *dev,
u8 state)
static inline int switchdev_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr)
{
return -EOPNOTSUPP;
}
static inline int register_netdev_switch_notifier(struct notifier_block *nb)
static inline int switchdev_port_obj_add(struct net_device *dev,
struct switchdev_obj *obj)
{
return -EOPNOTSUPP;
}
static inline int switchdev_port_obj_del(struct net_device *dev,
struct switchdev_obj *obj)
{
return -EOPNOTSUPP;
}
static inline int register_switchdev_notifier(struct notifier_block *nb)
{
return 0;
}
static inline int unregister_netdev_switch_notifier(struct notifier_block *nb)
static inline int unregister_switchdev_notifier(struct notifier_block *nb)
{
return 0;
}
static inline int call_netdev_switch_notifiers(unsigned long val, struct net_device *dev,
struct netdev_switch_notifier_info *info)
static inline int call_switchdev_notifiers(unsigned long val,
struct net_device *dev,
struct switchdev_notifier_info *info)
{
return NOTIFY_DONE;
}
static inline int netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
static inline int switchdev_port_bridge_getlink(struct sk_buff *skb, u32 pid,
u32 seq, struct net_device *dev,
u32 filter_mask, int nlflags)
{
return -EOPNOTSUPP;
}
static inline int netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
static inline int switchdev_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
{
return -EOPNOTSUPP;
}
static inline int ndo_dflt_netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
static inline int switchdev_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
{
return -EOPNOTSUPP;
}
static inline int switchdev_fib_ipv4_add(u32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type,
u32 nlflags, u32 tb_id)
{
return 0;
}
static inline int ndo_dflt_netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh,
u16 flags)
static inline int switchdev_fib_ipv4_del(u32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type, u32 tb_id)
{
return 0;
}
static inline int netdev_switch_fib_ipv4_add(u32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type,
u32 nlflags, u32 tb_id)
{
return 0;
}
static inline int netdev_switch_fib_ipv4_del(u32 dst, int dst_len,
struct fib_info *fi,
u8 tos, u8 type, u32 tb_id)
{
return 0;
}
static inline void netdev_switch_fib_ipv4_abort(struct fib_info *fi)
static inline void switchdev_fib_ipv4_abort(struct fib_info *fi)
{
}

View File

@ -121,13 +121,13 @@ static struct notifier_block br_device_notifier = {
.notifier_call = br_device_event
};
static int br_netdev_switch_event(struct notifier_block *unused,
unsigned long event, void *ptr)
static int br_switchdev_event(struct notifier_block *unused,
unsigned long event, void *ptr)
{
struct net_device *dev = netdev_switch_notifier_info_to_dev(ptr);
struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
struct net_bridge_port *p;
struct net_bridge *br;
struct netdev_switch_notifier_fdb_info *fdb_info;
struct switchdev_notifier_fdb_info *fdb_info;
int err = NOTIFY_DONE;
rtnl_lock();
@ -138,14 +138,14 @@ static int br_netdev_switch_event(struct notifier_block *unused,
br = p->br;
switch (event) {
case NETDEV_SWITCH_FDB_ADD:
case SWITCHDEV_FDB_ADD:
fdb_info = ptr;
err = br_fdb_external_learn_add(br, p, fdb_info->addr,
fdb_info->vid);
if (err)
err = notifier_from_errno(err);
break;
case NETDEV_SWITCH_FDB_DEL:
case SWITCHDEV_FDB_DEL:
fdb_info = ptr;
err = br_fdb_external_learn_del(br, p, fdb_info->addr,
fdb_info->vid);
@ -159,8 +159,8 @@ static int br_netdev_switch_event(struct notifier_block *unused,
return err;
}
static struct notifier_block br_netdev_switch_notifier = {
.notifier_call = br_netdev_switch_event,
static struct notifier_block br_switchdev_notifier = {
.notifier_call = br_switchdev_event,
};
static void __net_exit br_net_exit(struct net *net)
@ -214,7 +214,7 @@ static int __init br_init(void)
if (err)
goto err_out3;
err = register_netdev_switch_notifier(&br_netdev_switch_notifier);
err = register_switchdev_notifier(&br_switchdev_notifier);
if (err)
goto err_out4;
@ -235,7 +235,7 @@ static int __init br_init(void)
return 0;
err_out5:
unregister_netdev_switch_notifier(&br_netdev_switch_notifier);
unregister_switchdev_notifier(&br_switchdev_notifier);
err_out4:
unregister_netdevice_notifier(&br_device_notifier);
err_out3:
@ -253,7 +253,7 @@ static void __exit br_deinit(void)
{
stp_proto_unregister(&br_stp_proto);
br_netlink_fini();
unregister_netdev_switch_notifier(&br_netdev_switch_notifier);
unregister_switchdev_notifier(&br_switchdev_notifier);
unregister_netdevice_notifier(&br_device_notifier);
brioctl_set(NULL);
unregister_pernet_subsys(&br_net_ops);

View File

@ -586,7 +586,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags)
struct nlattr *afspec;
struct net_bridge_port *p;
struct nlattr *tb[IFLA_BRPORT_MAX + 1];
int err = 0, ret_offload = 0;
int err = 0;
protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_PROTINFO);
afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
@ -628,16 +628,6 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags)
afspec, RTM_SETLINK);
}
if (p && !(flags & BRIDGE_FLAGS_SELF)) {
/* set bridge attributes in hardware if supported
*/
ret_offload = netdev_switch_port_bridge_setlink(dev, nlh,
flags);
if (ret_offload && ret_offload != -EOPNOTSUPP)
br_warn(p->br, "error setting attrs on port %u(%s)\n",
(unsigned int)p->port_no, p->dev->name);
}
if (err == 0)
br_ifinfo_notify(RTM_NEWLINK, p);
out:
@ -649,7 +639,7 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags)
{
struct nlattr *afspec;
struct net_bridge_port *p;
int err = 0, ret_offload = 0;
int err = 0;
afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
if (!afspec)
@ -668,16 +658,6 @@ int br_dellink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags)
*/
br_ifinfo_notify(RTM_NEWLINK, p);
if (p && !(flags & BRIDGE_FLAGS_SELF)) {
/* del bridge attributes in hardware
*/
ret_offload = netdev_switch_port_bridge_dellink(dev, nlh,
flags);
if (ret_offload && ret_offload != -EOPNOTSUPP)
br_warn(p->br, "error deleting attrs on port %u (%s)\n",
(unsigned int)p->port_no, p->dev->name);
}
return err;
}
static int br_validate(struct nlattr *tb[], struct nlattr *data[])

View File

@ -39,10 +39,14 @@ void br_log_state(const struct net_bridge_port *p)
void br_set_state(struct net_bridge_port *p, unsigned int state)
{
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_STP_STATE,
.stp_state = state,
};
int err;
p->state = state;
err = netdev_switch_port_stp_update(p->dev, state);
err = switchdev_port_attr_set(p->dev, &attr);
if (err && err != -EOPNOTSUPP)
br_warn(p->br, "error setting offload STP state on port %u(%s)\n",
(unsigned int) p->port_no, p->dev->name);

View File

@ -98,7 +98,6 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_RXALL_BIT] = "rx-all",
[NETIF_F_HW_L2FW_DOFFLOAD_BIT] = "l2-fwd-offload",
[NETIF_F_BUSY_POLL_BIT] = "busy-poll",
[NETIF_F_HW_SWITCH_OFFLOAD_BIT] = "hw-switch-offload",
};
static const char

View File

@ -458,11 +458,15 @@ static ssize_t phys_switch_id_show(struct device *dev,
return restart_syscall();
if (dev_isalive(netdev)) {
struct netdev_phys_item_id ppid;
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_PARENT_ID,
.flags = SWITCHDEV_F_NO_RECURSE,
};
ret = netdev_switch_parent_id_get(netdev, &ppid);
ret = switchdev_port_attr_get(netdev, &attr);
if (!ret)
ret = sprintf(buf, "%*phN\n", ppid.id_len, ppid.id);
ret = sprintf(buf, "%*phN\n", attr.ppid.id_len,
attr.ppid.id);
}
rtnl_unlock();

View File

@ -1004,16 +1004,19 @@ static int rtnl_phys_port_name_fill(struct sk_buff *skb, struct net_device *dev)
static int rtnl_phys_switch_id_fill(struct sk_buff *skb, struct net_device *dev)
{
int err;
struct netdev_phys_item_id psid;
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_PARENT_ID,
.flags = SWITCHDEV_F_NO_RECURSE,
};
err = netdev_switch_parent_id_get(dev, &psid);
err = switchdev_port_attr_get(dev, &attr);
if (err) {
if (err == -EOPNOTSUPP)
return 0;
return err;
}
if (nla_put(skb, IFLA_PHYS_SWITCH_ID, psid.id_len, psid.id))
if (nla_put(skb, IFLA_PHYS_SWITCH_ID, attr.ppid.id_len, attr.ppid.id))
return -EMSGSIZE;
return 0;

View File

@ -345,6 +345,24 @@ static int dsa_slave_stp_update(struct net_device *dev, u8 state)
return ret;
}
static int dsa_slave_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr)
{
int ret = 0;
switch (attr->id) {
case SWITCHDEV_ATTR_PORT_STP_STATE:
if (attr->trans == SWITCHDEV_TRANS_COMMIT)
ret = dsa_slave_stp_update(dev, attr->stp_state);
break;
default:
ret = -EOPNOTSUPP;
break;
}
return ret;
}
static int dsa_slave_bridge_port_join(struct net_device *dev,
struct net_device *br)
{
@ -382,14 +400,20 @@ static int dsa_slave_bridge_port_leave(struct net_device *dev)
return ret;
}
static int dsa_slave_parent_id_get(struct net_device *dev,
struct netdev_phys_item_id *psid)
static int dsa_slave_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr)
{
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_switch *ds = p->parent;
psid->id_len = sizeof(ds->index);
memcpy(&psid->id, &ds->index, psid->id_len);
switch (attr->id) {
case SWITCHDEV_ATTR_PORT_PARENT_ID:
attr->ppid.id_len = sizeof(ds->index);
memcpy(&attr->ppid.id, &ds->index, attr->ppid.id_len);
break;
default:
return -EOPNOTSUPP;
}
return 0;
}
@ -675,9 +699,9 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_get_iflink = dsa_slave_get_iflink,
};
static const struct swdev_ops dsa_slave_swdev_ops = {
.swdev_parent_id_get = dsa_slave_parent_id_get,
.swdev_port_stp_update = dsa_slave_stp_update,
static const struct switchdev_ops dsa_slave_switchdev_ops = {
.switchdev_port_attr_get = dsa_slave_port_attr_get,
.switchdev_port_attr_set = dsa_slave_port_attr_set,
};
static void dsa_slave_adjust_link(struct net_device *dev)
@ -866,7 +890,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
eth_hw_addr_inherit(slave_dev, master);
slave_dev->tx_queue_len = 0;
slave_dev->netdev_ops = &dsa_slave_netdev_ops;
slave_dev->swdev_ops = &dsa_slave_swdev_ops;
slave_dev->switchdev_ops = &dsa_slave_switchdev_ops;
netdev_for_each_tx_queue(slave_dev, dsa_slave_set_lockdep_class_one,
NULL);

View File

@ -1165,13 +1165,13 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
new_fa->fa_state = state & ~FA_S_ACCESSED;
new_fa->fa_slen = fa->fa_slen;
err = netdev_switch_fib_ipv4_add(key, plen, fi,
new_fa->fa_tos,
cfg->fc_type,
cfg->fc_nlflags,
tb->tb_id);
err = switchdev_fib_ipv4_add(key, plen, fi,
new_fa->fa_tos,
cfg->fc_type,
cfg->fc_nlflags,
tb->tb_id);
if (err) {
netdev_switch_fib_ipv4_abort(fi);
switchdev_fib_ipv4_abort(fi);
kmem_cache_free(fn_alias_kmem, new_fa);
goto out;
}
@ -1215,12 +1215,10 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
new_fa->tb_id = tb->tb_id;
/* (Optionally) offload fib entry to switch hardware. */
err = netdev_switch_fib_ipv4_add(key, plen, fi, tos,
cfg->fc_type,
cfg->fc_nlflags,
tb->tb_id);
err = switchdev_fib_ipv4_add(key, plen, fi, tos, cfg->fc_type,
cfg->fc_nlflags, tb->tb_id);
if (err) {
netdev_switch_fib_ipv4_abort(fi);
switchdev_fib_ipv4_abort(fi);
goto out_free_new_fa;
}
@ -1239,7 +1237,7 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
return 0;
out_sw_fib_del:
netdev_switch_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type, tb->tb_id);
switchdev_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type, tb->tb_id);
out_free_new_fa:
kmem_cache_free(fn_alias_kmem, new_fa);
out:
@ -1517,8 +1515,8 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
if (!fa_to_delete)
return -ESRCH;
netdev_switch_fib_ipv4_del(key, plen, fa_to_delete->fa_info, tos,
cfg->fc_type, tb->tb_id);
switchdev_fib_ipv4_del(key, plen, fa_to_delete->fa_info, tos,
cfg->fc_type, tb->tb_id);
rtmsg_fib(RTM_DELROUTE, htonl(key), fa_to_delete, plen, tb->tb_id,
&cfg->fc_nlinfo, 0);
@ -1767,10 +1765,9 @@ void fib_table_flush_external(struct fib_table *tb)
if (!fi || !(fi->fib_flags & RTNH_F_EXTERNAL))
continue;
netdev_switch_fib_ipv4_del(n->key,
KEYLENGTH - fa->fa_slen,
fi, fa->fa_tos,
fa->fa_type, tb->tb_id);
switchdev_fib_ipv4_del(n->key, KEYLENGTH - fa->fa_slen,
fi, fa->fa_tos, fa->fa_type,
tb->tb_id);
}
/* update leaf slen */
@ -1835,10 +1832,9 @@ int fib_table_flush(struct fib_table *tb)
continue;
}
netdev_switch_fib_ipv4_del(n->key,
KEYLENGTH - fa->fa_slen,
fi, fa->fa_tos,
fa->fa_type, tb->tb_id);
switchdev_fib_ipv4_del(n->key, KEYLENGTH - fa->fa_slen,
fi, fa->fa_tos, fa->fa_type,
tb->tb_id);
hlist_del_rcu(&fa->fa_list);
fib_release_info(fa->fa_info);
alias_free_mem_rcu(fa);

View File

@ -15,97 +15,328 @@
#include <linux/mutex.h>
#include <linux/notifier.h>
#include <linux/netdevice.h>
#include <linux/if_bridge.h>
#include <net/ip_fib.h>
#include <net/switchdev.h>
/**
* netdev_switch_parent_id_get - Get ID of a switch
* @dev: port device
* @psid: switch ID
* switchdev_port_attr_get - Get port attribute
*
* Get ID of a switch this port is part of.
*/
int netdev_switch_parent_id_get(struct net_device *dev,
struct netdev_phys_item_id *psid)
{
const struct swdev_ops *ops = dev->swdev_ops;
if (!ops || !ops->swdev_parent_id_get)
return -EOPNOTSUPP;
return ops->swdev_parent_id_get(dev, psid);
}
EXPORT_SYMBOL_GPL(netdev_switch_parent_id_get);
/**
* netdev_switch_port_stp_update - Notify switch device port of STP
* state change
* @dev: port device
* @state: port STP state
*
* Notify switch device port of bridge port STP state change.
* @attr: attribute to get
*/
int netdev_switch_port_stp_update(struct net_device *dev, u8 state)
int switchdev_port_attr_get(struct net_device *dev, struct switchdev_attr *attr)
{
const struct swdev_ops *ops = dev->swdev_ops;
const struct switchdev_ops *ops = dev->switchdev_ops;
struct net_device *lower_dev;
struct list_head *iter;
struct switchdev_attr first = {
.id = SWITCHDEV_ATTR_UNDEFINED
};
int err = -EOPNOTSUPP;
if (ops && ops->swdev_port_stp_update)
return ops->swdev_port_stp_update(dev, state);
if (ops && ops->switchdev_port_attr_get)
return ops->switchdev_port_attr_get(dev, attr);
if (attr->flags & SWITCHDEV_F_NO_RECURSE)
return err;
/* Switch device port(s) may be stacked under
* bond/team/vlan dev, so recurse down to get attr on
* each port. Return -ENODATA if attr values don't
* compare across ports.
*/
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = netdev_switch_port_stp_update(lower_dev, state);
if (err && err != -EOPNOTSUPP)
return err;
err = switchdev_port_attr_get(lower_dev, attr);
if (err)
break;
if (first.id == SWITCHDEV_ATTR_UNDEFINED)
first = *attr;
else if (memcmp(&first, attr, sizeof(*attr)))
return -ENODATA;
}
return err;
}
EXPORT_SYMBOL_GPL(netdev_switch_port_stp_update);
EXPORT_SYMBOL_GPL(switchdev_port_attr_get);
static DEFINE_MUTEX(netdev_switch_mutex);
static RAW_NOTIFIER_HEAD(netdev_switch_notif_chain);
static int __switchdev_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr)
{
const struct switchdev_ops *ops = dev->switchdev_ops;
struct net_device *lower_dev;
struct list_head *iter;
int err = -EOPNOTSUPP;
if (ops && ops->switchdev_port_attr_set)
return ops->switchdev_port_attr_set(dev, attr);
if (attr->flags & SWITCHDEV_F_NO_RECURSE)
return err;
/* Switch device port(s) may be stacked under
* bond/team/vlan dev, so recurse down to set attr on
* each port.
*/
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = __switchdev_port_attr_set(lower_dev, attr);
if (err)
break;
}
return err;
}
struct switchdev_attr_set_work {
struct work_struct work;
struct net_device *dev;
struct switchdev_attr attr;
};
static void switchdev_port_attr_set_work(struct work_struct *work)
{
struct switchdev_attr_set_work *asw =
container_of(work, struct switchdev_attr_set_work, work);
int err;
rtnl_lock();
err = switchdev_port_attr_set(asw->dev, &asw->attr);
BUG_ON(err);
rtnl_unlock();
dev_put(asw->dev);
kfree(work);
}
static int switchdev_port_attr_set_defer(struct net_device *dev,
struct switchdev_attr *attr)
{
struct switchdev_attr_set_work *asw;
asw = kmalloc(sizeof(*asw), GFP_ATOMIC);
if (!asw)
return -ENOMEM;
INIT_WORK(&asw->work, switchdev_port_attr_set_work);
dev_hold(dev);
asw->dev = dev;
memcpy(&asw->attr, attr, sizeof(asw->attr));
schedule_work(&asw->work);
return 0;
}
/**
* register_netdev_switch_notifier - Register notifier
* switchdev_port_attr_set - Set port attribute
*
* @dev: port device
* @attr: attribute to set
*
* Use a 2-phase prepare-commit transaction model to ensure
* system is not left in a partially updated state due to
* failure from driver/device.
*/
int switchdev_port_attr_set(struct net_device *dev, struct switchdev_attr *attr)
{
int err;
if (!rtnl_is_locked()) {
/* Running prepare-commit transaction across stacked
* devices requires nothing moves, so if rtnl_lock is
* not held, schedule a worker thread to hold rtnl_lock
* while setting attr.
*/
return switchdev_port_attr_set_defer(dev, attr);
}
/* Phase I: prepare for attr set. Driver/device should fail
* here if there are going to be issues in the commit phase,
* such as lack of resources or support. The driver/device
* should reserve resources needed for the commit phase here,
* but should not commit the attr.
*/
attr->trans = SWITCHDEV_TRANS_PREPARE;
err = __switchdev_port_attr_set(dev, attr);
if (err) {
/* Prepare phase failed: abort the transaction. Any
* resources reserved in the prepare phase are
* released.
*/
attr->trans = SWITCHDEV_TRANS_ABORT;
__switchdev_port_attr_set(dev, attr);
return err;
}
/* Phase II: commit attr set. This cannot fail as a fault
* of driver/device. If it does, it's a bug in the driver/device
* because the driver said everythings was OK in phase I.
*/
attr->trans = SWITCHDEV_TRANS_COMMIT;
err = __switchdev_port_attr_set(dev, attr);
BUG_ON(err);
return err;
}
EXPORT_SYMBOL_GPL(switchdev_port_attr_set);
int __switchdev_port_obj_add(struct net_device *dev, struct switchdev_obj *obj)
{
const struct switchdev_ops *ops = dev->switchdev_ops;
struct net_device *lower_dev;
struct list_head *iter;
int err = -EOPNOTSUPP;
if (ops && ops->switchdev_port_obj_add)
return ops->switchdev_port_obj_add(dev, obj);
/* Switch device port(s) may be stacked under
* bond/team/vlan dev, so recurse down to add object on
* each port.
*/
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = __switchdev_port_obj_add(lower_dev, obj);
if (err)
break;
}
return err;
}
/**
* switchdev_port_obj_add - Add port object
*
* @dev: port device
* @obj: object to add
*
* Use a 2-phase prepare-commit transaction model to ensure
* system is not left in a partially updated state due to
* failure from driver/device.
*
* rtnl_lock must be held.
*/
int switchdev_port_obj_add(struct net_device *dev, struct switchdev_obj *obj)
{
int err;
ASSERT_RTNL();
/* Phase I: prepare for obj add. Driver/device should fail
* here if there are going to be issues in the commit phase,
* such as lack of resources or support. The driver/device
* should reserve resources needed for the commit phase here,
* but should not commit the obj.
*/
obj->trans = SWITCHDEV_TRANS_PREPARE;
err = __switchdev_port_obj_add(dev, obj);
if (err) {
/* Prepare phase failed: abort the transaction. Any
* resources reserved in the prepare phase are
* released.
*/
obj->trans = SWITCHDEV_TRANS_ABORT;
__switchdev_port_obj_add(dev, obj);
return err;
}
/* Phase II: commit obj add. This cannot fail as a fault
* of driver/device. If it does, it's a bug in the driver/device
* because the driver said everythings was OK in phase I.
*/
obj->trans = SWITCHDEV_TRANS_COMMIT;
err = __switchdev_port_obj_add(dev, obj);
WARN(err, "%s: Commit of object (id=%d) failed.\n", dev->name, obj->id);
return err;
}
EXPORT_SYMBOL_GPL(switchdev_port_obj_add);
/**
* switchdev_port_obj_del - Delete port object
*
* @dev: port device
* @obj: object to delete
*/
int switchdev_port_obj_del(struct net_device *dev, struct switchdev_obj *obj)
{
const struct switchdev_ops *ops = dev->switchdev_ops;
struct net_device *lower_dev;
struct list_head *iter;
int err = -EOPNOTSUPP;
if (ops && ops->switchdev_port_obj_del)
return ops->switchdev_port_obj_del(dev, obj);
/* Switch device port(s) may be stacked under
* bond/team/vlan dev, so recurse down to delete object on
* each port.
*/
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = switchdev_port_obj_del(lower_dev, obj);
if (err)
break;
}
return err;
}
EXPORT_SYMBOL_GPL(switchdev_port_obj_del);
static DEFINE_MUTEX(switchdev_mutex);
static RAW_NOTIFIER_HEAD(switchdev_notif_chain);
/**
* register_switchdev_notifier - Register notifier
* @nb: notifier_block
*
* Register switch device notifier. This should be used by code
* which needs to monitor events happening in particular device.
* Return values are same as for atomic_notifier_chain_register().
*/
int register_netdev_switch_notifier(struct notifier_block *nb)
int register_switchdev_notifier(struct notifier_block *nb)
{
int err;
mutex_lock(&netdev_switch_mutex);
err = raw_notifier_chain_register(&netdev_switch_notif_chain, nb);
mutex_unlock(&netdev_switch_mutex);
mutex_lock(&switchdev_mutex);
err = raw_notifier_chain_register(&switchdev_notif_chain, nb);
mutex_unlock(&switchdev_mutex);
return err;
}
EXPORT_SYMBOL_GPL(register_netdev_switch_notifier);
EXPORT_SYMBOL_GPL(register_switchdev_notifier);
/**
* unregister_netdev_switch_notifier - Unregister notifier
* unregister_switchdev_notifier - Unregister notifier
* @nb: notifier_block
*
* Unregister switch device notifier.
* Return values are same as for atomic_notifier_chain_unregister().
*/
int unregister_netdev_switch_notifier(struct notifier_block *nb)
int unregister_switchdev_notifier(struct notifier_block *nb)
{
int err;
mutex_lock(&netdev_switch_mutex);
err = raw_notifier_chain_unregister(&netdev_switch_notif_chain, nb);
mutex_unlock(&netdev_switch_mutex);
mutex_lock(&switchdev_mutex);
err = raw_notifier_chain_unregister(&switchdev_notif_chain, nb);
mutex_unlock(&switchdev_mutex);
return err;
}
EXPORT_SYMBOL_GPL(unregister_netdev_switch_notifier);
EXPORT_SYMBOL_GPL(unregister_switchdev_notifier);
/**
* call_netdev_switch_notifiers - Call notifiers
* call_switchdev_notifiers - Call notifiers
* @val: value passed unmodified to notifier function
* @dev: port device
* @info: notifier information data
@ -114,146 +345,241 @@ EXPORT_SYMBOL_GPL(unregister_netdev_switch_notifier);
* when it needs to propagate hardware event.
* Return values are same as for atomic_notifier_call_chain().
*/
int call_netdev_switch_notifiers(unsigned long val, struct net_device *dev,
struct netdev_switch_notifier_info *info)
int call_switchdev_notifiers(unsigned long val, struct net_device *dev,
struct switchdev_notifier_info *info)
{
int err;
info->dev = dev;
mutex_lock(&netdev_switch_mutex);
err = raw_notifier_call_chain(&netdev_switch_notif_chain, val, info);
mutex_unlock(&netdev_switch_mutex);
mutex_lock(&switchdev_mutex);
err = raw_notifier_call_chain(&switchdev_notif_chain, val, info);
mutex_unlock(&switchdev_mutex);
return err;
}
EXPORT_SYMBOL_GPL(call_netdev_switch_notifiers);
EXPORT_SYMBOL_GPL(call_switchdev_notifiers);
/**
* netdev_switch_port_bridge_setlink - Notify switch device port of bridge
* port attributes
* switchdev_port_bridge_getlink - Get bridge port attributes
*
* @dev: port device
* @nlh: netlink msg with bridge port attributes
* @flags: bridge setlink flags
*
* Notify switch device port of bridge port attributes
* Called for SELF on rtnl_bridge_getlink to get bridge port
* attributes.
*/
int netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
int switchdev_port_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
struct net_device *dev, u32 filter_mask,
int nlflags)
{
const struct net_device_ops *ops = dev->netdev_ops;
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_BRIDGE_FLAGS,
};
u16 mode = BRIDGE_MODE_UNDEF;
u32 mask = BR_LEARNING | BR_LEARNING_SYNC;
int err;
if (!(dev->features & NETIF_F_HW_SWITCH_OFFLOAD))
return 0;
err = switchdev_port_attr_get(dev, &attr);
if (err)
return err;
if (!ops->ndo_bridge_setlink)
return -EOPNOTSUPP;
return ops->ndo_bridge_setlink(dev, nlh, flags);
return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode,
attr.brport_flags, mask, nlflags);
}
EXPORT_SYMBOL_GPL(netdev_switch_port_bridge_setlink);
EXPORT_SYMBOL_GPL(switchdev_port_bridge_getlink);
/**
* netdev_switch_port_bridge_dellink - Notify switch device port of bridge
* port attribute delete
*
* @dev: port device
* @nlh: netlink msg with bridge port attributes
* @flags: bridge setlink flags
*
* Notify switch device port of bridge port attribute delete
*/
int netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
static int switchdev_port_br_setflag(struct net_device *dev,
struct nlattr *nlattr,
unsigned long brport_flag)
{
const struct net_device_ops *ops = dev->netdev_ops;
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_BRIDGE_FLAGS,
};
u8 flag = nla_get_u8(nlattr);
int err;
if (!(dev->features & NETIF_F_HW_SWITCH_OFFLOAD))
return 0;
err = switchdev_port_attr_get(dev, &attr);
if (err)
return err;
if (!ops->ndo_bridge_dellink)
return -EOPNOTSUPP;
if (flag)
attr.brport_flags |= brport_flag;
else
attr.brport_flags &= ~brport_flag;
return ops->ndo_bridge_dellink(dev, nlh, flags);
return switchdev_port_attr_set(dev, &attr);
}
EXPORT_SYMBOL_GPL(netdev_switch_port_bridge_dellink);
/**
* ndo_dflt_netdev_switch_port_bridge_setlink - default ndo bridge setlink
* op for master devices
*
* @dev: port device
* @nlh: netlink msg with bridge port attributes
* @flags: bridge setlink flags
*
* Notify master device slaves of bridge port attributes
*/
int ndo_dflt_netdev_switch_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
static const struct nla_policy
switchdev_port_bridge_policy[IFLA_BRPORT_MAX + 1] = {
[IFLA_BRPORT_STATE] = { .type = NLA_U8 },
[IFLA_BRPORT_COST] = { .type = NLA_U32 },
[IFLA_BRPORT_PRIORITY] = { .type = NLA_U16 },
[IFLA_BRPORT_MODE] = { .type = NLA_U8 },
[IFLA_BRPORT_GUARD] = { .type = NLA_U8 },
[IFLA_BRPORT_PROTECT] = { .type = NLA_U8 },
[IFLA_BRPORT_FAST_LEAVE] = { .type = NLA_U8 },
[IFLA_BRPORT_LEARNING] = { .type = NLA_U8 },
[IFLA_BRPORT_LEARNING_SYNC] = { .type = NLA_U8 },
[IFLA_BRPORT_UNICAST_FLOOD] = { .type = NLA_U8 },
};
static int switchdev_port_br_setlink_protinfo(struct net_device *dev,
struct nlattr *protinfo)
{
struct net_device *lower_dev;
struct list_head *iter;
int ret = 0, err = 0;
struct nlattr *attr;
int rem;
int err;
if (!(dev->features & NETIF_F_HW_SWITCH_OFFLOAD))
return ret;
err = nla_validate_nested(protinfo, IFLA_BRPORT_MAX,
switchdev_port_bridge_policy);
if (err)
return err;
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = netdev_switch_port_bridge_setlink(lower_dev, nlh, flags);
if (err && err != -EOPNOTSUPP)
ret = err;
nla_for_each_nested(attr, protinfo, rem) {
switch (nla_type(attr)) {
case IFLA_BRPORT_LEARNING:
err = switchdev_port_br_setflag(dev, attr,
BR_LEARNING);
break;
case IFLA_BRPORT_LEARNING_SYNC:
err = switchdev_port_br_setflag(dev, attr,
BR_LEARNING_SYNC);
break;
default:
err = -EOPNOTSUPP;
break;
}
if (err)
return err;
}
return ret;
return 0;
}
EXPORT_SYMBOL_GPL(ndo_dflt_netdev_switch_port_bridge_setlink);
/**
* ndo_dflt_netdev_switch_port_bridge_dellink - default ndo bridge dellink
* op for master devices
*
* @dev: port device
* @nlh: netlink msg with bridge port attributes
* @flags: bridge dellink flags
*
* Notify master device slaves of bridge port attribute deletes
*/
int ndo_dflt_netdev_switch_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
static int switchdev_port_br_afspec(struct net_device *dev,
struct nlattr *afspec,
int (*f)(struct net_device *dev,
struct switchdev_obj *obj))
{
struct net_device *lower_dev;
struct list_head *iter;
int ret = 0, err = 0;
struct nlattr *attr;
struct bridge_vlan_info *vinfo;
struct switchdev_obj obj = {
.id = SWITCHDEV_OBJ_PORT_VLAN,
};
int rem;
int err;
if (!(dev->features & NETIF_F_HW_SWITCH_OFFLOAD))
return ret;
netdev_for_each_lower_dev(dev, lower_dev, iter) {
err = netdev_switch_port_bridge_dellink(lower_dev, nlh, flags);
if (err && err != -EOPNOTSUPP)
ret = err;
nla_for_each_nested(attr, afspec, rem) {
if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
continue;
if (nla_len(attr) != sizeof(struct bridge_vlan_info))
return -EINVAL;
vinfo = nla_data(attr);
obj.vlan.flags = vinfo->flags;
if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_BEGIN) {
if (obj.vlan.vid_start)
return -EINVAL;
obj.vlan.vid_start = vinfo->vid;
} else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) {
if (!obj.vlan.vid_start)
return -EINVAL;
obj.vlan.vid_end = vinfo->vid;
if (obj.vlan.vid_end <= obj.vlan.vid_start)
return -EINVAL;
err = f(dev, &obj);
if (err)
return err;
memset(&obj.vlan, 0, sizeof(obj.vlan));
} else {
if (obj.vlan.vid_start)
return -EINVAL;
obj.vlan.vid_start = vinfo->vid;
obj.vlan.vid_end = vinfo->vid;
err = f(dev, &obj);
if (err)
return err;
memset(&obj.vlan, 0, sizeof(obj.vlan));
}
}
return ret;
return 0;
}
EXPORT_SYMBOL_GPL(ndo_dflt_netdev_switch_port_bridge_dellink);
static struct net_device *netdev_switch_get_lowest_dev(struct net_device *dev)
/**
* switchdev_port_bridge_setlink - Set bridge port attributes
*
* @dev: port device
* @nlh: netlink header
* @flags: netlink flags
*
* Called for SELF on rtnl_bridge_setlink to set bridge port
* attributes.
*/
int switchdev_port_bridge_setlink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
{
const struct swdev_ops *ops = dev->swdev_ops;
struct nlattr *protinfo;
struct nlattr *afspec;
int err = 0;
protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
IFLA_PROTINFO);
if (protinfo) {
err = switchdev_port_br_setlink_protinfo(dev, protinfo);
if (err)
return err;
}
afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
IFLA_AF_SPEC);
if (afspec)
err = switchdev_port_br_afspec(dev, afspec,
switchdev_port_obj_add);
return err;
}
EXPORT_SYMBOL_GPL(switchdev_port_bridge_setlink);
/**
* switchdev_port_bridge_dellink - Set bridge port attributes
*
* @dev: port device
* @nlh: netlink header
* @flags: netlink flags
*
* Called for SELF on rtnl_bridge_dellink to set bridge port
* attributes.
*/
int switchdev_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags)
{
struct nlattr *afspec;
afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
IFLA_AF_SPEC);
if (afspec)
return switchdev_port_br_afspec(dev, afspec,
switchdev_port_obj_del);
return 0;
}
EXPORT_SYMBOL_GPL(switchdev_port_bridge_dellink);
static struct net_device *switchdev_get_lowest_dev(struct net_device *dev)
{
const struct switchdev_ops *ops = dev->switchdev_ops;
struct net_device *lower_dev;
struct net_device *port_dev;
struct list_head *iter;
/* Recusively search down until we find a sw port dev.
* (A sw port dev supports swdev_parent_id_get).
* (A sw port dev supports switchdev_port_attr_get).
*/
if (dev->features & NETIF_F_HW_SWITCH_OFFLOAD &&
ops && ops->swdev_parent_id_get)
if (ops && ops->switchdev_port_attr_get)
return dev;
netdev_for_each_lower_dev(dev, lower_dev, iter) {
port_dev = netdev_switch_get_lowest_dev(lower_dev);
port_dev = switchdev_get_lowest_dev(lower_dev);
if (port_dev)
return port_dev;
}
@ -261,10 +587,12 @@ static struct net_device *netdev_switch_get_lowest_dev(struct net_device *dev)
return NULL;
}
static struct net_device *netdev_switch_get_dev_by_nhs(struct fib_info *fi)
static struct net_device *switchdev_get_dev_by_nhs(struct fib_info *fi)
{
struct netdev_phys_item_id psid;
struct netdev_phys_item_id prev_psid;
struct switchdev_attr attr = {
.id = SWITCHDEV_ATTR_PORT_PARENT_ID,
};
struct switchdev_attr prev_attr;
struct net_device *dev = NULL;
int nhsel;
@ -276,28 +604,29 @@ static struct net_device *netdev_switch_get_dev_by_nhs(struct fib_info *fi)
if (!nh->nh_dev)
return NULL;
dev = netdev_switch_get_lowest_dev(nh->nh_dev);
dev = switchdev_get_lowest_dev(nh->nh_dev);
if (!dev)
return NULL;
if (netdev_switch_parent_id_get(dev, &psid))
if (switchdev_port_attr_get(dev, &attr))
return NULL;
if (nhsel > 0) {
if (prev_psid.id_len != psid.id_len)
if (prev_attr.ppid.id_len != attr.ppid.id_len)
return NULL;
if (memcmp(prev_psid.id, psid.id, psid.id_len))
if (memcmp(prev_attr.ppid.id, attr.ppid.id,
attr.ppid.id_len))
return NULL;
}
prev_psid = psid;
prev_attr = attr;
}
return dev;
}
/**
* netdev_switch_fib_ipv4_add - Add IPv4 route entry to switch
* switchdev_fib_ipv4_add - Add IPv4 route entry to switch
*
* @dst: route's IPv4 destination address
* @dst_len: destination address length (prefix length)
@ -309,11 +638,22 @@ static struct net_device *netdev_switch_get_dev_by_nhs(struct fib_info *fi)
*
* Add IPv4 route entry to switch device.
*/
int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 nlflags, u32 tb_id)
int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 nlflags, u32 tb_id)
{
struct switchdev_obj fib_obj = {
.id = SWITCHDEV_OBJ_IPV4_FIB,
.ipv4_fib = {
.dst = htonl(dst),
.dst_len = dst_len,
.fi = fi,
.tos = tos,
.type = type,
.nlflags = nlflags,
.tb_id = tb_id,
},
};
struct net_device *dev;
const struct swdev_ops *ops;
int err = 0;
/* Don't offload route if using custom ip rules or if
@ -328,25 +668,20 @@ int netdev_switch_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
if (fi->fib_net->ipv4.fib_offload_disabled)
return 0;
dev = netdev_switch_get_dev_by_nhs(fi);
dev = switchdev_get_dev_by_nhs(fi);
if (!dev)
return 0;
ops = dev->swdev_ops;
if (ops->swdev_fib_ipv4_add) {
err = ops->swdev_fib_ipv4_add(dev, htonl(dst), dst_len,
fi, tos, type, nlflags,
tb_id);
if (!err)
fi->fib_flags |= RTNH_F_EXTERNAL;
}
err = switchdev_port_obj_add(dev, &fib_obj);
if (!err)
fi->fib_flags |= RTNH_F_EXTERNAL;
return err;
}
EXPORT_SYMBOL_GPL(netdev_switch_fib_ipv4_add);
EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_add);
/**
* netdev_switch_fib_ipv4_del - Delete IPv4 route entry from switch
* switchdev_fib_ipv4_del - Delete IPv4 route entry from switch
*
* @dst: route's IPv4 destination address
* @dst_len: destination address length (prefix length)
@ -357,38 +692,45 @@ EXPORT_SYMBOL_GPL(netdev_switch_fib_ipv4_add);
*
* Delete IPv4 route entry from switch device.
*/
int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 tb_id)
int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
u8 tos, u8 type, u32 tb_id)
{
struct switchdev_obj fib_obj = {
.id = SWITCHDEV_OBJ_IPV4_FIB,
.ipv4_fib = {
.dst = htonl(dst),
.dst_len = dst_len,
.fi = fi,
.tos = tos,
.type = type,
.nlflags = 0,
.tb_id = tb_id,
},
};
struct net_device *dev;
const struct swdev_ops *ops;
int err = 0;
if (!(fi->fib_flags & RTNH_F_EXTERNAL))
return 0;
dev = netdev_switch_get_dev_by_nhs(fi);
dev = switchdev_get_dev_by_nhs(fi);
if (!dev)
return 0;
ops = dev->swdev_ops;
if (ops->swdev_fib_ipv4_del) {
err = ops->swdev_fib_ipv4_del(dev, htonl(dst), dst_len,
fi, tos, type, tb_id);
if (!err)
fi->fib_flags &= ~RTNH_F_EXTERNAL;
}
err = switchdev_port_obj_del(dev, &fib_obj);
if (!err)
fi->fib_flags &= ~RTNH_F_EXTERNAL;
return err;
}
EXPORT_SYMBOL_GPL(netdev_switch_fib_ipv4_del);
EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_del);
/**
* netdev_switch_fib_ipv4_abort - Abort an IPv4 FIB operation
* switchdev_fib_ipv4_abort - Abort an IPv4 FIB operation
*
* @fi: route FIB info structure
*/
void netdev_switch_fib_ipv4_abort(struct fib_info *fi)
void switchdev_fib_ipv4_abort(struct fib_info *fi)
{
/* There was a problem installing this route to the offload
* device. For now, until we come up with more refined
@ -401,4 +743,4 @@ void netdev_switch_fib_ipv4_abort(struct fib_info *fi)
fib_flush_external(fi->fib_net);
fi->fib_net->ipv4.fib_offload_disabled = true;
}
EXPORT_SYMBOL_GPL(netdev_switch_fib_ipv4_abort);
EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_abort);