mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-21 09:24:37 +07:00
0a58d471de
This adds a table which illustrates what combinations of management / regular traffic work depending on the state the switch ports are in. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
221 lines
11 KiB
ReStructuredText
221 lines
11 KiB
ReStructuredText
=========================
|
|
NXP SJA1105 switch driver
|
|
=========================
|
|
|
|
Overview
|
|
========
|
|
|
|
The NXP SJA1105 is a family of 6 devices:
|
|
|
|
- SJA1105E: First generation, no TTEthernet
|
|
- SJA1105T: First generation, TTEthernet
|
|
- SJA1105P: Second generation, no TTEthernet, no SGMII
|
|
- SJA1105Q: Second generation, TTEthernet, no SGMII
|
|
- SJA1105R: Second generation, no TTEthernet, SGMII
|
|
- SJA1105S: Second generation, TTEthernet, SGMII
|
|
|
|
These are SPI-managed automotive switches, with all ports being gigabit
|
|
capable, and supporting MII/RMII/RGMII and optionally SGMII on one port.
|
|
|
|
Being automotive parts, their configuration interface is geared towards
|
|
set-and-forget use, with minimal dynamic interaction at runtime. They
|
|
require a static configuration to be composed by software and packed
|
|
with CRC and table headers, and sent over SPI.
|
|
|
|
The static configuration is composed of several configuration tables. Each
|
|
table takes a number of entries. Some configuration tables can be (partially)
|
|
reconfigured at runtime, some not. Some tables are mandatory, some not:
|
|
|
|
============================= ================== =============================
|
|
Table Mandatory Reconfigurable
|
|
============================= ================== =============================
|
|
Schedule no no
|
|
Schedule entry points if Scheduling no
|
|
VL Lookup no no
|
|
VL Policing if VL Lookup no
|
|
VL Forwarding if VL Lookup no
|
|
L2 Lookup no no
|
|
L2 Policing yes no
|
|
VLAN Lookup yes yes
|
|
L2 Forwarding yes partially (fully on P/Q/R/S)
|
|
MAC Config yes partially (fully on P/Q/R/S)
|
|
Schedule Params if Scheduling no
|
|
Schedule Entry Points Params if Scheduling no
|
|
VL Forwarding Params if VL Forwarding no
|
|
L2 Lookup Params no partially (fully on P/Q/R/S)
|
|
L2 Forwarding Params yes no
|
|
Clock Sync Params no no
|
|
AVB Params no no
|
|
General Params yes partially
|
|
Retagging no yes
|
|
xMII Params yes no
|
|
SGMII no yes
|
|
============================= ================== =============================
|
|
|
|
|
|
Also the configuration is write-only (software cannot read it back from the
|
|
switch except for very few exceptions).
|
|
|
|
The driver creates a static configuration at probe time, and keeps it at
|
|
all times in memory, as a shadow for the hardware state. When required to
|
|
change a hardware setting, the static configuration is also updated.
|
|
If that changed setting can be transmitted to the switch through the dynamic
|
|
reconfiguration interface, it is; otherwise the switch is reset and
|
|
reprogrammed with the updated static configuration.
|
|
|
|
Traffic support
|
|
===============
|
|
|
|
The switches do not support switch tagging in hardware. But they do support
|
|
customizing the TPID by which VLAN traffic is identified as such. The switch
|
|
driver is leveraging ``CONFIG_NET_DSA_TAG_8021Q`` by requesting that special
|
|
VLANs (with a custom TPID of ``ETH_P_EDSA`` instead of ``ETH_P_8021Q``) are
|
|
installed on its ports when not in ``vlan_filtering`` mode. This does not
|
|
interfere with the reception and transmission of real 802.1Q-tagged traffic,
|
|
because the switch does no longer parse those packets as VLAN after the TPID
|
|
change.
|
|
The TPID is restored when ``vlan_filtering`` is requested by the user through
|
|
the bridge layer, and general IP termination becomes no longer possible through
|
|
the switch netdevices in this mode.
|
|
|
|
The switches have two programmable filters for link-local destination MACs.
|
|
These are used to trap BPDUs and PTP traffic to the master netdevice, and are
|
|
further used to support STP and 1588 ordinary clock/boundary clock
|
|
functionality.
|
|
|
|
The following traffic modes are supported over the switch netdevices:
|
|
|
|
+--------------------+------------+------------------+------------------+
|
|
| | Standalone | Bridged with | Bridged with |
|
|
| | ports | vlan_filtering 0 | vlan_filtering 1 |
|
|
+====================+============+==================+==================+
|
|
| Regular traffic | Yes | Yes | No (use master) |
|
|
+--------------------+------------+------------------+------------------+
|
|
| Management traffic | Yes | Yes | Yes |
|
|
| (BPDU, PTP) | | | |
|
|
+--------------------+------------+------------------+------------------+
|
|
|
|
Switching features
|
|
==================
|
|
|
|
The driver supports the configuration of L2 forwarding rules in hardware for
|
|
port bridging. The forwarding, broadcast and flooding domain between ports can
|
|
be restricted through two methods: either at the L2 forwarding level (isolate
|
|
one bridge's ports from another's) or at the VLAN port membership level
|
|
(isolate ports within the same bridge). The final forwarding decision taken by
|
|
the hardware is a logical AND of these two sets of rules.
|
|
|
|
The hardware tags all traffic internally with a port-based VLAN (pvid), or it
|
|
decodes the VLAN information from the 802.1Q tag. Advanced VLAN classification
|
|
is not possible. Once attributed a VLAN tag, frames are checked against the
|
|
port's membership rules and dropped at ingress if they don't match any VLAN.
|
|
This behavior is available when switch ports are enslaved to a bridge with
|
|
``vlan_filtering 1``.
|
|
|
|
Normally the hardware is not configurable with respect to VLAN awareness, but
|
|
by changing what TPID the switch searches 802.1Q tags for, the semantics of a
|
|
bridge with ``vlan_filtering 0`` can be kept (accept all traffic, tagged or
|
|
untagged), and therefore this mode is also supported.
|
|
|
|
Segregating the switch ports in multiple bridges is supported (e.g. 2 + 2), but
|
|
all bridges should have the same level of VLAN awareness (either both have
|
|
``vlan_filtering`` 0, or both 1). Also an inevitable limitation of the fact
|
|
that VLAN awareness is global at the switch level is that once a bridge with
|
|
``vlan_filtering`` enslaves at least one switch port, the other un-bridged
|
|
ports are no longer available for standalone traffic termination.
|
|
|
|
Topology and loop detection through STP is supported.
|
|
|
|
L2 FDB manipulation (add/delete/dump) is currently possible for the first
|
|
generation devices. Aging time of FDB entries, as well as enabling fully static
|
|
management (no address learning and no flooding of unknown traffic) is not yet
|
|
configurable in the driver.
|
|
|
|
A special comment about bridging with other netdevices (illustrated with an
|
|
example):
|
|
|
|
A board has eth0, eth1, swp0@eth1, swp1@eth1, swp2@eth1, swp3@eth1.
|
|
The switch ports (swp0-3) are under br0.
|
|
It is desired that eth0 is turned into another switched port that communicates
|
|
with swp0-3.
|
|
|
|
If br0 has vlan_filtering 0, then eth0 can simply be added to br0 with the
|
|
intended results.
|
|
If br0 has vlan_filtering 1, then a new br1 interface needs to be created that
|
|
enslaves eth0 and eth1 (the DSA master of the switch ports). This is because in
|
|
this mode, the switch ports beneath br0 are not capable of regular traffic, and
|
|
are only used as a conduit for switchdev operations.
|
|
|
|
Device Tree bindings and board design
|
|
=====================================
|
|
|
|
This section references ``Documentation/devicetree/bindings/net/dsa/sja1105.txt``
|
|
and aims to showcase some potential switch caveats.
|
|
|
|
RMII PHY role and out-of-band signaling
|
|
---------------------------------------
|
|
|
|
In the RMII spec, the 50 MHz clock signals are either driven by the MAC or by
|
|
an external oscillator (but not by the PHY).
|
|
But the spec is rather loose and devices go outside it in several ways.
|
|
Some PHYs go against the spec and may provide an output pin where they source
|
|
the 50 MHz clock themselves, in an attempt to be helpful.
|
|
On the other hand, the SJA1105 is only binary configurable - when in the RMII
|
|
MAC role it will also attempt to drive the clock signal. To prevent this from
|
|
happening it must be put in RMII PHY role.
|
|
But doing so has some unintended consequences.
|
|
In the RMII spec, the PHY can transmit extra out-of-band signals via RXD[1:0].
|
|
These are practically some extra code words (/J/ and /K/) sent prior to the
|
|
preamble of each frame. The MAC does not have this out-of-band signaling
|
|
mechanism defined by the RMII spec.
|
|
So when the SJA1105 port is put in PHY role to avoid having 2 drivers on the
|
|
clock signal, inevitably an RMII PHY-to-PHY connection is created. The SJA1105
|
|
emulates a PHY interface fully and generates the /J/ and /K/ symbols prior to
|
|
frame preambles, which the real PHY is not expected to understand. So the PHY
|
|
simply encodes the extra symbols received from the SJA1105-as-PHY onto the
|
|
100Base-Tx wire.
|
|
On the other side of the wire, some link partners might discard these extra
|
|
symbols, while others might choke on them and discard the entire Ethernet
|
|
frames that follow along. This looks like packet loss with some link partners
|
|
but not with others.
|
|
The take-away is that in RMII mode, the SJA1105 must be let to drive the
|
|
reference clock if connected to a PHY.
|
|
|
|
RGMII fixed-link and internal delays
|
|
------------------------------------
|
|
|
|
As mentioned in the bindings document, the second generation of devices has
|
|
tunable delay lines as part of the MAC, which can be used to establish the
|
|
correct RGMII timing budget.
|
|
When powered up, these can shift the Rx and Tx clocks with a phase difference
|
|
between 73.8 and 101.7 degrees.
|
|
The catch is that the delay lines need to lock onto a clock signal with a
|
|
stable frequency. This means that there must be at least 2 microseconds of
|
|
silence between the clock at the old vs at the new frequency. Otherwise the
|
|
lock is lost and the delay lines must be reset (powered down and back up).
|
|
In RGMII the clock frequency changes with link speed (125 MHz at 1000 Mbps, 25
|
|
MHz at 100 Mbps and 2.5 MHz at 10 Mbps), and link speed might change during the
|
|
AN process.
|
|
In the situation where the switch port is connected through an RGMII fixed-link
|
|
to a link partner whose link state life cycle is outside the control of Linux
|
|
(such as a different SoC), then the delay lines would remain unlocked (and
|
|
inactive) until there is manual intervention (ifdown/ifup on the switch port).
|
|
The take-away is that in RGMII mode, the switch's internal delays are only
|
|
reliable if the link partner never changes link speeds, or if it does, it does
|
|
so in a way that is coordinated with the switch port (practically, both ends of
|
|
the fixed-link are under control of the same Linux system).
|
|
As to why would a fixed-link interface ever change link speeds: there are
|
|
Ethernet controllers out there which come out of reset in 100 Mbps mode, and
|
|
their driver inevitably needs to change the speed and clock frequency if it's
|
|
required to work at gigabit.
|
|
|
|
MDIO bus and PHY management
|
|
---------------------------
|
|
|
|
The SJA1105 does not have an MDIO bus and does not perform in-band AN either.
|
|
Therefore there is no link state notification coming from the switch device.
|
|
A board would need to hook up the PHYs connected to the switch to any other
|
|
MDIO bus available to Linux within the system (e.g. to the DSA master's MDIO
|
|
bus). Link state management then works by the driver manually keeping in sync
|
|
(over SPI commands) the MAC link speed with the settings negotiated by the PHY.
|